Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[supervisor] support shutdown task #11287

Closed
wants to merge 1 commit into from

Conversation

svenefftinge
Copy link
Member

@svenefftinge svenefftinge commented Jul 11, 2022

fixes #3966

Description

Related Issue(s)

Fixes #

How to test

Release Notes

Documentation

Werft options:

  • /werft with-preview

Front logo Front conversations

Sorry, something went wrong.

@roboquat
Copy link
Contributor

@svenefftinge: Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@werft-gitpod-dev-com
Copy link

started the job as gitpod-build-sefftinge-allow-end-tasks-which-3966.1 because the annotations in the pull request description changed
(with .werft/ from main)

@svenefftinge svenefftinge force-pushed the sefftinge/allow-end-tasks-which-3966 branch 4 times, most recently from 4d5ccc4 to 097d76f Compare July 12, 2022 12:04
@csweichel
Copy link
Contributor

Had a discussion with @svenefftinge about this:

  • shutdown tasks are subject to the terminationGracePeriod of a workspace, hence that time should be configurable in the .gitpod.yml. We would introduce a terminationGracePeriod property on the top level of the file.
  • We would introduce a new task type: shutdown. When a shutdown task is present, upon workspace stop, the child processes of the task executing terminal would receive SIGTERM followed by SIGKILL 10 seconds later. Then we'd execute the shutdown command in the same terminal. If the task terminal has been closed in the meantime we open a new terminal and execute the shutdown command there.
  • If a task has no shutdown command, the tasks command will keep running until all shutdown tasks have completed.
  • The log output of the shutdown tasks are stored in the workspace filesystem so that the logs are available when the workspace is restarted, or if we choose to upload them. Previously existing shutdown logs are overwritten.

Example .gitpod.yml:

# give the workspace enough time to finish the shutdown tasks
terminationGracePeriodSeconds: 300

tasks:
  # Note that the shutdown command is NOT part of the same task so that the database
  # keeps running until the `foobardb flush` command has been executed.
  - command: foobardb --datadir /workspace/data/db
  - shutdown: foobardb flush

  # Say you wanted to carry an image across multiple workspaces, using the shutdown command
  # you could push that image to a registry prior before the workspace shuts down
  - before: echo "$DOCKER_PWD" | docker login -u $DOCKER_USER --password-stdin myreg.company.dev
    init: docker pull myreg.company.dev/devstate/$GITPOD_OWNER_ID/server:latest
    command: docker run -it --name server -v /workspace/data/server:/data myreg.company.dev/devstate/$GITPOD_OWNER_ID/server:latest
    shutdown: docker container commit server myreg.company.dev/devstate/$GITPOD_OWNER_ID/server:latest && docker push myreg.company.dev/devstate/$GITPOD_OWNER_ID/server:latest

@svenefftinge
Copy link
Member Author

svenefftinge commented Jul 20, 2022

/werft run

👍 started the job as gitpod-build-sefftinge-allow-end-tasks-which-3966.9
(with .werft/ from main)

@svenefftinge
Copy link
Member Author

svenefftinge commented Jul 25, 2022

/werft run

👍 started the job as gitpod-build-sefftinge-allow-end-tasks-which-3966.10
(with .werft/ from main)

@stale
Copy link

stale bot commented Aug 11, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the meta: stale This issue/PR is stale and will be closed soon label Aug 11, 2022
@svenefftinge svenefftinge added meta: never-stale This issue can never become stale and removed meta: stale This issue/PR is stale and will be closed soon labels Sep 6, 2022
@MarcusSorealheis
Copy link

Is this PR still alive?

@svenefftinge
Copy link
Member Author

svenefftinge commented Sep 20, 2022

Yes, we got to prioritize some other topics for the time being but will pick this up soon again. Could you outline what you plan to use this for?

@metcalfc
Copy link
Contributor

We have folks that are affected by #11183 which currently seems to be caused by not running docker compose down.

This same set of folks would also use shutdown to push images to Docker registries and persist data to object store on workspace shutdown. They'd do the opposite on workspace startup load data from object store and pull images.

@svenefftinge svenefftinge force-pushed the sefftinge/allow-end-tasks-which-3966 branch from 097d76f to 801840b Compare September 26, 2022 14:48
@sagor999
Copy link
Contributor

Hey @svenefftinge
I see you recently pushed into this PR. Are you resuming work on this?
I am asking because we have this task in workspace: #11183 that seems like might get fixed by this PR. 🙏

@svenefftinge
Copy link
Member Author

svenefftinge commented Sep 27, 2022

@sagor999 there is real analysis of the underlying problem in #11183 can you elaborate a bit on how you hope this PR will fix those issues? Is the underlying problem that docker doesn't terminate gracefully on workspace stop and increasing the time would help?
Edit: Ah just saw this comment. 👍
It seems like docker-compose down is doing it's job on SIGTERM [ref]. Is it just that we don't give it enough time?

@sagor999
Copy link
Contributor

Yeah, it seems that way indeed.

@svenefftinge
Copy link
Member Author

update: depends on #13495

@akosyakov
Copy link
Member

akosyakov commented Oct 6, 2022

@csweichel Just to understand how I can get proper docker-compose running in the background similar how people are doing it with systemd for instance [1]. I would do following?

tasks:
  - init: docker-compose pull --quiet --parallel
    command: |
      docker-compose up &
      exit
    shutdown: docker-compose down

With this solution though task terminal still can pop up in VS Code while pulling for instance in a workspace without the prebuild. I wish we would have something like background property for tasks that it does not create pty service but only shell, for example:

tasks:
  - init: docker-compose pull --quiet --parallel
    command: docker-compose up
    shutdown: docker-compose down
    background: true

shutdown tasks are subject to the terminationGracePeriod of a workspace, hence that time should be configurable in the .gitpod.yml. We would introduce a terminationGracePeriod property on the top level of the file.

👍

We would introduce a new task type: shutdown. When a shutdown task is present, upon workspace stop, the child processes of the task executing terminal would receive SIGTERM followed by SIGKILL 10 seconds later. Then we'd execute the shutdown command in the same terminal. If the task terminal has been closed in the meantime we open a new terminal and execute the shutdown command there.

I'm not sure about the order consider another use case with docker-compose with terminal:

tasks:
  - init: docker-compose pull --quiet --parallel
    command: docker-compose up
    shutdown: docker-compose down

Sending SIGTERM to docker-compose up is not the same like running docker-compose down. It is not going to terminate gracefully. First shutdown command should be executed independently in own shell and only then SIGTERM should be sent to the terminal shell. I also not sure why we need terminal device for shutdown command at all.

If a task has no shutdown command, the tasks command will keep running until all shutdown tasks have completed.

I'm not sure what is reasoning, i.e. shutdown of one task can affect execution of another task? Should not we rather design that it should be local to the task itself?

The log output of the shutdown tasks are stored in the workspace filesystem so that the logs are available when the workspace is restarted, or if we choose to upload them. Previously existing shutdown logs are overwritten.

👍

@svenefftinge
Copy link
Member Author

Sending SIGTERM to docker-compose up is not the same like running docker-compose down. It is not going to terminate gracefully. First shutdown command should be executed independently in own shell and only then SIGTERM should be sent to the terminal shell. I also not sure why we need terminal device for shutdown command at all.

In that case people would need to start the shutdown in a separate terminal:

tasks:
  - init: docker-compose pull --quiet --parallel
    command: docker-compose up
  - shutdown: docker-compose down

@akosyakov
Copy link
Member

In that case people would need to start the shutdown in a separate terminal:

Is not it make it complex for a user there will be 2 different semantics depending whether shutdown is present or not? I wonder what use case won't be covered by all tasks are closed in parallel, shutdown is executed before SIGTERM? There is also gp sync-await if someone wants to couple command and shutdown between 2 tasks already?

@svenefftinge
Copy link
Member Author

Yes, makes sense. So you suggest to go back to the design where we don't have a shutdown lifecycle on terminals but one global one. I.e. what I propose here?

@akosyakov
Copy link
Member

Yes, makes sense. So you suggest to go back to the design where we don't have a shutdown lifecycle on terminals but one global one. I.e. what I propose #3966 (comment)?

I think it makes sense to have shutdown per a task, i.e. to reload docker-compose configuration. Having it under another task or globally won't allow it. I meant that the contract can be different:

  • each task is self contained and can have own shutdown command
  • shutdown command is executed first in separate shell
  • then SIGTERM is send to the terminal task process if it is still alive

We have gp tasks stop already and discussing to add gp tasks restart. I would think running gp tasks stop for some task should execute it shutdown command regardless of other tasks.

@svenefftinge
Copy link
Member Author

svenefftinge commented Oct 6, 2022

I think instead of introducing a hard-coded concept of a lifecycle for which the contract are not really clear (i.e. shutdown). We could look into building (or use an existing), convenient CLI that allows me to run something and run something else on signal. So we would just rely on the contract that processes receive a SIGTERM, or SIGHUP on stop.
Users can already do this with bash and trap or writing their own tools.

@metcalfc
Copy link
Contributor

metcalfc commented Oct 6, 2022

Users can already do this with bash and trap or writing their own tools.

This requires developers to be very experienced. Many developers do not live on the command line, do not use shell job management, don't know about the various signals, or how to trap them. I couldn't make this work, @axonasif had to give me a snippet to make it work. Many developers find Dockerfile challenging enough.

Most folks do not have to manage this today. They've got full VMs with real inits and systemd doing clean up. Having to manage this is something for us and our abstraction.

The original thoughts around shutdown came from a prospective customer, who wanted to manage db state, containers, etc at shutdown. In much the same way as they think of before, init, and command.

Its easy to drill into the specifics of the particular docker compose example but that wasn't the original signal this came from.

@mbrevoort
Copy link
Contributor

I quite dislike that we even have a notion of "stop" and "start". I wish that "stop" was hibernete and "restart" was resume. I wish that we could snapshot the current state versus go through a full lifecycle of starting and stopping a container. Though we use containers it feels like a leaky abstraction.

There are things I would like to be able to do as a developer to distinguish between shutdown and workspace deletion. (but like I mentioned, I don't like that we have start/stop in the first place)

For instance, for a project, for each workspace I create, I want to provision an SQS queue. And when the workspace is deleted I want to deprovision the SQS queue. When the workspace is running I want to be polling the queue.

I can see following the same lifecycle with a cloud based database as well. Provision a new database or keyspace, load with seed data and clean up when the workspace is deleted. Every "ephemeral" workspace gets it's own ephemeral cloud resources.

@axonasif
Copy link
Member

axonasif commented Oct 7, 2022

  • @akosyakov: then SIGTERM is send to the terminal task process if it is still alive

I would like to have an option within .gitpod.yml to disable that behavior for a task-group. This could be helpful when I want to have manually handle from shutdown of a task when necessary. As you mentioned, docker-compose up doesn't like SIGTERM as they just simply handle SIGTERM that way, I may not want to start it on the background as a daemon to avoid SIGTERM if that's the current implementation of where SIGTERMs are sent.

@akosyakov: Sending SIGTERM to docker-compose up is not the same like running docker-compose down. It is not going to terminate gracefully. First shutdown command should be executed independently in own shell and only then SIGTERM should be sent to the terminal shell. I also not sure why we need terminal device for shutdown command at all.

This could work too 👍

  • @csweichel: If a task has no shutdown command, the tasks command will keep running until all shutdown tasks have completed.

I assume this is close to what I'm asking.

@svenefftinge
Copy link
Member Author

This requires developers to be very experienced.

That's why I suggested building or using a convenient CLI.

@svenefftinge svenefftinge deleted the sefftinge/allow-end-tasks-which-3966 branch August 21, 2023 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow end-tasks, which will run when a workspace stops (timeout)
9 participants