Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: #191: Container friendliness #200

Closed
wants to merge 12 commits into from

Conversation

Tristan971
Copy link

@Tristan971 Tristan971 commented Sep 29, 2020

Hi,

Following #191, I finally had time for an MR of "container friendliness" as I so-called it over there.

This is achieved mostly by implementing a mix of what I did a few weeks back on my own, and recommendations by @d--j in the aforementioned issue.

While not completely "shell-free", I tried to keep scripting to a minimum for reasonable environment-variables-driven configuration.

Now there's 2 things to note:

  • a slightly unfortunate limitation relating to signal handling and the docker entrypoint I ran into, detailed hereafter
  • there's no tests (or CI setup) for this yet and I do plan to at least have a go at it. However, this would involve some docker-compose (un)fun, which will be a bit tedious, so I'll wait to be sure at least the implementation seems fine before committing that time 😅

Limitations:

  • you cannot use the syslog logging, mostly because setting it up in a docker container would be overly complex, have little value, and considerably complexify/bloat the image
  • you can use the none logging, but you will then not see the logs, which is due to the next limitation
  • I was not able to come up with a solution using the foreground mode of mount.s3ql

For the later, it seems to me that mount.s3ql does indeed terminate on receiving a stop signal, and unmounts the fuse mount, but doesn't cleanly close the filesystem. Meaning you should actually never "cancel" that process yourself, but rather run fusermount/umount.s3ql separately, which will eventually stop it.

In this setup, a stop signal has the following path:

  1. it is is handled by docker (either by Ctrl+C-ing an interactively-attached container or docker stopping and associated),
  2. the container's entrypoint, dumb-init ensures it is proxied to entrypoint.sh rather than swallowed by docker's default init (& never seen by the entrypoint)
  3. is received by entrypoint.sh
  4. which executes the shutdown hook

However, to make this work with the foreground mode, the signal needs to somehow be read by entrypoint.sh, and invoke the shutdown hook, then wait for the mount.s3ql process to finish, without the latter ever knowing about the signal...

Unfortunately I couldn't set this up, even starting mount.s3ql as a background subshell, as it would still always see the signal before the shutdown hook was called...

In the end, I believe this limitation is a reasonable tradeoff, unless some shell signals/concurrency guru is willing to help out.

@Tristan971 Tristan971 marked this pull request as draft September 29, 2020 03:21
README.rst Show resolved Hide resolved
@d--j
Copy link
Collaborator

d--j commented Sep 29, 2020

Limitations:

you cannot use the syslog logging, mostly because setting it up in a docker container would be overly complex, have little value, and considerably complexify/bloat the image
you can use the none logging, but you will then not see the logs, which is due to the next limitation
I was not able to come up with a solution using the foreground mode of mount.s3ql

We should probably find a solution so that you can use --fg and --log none. AFAIK logging directly to stdout is the most preferred solution in a docker environment - and the tail -f while loop in the entry point script is suboptimal...

@Nikratio could we change mount.s3ql to that it cleanly unmounts the file system on a SIGINT/SIGTERM? Maybe activate this behaviour with a new argument --clean-umount-on-signal?

contrib/docker/1-start.sh Outdated Show resolved Hide resolved
@Tristan971
Copy link
Author

Limitations:
you cannot use the syslog logging, mostly because setting it up in a docker container would be overly complex, have little value, and considerably complexify/bloat the image
you can use the none logging, but you will then not see the logs, which is due to the next limitation
I was not able to come up with a solution using the foreground mode of mount.s3ql

We should probably find a solution so that you can use --fg and --log none. AFAIK logging directly to stdout is the most preferred solution in a docker environment - and the tail -f while loop in the entry point script is suboptimal...

@Nikratio could we change mount.s3ql to that it cleanly unmounts the file system on a SIGINT/SIGTERM? Maybe activate this behaviour with a new argument --clean-umount-on-signal?

I was thinking of avoiding suggesting extraneous changes to support this, but it'd be ideal indeed (and slightly less surprising behaviour too, maybe?)

contrib/docker/1-start.sh Outdated Show resolved Hide resolved
contrib/docker/entrypoint.sh Outdated Show resolved Hide resolved
@Nikratio
Copy link
Collaborator

Nikratio commented Sep 30, 2020

I think there's some confusion around signals here. There is SIGSTOP (which is closest to the "stop signal" term that you use), SIGINT (which is generated on Ctrl-C) and SIGTERM (which I think is the signal you actually mean).

Could you clarify which signal(s) you are trying to use for what?

@Nikratio
Copy link
Collaborator

@Nikratio could we change mount.s3ql to that it cleanly unmounts the file system on a SIGINT/SIGTERM? Maybe activate this behaviour with a new argument --clean-umount-on-signal?

I think it would be a good idea to use a signal for this purpose yes. SIGINT and SIGHUP (or both) would be great choices.

I do not think we should change the meaning of SIGTERM, because that is currently the only way to terminate the process quickly without risk of data corruption (SIGKILL is not safe).

@Tristan971
Copy link
Author

I think there's some confusion around signals here. There is SIGSTOP (which is closest to the "stop signal" term that you use), SIGINT (which is generated on Ctrl-C) and SIGTERM (which I think is the signal you actually mean).

Could you clarify which signal(s) you are trying to use for what?

I'm meaning the signal sent to the container by the docker daemon (either via ctrl+c or docker stop)

This is indeed INT on ctrl+c and TERM on docker stop.

I'm not using them directly (ie not invoking kill myself), but rather want to make it work nicely with whatever dockerd and containerd use

@d--j
Copy link
Collaborator

d--j commented Sep 30, 2020

@Nikratio could we change mount.s3ql to that it cleanly unmounts the file system on a SIGINT/SIGTERM? Maybe activate this behaviour with a new argument --clean-umount-on-signal?

I think it would be a good idea to use a signal for this purpose yes. SIGINT and SIGHUP (or both) would be great choices.

I do not think we should change the meaning of SIGTERM, because that is currently the only way to terminate the process quickly without risk of data corruption (SIGKILL is not safe).

Docker stop always uses SIGTERM, but since the container already uses dump-init we could use dump-init to rewrite SIGTERM to SIGINT and teach S3QL to do a graceful shutdown on SIGINT.

I have just create a pull request for S3QL unconditionally shutting down cleanly on KeyboardInterrupt: #202

@Tristan971
Copy link
Author

As per the comments so far, I've added a few changes, which include rewriting SIGTERM as SIGINT (so, assuming #202 goes in first, the none and --fg issues will go away), and a few other small things like forwarding return codes

@@ -100,6 +100,8 @@ pre-release testers) we cannot guarantee that every release will run
on all non-Linux systems. Please report any bugs you find, and we will
try to fix them.

You can also run S3QL in a Docker container. See <insert docker README link
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be curious under which situations someone would want to do that. Given that the fuse module still has to be loaded in the host system kernel and that special privileges are needed for the guest, security and isolation are probably not applicable here. For dependency management, Python's venv system would probably also fit the bill better. So maybe add a sentence describing the conditions under which this makes sense?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In most cases the fuse module is already loaded on the host by default, but if worst comes to worst, it's a fairly standard thing to enable on linux, at least compared to installing the whole toolchain needed to build and run S3QL.

Even with venvs it's not all that nice to do things directly on the host.
Like, atm, fedora 33 defaults to python 3.9 as provider of python3, which doesn't successfully build and run the project, and that's the kind of thing that having a "whatever we baked in this image, it will run good" is quite convenient for.
I don't want system updates to suddenly break my apps anymore (or make the choice to never update and run debian oldstable till the end of times 🙂 ) Don't necessarily want to turn this into containers proselytism (they have their issues for sure), but many other things like restart/update/logs behaviours are all much simpler in my opinion with them. Only matched by a well-oiled systemd setup, which is quite a bit more difficult to achieve.

If you disagree with all of these, lastly, in my case, it's that you must use containerised apps with k8s anyway, there isn't a "run on host directly" way that isn't awfully hacky


however I agree that this should be documented, either in the main README or the one in contrib/docker. Whichever one you think it's more appropriate to put this in!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's best to have 1-3 sentences in README. If you want, you can go into more detail in the Docker readme.

@@ -0,0 +1,19 @@
# build
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we truly need this file? I believe we considered adding it in the past, but rejected it because it wasn't clear what versions should go in there and when they should be updated, that it was a bit misleading (this is certainly not the only set of versions that works). If anything, this also looks like something that ought to be generated when creating a release tarball but not tracked in Git...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... Not necessarily.

I was wondering, and just took it from my original setup. The only issue I had with not having it, is the pip install step potentially pulling versions that don't actually work well (or at not compatible anymore, etc)?

Would be happy to have a generated one at build/test time and ingest it in the docker image building step however, so we know it was the same versions as the ones used to run the tests etc!

@Nikratio
Copy link
Collaborator

Nikratio commented Oct 1, 2020

This is not a blocker for merging, but out of curiosity:

Do the 0-init.sh, 1-start.sh and similar files need to be shell scripts? It seems to me they contain nothing but boilerplate to pass-through options. It may be much simpler to turn them into Python scripts that completely replace the standard mount.s3ql entry points. That way, you could e.g. write a loop over the arguments that S3QL defines instead of having to hardcode them all.

@Tristan971
Copy link
Author

This is not a blocker for merging, but out of curiosity:

Do the 0-init.sh, 1-start.sh and similar files need to be shell scripts? It seems to me they contain nothing but boilerplate to pass-through options. It may be much simpler to turn them into Python scripts that completely replace the standard mount.s3ql entry points. That way, you could e.g. write a loop over the arguments that S3QL defines instead of having to hardcode them all.

You are correct that they could be python scripts ; but at that point I would almost inch towards editing S3QL directly to have environment variables as a fallback argument input?

@Nikratio
Copy link
Collaborator

Nikratio commented Oct 1, 2020

This is not a blocker for merging, but out of curiosity:
Do the 0-init.sh, 1-start.sh and similar files need to be shell scripts? It seems to me they contain nothing but boilerplate to pass-through options. It may be much simpler to turn them into Python scripts that completely replace the standard mount.s3ql entry points. That way, you could e.g. write a loop over the arguments that S3QL defines instead of having to hardcode them all.

You are correct that they could be python scripts ; but at that point I would almost inch towards editing S3QL directly to have environment variables as a fallback argument input?

Yeah, that's pretty much what I mean. Add some new --read-params-from-environment parameter and extend the parse_args code to read the environment when it is present.

@IvoPereira
Copy link

Hi there.

Sorry for "hijacking the thread", but what is the supposed behaviour of the Docker image when an S3QL file system is not mounted at the given storage URL?

What would be the best way to initialize it in case it doesn't already exist? Would the PR benefit from the addition of such a thing or would it be better to keep it separate?

@Tristan971
Copy link
Author

Hi there.

Sorry for "hijacking the thread", but what is the supposed behaviour of the Docker image when an S3QL file system is not mounted at the given storage URL?

What would be the best way to initialize it in case it doesn't already exist? Would the PR benefit from the addition of such a thing or would it be better to keep it separate?

It would, in this case, be the same behaviour as running fsck.s3ql/mount.s3ql on such a storage (which is a failure)

Initialising it could be handy, but does seem of limited value to me? (if you do indeed suggest running mkfs.s3ql in these cases)

@d--j
Copy link
Collaborator

d--j commented Oct 1, 2020

Sorry for "hijacking the thread", but what is the supposed behaviour of the Docker image when an S3QL file system is not mounted at the given storage URL?

What would be the best way to initialize it in case it doesn't already exist? Would the PR benefit from the addition of such a thing or would it be better to keep it separate?

We would probably need a way to do it – since when you run S3QL in docker you probably do not have another S3QL installation laying around for calling mkfs.s3ql

I suppose you can just overwrite the CMD of this image to execute mkfs.s3ql. Something like this should probably work right now:

docker run \
           --it \
           --rm \
           --device /dev/fuse \
           --cap-add SYS_ADMIN \
           -v /cache:/cache
           s3ql-image-name:latest \
           mkfs.s3ql --cachedir /cache your://storage/path

We could of course unconditionally run a mkfs.s3ql in every execution (before running fsck.s3ql) – mkfs.s3ql will not overwrite a file system unless you add --force.
I would see FS creation as a separate and deliberate task, tho.

- rewrite SIGTERM as SIGINT in dumb-init,
- handle only SIGINT at the s3ql level
- remove unnecessary loop in entrypoint
- forward exit codes in all places
@Tristan971
Copy link
Author

Tristan971 commented Oct 2, 2020

Did a sizable update:

  • extended the argparser (not sure this is the ideal way however...) to support environment variables as fallback default when missing parameters, but before looking at hardcoded defaults
  • removed shell code that became unnecessary after this

I quite like how it ended, but also gotta admit I'm not a python expert, and there's probably a better way about these things, so let me know


Encountered 1 issue and 1 surprising thing

  • issue first: can't get --debug / S3QL_DEBUG to work, which is a type="append_const" argument. Since I don't know how to tell argparser that it is "present" when S3QL_DEBUG=true, it seems to override const's value to rather than just invoking its action (which kind of makes sense I suppose)
  • surprise second: --log / S3QL_LOG has type str_or_None_type which seems to basically do "none" -> None conversion, but surprisingly also does it to the default value ; it works here, but was certainly not quite expected to me

@IvoPereira
Copy link

IvoPereira commented Oct 2, 2020

Initialising it could be handy, but does seem of limited value to me? (if you do indeed suggest running mkfs.s3ql in these cases)

What would you suggest then for initializing it in case it doesn't exist yet?

I mean, as @d--j suggests we could have even have the same docker image running in parallel that would execute mkfs instead of fsck/mount, however, looking at this in a "one-click deployment" that should be Docker (usually) I see it as somewhat hacky, the need to run 2 containers).

Wouldn't it be better if we were able to achieve some kind of idempotence here? Thinking about it as well, but perhaps passing an additional environment variable to s3ql Docker instance, create a flag file (or similar strategy so it won't occur again) and run an additional mkfs?

We could of course unconditionally run a mkfs.s3ql in every execution (before running fsck.s3ql) – mkfs.s3ql will not overwrite a file system unless you add --force.
I would see FS creation as a separate and deliberate task, tho.

You could do this as well, but wouldn't it be a waste of time and resources by running it every single time?

src/s3ql/fsck.py Outdated
parser.add_argument("--force-remote", action="store_true", default=False,
help="Force use of remote metadata even when this would "
"likely result in data loss.")
parser.add_argument("--keep-cache", action="store_true",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of making all these changes to every single add_argument() call, I think it's better to modify the add_argument function itself. You can derive the environment variable name from the parameter name and can leave all the add_argument() calls unchanged.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi, just did that! hope it's how you saw it being done

…and automatic in-argparser kwargs default filtering
@Tristan971
Copy link
Author

Tristan971 commented Oct 4, 2020

another update (sorry it took longer, at least I learned some more python going about it) that this time automatically tries to update the default default of any arg with S3QL_<long-form with dashes replaced by underscore>

it works nicely, however that introduces one outstanding issue for flags like --keep-cache, --force etc. which are shared amongst multiple binaries, it isn't quite possible anymore to differentiate them (force for unmount, but not for something else)

an option for this is to add some property, think namespace or so, to binary-specific args (usually the ones outside of parse_args.py) and use it when defined upon generation of the environment variable name

@IvoPereira
Copy link

Just checking in and understand if there were still intentions to merge this into the main branch? If so, what is the current status of it? Is there anything left to be done?

@Tristan971
Copy link
Author

Well I'm still using it personally, and hope to get it merged -- afaik what's missing is feedback on the final environment-variables-parsing setup (by Nikolaus), and writing some tests and additional documentation accordingly (by me)

I've been a little bit busy lately, but haven't forgotten

@Nikratio
Copy link
Collaborator

Apologies, I didn't know you were waiting for my feedback. The environment parsing approach is fine with me.

@Tristan971
Copy link
Author

Alright, I shall work on getting this cleaned-up, tested and documented in the next few days then

+ entrypoint creation of mount dir if absent (on first run or new node)
@Nikratio Nikratio force-pushed the master branch 3 times, most recently from c7188f0 to 6407f54 Compare December 30, 2020 20:56
@zhanghai
Copy link

zhanghai commented Jan 4, 2021

Note that Dockerfile STOPSIGNAL can be used to configure docker to send SIGINT instead of SIGTERM - Nginx has been using it to get SIGQUIT at nginxinc/docker-nginx#377.

And for the stop grace period, one can use something like docker-compose down -t 60 or docker stop -t 60.

For docker-compose.yml, there is also:

    stop_signal: SIGINT
    stop_grace_period: 1m

And here is a simple script for using the Ubuntu packaged version which doesn't have graceful stop on SIGINT/SIGTERM:

#!/bin/bash
set -e

if [[ ! -d "${S3QL_MOUNTPOINT}" ]]; then
    mkdir -p "${S3QL_MOUNTPOINT}"
fi

umount() {
    umount.s3ql --log none "${S3QL_MOUNTPOINT}"
}

trap umount INT TERM
mount.s3ql --allow-other --authfile "${S3QL_AUTHFILE}" --cachedir "${S3QL_CACHEDIR}" --cachesize "${S3QL_CACHESIZE}" --compress "${S3QL_COMPRESS}" --fg --keep-cache --log none "${S3QL_STORAGE_URL}" "${S3QL_MOUNTPOINT}" &
wait

I'm also configuring my docker files now, and thanks for the great work!

@Nikratio
Copy link
Collaborator

Nikratio commented May 4, 2021

Is this ready for review? If not, are you still interested in finishing this up?

@Nikratio Nikratio force-pushed the master branch 3 times, most recently from 15d7789 to 9b9c95f Compare June 22, 2021 10:16
@IvoPereira
Copy link

Any updates on this @Tristan971?

@Nikratio
Copy link
Collaborator

Nikratio commented Sep 8, 2021

I'm closing this pull request for now. Feel free to re-open if/when you resume work on this again.

@Nikratio Nikratio closed this Sep 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants