Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversion to execline-2.7.0.0 #317

Closed
skarnet opened this issue Jan 14, 2021 · 21 comments
Closed

Conversion to execline-2.7.0.0 #317

skarnet opened this issue Jan 14, 2021 · 21 comments

Comments

@skarnet
Copy link
Contributor

skarnet commented Jan 14, 2021

Hi @jprjr !

So, my big January release has possibly broken a few things, some of which are obvious (xpathexec_r() in justc-envdir needs to be rewritten as xmexec_fm(), for instance), and some of which are... not.

Here, I'm concerned about how the changes I made to execline with 2.7.0.0 interact with the large amount of legacy execline code that exists in s6-overlay. I made the changes because they're QoL fixes for a vast majority of the execline code I write, they clarify the semantics and simplify the syntax, but there are a few edge cases that are incompatible (which is why it is a major version bump). The relevant changes are the following: (excerpt from the release announcement)

  • The forstdin program has changed. It now exits 0 if it has read at least one line, and 1 otherwise.
  • The default list of delimiters for backtick, withstdinas, forstdin and forbacktickx has been set to "\n", so by default those programs will read and/or split on lines and only lines.

I had a quick glance at the code and there are quite a few instances of forstdin, backtick, and forbacktickx. In any "normal" situation, the changes wouldn't impact them; but a quick glance isn't enough to guarantee that. It is possible that there are situations where forstdin/forbacktickx needs to exit 0 even when it has nothing to read. It is possible that forbacktickx is run with the default set of delimiters and it expects to split on spaces or tabs, not only on newlines, even though it's unlikely.

You've used s6-overlay much more than I have, and despite the code being Gorka's, you're probably more familiar with it than I am :-) so could you please have a deep look to see if there are problematic uses of those programs? If it's the case, the changes should be pretty trivial.

There are also parts where using the new execline behaviour would massively simplify the code: for instance, the size of with-retries could probably be cut in half. I would like to do a complete pass on it to modernize and simplify the code; but I don't think it's really worth it, because ultimately what would be needed is a complete rewrite using the new tools, s6-linux-init in particular - which offers a full sysvinit-compatible API and now supports containers. I would like to do this rewrite this year (I have the time for it), but I will need your help to get the specs down and for all the ops/build/CI/testing parts.

Hope to talk to you soon!

@jprjr
Copy link
Member

jprjr commented Jan 15, 2021

@skarnet ! Good to hear from you!

I'll take a look this weekend, I think the quickest/easiest path for delimiters changing is to just find instances of forstdin, backtick, backtickx without delimiters set, and set them to use the old defaults. That would guarantee that part of the behavior remains the same, then I could look more closely into what kind of input they're getting and reducing the delimiters to just what's needed.

For the forstdin exit code changing, I know in init-stage3, we can just replace an instance of if with foreground, to ignore the exit code. I think init-stage2 has some instances that check its exit code (running cont-init.d scripts), I'll get the automated skaware builds created with the latest versions of everything and test that out.

@jprjr
Copy link
Member

jprjr commented Jan 17, 2021

Looks like there's an issue - init-stage2 wants to loop over the files in /etc/fix-attrs.d (technically /var/run/s6/etc/fix-attrs.d), which results in forstdin being called with no input, it exits 1, and the rest of the init stops running. This will likely apply to cont-init.d, etc.

One idea is to write a utility to count the number of entries in a directory and use that to drive running the fix-attrs.d scripts, cont-init.d, etc. I'm not sure how to piece that together with just execline and s6-portable-utils.

Could a switch be added to forstdin (etc) to revert to the old behavior? I'm not sure how to disambiguate if a non-zero exit code is coming from forstdin, or if it's an approximation of a child exit code,

@jprjr
Copy link
Member

jprjr commented Jan 17, 2021

Scratch that, I may be able to rework this with elglob and forx

@skarnet
Copy link
Contributor Author

skarnet commented Jan 17, 2021

Just run the forstdin part in an additional foreground block. Then, if you need to catch child errors, you can test on $?; few execline programs actually use 1 as a valid exit code (only forstdin and the if family, IIRC).

@jprjr
Copy link
Member

jprjr commented Jan 17, 2021

I think the issue with that may be - if a script under cont-finish.d exits non-zero and S6_BEHAVIOUR_IF_STAGE2_FAILS is non-zero, the expectation is for everything to quit.

So a user-written script running under cont-init.d could return any non-zero exit code, not sure how to determine where that 1 exit code came from.

I think this elglob + forx method is working out, still doing some testing.

@jprjr
Copy link
Member

jprjr commented Jan 17, 2021

OK, I think I replicated the forstdin logic using elglob and forx, but now I'm hitting an issue with upgraded binaries.

If I take the last release, but overwrite the init-stage2 and init-stage3 scripts with the newest versions from the skaware-update branch, everything works fine:

FROM ubuntu:20.04

RUN apt-get update && apt-get install -y curl

RUN curl -R -L -o /tmp/s6-overlay-amd64.tar.gz \
    https://github.com/just-containers/s6-overlay/releases/download/v2.1.0.2/s6-overlay-amd64.tar.gz

RUN tar zxf /tmp/s6-overlay-amd64.tar.gz -C / --exclude="./bin" && \
    tar zxf /tmp/s6-overlay-amd64.tar.gz -C /usr ./bin

ADD https://github.com/just-containers/socklog-overlay/releases/download/v3.1.1-1/socklog-overlay-amd64.tar.gz /tmp/
RUN tar xzf /tmp/socklog-overlay-amd64.tar.gz -C /

RUN touch /etc/fix-attrs.d/dummy
RUN touch /etc/fix-attrs.d/dummy2
RUN touch /etc/fix-attrs.d/"dummy with spaces"

RUN curl -R -L -o /etc/s6/init/init-stage2 \
  https://raw.githubusercontent.com/just-containers/s6-overlay/skaware-update/builder/overlay-rootfs/etc/s6/init/init-stage2

RUN curl -R -L -o /etc/s6/init/init-stage3 \
  https://raw.githubusercontent.com/just-containers/s6-overlay/skaware-update/builder/overlay-rootfs/etc/s6/init/init-stage3

RUN chmod +x /etc/s6/init/init-stage2 && \
    chmod +x /etc/s6/init/init-stage3

ENTRYPOINT ["/init"]

I'm able to start a container like:

docker build -t test .
docker run --rm -ti test /bin/bash
(output appears, a shell comes up)
^D
(usual shutting down output appears, container shuts down)

Whereas if I use the latest build (so updated scripts and updated binaries), everything seems to start, but once I ctrl-D out of the shell, the container just hangs until I use docker kill.

Here's a Dockerfile for pulling the latest zip from Github Actions:

FROM ubuntu:20.04

RUN apt-get update && apt-get install -y unzip curl

RUN curl -R -L -o /tmp/dist.zip \
  https://nightly.link/just-containers/s6-overlay/workflows/all/skaware-update/dist.zip

RUN cd /tmp && unzip dist.zip

RUN tar zxf /tmp/s6-overlay-amd64.tar.gz -C / --exclude="./bin" && \
    tar zxf /tmp/s6-overlay-amd64.tar.gz -C /usr ./bin

ADD https://github.com/just-containers/socklog-overlay/releases/download/v3.1.1-1/socklog-overlay-amd64.tar.gz /tmp/
RUN tar xzf /tmp/socklog-overlay-amd64.tar.gz -C /

ENTRYPOINT ["/init"]

@skarnet
Copy link
Contributor Author

skarnet commented Jan 18, 2021

I don't have the infrastructure to reproduce, sorry. But I think I know where it comes from. Just in case, can you show the output of s6-ps -H -o user,pid,vsize,rss,tty,s,start,cttime,args in the container?

My suspicion: on s6-svscanctl -t, s6-svscan will now wait for its children to exit before exec'ing into .s6-svscan/finish. To get the old behaviour, you now need s6-svscanctl -tb; that's what needs to change at the end of init-stage2.

Man, both the new behaviour and s6-linux-init make the shutdown procedure waaay simpler and less clunky. I really want to dive down into this as soon as I can.

@jprjr
Copy link
Member

jprjr commented Jan 22, 2021

I've gone ahead and gotten everything to work with execline-2.7.0.0 with that s6-svscanctl -tb change, so we're squared away with the upgrade. But yeah, changing to s6-linux-init seems like the way to go.

I've gone ahead and added s6-linux-init to the just-containers/skaware repo, the latest release includes s6-linux-init binaries: https://github.com/just-containers/skaware/releases/tag/v2.0.5

@skarnet
Copy link
Contributor Author

skarnet commented Jan 22, 2021

Excellent, thanks!
Hijacking this issue now to talk about the upgrade and related stuff. :-)

I've heard from several people that it's best for an overlay to install its stuff in a restricted set of out-of-the-way directories, to avoid conflicting with pre-existing files. What do you think of

  • using slashpackage for the skaware build? (so all the binaries would be under subdirectories of /package, symlinked in /command)
  • also using slashpackage for s6-overlay, so installing the scripts into e.g. /package/admin/s6-overlay?

The user scripts in /etc would obviously not change.

On a related note, is /var/run/s6/services supposed to be public API? Are users supposed to be able to rely on a fixed place for the scandir and the repo of servicedirs?

Thanks for your answers - I'll certainly have more questions later.

@jprjr
Copy link
Member

jprjr commented Jan 22, 2021

Regarding /var/run/s6/services - that's a public API, we've had instructions that reference it in our README for some time, see: https://github.com/just-containers/s6-overlay#writing-an-optional-finish-script - we tell users to call s6-svscanctl -t /var/run/s6/services in your finish script to bring the whole thing down if your service dies.

Not sure re: slashpackage. I do like the idea of moving out of /usr and /bin (especially because of that whole "some distros symlink /bin to /usr/bin" nonsense), but I'd be very concerned about breaking existing scripts that call binaries directly. There are likely to be a lot of run scripts out there with #!/bin/execlineb for the interpreter.

@skarnet
Copy link
Contributor Author

skarnet commented Jan 22, 2021

Yeah, we'd have a symlink for /bin/execlineb, obviously. And maybe for other stuff if it appears that users rely on absolute paths - which they should definitely not do on FHS.

As for the scandir, how does s6-overlay populate it, exactly? The thing is, s6-linux-init provides an empty scandir and expects the rc.init scripts to fill it up (typically by invoking a service manager). Is /var/run/s6/services supposed to be already populated at container launch time, or are the servicedirs stored somewhere else and s6-overlay's init sequence links them? (Sorry for the obvious question, I'm working on something else and don't want to take the time to dive in the code...)

I suggest the new scandir should be in /run/service, which is the s6-l-i default, and /var/run/s6/services can be a compatibility symlink to it.

@jprjr
Copy link
Member

jprjr commented Jan 22, 2021

/var/run/s6/services gets populated at runtime, the user isn't expected to have it already created. When launched we:

  • s6-rmrf the /var/run/s6/services directory
  • use s6-hiercopy to copy in /etc/s6/services (our internal services, like the catchall logger)
  • bring up s6-svscan
  • run scripts in fix-attrs.d, cont-init.d, etc
  • Do another s6-hiercopy to bring services from /etc/services.d in and s6-svscanctl -a to add them.

@skarnet
Copy link
Contributor Author

skarnet commented Jan 22, 2021

Ah, so the user service repo is /etc/services.d. Perfect, this will be easy to convert to s6-l-i.

@skarnet
Copy link
Contributor Author

skarnet commented Jan 24, 2021

My next question: do you think it's acceptable to introduce a dependency to /bin/sh? Are there any containers out there that do not include a shell?
Rationale: execline is only worth it, performance-wise, for simple scripts, and at the level of complexity of the scripts we have, a shell would actually be more performant, if we can use it. Introducing s6-linux-init will significantly simplify things, but still, only using execline when it's an actual benefit would be useful, if we have access to sh.

@jprjr
Copy link
Member

jprjr commented Jan 24, 2021

Not sure, to be honest. I've always really liked that, in theory, you could use s6-overlay as an init system along with a binary and literally nothing else in the image. But I know I personally don't use that, and I don't know how popular that concept is.

@skarnet
Copy link
Contributor Author

skarnet commented Jan 24, 2021

Hm. If that case arises, would it be acceptable to just package a busybox containing only ash, for the people who need it, the same way you are currently packaging socklog for people who need a syslogd?

@crazy-max
Copy link

crazy-max commented Mar 3, 2021

Hi @jprjr @skarnet, I'm here about the CI/packaging part.

I've created an alpine-s6 repository to be able to provide the whole "s6 and overlay stack" for major usecases. You only need Docker (with buildx/bake) for this purpose.

Here is the 3 distribution packages provided by this repo through GitHub Actions. Let me know if you're interested to use it here or improve. Thanks for your work!

Artifacts

# Build artifacts and output to ./dist
docker buildx bake artifact-all

This command will build the Dockerfile artifacts target as stipulated in the bake definition for the following platforms: linux/amd64, linux/arm/v6, linux/arm/v7, linux/arm64, linux/386, linux/ppc64le, linux/s390x and will be released on GitHub with the specified git semver tag: https://github.com/crazy-max/docker-alpine-s6/releases/tag/2.2.0.3-r4

image

Docker image

# Build multi-platform image
docker buildx bake image-all

And also a multi-platform Docker image for Alpine 3.11, 3.12, 3.13 and edge:

image

And can be used as a base image for everyone like this:

FROM crazymax/alpine-s6:3.13-2.1.0.2

Docker dist image

# Build multi-platform dist image
docker buildx bake image-dist-all

This Docker image is a "distribution image" which only contains the required artifacts (scratch based):

image

This image can be useful if you want to copy the required artifacts for a specific platform which is not based on vanilla alpine like the python one:

FROM crazymax/alpine-s6-dist:3.12-2.1.0.2 AS s6
FROM python:3.8-alpine3.12
COPY --from=s6 / /

@dermotbradley
Copy link
Contributor

@crazy-max

Hi. Just to point out that I'm the Alpine package maintainer for s6-overlay, s6-overlay-preinit, and justc-envdir. They are packaged in Alpine v3.13 and Edge with appropriate dependencies defined on the other s6 packages.

https://pkgs.alpinelinux.org/packages?name=s6-overlay&branch=v3.13
https://pkgs.alpinelinux.org/packages?name=s6-overlay&branch=edge

@crazy-max
Copy link

@dermotbradley

That's great but unfortunately was not available on 3.12 when I created this repo. But can't wait to be able to use Alpine packages! Thanks for your work on this!

@skarnet
Copy link
Contributor Author

skarnet commented Jan 20, 2022

s6-overlay-3.0.0.0 is out, which should fix all this; new Alpine package coming soon.
Closing this, please reopen if problems still arise.

@skarnet skarnet closed this as completed Jan 20, 2022
@crazy-max
Copy link

@skarnet Nice work, sounds really promising, will check that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants