Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose output configuration via CLI arguments #1062

Merged
merged 13 commits into from
Oct 28, 2021
Merged

Conversation

shonfeder
Copy link
Contributor

@shonfeder shonfeder commented Oct 26, 2021

This plumbs configuration of the OutputManager's settings
for the out-dir and write-intermediate fields through to the CLI.

This is needed for us to control the location and kind of output for the
benchmark tests, and was planned in #1025

  • Tests added for any new code
  • Ran make fmt-fix (or had formatting run automatically on all files edited)
  • Documentation added for any new functionality
  • Entry added to UNRELEASED.md for any new functionality

@shonfeder
Copy link
Contributor Author

There's an issue with the docker container interacting with the file system in integration tests, but it doesn't have any impact on the core functionality added here, so I think this is ready for review, even while I figure out how to account for that detail.

Shon Feder added 7 commits October 27, 2021 13:16
Previously indicated incorrect config file
- Followup to #1025
- Required for #1059

This plumbs configuration of the OutputManager's settings
for the out-dir and write-intermediate fields through to the CLI.

This is needed for us to control the location and kind of output for the
benchmark tests, and was planned in #1025
@shonfeder
Copy link
Contributor Author

Oof. The docker CI job has been given me hell here. Been trapped for hours now trying to reconciling between broken permissions on generated output files and broken users home/dir for creation of the config directory. :(

The placement of env was shadowing the intended user
@shonfeder
Copy link
Contributor Author

shonfeder commented Oct 28, 2021

I have no idea why these new tests are failing. It seems clear it's a docker issue (cause the other integration tests are working as expected), and 2/3 aren't related to anything added in this pr.

I may need to try to re-engineer the way we're testing the docker images, but maybe it's a good idea to stop testing the docker container in our CI for now (it introduces a lot of IO/system-level complexity, and nearly doubles our longest CI time), until we can budget time to re-architect the dockerization? (And maybe just depricate this docker support in favor of some proper packaging?).

I'll thing about it a bit tomorrow after some rest.

@konnov
Copy link
Contributor

konnov commented Oct 28, 2021

Looking at the logs, I think the problem is in find -type f. Should not it be find -type d instead?

@shonfeder
Copy link
Contributor Author

Looking at the logs, I think the problem is in find -type f. Should not it be find -type d instead?

Thanks for looking! I don’t think that’s the problem, however. I’m only meaning to list the files, not the dirs (since the dir names are different every run). Note that the same tests are passing on the non-docker integration tests, and there are like 3 other tests using the same find command that are passing even in the docker container.

@shonfeder
Copy link
Contributor Author

My main conclusion after 2 days of fighting with the docerized integration tests, is there is way too much ad hoc complexity in the current way we are dockerization (and then testing) the application. I think it's taken me 3 times as long to debug the IO and environment interaction between the dockerized app and the tests as it took to add the feature introduced here. (This is mostly my own fault, as the complexity has gradually piled on over the last year+).

To illustrate, here's a brief summary of what we currently have to do to get the IO and environment threaded through correctly:

  • in test/.envrc, we look for an envvar USE_DOCKER, which will then shadow the path to apalache-mc with an alias to script/run-docker.sh. This enables using the dockerized version of the app in the integration tests.

  • in run-docker.sh, we are hard-coding in all environment variables that have to preserved inside of the docker container:

    cmd="docker run \
        -e OUT_DIR -e WRITE_INTERMEDIATE -e SMT_ENCODING \
        -e TLA_PATH -e JVM_ARGS \
        -e USER_ID=$(id -u) -e GROUP_ID=$(id -g) \
        --rm -v $(pwd):/var/apalache ${img} ${*}"
    

    without passing exposing these environment variables, we lose the functionality which these are meant to provide whenever the app is run in the container.

    Except the story is different with USER_ID and GROUP_ID...

  • we are also dynamically fetching the user id and group id of the user who invokes the script. This has to be threaded through the container for reasons described shortly.

  • the entrypoint of the docker container is bin/run-in-docker-container, where we are creating a new user and and group dynamically inside the container:

    # Create a user and group matching the invoking user
    # this is critical to avoid generating files with bad permissions
    USER_ID=${USER_ID:-1000}
    GROUP_ID=${GROUP_ID:-1000}
    groupadd --gid "$GROUP_ID" --non-unique apalache
    useradd --create-home --shell /bin/bash --uid "$USER_ID" --non-unique --gid "$GROUP_ID" apalache
    

    And then invoking the contained apalache with the new user:

    exec sudo --preserve-env=PATH --set-home -u apalache env /opt/apalache/bin/apalache-mc $@
    

    Without this crap, the files that are created by apalache inside of the docker container end up belonging to the user inside of the docker container, along with whatever permissions and groups they belong to. This means they can (and were before) ending up not being readable/writeable by the actual user running the docker container (without running as root, in the worst case).

  • Note that we now have to pass the environment that holds in the docker container on to the new user. this is what is currently causing the few docker tests to fail! We are only forwarding the PATH here, so the JVM envvars and the apalache-specific envvars are getting stripped out of the environment.

There must be a better way!

However, I think getting to that better way will require some thought and reworking the architecture for this whole chain of nested environments.

(Note that these complications are not arising from anything particular added in this PR, they are entailed by the way our dockerization currently is configured and by the nature of docker itself, coming into conflict with the desire to test our new file outputting design. Moreover, these complications are not introduced by the new design for output files! Rather, afaict, we just weren't testing anything about the output files before, so we weren't hitting the permissions errors. I think this has always been a problem, tho I haven't confirmed by checking log file permissions produced by an older container image.)

So my current plan is:

  • Get this PR fixed by patching up the missing environment vars.
  • Merge in this PR, which will allow me to fix the benchmark tests.
  • Then we can consider prioritization of a rewrite of the whole container story and maybe of the integration test harness.

@shonfeder
Copy link
Contributor Author

Wow. I can't believe it finally worked... lol 😅

@@ -143,9 +143,10 @@ jobs:
- name: Build dev-shell
run: nix develop -c bash -c exit
- name: Build the docker image
run: docker build -t informalsystems/apalache:ci .
run: docker build -t ghcr.io/informalsystems/apalache:ci .
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just cleaning up some remnants from the move from dockerhub to github container registry.

bin/run-in-docker-container Outdated Show resolved Hide resolved
bin/run-in-docker-container Outdated Show resolved Hide resolved
Comment on lines -5 to +9
- `out-dir` (String): The value of `out-dir` defines a path to the directory, in which all Apalache runs write their outputs. Each run will produce a unique subdirectory inside `out-dir` using the following convention: `<SPECNAME>_yyyy-MM-dd_HH-mm-ss_<UNIQUEID>`. The use of `~` in the path specification is permitted. The directory need not already exist, however, its parent directory must.
- `out-dir` (String): The value of `out-dir` defines a path to the directory,
in which all Apalache runs write their outputs. Each run will produce a unique
subdirectory inside `out-dir` using the following convention:
`<SPECNAME>_yyyy-MM-dd_HH-mm-ss_<UNIQUEID>`. The use of `~` in the path
specification is permitted. The directory path need not already exist.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change here is removing the caveat about parent dirs needing to exist. I changed it so that qualification is not needed.

docs/src/apalache/running.md Show resolved Hide resolved
@@ -2,11 +2,11 @@
"nodes": {
"flake-utils": {
"locked": {
"lastModified": 1631561581,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just due to an update of depss in our nix dev-shell (only used in integration test currently).

Comment on lines +50 to +52
# Required to avoid polluting the dev-shell's ocaml environment with
# dynamically liked libs from the host environment
unset CAML_LD_LIBRARY_PATH
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just fixes a problem I was hitting on my machine, but would likely impact other OCaml users

@shonfeder shonfeder requested review from rodrigo7491 and removed request for konnov and Kukovec October 28, 2021 15:20
Copy link
Collaborator

@rodrigo7491 rodrigo7491 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@shonfeder shonfeder merged commit ae73922 into unstable Oct 28, 2021
@shonfeder shonfeder deleted the shon/cli-output-args branch October 28, 2021 15:56
@apalache-bot apalache-bot mentioned this pull request Oct 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants