Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] Refactor ldmx-env.sh to use denv for container interaction #1232

Open
tomeichlersmith opened this issue Nov 20, 2023 · 8 comments · May be fixed by #1269
Open

[Discussion] Refactor ldmx-env.sh to use denv for container interaction #1232

tomeichlersmith opened this issue Nov 20, 2023 · 8 comments · May be fixed by #1269

Comments

@tomeichlersmith
Copy link
Member

Using denv instead of the bash functions currently in ldmx-env.sh has several benefits and disadvantages.

Benefits

  • I have learned a lot about running containers since originally designing ldmx-env.sh and have put that knowledge into denv.
  • denv removes the bash requirement on ldmx-env.sh, supporting other POSIX-compliant shells.
  • denv supports podman (unsupported by ldmx-env.sh) and apptainer explicitly (only supported by ldmx-env.sh through its singularity alias)
  • denv can be configured at run-time with environment variables as well as a text file (.denv/config) which we could distribute within our repository.
    • In all honesty, we probably don't want to distribute .denv/config in ldmx-sw since some users of denv will want to include additional mounting paths corresponding to data locations at their clusters and that would incur a breaking change for many workflows.
  • denv supports spawning an interactive shell within the containerized environment with a helpful remapping of the in-container HOME to the workspace so things like .bashrc work. ldmx-env.sh can spawn a shell but does not do this remapping and does not update the prompt.
  • denv passes environment variables into the container automatically so we do not need to have a setenv-style command

Disadvantages

  • Moving container-wrapping to its own program means that it is technically another dependency to install. I don't see this as a huge disadvantage because the program is so light and can easily be installed in any directory in PATH.
  • denv makes the opinionated choice to define the workspace as the home directory. ldmx-env.sh makes the choice to have the parent directory of ldmx-sw be the workspace. This means we would not be able to store .denv/config within ldmx-sw unless we make the breaking change of moving the workspace to be ldmx-sw and not its parent.
  • I have not fully tested denv on OS's that I don't have access to (MacOS and Windoze) so there may be some bugs to squash as it starts to be used.
  • denv passes environment variables into the container automatically so we do not have a setenv-style command isolating environment variables inside the container from outside the container
  • We'd need to do some development to make sure that the new mounting procedure and the new entrypoint script collaborate to get PATHs setup correctly.
@EinarElen
Copy link
Contributor

EinarElen commented Nov 20, 2023

Using denv instead of the bash functions currently in ldmx-env.sh has several benefits and disadvantages.

Benefits

* denv removes the `bash` requirement on ldmx-env.sh, supporting other POSIX-compliant shells.

HUGE

* denv supports spawning an interactive shell within the containerized environment with a helpful remapping of the in-container HOME to the workspace so things like `.bashrc` work. ldmx-env.sh can spawn a shell but does not do this remapping and does not update the prompt.

Neat

* denv passes environment variables into the container automatically so we do not need to have a `setenv`-style command

I think for most things having the environment variables passed in would be nice (how does it handle conflicts? I'm especially thinking about LD_LIBRARY_PATH and all the G4_WHATEVER variables. However, I do actually like the ability to "toggle" things that only end up affecting the container environment. Is this still possible somehow?

Disadvantages

* Moving container-wrapping to its own program means that it is technically another dependency to install. I don't see this as a huge disadvantage because the program is so light and can easily be installed in any directory in `PATH`.

I don't think installing another dependency is a big problem w.r.t. having to build something. I could imagine some people being queezy about encouraging curl | sh, especially on things like clusters (I'm a bit ambivalent).

I do think it will be a bit harder to teach "run denv command" rather than "run ldmx command", especially for people that aren't familiar with the technical side of things. It is a bit silly but it is a lot easier for people to grok that "oh when I'm working on ldmx things I put ldmx in front of the things" than "i need to use denv".

Would it be possible to have some kind of setup with ldmx-sw/scripts/ldmx-env.sh that uses denv but calls it through something called ldmx? I am not kidding, I genuinely think this would be valuable

* denv makes the opinionated choice to define the workspace as the home directory. ldmx-env.sh makes the choice to have the _parent_ directory of ldmx-sw be the workspace. This means we would _not_ be able to store `.denv/config` within ldmx-sw unless we make the breaking change of moving the workspace to be ldmx-sw and not its parent.

Could you elaborate? What do you mean with workspace? And what do you mean with home-directory?

* I have not fully tested denv on OS's that I don't have access to (MacOS and Windoze) so there may be some bugs to squash as it starts to be used.

* denv passes environment variables into the container automatically so we do not have a `setenv`-style command isolating environment variables inside the container from outside the container

* We'd need to do some development to make sure that the new mounting procedure and the new entrypoint script collaborate to get `PATH`s setup correctly.

@tomeichlersmith
Copy link
Member Author

how does it handle conflicts?

The current strategy is simply to overwrite container-internal env-vars with host env-vars. This is the choice made by the singularity/apptainer container runner family and so I adopted it. The docker/podman also makes this choice but since you have to specify each env-var for the container-running command, there aren't as many conflicts. I could forsee a strategy where denv has more configurations:

  • copy "all" (all in quotes since it would skip hidden and special vars)
  • copy some (only the specified vars are copied)
  • define some (vars are defined in the config and are not read from the host environment)

I initially avoided this due to its complexity but I don't think it would be too difficult to implement.


Would it be possible to have some kind of setup with ldmx-sw/scripts/ldmx-env.sh that uses denv but calls it through something called ldmx?

This was my plan actually. ldmx-env.sh would still exist and do a lot of things for us but it would be significantly shorter. It would also give us an opportunity to re-map calls from the current ldmx <command> to how denv is structured for backwards compatibility.


Could you elaborate? What do you mean with workspace? And what do you mean with home-directory?

"Workspace" is just my short-hand for "that directory on your system that holds all the crap you are working on". Right now, the environment variable LDMX_BASE is what I would call the "workspace". For my purposes, the home directory is where-ever the HOME environment variable is pointing. Re-defining the HOME inside the container to be the same as the "workspace" directory means that programs look into the workspace directory for their special config files. The biggest example of this is launching a shell - for example, when bash is run within a denv, it looks at <workspace>/.bashrc instead of the home directory of the host filesystem (or the probably empty home directory within the container filesystem).

@EinarElen
Copy link
Contributor

Right ok, so you mean that inside the container environment whatever you put as your workspace will be treated as $HOME. That seems reasonable to me for a whole bunch of reasons.

When it comes to environment variables/overriding I think it is worth to be careful with. I recently lost an annoying number of hours having to debug something that was caused by LD_LIBRARY_PATH pointing somewhere surprising... I think that it is a safe assumption that any thing that someone might put in an environment variable no matter how unreasonable some user will have a good reason for why they needed That One Weird Thing (www.hyrumslaw.com example?) so being able to handle things is important

@bryngemark
Copy link
Contributor

Mmm yeah, this mixing of local and container variables actually makes me a little uneasy. Not just that it's a time sink to confuse them but also that it messes with reproducibility (and trouble shooting/debugging support). Are there ways around this? Any handy tools we can set up to print env variables, or warnings when a local env variable overrides the one in the container, or similar?

The other part, about adding dependencies. For me this is usually not really a matter of bloating the container (maybe I should be more concerned with that) but more a matter of depending on something that will invariably change while we are using it. Outsourcing work is nice while it lasts but inevitably comes with having to take a leap now and then, and having handed off control over features to someone else. So I always think it's a matter of understanding which approach will be harder to maintain in the long run. I will err on the side of stability over convenience (which I know is not everyone's preference, and I think that's great, then we might strike a healthy balance). And admittedly this adds a touch of tealeaf reading to the decision workflow (what will development look like in five years?).

Ok. Grumpy old physics coder comment done. Will be happy to learn why I shouldn't worry.

@EinarElen
Copy link
Contributor

Ok after trying things out a bit for actual work I have some thoughts

  • This workflow kind of would require us to add more developer tools into the development container (I am not necessarily opposed to this). Since denv is an entire environment rather than just something you prefix a command with, you end up doing things like
> denv 
> fire myconfig.py 
# Realise there was a typo in myconfig.py 
> vim myconfig.py
zsh: command not found: vim

which is quite painful. I get that you can run it like denv fire myconfig.py, but I doubt I will be the last to try the above (and it removes a useful feature of the setup)

  • The image set in the config is just "ldmx/dev", will this always pick dev-latest? What if I want dev-myweirdversion?

@tomeichlersmith
Copy link
Member Author

Are there ways around this? Any handy tools we can set up to print env variables, or warnings when a local env variable overrides the one in the container, or similar?

There are many ways around this and I think the most stable solution is to put better env-var handling into denv so that users can be restrictive if they wish. ldmx-env.sh could setup denv in such a way to be more restrictive with a focus on reproducibility.

One can use printenv within the container to see how the container environment has been set upon launch. The warnings could be generated by comparing denv printenv with printenv to see where the overlaps are.

about adding dependencies

Since you mentioned bloating the container, I want to emphasize that this would not be a new dependency in the container. It would be outside the container and thus would need to be installed separately like docker/apptainer. Since denv is so light, one could imagine a situation where ldmx-env.sh will just install it if it's not already available, but that would be a delicate solution to implement.


add more developer tools to the container

There is nothing stopping us (on the denv side) to add vim and the like, but I am worried about doing that because then we get into the territory of "why doesn't my vim look the same" or "why doesn't my emacs plugin load". My initial suggestion to folks would be to have a editor window and a running window (split manually with tmux or like an IDE), but that is also a lame "solution" since it basically is saying "stop doing that".

How to pick an environment.

denv just stores whatever image tag you want. You can change it with denv config image ldmx/dev:myweirdversion and switch back with denv config image ldmx/dev:latest. It will not pull down the image unless the image doesn't exist or if you ask it to pull it down (either in denv config image when it prompts you or with denv config image pull). In this way, it is similar to the current functions in ldmx-env.sh: one container image at a time.

@tomeichlersmith
Copy link
Member Author

I've updated denv to support better handling of the environment variables. Now one can choose to disable the copy-all feature and instead have a more isolated denv.

denv init --no-copy-all ldmx/dev:latest
denv config env copy LDMX_BASE MY_SPECIAL_VAR=1

This would copy the value of LDMX_BASE from the host and set the value of MY_SPECIAL_VAR to 1 within the denv.

@tomeichlersmith
Copy link
Member Author

I've started a branch 1232-use-denv-in-ldmx-env beginning this refactor. I think the most intricate part is designing a smooth transition. I don't think replicating all of the sub-commands of ldmx with denv is helpful or maintainable; however, I could foresee a simple wrapper script that offers some automatic translation in order to inform our user base. The long-term goal would be for ldmx-env.sh to evolve into a very simple script that insures (1) a container runner is available (2) denv is available and (3) a denv holding ldmx-sw is initialized (denv init has been run). This would mean (in the far future), people would be

. ldmx-sw/scripts/ldmx-env.sh
denv <command in ldmx container environment>

I find this a helpful goal so that people know they are using denv and can be guided to its documentation and its development.

The largest separation between how denv runs an image and ldmx is that denv uses its own entry-point rather than expecting an entry-point to exist at /etc/entry.sh within the image. This allows denv to insure certain initialization procedures are done including sourcing shell init scripts (e.g. .profile that exist within the workspace it is initialized within). This would allow users to more easily customize the environment. Most urgently, we would need to document how to use a custom locally-compiled Geant4 version similar to how it can currently be done with the 4.2.0 image.

Draft Roadmap

This first roadmap draft is mainly focused on moving fast and breaking everyone's set up after documenting how they could do the same tasks with the new ecosystem. We could insert some extra steps by (for example) adding deprecation/removal warnings to ldmx commands some amount of time before actually removing them. We could also write a transitory script focused on softly introducing the user to denv by removing the now-unsupported ldmx commands and auto-translating the denv-equivalent ones. The branch I linked to has a ldmx-denv-init.sh started and a ldmx-denv-transition.sh where I outline how we could wrap the denv equivalent commands with bash functions to make the transition easier on the users. Unsure if this will be helpful or not.

  1. Discussion on refactoring away from ldmx bash function <- we are here
  2. Testing of denv with lighter environment script on trunk (different name e.g. ldmx-denv-init.sh)
  3. Document how users would translate current ldmx commands into denv (initial table below)
  4. Update default install path to be ~/.local so that it is found by the container automatically
  5. Remove ldmx bash function (and the .ldmxrc file), basically a PR replacing ldmx-env.sh with ldmx-denv-init.sh
  6. Remove custom entry-point from image
ldmx Command denv Equivalent
help help
config config print
use config image <image>
pull config image pull
mount config mounts <dir>
setenv config env copy
run 'cd <dir>; <cmd>'
checkout removed
source removed
list removed
clean removed

Explanations for Removals

All of the ldmx commands that do not have a denv equivalent are not included within denv because they do not directly pertain to interfacing with the container runner. I also am proposing removing them entirely from our ecosystem largely because they are either (a) broken, (b) not used widely, or (c) both.

  • checkout: this tries to code a more efficient combination of git checkout and git submodule update so that users can more easily switch ldmx-sw branches; however, this is not functional. Moreover, git provides a better description of how to switch between branches with different submodule commits and I think we should simply guide developers to this resource.
  • source: this is here to allow users to run multiple ldmx commands at once intending to have a single file defining the configuration of the environment conducted by ldmx. This command is not necessary in denv since denv stores the environment configuration within a text file at <workspace>/.denv/config (for us, ldmx-sw/../.denv/config since the workspace is the parent of ldmx-sw).
  • list: I tried to write some plain-shell code to list the available tags of the image. Since time of writing, DockerHub has changed its API breaking this code. Nobody else has noticed (or even if they have noticed, it wasn't a big enough deal to make an issue or post on slack), so I think removing it and just pointing people to the GitHub releases and DockerHub is good.
  • clean: This is the messiest command and has three sub-tasks.
    • src: Clean out all untracked files from the source file tree. Just a wrapper around git clean basically and with the moving of the generated files out from Framework and into the build directory, something more complicated that rm -r build install is not necessary anymore.
    • container: Remove the downloaded container images. This is a dangerous action and I feel more comfortable instructing people on how to list the images already downloaded and remove them manually (either with docker image rm or rm depending on the runner).
    • env: reset the environment variables. denv does not store any important information in environment variables preferring instead to write and read the config file.

@tomeichlersmith tomeichlersmith linked a pull request Mar 8, 2024 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants