Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Executing command using spython needs --cleanenv flag, while it's not necessary using Singularity #172

Closed
maawoo opened this issue Mar 16, 2021 · 12 comments

Comments

@maawoo
Copy link

maawoo commented Mar 16, 2021

Expected Behavior

Executing a command within a container using spython works equally to when using Singularity.
E.g. If I need additional options to make a command work with Singularity, I expect to need that additional option using spython as well.

Actual Behavior

The command I'm executing in my container works without any additional options when I'm using Singularity, but needs the --cleanenv flag when I'm using spython to work properly.

Steps to Reproduce

If actually needed, I can provide more detailed information.
In short: My container is based on the Docker image of this project and I noticed this bug while running a command described here in more detail.

Context

  • Operating System: Pop!_OS 20.10
  • singularity version: 3.7.0
  • spython version: 0.1.1
  • python version: 3.7

Failure Logs

Possible Fix

@vsoch
Copy link
Member

vsoch commented Mar 16, 2021

Could you show me the actual command you are trying to run, and why it does not work?

@maawoo
Copy link
Author

maawoo commented Mar 16, 2021

Hi Vanessa!

Command that works using Singularity natively:
singularity exec force.sif force-level1-csd --no-act -s LC08 -d 20200601,20200701 -c 0,75 /path/to/metadata/files /path/to/output/dir queue.txt /path/to/a/geojson/file

Same command in spython, that only works if I add --cleanenv:
Client.execute(/path/to/same/force.sif, ["force-level1-csd", "--no-act", "-s", "LC08", "-d", "20200601,20200701", "-c", "0,75", "/path/to/metadata/files", "/path/to/output/dir", "queue.txt", "/path/to/a/geojson/file"], options=["--cleanenv"])

If I don't use --cleanenv, I get the following error message from my containerized software related to GDAL:
ERROR 4: Unable to open EPSG support file gcs.csv. Try setting the GDAL_DATA environment variable to point to the directory containing EPSG csv files.
ERROR 1: Failed to process SRS definition: EPSG:4326
I use GDAL version 3.1.3 on my Laptop, while it uses version 2.2.3 in the container... so I'm not surprised that it results in compatibility issues when the environment variables get mixed up.

@vsoch
Copy link
Member

vsoch commented Mar 16, 2021

Ok good - so you did find that the exec command has options, singularity_options, and sudo_options. https://github.com/singularityhub/singularity-cli/blob/master/spython/main/execute.py#L26

For the cleanenv, I don't have intuition off the bat for why that happens - the environ variable gets passed as None to the ultimate function to run the command, here https://github.com/singularityhub/singularity-cli/blob/master/spython/utils/terminal.py#L179. Could you add --debug to see what the final command looks like? That might at least show us the difference in running the command. If they are the same, then it's some issue with executing from Python, and we can go from there.

Here is an example that shows how the debug prints.

from spython.main import Client
Client.debug =True
Client.pull('docker://busybox')
singularity --debug pull --name busybox.sif docker://busybox

@maawoo
Copy link
Author

maawoo commented Mar 17, 2021

Sorry for my late answer! Couldn't get back to it yesterday.

Here are the debug outputs:
debug_singularity_okay.txt
debug_spython_fail.txt
debug_spython_okay__cleanenv.txt

So spython forwards a different set of env variables to the container in comparison to native Singularity.
Including some variables that probably cause my container to fail (e.g. GDAL_DATA and PROJ_LIB)... even though I actually don't seem to have set them on my system.

@vsoch
Copy link
Member

vsoch commented Mar 17, 2021

Sorry I’m a bit confused - we want debug output from the Singularity Python client, which should (most importantly) sho us the Singularity command being run. Is that included in there (I only see native Singularity output). The first thing to do is compare the actual commands.

@maawoo
Copy link
Author

maawoo commented Mar 17, 2021

Ah sorry! I got a bit ahead of myself, but running your simple example in the shell (as you suggested) made me understand as the Singularity command is highlighted:
Screenshot from 2021-03-17 18-42-27

Well, here's the output of my command without the --cleanenv flag:
Screenshot from 2021-03-17 18-43-54

It just prints out what I uploaded earlier with debug_spython_fail.txt, but without formatting obviously.
Running the command without the --cleanenv flag, also doesn't print the Singularity command.

@vsoch
Copy link
Member

vsoch commented Mar 17, 2021

Oh that's a bug, let's fix that so we can see the command. I should have some time this weekend unless you beat me to it.

@maawoo
Copy link
Author

maawoo commented Mar 17, 2021

Alright! I'll try to find some time in the next few days and see if I can find anything 🙂

@vsoch
Copy link
Member

vsoch commented Mar 20, 2021

hey @maawoo ! As promised, I did a PR that will add extra logging to the execute command when you have debug=True, either on the client or provided to the function:

#173

Would you care to review, and test out for your use case? We'd first want to compare the commands generated by the singularity python client vs. when you run something locally for differences. If there are no differences, then we need to think about the context of execution (and why the environment is different). If it gets to that, I'll need a dummy example to reproduce the problem.

@maawoo
Copy link
Author

maawoo commented Mar 22, 2021

Hi @vsoch !
Thanks a lot for your effort and spending more time on this!
Your addition works and now I'm able to see the command when using debug=True and quiet=False. The generated command seems okay and I guess there actually was never a problem with my failing command to begin with...

To dig a little bit into the singularity-cli code, I cloned the repo and created a fresh conda environment. I didn't get very far with the digging, but noticed that my command worked in the new env. So I checked the installation and all dependencies in the other env and made sure that both are using the same versions with your PR.... so v0.1.11.
Still, the command works in the new conda env (left) and doesn't in the other (right):

Screenshot from 2021-03-22 21-06-25

So, I installed some packages that could be causing the issue into the new env and quickly found the troublemaker: rasterio, which is depending on the GDAL library and (not surprisingly) caused my command to fail.

So, like I said... there actually was never a problem and the difference in behavior was to be expected :)

@maawoo maawoo closed this as completed Mar 22, 2021
@vsoch
Copy link
Member

vsoch commented Mar 22, 2021

Ah, so glad you found the bugger! 🎉 It was worth improving the debug output for spython to help with that (and future users that might run into the same issue!). Should I go ahead and merge that PR so it's added to singularity python proper?

@maawoo
Copy link
Author

maawoo commented Mar 22, 2021

Yeah, at least something else could be fixed along the way!
I don't see why not. The output it provides looks good to me 👍🏻

Again, thanks for your help and the work you do in general :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants