Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add registerStringMapVar to handle --env with commas! #674

Merged
merged 1 commit into from
Mar 30, 2022

Conversation

vsoch
Copy link
Contributor

@vsoch vsoch commented Mar 27, 2022

Description of the Pull Request (PR):

There was a bug reported in slack that --env is truncating variables at commas:

Hello, I am trying to use --env to specify an environment variable (--env clustering_threshold='100,97') but I am running into an issue with

WARNING: Ignore environment variable "97": '=' is missing .

I have tried to use a backslash to escape the comma but that doesn't seem to work.
I am currently using singularity-ce version 3.9.0

And I think I've seen this before? google/go-containerregistry#1178. So because I love Go and don't get to work on enough Go projects (and this seemed like maybe a good way to engage with Singularity again) I decided to try for a fix! The bug looks to be the same here - there are just a few more layers added on top of it with the custom cmdline package provided here. To go through the changes:

  • we ultimately need to use StringToStringVarP/StringToStringVar that can handle taking a varA=varB and parsing into map[string]string. We need to do this because StringArrayVarP considers a comma an ending delimiter to separate something out into a different string. There could be a way to tweak this particular parsing, but for other projects I've found a more straight forward fix is to use a map[string]string.
  • This means that we cannot just parse the SIngularityEnvFile into SingularityEnv by appending to the list. Instead we parse the list of env directly from the file, and then check if it's already defined (by --env) or not. We issue a warning if the flag is also provided with --env and do not overwrite (--env takes precedence!)
  • Finally, when we go over the list we only skip a variable when the envName is not defined. We can technically allow an empty value for envValue because the user might be trying to unset something.

This same type can be used for any other variables that might come up to truncate on commas!

Apologies in advance for anything I did wrong - I'll fix things as the CI brings up errors!

@dtrudg
Copy link
Member

dtrudg commented Mar 28, 2022

Thanks @vsoch - I had to look at some of the command line comma splitting behavior in #324 for --mount before, but didn't think to go and investigate --env etc. at that time...

Note also that CLI flag handling was extended to allow StringArray flags, where values are not split into a slice on ','. This was a requirement for the --mount spec syntax.

... that stuff for --mount is actually a bit weird as it uses CSV rules for splitting things but I will have a quick think about whether there is any more generalization that can be done here when I review. I'll aim to review the code properly tomorrow, but please excuse any slight delay due to my being stuck working on #670

What we definitely do want to make sure, is that there are some tests that cover cases with commas in env vars. SINGULARITYENV_ / --env / --env-file are tested in 3 different Test functions in:

https://github.com/sylabs/singularity/blob/master/e2e/env/env.go

I'm very happy to add those to the PR, or assist if you need it - our E2E framework is not always the easiest to work with.

Also, one thing that comes to mind is does this satisfy the case of an env var with an = in the value, which is valid?

$ export FOO=BAR=BAZ

$ echo $FOO
BAR=BAZ

@vsoch
Copy link
Contributor Author

vsoch commented Mar 28, 2022

I'm very happy to add those to the PR, or assist if you need it - our E2E framework is not always the easiest to work with.

I'd be okay if it's lots of headache and easy for you to do, or you want the fix in soon (I can do this in free time but not official work time since the lab is pretty strict about that) and if you want me to take a shot I could! I'd ask for some basic guidance "here is the strategy to use, what to add, what you should look at in terms of files / sections, etc."

Also, one thing that comes to mind is does this satisfy the case of an env var with an = in the value, which is valid?

Yep seems to be ok?

$ ./singularity exec --env clustering_threshold=another=97 ../busybox_latest.sif  env | grep clustering_threshold
clustering_threshold=another=97

And your example:

$ ./singularity exec --env FOO=BAR=BAZ ../busybox_latest.sif  env | grep FOO
FOO=BAR=BAZ

Kind of weird but I guess FOO sometimes wants its BAR and BAR wants its BAZ!

@dtrudg
Copy link
Member

dtrudg commented Mar 28, 2022

I'm very happy to add those to the PR, or assist if you need it - our E2E framework is not always the easiest to work with.

I'd be okay if it's lots of headache and easy for you to do, or you want the fix in soon (I can do this in free time but not official work time since the lab is pretty strict about that) and if you want me to take a shot I could! I'd ask for some basic guidance "here is the strategy to use, what to add, what you should look at in terms of files / sections, etc."

Cool - I'll provide some guidance here tomorrow. Thanks for checking the FOO=BAR=BAZ also.

Copy link
Member

@dtrudg dtrudg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vsoch - this looks good to me, subject to adding some testing so we capture the expected behavior going forward.

The handling of a ',' is clearly a fix for both expected success and failure modes, so this can definitely go under the bug fix heading rather than behavior change, and is available for backport to a 3.9 release.

singularity-ce version 3.9.1+193-gcc92d1b05

$ singularity run --env TEST1=foo,TEST2=bar library://alpine env | grep TEST
TEST1=foo
TEST2=bar
$ singularity run --env TEST1=foo,TEST2=bar,bob library://alpine env | grep TEST
WARNING: Ignore environment variable "bob": '=' is missing
TEST1=foo
TEST2=bar
$ singularity run --env TEST1=foo --env TEST2=bar,bob library://alpine env | grep TEST
WARNING: Ignore environment variable "bob": '=' is missing
TEST1=foo
TEST2=bar

This PR 674

$ singularity run --env TEST1=foo,TEST2=bar library://alpine env | grep TEST
TEST1=foo
TEST2=bar
$ singularity run --env TEST1=foo,TEST2=bar,bob library://alpine env | grep TEST
Error for command "run": invalid argument "TEST1=foo,TEST2=bar,bob" for "--env" flag: bob must be formatted as key=value
$ singularity run --env TEST1=foo --env TEST2=bar,bob library://alpine env | grep TEST
TEST1=foo
TEST2=bar,bob

Test guidance

In the unit tests, we probably need to add a map[string][string] to the flags that are tested here:

... and then add test case(s) to the ttData struct. A good pattern would be to take a look at the "string slice flag" test which is defined here, as it should be relatively straightforward to copy that case and adapt it:

desc: "string slice flag",

A make short-unit-test will cover running those, or just use e.g. the VSCode Go plugin built-in functionality to run the specific tests you are working on.

We also need e2e tests to verify behavior through the CLI, as called by users. In the e2e tests, environment handling is tested in:

https://github.com/sylabs/singularity/blob/master/e2e/env/env.go

There are 3 test functions:

  • singularityEnv tests SINGULARITYENV_ variables
  • singularityEnvOption tests --env
  • singularityEnvFile tests --env-file

In each of these we should test the setting of env vars that contain commas. Ideally it'd be good to test the case where there is a trailing comma in the value, etc. so our tests would pick up any behavior changes from the upstream library or similar... which are unlikely but would be good to know about.

The 3 test functions each use table driven tests, with tests defined in a test slice of struct. They use the PATH environment variable as it has some extra APPEND/PREPEND handling that other vars do not. Since , is valid in a PATH then it's fine to continue that.

I would be inclined to base the test on the "OverwriteDefaultPath" test cases... copy those and change them to test using a path string that includes a comma. Bonus points for considering trailing commas etc.

The e2e framework is unwieldy to work with when developing on a small area of the e2e test code. I'd suggest that while working on the tests you comment out all of the suite.AddGroup lines in e2e/suite.go, except the one for ENV so that make e2e-test will only run the env var tests. Make sure those comments aren't committed, though.

You can also comment out the line:

e2e.PrepRegistry(t, testenv)

... to speed things up a bit as those tests don't need us to spin up an OCI registry.

Let me know if you need any more guidance. Thanks again!

@vsoch
Copy link
Contributor Author

vsoch commented Mar 28, 2022

For the first part, how do we account for the map[string]string for this part?

} else if len(cm.GetError()) == 0 && d.envValue != "" && len(d.flag.EnvKeys) > 0 {
			os.Setenv(d.flag.EnvKeys[0], d.envValue)
			cmds[d.cmd] = c
}

It feels like we'd need to set something too, but I'm not familiar with this setup and want to ask for help. Thank you!

@dtrudg
Copy link
Member

dtrudg commented Mar 28, 2022

For the first part, how do we account for the map[string]string for this part?

} else if len(cm.GetError()) == 0 && d.envValue != "" && len(d.flag.EnvKeys) > 0 {
			os.Setenv(d.flag.EnvKeys[0], d.envValue)
			cmds[d.cmd] = c
}

It feels like we'd need to set something too, but I'm not familiar with this setup and want to ask for help. Thank you!

We shouldn't need any additions there, only a new test case (or cases) in the tt struct for the map flag type. We aren't actually testing calling the flag code with CLI like values (e.g. --env FOO=BAR,BAZ). Instead this test is purely about checking the Singularity specific wrappings around the pflag code:

  • Checking that each type of flag can be registered with our Command manger.
  • If the flag can be registered, our test supplies a value, and no error yet... set an env var matching the flag to a value (the os.Setenv line)
  • Check that our UpdateCmdFlagFromEnv updates the value of the flag from the environment variable, with no error.
  • Check that if we ask the flag for the string representation of its value, we get what we expect given the env var that UpdateCmdFlagFromEnv used.

All the magic is in pflag already for handling the different types. And, we're going to check the actual CLI level functionality of passing e.g. --env FOO=BAR,BAZ in our e2e tests. Therefore a first test case to satisfy the above for the new map[string]string might be as simple as:

diff --git a/pkg/cmdline/flag_test.go b/pkg/cmdline/flag_test.go
index 88fe5369a..3b00b4376 100644
--- a/pkg/cmdline/flag_test.go
+++ b/pkg/cmdline/flag_test.go
@@ -18,6 +18,7 @@ var (
        testBool        bool
        testStringSlice []string
        testStringArray []string
+       testStringMap   map[string]string
        testInt         int
        testUint32      uint32
 )
@@ -232,6 +233,21 @@ var ttData = []struct {
                },
                cmd: parentCmd,
        },
+       {
+               desc: "string map flag",
+               flag: &Flag{
+                       ID:           "testStringMapFlag",
+                       Value:        &testStringMap,
+                       DefaultValue: testStringMap,
+                       Name:         "string-map",
+                       Usage:        "a string map flag",
+                       EnvKeys:      []string{"MAP"},
+                       StringArray:  true,
+               },
+               cmd:        parentCmd,
+               envValue:   "FOO=BAR,ABC=123",
+               matchValue: "[FOO=BAR,ABC=123]",
+       },
 }
 
 func TestCmdFlag(t *testing.T) {

@vsoch
Copy link
Contributor Author

vsoch commented Mar 28, 2022

Oh okay! I was caught up on how to implement a short flag (I saw it for the other ones) but it looks like I don't need to. Thank you!

@vsoch
Copy link
Contributor Author

vsoch commented Mar 28, 2022

okay onto the e2e tests - how do I run this subset? Same make unit test command or something different?

@vsoch
Copy link
Contributor Author

vsoch commented Mar 28, 2022

oh derp

make e2e-test

Copy link
Member

@dtrudg dtrudg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far - if a similar e2e test case with a comma in a var value can be added to:

  • singularityEnvOption tests for --env
  • singularityEnvFile tests for --env-file

... and the changes to suite.go reverted (generally let them be local only, don't submit the shortened tests to GH in the PR).... then I think it's ready to be merged.

I do see opportunities to improve the tests further, but that is on us more generally, doesn't need to be in the scope of this PR.

Thanks!

@vsoch
Copy link
Contributor Author

vsoch commented Mar 30, 2022

@dtrudg I'll need to ask for your help on the last test - couldn't get anything working. I'm thinking reading a trailing comma from file is out of scope for here, because we don't actually retrieve it via the command line parser. My issue in particular was aimed toward fixing the command line parsing of --env and the others come in via different avenues (why I think it doesn't work).

@dtrudg
Copy link
Member

dtrudg commented Mar 30, 2022

@dtrudg I'll need to ask for your help on the last test - couldn't get anything working. I'm thinking reading a trailing comma from file is out of scope for here, because we don't actually retrieve it via the command line parser. My issue in particular was aimed toward fixing the command line parsing of --env and the others come in via different avenues (why I think it doesn't work).

Okay - let's leave that one out from this, and I'll open a separate issue to tackle it. It's good to know you couldn't get anything working, as that likely indicates work needed, which is the benefit of the e2e test. Thanks for your efforts there!

@vsoch
Copy link
Contributor Author

vsoch commented Mar 30, 2022

Sure! Ping me in the new issue and if/when I get free time and bored (happens quite a lot lol) I'll take a look and see if I can help there too. I suspect it just comes down to finding where the paths are read from the file and then doing a strip of the trailing comma. But I did see a function somewhere that (I think?) is intended to handle it, so I'd want to ask for tips about if we should be using that (or not!)

@dtrudg
Copy link
Member

dtrudg commented Mar 30, 2022

Sorry @vsoch - I was a bit confused here and thought you were talking about the last of 3 tests as in "singularityEnvFile" but just saw the revert...
image
... and this is singularityEnvOption which isn't for the env-file... it's actually for --env

I can take a look at this later this afternoon or tomorrow morning.

Edit - isn't it just that the matchVal here should have foo,: not foo:?

@dtrudg
Copy link
Member

dtrudg commented Mar 30, 2022

This will fix the occasional unit test error, which is due to Go maps being randomly ordered. Sorry I neglected that in earlier reviewing!

diff --git a/pkg/cmdline/flag_test.go b/pkg/cmdline/flag_test.go
index 9153ddd33..a5124a624 100644
--- a/pkg/cmdline/flag_test.go
+++ b/pkg/cmdline/flag_test.go
@@ -24,11 +24,13 @@ var (
 )
 
 var ttData = []struct {
-       desc            string
-       flag            *Flag
-       cmd             *cobra.Command
-       envValue        string
-       matchValue      string
+       desc       string
+       flag       *Flag
+       cmd        *cobra.Command
+       envValue   string
+       matchValue string
+       // Alternative match to accommodate random map ordering
+       altMatchValue   string
        expectedFailure bool
 }{
        {
@@ -163,9 +165,10 @@ var ttData = []struct {
                        Usage:        "a string map flag",
                        EnvKeys:      []string{"STRING_MAP"},
                },
-               cmd:        parentCmd,
-               envValue:   "key1=arg1,key2=arg2",
-               matchValue: "[key1=arg1,key2=arg2]",
+               cmd:           parentCmd,
+               envValue:      "key1=arg1,key2=arg2",
+               matchValue:    "[key1=arg1,key2=arg2]",
+               altMatchValue: "[key2=arg2,key1=arg1]",
        },
        {
                desc: "string array flag",
@@ -289,7 +292,9 @@ func TestCmdFlag(t *testing.T) {
                }
                if d.envValue != "" {
                        v := d.cmd.Flags().Lookup(d.flag.Name).Value.String()
-                       if v != d.matchValue {
+                       // We allow matching against an optional altMatchValue, so that we can accommodate
+                       // the string forms of both orderings of maps with 2 keys.
+                       if v != d.matchValue && (d.altMatchValue == "" || v != d.altMatchValue) {
                                t.Errorf("unexpected value for %s, returned %s instead of %s", d.desc, v, d.matchValue)
                        }
                }

the current flag using a splice type will truncate on commas. In order to better
support commas in --env variables we should have support for types of map[string]string.
This change will add this new option and fix the current --env bug described in slack
that splits name=val1,val2 into just name=val1 and outputs a warning. I saw some custom
parsing code in a different file that maybe was intended to handle this, but it never
gets called because the envar is skipped. If it is not skipped the comma would already
be truncated by the flag parser.

Apologies for mistakes - have not looked at Singularity source code in years it seems!

Signed-off-by: vsoch <vsoch@users.noreply.github.com>
Copy link
Member

@dtrudg dtrudg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Many thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants