- 
                Notifications
    
You must be signed in to change notification settings  - Fork 49
 
Fix parsing of environment variables to prevent object parsing. #132 #139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally for environment variables I would not expect them to support the object syntax. But I'm not sure if we are facing a bug that has turned into a feature :)
For arrays we have to support top-level arrays. There are definitely users configuring multiple hosts by separating them via commas.
Really the bug could be solved by ensuring that the environment variable is single quoted:
Good to know we have a workaround.
@ph WDYT?
        
          
                internal/parse/parse.go
              
                Outdated
          
        
      | // | ||
| // The parser implements a superset of JSON, but only a subset of YAML by | ||
| // allowing for arrays and objects having a trailing comma. In addition 3 | ||
| // strings types are supported: | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/strings/string
        
          
                internal/parse/parse.go
              
                Outdated
          
        
      | // | ||
| // 1. single quoted string (no unescaping of any characters) | ||
| // 2. double quoted strings (characters are escaped) | ||
| // 3. strings without quotes. String parsing stops in | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but shouldn't it be stops at or stops at (non-native english speaker here :) )
| switch in[0] { | ||
| case '[': | ||
| return p.parseArray() | ||
| if p.cfg.Array { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
even if p.cfg.Array is false we might still parse top-level arrays if users have an environment variable like 1,2,3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the Env case p.cfg.Array is still true which allows the flowing to be in the environment variables:
		{`[]`, nil},
		{
			`a,b,c`,
			[]interface{}{"a", "b", "c"},
		},
		{
			`C:\Windows\Path1,C:\Windows\Path2`,
			[]interface{}{
				`C:\Windows\Path1`,
				`C:\Windows\Path2`,
			},
		},
		{
			`[array, 1, true, "abc"]`,
			[]interface{}{"array", uint64(1), true, "abc"},
		},
		{
			`[test, [1,2,3], on]`,
			[]interface{}{
				"test",
				[]interface{}{uint64(1), uint64(2), uint64(3)},
				true,
			},
		},
		{
			`[host1:1234, host2:1234]`,
			[]interface{}{
				"host1:1234",
				"host2:1234",
			},
		},
| if p.cfg.Array { | ||
| return p.parseArray() | ||
| } | ||
| return p.parsePrimitive(stopSet) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we adapt the stop set based on the config? e.g. if Array and Object are false and the string does not start with quotes we could return the original string as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case that Array and Object are false the stopSet will just be "," (the topLevelStopSet) and it will not change unless the parser hits a double or single quoted string (when enabled).
Being that we want top level array parsing to still be allowed when p.cfg.Array is false, I believe this path is correct. I am going to add more tests for all the combinations of configs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. Top-level array parsing can not be disabled and is done independent of the cfg.Array setting. My impression was that Array parsing also disables the top-level array support.
I think that makes sense. Was just not what I expected. Maybe we should document that top-level array splitting is active always.
        
          
                types.go
              
                Outdated
          
        
      | return nil, err | ||
| } | ||
| return parseValue(p, opts, str) | ||
| return parseValue(p, opts, str, parse.EnvParserConfig) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will affect parsing for every single resolver. go-ucfg has 3 resolver options by default: ResolveEnv, ResolveNOOP, and Resolve. The last one accepts a custom function.
I think it makes sense to have the resolver also configure the parsing mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also a Keystore resolver in libbeat.
| 
           @urso @blakerouse Even if this is breaking change I doubt it is really used in the wild (the parsing). But as you have said, this might be a bug turned into a feature. Even I believe this behavior leads to confusion and it is a usability problem.  | 
    
| 
           Also, I don't think the object reference used in the variables is a well-understood behavior, I wish we had stats to understand how people are actually using it.  | 
    
| 
           @urso this is ready for another review. It became a much bigger change with adding the config to the resolvers but I like the result.  | 
    
          Codecov Report
 @@            Coverage Diff             @@
##           master     #139      +/-   ##
==========================================
+ Coverage    77.8%   77.94%   +0.14%     
==========================================
  Files          23       23              
  Lines        2595     2612      +17     
==========================================
+ Hits         2019     2036      +17     
  Misses        425      425              
  Partials      151      151
 Continue to review full report at Codecov. 
  | 
    
…ust Config, so import is parse.Config. Add parse.Config to return of resolvers.
245ea61    to
    fdebcfd      
    Compare
  
    | 
           Although it is a breaking change, I like it. I think we should go forward with it. In case users have problems in Beats, we can still point them towards the  Please add changelog entries. Besides fixing #132, the change breaks API + removes object syntax for env variables.  | 
    
| 
           @urso Added changelog entries.  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Lets also make it obvious in the beats changelog when this change goes in 7.7
Currently
internal/parsehandles the values from the environment variable just like values from the configuration file. This PR changes the parser to limit environment variable parsing to not parse object notation.This is a breaking change because if any environment relies on this behavior it would break them. I question if this is correct way to solve the bug.
Really the bug could be solved by ensuring that the environment variable is single quoted:
I would be okay with closing this PR and closing the bug with this comment.