Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regexFirstGroup return first group from regexp, useful with .env file… #895

Merged

Conversation

alexandregz
Copy link
Contributor

@alexandregz alexandregz commented Apr 21, 2024

…s and another config files.

From my server I need to access config.inc.php file to know user, password and BD to execute a mysql command to check data using "command" resource. Also can use with .env files or similar.

Example:

{{ $regexBDRC := "\'mysql:\/\/[a-z0-9]+:[a-z0-9]+@localhost\/(roundcube[a-z0-9]+)\';"}}
{{ $BDRC := readFile $fileRCConfig | regexFirstGroup $regexBDRC}}

Extract BD (first group from regexp) from Roundcube config, to check table "users" with something like that:

command:
'mysql RC':
exit-status: 0
exec: {{ print "mysql -u " $UserBDRC " -p" $PassBDRC " " $BDRC " -e 'select user_id from users limit 1' --skip-column-names" }}
stdout:
and:
- 1
stderr: []
timeout: 10000 # in milliseconds
skip: false

Checklist
  • make test-all (UNIX) passes. CI will also test this
  • unit and/or integration tests are included (if applicable)
  • documentation is changed or added (if applicable)

Description of change

I just modified template to add func regexFirstGroup only. I checked on my own docker image with success.

Copy link
Member

@aelsabbahy aelsabbahy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Wondering if it can be simplified to something like this:

func regexFindSubmatch(regex, input string, index int) (string {
    re := regexp.MustCompile(regex)
    matches := re.FindStringSubmatch(input)
    if len(matches) > index {
        return matches[index] // Return the requested group
    }
    return "" // Return empty if the group index does not exist
}

On phone so got too lazy to do error handling, but the above with Compile instead of MustCompile. That would enable the user to pick which group they want, which seems more flexible.

The alternative is to have it return the array of matches, something like this:

func regexFindSubmatch(pattern, input string) []string {
            re := regexp.MustCompile(pattern)
            return re.FindStringSubmatch(input)
        },

And combine it with get:

{{ findStringSubmatch "(\d+)-(\d+)-(\d+)" . | get 2 "Group not found" }}

Trying to see if there's a balance of convenience + flexibility.

@aelsabbahy
Copy link
Member

With all the options above, I think care should be given to how we handle cases where the group isn't matched. Empty string, error, something else? 🤔

@alexandregz
Copy link
Contributor Author

alexandregz commented Apr 23, 2024

With all the options above, I think care should be given to how we handle cases where the group isn't matched. Empty string, error, something else? 🤔

Personally, I prefer error 👍

If you want, I can put changes to commit, basically use your function "func regexFindSubmatch(regex, input string, index int) (string, error)" and documentation change to show regexFindSubmatch use, because your implementation is really better than mine 😄

@ripienaar
Copy link
Collaborator

I also prefer {{ findStringSubmatch "(\d+)-(\d+)-(\d+)" . | get 2 "Group not found" }} which for the specific case of first can be {{ findStringSubmatch "(\d+)-(\d+)-(\d+)" . | first }} which would return empty string but if you needed an error would be {{ findStringSubmatch "(\d+)-(\d+)-(\d+)" . | mustFirst }}

So I think we dont need to decide how missing data are handled in this function the downstream access into the resulting array has all the options we need

@alexandregz
Copy link
Contributor Author

I also prefer {{ findStringSubmatch "(\d+)-(\d+)-(\d+)" . | get 2 "Group not found" }} which for the specific case of first can be {{ findStringSubmatch "(\d+)-(\d+)-(\d+)" . | first }} which would return empty string but if you needed an error would be {{ findStringSubmatch "(\d+)-(\d+)-(\d+)" . | mustFirst }}

So I think we dont need to decide how missing data are handled in this function the downstream access into the resulting array has all the options we need

Something like that?:

func findStringSubmatch(pattern, input string) map[string]interface{} {
	re := regexp.MustCompile(pattern)
	els := re.FindStringSubmatch(input)

	elsMap := make(map[string]interface{})
	for i := 0; i < len(els); i++ {
		elsMap[strconv.Itoa(i)] = els[i]
	}
	return elsMap
}

because Sprig get() needs map[string]interface{} ( https://github.com/Masterminds/sprig/blob/master/dict.go#L8-L13 )

@ripienaar
Copy link
Collaborator

Yeah indeed, get wont work - index probably will. I would not turn it into a map.

@alexandregz
Copy link
Contributor Author

alexandregz commented Apr 26, 2024

Yep, index works but without pipe: string is the first param, not last, you can't use something like {{ findStringSubmatch "(\d+)-(\d+)-(\d+)" $str | index 2 }}.

For me, works with:

{{ $str := readFile "/etc/nginx/sites-available/mydomain.gal" | findStringSubmatch "\\s+ssl_certificate (.*);|\\s+ssl_certificate_key (.*);" }}
{{ $CERT := index $str 1 }}
{{ $PRIVKEY := index $str 2 }}

@aelsabbahy
Copy link
Member

aelsabbahy commented Apr 26, 2024

So it seems like the options are:

Function takes index:

func regexFindSubmatch(regex, input string, index int) (string, error)
  1. Function returns map + get

  2. Same as 2 but maybe also create keys for named captures, ex: https://stackoverflow.com/a/20751656

  3. Array + index, but clunky user experience.

It seems to me option three allows for the most flexibility and will allow using named captures which can be helpful for making a regex self documenting. Feel free to push back if you disagree.

…h get and named parenthesized subexpressions or stringfied array string
@alexandregz
Copy link
Contributor Author

alexandregz commented May 3, 2024

Hi. I modified findStringSubmatch to returns:

  • keys with named captures (?P<first>,...): elsMapsNamed: [map[string]interface {}{"user":"my_user"}]
  • or stringfied array string (``): elsMap:[map[string]interface {}{"0":"xxxx", "1":"xxx", "2":"my_user"}]

both to use with get

If don't exists named captures, function returns the stringfied array (because user don't introduce something with the (?P<name>...) syntax.

@aelsabbahy
Copy link
Member

This looks great! Any reason to not just have one map and always return it?

This way:

  1. You can get the first group using either get "1" or get "name"
  2. You can mix and match named groups and non named groups. Not saying this is a good idea, but not sure we should actively prevent it
  3. Consistency.. adding one named groups doesn't completely change the map

Let me know if there's a danger or something that would break with the single map approach.

@alexandregz
Copy link
Contributor Author

alexandregz commented May 13, 2024

Always returns a map[string]interface{} to use with get: when code have named captures, returns this; if not, returns same kind of map, but with keys "1", "2", "3", etc.

But I think mix both (named captures and "standard" map with numbered keys) is complicated to manage and not useful.

@aelsabbahy
Copy link
Member

Hmm, if that's the case, I wonder if we should only support only named captures at that point? Essentially you're opting into a more power-user flow.

Thoughts?

cc: @ripienaar also curious on your thoughts here.

PS: I think this PR is really close and is a great feature! Appreciate all the effort that went into this.

@alexandregz
Copy link
Contributor Author

alexandregz commented May 16, 2024

Because returns only named captures, you force to use named captures and maybe sometimes you want don't use that and use a "standard" regexp, it's much more flexible

@aelsabbahy
Copy link
Member

Since the behavior is well documented, I'll merge this. If in the future it causes issues we can look into either splitting them into separate functions or combining them in a more merged way.

Sorry for the late response, thought I had addressed this PR, but apparently I didn't.

@aelsabbahy aelsabbahy merged commit a6451d2 into goss-org:master Jun 8, 2024
1 check passed
@aelsabbahy
Copy link
Member

Many thanks for this great addition, it'll be in the next Goss release! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants