Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: alias glaure.match as ngx.re.match #26

Merged
merged 1 commit into from
Sep 13, 2022

Conversation

omegabytes
Copy link
Contributor

@omegabytes omegabytes commented Aug 20, 2022

Injects the find and match functions from external gluare repo into the overridden ngx package.
This enables in-place substitution without updating the import target.

Hold until after #27

@omegabytes omegabytes force-pushed the feat/re-find-match-fn-aliasing branch from f306f48 to 4e4c7e0 Compare August 22, 2022 17:33
@omegabytes omegabytes force-pushed the feat/re-find-match-fn-aliasing branch from 4e4c7e0 to 7d9974b Compare August 22, 2022 17:41
@omegabytes omegabytes marked this pull request as ready for review August 22, 2022 17:42
@omegabytes omegabytes requested a review from a team as a code owner August 22, 2022 17:42
@omegabytes
Copy link
Contributor Author

Related to #25

Copy link
Contributor

@mikefero mikefero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is starting to take shape as well. I think you will run into conflicts since you have 2 PRs touching the patch, but should be pretty easy to take care of (we shall see which PR lands first). This will be nice when it is complete.

},
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is very confusing and I feel a little incomplete; let's use a custom_validator here to test that local aliases re_find and re_match work against some fields. You can create fields for the find and the match and pass those to the custom_validator function along with another schema field which would be the thing you are searching for.

Comment on lines 27 to 45
ltbl := l.ToTable(ldr)
findFn := l.GetField(ltbl, "find")
matchFn := l.GetField(ltbl, "match")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These values could be null, we should handle that case and panic. Granted it shouldn't happen, but let's handle appropriately.

patches/lua-tree.patch Show resolved Hide resolved
@omegabytes omegabytes force-pushed the feat/re-find-match-fn-aliasing branch 3 times, most recently from d208cd1 to d5b8e31 Compare August 23, 2022 23:10
Copy link
Contributor

@mikefero mikefero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't squash commits during a review; it makes it harder to review the changes made during a request.

Let's improve the tests, goks is integral to Koko and we need to make sure we thoroughly test all that we can.

-local re_match = ngx.re.match
-local re_find = ngx.re.find
+local re_match = require "go.re2".match
local tostring = tostring
local concat = table.concat
local insert = table.insert
local re_match = ngx.re.match
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch doesn't represent what is in the actual kong/db/schema/init.lua file within the lua-tree/share directory.

  • Please re-add re_find
  • Let's not remove local variables that are not affected by these changes.

Copy link
Contributor Author

@omegabytes omegabytes Aug 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The line after + local re_match = ngx.re.match isn't removed or changed; it isn't reflected in the diff since it is 3 lines after the last changed line. By way of example:

-- a/init.lua
local tablex       = require "pl.tablex"
local pretty       = require "pl.pretty"
local utils        = require "kong.tools.utils"
local cjson        = require "cjson"
local vault        = require "kong.pdk.vault".new()


local is_reference = vault.is_reference
local dereference  = vault.get
local setmetatable = setmetatable
local getmetatable = getmetatable
local re_match     = ngx.re.match
local re_find      = ngx.re.find
local tostring     = tostring
local concat       = table.concat
local insert       = table.insert
-- b/init.lua
local tablex       = require "pl.tablex"
local pretty       = require "pl.pretty"
local utils        = require "kong.tools.utils"
local cjson        = { array_mt = {} } --- TODO(hbagdi) XXX analyze the impact
local uuid         = require "go.uuid".generate

local setmetatable = setmetatable
local getmetatable = getmetatable
local re_match     = ngx.re.match
local re_find      = ngx.re.find
local tostring     = tostring
local concat       = table.concat
local insert       = table.insert

image

I think there is some confusion being caused because this is a diff of a file containing a list diffs. Take another look at what's being added and removed from lua-tree.patch (ie changes in my commit) vs what each diff contained in lua-tree.patch is indicating should be added/removed (ie what should be the outcome of running patch -p1 < patches/lua-tree.patch).
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the init.lua file that is checked in, not the diff between the 2.8.0 Kong Gateway OSS files in the luarock.

diff goks/patches/lua-tree.patch goks-orig/patches/lua-tree.patch
40c40
< - Use gluare instead of ngx.re
---
> - Use go.re2 instead of ngx.re
52c52
< - use gluare::match instead of string.match
---
> - use go.re2::match instead of string.match
64c64
< - Use gluare instead of ngx.re
---
> - Use go.re2 instead of ngx.re
167c167
< @@ -1,12 +1,9 @@
---
> @@ -1,16 +1,12 @@
181c181,186
<  local re_match     = ngx.re.match
---
> -local re_match     = ngx.re.match
> -local re_find      = ngx.re.find
> +local re_match     = require "go.re2".match
>  local tostring     = tostring
>  local concat       = table.concat
>  local insert       = table.insert
304c309
< +local match = ngx.re.match
---
> +local match = require "go.re2".match
413,414c418,425
< @@ -30,30 +30,8 @@ local fmt           = string.format
<  local re_match      = ngx.re.match
---
> @@ -33,34 +26,12 @@ local fmt           = string.format
>  local find          = string.find
>  local gsub          = string.gsub
>  local split         = pl_stringx.split
> -local re_find       = ngx.re.find
> -local re_match      = ngx.re.match
> +local re_find       = require "go.re2".find
> +local re_match      = require "go.re2".match

Here you can see that there are more entries is the original patch then the current.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you confused about?

Copy link
Contributor Author

@omegabytes omegabytes Aug 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, I don't understand. I've looked at this several different ways and validated the desired contents of lua-tree.patch against the contents of the corresponding files in lua-tree/share.

Comment on lines 418 to 534
@@ -33,34 +26,12 @@ local fmt = string.format
local find = string.find
local gsub = string.gsub
local split = pl_stringx.split
-local re_find = ngx.re.find
-local re_match = ngx.re.match
+local re_find = require "go.re2".find
+local re_match = require "go.re2".match
@@ -30,30 +30,8 @@ local fmt = string.format
local re_match = ngx.re.match
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch doesn't represent what is in the actual kong/tools/utils.lua file within the lua-tree/share directory.

  • Please re-add re_find
  • Let's not remove local variables that are not affected by these changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I'm also removing pre-loading the gluare package, I've removed re_find where it is being declared but it is unused. I've only done so where we were previously using "go.re2".find. I would probably need to keep the pre-loader in vm.go to accurately leave this unchanged.

},
},
},
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • nit: Two spaces for indents of Lua files (not four)
  • fields could have two regexes and a subject to simplify
    • custom validator should utilize these new regex fields for extending tests on the go side
  • nit: no need to create re_find and re_match, just use ngx.re.find|match directly
  • nit: just call plugin name ngx-re-test

A better schema for validation and testing with varying values could be (untested schema)

local typedefs = require "kong.db.schema.typedefs"

return {
  name = "ngx-re-test",
  fields = {
    { config = {
      type = "record",
      fields = {
        { find_regex = { type = "string", is_regex = true }, },
        { match_regex = { type = "string", is_regex = true }, },
        { subject = { type = "string", required = true }, },
      },
      custom_validator = function(config)
        if config.find_regex ~= ngx.null and #config.find_regex > 0 then
          local from = ngx.re.find(config.subject, config.find_regex)
          if not from then
            return false, "failed to find subject " .. config.subject ..
              " using regex " .. config.find_regex
          end
        end
        if config.match_regex ~= ngx.null and #config.match_regex > 0 then
          local m = ngx.re.match(config.subject, config.match_regex)
          if not m then
            return false, "failed to match subject " .. config.subject ..
              " using regex " .. config.match_regex
          end
        end
        return true
      end,
      },
    },
  },
  entity_checks = {
    { at_least_one_of = { "config.find_regex", "config.match_regex" } },
  },
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at 19 regex/target strings used in match calls across several lua files. I've added 4 unit tests that use existing literals. I came across several regexes that seemingly have a malformed input. Others did not have existing tests or adequate documentation for me to efficiently derive a potential matching string to test against.

Complete list:

tested:
* uuid regex ^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$
* header name ^[a-zA-Z0-9-_]+$
* authorization_header: ([^:]+):(.*)
* /directory$

malformed?
* vault: [[([^/]+)$]] mismatched )
* vault: (/[^/]+) unescaped / 
* casandra: [==[[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}\:\$\ dangling backslash
* get response type: [[^\s*q=([0-9]*[\.][0-9]+)\s*$]] unmatched closing peren )
* parse_directive_header: [[^\s*([^=]+)(?:=(.+))?]] unmatched closing peren )

unsure of input:
* decode array arg: [[(.+)\[(\d*)\]$]]
* decode arg [[^\.+|\.$]]
* get response type [[^\s*(\S+\/\S+)\s*$]]
* no Route matched with those values
* /$
* [[\.well-known/]] .. acme-challenge .. /(.+)
* ^([" ..  [[\x21-\x2B\x2D\x2E\x30-\x7E\x80-\xF4]] .. "]+(?:,|$))+$
* ^([" ..  [[\x21-\x2B\x2D\x2E\x30-\x7E\x80-\xF4]] .. "]+(?:/|$))+$
* (VALUE\)]==]
* specific string "any string"

err = v.Validate(testCase)
assert.Nil(t, err)
}
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Let's isolate this into it's own test; doesn't need to be with the LoadSchema tests.
  • Add more robust tests; see schema update suggestions

Something like this (test is not complete not tested)

func TestValidator_NginxRegex(t *testing.T) {
	schema, err := os.ReadFile("testdata/override-ngx-re-test.lua")
	require.NoError(t, err)
	require.NotNil(t, schema)

	v, err := NewValidator(ValidatorOpts{})
	require.NoError(t, err)
	require.NotNil(t, v)
	pluginName, err := v.LoadSchema(string(schema))
	require.NoError(t, err)
	require.Equal(t, "override-ngx-re-test", pluginName)

	tests := []struct {
		name        string
		config      string
		expectedErr string
	}{
		{
			name: "ensure ngx.re.find is loaded and contains subject",
			config: `"find_regex": "([0-9])",
				"subject": "A123"`,
		},
		{
			name: "ensure ngx.re.find is loaded and fails to contain subject",
			config: `"find_regex": "([a-z][0-9]+)",
				"subject": "A123"`,
			expectedErr: "failed to find subject A123 using regex ([a-z][0-9]+)",
		},
		{
			name: "ensure ngx.re.match is loaded and matches subject",
			config: `"match_regex": "([A-Z][0-9]+)",
				"subject": "A123"`,
		},
		{
			name: "ensure ngx.re.match is loaded and fails to match subject",
			config: `"match_regex": "([a-z][0-9]+)",
				"subject": "A123"`,
			expectedErr: "failed to match subject A123 using regex ([a-z][0-9]+)",
		},
	}

	for _, test := range tests {
		plugin := fmt.Sprintf(`{
				"name": "ngx-re-test",
				"config": {
					%s
				},
				"enabled": true,
				"protocols": ["http"]
			}`, test.config)
		err = v.Validate(plugin)
		if len(test.expectedErr) > 0 {
			require.ErrorContains(t, err, test.expectedErr)
		} else {
			require.NoError(t, err)
		}
	}
}

Comment on lines 2345 to 2349
// {
// name: "ensure ngx.re.find works for validate_path",
// config: `"find_regex": "[[^[a-zA-Z0-9\.\-_~/%]*$]]",
// "subject": "/path"`,
// },
Copy link
Contributor Author

@omegabytes omegabytes Aug 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing what we want to do with these tests is validate some (all?) of the in-use regex strings. I would hope the existing tests that cover these functions would be sufficient, but I understand why we might want to directly test the behavior with specific unit tests.

Either way, when I placed the regex string found in validate_path_with_regexes under test, it is rejected with an unexpected error: invalid character '.' in string escape code. Dropping the escape characters from the regex resolves ([[^[a-zA-Z0-9\.\-_~/%]*$]] -> [[^[a-zA-Z0-9.-_~/%]*$]]). That would suggest that the existing regex isn't valid? Running the full test suite (the failing test excluded) does not result in any failures, so maybe this specific example isn't exercised by an existing test and I've found a regression.

I also noticed the following signature in use: ngx.re.find("", path, "aj"). It's interesting to me that the 3rd arg (options) is a string originally, still a string here but is rejected by our aliased gluare.re_find function when I use it in a test as it requires an int, not a string.

My understanding was the overrides in lualibs/go/ngx/loader.go are global. These observations make it seem like that might not be so?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it looks like the signature for gluare::reFind() is different than ngx.re.find so this replacement is not a 1-to-1 and something that wasn't noticed early in the development of goks. This must also mean that ngx.re.find is not used in our plugin validation or only the subject and regex parameters were used.

gluare::reFind()

  • parameters
    • subject
    • regex
    • starting index
    • plaintext flag
  • returns
    • found position index
    • error

ngx.re.find

  • parameters
    • subject
    • regex
    • options; how the operation is performed
    • context
    • captured index to return
  • returns
    • from
    • to
    • error

Since parameters 1 and 2 are required for both implementations it is a close fit, but definitely not an exact; this may need to be re-evaluated and a ticket created to handle find in the event we need it in the future.

@omegabytes omegabytes force-pushed the feat/re-find-match-fn-aliasing branch 2 times, most recently from 8b20df1 to 02b50b2 Compare August 30, 2022 19:00
@omegabytes omegabytes changed the title feat: add ngx.re.find and ngx.re.match feat: ngx.re.match Aug 30, 2022
@omegabytes omegabytes changed the title feat: ngx.re.match feat: alias glaure.match as ngx.re.match Aug 30, 2022
Copy link
Contributor

@mikefero mikefero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of nickel and dime things left; let's clean these up and try and merge these two goks PRs you have open.

patches/lua-tree.patch Show resolved Hide resolved
Comment on lines 2345 to 2349
// {
// name: "ensure ngx.re.find works for validate_path",
// config: `"find_regex": "[[^[a-zA-Z0-9\.\-_~/%]*$]]",
// "subject": "/path"`,
// },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it looks like the signature for gluare::reFind() is different than ngx.re.find so this replacement is not a 1-to-1 and something that wasn't noticed early in the development of goks. This must also mean that ngx.re.find is not used in our plugin validation or only the subject and regex parameters were used.

gluare::reFind()

  • parameters
    • subject
    • regex
    • starting index
    • plaintext flag
  • returns
    • found position index
    • error

ngx.re.find

  • parameters
    • subject
    • regex
    • options; how the operation is performed
    • context
    • captured index to return
  • returns
    • from
    • to
    • error

Since parameters 1 and 2 are required for both implementations it is a close fit, but definitely not an exact; this may need to be re-evaluated and a ticket created to handle find in the event we need it in the future.

@@ -55,7 +54,6 @@ func New(opts Opts) (*VM, error) {
l.PreloadModule("go.rand", rand.Loader)
l.PreloadModule("go.uuid", uuid.Loader)
l.PreloadModule("go.ipmatcher", ipmatcher.Loader)
l.PreloadModule("go.re2", gluare.Loader)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should keep these; no reason to remove them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, does that cause problems?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I can tell. I've made some updates such that we don't reference go.re2 at all. If you agree with those changes, then I suggest we don't import here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So because we have multiple goks PRs in flight and @tjasko has #27 in flight which uses go.re2 we should keep it. We can always make another PR to clean up the last little bit of go.re2 usage in our Lua code, but since it doesn't hurt I think we should keep it; especially since it is not a maintenance burden.

@@ -26,8 +26,7 @@ local fmt = string.format
local find = string.find
local gsub = string.gsub
local split = pl_stringx.split
local re_find = require "go.re2".find
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird that this is even here, since this removal has been in the patch for months. Nothing to do here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's why I wanted to clean it up. I'll leave it in per your suggestion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I went back and looked at both find and match signatures between gluare and ngx.re. They both methods are not an exact match. I've changed this line to instead reference the original ngx.re.find (even though it is still unused). This satisfies the goal of removing the need to patch custom plugins.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope they are not a 1:1 either unfortunately; we have had to make changes to some of the regex formats used.

Comment on lines 24 to 25
// fetch gluare.find and gluare.match functions
// see https://github.com/yuin/gluare/blob/master/gluare.go
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a blurb here talking about the differences between ngx.re.find and go.re2.find? Since ngx.re.find uses PCRE and go.re2.find uses RE2 we know there is a difference and we should let futureselves know when we look at the code again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documented LoadNgx as well as the overrides of find and match.

@@ -37,7 +37,7 @@ schema/init.lua:
- rework goto statements
- Use an empty lua table for cjson.array_mt, this can potentially
introduce serilialization errors
- Use gluare instead of ngx.re
- Use go.re2::match instead of ngx.re.match
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for all of these we can remove the lines like you did previously; we aren't modifying the Lua files anymore so we will know to check the ngx loader.

patches/lua-tree.patch Show resolved Hide resolved
patches/lua-tree.patch Show resolved Hide resolved
@omegabytes omegabytes force-pushed the feat/re-find-match-fn-aliasing branch 2 times, most recently from bd23651 to 752597f Compare September 8, 2022 20:49
mikefero
mikefero previously approved these changes Sep 12, 2022
Copy link
Contributor

@mikefero mikefero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small request for the comments in the patch, please make an overall comment about ngx.re.<find|match> usage and remove the comments from the 3 file sections.

patches/lua-tree.patch Outdated Show resolved Hide resolved
@mikefero mikefero force-pushed the feat/re-find-match-fn-aliasing branch 2 times, most recently from 8ce5c69 to a598673 Compare September 13, 2022 21:40
Injects the find and match functions from external gluare repo into
the overridden ngx package. This enables in-place substitution without
updating the import target.
@mikefero mikefero force-pushed the feat/re-find-match-fn-aliasing branch from a598673 to a4fb835 Compare September 13, 2022 21:43
Copy link
Contributor

@mikefero mikefero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @omegabytes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants