Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: adds configurable query parameters to tile endpoints #867

Closed

Conversation

bemyak
Copy link
Contributor

@bemyak bemyak commented Jun 22, 2022

This PR is a successor of #795.

The main differences/features are:

  • Avoid SQL injection vulnerability (more on the approach bellow)
  • Allow using several types of parameters, not only int
  • Support omitting a query entirely or replacing it with an arbitrary expression in case no parameter is passed
  • Go is updated to 1.18.3 (dropped, as upstream already uses 1.19)

A user can define custom parameters on per-map basis like this:

[[maps.params]]
name = "param2"
token = "!PARAM2!"
# only very simply types are suppored: `int`, `string`, `float`, `bool`
type = "int"
# `?` will be replaced with the actual value
# if `sql` is not specified, it is considered to be `?`
sql = "AND ANSWER = ?"
# if the parameter value is missing from the HTTP query, this value will be used
# `default_value` is an optional parameter. If it is not defined, the parameter is considered as required and cliet will get 400 error
default_value = "42"
# instead of `default_value`, `default_sql` can be defined, e.g. to omit a param query entirely:
#default_sql = " "
# defining both `default_value` and `default_sql` is forbidden

When a request is received, a value of QueryParameterValue is constructed to hold the resulting parameter query (still with ?) and a value (parsed to the proper type).

Then (params map[string]QueryParameterValue) ReplaceParams(sql string, args *[]interface{}) string function can be called which replaces parameters with positional arguments (e.g., $1, $2) and fills an array with their values. This should prevent SQL injections.

Copy link
Member

@gdey gdey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not seeing when we check to see if both caching and map params are enabled during config validation. I'll need to go through it again, but I'm worried that could lead to come confusion when trying to startup and have caching.

Were the caching check is happening right now, it ignores the caching if the map has any parameters, which is the correct behavior, but how does the user know that the cache does not contain that map?

I made a few code clean-up comments as well. I kind of want to get some of the simpler code cleanups into the main branch; that would make reviewing this easier as well, I think, as those things would not distract from the main part of the code.

Overall I think this will work; bit concerned about number generation for placeholders. Does this force anyone writing SQL to use $%d instead of '?'? What happens if one types a number that is generated in the SQL by mistake? Where would the failure point be? How, would the user know this? We need to be able to document this so that people can trouble shoot issues like this? Are we able to catch issues like this before hand?

cmd/tegola/cmd/cache/cache.go Outdated Show resolved Hide resolved
config/config.go Outdated Show resolved Hide resolved
provider/query_parameter.go Outdated Show resolved Hide resolved
provider/query_parameter.go Outdated Show resolved Hide resolved
provider/query_parameter_value.go Outdated Show resolved Hide resolved
@bemyak
Copy link
Contributor Author

bemyak commented Oct 17, 2022

@gdey, thank you so much for the review!

I've tried to preserve the commits from the original feature author, as my impression was that they were already reviewed and approved. However, it might be easier if I redo the commit history and separate code cleanups from the actual feature code. What do you think?

Regarding caching, will it be enough to log a warning? Or should we fail the config validation? We don't support per-map caching, so the best thing we can do without implementing it is to skip maps with configured parameters.

Does this force anyone writing SQL to use $%d instead of '?'?

No, the numbered arguments are handled by the Tegola internals and are not exposed to the end user in any way. In maps' configuration, one uses only "tokens", like !PARAM1!. In the parameter configuration, only ?. The substitution will happen transparently.

What happens if one types a number that is generated in the SQL by mistake?

If that's just a mistake, Tegola will probably catch it during startup while executing inspectLayerGeomType since the query isn't valid. If the geometry_type was specified, that function is optional so that the parameter will be passed to the query and replaced with some value. I don't think we should protect against it. It's the same attack vector as if an admin would put DROP DATABASE in the query 🤷

We definitely must ensure that the documentation clearly states that only ? is accepted as a placeholder, and numbered parameters here produce undefined behaviour.

@bemyak bemyak force-pushed the feature/query-parameters-shsh branch 2 times, most recently from cb3e35e to def63c4 Compare October 17, 2022 11:21
@gdey
Copy link
Member

gdey commented Oct 17, 2022

I've tried to preserve the commits from the original feature author, as my impression was that they were already reviewed and approved. However, it might be easier if I redo the commit history and separate code cleanups from the actual feature code. What do you think?

I would say go ahead and rebase the code to separate out the clean-up from the actual feature code. This provides two benefits as I see it.

  1. I can cherry-pick the clean-up and get them into dev, as we discuss this. (Through I think it's really close, the only thing I have a concern with is how caching is going to interact with this feature, from a user perspective.)
  2. It's easier to see how everything interacts without the clean-up code distracting.

Regarding caching, will it be enough to log a warning? Or should we fail the config validation? We don't support per-map caching, so the best thing we can do without implementing it is to skip maps with configured parameters.

I feel we should fail on config validation. At least warn there that caching is being turned off due to maps with parameters.

Just skipping a map with parameters seems like the worst of both worlds, because the user has to remember which maps have parameters and which don't.
The downside is that there isn't caching for any of the maps.

The PR is logging (with your review changes) when a map is not going to be cached at the time of tile buildup; this is going to inflate the logs (I believe) since this will have every time, an (uncached) tile is requested.

Failing at config validation time, or logging at that time the maps that will not be included in the cache may be better.

@dechristopher @mojodna @ARolek @ear7h opinions?

Does this force anyone writing SQL to use $%d instead of '?'?

No, the numbered arguments are handled by the Tegola internals and are not exposed to the end user in any way. In maps' configuration, one uses only "tokens", like !PARAM1!. In the parameter configuration, only ?. The substitution will happen transparently.

Okay, cool. Could not remember how we had implemented that.

What happens if one types a number that is generated in the SQL by mistake?

If that's just a mistake, Tegola will probably catch it during startup while executing inspectLayerGeomType since the query isn't valid. If the geometry_type was specified, that function is optional so that the parameter will be passed to the query and replaced with some value. I don't think we should protect against it. It's the same attack vector as if an admin would put DROP DATABASE in the query shrug

I need to think about this.

We definitely must ensure that the documentation clearly states that only ? is accepted as a placeholder, and numbered parameters here produce undefined behaviour.

👍

@coveralls
Copy link

coveralls commented Oct 17, 2022

Pull Request Test Coverage Report for Build fe493a746-PR-867

  • 218 of 396 (55.05%) changed or added relevant lines in 22 files are covered.
  • 2 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+0.007%) to 45.322%

Changes Missing Coverage Covered Lines Changed/Added Lines %
cmd/tegola/cmd/root.go 0 1 0.0%
atlas/map.go 6 12 50.0%
atlas/atlas.go 0 7 0.0%
provider/postgis/util.go 18 25 72.0%
cmd/tegola/cmd/cache/cache.go 0 10 0.0%
config/config.go 76 86 88.37%
provider/paramater_decoders.go 0 12 0.0%
provider/map_layer.go 0 15 0.0%
provider/postgis/postgis.go 28 46 60.87%
server/handle_map_layer_zxy.go 12 34 35.29%
Files with Coverage Reduction New Missed Lines %
config/config.go 1 77.61%
provider/postgis/postgis.go 1 62.07%
Totals Coverage Status
Change from base Build dd8c61dfc: 0.007%
Covered Lines: 5745
Relevant Lines: 12676

💛 - Coveralls

@bemyak
Copy link
Contributor Author

bemyak commented Oct 18, 2022

Just summarizing my thoughts and findings on the caching

The problems

  1. Caching is configured on a per-Atlas, not per-map level. Moving it to a map level will introduce extra complexity to the end users while not providing too much benefit. However, we can add a simple cache=true/false param and ensure that all maps with custom params have it off.
  2. I think having cache for some maps is better than having it off for everything.
  3. It is possible to create cache for custom parameters, but this comes with a new set of challenges:
    • Cache key can be extended with a key=value part for every custom parameter used. However, this value must be hashed to avoid attacks like specifying a value as ../../123 and reading tiles from another cache this way
    • The order of parameters, their names, and even queries now matter. If the config is changed, all cache must be invalidated. So we either need to always do it on startup or store a config's checksum somewhere.
      All in all, I think we should leave this out of this PR's scope.
  4. In this PR, cache is still used for every map, disregarding custom parameters. Changes in the cmd/tegola/cmd/cache/cache.go only affect the tegola cache subcommand. I'll fix it when we decide what approach should be taken.

Possible solutions

  1. Disable all caching if any custom parameter is configured
  2. Skip caching for maps that have custom parameters, printing a warning on startup
  3. Add a cache=false key to the map configuration and require it to be set for every map with custom params

Personally, I don't like №3 as the users don't have any saying in this; by adding this line, they just acknowledge the fact, so it's not a "configuration option".

@bemyak bemyak force-pushed the feature/query-parameters-shsh branch from def63c4 to ad9fa71 Compare October 18, 2022 12:09
@bemyak
Copy link
Contributor Author

bemyak commented Oct 18, 2022

@gdey, I rewrote the commit history. It should be much easier to follow it commit by commit now, as there are less back and forth changes.

@bemyak bemyak force-pushed the feature/query-parameters-shsh branch 2 times, most recently from 6ffd38a to 2f0f188 Compare October 18, 2022 12:20
@gdey
Copy link
Member

gdey commented Oct 19, 2022

Thank you for the great summation!

Possible solutions

1. Disable all caching if any custom parameter is configured

Sounds like this would be true Per Atlas and not for the entirety of Tegola instance.

2. Skip caching for maps that have custom parameters, printing a warning on startup

This seems to be the natural solution but will require documentation around people missing the warning on startup, and further clarification in the feature documentation.

3. Add a `cache=false` key to the map configuration and require it to be set for every map with custom params

Personally, I don't like №3 as the users don't have any saying in this; by adding this line, they just acknowledge the fact, so it's not a "configuration option".
I agree, 3 is not a good solution, and add caching on a per map bases, should be out of scope for this PR.

@gdey
Copy link
Member

gdey commented Oct 19, 2022

@ARolek are we able to upgrade to go1.18, or do we still need to wait on that?

@gdey
Copy link
Member

gdey commented Oct 26, 2022

@ARolek are we able to upgrade to go1.18, or do we still need to wait on that?

Talk to @ARolek and we are fine with upgrading to go1.18, and can even move to go1.19.

@sjb-citian
Copy link

sjb-citian commented Oct 31, 2022

Got an internal error when trying to build a lambda version of this branch. Has anyone else come across this issue?

Oh yea to clarify, this is after building tegola_lambda and uploading it to my lambda function on AWS that typically works when building the main branch. When opening the /capabilities url I get an internal error.

@bemyak
Copy link
Contributor Author

bemyak commented Nov 1, 2022

Thanks for giving it a try, @sjb-citian !

I haven't tested the lambda build. Are there any logs or a stacktrace so I could investigate?

@sjb-citian
Copy link

@bemyak I rebuilt it this morning using the command specified in the read.me of the cmd/lambda folder and reconfigured my endpoint on AWS and it worked! Likely a build error on my part or I configured the endpoint incorrectly, either this branch definitely works with lambda.

@ARolek
Copy link
Member

ARolek commented Nov 10, 2022

@sjb-citian thanks for being diligent on this contribution, and again apologies for the slow response on my end. Regarding caching, I think we should go with:

  1. Skip caching for maps that have custom parameters, printing a warning on startup

This seems the most natural in my mind. If the user wants some caching, we can suggest they out a cache in front of tegola, like a CDN. Your breakdown of the complications with caching dynamic tiles is very correct. It opens up a whole new world of complexities that will likely need to be solved, but I think are beyond the scope of this initial enhancement.

I think the cache detail is the only final outstanding detail. What I would like to do next is the following:

  • Release a new version of tegola ahead of this PR hitting main. I just need to finish up some documentation and a few other minor details to do this.
  • If you could please implement the above cache solution, we can work on getting this PR into main.
  • Let the feature simmer for a bit in main so we can battle test it for a bit prior to the next tegola release.

I'm excited to play with this feature. It has been long requested and again, I appreciate your diligence and patience. Your contribution is greatly appreciated!

@bemyak bemyak force-pushed the feature/query-parameters-shsh branch from e693c24 to 60b3db9 Compare November 15, 2022 14:56
@bemyak
Copy link
Contributor Author

bemyak commented Nov 15, 2022

@gdey , @ARolek , I added cache bypassing and the warning for maps with configured parameters.

Is there anything else remaining unresolved?

@bemyak bemyak force-pushed the feature/query-parameters-shsh branch from 8572cfb to 713967e Compare November 16, 2022 07:49
Copy link
Member

@gdey gdey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bemyak
Copy link
Contributor Author

bemyak commented Nov 21, 2022

@ARolek, can I do something else here?

@ARolek
Copy link
Member

ARolek commented Nov 21, 2022

@bemyak nope! Action on me to get the release notes and documentation updates for the current version polished out. If you would like to help on that front I can push up a branch that you can help on. I'm mainly trying to cut over to using mvt_postgis in our readmes.

@ARolek
Copy link
Member

ARolek commented Nov 29, 2022

@bemyak can you rebase this branch against master please?

dechristopher and others added 2 commits November 30, 2022 09:23
fix: small typos and doc cleanup in provider/postgis
feat: implemented Query Parameters into postgres provider
feat: implemented default parameter values
feat: revamp param specification in config
- Fixed test in case no env vars are defined
- QueryParams inserted only once per token
- Minor fixes
Warn about disabled caching
This is needed to access configured map parameters from the provider's
initialization code to inspect geom type. Unfortunately, config pkg
depends on the provider pkg, so simply referencing these types creates a
dependency loop.
@bemyak bemyak force-pushed the feature/query-parameters-shsh branch from 713967e to ebf58d2 Compare November 30, 2022 07:23
@bemyak
Copy link
Contributor Author

bemyak commented Nov 30, 2022

@ARolek, done! I've just dropped two commits with the Go update and CI fix

@bemyak bemyak force-pushed the feature/query-parameters-shsh branch from ebf58d2 to 1201df2 Compare November 30, 2022 07:34
@ARolek
Copy link
Member

ARolek commented Dec 1, 2022

@bemyak alright, I just cut the v0.16.0 version of tegola. Going to run the CI on this PR and then I will merge it in. Thanks again for the commitment and patience on this PR. I would like to let this sit in pre-release state for a bit so we can let the community give it a try.

@ARolek
Copy link
Member

ARolek commented Dec 1, 2022

PR now in master: https://github.com/go-spatial/tegola/commits/master. I'm going to open a discussion for anyone that needs help with the feature or has feedback. Onward!

@ARolek
Copy link
Member

ARolek commented Dec 1, 2022

@bemyak when opening this discussion, I realized the configuration instructions for the feature are only in the PR. Could you send in some README documentation to help guide future users?

@ARolek ARolek closed this Dec 1, 2022
@bemyak
Copy link
Contributor Author

bemyak commented Dec 2, 2022

@ARolek, thank you! I've opened a PR with the README update: #896

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants