Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(influxql): don't expand large regex char sets #30

Merged
merged 1 commit into from Sep 12, 2019

Conversation

dgnorton
Copy link
Contributor

@dgnorton dgnorton commented Sep 11, 2019

fixes influxdata/influxdb#13929

InfluxQL optimizes some queries by expanding regular expressions in the
WHERE clause to literal expressions. This causes a problem when the
expression expands to a large number of literals. This change caps it at
100 literals. If the expression would expand to more, it is not
optimized at all (i.e., no partial optimization).

Copy link
Contributor

@jsternberg jsternberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the max should be lower, but the code is solid so whichever number it is this can be approved.

ast.go Outdated
// this is exceeded, no expansion will be done. This allows reasonable
// optimizations of regex by expansion to literals but prevents cases
// where that expansion would result in a large number of literals.
const maxLiterals = 1000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 100 is a better value. 1000 is likely too high.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 fixed

InfluxQL optimizes some queries by expanding regular expressions in the
WHERE clause to literal expressions. This causes a problem when the
expression expands to a large number of literals. This change caps it at
100 literals. If the expression would expand to more, it is not
optimized at all (i.e., no partial optimization).
Copy link
Contributor

@e-dard e-dard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is right approach, but we need to change a couple of things here:

  1. we can't introduce a change like this that could break existing users' queries without a form of control. We need to propogate maxLiterals all the way back to a configuration file option in InfluxDB. The default of 100 seems reasonable.
  2. when I tested this I just got a normal info message saying the query executed, but no error or indication the query failed. We need to be able to inform the user that their query failed, if indeed it did.

Further, when the PR to InfluxDB is made, we need to add then config option to the demo config with some description of what it does. We also need to let the docs team know about this change.

@dgnorton
Copy link
Contributor Author

@e-dard

  1. This change shouldn't break any existing queries because it is just limiting a query optimization. E.g., take a query with regex that would expanded to 100,000 literal comparisons. Prior to this change, the expansion would have happened and the rewritten query might have executed on a machine with sufficient resources. After this change, the query will not be rewritten and it will execute correctly on any machine.
  2. No error occurs. It's just a matter of whether a query was optimized or not.

@e-dard
Copy link
Contributor

e-dard commented Sep 12, 2019

@dgnorton ah OK I misunderstood the function's purpose. This makes a lot of sense! We should still document in the InfluxDB release notes (top of the changelog) that the optimisation has now been limited to XYZ expressions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Some regexes will cause a stack overflow during evaluation
3 participants