Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shard Mappers perform query re-writing #3414

Closed
wants to merge 2 commits into from
Closed

Conversation

otoolep
Copy link
Contributor

@otoolep otoolep commented Jul 21, 2015

With this change query-rewriting, critically wildcard-rewriting, is performed by the LocalMappers and not by Executors. This is necessary since only shard indexes know what fields should replace * within a given SELECT query.

This required the following changes:

  • Mapper responses now include the list of fields names, always in alphabetical order, for the data being returned to the Executor.
  • The Executor, when in raw mode, can use this to form the complete set of column names.
  • The Executor does not now re-write queries, but still uses the version of the statement passed to it to initialize reducer functions (in the case of aggregate queries) and mathematical functions. Fortunately this processing does not require wildcards or regexes to be replaced so the Executor can continue to work with the non-rewritten version of the SELECT statement.
  • Aggregate queries did not need much work, since aggregate queries may not contain wildcards, so re-writing is irrelevant.
  • RemoteMapper updated as needed.
  • Unit tests updated as needed. An important new test was added -- SELECT * FROM cpu where the query requires that two complete decoupled shards (and TSDB stores) objects mush be processed.

@otoolep otoolep force-pushed the statement_rewrite2 branch 2 times, most recently from 1c5cdee to bf43a44 Compare July 21, 2015 21:45
@otoolep otoolep changed the title Statement rewrite2 Shard Mappers perform query re-writing Jul 21, 2015
@otoolep otoolep self-assigned this Jul 22, 2015
@otoolep otoolep modified the milestones: 0.9.3, 0.9.2 Jul 22, 2015
@otoolep
Copy link
Contributor Author

otoolep commented Jul 22, 2015

@pauldix @dgnorton

With this change query-rewriting, critically wildcard-rewriting, is
performed by Shard Mappers. This is necessary since only shard indexes
know what fields should replace * within a given SELECT query.

This required the following changes:

- Mapper responses now include the list of fields names, always in alphabetical
  order, for the data being returned to the Executor.
- The Executor, when in raw mode, can use this to form the complete set
  of column names.
- The Executor does not now re-write queries, but still uses the version
  of the statement passed to it to initialize reducer functions (in the
  case of aggregate queries) and mathematical functions. Fortunately this
  processing does not require wildcards or regexes to be replaced so the
  Executor can continue to work with the non-rewritten version of the
  SELECT statement.
- Aggregate queries did not need much work, since agrgegate queries may
  not contain wildcards, so re-writing is irrelevant.
- RemoteMapper updated as needed.
- Unit tests updated as needed.
dgnorton added a commit that referenced this pull request Jul 29, 2015
@dgnorton dgnorton closed this in d661bf1 Aug 4, 2015
@otoolep otoolep deleted the statement_rewrite2 branch August 18, 2015 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants