Add AnalyticsExecutionEngine and query routing for Parquet-backed indices#5258
Conversation
…ices (opensearch-project#5247) Implements the execution engine adapter and query routing infrastructure for Project Mustang's unified query pipeline (Option B). PPL queries targeting parquet_ prefixed indices are routed through UnifiedQueryPlanner and AnalyticsExecutionEngine instead of the existing Lucene path. Key changes: - QueryPlanExecutor interface: contract for analytics engine execution - AnalyticsExecutionEngine: converts QueryPlanExecutor results to QueryResponse - RestUnifiedQueryAction: orchestrates schema building, planning, execution - RestPPLQueryAction: routing branch for parquet_ indices - StubQueryPlanExecutor: canned data for development/testing Signed-off-by: Kai Huang <kaihuang@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>
PR Reviewer Guide 🔍(Review updated until commit ac9ca64)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to ac9ca64 Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit ccd1665
Suggestions up to commit 10d08f2
Suggestions up to commit 3b1ecb8
|
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 10d08f2.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
|
Persistent review updated to latest commit 10d08f2 |
- Classify client vs server errors in RestUnifiedQueryAction (syntax errors return 400, engine failures return 500) - Fix JSON escaping in error responses for special characters - Add integration tests: explain with filter/aggregation, syntax error handling, response format validation, regression checks - Increment correct metrics (PPL_FAILED_REQ_COUNT_CUS vs SYS) Signed-off-by: Kai Huang <kaihuang@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>
10d08f2 to
ccd1665
Compare
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit ccd1665.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit ccd1665.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
|
Persistent review updated to latest commit ccd1665 |
Signed-off-by: Heng Qian <qianheng@amazon.com>
- Parse explain JSON and verify calcite.logical, calcite.physical (null), calcite.extended (null) structure - Verify exact plan operators (LogicalProject, LogicalFilter, LogicalSort, LogicalAggregate) and their contents (column names, filter conditions) - Add sort plan explain test Signed-off-by: Kai Huang <kaihuang@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit ac9ca64.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
|
Persistent review updated to latest commit ac9ca64 |
Summary
Implements the execution engine adapter and query routing infrastructure for Project Mustang's unified query pipeline (#5247).
QueryPlanExecutor—@FunctionalInterfacecontract for analytics engine plan executionAnalyticsExecutionEngine— ImplementsExecutionEngine, bridgesQueryPlanExecutoroutput toQueryResponsepipelineRestUnifiedQueryAction— Orchestrates the analytics query path: routing, planning viaUnifiedQueryPlanner, execution, response formatting onsql-workerthread poolRestPPLQueryAction— Routing branch:parquet_prefixed indices → analytics path; all others → existing Lucene pathStubQueryPlanExecutor— Canned data forparquet_logsandparquet_metricstables for development/testingAnalyticsPPLIT— Integration tests validating full pipeline end-to-endTest plan
AnalyticsExecutionEngine(type mapping, row conversion, query size limit, null handling, error propagation, explain)RestUnifiedQueryAction(index name extraction, routing detection)AnalyticsPPLIT) — full pipeline with stub executor against real test cluster