Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: PushdownRegexExtract doesn't take into account shadow attributes #105434

Closed
costin opened this issue Feb 13, 2024 · 2 comments · Fixed by #105650
Closed

ESQL: PushdownRegexExtract doesn't take into account shadow attributes #105434

costin opened this issue Feb 13, 2024 · 2 comments · Fixed by #105650
Assignees
Labels
:Analytics/ES|QL AKA ESQL >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.14.0

Comments

@costin
Copy link
Member

costin commented Feb 13, 2024

Description

Grok and Dissect produce attributes which can clash with existing fields creating new attributes (with the same name).
When reordering this through the tree however, the shadow attributes are not considered leading to inconsistencies in the tree:

// notice the two emp_no attributes
 [test] Rule optimizer.LogicalPlanOptimizer$PushDownRegexExtract applied
...
  \_Dissect[full_name{r}#5,Parser[pattern=%{emp_no} %{b}, appendSeparator=, parser=org.elasticsearch.dissect.DissectParser@ !   \_OrderBy[[Order[emp_no{f}#15,ASC,LAST]]]
15e5e308],[emp_no{r}#6, b{r}#7]]                                                                                            !     \_Dissect[full_name{r}#5,Parser[pattern=%{emp_no} %{b}, appendSeparator=, parser=org.elasticsearch.dissect.DissectParser@
    \_OrderBy[[Order[emp_no{f}#15,ASC,LAST]]]                                                                               ! 15e5e308],[emp_no{r}#6, b{r}#7]]

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Feb 13, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@alex-spies alex-spies changed the title ESQL: PushdownRegexEval doesn't take into account shadow attributes ESQL: PushdownRegexExtract doesn't take into account shadow attributes Feb 15, 2024
@alex-spies
Copy link
Contributor

PushDownEval has the exact same problem.

I have reproducers for both that result in NPEs if run as CSV tests:

//eval
FROM employees
| SORT emp_no ASC
| eval emp_no = -emp_no
| LIMIT 3
| KEEP emp_no

//dissect
FROM employees
| MV_EXPAND job_positions
| SORT emp_no ASC
| DISSECT job_positions "%{emp_no} %{} %{}"
| LIMIT 1
;

elasticsearchmachine pushed a commit that referenced this issue Feb 26, 2024
Fix #105434

Fixes accidental shadowing when pushing down `GROK`/`DISSECT`, `EVAL` or
`ENRICH` past a `SORT`.

Example for how this works:

```
...
| SORT x
| EVAL x = y
...

pushing this down just like that would be incorrect as x is used in the SORT, so we turn this essentially into

...
| EVAL $$x = x
| EVAL x = y
| SORT $$x
| DROP $$x
...
```

The same logic is applied to `GROK`/`DISSECT` and `ENRICH`.

This allows to re-enable the dependency checker (after fixing a small
bug in it when handling `ENRICH`).
alex-spies added a commit to alex-spies/elasticsearch that referenced this issue Feb 26, 2024
Fix elastic#105434

Fixes accidental shadowing when pushing down `GROK`/`DISSECT`, `EVAL` or
`ENRICH` past a `SORT`.

Example for how this works:

```
...
| SORT x
| EVAL x = y
...

pushing this down just like that would be incorrect as x is used in the SORT, so we turn this essentially into

...
| EVAL $$x = x
| EVAL x = y
| SORT $$x
| DROP $$x
...
```

The same logic is applied to `GROK`/`DISSECT` and `ENRICH`.

This allows to re-enable the dependency checker (after fixing a small
bug in it when handling `ENRICH`).
elasticsearchmachine pushed a commit that referenced this issue Feb 26, 2024
#105808)

* ESQL: Fix wrong attribute shadowing in pushdown rules (#105650)

Fix #105434

Fixes accidental shadowing when pushing down `GROK`/`DISSECT`, `EVAL` or
`ENRICH` past a `SORT`.

Example for how this works:

```
...
| SORT x
| EVAL x = y
...

pushing this down just like that would be incorrect as x is used in the SORT, so we turn this essentially into

...
| EVAL $$x = x
| EVAL x = y
| SORT $$x
| DROP $$x
...
```

The same logic is applied to `GROK`/`DISSECT` and `ENRICH`.

This allows to re-enable the dependency checker (after fixing a small
bug in it when handling `ENRICH`).

* Make OptimizerRules compile again
fang-xing-esql pushed a commit to fang-xing-esql/Elasticsearch that referenced this issue Mar 8, 2024
Fix elastic#105434

Fixes accidental shadowing when pushing down `GROK`/`DISSECT`, `EVAL` or
`ENRICH` past a `SORT`.

Example for how this works:

```
...
| SORT x
| EVAL x = y
...

pushing this down just like that would be incorrect as x is used in the SORT, so we turn this essentially into

...
| EVAL $$x = x
| EVAL x = y
| SORT $$x
| DROP $$x
...
```

The same logic is applied to `GROK`/`DISSECT` and `ENRICH`.

This allows to re-enable the dependency checker (after fixing a small
bug in it when handling `ENRICH`).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.14.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants