Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the map() returns wrong value #9002

Closed
1 task done
l1t1 opened this issue Sep 20, 2023 · 7 comments · Fixed by #9046
Closed
1 task done

the map() returns wrong value #9002

l1t1 opened this issue Sep 20, 2023 · 7 comments · Fixed by #9046
Assignees

Comments

@l1t1
Copy link

l1t1 commented Sep 20, 2023

What happens?

map(list1,list2)[item of list1] return empty or wrong value of list2

To Reproduce

see #8986

OS:

windows7 x64

DuckDB Version:

0.8.1

DuckDB Client:

CLI

Full Name:

lutao

Affiliation:

study

Have you tried this on the latest main branch?

I have tested with a main build

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • Yes, I have
@l1t1

This comment was marked as abuse.

@l1t1

This comment was marked as abuse.

@l1t1

This comment was marked as abuse.

@ghost
Copy link

ghost commented Sep 21, 2023

here is a minimal example. the first two selects are fine, the last one should return zero rows but with extra r[1] > 5 we get a mismatched row.

create table data as from (
  values
  ([0], [3]),
  ([0], [9])
) as t(l, r);
-- this works.
select r[1], map(l, r)[l[1]] as v from data where v[1] = r[1]; 
-- this works
select r[1], map(l, r)[l[1]] as v from data where v[1] != r[1];
-- this fails
select r[1], map(l, r)[l[1]] as v from data where v[1] != r[1] and r[1] > 5;
┌───────┬─────────┐
│ r[1]  │    v    │
│ int32 │ int32[] │
├───────┼─────────┤
│     3 │ [3]     │
│     9 │ [9]     │
└───────┴─────────┘
┌───────┬─────────┐
│ r[1]  │    v    │
│ int32 │ int32[] │
├─────────────────┤
│     0 rows      │
└─────────────────┘
┌───────┬─────────┐
│ r[1]  │    v    │
│ int32 │ int32[] │
├───────┼─────────┤
│     9 │ [3]     │
└───────┴─────────┘

@Tishj Tishj self-assigned this Sep 21, 2023
@Tishj
Copy link
Contributor

Tishj commented Sep 21, 2023

As always with issues like this:

pragma disable_optimizer

Makes it work, so I'm going to have to have a look at the optimizers that affect this


Hmm it's the reordering that causes it, but that means that
select r[1], map(l, r)[l[1]] as v from data where r[1] > 5 and v[1] != r[1] will fail even when optimizer is disabled
That shouldn't happen 😅

@ghost
Copy link

ghost commented Sep 22, 2023

As always with issues like this:

pragma disable_optimizer

Makes it work, so I'm going to have to have a look at the optimizers that affect this

Hmm it's the reordering that causes it, but that means that select r[1], map(l, r)[l[1]] as v from data where r[1] > 5 and v[1] != r[1] will fail even when optimizer is disabled That shouldn't happen 😅

i should've had a case without the != clause.
select r[1], map(l, r)[l[1]] as v from data where r[1] > 5 is sufficient to cause the bug. or
select r[1], map(l, r)[l[1]] as v from data where r[1] != 3.

the output in the map column, regardless of the where clause, is always what would appear in the first n rows without the where clause.

eg:

create table data as from (
  values
  ([1], [3]),
  ([2], [9]),
  ([3], [15]),
  ([4], [21]),
) as t(l, r);
-- this works
select l[1], r[1], map(l, r) from data;
-- map output ignores where filter.
select l[1], r[1], map(l, r) from data where r[1] != 3;
-- map output ignores where filter.
select l[1], r[1], map(l, r) from data where r[1] != 9;
-- map output ignores where filter.
select l[1], r[1], map(l, r) from data where r[1] != 15;
┌───────┬───────┬───────────────────────┐
│ l[1]  │ r[1]  │       map(l, r)       │
│ int32 │ int32 │ map(integer, integer) │
├───────┼───────┼───────────────────────┤
│     1 │     3 │ {1=3}                 │
│     2 │     9 │ {2=9}                 │
│     3 │    15 │ {3=15}                │
│     4 │    21 │ {4=21}                │
└───────┴───────┴───────────────────────┘
┌───────┬───────┬───────────────────────┐
│ l[1]  │ r[1]  │       map(l, r)       │
│ int32 │ int32 │ map(integer, integer) │
├───────┼───────┼───────────────────────┤
│     2 │     9 │ {1=3}                 │
│     3 │    15 │ {2=9}                 │
│     4 │    21 │ {3=15}                │
└───────┴───────┴───────────────────────┘
┌───────┬───────┬───────────────────────┐
│ l[1]  │ r[1]  │       map(l, r)       │
│ int32 │ int32 │ map(integer, integer) │
├───────┼───────┼───────────────────────┤
│     1 │     3 │ {1=3}                 │
│     3 │    15 │ {2=9}                 │
│     4 │    21 │ {3=15}                │
└───────┴───────┴───────────────────────┘
┌───────┬───────┬───────────────────────┐
│ l[1]  │ r[1]  │       map(l, r)       │
│ int32 │ int32 │ map(integer, integer) │
├───────┼───────┼───────────────────────┤
│     1 │     3 │ {1=3}                 │
│     2 │     9 │ {2=9}                 │
│     4 │    21 │ {3=15}                │
└───────┴───────┴───────────────────────┘

@Tishj
Copy link
Contributor

Tishj commented Sep 22, 2023

Ah thanks, that's a good example
To explain:
When we create a filter, we also create a dictionary vector to mask out the results that are filtered out
map did not respect this and that's why the first tuple that shows up is always the first tuple from the total result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants