Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Materialize view doesn't populate columns after join and migration to 24.3.2.23 #63345

Open
p-pekala opened this issue May 3, 2024 · 7 comments · May be fixed by #63519
Open

Materialize view doesn't populate columns after join and migration to 24.3.2.23 #63345

p-pekala opened this issue May 3, 2024 · 7 comments · May be fixed by #63519
Assignees
Labels
analyzer Issues and pull-requests related to new analyzer bug Confirmed user-visible misbehaviour in official release unexpected behaviour

Comments

@p-pekala
Copy link

p-pekala commented May 3, 2024

Describe what's wrong

After migration from 23.8 to 24.3 I noticed that some columns are not populated in the destination table (after materialized view processing). After investigating and narrowing down the issue, I noticed that it applies only when we use join in the materialized view and final column names are the same as in the joined table. It doesn't matter if columns from the joined table are used.

CREATE TABLE IF NOT EXISTS bug_reproduction (
missingId LowCardinality(String),
someId LowCardinality(String),
dateTime DateTime64(3, 'UTC')
) ENGINE MergeTree()
PARTITION BY toYYYYMM(dateTime)
ORDER BY (someId, dateTime);

CREATE MATERIALIZED VIEW IF NOT EXISTS bug_reproduction_mv TO bug_reproduction AS
select
st.missingId,
st.someId,
st.dateTime
from
source_table st
LEFT JOIN joined_table i
using (missingId, some_column1, some_column2, some_column3);

If I remove the left join from materialized view or change the column names in the final table (and use aliast in MV) the missingId is populated correctly. It also worked correctly in version 23.8.

Expected behavior

missingId should be populated

@p-pekala p-pekala added the potential bug To be reviewed by developers and confirmed/rejected. label May 3, 2024
@den-crane
Copy link
Contributor

try

CREATE MATERIALIZED VIEW IF NOT EXISTS bug_reproduction_mv TO bug_reproduction AS
select
st.missingId as missingId,
st.someId as someId,
st.dateTime as dateTime
...

@den-crane
Copy link
Contributor

@p-pekala
Copy link
Author

p-pekala commented May 3, 2024

@den-crane Yes the alias seems to help. I also updated your example that reproduces the problem (not sure which modification changed the behavior but I adjusted it to more like my case - with a dictionary and different types).

@den-crane
Copy link
Contributor

den-crane commented May 3, 2024

I also updated your example that reproduces the problem

You cannot update my example. fiddle generates a different url on each execution. You need to share your own URL.

@p-pekala
Copy link
Author

p-pekala commented May 3, 2024

@den-crane
Copy link
Contributor

den-crane commented May 3, 2024

related to an allow_experimental_analyzer and the join with a dictionary.

https://fiddle.clickhouse.com/76d74308-b8ba-4df6-b519-3b0de9b120b5

@Algunenano Algunenano added analyzer Issues and pull-requests related to new analyzer unexpected behaviour and removed potential bug To be reviewed by developers and confirmed/rejected. labels May 6, 2024
@novikd novikd self-assigned this May 7, 2024
@novikd novikd added the bug Confirmed user-visible misbehaviour in official release label May 7, 2024
@novikd
Copy link
Member

novikd commented May 7, 2024

In the QueryAnalyzer::tryResolveIdentifierFromJoin function resolved column is promoted to the common type of columns in the both tables.

auto left_resolved_column_clone = std::static_pointer_cast<ColumnNode>(left_resolved_column.clone());
left_resolved_column_clone->setColumnType(using_column_node_it->second->getColumnType());
resolved_identifier = std::move(left_resolved_column_clone);

Currently when this promotion takes place, node_to_projection_name hash table is not updated, so projection name is not properly calculated and the fall back is used:
result_projection_names.push_back(unresolved_identifier.getFullName());

I'm working on the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analyzer Issues and pull-requests related to new analyzer bug Confirmed user-visible misbehaviour in official release unexpected behaviour
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants