Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Schema inference works for CREATE AS SELECT #48679

Merged
merged 1 commit into from
Apr 13, 2023

Conversation

ucasfl
Copy link
Collaborator

@ucasfl ucasfl commented Apr 12, 2023

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Make Schema inference works for CREATE AS SELECT. Closes #47599.

@robot-clickhouse-ci-2 robot-clickhouse-ci-2 added the pr-improvement Pull request with some product improvements label Apr 12, 2023
@pufit pufit self-assigned this Apr 13, 2023
@pufit pufit merged commit a8c892f into ClickHouse:master Apr 13, 2023
138 checks passed
@pufit
Copy link
Member

pufit commented Apr 13, 2023

We have some concerns about why the performance build report failed. I will temporarily revert this pr

@pufit
Copy link
Member

pufit commented Apr 18, 2023

@ucasfl I found a query that was broken by this commit. It can be reproduced like this:

cat mock_data.tsv

UserName	Age	Tags
String	Int8	Map(String, UInt64)
user127	20	{'test': 123}
user405	43	{'test': 123}
user902	43	{'test': 123}
CREATE VIEW users AS SELECT * FROM file('mock_data.tsv', TSVWithNamesAndTypes);
CREATE TABLE users_output ENGINE File(TSV, 'output.tsv')
AS
WITH (SELECT groupUniqArrayArray(mapKeys(Tags))
      FROM users) AS unique_tags
SELECT UserName, length(unique_tags)
FROM users;

The last query will result in an error

Code: 47. DB::Exception: Received from localhost:9000. DB::Exception: Unknown column: Tags, there are only columns : While processing (SELECT groupUniqArrayArray(mapKeys(Tags)) FROM users) AS unique_tags. (UNKNOWN_IDENTIFIER)

@ucasfl ucasfl deleted the scheme-infer branch July 23, 2023 09:53
@ucasfl
Copy link
Collaborator Author

ucasfl commented Jul 23, 2023

@ucasfl I found a query that was broken by this commit. It can be reproduced like this:

cat mock_data.tsv

UserName	Age	Tags
String	Int8	Map(String, UInt64)
user127	20	{'test': 123}
user405	43	{'test': 123}
user902	43	{'test': 123}
CREATE VIEW users AS SELECT * FROM file('mock_data.tsv', TSVWithNamesAndTypes);
CREATE TABLE users_output ENGINE File(TSV, 'output.tsv')
AS
WITH (SELECT groupUniqArrayArray(mapKeys(Tags))
      FROM users) AS unique_tags
SELECT UserName, length(unique_tags)
FROM users;

The last query will result in an error

Code: 47. DB::Exception: Received from localhost:9000. DB::Exception: Unknown column: Tags, there are only columns : While processing (SELECT groupUniqArrayArray(mapKeys(Tags)) FROM users) AS unique_tags. (UNKNOWN_IDENTIFIER)

@pufit Although the change makes the query above does not work. However, even without the change, the query also does not work if we first create table users_output, then execute:

INSERT INTO users_output 
WITH (SELECT groupUniqArrayArray(mapKeys(Tags))
      FROM users) AS unique_tags
SELECT UserName, length(unique_tags);
FROM users;

Code: 47. DB::Exception: Received from localhost:9000. DB::Exception: Unknown column: Tags, there are only columns : While processing (SELECT groupUniqArrayArray(mapKeys(Tags)) FROM users) AS unique_tags. (UNKNOWN_IDENTIFIER)

So, it's a different issue with insert schema inference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-improvement Pull request with some product improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Schema inference for INSERT SELECT should also work for CREATE AS SELECT
3 participants