Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow smart "merging" of schema for multi-file schema inference #55428

Closed
seandavi opened this issue Oct 9, 2023 · 0 comments · Fixed by #55892
Closed

Allow smart "merging" of schema for multi-file schema inference #55428

seandavi opened this issue Oct 9, 2023 · 0 comments · Fixed by #55892
Assignees
Labels

Comments

@seandavi
Copy link

seandavi commented Oct 9, 2023

Use case

When performing schema inference on multiple files, there is an opportunity to be more flexible by having clickhouse "union_by_name" the schemata to allow these schema variants to load as a unified table. This would be similar to read_parquet('*.parquet', union_by_name=True) to allow schema inference to simply add new columns when they are not present in all files?

The same logic could apply to CSV, TSV, jsonlines, etc., but my use case is for Parquet, so feel free to scope as you see fit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants