Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ch140705] Sanitize column names on overwrite import #16208

Merged

Conversation

Shylpx
Copy link
Collaborator

@Shylpx Shylpx commented Mar 10, 2021

Resources

Context

This corner case appears when a table is imported using the overwrite collision strategy, and the columns required sanitization. For example, when the column name starts with a number (1234_column), an underscore will be added (_1234_column). First import is successfully, but the second one compares the schemas. Since the sanitization is executed on table creation (almost as last step), the code is going to compare 1234_column with _1234_column that are indeed different. So an IncompatibleSchemas exception will be raised.

The idea is to apply the sanitization to the column names of the original schemas in order to compare the existing table's schema with the real names that the new table would have.

Changes

  • Sanitize column names on comparison when the collision strategy is overwrite.
  • Add column sanitization to the result table on overwrite_register

@Shylpx Shylpx self-assigned this Mar 10, 2021
@Shylpx Shylpx requested a review from amiedes March 10, 2021 11:14
Copy link
Contributor

@amiedes amiedes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@Shylpx Shylpx merged commit fb6490a into master Mar 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants