New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AnalysisException after using mergeSchema
option to add a new column containing only null
#944
Comments
Thanks for reporting this! |
Hey @NicolasGuary, I wanted to provide an update. The issue here is that the addition of a column containing only You can avoid this error by explicitly specifying a type for null-only columns. 989078f throws an error for any |
Thank you for the explanation @allisonport-db ! Let me know how that sounds for you and if it makes sense for Delta to implement such rule ! |
Ignoring the column could cause later operations to fail since the user expects that column to now exist in the table's schema and there are a lot of corner cases. I think for now throwing an error is the most explicitly clear solution to the user. Closing this issue for now since this specific bug should no longer be allowed, and void column support is not in good shape and there are a lot of edge cases to deal with. |
I'm facing this error. please tell me how to tackle it Error in SQL statement: AnalysisException: The schema of your Delta table has changed in an incompatible way since your DataFrame or I have introduced this two columns modified_timestamp, created_timestamp after creating the dataframe. |
Tested this on
DBR 9.1 LTS
andDBR 10.3
.Hello, I am currently facing an issue that is for me making the
mergeSchema
option unusable.My goal is to append new columns to an existing table, but sometimes a new column can come with only null values and then afterwards contain non-null values. That's what I've tried to reproduce, but I am getting this error when trying to merge a new record with the non-null value :
Here's what you can do to reproduce the bug :
X
that has only anull
value :Note that at this point, if you
display(spark.read.format("delta").load(path))
columnX
won't even exist on this table.X
contain a non-null value:After running command 3, you should get the error above.
Thank you for your time and consideration, have a great day !
The text was updated successfully, but these errors were encountered: