New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-38412][SS] Fix the swapped sequence of from and to in StateSchemaCompatibilityChecker #35731
Conversation
…emaCompatibilityChecker
} | ||
|
||
test("changing the nullability of nullable to non-nullable in value should fail") { | ||
test("storing non-nullable to nullable in value schema should be allowed") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can you word this the same as the test above, this felt a bit harder to understand
"storing non-nullable column into nullable column in value should be allowed"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. explicitly mentioning column
seems to be better to understand. will update the test name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I already did that and I just missed this one. Will fix. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thanks all for reviewing! I'll merge this once test passes, given the last commit only refines the test name. |
Test passed. Merging to master/3.2/3.1. |
…emaCompatibilityChecker ### What changes were proposed in this pull request? This PR fixes the StateSchemaCompatibilityChecker which mistakenly swapped `from` (should be provided schema) and `to` (should be existing schema). ### Why are the changes needed? The bug mistakenly allows the case where it should not be allowed, and disallows the case where it should be allowed. That allows nullable column to be stored into non-nullable column, which should be prohibited. This is less likely making runtime problem since state schema is conceptual one and row can be stored even not respecting the state schema. The opposite case is worse, that disallows non-nullable column to be stored into nullable column, which should be allowed. Spark fails the query for this case. ### Does this PR introduce _any_ user-facing change? Yes, after the fix, storing non-nullable column into nullable column for state will be allowed, which should have been allowed. ### How was this patch tested? Modified UTs. Closes #35731 from HeartSaVioR/SPARK-38412. Authored-by: Jungtaek Lim <kabhwan.opensource@gmail.com> Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com> (cherry picked from commit 43c7824) Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
…emaCompatibilityChecker ### What changes were proposed in this pull request? This PR fixes the StateSchemaCompatibilityChecker which mistakenly swapped `from` (should be provided schema) and `to` (should be existing schema). ### Why are the changes needed? The bug mistakenly allows the case where it should not be allowed, and disallows the case where it should be allowed. That allows nullable column to be stored into non-nullable column, which should be prohibited. This is less likely making runtime problem since state schema is conceptual one and row can be stored even not respecting the state schema. The opposite case is worse, that disallows non-nullable column to be stored into nullable column, which should be allowed. Spark fails the query for this case. ### Does this PR introduce _any_ user-facing change? Yes, after the fix, storing non-nullable column into nullable column for state will be allowed, which should have been allowed. ### How was this patch tested? Modified UTs. Closes #35731 from HeartSaVioR/SPARK-38412. Authored-by: Jungtaek Lim <kabhwan.opensource@gmail.com> Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com> (cherry picked from commit 43c7824) Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
Merged into master/3.2/3.1. |
…emaCompatibilityChecker ### What changes were proposed in this pull request? This PR fixes the StateSchemaCompatibilityChecker which mistakenly swapped `from` (should be provided schema) and `to` (should be existing schema). ### Why are the changes needed? The bug mistakenly allows the case where it should not be allowed, and disallows the case where it should be allowed. That allows nullable column to be stored into non-nullable column, which should be prohibited. This is less likely making runtime problem since state schema is conceptual one and row can be stored even not respecting the state schema. The opposite case is worse, that disallows non-nullable column to be stored into nullable column, which should be allowed. Spark fails the query for this case. ### Does this PR introduce _any_ user-facing change? Yes, after the fix, storing non-nullable column into nullable column for state will be allowed, which should have been allowed. ### How was this patch tested? Modified UTs. Closes apache#35731 from HeartSaVioR/SPARK-38412. Authored-by: Jungtaek Lim <kabhwan.opensource@gmail.com> Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com> (cherry picked from commit 43c7824) Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com> (cherry picked from commit f1efc95) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
This PR fixes the StateSchemaCompatibilityChecker which mistakenly swapped
from
(should be provided schema) andto
(should be existing schema).Why are the changes needed?
The bug mistakenly allows the case where it should not be allowed, and disallows the case where it should be allowed.
That allows nullable column to be stored into non-nullable column, which should be prohibited. This is less likely making runtime problem since state schema is conceptual one and row can be stored even not respecting the state schema.
The opposite case is worse, that disallows non-nullable column to be stored into nullable column, which should be allowed. Spark fails the query for this case.
Does this PR introduce any user-facing change?
Yes, after the fix, storing non-nullable column into nullable column for state will be allowed, which should have been allowed.
How was this patch tested?
Modified UTs.