[SPARK-47462][SQL] Align mappings of other unsigned numeric types with TINYINT in MySQLDialect#45588
[SPARK-47462][SQL] Align mappings of other unsigned numeric types with TINYINT in MySQLDialect#45588yaooqinn wants to merge 3 commits intoapache:masterfrom
Conversation
…h TINYINT in MySQLDialect
…h TINYINT in MySQLDialect
…h TINYINT in MySQLDialect
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Hi, @yaooqinn . This looks correct and is aligned with the previous one correctly.
BTW, do you think we have a chance of regression (or breaking change) due to the table schema change? Although I don't remember correctly, there was some incident before due to the table schema change like this. I'm worrying about those kind of situation.
|
The regression you mentioned was performed in SPARK-43049 and undone in SPARK-46478. SPARK-43049 modified the In this PR, the changes happen in the read path. The table schema change at the spark side can happen when users perform CTAS against MySQL, i.e. It's important to keep in mind that the results of arithmetic operations can differ based on the type of data that is returned. Since SPARK-45561 already had such impacts for TINYINT in Spark 3.5.1, it seems okay to extend to other types. |
|
No, it was a slightly different issue. IIRC, a user read and tries to write back (with overwrite?) and it broke their existing Database schema. And, their whole backend systems were screwed, @yaooqinn . Maybe, we had better a legacy configuration for this kind of potential schema change stuff. |
|
Thank you @dongjoon-hyun For the case that users read/write things in a roundtrip:
I'm not sure the existing behavior works well but seems a bug to me, and we don't have test cases for that |
|
Is this correct?
According to |
|
It's incorrect, it's like we read a smallint and write an int back. |
|
So,
|
dongjoon-hyun
left a comment
There was a problem hiding this comment.
+1, LGTM (Given that #45588 (comment))
Could you add a migration guide after all this PRs, @yaooqinn ?
|
cc @cloud-fan and @HyukjinKwon |
|
Thank you, @yaooqinn and @cloud-fan . |
|
Thank you, @dongjoon-hyun and @cloud-fan. I will send followups for migration guides |
What changes were proposed in this pull request?
Align mappings of other unsigned numeric types with TINYINT in MySQLDialect. TINYINT is mapping to ByteType and TINYINT UNSIGNED is mapping to ShortType.
In this PR, we
Other unsigned/signed types remain unchanged and only improve the test coverage.
Why are the changes needed?
Consistency and efficiency while reading MySQL numeric values
Does this PR introduce any user-facing change?
yes, the mappings described the 1st section.
How was this patch tested?
new tests
Was this patch authored or co-authored using generative AI tooling?
no