Commit cd95799
authored
[Spark] Drop Type widening feature: read Parquet footers to collect files to rewrite (#3155)
## What changes were proposed in this pull request?
The initial approach to identify files that contain a type that differs
from the table schema and that must be rewritten before dropping the
type widening table feature is convoluted and turns out to be more
brittle than intended.
This change switches instead to directly reading the file schema from
the Parquet footer and rewriting all files that have a mismatching type.
### Additional Context
Files are identified using their default row commit version (a part of
the row tracking feature) and matched against type changes previously
applied to the table and recorded in the table metadata: any file
written before the latest type change should use a different type and
must be rewritten.
This requires multiple pieces of information to be accurately tracked:
- Default row commit versions must be correctly assigned to all files.
E.p. files that are copied over without modification must never be
assigned a new default row commit version. On the other hand, default
row commit versions are preserved across CLONE but these versions don't
match anything in the new cloned table.
- Type change history must be reliably recorded and preserved across
schema changes, e.g. column mapping.
Any bug will likely lead to files not being correctly rewritten before
removing the table feature, potentially leaving the table in an
unreadable state.
## How was this patch tested?
Tests added in previous PR to cover CLONE and RESTORE:
#3053
Tests added and updated in this PR to cover rewriting files with
different column types when removing the table feature.1 parent 4b102d3 commit cd95799
File tree
6 files changed
+174
-45
lines changed- spark/src
- main/scala/org/apache/spark/sql/delta
- commands
- test/scala/org/apache/spark/sql/delta
6 files changed
+174
-45
lines changedLines changed: 7 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| 26 | + | |
26 | 27 | | |
27 | 28 | | |
28 | | - | |
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
| |||
309 | 309 | | |
310 | 310 | | |
311 | 311 | | |
312 | | - | |
313 | | - | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
314 | 315 | | |
315 | 316 | | |
316 | 317 | | |
| |||
323 | 324 | | |
324 | 325 | | |
325 | 326 | | |
326 | | - | |
327 | | - | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
328 | 330 | | |
329 | 331 | | |
330 | 332 | | |
| |||
Lines changed: 0 additions & 29 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
96 | 96 | | |
97 | 97 | | |
98 | 98 | | |
99 | | - | |
100 | | - | |
101 | | - | |
102 | | - | |
103 | | - | |
104 | | - | |
105 | | - | |
106 | | - | |
107 | | - | |
108 | | - | |
109 | | - | |
110 | | - | |
111 | | - | |
112 | | - | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | | - | |
121 | | - | |
122 | | - | |
123 | | - | |
124 | | - | |
125 | | - | |
126 | | - | |
127 | | - | |
128 | 99 | | |
Lines changed: 14 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
| 19 | + | |
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| |||
97 | 97 | | |
98 | 98 | | |
99 | 99 | | |
100 | | - | |
| 100 | + | |
101 | 101 | | |
102 | 102 | | |
103 | 103 | | |
104 | 104 | | |
105 | 105 | | |
106 | 106 | | |
107 | | - | |
| 107 | + | |
| 108 | + | |
108 | 109 | | |
109 | 110 | | |
110 | 111 | | |
| |||
115 | 116 | | |
116 | 117 | | |
117 | 118 | | |
118 | | - | |
| 119 | + | |
| 120 | + | |
119 | 121 | | |
120 | 122 | | |
121 | 123 | | |
| |||
129 | 131 | | |
130 | 132 | | |
131 | 133 | | |
132 | | - | |
133 | | - | |
134 | | - | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
135 | 142 | | |
Lines changed: 4 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
265 | 265 | | |
266 | 266 | | |
267 | 267 | | |
268 | | - | |
269 | | - | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
270 | 272 | | |
271 | 273 | | |
272 | 274 | | |
| |||
Lines changed: 111 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
Lines changed: 38 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| 19 | + | |
19 | 20 | | |
20 | 21 | | |
21 | 22 | | |
| |||
1017 | 1018 | | |
1018 | 1019 | | |
1019 | 1020 | | |
1020 | | - | |
| 1021 | + | |
1021 | 1022 | | |
1022 | 1023 | | |
1023 | 1024 | | |
| |||
1351 | 1352 | | |
1352 | 1353 | | |
1353 | 1354 | | |
| 1355 | + | |
| 1356 | + | |
| 1357 | + | |
| 1358 | + | |
| 1359 | + | |
| 1360 | + | |
| 1361 | + | |
| 1362 | + | |
| 1363 | + | |
| 1364 | + | |
| 1365 | + | |
| 1366 | + | |
| 1367 | + | |
| 1368 | + | |
| 1369 | + | |
1354 | 1370 | | |
1355 | 1371 | | |
1356 | 1372 | | |
| |||
1453 | 1469 | | |
1454 | 1470 | | |
1455 | 1471 | | |
1456 | | - | |
| 1472 | + | |
1457 | 1473 | | |
1458 | 1474 | | |
1459 | 1475 | | |
| |||
1498 | 1514 | | |
1499 | 1515 | | |
1500 | 1516 | | |
| 1517 | + | |
| 1518 | + | |
| 1519 | + | |
| 1520 | + | |
| 1521 | + | |
| 1522 | + | |
| 1523 | + | |
| 1524 | + | |
| 1525 | + | |
| 1526 | + | |
| 1527 | + | |
| 1528 | + | |
| 1529 | + | |
| 1530 | + | |
| 1531 | + | |
| 1532 | + | |
| 1533 | + | |
| 1534 | + | |
| 1535 | + | |
| 1536 | + | |
1501 | 1537 | | |
1502 | 1538 | | |
1503 | 1539 | | |
| |||
0 commit comments