-
Notifications
You must be signed in to change notification settings - Fork 506
ORC-642: update PatchedBase doc with patch ceiling in spec #1868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
site/specification/ORCv1.md
Outdated
| (PGW + PW) | ceil(PGW + PW) | ||
| :------------ | :------------- | ||
| 1 <= x <= 24 | x | ||
| 25 | 26 | ||
| 26 | 26 | ||
| 27 | 28 | ||
| 28 | 28 | ||
| 29 | 30 | ||
| 30 | 30 | ||
| 31 | 32 | ||
| 32 | 32 | ||
| 33 <= x <= 40 | 40 | ||
| 41 <= x <= 48 | 48 | ||
| 49 <= x <= 56 | 56 | ||
| 57 <= x <= 64 | 64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From
Lines 29 to 33 in 9b79de9
| // Map bit length i to closest fixed bit width that can contain i bits. | |
| const uint8_t ClosestFixedBitsMap[65] = { | |
| 1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, | |
| 22, 23, 24, 26, 26, 28, 28, 30, 30, 32, 32, 40, 40, 40, 40, 40, 40, 40, 40, 48, 48, 48, | |
| 48, 48, 48, 48, 48, 56, 56, 56, 56, 56, 56, 56, 56, 64, 64, 64, 64, 64, 64, 64, 64}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it better to follow the java code to add 0 as well?
orc/java/core/src/java/org/apache/orc/impl/SerializationUtils.java
Lines 362 to 391 in 513922a
| public int getClosestFixedBits(int n) { | |
| if (n == 0) { | |
| return 1; | |
| } | |
| if (n <= 24) { | |
| return n; | |
| } | |
| if (n <= 26) { | |
| return 26; | |
| } | |
| if (n <= 28) { | |
| return 28; | |
| } | |
| if (n <= 30) { | |
| return 30; | |
| } | |
| if (n <= 32) { | |
| return 32; | |
| } | |
| if (n <= 40) { | |
| return 40; | |
| } | |
| if (n <= 48) { | |
| return 48; | |
| } | |
| if (n <= 56) { | |
| return 56; | |
| } | |
| return 64; | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it makes sense to have 0 as the input is PGW + PW, and the spec states that PGW is 1 to 8 bits and PW is 1 to 64 bits, so it can never be 0 (or 1, for that matter, I guess 🤔 )
site/specification/ORCv1.md
Outdated
| 64. (PGW + PW) is padded to the nearest fixed bit size according to the | ||
| below table before being encoded in the patch list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to
Line 332 in 9b79de9
| uint32_t cfb = getClosestFixedBits(patchBitSize + pgw); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please rename ceil to closestFixedBits or cfb? ceil does not seems to have its common meaning here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing, done
wgtmac
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix! Could you please also add them to ORCv2.md as well?
site/specification/ORCv1.md
Outdated
| 64. (PGW + PW) is padded to the nearest fixed bit size according to the | ||
| below table before being encoded in the patch list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please rename ceil to closestFixedBits or cfb? ceil does not seems to have its common meaning here.
site/specification/ORCv1.md
Outdated
| (PGW + PW) | ceil(PGW + PW) | ||
| :------------ | :------------- | ||
| 1 <= x <= 24 | x | ||
| 25 | 26 | ||
| 26 | 26 | ||
| 27 | 28 | ||
| 28 | 28 | ||
| 29 | 30 | ||
| 30 | 30 | ||
| 31 | 32 | ||
| 32 | 32 | ||
| 33 <= x <= 40 | 40 | ||
| 41 <= x <= 48 | 48 | ||
| 49 <= x <= 56 | 56 | ||
| 57 <= x <= 64 | 64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it better to follow the java code to add 0 as well?
orc/java/core/src/java/org/apache/orc/impl/SerializationUtils.java
Lines 362 to 391 in 513922a
| public int getClosestFixedBits(int n) { | |
| if (n == 0) { | |
| return 1; | |
| } | |
| if (n <= 24) { | |
| return n; | |
| } | |
| if (n <= 26) { | |
| return 26; | |
| } | |
| if (n <= 28) { | |
| return 28; | |
| } | |
| if (n <= 30) { | |
| return 30; | |
| } | |
| if (n <= 32) { | |
| return 32; | |
| } | |
| if (n <= 40) { | |
| return 40; | |
| } | |
| if (n <= 48) { | |
| return 48; | |
| } | |
| if (n <= 56) { | |
| return 56; | |
| } | |
| return 64; | |
| } |
wgtmac
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM +1
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM.
|
Thanks @Jefffrey @wgtmac @dongjoon-hyun . Merged to main. |
What changes were proposed in this pull request?
Update PatchedBase specification doc to include details about the behaviour of padding the patch gap + patch width bits to nearest fixed btis.
Why are the changes needed?
Ensure spec is accurate to implementation details
How was this patch tested?
Was this patch authored or co-authored using generative AI tooling?
No