Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ruby: Fix object cache lookups on 32-bit platforms #13494

Closed

Conversation

stanhu
Copy link
Contributor

@stanhu stanhu commented Aug 9, 2023

#13204 refactored the Ruby object cache to use a key of LL2NUM(key_val) instead of LL2NUM(key_val >> 2). On 32-bit systems, LL2NUM(key_val) returns inconsistent results because a large value has to be stored as a Bignum on the heap. This causes cache lookups to fail.

This commit restores the previous behavior of using ObjectCache_GetKey, which discards the lower 2 bits, which are zero. This enables a key to be stored as a Fixnum on both 32 and 64-bit platforms.

As https://patshaughnessy.net/2014/1/9/how-big-is-a-bignum describes, a Fixnum uses:

  • 1 bit for the FIXNUM_FLAG.
  • 1 bit for the sign flag.

Therefore the largest possible Fixnum value on a 64-bit value is 4611686018427387903 (2^62 - 1). On a 32-bit system, the largest value is 1073741823 (2^30 - 1).

For example, a possible VALUE pointer address on a 32-bit system:

0xff5b4af8 => 4284173048

Dropping the lower 2 bits makes up for the loss of range to these flags. In the example above, we see that shifting by 2 turns the value into a 30-bit number, which can be represented as a Fixnum:

(0xff5b4af8 >> 2) => 1071043262

This bug can also manifest on a 64-bit system if the upper bits are 0xff.

Closes #13481

@stanhu stanhu requested a review from a team as a code owner August 9, 2023 20:37
@stanhu stanhu requested review from JasonLunn and removed request for a team August 9, 2023 20:37
@fowles fowles requested a review from esrauchg August 9, 2023 20:41
@JasonLunn JasonLunn added the ruby label Aug 9, 2023
VALUE key_val = (VALUE)key;
PBRUBY_ASSERT((key_val & 3) == 0);
return rb_funcall(weak_obj_cache, item_try_add, 2, LL2NUM(key_val), val);
// Avoid overflow on 32-bit systems by discarding the bottom zeros
Copy link
Contributor Author

@stanhu stanhu Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue isn't overflow, but in https://github.com/ruby/ruby/blob/abd15ac775d41e6485f728fe0fad4cddf138d3ec/include/ruby/internal/arithmetic/long_long.h#L75-L86 it appears that if the value can fit as a Fixnum it will be stored that way. I suspect that on a 32-bit system, the value doesn't fit, so it has to be allocated on a heap via Bignum.

Example values:

0xff5b4af8 => 4284173048
(0xff5b4af8 >> 2) => 1071043262

Every time we get a new key on a 32-bit system, LL2NUM(key_val) previously allocated a new Bignum, which has a different value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an unit test or an assert so that we can't regress this in the future? Most code paths aren't picky about a Fixnum vs a Bignum, but this one does.

Copy link
Contributor Author

@stanhu stanhu Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've confirmed that this is indeed a Bignum vs. Fixnum issue. I've added an assertion, though the assertion will only be triggered on a 32-bit system with the NDEBUG compile flag omitted. The entire test suite fails with v3.24.0 at the moment even without this assertion.

@stanhu stanhu force-pushed the sh-fix-ruby-protobuf-32bit branch from 84070be to 21b1bb6 Compare August 9, 2023 21:43
@JasonLunn JasonLunn added the 🅰️ safe for tests Mark a commit as safe to run presubmits over label Aug 9, 2023
@github-actions github-actions bot removed the 🅰️ safe for tests Mark a commit as safe to run presubmits over label Aug 9, 2023
@esrauchg
Copy link

It would be great if we can get this PR merged this week to get it released as soon as possible; tests would be appreciated but since this is a partial rollback of a recent change if we can only get this in without tests this week that is likely still preferable.

@stanhu
Copy link
Contributor Author

stanhu commented Aug 10, 2023

@esrauchg The best way to test this would be to build and run the Ruby tests on a 32-bit image, such as i386/ubuntu:latest. Unfortunately, I don't have the bandwidth to do this.

UPDATE: Never mind, done.

protocolbuffers#13204 refactored the
Ruby object cache to use a key of `LL2NUM(key_val)` instead of
`LL2NUM(key_val >> 2)`. On 32-bit systems, `LL2NUM(key_val)` returns
inconsistent results because a large value has to be stored as a
Bignum on the heap. This causes cache lookups to fail.

This commit restores the previous behavior of using
`ObjectCache_GetKey`, which discards the lower 2 bits, which are
zero. This enables a key to be stored as a Fixnum on both 32 and
64-bit platforms.

As https://patshaughnessy.net/2014/1/9/how-big-is-a-bignum describes,
a Fixnum uses:

* 1 bit for the `FIXNUM_FLAG`.
* 1 bit for the sign bit.

Therefore the largest possible Fixnum value on a 64-bit value is
4611686018427387903 (2^62 - 1). On a 32-bit system, the largest value
is 1073741823 (2^30 - 1).

For example, a possible VALUE pointer address on a 32-bit system:

0xff5b4af8 => 4284173048

Dropping the lower 2 bits makes up for the loss of range to these
flags. In the example above, we see that shifting by 2 turns the value
into a 30-bit number, which can be represented as a Fixnum:

(0xff5b4af8 >> 2) => 1071043262

This bug can also manifest on a 64-bit system if the upper bits are 0xff.

Closes protocolbuffers#13481
@stanhu stanhu requested a review from a team as a code owner August 11, 2023 00:59
@stanhu stanhu requested review from haberman and removed request for a team August 11, 2023 00:59
This will prevent regressions in a 32-bit environment.
@zhangskz zhangskz added the 🅰️ safe for tests Mark a commit as safe to run presubmits over label Aug 15, 2023
@github-actions github-actions bot removed the 🅰️ safe for tests Mark a commit as safe to run presubmits over label Aug 15, 2023
@@ -43,6 +43,39 @@ jobs:
bazel-cache: ruby_linux/${{ matrix.ruby }}_${{ matrix.bazel }}
bazel: test //ruby/... //ruby/tests:ruby_version --test_env=KOKORO_RUBY_VERSION --test_env=BAZEL=true ${{ matrix.ffi == 'FFI' && '--//ruby:ffi=enabled --test_env=PROTOCOL_BUFFERS_RUBY_IMPLEMENTATION=FFI' || '' }}

linux-32bit:
name: Linux aarch64
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, there is a typo here.

copybara-service bot pushed a commit that referenced this pull request Aug 15, 2023
#13204 refactored the Ruby object cache to use a key of `LL2NUM(key_val)` instead of `LL2NUM(key_val >> 2)`. On 32-bit systems, `LL2NUM(key_val)` returns inconsistent results because a large value has to be stored as a Bignum on the heap. This causes cache lookups to fail.

This commit restores the previous behavior of using `ObjectCache_GetKey`, which discards the lower 2 bits, which are zero. This enables a key to be stored as a Fixnum on both 32 and 64-bit platforms.

As https://patshaughnessy.net/2014/1/9/how-big-is-a-bignum describes, a Fixnum uses:

* 1 bit for the `FIXNUM_FLAG`.
* 1 bit for the sign flag.

Therefore the largest possible Fixnum value on a 64-bit value is 4611686018427387903 (2^62 - 1). On a 32-bit system, the largest value  is 1073741823 (2^30 - 1).

For example, a possible VALUE pointer address on a 32-bit system:

0xff5b4af8 => 4284173048

Dropping the lower 2 bits makes up for the loss of range to these flags. In the example above, we see that shifting by 2 turns the value into a 30-bit number, which can be represented as a Fixnum:

(0xff5b4af8 >> 2) => 1071043262

This bug can also manifest on a 64-bit system if the upper bits are 0xff.

Closes #13481

Closes #13494

COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a
FUTURE_COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a
PiperOrigin-RevId: 557189479
copybara-service bot pushed a commit that referenced this pull request Aug 15, 2023
#13204 refactored the Ruby object cache to use a key of `LL2NUM(key_val)` instead of `LL2NUM(key_val >> 2)`. On 32-bit systems, `LL2NUM(key_val)` returns inconsistent results because a large value has to be stored as a Bignum on the heap. This causes cache lookups to fail.

This commit restores the previous behavior of using `ObjectCache_GetKey`, which discards the lower 2 bits, which are zero. This enables a key to be stored as a Fixnum on both 32 and 64-bit platforms.

As https://patshaughnessy.net/2014/1/9/how-big-is-a-bignum describes, a Fixnum uses:

* 1 bit for the `FIXNUM_FLAG`.
* 1 bit for the sign flag.

Therefore the largest possible Fixnum value on a 64-bit value is 4611686018427387903 (2^62 - 1). On a 32-bit system, the largest value  is 1073741823 (2^30 - 1).

For example, a possible VALUE pointer address on a 32-bit system:

0xff5b4af8 => 4284173048

Dropping the lower 2 bits makes up for the loss of range to these flags. In the example above, we see that shifting by 2 turns the value into a 30-bit number, which can be represented as a Fixnum:

(0xff5b4af8 >> 2) => 1071043262

This bug can also manifest on a 64-bit system if the upper bits are 0xff.

Closes #13481

Closes #13494

COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a
FUTURE_COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a
PiperOrigin-RevId: 557189479
copybara-service bot pushed a commit that referenced this pull request Aug 15, 2023
Every language has very different handling of utf8 validation.  Any with proto2/proto3 differences will receive language-specific features for edition zero to better model these subtle differences.

COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a
FUTURE_COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a
PiperOrigin-RevId: 552625656
copybara-service bot pushed a commit that referenced this pull request Aug 15, 2023
This helps make the API more complete, since the FeatureSet object will always be fully resolved on any accessible features.  This specifically targets C++ plugins though, which will now have their features filled in by default.  Before, any proto files that didn't include the language-specific features would result in unresolved extensions in the generators.

COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a
FUTURE_COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a
PiperOrigin-RevId: 553224298
copybara-service bot pushed a commit that referenced this pull request Aug 15, 2023
This has been replaced with language-specific features and will not be included in Edition 2023.

COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a
FUTURE_COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a
PiperOrigin-RevId: 557215503
copybara-service bot pushed a commit that referenced this pull request Aug 15, 2023
#13204 refactored the Ruby object cache to use a key of `LL2NUM(key_val)` instead of `LL2NUM(key_val >> 2)`. On 32-bit systems, `LL2NUM(key_val)` returns inconsistent results because a large value has to be stored as a Bignum on the heap. This causes cache lookups to fail.

This commit restores the previous behavior of using `ObjectCache_GetKey`, which discards the lower 2 bits, which are zero. This enables a key to be stored as a Fixnum on both 32 and 64-bit platforms.

As https://patshaughnessy.net/2014/1/9/how-big-is-a-bignum describes, a Fixnum uses:

* 1 bit for the `FIXNUM_FLAG`.
* 1 bit for the sign flag.

Therefore the largest possible Fixnum value on a 64-bit value is 4611686018427387903 (2^62 - 1). On a 32-bit system, the largest value  is 1073741823 (2^30 - 1).

For example, a possible VALUE pointer address on a 32-bit system:

0xff5b4af8 => 4284173048

Dropping the lower 2 bits makes up for the loss of range to these flags. In the example above, we see that shifting by 2 turns the value into a 30-bit number, which can be represented as a Fixnum:

(0xff5b4af8 >> 2) => 1071043262

This bug can also manifest on a 64-bit system if the upper bits are 0xff.

Closes #13481

Closes #13494

COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a
FUTURE_COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a
PiperOrigin-RevId: 557216800
zhangskz pushed a commit that referenced this pull request Aug 17, 2023
#13204 refactored the Ruby object cache to use a key of `LL2NUM(key_val)` instead of `LL2NUM(key_val >> 2)`. On 32-bit systems, `LL2NUM(key_val)` returns inconsistent results because a large value has to be stored as a Bignum on the heap. This causes cache lookups to fail.

This commit restores the previous behavior of using `ObjectCache_GetKey`, which discards the lower 2 bits, which are zero. This enables a key to be stored as a Fixnum on both 32 and 64-bit platforms.

As https://patshaughnessy.net/2014/1/9/how-big-is-a-bignum describes, a Fixnum uses:

* 1 bit for the `FIXNUM_FLAG`.
* 1 bit for the sign flag.

Therefore the largest possible Fixnum value on a 64-bit value is 4611686018427387903 (2^62 - 1). On a 32-bit system, the largest value  is 1073741823 (2^30 - 1).

For example, a possible VALUE pointer address on a 32-bit system:

0xff5b4af8 => 4284173048

Dropping the lower 2 bits makes up for the loss of range to these flags. In the example above, we see that shifting by 2 turns the value into a 30-bit number, which can be represented as a Fixnum:

(0xff5b4af8 >> 2) => 1071043262

This bug can also manifest on a 64-bit system if the upper bits are 0xff.

Closes #13481

Closes #13494

COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a
PiperOrigin-RevId: 557211768
esrauchg pushed a commit that referenced this pull request Aug 17, 2023
#13204 refactored the Ruby object cache to use a key of `LL2NUM(key_val)` instead of `LL2NUM(key_val >> 2)`. On 32-bit systems, `LL2NUM(key_val)` returns inconsistent results because a large value has to be stored as a Bignum on the heap. This causes cache lookups to fail.

This commit restores the previous behavior of using `ObjectCache_GetKey`, which discards the lower 2 bits, which are zero. This enables a key to be stored as a Fixnum on both 32 and 64-bit platforms.

As https://patshaughnessy.net/2014/1/9/how-big-is-a-bignum describes, a Fixnum uses:

* 1 bit for the `FIXNUM_FLAG`.
* 1 bit for the sign flag.

Therefore the largest possible Fixnum value on a 64-bit value is 4611686018427387903 (2^62 - 1). On a 32-bit system, the largest value  is 1073741823 (2^30 - 1).

For example, a possible VALUE pointer address on a 32-bit system:

0xff5b4af8 => 4284173048

Dropping the lower 2 bits makes up for the loss of range to these flags. In the example above, we see that shifting by 2 turns the value into a 30-bit number, which can be represented as a Fixnum:

(0xff5b4af8 >> 2) => 1071043262

This bug can also manifest on a 64-bit system if the upper bits are 0xff.

Closes #13481

Closes #13494

COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a
PiperOrigin-RevId: 557211768
zhangskz pushed a commit that referenced this pull request Aug 17, 2023
#13204 refactored the Ruby object cache to use a key of `LL2NUM(key_val)` instead of `LL2NUM(key_val >> 2)`. On 32-bit systems, `LL2NUM(key_val)` returns inconsistent results because a large value has to be stored as a Bignum on the heap. This causes cache lookups to fail.

This commit restores the previous behavior of using `ObjectCache_GetKey`, which discards the lower 2 bits, which are zero. This enables a key to be stored as a Fixnum on both 32 and 64-bit platforms.

As https://patshaughnessy.net/2014/1/9/how-big-is-a-bignum describes, a Fixnum uses:

* 1 bit for the `FIXNUM_FLAG`.
* 1 bit for the sign flag.

Therefore the largest possible Fixnum value on a 64-bit value is 4611686018427387903 (2^62 - 1). On a 32-bit system, the largest value  is 1073741823 (2^30 - 1).

For example, a possible VALUE pointer address on a 32-bit system:

0xff5b4af8 => 4284173048

Dropping the lower 2 bits makes up for the loss of range to these flags. In the example above, we see that shifting by 2 turns the value into a 30-bit number, which can be represented as a Fixnum:

(0xff5b4af8 >> 2) => 1071043262

This bug can also manifest on a 64-bit system if the upper bits are 0xff.

Closes #13481

Closes #13494

COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a
PiperOrigin-RevId: 557211768

Co-authored-by: Stan Hu <stanhu@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ruby] Unable to instantiate a descriptor in v3.24.0 on 32-bit system
4 participants