Skip to content

Fix bad any cast errors caused by hash collisions#208

Merged
jasonroelofs merged 3 commits intoruby-rice:masterfrom
ankane:bad-any-cast2
Oct 7, 2024
Merged

Fix bad any cast errors caused by hash collisions#208
jasonroelofs merged 3 commits intoruby-rice:masterfrom
ankane:bad-any-cast2

Conversation

@ankane
Copy link
Copy Markdown
Contributor

@ankane ankane commented Oct 7, 2024

This updates the NativeRegistry to handle hash collisions, which should fix random bad any cast errors.

Ref: #207 (comment)

Edit: Not very happy with the latest approach. There should be a much more performant way to do this.

Edit 2: I'm not sure putting VALUEs on the heap is a good idea, so it may be better to go back to the previous approach: 819515b. Also, choosing a larger prime in the key function reduces collisions significantly, but I think they need to be handled anyways.

@jasonroelofs
Copy link
Copy Markdown
Collaborator

Thanks for tracking this down. Can we get a test added that shows this is the fix?

@ankane
Copy link
Copy Markdown
Contributor Author

ankane commented Oct 7, 2024

Thanks @jasonroelofs, added.

@jasonroelofs
Copy link
Copy Markdown
Collaborator

Not sure what's up with Ruby 3.1 on mac but that does not look related to these changes. Thanks for the update! I am not in a position to push a release with this fix right away but will get to it when I've got a chance.

@jasonroelofs jasonroelofs merged commit a82515b into ruby-rice:master Oct 7, 2024
@ankane
Copy link
Copy Markdown
Contributor Author

ankane commented Oct 7, 2024

Great, thanks @jasonroelofs! I probably should have included #209 in this as well, but just submitted as a separate PR.

@cfis
Copy link
Copy Markdown
Collaborator

cfis commented Oct 8, 2024

Why do you think this is caused by hash collisions?

If it is, wouldn't the better solution be fix the size_t NativeRegistry::key(VALUE klass, ID id) method? See:

inline size_t NativeRegistry::key(VALUE klass, ID id)

I always wondered if that method was giving something back that was legitimate.

@ankane
Copy link
Copy Markdown
Contributor Author

ankane commented Oct 8, 2024

Why do you think this is caused by hash collisions?

The easiest way to reproduce is to change key to always return 1, create two functions with different return types, and call them. When it tries to lookup the type that was overwritten, it'll throw a bad any cast error.

If it is, wouldn't the better solution be fix the size_t NativeRegistry::key(VALUE klass, ID id) method?

With a non-perfect hash function, there will always be the possibility of collisions.

@cfis
Copy link
Copy Markdown
Collaborator

cfis commented Oct 8, 2024

Sure collisions can happen by my understanding is that it is very unlikely (maybe this answer is incorrect- https://stackoverflow.com/a/62667633). I also assume that the number of native functions a binding is tracking is not very big, maybe in the hundreds or thousands?

But its sounds like you see it a lot. So that makes me wonder if:

  • The hash function is doing a bad job
  • Maybe it isn't a has collision

Are you able to reproduce this error with an particular project? Do the tests you added reproduce it (that would be awesome if yes)?

@ankane
Copy link
Copy Markdown
Contributor Author

ankane commented Oct 8, 2024

Are you able to reproduce this error with an particular project?

Yes, it's been happening across many projects.

And the test case should reproduce it consistently.

@cfis
Copy link
Copy Markdown
Collaborator

cfis commented Oct 8, 2024

Ok, good news on test case and thanks for the explanation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants