Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark buffers held onto the stack #346

Closed
wants to merge 2 commits into from

Conversation

casperisfine
Copy link

Fix: #341
Fix: #344

What is happening is that when we call into a recursive packing proc, we first save the packer buffer state onto the stack and then reset the buffer.

Once we return from the proc, the original buffer state is copied back.

The problem with this is that if any of the chunk has a mapped string then they are not reachable by any Ruby object and may be garbage collected at any moment.

To fix this, we instantiate a special Buffer that we keep on the stack and is in change of marking the parent buffer we copied.

This really isn't ideal as we have to go over the linked list to update the pointer to the tail.

The buffer implementation is in need of a more serious refactor but this patch is a quick fix.

cc @peterzhu2118 @pavel-workato @sitano

@casperisfine
Copy link
Author

Hum, it passes quite consistently locally, but fail about half the time on CI it seems, I'll have to dig more.

What is happening is that when we call into a recursive packing
proc, we first save the packer buffer state onto the stack
and then reset the buffer.

Once we return from the proc, the original buffer state is copied
back.

The problem with this is that if any of the chunk has a mapped string
then they are not reachable by any Ruby object and may be garbage collected
at any moment.
We instantiate a special Buffer that we keep on the stack
and is in change of marking the parent buffer we copied.

This really isn't ideal as we have to go over the linked list
to update the pointer to the tail.

The buffer implementation is in need of a more serious refactor
but this patch is a quick fix.
0x02,
hash_with_indifferent_access,
packer: ->(value, packer) do
packer.write("a".b * 600_0000) # Over MSGPACK_BUFFER_STRING_WRITE_REFERENCE_DEFAULT
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is what was likely missing to repro, on writes msgpack only reference the original string if it's over 500kiB, which is huge and rarely happens.

We should consider reducing that default very significantly in the future.

@casperisfine
Copy link
Author

Ok, so I can repro on a linux machine with running:

export MSGPACK_DEBUG=true
bundle exec rake clean compile
bin/rspec spec/factory_spec.rb spec/packer_spec.rb

It crashes in buffer_mark

* thread #1, name = 'ruby', stop reason = signal SIGSEGV: invalid address (fault address: 0x20)
    frame #0: 0x00007ffff25ad7e0 msgpack.so`msgpack_buffer_mark(ptr=0x00007fffffffa3f0) at buffer.c:138:9
   135 	    /* head is always available */
   136 	    msgpack_buffer_chunk_t* c = b->head;
   137 	    while(c != &b->tail) {
-> 138 	        rb_gc_mark(c->mapped_string);
   139 	        c = c->next;
   140 	    }
   141 	    rb_gc_mark(c->mapped_string);

Somehow a buffer chunk's ->next points to NULL, so I must have messed something.

(lldb) p b->head->next->next
(msgpack_buffer_chunk_t *) $4 = NULL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Attempt to GC mark already marked object
2 participants