Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#8293 fix concurrent access to PQ persisted bytesize throwing #8317

Closed
wants to merge 1 commit into from

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented Sep 19, 2017

See error reported in #8293, obviously looping over the tail pages will throw on concurrent modification of the underlying ArrayList.
Unfortunately, the stack trace of that error doesn't get forwarded to Ruby and also seems to prevent the mutex covering the PQ data poll on the Ruby side from unlocking (that's a different issue imo, no clue why this is happening).
=> Either way, fixed by locking that operation in PQ and preventing concurrent modification exceptions.

  • I'm aware that this needs a stronger fix, in the long run, it's not ok having to lock the pipeline for this, but for the current implementation, I see no other fix that doesn't add the same degree of locking.

@jordansissel
Copy link
Contributor

Most tests passing. Only 1 DLQ test failing and is unrelated.

LGTM

@suyograo
Copy link
Contributor

@original-brownbear can we backport this to 5.6 as well?

@original-brownbear
Copy link
Member Author

@suyograo sure, let's see this might even merge clean :)

@elasticsearch-bot
Copy link

Armin Braun merged this into the following branches!

Branch Commits
master ffb72d3
6.x fc1b548

@elasticsearch-bot
Copy link

Armin Braun merged this into the following branches!

Branch Commits
6.0 cacaa62

@elasticsearch-bot
Copy link

Armin Braun merged this into the following branches!

Branch Commits
5.6 64dacdd

@original-brownbear
Copy link
Member Author

@suyograo done :)

@colinsurprenant
Copy link
Contributor

Just helped diagnose an issue on discuss which pointed to this rather obvious lack of locking and found this fix - thanks @original-brownbear this one was pretty nasty, good thing we also merged it in 5.6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants