Skip to content

MSQ extension: Fix over-capacity write in ScanQueryFrameProcessor.#13036

Merged
abhishekagarwal87 merged 2 commits intoapache:masterfrom
gianm:fix_double_frame_write
Sep 7, 2022
Merged

MSQ extension: Fix over-capacity write in ScanQueryFrameProcessor.#13036
abhishekagarwal87 merged 2 commits intoapache:masterfrom
gianm:fix_double_frame_write

Conversation

@gianm
Copy link
Contributor

@gianm gianm commented Sep 7, 2022

Frame processors are meant to write only one output frame per cycle. The ScanQueryFrameProcessor would write two when reading from a channel if the input frame cursor cycled and then the output frame filled up while reading from the next frame.

This patch fixes the bug, and adds a test. It also makes some adjustments to the processor code in order to make it easier to test.

Frame processors are meant to write only one output frame per cycle.
The ScanQueryFrameProcessor would write two when reading from a channel
if the input frame cursor cycled and then the output frame filled up
while reading from the next frame.

This patch fixes the bug, and adds a test. It also makes some adjustments
to the processor code in order to make it easier to test.
@gianm gianm added Bug Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Sep 7, 2022
@gianm
Copy link
Contributor Author

gianm commented Sep 7, 2022

This part is the fix: https://github.com/apache/druid/pull/13036/files#diff-9b472caafa614f6c7b93fc27e09e8192b1bd6b665857ed0e5946f73035ac633cL206-R207

The rest is refactoring to make it possible to write the test.

@gianm gianm added this to the 24.0.0 milestone Sep 7, 2022
Copy link
Contributor

@cryptoe cryptoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
It would be cool to understand how you debugged this also.

Copy link
Contributor

@vogievetsky vogievetsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw Gian write the fix :-)

@gianm
Copy link
Contributor Author

gianm commented Sep 7, 2022

It would be cool to understand how you debugged this also.

I ran into an error "Channel has no capacity" from BlockingQueueFrameChannel in a test cluster. The stack trace indicated it was a frame write from ScanQueryFrameProcessor populateFrameWriterAndFlushIfNeeded. Knowing that this error means that a processor wrote two frames in a single cycle, I read through the code and thought about ways that this might happen. I found this one, and wrote a test case & fix for it.

@abhishekagarwal87 abhishekagarwal87 merged commit f00f1f7 into apache:master Sep 7, 2022
abhishekagarwal87 pushed a commit that referenced this pull request Sep 7, 2022
…13036)

* MSQ extension: Fix over-capacity write in ScanQueryFrameProcessor.

Frame processors are meant to write only one output frame per cycle.
The ScanQueryFrameProcessor would write two when reading from a channel
if the input frame cursor cycled and then the output frame filled up
while reading from the next frame.

This patch fixes the bug, and adds a test. It also makes some adjustments
to the processor code in order to make it easier to test.

* Add license header.
@gianm gianm deleted the fix_double_frame_write branch September 23, 2022 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 Bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants