Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-17841][STREAMING][KAFKA] drain commitQueue #15407

Closed
wants to merge 1 commit into from

Conversation

koeninger
Copy link
Contributor

@koeninger koeninger commented Oct 9, 2016

What changes were proposed in this pull request?

Actually drain commit queue rather than just iterating it.
iterator() on a concurrent linked queue won't remove items from the queue, poll() will.

How was this patch tested?

Unit tests

@SparkQA
Copy link

SparkQA commented Oct 9, 2016

Test build #66609 has finished for PR 15407 at commit 8810a79.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented Oct 17, 2016

Is there a test we can add to cover this?

I realized "How was this patch tested?" was confusing. That was meant to ask how is the change tested itself, and a bug fix obviously can't be tested by existing unit tests.

@koeninger
Copy link
Contributor Author

This doesn't affect correctness (only the highest offset for a given partition is used in any case), just memory leaks. I'm not sure what a good way to unit test memory leaks is, short of exposing the internal state of that queue, which seems less than optimal.

@tdas
Copy link
Contributor

tdas commented Oct 18, 2016

Can you explain what is the memory that can currently leak with the iterator?

@rxin
Copy link
Contributor

rxin commented Oct 18, 2016

@koeninger can you put more information into the description of the pull request? At the very least we should talk about the current implementation causes memory leaks.

@koeninger
Copy link
Contributor Author

@rxin @tdas right now, items to be committed can be added to the queue, but they will never actually be removed from the queue. poll() removes, iterator() does not. I updated the description of the PR.

@rxin
Copy link
Contributor

rxin commented Oct 18, 2016

Thanks for the explanation. Merging in master/branch-2.0.

asfgit pushed a commit that referenced this pull request Oct 18, 2016
## What changes were proposed in this pull request?

Actually drain commit queue rather than just iterating it.
iterator() on a concurrent linked queue won't remove items from the queue, poll() will.

## How was this patch tested?
Unit tests

Author: cody koeninger <cody@koeninger.org>

Closes #15407 from koeninger/SPARK-17841.

(cherry picked from commit cd106b0)
Signed-off-by: Reynold Xin <rxin@databricks.com>
@asfgit asfgit closed this in cd106b0 Oct 18, 2016
@zsxwing
Copy link
Member

zsxwing commented Oct 18, 2016

LGTM

robert3005 pushed a commit to palantir/spark that referenced this pull request Nov 1, 2016
## What changes were proposed in this pull request?

Actually drain commit queue rather than just iterating it.
iterator() on a concurrent linked queue won't remove items from the queue, poll() will.

## How was this patch tested?
Unit tests

Author: cody koeninger <cody@koeninger.org>

Closes apache#15407 from koeninger/SPARK-17841.
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
## What changes were proposed in this pull request?

Actually drain commit queue rather than just iterating it.
iterator() on a concurrent linked queue won't remove items from the queue, poll() will.

## How was this patch tested?
Unit tests

Author: cody koeninger <cody@koeninger.org>

Closes apache#15407 from koeninger/SPARK-17841.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants