Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-27428][CORE][TEST] Increase receive buffer size used in StatsdSinkSuite #29819

Closed
wants to merge 1 commit into from

Conversation

mundaym
Copy link
Contributor

@mundaym mundaym commented Sep 21, 2020

What changes were proposed in this pull request?

Increase size of socket receive buffer in these tests.

Why are the changes needed?

The socket receive buffer size set in this test was too small for
the StatsdSinkSuite tests to run reliably on some systems. For a
test in this suite to run reliably the buffer needs to be large
enough to hold all the data in the packets being sent in a test
along with any additional kernel or protocol overhead. The amount
of kernel overhead per packet can vary from system to system but is
typically far higher than the protocol overhead.

If the receive buffer is too small and fills up then packets are
silently dropped. This leads to the test failing with a timeout.

If the socket defaults to a larger receive buffer (normally true)
then we should keep that size.

As well as increasing the minimum buffer size I've also decoupled
the datagram packet buffer size from the receive buffer size. The
receive buffer should in general be far larger to account for the
fact that multiple packets might be buffered, as well as the
aforementioned overhead. Any truncated data in individual packets
will be picked up by the tests.

Does this PR introduce any user-facing change?

No, this only affects the tests.

How was this patch tested?

Existing tests on IBM Z and x86.

…SinkSuite.

The socket receive buffer size set in this test was too small for
the StatsdSinkSuite tests to run reliably on some systems. For a
test in this suite to run reliably the buffer needs to be large
enough to hold all the data in the packets being sent in a test
along with any additional kernel or protocol overhead. The amount
of kernel overhead per packet can vary from system to system but is
typically far higher than the protocol overhead.

If the receive buffer is too small and fills up then packets are
silently dropped. This leads to the test failing with a timeout.

If the socket defaults to a larger receive buffer (normally true)
then we should keep that size.

As well as increasing the minimum buffer size I've also decoupled
the datagram packet buffer size from the receive buffer size. The
receive buffer should in general be far larger to account for the
fact that multiple packets might be buffered, as well as the
aforementioned overhead. Any truncated data in individual packets
will be picked up by the tests.
@HyukjinKwon
Copy link
Member

cc @jerryshao FYI

@dongjoon-hyun
Copy link
Member

ok to test

@SparkQA
Copy link

SparkQA commented Sep 22, 2020

Test build #128991 has finished for PR 29819 at commit 9181c18.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-27428][CORE][TEST] Increase receive buffer size used in Statsd… [SPARK-27428][CORE][TEST] Increase receive buffer size used in StatsdSinkSuite Oct 1, 2020
// This value was determined experimentally and should be
// increased if timeouts are seen.
private val socketMinRecvBufferSize = 16384 // bytes
private val socketTimeout = 30000 // milliseconds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this to the original position to reduce the unnecessary diff and don't add space between 30000 and // milliseconds.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mundaym . Why don't we simply use new value? Then, this PR can be one-liner.

- private val socketBufferSize = 8192
+ private val socketBufferSize = 16384

@mundaym
Copy link
Contributor Author

mundaym commented Oct 1, 2020

Thanks for the review @dongjoon-hyun.

Why don't we simply use new value? Then, this PR can be one-liner.

Could do. It's seems to me though that having the values the same misleadingly implies that the socket receive buffer should be the same size as the UDP payload, when in fact the socket receive buffer needs to be a lot bigger (multiple packets + kernel overhead). The more I think about though it the more I'm inclined to say we should just leave the size of the socket receive buffer alone (i.e. delete the socket.setReceiveBufferSize call completely). The default is always going to be large enough for this test, overriding it seems unnecessary.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine pending tests

@dongjoon-hyun
Copy link
Member

@mundaym What is the purpose of this contribution? Technically, this PR looks worth for a testing on specific machine types like IBM. Given that, the change looks a little too much. If we have one-liner, I believe we can cherry-pick to all live branches. Otherwise, maybe, this PR is limited to master only.

@srowen
Copy link
Member

srowen commented Oct 1, 2020

This is a small change to a test suite, to just increase a buffer size. I can't see that it's controversial or that a few more or less lines matters. Where are you getting that it has to be one line to backport??

@dongjoon-hyun
Copy link
Member

No, it's not a rule~ I wrote like that because I would do if I merge this PR.

Feel free to proceed in your way, @srowen . I'll leave this to you.

@srowen
Copy link
Member

srowen commented Oct 3, 2020

Jenkins retest this please

@srowen
Copy link
Member

srowen commented Oct 3, 2020

@mundaym are you looking for this to go into 3.0.x as well?

@SparkQA
Copy link

SparkQA commented Oct 3, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33989/

@SparkQA
Copy link

SparkQA commented Oct 3, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33989/

@SparkQA
Copy link

SparkQA commented Oct 3, 2020

Test build #129381 has finished for PR 29819 at commit 9181c18.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen srowen closed this in b5e4b8c Oct 6, 2020
@srowen
Copy link
Member

srowen commented Oct 6, 2020

Merged to master/3.0

srowen pushed a commit that referenced this pull request Oct 6, 2020
…SinkSuite

### What changes were proposed in this pull request?

Increase size of socket receive buffer in these tests.

### Why are the changes needed?

The socket receive buffer size set in this test was too small for
the StatsdSinkSuite tests to run reliably on some systems. For a
test in this suite to run reliably the buffer needs to be large
enough to hold all the data in the packets being sent in a test
along with any additional kernel or protocol overhead. The amount
of kernel overhead per packet can vary from system to system but is
typically far higher than the protocol overhead.

If the receive buffer is too small and fills up then packets are
silently dropped. This leads to the test failing with a timeout.

If the socket defaults to a larger receive buffer (normally true)
then we should keep that size.

As well as increasing the minimum buffer size I've also decoupled
the datagram packet buffer size from the receive buffer size. The
receive buffer should in general be far larger to account for the
fact that multiple packets might be buffered, as well as the
aforementioned overhead. Any truncated data in individual packets
will be picked up by the tests.

### Does this PR introduce _any_ user-facing change?

No, this only affects the tests.

### How was this patch tested?
Existing tests on IBM Z and x86.

Closes #29819 from mundaym/fix-statsd.

Authored-by: Michael Munday <mike.munday@ibm.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
(cherry picked from commit b5e4b8c)
Signed-off-by: Sean Owen <srowen@gmail.com>
@mundaym
Copy link
Contributor Author

mundaym commented Oct 7, 2020

Thanks for reviewing this.

@mundaym are you looking for this to go into 3.0.x as well?
Merged to master/3.0

Happy for this to be backported (looks like already done).

holdenk pushed a commit to holdenk/spark that referenced this pull request Oct 27, 2020
…SinkSuite

### What changes were proposed in this pull request?

Increase size of socket receive buffer in these tests.

### Why are the changes needed?

The socket receive buffer size set in this test was too small for
the StatsdSinkSuite tests to run reliably on some systems. For a
test in this suite to run reliably the buffer needs to be large
enough to hold all the data in the packets being sent in a test
along with any additional kernel or protocol overhead. The amount
of kernel overhead per packet can vary from system to system but is
typically far higher than the protocol overhead.

If the receive buffer is too small and fills up then packets are
silently dropped. This leads to the test failing with a timeout.

If the socket defaults to a larger receive buffer (normally true)
then we should keep that size.

As well as increasing the minimum buffer size I've also decoupled
the datagram packet buffer size from the receive buffer size. The
receive buffer should in general be far larger to account for the
fact that multiple packets might be buffered, as well as the
aforementioned overhead. Any truncated data in individual packets
will be picked up by the tests.

### Does this PR introduce _any_ user-facing change?

No, this only affects the tests.

### How was this patch tested?
Existing tests on IBM Z and x86.

Closes apache#29819 from mundaym/fix-statsd.

Authored-by: Michael Munday <mike.munday@ibm.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
(cherry picked from commit b5e4b8c)
Signed-off-by: Sean Owen <srowen@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants