Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-6667] [PySpark] remove setReuseAddress #5324

Closed
wants to merge 3 commits into from

Conversation

davies
Copy link
Contributor

@davies davies commented Apr 2, 2015

The reused address on server side had caused the server can not acknowledge the connected connections, remove it.

This PR will retry once after timeout, it also add a timeout at client side.

@davies
Copy link
Contributor Author

davies commented Apr 2, 2015

cc @JoshRosen

@SparkQA
Copy link

SparkQA commented Apr 2, 2015

Test build #29587 has started for PR 5324 at commit b838f35.

@davies
Copy link
Contributor Author

davies commented Apr 2, 2015

After testing for a while, it seems that the retry does not work, but the timeout on client side can help:

15/04/01 22:42:14 WARN PythonRDD: Timed out after 4 seconds, retry once
15/04/01 22:42:14 ERROR PythonRDD: Error while sending iterator
java.net.SocketTimeoutException: Accept timed out
    at java.net.PlainSocketImpl.socketAccept(Native Method)
    at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
    at java.net.ServerSocket.implAccept(ServerSocket.java:530)
    at java.net.ServerSocket.accept(ServerSocket.java:498)
    at org.apache.spark.api.python.PythonRDD$$anon$2.run(PythonRDD.scala:624)
**********************************************************************
File "/Users/davies/work/spark/python/pyspark/rdd.py", line 1090, in __main__.RDD.variance
Failed example:
    sc.parallelize([1, 2, 3]).variance()
Exception raised:
    Traceback (most recent call last):
      File "//anaconda/lib/python2.7/doctest.py", line 1315, in __run
        compileflags, 1) in test.globs
      File "<doctest __main__.RDD.variance[0]>", line 1, in <module>
        sc.parallelize([1, 2, 3]).variance()
      File "/Users/davies/work/spark/python/pyspark/rdd.py", line 1093, in variance
        return self.stats().variance()
      File "/Users/davies/work/spark/python/pyspark/rdd.py", line 948, in stats
        return self.mapPartitions(lambda i: [StatCounter(i)]).reduce(redFunc)
      File "/Users/davies/work/spark/python/pyspark/rdd.py", line 745, in reduce
        vals = self.mapPartitions(func).collect()
      File "/Users/davies/work/spark/python/pyspark/rdd.py", line 720, in collect
        return list(_load_from_socket(port, self._jrdd_deserializer))
      File "/Users/davies/work/spark/python/pyspark/rdd.py", line 120, in _load_from_socket
        for item in serializer.load_stream(rf):
      File "/Users/davies/work/spark/python/pyspark/serializers.py", line 131, in load_stream
        yield self._read_with_length(stream)
      File "/Users/davies/work/spark/python/pyspark/serializers.py", line 148, in _read_with_length
        length = read_int(stream)
      File "/Users/davies/work/spark/python/pyspark/serializers.py", line 526, in read_int
        length = stream.read(4)
      File "//anaconda/lib/python2.7/socket.py", line 380, in read
        data = self._sock.recv(left)
    timeout: timed out
**********************************************************************

@SparkQA
Copy link

SparkQA commented Apr 2, 2015

Test build #29589 has started for PR 5324 at commit 7977c2f.

@SparkQA
Copy link

SparkQA commented Apr 2, 2015

Test build #29592 has started for PR 5324 at commit e5a51a2.

@SparkQA
Copy link

SparkQA commented Apr 2, 2015

Test build #29587 has finished for PR 5324 at commit b838f35.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class MessageWithHeader extends AbstractReferenceCounted implements FileRegion
  • This patch does not change any dependencies.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29587/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Apr 2, 2015

Test build #29589 has finished for PR 5324 at commit 7977c2f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29589/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Apr 2, 2015

Test build #29592 has finished for PR 5324 at commit e5a51a2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29592/
Test PASSed.

@davies davies changed the title [SPARK-6667] [PySpark] retry after timeout to accept [SPARK-6667] [PySpark] remove setReuseAddress Apr 2, 2015
asfgit pushed a commit that referenced this pull request Apr 2, 2015
The reused address on server side had caused the server can not acknowledge the connected connections, remove it.

This PR will retry once after timeout, it also add a timeout at client side.

Author: Davies Liu <davies@databricks.com>

Closes #5324 from davies/collect_hang and squashes the following commits:

e5a51a2 [Davies Liu] remove setReuseAddress
7977c2f [Davies Liu] do retry on client side
b838f35 [Davies Liu] retry after timeout

(cherry picked from commit 0cce545)
Signed-off-by: Josh Rosen <joshrosen@databricks.com>
@asfgit asfgit closed this in 0cce545 Apr 2, 2015
@JoshRosen
Copy link
Contributor

I've merged this to master (1.4.0), branch-1.3 (1.3.1), and branch-1.2 (1.2.2). Thanks!

asfgit pushed a commit that referenced this pull request Apr 2, 2015
The reused address on server side had caused the server can not acknowledge the connected connections, remove it.

This PR will retry once after timeout, it also add a timeout at client side.

Author: Davies Liu <davies@databricks.com>

Closes #5324 from davies/collect_hang and squashes the following commits:

e5a51a2 [Davies Liu] remove setReuseAddress
7977c2f [Davies Liu] do retry on client side
b838f35 [Davies Liu] retry after timeout

(cherry picked from commit 0cce545)
Signed-off-by: Josh Rosen <joshrosen@databricks.com>
markhamstra pushed a commit to markhamstra/spark that referenced this pull request Apr 15, 2015
The reused address on server side had caused the server can not acknowledge the connected connections, remove it.

This PR will retry once after timeout, it also add a timeout at client side.

Author: Davies Liu <davies@databricks.com>

Closes apache#5324 from davies/collect_hang and squashes the following commits:

e5a51a2 [Davies Liu] remove setReuseAddress
7977c2f [Davies Liu] do retry on client side
b838f35 [Davies Liu] retry after timeout

(cherry picked from commit 0cce545)
Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants