Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-5124][Core] A standard RPC interface and an Akka implementation #4588

Closed
wants to merge 33 commits into from
Closed

[SPARK-5124][Core] A standard RPC interface and an Akka implementation #4588

wants to merge 33 commits into from

Conversation

zsxwing
Copy link
Member

@zsxwing zsxwing commented Feb 13, 2015

This PR added a standard internal RPC interface for Spark and an Akka implementation. See the design document for more details.

I will split the whole work into multiple PRs to make it easier for code review. This is the first PR and avoid to touch too many files.

@SparkQA
Copy link

SparkQA commented Feb 13, 2015

Test build #27440 has started for PR 4588 at commit 45b2317.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Feb 13, 2015

Test build #27440 has finished for PR 4588 at commit 45b2317.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class OutputCommitCoordinatorEndpoint(

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27440/
Test PASSed.

@@ -45,30 +42,37 @@ private[spark] class WorkerWatcher(workerUrl: String)
private var isTesting = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u document what this does?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some docs to explain it.

@SparkQA
Copy link

SparkQA commented Feb 26, 2015

Test build #28002 has started for PR 4588 at commit 04a106e.

  • This patch merges cleanly.

@zsxwing
Copy link
Member Author

zsxwing commented Feb 26, 2015

I will update this PR as per discussion in https://issues.apache.org/jira/browse/SPARK-5124

@SparkQA
Copy link

SparkQA commented Feb 26, 2015

Test build #28002 has finished for PR 4588 at commit 04a106e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class OutputCommitCoordinatorEndpoint(

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28002/
Test PASSed.

* Retrieve the [[RpcEndpointRef]] represented by `systemName`, `address` and `endpointName`
*/
def setupEndpointRef(
systemName: String, address: RpcAddress, endpointName: String): RpcEndpointRef
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick i think the indentation is off by one here.

@SparkQA
Copy link

SparkQA commented Mar 4, 2015

Test build #28264 has started for PR 4588 at commit fe7d1ff.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 4, 2015

Test build #28264 has finished for PR 4588 at commit fe7d1ff.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class OutputCommitCoordinatorEndpoint(

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28264/
Test PASSed.

* @param address the remote address of the connection which this error happens on.
* @param cause the cause of the network error.
*/
private[spark] case class NetworkErrorEvent(address: RpcAddress, cause: Throwable)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually like your old design better - NetworkErrorEvent is not a message, but rather just a function that can be overridden. I think it is more clear that way on how errors are handled. Similarly for AssociatedEvent and DisassociatedEvent btw.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean you like adding the methods in the previous NetworkRpcEndpoint to RpcEndpoint?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

private[deploy] var isShutDown = false
private[deploy] def setTesting(testing: Boolean) = isTesting = testing
private var isTesting = false

// Lets us filter events only from the worker's actor system
private val expectedHostPort = AddressFromURIString(workerUrl).hostPort
private def isWorker(address: Address) = address.hostPort == expectedHostPort
private val expectedHostPort = new java.net.URI(workerUrl)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps RpcAddress.fromUriString(workerUrl) and use of the == method?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use RpcAddress.fromUriString

@aarondav
Copy link
Contributor

Just a few mainly minor comments, otherwise LGTM. We can iterate on some of the API moving forward as we get more endpoints using it.

@SparkQA
Copy link

SparkQA commented Mar 30, 2015

Test build #29378 has started for PR 4588 at commit 8bd1097.

@SparkQA
Copy link

SparkQA commented Mar 30, 2015

Test build #29379 has started for PR 4588 at commit f6f3287.

lazy val actorRef = actorSystem.actorOf(Props(new Actor with ActorLogReceive with Logging {

assert(endpointRef != null)
registerEndpoint(endpoint, endpointRef)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah - I think there was a miscommunication on theses lines. I was not worried about require vs assert, I was wondering if we could just invoke registerEndpoint(endpoint, endpointRef) right before endpointRef.init(), on L158 , since endpointRef is only used inside this call inside the constructor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry. Yes, it can be moved. Done.

@SparkQA
Copy link

SparkQA commented Mar 30, 2015

Test build #29380 has started for PR 4588 at commit fe3df4c.

@SparkQA
Copy link

SparkQA commented Mar 30, 2015

Test build #29378 has finished for PR 4588 at commit 8bd1097.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29378/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Mar 30, 2015

Test build #29379 has finished for PR 4588 at commit f6f3287.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29379/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Mar 30, 2015

Test build #29380 has finished for PR 4588 at commit fe3df4c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29380/
Test PASSed.

@rxin
Copy link
Contributor

rxin commented Mar 30, 2015

Thanks. I've merged this in master. Let's start replacing the direct use of actors with this API. Great work!

@asfgit asfgit closed this in a8d53af Mar 30, 2015
@zsxwing zsxwing deleted the rpc-part1 branch March 30, 2015 05:12
@zsxwing
Copy link
Member Author

zsxwing commented Mar 30, 2015

@rxin @CodingCat @vanzin @aarondav Thanks for reviewing this PR.

asfgit pushed a commit that referenced this pull request Mar 31, 2015
…t does not require a reply

Hotfix for #4588

cc rxin

Author: zsxwing <zsxwing@gmail.com>

Closes #5283 from zsxwing/hotfix and squashes the following commits:

cf3e5a7 [zsxwing] Move StopCoordinator to the receive method since it does not require a reply
guavuslabs-builder pushed a commit to ThalesGroup/spark that referenced this pull request Apr 16, 2015
Fixed my mistake in apache#4588

Author: zsxwing <zsxwing@gmail.com>

Closes apache#5529 from zsxwing/SPARK-6934 and squashes the following commits:

9890b2d [zsxwing] Use 'spark.akka.askTimeout' for the ask timeout
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants