Skip to content

Adding an RPC client server pair between client and AM

Anthony Hsu edited this page Apr 22, 2019 · 4 revisions

TonY relies on Hadoop RPC for communication between clients (TaskExecutor and TonyClient) and the ApplicationMaster. RPC is used to register tasks, fetch task information, and send metrics.

To get an idea of how Hadoop RPC works, check out this very good guide: https://wiki.apache.org/hadoop/HadoopRpc. There are 2 RPC engines Hadoop supports out-of-the-box: WritableRpcEngine (the default) and ProtobufRpcEngine. WritableRpcEngine is currently used for MetricsRpc, while ProtobufRpcEngine is used for TensorFlowCluster.

WritableRpcEngine requires that all method arguments and return types implement Hadoop's Writable interface. Implementations for all primitive types are provided for Hadoop. If you want to serialize objects, those objects need to implement Writable. This is fairly straightforward to do; you just need to implement write and readFields methods. See MetricWritable and MetricsWritable for some examples.

For ProtobufRpcEngine, the APIs are generally defined in a .proto file and then Java classes are generated by the protoc compiler. Protobuf requires that all methods accept a single argument. In order to pass multiple arguments, you need to wrap them in a single object.

In a secure cluster, where hadoop.security.authentication is non-simple and hadoop.security.authorization is true, you will also need to do some security setup for the RPC server and client. At a high-level, the server needs to generate a token for authentication and pass this to the client, which will use the token to securely connect. To plumb this through, this requires

  • setting a SecretManager on the RPC server
  • using the SecretManager to create a Token
  • adding the tokens to the client container environment
  • defining a PolicyProvider that defines what protocols are supported (e.g.: MetricsRpc)
  • defining a SecurityInfo that maps a given protocol (e.g.: MetricsRpc) to the token selector to use (e.g.: ClientToAMTokenSelector)

Currently, both the application RPC server and the metrics RPC server share the same SecretManager (ClientToAMTokenSecretManager), PolicyProvider (TonyPolicyProvider), and SecurityInfo (TonyClientAMSecurityInfo). To add a new secure RPC server, you can probably leverage these existing classes.

When developing an RPC server and client, it's helpful to enable debug logging. You can do this by updating the log4j.properties files or by adding

org.apache.log4j.Logger logger4j = org.apache.log4j.Logger.getRootLogger();
logger4j.setLevel(org.apache.log4j.Level.toLevel("DEBUG"));

to your code to override the log4j.properties settings.

Clone this wiki locally