Skip to content

HDDS-4440- s3 performance#2259

Closed
neils-dev wants to merge 14 commits intoapache:HDDS-4440-s3-performancefrom
neils-dev:HDDS-4440-s3-performance
Closed

HDDS-4440- s3 performance#2259
neils-dev wants to merge 14 commits intoapache:HDDS-4440-s3-performancefrom
neils-dev:HDDS-4440-s3-performance

Conversation

@neils-dev
Copy link
Contributor

@neils-dev neils-dev commented May 18, 2021

What changes were proposed in this pull request?

Initial commit for s3g gRPC for command OmRequest and OmResponses. Creates:

  1. a gRPC service for existing Om protocol - protoc 2.5 hadoop-ozone/interface-client/pom.xml, hadoop-ozone/interface-client/target/generated-sources/protobuf/java/org/apache/hadoop/ozone/protocol/proto/OzoneManagerServiceGrpc.java - HDDS-5210
  2. starts OM gRPC server as part of OM bootstrap - hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/GrpcOzoneManagerServer.java, OzoneManager.java - HDDS-5211
  3. creates implementation of OmTransport for gRPC - hadoop-ozone/interface-client/pom.xml - HDDS-5212
  4. Create specific OmTransportFactory for GrpcOmTransport - hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/protocolPB/GrpcOmTransportFactory.java; dynamically selected ServiceProvider loaded class for s3g hadoop-ozone/s3gateway/src/main/resources/META-INF/services/org.apache.hadoop.ozone.om.protocolPB.OmTransportFactory - HDDS-5213

Currently requires ratis disabled for Om : set ozone.om.ratis.enable=false in ozone-site.xml;(fixed/resolved) use intellij hadoop-ozone/dev-support/intellij/ozone-site.xml and intellij s3g together with invoke s3g for simple bucket commands through s3 cli.

In addition, for s3g gRPC OmRequest/OmResponse a single OmTransport connection is used between s3g and the ozone manager. A cached OzoneClient is implemented to service every s3g request.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-5210 - (see 1. previous section)
https://issues.apache.org/jira/browse/HDDS-5211 - (see 2. previous section)
https://issues.apache.org/jira/browse/HDDS-5212 - (see 3. previous section)
https://issues.apache.org/jira/browse/HDDS-5213 - (see 4. previous section)

How was this patch tested?

Patch is initial POC for s3g gRPC OmRequest/OmResponse. Currently requires ratis disabled for Om - (when invoking ratis server through OzoneManagerRatisServer submitRequest, createWriteRaftClientRequest(omRequest) fails when checking preconditions due to hadoop class Server.Call is null; requires attention from propagating the om SubmitRequest through the GrpcServer). (fixed/resolved)

To test - use intellij ozone development cluster running through runConfigurations for invoking SCM, OM, Recon, Datanodes and S3Gateway. Commit includes modified ozone-site.xml (hadoop-ozone/dev-support/intellij/ozone-site.xml) to disable ratis for om (<property><name>ozone.om.ratis.enable</name><value>false</value></property>). (fixed/resolved) Run simple s3 file operations through aws cli:
end_to_end_s3Om_gRPC_

For s3g gRPC comparison to existing s3g hadoop RPC performance measurements, a freon test is implemented,S3BucketGenerator.java, and used.
Run through intellij, deployment platform a cluster consisting of SCM, OM, Recon,Datanodes, S3Gateway:

$ ozone freon s3bg -t 2 -n 5

freon_s3bg

neils-dev added 5 commits May 13, 2021 22:35
…rt, client grpc-netty channal, server grpc-netty service. Generates grpc s3OmGrpc server protobuf and client stubs. Changes made to pom files in hadoop-common and hadoop-client-interfaces.
…PC service def for Om protocol (subtask 1 hdds-4440 - OzoneManagerServiceGrpc.java); OM Gprc server started by OM (subtask 2 hdds-4440 - GrpcOzoneManagerServer.java, OzoneManager.java); implementation fo OmTransport interface (subtask 3 hdds-4440 - GrpcOmTransport.java); implementation of OmTransportFactory for s3g gRPC (subtask 4 hdds-4440 - GrpcOmTransportFactory.java). - Necessary to disable Om ratis for this commit, set ozone.om.ratis.enable=false in ozone-site.xml; use intellij hadoop-ozone/dev-support/intellij/ozone-site.xml and intellij s3g together with invoke s3g for simple bucket commands through s3 cli.
…ent requests - fixes reported limitation on disabled om ratis configuration. S3g gRPC NOW functioning with om ratis. Added maven pom file dependency to s3gateway to fix problem with runtime not finding grpc netty ChannelFactory when run in docker cluster mode. Minor cleanup to maven pom files.
… OmRequest/OmResponse. Created OzoneClient cache in OzoneClientProducer to have single OzoneClient (GrpcOmTransport connection) process all s3 requests.
@elek
Copy link
Member

elek commented May 25, 2021

Thanks the patch @neils-dev. Looks it has a lot of unrelated change. Is it possible that master is merged to your branch but not to the target branch?

yoowonsuk and others added 9 commits May 25, 2021 16:54
…ions 1.38.0 and 4.1.63.Final as defined in main ozone pom build properties. Also minor modifications to GrpcServer and OzoneManager for checkstyle related fixes.
…thentication. Includes client creating OzoneToken with s3 string2sign, aws id and signature, transport through grpc (modified OzoneManagerProtocol) with request thread context. On the server side, authenticating request with s3 signature comparing with secret key stored through the delegationTokenManager. Datanode delegation token handling to follow in next commit.
…request filter to create OzoneToken for authenication and client request thread context; added changes to Om server to handle s3g grpc with ACLS through OmClientRequest; changes made to handle delegationToken based s3 object operations (put,get,list,delete requests); unit test cases added for s3g filters and aws signature; changes made to pass all current s3g unit tests.
…cation header date validator. General clean up / fixes for attempting a green build for the feature branch.
…d ACCESS_DENIED to http responses from authentication (secret key missing and signature mismatch) errors. Also added support for error propagation back to client from server through gRPC OMResponse with proper failure status and error message. Simplied error handling from http servlet UgiFilter - interpreted as fatal error at this level, s3 request errors should be handled from s3 endpoint command level and not at http filter level.
@adoroszlai
Copy link
Contributor

@neils-dev Given that

  • HDDS-4440 is split into sub-tasks
  • we had a separate PR for the first sub-task
  • which had improvement over this one (as CSI test is failing here and was OK there)
  • the separate PR just got merged
  • we had a merge conflict even before that

I think we should close this mega PR and open a new one for the second sub-task.

@adoroszlai adoroszlai closed this Sep 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants