Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds SigV4 capability #48

Merged
merged 31 commits into from Feb 8, 2023

Conversation

harshavamsi
Copy link
Collaborator

@harshavamsi harshavamsi commented Jan 11, 2023

Description

Adds SigV4 signing capabilities to Hive, Spark, MR, Storm, and Pig.

Issues Resolved

Closes #28

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@harshavamsi
Copy link
Collaborator Author

@dblock @nknize appreciate you help and input here.

I keep running into this issue when I build and do a dry run with HIVE and SigV4. Can't seem to understand why there is a classNotFound error when the JAR is being included.

Exception in thread "main" java.lang.NoClassDefFoundError: software/amazon/awssdk/auth/credentials/AwsCredentialsProvider
        at org.opensearch.hadoop.rest.commonshttp.CommonsHttpTransport.awsSigV4SignRequest(CommonsHttpTransport.java:755)
        at org.opensearch.hadoop.rest.commonshttp.CommonsHttpTransport.execute(CommonsHttpTransport.java:715)
        at org.opensearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:126)
        at org.opensearch.hadoop.rest.RestClient.execute(RestClient.java:433)
        at org.opensearch.hadoop.rest.RestClient.execute(RestClient.java:429)
        at org.opensearch.hadoop.rest.RestClient.execute(RestClient.java:397)
        at org.opensearch.hadoop.rest.RestClient.mainInfo(RestClient.java:694)
        at org.opensearch.hadoop.rest.InitializationUtils.discoverClusterInfo(InitializationUtils.java:430)
        at org.opensearch.hadoop.hive.HiveUtils.init(HiveUtils.java:207)
        at org.opensearch.hadoop.hive.OpenSearchHiveInputFormat.getSplits(OpenSearchHiveInputFormat.java:122)
        at org.opensearch.hadoop.hive.OpenSearchHiveInputFormat.getSplits(OpenSearchHiveInputFormat.java:61)
        at org.apache.hadoop.hive.ql.exec.FetchOperator.generateWrappedSplits(FetchOperator.java:425)
        at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextSplits(FetchOperator.java:395)
        at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:314)
        at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:540)
        at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:509)
        at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
        at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2691)
        at org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
        at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
        at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
Caused by: java.lang.ClassNotFoundException: software.amazon.awssdk.auth.credentials.AwsCredentialsProvider
        at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
        ... 31 more

build.gradle Outdated Show resolved Hide resolved
@reta
Copy link
Contributor

reta commented Jan 17, 2023

I keep running into this issue when I build and do a dry run with HIVE and SigV4. Can't seem to understand why there is a classNotFound error when the JAR is being included.

Those dependencies (below) are needed in classpath along with the Hadoop client.

    implementation("software.amazon.awssdk:aws-core:${project.ext.awsSdkVersion}")
    implementation("software.amazon.awssdk:auth:${project.ext.awsSdkVersion}")
    implementation("software.amazon.awssdk:utils:${project.ext.awsSdkVersion}")

Alternatively, you could shade them (and relocate) the same way you did here https://github.com/opensearch-project/opensearch-hadoop/pull/48/files#diff-b7b046db19a707ccd2cec27ba2fdcd1a33ab3219125d95c187459c9cc26888cc:

dependencies {
    ...
    shaded("software.amazon.awssdk:aws-core:${project.ext.awsSdkVersion}")
    shaded("software.amazon.awssdk:auth:${project.ext.awsSdkVersion}")
    shaded("software.amazon.awssdk:utils:${project.ext.awsSdkVersion}")
}

shadowJar {
   ...
    relocate 'software.amazon', 'org.opensearch.hadoop.thirdparty.software.amazon'
}

In this case, you don't need to alter the classpath

@harshavamsi
Copy link
Collaborator Author

@reta I updated the code to use the native signer with Aws4Signer but I get an error with build now that I cannot understand -- https://github.com/opensearch-project/opensearch-hadoop/actions/runs/4018598709/jobs/6904413460#step:7:211. What am I doing wrong?

@reta
Copy link
Contributor

reta commented Jan 26, 2023

@reta I updated the code to use the native signer with Aws4Signer but I get an error with build now that I cannot understand -- https://github.com/opensearch-project/opensearch-hadoop/actions/runs/4018598709/jobs/6904413460#step:7:211. What am I doing wrong?

@harshavamsi please add implementation("software.amazon.awssdk:sdk-core:${project.ext.awsSdkVersion}") to your dependencies

@harshavamsi
Copy link
Collaborator Author

@reta I updated the code to use the native signer with Aws4Signer but I get an error with build now that I cannot understand -- https://github.com/opensearch-project/opensearch-hadoop/actions/runs/4018598709/jobs/6904413460#step:7:211. What am I doing wrong?

@harshavamsi please add implementation("software.amazon.awssdk:sdk-core:${project.ext.awsSdkVersion}") to your dependencies

Thanks! This fixed that problem. I updated the aws sdk to use shaded jars but I still get this error

Exception in thread "main" java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError: org/opensearch/hadoop/thirdparty/amazon/awssdk/profiles/ProfileFile

@reta
Copy link
Contributor

reta commented Jan 26, 2023

Thanks! This fixed that problem. I updated the aws sdk to use shaded jars but I still get this error

You need to shade profiles as well (and may be a few more modules):

shaded("software.amazon.awssdk:profiles:${project.ext.awsSdkVersion}")

@harshavamsi
Copy link
Collaborator Author

Thanks! This fixed that problem. I updated the aws sdk to use shaded jars but I still get this error

You need to shade profiles as well (and may be a few more modules):

shaded("software.amazon.awssdk:profiles:${project.ext.awsSdkVersion}")

Ah okay. I added

shaded("software.amazon.awssdk:profiles:${project.ext.awsSdkVersion}")
    shaded("software.amazon.awssdk:protocols:${project.ext.awsSdkVersion}")

I still get

Exception in thread "main" java.lang.NoClassDefFoundError: org/opensearch/hadoop/thirdparty/amazon/awssdk/protocols/jsoncore/JsonNodeParser

even though I added protocols.

@harshavamsi
Copy link
Collaborator Author

Thanks! This fixed that problem. I updated the aws sdk to use shaded jars but I still get this error

You need to shade profiles as well (and may be a few more modules):

shaded("software.amazon.awssdk:profiles:${project.ext.awsSdkVersion}")

Ah okay. I added

shaded("software.amazon.awssdk:profiles:${project.ext.awsSdkVersion}")
    shaded("software.amazon.awssdk:protocols:${project.ext.awsSdkVersion}")

I still get

Exception in thread "main" java.lang.NoClassDefFoundError: org/opensearch/hadoop/thirdparty/amazon/awssdk/protocols/jsoncore/JsonNodeParser

even though I added protocols.

@reta i was able to fix this by including shaded("software.amazon.awssdk:json-utils:${project.ext.awsSdkVersion}")

But now it's org/opensearch/hadoop/thirdparty/amazon/awssdk/thirdparty/jackson/core/JsonFactory that's missing. I wonder if the dependencies are colliding?

@harshavamsi
Copy link
Collaborator Author

Thanks! This fixed that problem. I updated the aws sdk to use shaded jars but I still get this error

You need to shade profiles as well (and may be a few more modules):

shaded("software.amazon.awssdk:profiles:${project.ext.awsSdkVersion}")

Ah okay. I added

shaded("software.amazon.awssdk:profiles:${project.ext.awsSdkVersion}")
    shaded("software.amazon.awssdk:protocols:${project.ext.awsSdkVersion}")

I still get

Exception in thread "main" java.lang.NoClassDefFoundError: org/opensearch/hadoop/thirdparty/amazon/awssdk/protocols/jsoncore/JsonNodeParser

even though I added protocols.

@reta i was able to fix this by including shaded("software.amazon.awssdk:json-utils:${project.ext.awsSdkVersion}")

But now it's org/opensearch/hadoop/thirdparty/amazon/awssdk/thirdparty/jackson/core/JsonFactory that's missing. I wonder if the dependencies are colliding?

https://github.com/aws/aws-sdk-java-v2/blob/303a375e8edbe117f8f5a34a554e1b3830def460/third-party/third-party-jackson-core/pom.xml -- looks like jackson itself is a third-party shaded dependency.

@reta
Copy link
Contributor

reta commented Jan 26, 2023

third-party-jackson-core/pom.xml -- looks like jackson itself is a third-party shaded dependency.

Hehe, you may need it as well: https://mvnrepository.com/artifact/software.amazon.awssdk/third-party-jackson-core

@harshavamsi
Copy link
Collaborator Author

@reta I've tested that this works as expected with the managed service. This is now complete. Would gladly accept feedback,

@harshavamsi harshavamsi marked this pull request as ready for review January 27, 2023 17:11
Copy link

@VachaShah VachaShah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change work for both service names "es" and "aoss"?

thirdparty/build.gradle Outdated Show resolved Hide resolved
gradle.properties Outdated Show resolved Hide resolved
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
@harshavamsi
Copy link
Collaborator Author

@penghuo @reta @VachaShah any more suggestions?

.github/workflows/build-mr.yml Outdated Show resolved Hide resolved
gradle.properties Outdated Show resolved Hide resolved
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
@reta
Copy link
Contributor

reta commented Feb 7, 2023

@harshavamsi LGTM, thank you!

Copy link

@VachaShah VachaShah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

In a separate PR, lets also add the documentation for signing requests.

@harshavamsi harshavamsi merged commit 48c7a59 into opensearch-project:main Feb 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] IAM Role Based Authentication for Spark to Elasticsearch
4 participants