New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding YARNContainerFactory. This allows OpenWhisk to run actions on Apache Hadoop clusters. #4129
Conversation
🤯 this is excellent! |
For anyone wanting to test this out, here is a quick guide for enabling Docker on YARN: It was tested with HDP 3.0.1 from Hortonworks. Available here: The steps are similar with a base Apache Hadoop installation (version 3.1.1 or higher). |
e7753ca
to
d926b7d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks generally straightforward to follow but there are some things to consider in the use of futures with synchronized and blocking operations. I'm curious to try this and will find some time to do that.
common/scala/src/main/scala/org/apache/openwhisk/core/yarn/YARNContainerFactory.scala
Outdated
Show resolved
Hide resolved
common/scala/src/main/scala/org/apache/openwhisk/core/yarn/YARNContainerFactory.scala
Outdated
Show resolved
Hide resolved
common/scala/src/main/scala/org/apache/openwhisk/core/yarn/YARNContainerFactory.scala
Outdated
Show resolved
Hide resolved
common/scala/src/main/scala/org/apache/openwhisk/core/yarn/YARNContainerFactory.scala
Outdated
Show resolved
Hide resolved
@rabbah Thanks for the feedback. I refactored the solution to use an Actor instead of synchronized blocks. The TravisCI build errored due to: |
Nice! Thanks for improving the PR. I will go through the changes and restart Travis for you. |
Codecov Report
@@ Coverage Diff @@
## master #4129 +/- ##
=========================================
- Coverage 85.75% 80.7% -5.05%
=========================================
Files 163 168 +5
Lines 7594 7838 +244
Branches 502 525 +23
=========================================
- Hits 6512 6326 -186
- Misses 1082 1512 +430
Continue to review full report at Codecov.
|
The test coverage is missing some of the error handling and all the Kerberos/SPNEGO code. Is it possible to test Kerberos/SPNEGO with TravisCI? |
I don't know @SamHjelmfelt - not something I've tried. We do have a Jenkins job we can try once it's fully operational. |
adb4079
to
4d57970
Compare
Here is a YARN sandbox that will simplify testing. It is a ~5GB docker image that contains YARN pre-installed and pre-configured. Just run this container and configure the invoker for a YARN RM at localhost:8088. Just use yarnquickstart-sample-hotfix.ini instead of yarnquickstart-sample.ini. Additionally, the following project has this pull request pre-configured with a single command quickstart. The relevant YARN configurations are commented out in the docker-whisk-controller.env file. |
sorry @SamHjelmfelt i'm behind on this but i'm going through and regaining velocity on the prs so i'm hoping to get to it sooner rather than later. |
…Apache Hadoop clusters. Rebased and squashed.
…decommissioning (YARN-8761)
Rebasing |
common/scala/src/main/scala/org/apache/openwhisk/core/yarn/YARNComponentActor.scala
Outdated
Show resolved
Hide resolved
common/scala/src/main/scala/org/apache/openwhisk/core/yarn/YARNComponentActor.scala
Show resolved
Hide resolved
common/scala/src/main/scala/org/apache/openwhisk/core/yarn/YARNComponentActor.scala
Show resolved
Hide resolved
common/scala/src/main/scala/org/apache/openwhisk/core/yarn/YARNComponentActor.scala
Outdated
Show resolved
Hide resolved
common/scala/src/main/scala/org/apache/openwhisk/core/yarn/YARNComponentActor.scala
Outdated
Show resolved
Hide resolved
common/scala/src/main/scala/org/apache/openwhisk/core/yarn/YARNContainerFactory.scala
Show resolved
Hide resolved
common/scala/src/main/scala/org/apache/openwhisk/core/yarn/YARNContainerFactory.scala
Show resolved
Hide resolved
common/scala/src/main/scala/org/apache/openwhisk/core/yarn/YARNContainerFactory.scala
Outdated
Show resolved
Hide resolved
common/scala/src/main/scala/org/apache/openwhisk/core/yarn/YARNContainerFactory.scala
Outdated
Show resolved
Hide resolved
common/scala/src/main/scala/org/apache/openwhisk/core/yarn/YARNContainerFactory.scala
Show resolved
Hide resolved
Thanks for reviewing this code with me and showing me a demo. Minor nits but LGTM generally. |
@rabbah I believe I addressed everything you requested, but I am happy to make further improvements. |
@SamHjelmfelt this is a very neat addition! |
…Apache Hadoop clusters. (apache#4129)
Thousands of organizations have Apache Hadoop clusters today. By implementing a YARNContainerFactory, OpenWhisk can run actions on Hadoop clusters. This will lower the barrier to adoption and expand the potential use cases for OpenWhisk.
Description
The YARNContainerFactory uses the Apache Hadoop Services REST API to create a single YARN service with a component for each action type. Both simple authentication and Kerberos/SPNEGO are supported. This was tested with Apache Hadoop 3.1.1.
https://hadoop.apache.org/docs/r3.1.1/hadoop-yarn/hadoop-yarn-site/yarn-service/YarnServiceAPI.html
The implemention is based on the MesosContainerFactory.
This was first implemented using Akka HTTP, but was rewritten to use the Apache HTTP client in order to suport SPNEGO.
There is a MockYARNRM in the tests directory. This mock RM simulates the YARN Resource Manager REST API and is used for the YARNContainerFactory tests.
Related issue and scope
My changes affect the following components
Types of changes
Checklist:
The website should be updated as well