-
Notifications
You must be signed in to change notification settings - Fork 75
Support Spark 3.3 and EMR 6.10 #113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| processors: ["cpu"] | ||
| python: ["py39"] | ||
| sm_version: "1.1" | ||
| sm_version: "1.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we going down to 1.0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, this represents container's minor version. Hence, we downgrade it to 1.0 everytime major version gets upgraded because of spark or python version update.
| @@ -0,0 +1,132 @@ | |||
| FROM 137112412989.dkr.ecr.us-west-2.amazonaws.com/amazonlinux:2 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New file? I thought we would already have one dockerfile and would just need to update it. Why do we need a new docker file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. So, for each major version upgrade we create new folder(missed the folder 3.3 earlier)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah.. that's right!
| && yum install -y awscli bigtop-utils curl gcc gzip unzip zip gunzip tar wget liblapack* libblas* libopencv* libopenblas* | ||
|
|
||
| # Install python 3.9 | ||
| ARG PYTHON_BASE_VERSION=3.9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the latest version supported by emr release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, retaining the current python version (python 3.9) as per the plan.
Description of changes:
Added support for Spark 3.3 and EMR 6.10
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.