-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Operator 0.1.0 #2
Conversation
Thanks for the contribution ! A 19k diff patch is unfortunately not going to be easy to review , or result in good quality reviews. We can keep this PR as reference to help drive the actual PR reviews |
e347a16
to
7fc3ef4
Compare
Thanks for quick turn around @mridulm ! I understand the concern regarding its size. This commit is result of effort to break down the original scope as much as possible. Given that this is a new project, this PR includes several fundamental modules that are essential to make the operator functional. I'd also like to highlight that ~7k lines are auto-generated CRD yaml, which can be bypassed during reviewing its POJO (1k lines). There are ~3k lines comprises of documentations and examples which may require less scrutiny. I appreciate your understanding and open to the discussing how we can streamline the review process together. |
{{/*Licensed to the Apache Software Foundation (ASF) under one*/}} | ||
{{/*or more contributor license agreements. See the NOTICE file*/}} | ||
{{/*distributed with this work for additional information*/}} | ||
{{/*regarding copyright ownership. The ASF licenses this file*/}} | ||
{{/*to you under the Apache License, Version 2.0 (the*/}} | ||
{{/*"License"); you may not use this file except in compliance*/}} | ||
{{/*with the License. You may obtain a copy of the License at*/}} | ||
|
||
{{/* http://www.apache.org/licenses/LICENSE-2.0*/}} | ||
|
||
{{/*Unless required by applicable law or agreed to in writing,*/}} | ||
{{/*software distributed under the License is distributed on an*/}} | ||
{{/*"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY*/}} | ||
{{/*KIND, either express or implied. See the License for the*/}} | ||
{{/*specific language governing permissions and limitations*/}} | ||
{{/*under the License.*/}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{{/*Licensed to the Apache Software Foundation (ASF) under one*/}} | |
{{/*or more contributor license agreements. See the NOTICE file*/}} | |
{{/*distributed with this work for additional information*/}} | |
{{/*regarding copyright ownership. The ASF licenses this file*/}} | |
{{/*to you under the Apache License, Version 2.0 (the*/}} | |
{{/*"License"); you may not use this file except in compliance*/}} | |
{{/*with the License. You may obtain a copy of the License at*/}} | |
{{/* http://www.apache.org/licenses/LICENSE-2.0*/}} | |
{{/*Unless required by applicable law or agreed to in writing,*/}} | |
{{/*software distributed under the License is distributed on an*/}} | |
{{/*"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY*/}} | |
{{/*KIND, either express or implied. See the License for the*/}} | |
{{/*specific language governing permissions and limitations*/}} | |
{{/*under the License.*/}} | |
{{/* | |
Licensed to the Apache Software Foundation (ASF) under one or more | |
contributor license agreements. See the NOTICE file distributed with | |
this work for additional information regarding copyright ownership. | |
The ASF licenses this file to You under the Apache License, Version 2.0 | |
(the "License"); you may not use this file except in compliance with | |
the License. You may obtain a copy of the License at | |
http://www.apache.org/licenses/LICENSE-2.0 | |
Unless required by applicable law or agreed to in writing, software | |
distributed under the License is distributed on an "AS IS" BASIS, | |
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |
See the License for the specific language governing permissions and | |
limitations under the License. | |
*/}} |
Maybe we can get the file to pass the license check by writing it this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW seems CI run license check with apache/skywalking-eyes
, we should move excludes to .github/.licenserc.yaml
Co-authored-by: Qi Tan <qi.tan.jade@gmail.com>
7fc3ef4
to
233c145
Compare
.github/workflows/build_and_test.yml
Outdated
runs-on: ubuntu-latest | ||
strategy: | ||
matrix: | ||
java-version: [ 11, 17 ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to support Java 21
to be consistent with SPARK-43831 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated workflow for java version - can you please re-approve the workflow ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approved.
.bzrignore | ||
.hg/ | ||
.hgignore | ||
.svn/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need .svn
and .hg
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not - it's a blanket ignore. Lemme remove this unused. Thanks for the catch
- v3_3_2 | ||
- v3_3_1 | ||
- v3_3_0 | ||
- v3_2_0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please follow Apache Spark community release policy by removing all EOL versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed old versions except 3.4.1 - once spark-docker releases 3.4.2 I'll replace that asap.
@@ -0,0 +1,6947 @@ | |||
# Generated by Fabric8 CRDGenerator, manual edits might get overwritten! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we can avoid having this by generating this during the compile time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this can be achieved. It's added for
- reference
- packed & used by the helm chart
But this is not mandatory. We already have a gradle task to automatically generate this at compile time.
If it's more preferred to not have the generated yaml checked in, I can remove it when implementing the chart publishing to helm reporistory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it's easy(without much effort to set up external building tools, especially OS coupled) to generate the constant code, I think we can remove it, just like we did for antlr, grpc in the spark main repo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apiVersion: org.apache.spark/v1alpha1 | ||
kind: SparkApplication | ||
metadata: | ||
name: spark-pi-341-212 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, this is not the latest even in Spark 3.4.x line because we have 3.4.2 already. Let's use 3.5.1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack - the e2e-tests
dir aims to include examples for all currently supported Spark versions. I started this effort with all available versions with official image . I'll replace the examples to remove old versions to avoid confusing people.
spark-operator-api/src/main/java/org/apache/spark/kubernetes/operator/Constants.java
Outdated
Show resolved
Hide resolved
spark-operator-api/src/main/java/org/apache/spark/kubernetes/operator/SparkApplication.java
Show resolved
Hide resolved
spark-operator-api/src/main/java/org/apache/spark/kubernetes/operator/spec/BaseSpec.java
Outdated
Show resolved
Hide resolved
|
||
public enum JDKVersion { | ||
Java11, | ||
Java17 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add Java 21.
spark-operator-api/src/main/java/org/apache/spark/kubernetes/operator/spec/SparkVersion.java
Outdated
Show resolved
Hide resolved
restartPolicy: NEVER | ||
deploymentMode: CLUSTER_MODE | ||
driverArgs: [] | ||
jars: local:///opt/spark/examples/jars/spark-examples_2.12-3.4.1.jar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the latest one, 3.5.2
, because this is getting_started.md
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for catching this ! will update to 3.5.2 as soon as it's available
| kubernetes.client.<ResourceName>.<Method> | Meter | Tracking the rates of HTTP request for a combination of one Kubernetes resource and one http method | | ||
| kubernetes.client.<NamespaceName>.<ResourceName>.<Method> | Meter | Tracking the rates of HTTP request for a combination of one namespace-scoped Kubernetes resource and one http method | | ||
|
||
## Forward Metrics to Prometheus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for providing this section.
This is a breakdown from #2 : set up [gradle](https://gradle.org/) as the build-tool for operator Closes #4 from jiangzho/gradle. Authored-by: zhou-jiang <zhou_jiang@apple.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Please rebase this PR, @jiangzho . |
Rebase pick failed due to schedule change - attempted a non fast forward merge commit. I may do a final squash to make the history clear |
runs-on: ubuntu-latest | ||
strategy: | ||
matrix: | ||
java-version: [ 11, 17, 21 ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove Java 11 and make a spin-off PR for this build_and_test.yml
, @jiangzho .
|
||
# Packages # | ||
############ | ||
*.7z |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my understanding, where this came from?
*.dmg | ||
*.gz | ||
*.iso | ||
*.rar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
|
||
# Logs and databases # | ||
###################### | ||
*.log |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks too broad.
additionalContainers: { } | ||
# additionalContainers: | ||
# - name: "" | ||
# image: "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove 80 ~ 82 because we have line 79.
watchGivenNamespacesOnly: false | ||
data: | ||
# - "spark-demo" | ||
# - "spark-demo" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I ask why we have the same comment twice?
} | ||
} | ||
|
||
assert JavaVersion.current().isJava11Compatible(): "Java 11 or newer is required" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please drop Java 11.
|
||
--> | ||
|
||
<module name="Checker"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make a spin-off PR for Static Analysis
-related stuff, @jiangzho .
### What changes were proposed in this pull request? This is a breakdown PR from #2 - setting up common build Java tasks and corresponding plugins. ### Why are the changes needed? This PR includes checkstyle, pmd, spotbugs. Also includes jacoco for coverage analysis, spotless for formatting. These tasks can help to enhance the quality of future Java contributions. They can also be referred in CI tasks for automation. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Tested manually. ### Was this patch authored or co-authored using generative AI tooling? no Closes #6 from jiangzho/builder_task. Authored-by: zhou-jiang <zhou_jiang@apple.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Hi, All. For the record, the following basic build and test PRs are merged. We are moving to API part.
To @jiangzho , please make a PR for Since Apache Spark 4.0.0-preview will start very soon, please use the matched dependency of |
To @jiangzho , please make a PR for |
Hi, All. For the record,
Thank you for driving this, @jiangzho . |
### What changes were proposed in this pull request? This is a breakdown PR of #2 - adding a submission worker implementation for SparkApplication. ### Why are the changes needed? Spark Operator needs a submission worker to convert its abstraction (the SparkApplication API) into k8s resource spec. This is a light-weight implementation based on native k8s integration. As of now, it's based off Spark 4.0.0-preview1 - but it's assumed to serve all Spark LTS versions. This is feasible because as it aims to cover only the spec generation, Spark core jars are still brought-in by application images. E2Es would set up with operator later to ensure that. Per SPIP doc, in future operator version(s) we may add more implementations for submission worker based on different Spark versions to achieve 100% version agnostic, at the cost of having multiple workers stand-by. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added unit test coverage. ### Was this patch authored or co-authored using generative AI tooling? no Closes #10 from jiangzho/worker. Authored-by: zhou-jiang <zhou_jiang@apple.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
For the record, From Spark repo side, @jiangzho also made the following improvements additionally for
Thank you again, @jiangzho . Shall we move forward to the next modules? |
Let me close this because this PR is obsolete as of now, @jiangzho . :) |
What changes were proposed in this pull request?
This PR proposes Spark Operator v1.0.0-alpha, which supports operating Spark Applications on k8s.
Please also refer README & docs for design details.
For quick start & test the operator with sample application, use getting started guide at
spark-operator-docs/getting_started.md
Why are the changes needed?
Spark on k8s Operator is a project that helps to effectively manage Spark application lifecycle on k8s in addition to native integration. For more details, please also refer Spark Operator SPIP doc.
Does this PR introduce any user-facing change?
No changes introduced in Spark API.
Spark Operator extends k8s API for SparkApplications.
How was this patch tested?
Unit test coverage is being actively added into this repository.
Getting started guide provides an example to start and test this project locally or on a remote cluster.
Operator e2e tests would be added incrementally with ci integration.
Was this patch authored or co-authored using generative AI tooling?
no