Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operator 0.1.0 #2

Closed
wants to merge 15 commits into from
Closed

Operator 0.1.0 #2

wants to merge 15 commits into from

Conversation

jiangzho
Copy link
Contributor

@jiangzho jiangzho commented Apr 2, 2024

What changes were proposed in this pull request?

This PR proposes Spark Operator v1.0.0-alpha, which supports operating Spark Applications on k8s.

Please also refer README & docs for design details.

For quick start & test the operator with sample application, use getting started guide at spark-operator-docs/getting_started.md

Why are the changes needed?

Spark on k8s Operator is a project that helps to effectively manage Spark application lifecycle on k8s in addition to native integration. For more details, please also refer Spark Operator SPIP doc.

Does this PR introduce any user-facing change?

No changes introduced in Spark API.

Spark Operator extends k8s API for SparkApplications.

How was this patch tested?

Unit test coverage is being actively added into this repository.

Getting started guide provides an example to start and test this project locally or on a remote cluster.

Operator e2e tests would be added incrementally with ci integration.

Was this patch authored or co-authored using generative AI tooling?

no

@mridulm
Copy link

mridulm commented Apr 2, 2024

Thanks for the contribution !

A 19k diff patch is unfortunately not going to be easy to review , or result in good quality reviews.
Can you please split this up into more manageable chunks to aid with it ?

We can keep this PR as reference to help drive the actual PR reviews

@jiangzho
Copy link
Contributor Author

jiangzho commented Apr 2, 2024

Thanks for quick turn around @mridulm !

I understand the concern regarding its size. This commit is result of effort to break down the original scope as much as possible. Given that this is a new project, this PR includes several fundamental modules that are essential to make the operator functional. I'd also like to highlight that ~7k lines are auto-generated CRD yaml, which can be bypassed during reviewing its POJO (1k lines). There are ~3k lines comprises of documentations and examples which may require less scrutiny. I appreciate your understanding and open to the discussing how we can streamline the review process together.

Comment on lines 1 to 16
{{/*Licensed to the Apache Software Foundation (ASF) under one*/}}
{{/*or more contributor license agreements. See the NOTICE file*/}}
{{/*distributed with this work for additional information*/}}
{{/*regarding copyright ownership. The ASF licenses this file*/}}
{{/*to you under the Apache License, Version 2.0 (the*/}}
{{/*"License"); you may not use this file except in compliance*/}}
{{/*with the License. You may obtain a copy of the License at*/}}

{{/* http://www.apache.org/licenses/LICENSE-2.0*/}}

{{/*Unless required by applicable law or agreed to in writing,*/}}
{{/*software distributed under the License is distributed on an*/}}
{{/*"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY*/}}
{{/*KIND, either express or implied. See the License for the*/}}
{{/*specific language governing permissions and limitations*/}}
{{/*under the License.*/}}
Copy link

@zwangsheng zwangsheng Apr 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{{/*Licensed to the Apache Software Foundation (ASF) under one*/}}
{{/*or more contributor license agreements. See the NOTICE file*/}}
{{/*distributed with this work for additional information*/}}
{{/*regarding copyright ownership. The ASF licenses this file*/}}
{{/*to you under the Apache License, Version 2.0 (the*/}}
{{/*"License"); you may not use this file except in compliance*/}}
{{/*with the License. You may obtain a copy of the License at*/}}
{{/* http://www.apache.org/licenses/LICENSE-2.0*/}}
{{/*Unless required by applicable law or agreed to in writing,*/}}
{{/*software distributed under the License is distributed on an*/}}
{{/*"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY*/}}
{{/*KIND, either express or implied. See the License for the*/}}
{{/*specific language governing permissions and limitations*/}}
{{/*under the License.*/}}
{{/*
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}

Maybe we can get the file to pass the license check by writing it this way.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW seems CI run license check with apache/skywalking-eyes, we should move excludes to .github/.licenserc.yaml

Co-authored-by: Qi Tan <qi.tan.jade@gmail.com>
runs-on: ubuntu-latest
strategy:
matrix:
java-version: [ 11, 17 ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to support Java 21 to be consistent with SPARK-43831 .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated workflow for java version - can you please re-approve the workflow ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved.

.gitignore Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
.bzrignore
.hg/
.hgignore
.svn/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need .svn and .hg?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not - it's a blanket ignore. Lemme remove this unused. Thanks for the catch

- v3_3_2
- v3_3_1
- v3_3_0
- v3_2_0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow Apache Spark community release policy by removing all EOL versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed old versions except 3.4.1 - once spark-docker releases 3.4.2 I'll replace that asap.

@@ -0,0 +1,6947 @@
# Generated by Fabric8 CRDGenerator, manual edits might get overwritten!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we can avoid having this by generating this during the compile time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this can be achieved. It's added for

  • reference
  • packed & used by the helm chart

But this is not mandatory. We already have a gradle task to automatically generate this at compile time.

If it's more preferred to not have the generated yaml checked in, I can remove it when implementing the chart publishing to helm reporistory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it's easy(without much effort to set up external building tools, especially OS coupled) to generate the constant code, I think we can remove it, just like we did for antlr, grpc in the spark main repo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be straightforward - removed this 39a74e1

docs at d448092

apiVersion: org.apache.spark/v1alpha1
kind: SparkApplication
metadata:
name: spark-pi-341-212
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, this is not the latest even in Spark 3.4.x line because we have 3.4.2 already. Let's use 3.5.1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack - the e2e-tests dir aims to include examples for all currently supported Spark versions. I started this effort with all available versions with official image . I'll replace the examples to remove old versions to avoid confusing people.

gradle.properties Outdated Show resolved Hide resolved

public enum JDKVersion {
Java11,
Java17
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add Java 21.

restartPolicy: NEVER
deploymentMode: CLUSTER_MODE
driverArgs: []
jars: local:///opt/spark/examples/jars/spark-examples_2.12-3.4.1.jar
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the latest one, 3.5.2, because this is getting_started.md.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for catching this ! will update to 3.5.2 as soon as it's available

| kubernetes.client.<ResourceName>.<Method> | Meter | Tracking the rates of HTTP request for a combination of one Kubernetes resource and one http method |
| kubernetes.client.<NamespaceName>.<ResourceName>.<Method> | Meter | Tracking the rates of HTTP request for a combination of one namespace-scoped Kubernetes resource and one http method |

## Forward Metrics to Prometheus
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for providing this section.

.gitattributes Outdated Show resolved Hide resolved
@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Apr 16, 2024

Yes, Apache Iceberg community didn't include it. I'd prefer to exclude that jar, @aaruna and @pan3793 .

dongjoon-hyun pushed a commit that referenced this pull request Apr 18, 2024
This is a breakdown from #2 : set up [gradle](https://gradle.org/) as the build-tool for operator

Closes #4 from jiangzho/gradle.

Authored-by: zhou-jiang <zhou_jiang@apple.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@dongjoon-hyun
Copy link
Member

Please rebase this PR, @jiangzho .

@jiangzho jiangzho changed the title Operator 1.0.0-alpha Operator 0.1.0 Apr 19, 2024
@jiangzho
Copy link
Contributor Author

Rebase pick failed due to schedule change - attempted a non fast forward merge commit. I may do a final squash to make the history clear

runs-on: ubuntu-latest
strategy:
matrix:
java-version: [ 11, 17, 21 ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove Java 11 and make a spin-off PR for this build_and_test.yml, @jiangzho .


# Packages #
############
*.7z
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding, where this came from?

*.dmg
*.gz
*.iso
*.rar
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.


# Logs and databases #
######################
*.log
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks too broad.

additionalContainers: { }
# additionalContainers:
# - name: ""
# image: ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove 80 ~ 82 because we have line 79.

watchGivenNamespacesOnly: false
data:
# - "spark-demo"
# - "spark-demo"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I ask why we have the same comment twice?

}
}

assert JavaVersion.current().isJava11Compatible(): "Java 11 or newer is required"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please drop Java 11.


-->

<module name="Checker">
Copy link
Member

@dongjoon-hyun dongjoon-hyun Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make a spin-off PR for Static Analysis-related stuff, @jiangzho .

dongjoon-hyun pushed a commit that referenced this pull request Apr 23, 2024
### What changes were proposed in this pull request?

This is a breakdown PR from #2  - setting up common build Java tasks and corresponding plugins.

### Why are the changes needed?

This PR includes checkstyle, pmd, spotbugs. Also includes jacoco for coverage analysis, spotless for formatting. These tasks can help to enhance the quality of future Java contributions. They can also be referred in CI tasks for automation.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Tested manually.

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #6 from jiangzho/builder_task.

Authored-by: zhou-jiang <zhou_jiang@apple.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@dongjoon-hyun
Copy link
Member

Hi, All.

For the record, the following basic build and test PRs are merged. We are moving to API part.

To @jiangzho , please make a PR for spark-operator-api part first.

Since Apache Spark 4.0.0-preview will start very soon, please use the matched dependency of master branch instead of Apache Spark 3.5.1 dependency. For example, K8s client library should be 6.12.1 instead of 6.7.1.

@dongjoon-hyun
Copy link
Member

Java API module is merged.

To @jiangzho , please make a PR for spark-operator and what you think it's tightly-related.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented May 3, 2024

Hi, All.

For the record,

Thank you for driving this, @jiangzho .

dongjoon-hyun pushed a commit that referenced this pull request May 20, 2024
### What changes were proposed in this pull request?

This is a breakdown PR of #2  - adding a submission worker implementation for SparkApplication.

### Why are the changes needed?

Spark Operator needs a submission worker to convert its abstraction (the SparkApplication API) into k8s resource spec.
This is a light-weight implementation based on native k8s integration.

As of now, it's based off Spark 4.0.0-preview1 - but it's assumed to serve all Spark LTS versions. This is feasible because as it aims to cover only the spec generation, Spark core jars are still brought-in by application images. E2Es would set up with operator later to ensure that.

Per SPIP doc, in future operator version(s) we may add more implementations for submission worker based on different Spark versions to achieve 100% version agnostic, at the cost of having multiple workers stand-by.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Added unit test coverage.

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #10 from jiangzho/worker.

Authored-by: zhou-jiang <zhou_jiang@apple.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@dongjoon-hyun
Copy link
Member

For the record, submission module is merged finally today with SPARK-48326 (TODO).

From Spark repo side, @jiangzho also made the following improvements additionally for submission module.

Thank you again, @jiangzho . Shall we move forward to the next modules?

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Jul 29, 2024

@dongjoon-hyun
Copy link
Member

Let me close this because this PR is obsolete as of now, @jiangzho . :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants