Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-47978][CONNECT] Use v1 as spark connect go library starting version #19

Closed
wants to merge 1 commit into from

Conversation

hiboyang
Copy link
Contributor

@hiboyang hiboyang commented Apr 22, 2024

What changes were proposed in this pull request?

Use v1 for this Spark Connect Go Client library, instead of previously using v34.

Also regenerate protobuf files based on Spark latest 3.5 release (3.5.1).

Why are the changes needed?

There is a recent discussion in Spark community for Spark Operator version naming convention. People like to use version independent of Spark versions. That applies to Spark Connect Go Client as well. Thus change the version here to start from v1.

Does this PR introduce any user-facing change?

Yes, Spark Connect Go Client users will use v1 instead of use v34.

How was this patch tested?

Run unit test.

@viirya
Copy link
Member

viirya commented Apr 23, 2024

Does this Spark connect go library have any previous release yet?

@@ -13,7 +13,7 @@
// See the License for the specific language governing permissions and
// limitations under the License.

module github.com/apache/spark-connect-go/v34
module github.com/apache/spark-connect-go/v1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this entry point which defines the version for this go library?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this v34 is accumulated since this go library began. Is it the same as the release version? I'm not sure if it is okay to change it like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we use v34 to make it align with Spark 3.4 at that time. It is not accumulated version number. Things will still work after changing v34 to v1.

@viirya
Copy link
Member

viirya commented Apr 23, 2024

Use v1 for this Spark Connect Go Client library, instead of previously using v34.

For Spart K8s operator, we begin with 0.1.0 as the first release version as discussion in mailing list.

For the v1 and v34 versions here, I'm not sure about if it is same case. Do you already have v34 release?

@hiboyang
Copy link
Contributor Author

hiboyang commented Apr 23, 2024

Does this Spark connect go library have any previous release yet?

No release previously. For GoLang repo, people can just import the repo directly. Thus we did not do "releasing" previously.

@hiboyang
Copy link
Contributor Author

Use v1 for this Spark Connect Go Client library, instead of previously using v34.

For Spart K8s operator, we begin with 0.1.0 as the first release version as discussion in mailing list.

For the v1 and v34 versions here, I'm not sure about if it is same case. Do you already have v34 release?

Hmm... feel there is some confusion with the meaning of v1 and v34 here. v34 is part of Go module name like github.com/apache/spark-connect-go/v34. It is not a release. Due to the recent discussion, we want to decouple the versioning of this project from Spark versioning. Thus modify the module name from github.com/apache/spark-connect-go/v34 to github.com/apache/spark-connect-go/v1.

@viirya
Copy link
Member

viirya commented Apr 23, 2024

Got it. I didn't know the actual meaning of v34 so I thought that you had v1, v2, ...v34 before.

I am also not sure why the version number is embedded into the Go module name. Is it a common practice seen on other Go modules too? So once this Go library has v2 release, you will go to increase v1 to v2 in the module name again?

I am not familiar with how Go module naming policy. I try to compare this with JVM package name and release version. For example, we don't have 3.4 in Spark JVM package name actually.

So I'm wondering if it makes sense to remove the version number v34 from this Go module name? 🤔

Of course it is commonly seen in other Go modules, then it is okay to rename it from v34 to v1.

It is also better to let other contributors of this Go library to take a look too.

@hiboyang
Copy link
Contributor Author

Yeah, Go module name normally put v2/v3/etc in the end to indicate major version bump, like module example.com/mymodule/v2, see example in the end of https://go.dev/doc/modules/version-numbers. Kind of different convention from JVM packages.

Comment on lines 25 to +26
"github.com/apache/arrow/go/v12/arrow/ipc"
proto "github.com/apache/spark-connect-go/v34/internal/generated"
proto "github.com/apache/spark-connect-go/v1/internal/generated"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, okay, I see. I saw the above arrow go libraries also have version number.

@@ -211,6 +211,7 @@ type Expression struct {
// *Expression_UpdateFields_
// *Expression_UnresolvedNamedLambdaVariable_
// *Expression_CommonInlineUserDefinedFunction
// *Expression_CallFunction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is generated, but is it intentional change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is generated and better to keep the change so the generated code is fully compatible with the protobuf protocol.

Comment on lines +54 to +55
golang.org/x/sys v0.13.0 // indirect
golang.org/x/text v0.13.0 // indirect
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you also upgrade these libraries in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not update these intentionally. Seems caused by Go automatically. Looks not hurting (also may be better) to use these new versions.

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification. Makes sense to me now.

viirya
viirya previously approved these changes Apr 24, 2024
Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if there are other community members who frequently look at this Go library? It would be great to have another look.

Otherwise it looks good to me.

@viirya
Copy link
Member

viirya commented Apr 24, 2024

maybe cc @HyukjinKwon ?

@viirya
Copy link
Member

viirya commented Apr 24, 2024

Do we need to create JIRA ticket for this? @hiboyang

@viirya viirya dismissed their stale review April 24, 2024 18:19

Wait for JIRA ticket etc.

@hiboyang hiboyang changed the title [CONNECT] Use v1 as spark connect go library starting version [CONNECT] [SPARK-47842] Use v1 as spark connect go library starting version Apr 24, 2024
@hiboyang
Copy link
Contributor Author

Do we need to create JIRA ticket for this? @hiboyang

Created JIRA just now: https://issues.apache.org/jira/projects/SPARK/issues/SPARK-47842

@hiboyang
Copy link
Contributor Author

Thanks @viirya for the review!

@HyukjinKwon @grundprinzip do you get time to look at this PR?

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go ahead if it works for now.

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @hiboyang

@viirya viirya changed the title [CONNECT] [SPARK-47842] Use v1 as spark connect go library starting version [CONNECT][SPARK-47842] Use v1 as spark connect go library starting version Apr 25, 2024
@viirya viirya changed the title [CONNECT][SPARK-47842] Use v1 as spark connect go library starting version [SPARK-47842][CONNECT] Use v1 as spark connect go library starting version Apr 25, 2024
@hiboyang
Copy link
Contributor Author

Thanks @HyukjinKwon @grundprinzip for reviewing! Would you help to merge the PR?

@HyukjinKwon
Copy link
Member

Merged to master.

@HyukjinKwon
Copy link
Member

@hiboyang is this https://issues.apache.org/jira/browse/SPARK-47842 correct? you should create a new JIRA, and link it to PR title.

@hiboyang hiboyang changed the title [SPARK-47842][CONNECT] Use v1 as spark connect go library starting version [SPARK-47978][CONNECT] Use v1 as spark connect go library starting version Apr 26, 2024
@hiboyang
Copy link
Contributor Author

@hiboyang is this https://issues.apache.org/jira/browse/SPARK-47842 correct? you should create a new JIRA, and link it to PR title.

Oops, should be this one: SPARK-47978. I updated this PR title with correct one.

Thanks for merging the PR!

@ptxmac
Copy link
Contributor

ptxmac commented Jul 3, 2024

Fun fact, you can't have v1 of go libraries 🫠
They must either have no version, or at least be version 2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants