Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-42901][CONNECT][PYTHON] Move StorageLevel into a separate file to avoid potential file recursively imports #40518

Closed
wants to merge 5 commits into from

Conversation

LuciferYang
Copy link
Contributor

@LuciferYang LuciferYang commented Mar 22, 2023

What changes were proposed in this pull request?

#40510 introduce message StorageLevel to base.proto, but if we try to import base.proto in catalog.proto to reuse StorageLevel in message CacheTable and run `build/sbt "connect-common/compile" to compile, there will be following message in compile log:

spark/connect/base.proto:23:1: File recursively imports itself: spark/connect/base.proto -> spark/connect/commands.proto -> spark/connect/relations.proto -> spark/connect/catalog.proto -> spark/connect/base.proto
spark/connect/catalog.proto:22:1: Import "spark/connect/base.proto" was not found or had errors.
spark/connect/catalog.proto:144:12: "spark.connect.DataType" seems to be defined in "spark/connect/types.proto", which is not imported by "spark/connect/catalog.proto".  To use it here, please add the necessary import.
spark/connect/catalog.proto:161:12: "spark.connect.DataType" seems to be defined in "spark/connect/types.proto", which is not imported by "spark/connect/catalog.proto".  To use it here, please add the necessary import.
spark/connect/relations.proto:25:1: Import "spark/connect/catalog.proto" was not found or had errors.
spark/connect/relations.proto:84:5: "Catalog" is not defined.
spark/connect/commands.proto:22:1: Import "spark/connect/relations.proto" was not found or had errors.
spark/connect/commands.proto:63:3: "Relation" is not defined.
spark/connect/commands.proto:81:3: "Relation" is not defined.
spark/connect/commands.proto:142:3: "Relation" is not defined.
spark/connect/base.proto:23:1: Import "spark/connect/commands.proto" was not found or had errors.
spark/connect/base.proto:25:1: Import "spark/connect/relations.proto" was not found or had errors.
....

So this pr move message StorageLevel to a separate file to avoid this potential file recursively imports.

Why are the changes needed?

To avoid potential file recursively imports.

Does this PR introduce any user-facing change?

No

How was this patch tested?

  • Pass GitHub Actions
  • Manual check:
    • Add import "spark/connect/common.proto"; to catalog.proto
    • run build/sbt "connect-common/compile"

No compilation logs related to File recursively imports itself .

@LuciferYang LuciferYang changed the title [SPARK-42889][CONNECT][FOLLOWUP] Move StorageLevel into a separate file to avoid potential file recursively imports [SPARK-42889][CONNECT][PYTHON][FOLLOWUP] Move StorageLevel into a separate file to avoid potential file recursively imports Mar 22, 2023
option java_package = "org.apache.spark.connect.proto";

// StorageLevel for persisting Datasets/Tables.
message StorageLevel {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -0,0 +1,37 @@
/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename this to something else at a high level? e.g., common.proto or utils.proto. Otherwise, LGTM from my end.

cc @hvanhovell @grundprinzip @ueshin

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I also think there should be a high level name, let me change it to common.proto first. If others have better suggestions, I will rename it again

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that connect-gen-protos.sh won't automatically delete storage_level_pb2.py & storage_level_pb2.pyi, hahaha

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

common.proto sounds good to me.

@LuciferYang
Copy link
Contributor Author

rebase due to #40516 merged

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LuciferYang . This looks worthy of having a new JIRA. Please create a new JIRA for this PR and use it. This PR is a good contribution of yours. 😄

@ueshin
Copy link
Member

ueshin commented Mar 23, 2023

@LuciferYang nit: you need to update the PR description. There is an old file name storage_level.proto.

@LuciferYang LuciferYang changed the title [SPARK-42889][CONNECT][PYTHON][FOLLOWUP] Move StorageLevel into a separate file to avoid potential file recursively imports [SPARK-42901][CONNECT][PYTHON] Move StorageLevel into a separate file to avoid potential file recursively imports Mar 23, 2023
@LuciferYang LuciferYang changed the title [SPARK-42901][CONNECT][PYTHON] Move StorageLevel into a separate file to avoid potential file recursively imports [SPARK-42901][CONNECT][PYTHON] Move StorageLevel into a separate file to avoid potential file recursively imports Mar 23, 2023
@LuciferYang
Copy link
Contributor Author

@LuciferYang nit: you need to update the PR description. There is an old file name storage_level.proto.

Thanks ~ fixed

@LuciferYang
Copy link
Contributor Author

@LuciferYang . This looks worthy of having a new JIRA. Please create a new JIRA for this PR and use it. This PR is a good contribution of yours. 😄

@dongjoon-hyun Thanks for your suggestion ~ I created SPARK-42901 and updated the pr title 😄

@LuciferYang
Copy link
Contributor Author

GA passed ~

@HyukjinKwon
Copy link
Member

Merged to master and branch-3.4.

HyukjinKwon pushed a commit that referenced this pull request Mar 23, 2023
…le to avoid potential `file recursively imports`

#40510 introduce `message StorageLevel` to `base.proto`, but if we try to import `base.proto` in `catalog.proto` to reuse `StorageLevel` in `message CacheTable` and run `build/sbt "connect-common/compile" to compile, there will be following message in compile log:

```
spark/connect/base.proto:23:1: File recursively imports itself: spark/connect/base.proto -> spark/connect/commands.proto -> spark/connect/relations.proto -> spark/connect/catalog.proto -> spark/connect/base.proto
spark/connect/catalog.proto:22:1: Import "spark/connect/base.proto" was not found or had errors.
spark/connect/catalog.proto:144:12: "spark.connect.DataType" seems to be defined in "spark/connect/types.proto", which is not imported by "spark/connect/catalog.proto".  To use it here, please add the necessary import.
spark/connect/catalog.proto:161:12: "spark.connect.DataType" seems to be defined in "spark/connect/types.proto", which is not imported by "spark/connect/catalog.proto".  To use it here, please add the necessary import.
spark/connect/relations.proto:25:1: Import "spark/connect/catalog.proto" was not found or had errors.
spark/connect/relations.proto:84:5: "Catalog" is not defined.
spark/connect/commands.proto:22:1: Import "spark/connect/relations.proto" was not found or had errors.
spark/connect/commands.proto:63:3: "Relation" is not defined.
spark/connect/commands.proto:81:3: "Relation" is not defined.
spark/connect/commands.proto:142:3: "Relation" is not defined.
spark/connect/base.proto:23:1: Import "spark/connect/commands.proto" was not found or had errors.
spark/connect/base.proto:25:1: Import "spark/connect/relations.proto" was not found or had errors.
....
```

So this pr move `message StorageLevel` to a separate file to avoid this potential file recursively imports.

To avoid potential file recursively imports.

No

- Pass GitHub Actions
- Manual check:
   - Add `import "spark/connect/common.proto";` to `catalog.proto`
   - run `build/sbt "connect-common/compile"`

No compilation logs related to `File recursively imports itself` .

Closes #40518 from LuciferYang/SPARK-42889-FOLLOWUP.

Lead-authored-by: yangjie01 <yangjie01@baidu.com>
Co-authored-by: YangJie <yangjie01@baidu.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 88cc239)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
@LuciferYang
Copy link
Contributor Author

Thanks @HyukjinKwon @dongjoon-hyun @ueshin

@dongjoon-hyun
Copy link
Member

+1, late LGTM.

snmvaughan pushed a commit to snmvaughan/spark that referenced this pull request Jun 20, 2023
…le to avoid potential `file recursively imports`

apache#40510 introduce `message StorageLevel` to `base.proto`, but if we try to import `base.proto` in `catalog.proto` to reuse `StorageLevel` in `message CacheTable` and run `build/sbt "connect-common/compile" to compile, there will be following message in compile log:

```
spark/connect/base.proto:23:1: File recursively imports itself: spark/connect/base.proto -> spark/connect/commands.proto -> spark/connect/relations.proto -> spark/connect/catalog.proto -> spark/connect/base.proto
spark/connect/catalog.proto:22:1: Import "spark/connect/base.proto" was not found or had errors.
spark/connect/catalog.proto:144:12: "spark.connect.DataType" seems to be defined in "spark/connect/types.proto", which is not imported by "spark/connect/catalog.proto".  To use it here, please add the necessary import.
spark/connect/catalog.proto:161:12: "spark.connect.DataType" seems to be defined in "spark/connect/types.proto", which is not imported by "spark/connect/catalog.proto".  To use it here, please add the necessary import.
spark/connect/relations.proto:25:1: Import "spark/connect/catalog.proto" was not found or had errors.
spark/connect/relations.proto:84:5: "Catalog" is not defined.
spark/connect/commands.proto:22:1: Import "spark/connect/relations.proto" was not found or had errors.
spark/connect/commands.proto:63:3: "Relation" is not defined.
spark/connect/commands.proto:81:3: "Relation" is not defined.
spark/connect/commands.proto:142:3: "Relation" is not defined.
spark/connect/base.proto:23:1: Import "spark/connect/commands.proto" was not found or had errors.
spark/connect/base.proto:25:1: Import "spark/connect/relations.proto" was not found or had errors.
....
```

So this pr move `message StorageLevel` to a separate file to avoid this potential file recursively imports.

To avoid potential file recursively imports.

No

- Pass GitHub Actions
- Manual check:
   - Add `import "spark/connect/common.proto";` to `catalog.proto`
   - run `build/sbt "connect-common/compile"`

No compilation logs related to `File recursively imports itself` .

Closes apache#40518 from LuciferYang/SPARK-42889-FOLLOWUP.

Lead-authored-by: yangjie01 <yangjie01@baidu.com>
Co-authored-by: YangJie <yangjie01@baidu.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 88cc239)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants