Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-40816][CONNECT][PYTHON] Rename LogicalPlan.collect to LogicalPlan.to_proto #38279

Closed

Conversation

amaliujia
Copy link
Contributor

@amaliujia amaliujia commented Oct 17, 2022

What changes were proposed in this pull request?

The collect method in class LogicalPlan is really to generate connect proto plan. It's confusing to use collect which overlaps with collect in dataframe API that returns materialized data.

This PR proposes to rename this method to to_proto to match its implementation.

Why are the changes needed?

Improve codebase readability.

Does this PR introduce any user-facing change?

No

How was this patch tested?

UT

@amaliujia
Copy link
Contributor Author

R: @HyukjinKwon @zhengruifeng

@zhengruifeng
Copy link
Contributor

should it be a private method?

@HyukjinKwon
Copy link
Member

If this class isn't going to be documented as an API, it's fine.

@amaliujia
Copy link
Contributor Author

amaliujia commented Oct 17, 2022

If this class isn't going to be documented as an API, it's fine.

This class is not API class so sounds like it's ok.

only session, dataframe, client, readwriter (possibly catalog) will be API class I think.

Copy link
Contributor

@zhengruifeng zhengruifeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, that is fine

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@amaliujia
Copy link
Contributor Author

also happening on this PR

starting mypy annotations test...
annotations failed mypy checks:
python/pyspark/sql/connect/plan.py:335: error: unused "type: ignore" comment
python/pyspark/sql/connect/plan.py:337: error: unused "type: ignore" comment
python/pyspark/sql/connect/plan.py:341: error: unused "type: ignore" comment
python/pyspark/sql/connect/plan.py:352: error: unused "type: ignore" comment
python/pyspark/sql/connect/plan.py:356: error: unused "type: ignore" comment
python/pyspark/sql/connect/plan.py:358: error: unused "type: ignore" comment
Found 6 errors in 1 file (checked 366 source files)
1

@@ -80,9 +80,19 @@ def _verify(self, session: "RemoteSparkSession") -> bool:

return test_plan == plan

def collect(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's problably fix the error here together.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errors should have been fixed.

HyukjinKwon pushed a commit that referenced this pull request Oct 18, 2022
### What changes were proposed in this pull request?

The mypy check in master seems to be broken:

<img width="676" alt="Screen Shot 2022-10-17 at 1 09 13 PM" src="https://user-images.githubusercontent.com/1938382/196273759-64372862-6caa-4a9e-b700-b665c7ff7e6c.png">

(see it on #38279 and #38275, also reproducible locally).

This PR propose to remove the relevant `type: ignore` comments.

### Why are the changes needed?

Fix python lint check.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

UT

Closes #38287 from amaliujia/test_python_lint.

Authored-by: Rui Wang <rui.wang@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
@HyukjinKwon
Copy link
Member

Merged to master.

SandishKumarHN pushed a commit to SandishKumarHN/spark that referenced this pull request Dec 12, 2022
### What changes were proposed in this pull request?

The mypy check in master seems to be broken:

<img width="676" alt="Screen Shot 2022-10-17 at 1 09 13 PM" src="https://user-images.githubusercontent.com/1938382/196273759-64372862-6caa-4a9e-b700-b665c7ff7e6c.png">

(see it on apache#38279 and apache#38275, also reproducible locally).

This PR propose to remove the relevant `type: ignore` comments.

### Why are the changes needed?

Fix python lint check.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

UT

Closes apache#38287 from amaliujia/test_python_lint.

Authored-by: Rui Wang <rui.wang@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
SandishKumarHN pushed a commit to SandishKumarHN/spark that referenced this pull request Dec 12, 2022
…lan.to_proto

### What changes were proposed in this pull request?

The `collect` method in `class LogicalPlan` is really to generate connect proto plan. It's confusing to use `collect` which overlaps with `collect` in dataframe API that returns materialized data.

This PR proposes to rename this method to `to_proto` to match its implementation.

### Why are the changes needed?

Improve codebase readability.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

UT

Closes apache#38279 from amaliujia/rename_logical_plan_collect2.

Authored-by: Rui Wang <rui.wang@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants