Skip to content

Comments

[MINOR][PYTHON] Change TypeVar to private symbols#40338

Closed
MaicoTimmerman wants to merge 1 commit intoapache:masterfrom
MaicoTimmerman:master
Closed

[MINOR][PYTHON] Change TypeVar to private symbols#40338
MaicoTimmerman wants to merge 1 commit intoapache:masterfrom
MaicoTimmerman:master

Conversation

@MaicoTimmerman
Copy link
Contributor

What changes were proposed in this pull request?

I've convert internal typing symbols in __init__.py to private. When these changes are agreed upon, I can expand this MR with more TypeVar this applies to.

Why are the changes needed?

Editors consider all public symbols in a library importable. A common pattern is to use a shorthand for pyspark functions:

import pyspark.sql.functions as F
F.col(...)

Since pyspark.F is a valid symbol according to __init__.py, editors will suggest this to users, while it is not a valid use-case of pyspark.

This change is in line with Pyright's Typing Guidance for Python Libraries

Does this PR introduce any user-facing change?

No

How was this patch tested?

Verified with PyCharm auto-importing.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaicoTimmerman
Copy link
Contributor Author

Looks pretty good. Mind taking a look at https://github.com/apache/spark/pull/40338/checks?check_run_id=11851128079?

@HyukjinKwon I've rebased against the latest master, however it seems that workflows won't start for new first-time contributors not inside the organization. If I read the documentation [1][2] correctly, a maintainer will need to activate the workflow?

[1] https://docs.github.com/en/actions/managing-workflow-runs/approving-workflow-runs-from-public-forks
[2] https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/enabling-features-for-your-repository/managing-github-actions-settings-for-a-repository#controlling-changes-from-forks-to-workflows-in-public-repositories

@HyukjinKwon
Copy link
Member

Apache Spark has a custom implementation that leverages the build (and resources) from forked repository. Mind checking if the workload is enabled in your fork (https://github.com/MaicoTimmerman/spark/actions/workflows/build_and_test.yml), and rebase this?

@HyukjinKwon
Copy link
Member

Merged to master and branch-3.4.

@HyukjinKwon
Copy link
Member

cc @zero323 in case you have some feedback on this.

HyukjinKwon pushed a commit that referenced this pull request Mar 15, 2023
### What changes were proposed in this pull request?
I've convert internal typing symbols in `__init__.py` to private. When these changes are agreed upon, I can expand this MR with more `TypeVar` this applies to.

### Why are the changes needed?
Editors consider all public symbols in a library importable. A common pattern is to use a shorthand for pyspark functions:
```python
import pyspark.sql.functions as F
F.col(...)
```
Since `pyspark.F` is a valid symbol according to `__init__.py`, editors will suggest this to users, while it is not a valid use-case of pyspark.

This change is in line with Pyright's [Typing Guidance for Python Libraries](https://github.com/microsoft/pyright/blob/main/docs/typed-libraries.md#generic-classes-and-functions)

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Verified with PyCharm auto-importing.

Closes #40338 from MaicoTimmerman/master.

Authored-by: Maico Timmerman <maico.timmerman@gmail.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 6d3587a)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
@zero323
Copy link
Member

zero323 commented Mar 15, 2023

cc @zero323 in case you have some feedback on this.

@HyukjinKwon I am OK with that, though there is a bigger issue here. We have TypeVars in py modules in quite a few places, which are a side effect of migrating to inline hints before dropping Python 3.8 support (looking at you, Protocol...).

Ideally, we'd handle all of that consistently and use consistent naming convention, but that's not something will be able to do any time soon.

snmvaughan pushed a commit to snmvaughan/spark that referenced this pull request Jun 20, 2023
### What changes were proposed in this pull request?
I've convert internal typing symbols in `__init__.py` to private. When these changes are agreed upon, I can expand this MR with more `TypeVar` this applies to.

### Why are the changes needed?
Editors consider all public symbols in a library importable. A common pattern is to use a shorthand for pyspark functions:
```python
import pyspark.sql.functions as F
F.col(...)
```
Since `pyspark.F` is a valid symbol according to `__init__.py`, editors will suggest this to users, while it is not a valid use-case of pyspark.

This change is in line with Pyright's [Typing Guidance for Python Libraries](https://github.com/microsoft/pyright/blob/main/docs/typed-libraries.md#generic-classes-and-functions)

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Verified with PyCharm auto-importing.

Closes apache#40338 from MaicoTimmerman/master.

Authored-by: Maico Timmerman <maico.timmerman@gmail.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 6d3587a)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants