Skip to content

[BEAM-7746] Minor typing updates / fixes#10822

Merged
udim merged 7 commits intoapache:masterfrom
chadrik:python-static-typing-misc
Feb 28, 2020
Merged

[BEAM-7746] Minor typing updates / fixes#10822
udim merged 7 commits intoapache:masterfrom
chadrik:python-static-typing-misc

Conversation

@chadrik
Copy link
Contributor

@chadrik chadrik commented Feb 10, 2020

This PR corrects a few problems that have occurred over the past few weeks:

  • bad annotations added in PRs
  • mangling of type ignore comments by yapf (fixed up the issues and added an option to the config to prevent it in the future)

Plus it adds some more type annotations. Nothing too radical or dangerous here.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

Post-Commit Tests Status (on master branch)

Lang SDK Apex Dataflow Flink Gearpump Samza Spark
Go Build Status --- --- Build Status --- --- Build Status
Java Build Status Build Status Build Status Build Status
Build Status
Build Status
Build Status Build Status Build Status
Build Status
Build Status
Python Build Status
Build Status
Build Status
Build Status
--- Build Status
Build Status
Build Status
Build Status
--- --- Build Status
XLang --- --- --- Build Status --- --- ---

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website
Non-portable Build Status Build Status
Build Status
Build Status Build Status
Portable --- Build Status --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

@chadrik
Copy link
Contributor Author

chadrik commented Feb 11, 2020

R: @udim

@chadrik chadrik force-pushed the python-static-typing-misc branch from df9a769 to 35258c7 Compare February 12, 2020 04:26
@iemejia iemejia added the python label Feb 12, 2020
@chadrik
Copy link
Contributor Author

chadrik commented Feb 19, 2020

R: @robertwb

@chadrik chadrik force-pushed the python-static-typing-misc branch from 35258c7 to 9d5ed3e Compare February 22, 2020 19:29
@chadrik chadrik force-pushed the python-static-typing-misc branch 3 times, most recently from e112249 to 19ba9d8 Compare February 25, 2020 17:20
Copy link
Contributor

@robertwb robertwb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some minor comments, and it looks like there's some lint to fix, but other than that looks good to go.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? (And below.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm... well, at the time that I made this change I think it resolved an error, but either I am mistaken or something changed in the module. I'm rolling this back.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the previous format, where there was single assignment rather than re-assignment. Is there any benefit to typing this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely agree. Unfortunately, this is a syntax error in mypy:

try:
  from apache_beam.runners.dataflow.internal import apiclient  # type: Optional[types.ModuleType]
except ImportError:
  apiclient = None

As is this:

try:
  import apache_beam.runners.dataflow.internal.apiclient as apiclient  # type: Optional[types.ModuleType]
except ImportError:
  apiclient = None

The reason is that type comments are (by design) not capable of doing anything that the new PEP 526 variable annotations are not. In other words, this is obviously wrong:

try:
  Optional[types.ModuleType]: import apache_beam.runners.dataflow.internal.apiclient as apiclient 
except ImportError:
  apiclient = None

In that context, this makes more sense:

apiclient: Optional[types.ModuleType] = None
try:
  from apache_beam.runners.dataflow.internal import apiclient
except ImportError:
  pass

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My question is why we need to type apiclient as types.ModuleType at all.

Copy link
Contributor Author

@chadrik chadrik Feb 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My question is why we need to type apiclient as types.ModuleType at all.

Fair question.

If we don't do this (i.e. as with the original code), we get the following error:

apache_beam/transforms/external_java.py:46: error: Incompatible types in assignment (expression has type "None", variable has type Module)  [assignment]

To resolve this, we need to mark apiclient as Optional


why can't we save ourselves some headache and leave out of the types.ModuleType part?

apiclient = None  # type: Optional
try:
  from apache_beam.runners.dataflow.internal import apiclient
except ImportError:
  pass

If we do this, apiclient will become Optional[Any]


Why can't we ignore the error?

We can, but then mypy will mark the type as non-Optional, and that would remove the added protections that mypy provides against accidentally using the variable when it's None.


Why can't we just add the type comment on the original apiclient = None line?

try:
  from apache_beam.runners.dataflow.internal import apiclient
except ImportError:
  apiclient = None  # type: Optional[types.ModuleType]

With this, we get the following error:

apache_beam/transforms/external_java.py:46: error: Name 'apiclient' already defined (by an import)  [no-redef]

There is only one opportunity to override/influence the inferred type of a variable: on the first line where it is defined (think of type variable definitions like C/C++, but with python scoping rules). However, apiclient is defined via an import rather than an assignment, which forces us to preface the import with a variable definition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some more research on this, and I found this mypy issue: python/mypy#1297

It suggests this idiom:

try:
  from apache_beam.runners.dataflow.internal import apiclient as _apiclient
except ImportError:
  apiclient = None
else:
  apiclient = _apiclient

The import is a bit longer and uglier, but it has 2 advantages:

  • no need to import Optional or ModuleType
  • the idiom I was using was actually making apiclient a generic ModuleType, dropping all knowledge of the members of apache_beam.runners.dataflow.internal. That's bad!

The reason this works without explicit Optional annotation that mypy will automatically determine optionality in some cases, like this:

if some_conditional():
  x = None
else:
  x = 1
reveal_type(x)  # Revealed type is 'Union[builtins.int, None]'

@chadrik chadrik force-pushed the python-static-typing-misc branch 3 times, most recently from d16797b to 22b349b Compare February 26, 2020 19:19
@chadrik
Copy link
Contributor Author

chadrik commented Feb 26, 2020

Down to 79 errors!

Btw, mypy may have revealed some legitimate errors in a recent change:

apache_beam/runners/worker/operations.py:837: error: "float" not callable  [operator]
apache_beam/runners/worker/operations.py:838: error: "float" not callable  [operator]
apache_beam/runners/worker/operations.py:839: error: "float" not callable  [operator]
apache_beam/runners/worker/operations.py:841: error: "float" not callable  [operator]
apache_beam/runners/worker/operations.py:842: error: "float" not callable  [operator]
apache_beam/runners/worker/operations.py:846: error: Module has no attribute "LATEST_DOUBLES_URN"; maybe "LATEST_DOUBLES_TYPE"?  [attr-defined]
apache_beam/runners/worker/operations.py:852: error: Module has no attribute "LATEST_DOUBLES_URN"; maybe "LATEST_DOUBLES_TYPE"?  [attr-defined]

@chadrik
Copy link
Contributor Author

chadrik commented Feb 26, 2020

Run Python2_PVR_Flink PreCommit

@chadrik chadrik force-pushed the python-static-typing-misc branch 3 times, most recently from 7852b4b to b6afeb2 Compare February 26, 2020 23:49
@chadrik
Copy link
Contributor Author

chadrik commented Feb 27, 2020

Run Python PreCommit

@chadrik
Copy link
Contributor Author

chadrik commented Feb 27, 2020

@robertwb I don't think the test failures are my fault because they were passing before I rebased onto master...

@udim
Copy link
Member

udim commented Feb 27, 2020

LGTM, tests already failing here:
https://builds.apache.org/job/beam_PreCommit_Python_Cron/2443/

@chadrik chadrik force-pushed the python-static-typing-misc branch from b6afeb2 to 1379717 Compare February 27, 2020 07:02
@chadrik
Copy link
Contributor Author

chadrik commented Feb 27, 2020

Run PythonLint PreCommit

@chadrik
Copy link
Contributor Author

chadrik commented Feb 27, 2020

Do we think this is safe to merge? I've been watching master for something that looks like it could solve the current test problems, and rebasing periodically.

@chadrik chadrik force-pushed the python-static-typing-misc branch from f5f4193 to 114bbc4 Compare February 28, 2020 17:31
@chadrik
Copy link
Contributor Author

chadrik commented Feb 28, 2020

@udim All tests passing now

@udim
Copy link
Member

udim commented Feb 28, 2020

Thanks, Chad! merging

@udim udim merged commit 2f10df9 into apache:master Feb 28, 2020
@angoenka
Copy link
Contributor

Thanks @chadrik @udim

Just a quick reminder that we should squash the fixup related commits to keep the history clean based on the committer guide https://beam.apache.org/contribute/committer-guide/#finishing-touches
Its ok to keep commits separate if they are unrelated changes.

@udim
Copy link
Member

udim commented Mar 2, 2020

I want to leave some commits and squash others. There is no way to do that in the GH UI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants