New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix mypy errors #16726
Fix mypy errors #16726
Conversation
@@ -271,7 +271,7 @@ | |||
_FILE_CONTENT_TEMPLATE = """ | |||
# THIS FILE WAS AUTOMATICALLY GENERATED BY deprecated_modules.py | |||
import sys | |||
from . import {new_module_name} | |||
from . import {new_module_name} # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are often C extensions (which raise mypy type errors), and since this module is deprecated it doesn't matter anyway.
@@ -395,7 +395,7 @@ def score(self, X, y, sample_weight=None): | |||
X = np.zeros(shape=(len(y), 1)) | |||
return super().score(X, y, sample_weight) | |||
|
|||
@deprecated( | |||
@deprecated( # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally decorators of properties are not supported in mypy
HistGradientBoostingClassifier # type: ignore | ||
) | ||
ensemble.HistGradientBoostingRegressor = ( | ||
HistGradientBoostingRegressor # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Errors with a message that module doesn't include HistGradientBoostingRegressor
.
If we want to use type hints, we should also add some documentation for it similar to https://pandas.pydata.org/docs/development/contributing.html?highlight=mypy#type-hints. I would rather not do it in this PR however, so maybe meanwhile we should disable mypy in CI... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why a bunch of the places in this PR need the type: ignore
directive. Should we put a comment before every single one, or should we put somewhere in maintaners.rst
or something and explain where those are needed?
azure-pipelines.yml
Outdated
- bash: conda create --name flake8_env --yes flake8 | ||
- bash: | | ||
conda create --name flake8_env --yes python=3.8 | ||
source activate flake8_env |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
source activate
is deprecated though.
sklearn/impute/__init__.py
Outdated
if typing.TYPE_CHECKING: | ||
# Workaround for type checkers (e.g. mypy) to avoid | ||
# import errors for experimenal estimators. | ||
# TODO: remove the above check once the estimator is no longer | ||
# experimental. | ||
from ._iterative import IterativeImputer # noqa | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add this this to where we explain how to have an experimental module? is it in maintainer.rst
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The section on experimental estimators, says to copy and adapt the code from HistGradientBoosting and/or Iterative regressor, which should be no need to more specific documentation, I think.
Thanks @adrinjalali !
I added a comment above each line where In the end I also added a sentence about workarounds needed for experimental modules to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but I do not understand the following:
pckg[1] for pckg in walk_packages( | ||
prefix='sklearn.', | ||
# mypy error: Module has no attribute "__path__" | ||
path=sklearn.__path__) # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this causing a typing error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a known mypy issue python/mypy#1422. For now the the workaround is to ignore it. I'll add an inline comment.
@@ -22,7 +22,8 @@ | |||
from ..decomposition import PCA | |||
from ..metrics.pairwise import pairwise_distances | |||
from . import _utils | |||
from . import _barnes_hut_tsne | |||
# mypy error: Module 'sklearn.manifold' has no attribute '_barnes_hut_tsne' | |||
from . import _barnes_hut_tsne # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is _barnes_hut_tsne
causing a typing error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is _barnes_hut_tsne causing a typing error?
There is some mypy issue with importing full C extension modules (but not individual objects from these modules). Couldn't find relevant information in the mypy issue tracker and opened python/mypy#8575
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of the above linked mypy issue is that's expected behavour,
Mypy doesn't actually import or interact with C extension code. It can only parse and gain type information from .py and .pyi files.
so it sounds like all C extensions need to be ignored for typing in some way, either explicitly with type: ignore
or via the --ignore-missing-import
option.
@WillAyd since you worked on this in pandas, what's the current approach for mypy and C extensions there? As far as I can tell pandas doesn't use neither of the above options, and yet I don't see any C extension related errors with mypy pandas
or errors about missing stabs for dependencies for that matter such as,
sklearn/cluster/_spectral.py:10: error: Skipping analyzing 'numpy': found module but no type hints or library stubs
that we currently skip in this PR using --ignore-missing-import
. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For better or worse we set ignore_missing_imports=True
in our config, so that this problem is "solved" globally rather than having to ignore individual imports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, indeed, finally found pandas mypy config. Was previously grepping option names with "-" instead of "_". Thanks for confirming @WillAyd!
For the record, the above mentioned # type: ignore
is still necessary since this particular import doesn't get skipped by the global --ignore-missing-import
option as discussed in the above linked mypy issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @rth
This feels like it can be maintainable. If passing mypy becomes a significant burden for contributors, we can turn it off.
Thanks @thomasjpfan, I added a comment to the documentation about mypy workarounds as you suggested. Please let me know if you have other comments. Also cc @adrinjalali |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @rth
Merging with +2; also since I didn't hear much objections to this while it was mentioned in the dev meeting today. Please ping me if it mypy linting becomes too annoying in PRs. |
Also we should have a discussion in #16705 as to which parts of the code it might be good to add (moderate) type annotations first. |
Solves mypy errors and closes #12953
Solving these errors is a pre-requisite to consider adding some light type annotations #16705
The full error log on master can be found below (58 errors),
and generally errors are due to mypy not being able to determine type. That's happens very rarely as by default anything unknown is of type
Any
. In particular following type of errors are found,mypy is also added in CI. I don't have strong feeling about it. pandas does this. I would propose to try it and if it's too annoying for any reason in PRs disable it.
CC maybe @thomasjpfan @jnothman