Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python-package] fix mypy errors in Dataset construction #6106

Merged
merged 4 commits into from Sep 22, 2023

Conversation

jameslamb
Copy link
Collaborator

Contributes to #3756.
Contributes to #3867.

Fixes the following errors from mypy

basic.py:1922: error: Argument 1 to "__init_from_list_np2d" of "Dataset" has incompatible type "list[Any] | list[Sequence] | list[ndarray[Any, Any]]"; expected "list[ndarray[Any, Any]]"  [arg-type]
basic.py:1924: error: Argument 1 to "__init_from_seqs" of "Dataset" has incompatible type "list[Any] | list[Sequence] | list[ndarray[Any, Any]]"; expected "list[Sequence]"  [arg-type]
basic.py:2874: error: Argument 1 to "_yield_row_from_seqlist" of "Dataset" has incompatible type "list[Any] | list[Sequence] | list[ndarray[Any, Any]]"; expected "list[Sequence]"  [arg-type]

These errors all come from the fact that you can't use an expression like

if isinstance(l, list) and all(isinstance(x, np.ndarray) for x in l):
    # ...

To help mypy understand which type from a union like Union[List[np.ndarray], List[Sequence]] a particular code block is working with.

As described in https://mypy.readthedocs.io/en/stable/type_narrowing.html#typeguards-with-parameters, in Python 3.10 mypy introduced a feature to do exactly that. If you set up a function like f(l) -> typing.TypeGuard[List[np.ndarray]], that tells the type checker:

  • f() returns a bool
  • if f(l) returns True, then l must be a List[np.ndarray]

This PR proposes adding such guards to lightgbm. This will help type checkers (like mypy in this project's CI and others in users' IDEs) to catch errors deeper in the code paths involving lists of numpy arrays and lists of Sequence objects.

@jameslamb jameslamb changed the title WIP: [python-package] fix mypy errors in Dataset construction [python-package] fix mypy errors in Dataset construction Sep 19, 2023
@jameslamb jameslamb marked this pull request as ready for review September 19, 2023 23:09
Copy link
Collaborator

@jmoralez jmoralez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@jameslamb jameslamb merged commit 7c9a985 into master Sep 22, 2023
41 checks passed
@jameslamb jameslamb deleted the python/mypy-array-type-narrowing branch September 22, 2023 02:37
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 27, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants