Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MyPy Cleanup Incompatible Types Errors #1245

Merged
merged 10 commits into from
Oct 28, 2022

Conversation

ryankarlos
Copy link
Contributor

@ryankarlos ryankarlos commented Oct 19, 2022

Signed-off-by: Ryan Nazareth ryankarlos@gmail.com

TL;DR

This PR fixes all the mypy Incompatible Types errors

Complete description

This PR fixes the following mypy errors thrown when running make lint | grep Incompatible

Screenshot 2022-10-19 at 21 01 38

Additonally, mypy complains that plugins/flytekit-kf-mpiinvalid package name - theinit.py` has been removed
in this PR as this should be a directory anyway based on other plugins structures.

Tracking Issue

Addresses part of issue flyteorg/flyte#1030

Follow-up issue

NA

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>
Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>
Copy link
Member

@pingsutw pingsutw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd feel like some places we shouldn't use ignore. we can use cast or update the type hint in the class or function. However, It's fine to use ignore after dataclass fields.

@@ -80,14 +80,14 @@ class ExecutionParameters(object):
@dataclass(init=False)
class Builder(object):
stats: taggable.TaggableStats
execution_date: datetime
logging: _logging.Logger
execution_date: typing.Optional[datetime]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add = None if the type is optional

Suggested change
execution_date: typing.Optional[datetime]
execution_date: typing.Optional[datetime] = None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should i set them for all the attributes bound to dataclass which have optional type ? Also not sure where these are being used elsewhere (or redundant if they don't have defaults anyway) - also they are being set in __init__

    @dataclass(init=False)
    class Builder(object):
        stats: taggable.TaggableStats
        execution_date: typing.Optional[datetime]
        logging: Optional[_logging.Logger]
        execution_id: typing.Optional[_identifier.WorkflowExecutionIdentifier]
        attrs: typing.Dict[str, typing.Any]
        working_dir: typing.Optional[utils.AutoDeletingTempDir]
        checkpoint: typing.Optional[Checkpoint]
        decks: List[Deck]
        raw_output_prefix: Optional[str]
        task_id: typing.Optional[_identifier.Identifier]

        def __init__(self, current: typing.Optional[ExecutionParameters] = None):
            self.stats = current.stats if current else None
            self.execution_date = current.execution_date if current else None
            self.working_dir = current.working_directory if current else None
            self.execution_id = current.execution_id if current else None
            self.logging = current.logging if current else None
            self.checkpoint = current._checkpoint if current else None
            self.decks = current._decks if current else []
            self.attrs = current._attrs if current else {}
            self.raw_output_prefix = current.raw_output_prefix if current else None
            self.task_id = current.task_id if current else None

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, they are duplicates, I'm not sure if we have a specific reason to do so. Maybe keep it for now.

Should i set them for all the attributes bound to dataclass which have optional type

yup, Let's set the default value of all the optional variables to None in the dataclass.

flytekit/core/python_customized_container_task.py Outdated Show resolved Hide resolved
Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>
Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>
else:
mod = inspect.getmodule(f)
mod = inspect.getmodule(f) # type: ignore
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't find an easy way to fix this
mypy error is expression has type "Optional[Module]", variable has type Module

I tried mod = inspect.getmodule(f) or None (and also to ensure that variable is None if inspect.getmodule(f) is None due to check afterwards) ....... however that didn't seem to work. Any ideas ?

Maybe im being silly but I actually don't understand why mypy is expecting mod to be Module - im guessing this is based on previous assignment to mod = importlib.import_module(f.instantiated_in) but should it matter if we are reassigning it anyway ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, using ignore is fine here

@ryankarlos
Copy link
Contributor Author

@pingsutw i've fixed the type hints in most places where previously had # type:ignore. Also, see my comments above.

else:
mod = inspect.getmodule(f)
mod = inspect.getmodule(f) # type: ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, using ignore is fine here

@@ -80,14 +80,14 @@ class ExecutionParameters(object):
@dataclass(init=False)
class Builder(object):
stats: taggable.TaggableStats
execution_date: datetime
logging: _logging.Logger
execution_date: typing.Optional[datetime]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, they are duplicates, I'm not sure if we have a specific reason to do so. Maybe keep it for now.

Should i set them for all the attributes bound to dataclass which have optional type

yup, Let's set the default value of all the optional variables to None in the dataclass.

@@ -326,12 +326,13 @@ class StructuredDatasetTransformerEngine(TypeTransformer[StructuredDataset]):
str: type_models.LiteralType(simple=type_models.SimpleType.STRING),
}

ENCODERS: Dict[Type, Dict[str, Dict[str, StructuredDatasetEncoder]]] = {}
DECODERS: Dict[Type, Dict[str, Dict[str, StructuredDatasetDecoder]]] = {}
ENCODERS: Dict[Type, Dict[str, Dict[str, Handlers]]] = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like using Handlers is not correct here because the type of the class in the ENCODERS must be StructuredDatasetEncoder

Copy link
Contributor Author

@ryankarlos ryankarlos Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes i agree . i only did this to stop mypy complaining, otherwise i get

flytekit/types/structured/structured_dataset.py:366: error: Incompatible types in assignment (expression has type "Dict[Type[Any], Dict[str, Dict[str, StructuredDatasetDecoder]]]", variable has type "Dict[Type[Any], Dict[str, Dict[str, StructuredDatasetEncoder]]]")

i think its because of assignment within the if block so mypy is expecting h to be top_level type nested dict key value to be Union[[cls.DECODERS, cls.ENCODERS]]

  @classmethod
  def _handler_finder(cls, h: Handlers, protocol: str) -> Dict[str, Any]:
      if isinstance(h, StructuredDatasetEncoder):
          top_level = cls.ENCODERS
      elif isinstance(h, StructuredDatasetDecoder):
          top_level = cls.DECODERS

Copy link
Contributor Author

@ryankarlos ryankarlos Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh actually i managed to find a workaround and explcitly cast top_level variable before assignment. Ive reverted the signature for ENCODERS and DECODERS as they were previously

  @classmethod
  def _handler_finder(cls, h: Handlers, protocol: str) -> Dict[str, Handlers]:
      top_level = typing.cast(Dict[Type[Any], Dict[str, Dict[str, Union[cls.DECODERS, cls.ENCODERS]]]], {})
      if isinstance(h, StructuredDatasetEncoder):
          top_level = cls.ENCODERS
      elif isinstance(h, StructuredDatasetDecoder):
          top_level = cls.DECODERS

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy making things painful 😄

flytekit/types/structured/structured_dataset.py Outdated Show resolved Hide resolved
flytekit/types/structured/structured_dataset.py Outdated Show resolved Hide resolved
def to_python_value(self, ctx: FlyteContext, lv: Literal, expected_python_type: Type[T]) -> T:
...

@overload
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to add to_python_value? we have three to_python_value...

Copy link
Contributor Author

@ryankarlos ryankarlos Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was doing this function overloading to map types (couldn't find a better way) https://mypy.readthedocs.io/en/stable/more_types.html#function-overloading as mypy was complaining when I had union on both param and return value so it couldn't map the correct ones i.e.
expected_python_type: Type[T] -> T and expected_python_type: StructuredDataset -> StructuredDataset
However, i removed it now in latest changes and weirdly it seemed to be fine running locally now - so i'll wait for ci to see if it passes.

Don't know why mypy makes things so painful 😄

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>
@pingsutw
Copy link
Member

@ryankarlos do you mind if I push some commits to your branch?

@ryankarlos
Copy link
Contributor Author

@pingsutw yup go for it

Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
@codecov
Copy link

codecov bot commented Oct 27, 2022

Codecov Report

Merging #1245 (a1e3e05) into master (ab9aa65) will increase coverage by 0.02%.
The diff coverage is 53.03%.

@@            Coverage Diff             @@
##           master    #1245      +/-   ##
==========================================
+ Coverage   68.68%   68.70%   +0.02%     
==========================================
  Files         288      286       -2     
  Lines       26323    26326       +3     
  Branches     2939     2490     -449     
==========================================
+ Hits        18079    18088       +9     
+ Misses       7766     7758       -8     
- Partials      478      480       +2     
Impacted Files Coverage Δ
flytekit/core/base_task.py 46.88% <ø> (+0.19%) ⬆️
flytekit/core/context_manager.py 39.61% <0.00%> (ø)
flytekit/core/launch_plan.py 57.89% <0.00%> (ø)
flytekit/core/map_task.py 43.11% <0.00%> (ø)
flytekit/core/promise.py 51.95% <0.00%> (ø)
flytekit/core/python_function_task.py 52.63% <ø> (ø)
flytekit/types/directory/types.py 53.71% <0.00%> (-1.29%) ⬇️
flytekit/types/file/file.py 59.72% <0.00%> (ø)
flytekit/core/type_engine.py 58.65% <40.00%> (-0.74%) ⬇️
flytekit/core/python_auto_container.py 49.54% <42.85%> (-0.46%) ⬇️
... and 33 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
@samhita-alla
Copy link
Contributor

@pingsutw, can we merge this PR?

@pingsutw pingsutw merged commit 19eaf89 into flyteorg:master Oct 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants