Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix type mismatches in io_pyjags.py. #1498

Merged
merged 4 commits into from
Jan 20, 2021

Conversation

obi1kenobi
Copy link
Contributor

Description

Fix the following mypy type errors, without affecting any of the underlying functionality. Working toward #1496.

arviz/data/io_pyjags.py:32: error: "Mapping[str, Any]" has no attribute "copy"  [attr-defined]
arviz/data/io_pyjags.py:40: error: Item "Mapping[str, Any]" of "Optional[Mapping[str, Any]]" has no attribute "pop"  [union-attr]
arviz/data/io_pyjags.py:40: error: Item "None" of "Optional[Mapping[str, Any]]" has no attribute "pop"  [union-attr]

Checklist

  • Follows official PR format
  • Code style correct (follows pylint and black guidelines)

@codecov
Copy link

codecov bot commented Jan 14, 2021

Codecov Report

Merging #1498 (6491375) into master (14875f5) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1498   +/-   ##
=======================================
  Coverage   91.76%   91.76%           
=======================================
  Files         105      105           
  Lines       11316    11322    +6     
=======================================
+ Hits        10384    10390    +6     
  Misses        932      932           
Impacted Files Coverage Δ
arviz/data/io_pyjags.py 94.00% <100.00%> (+0.38%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 14875f5...6491375. Read the comment docs.

Copy link
Member

@OriolAbril OriolAbril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the assertion errors, otherwise it looks good (assertion errors may too once I understand them)

@obi1kenobi
Copy link
Contributor Author

I don't understand the assertion errors, otherwise it looks good (assertion errors may too once I understand them)

Ah yes. Let me use the one about log_likelihood as an example, since all three checks are there for the same reason:

@requires("log_likelihood")
def log_likelihood_to_xarray(self) -> tp.Tuple[xarray.Dataset, xarray.Dataset]:
    """Extract log likelihood samples from fit."""
    if self.log_likelihood is None:
        # This should be impossible because of @requires above.
        raise AssertionError("self.log_likelihood was unexpectedly None. This is a bug.")

    return self._pyjags_samples_to_xarray(self.log_likelihood)

The type of self.log_likelihood is Optional[Mapping[str, str]], but self._pyjags_samples_to_xarray() expects an argument of type Mapping, and not an Optional[Mapping]. The @requires("log_likelihood") decorator on the function will actually check if self.log_likelihood is None and automatically handle the case, so we have a guarantee that this function is only called if self.log_likelihood is not None.

However, mypy doesn't know this: that transformation of our expectations cannot be expressed in the decorator's type signature. But with an explicit check for None on self.log_likelihood and a raised error in that case, mypy is able to prove that we only hit the return line in cases where self.log_likelihood is not None i.e. when its type is Mapping and not Optional[Mapping]. This process is called "type refinement," and is the reason that the code with the AssertionError type-checks correctly whereas the code without it fails with a type error.

Unrelatedly, it seems that due to the lack of type hints, the np.ndarray type is perhaps turning into Any. I bring this up because _pyjags_samples_to_xarray() is defined as taking Mapping[str, np.ndarray] but in the log_likelihood example, it is called with Mapping[str, str], which mypy doesn't seem to complain about. This is not an issue with this PR — this concern exists in the current codebase as-is — I simply spotted it while typing up this example. Is the symptom here perhaps an incorrect type hint on _pyjags_samples_to_xarray()? Should I open a separate issue for this? @ColCarroll might also be interested in this, since we were talking about whether type-hinting more of the package might help surface some bugs, and this is certainly a candidate.

@OriolAbril
Copy link
Member

Okey, thanks for the explanation, I see the need now, but it's kind of a bummer it is needed as it basicallly kills the usefulness of the requires decorator, we could now change them to if None: -> return None :/

Regarding log_likelihood, in the __init__ method, roughly lines 29-46 the strings passed as log likelihood are converted to dicts of arrays by taking the arrays from the posterior argument, there should be no case where string get passed to _pyjags_samples_to_xarray I think

@obi1kenobi
Copy link
Contributor Author

Regarding log_likelihood, in the __init__ method, roughly lines 29-46 the strings passed as log likelihood are converted to dicts of arrays by taking the arrays from the posterior argument, there should be no case where string get passed to _pyjags_samples_to_xarray I think

Ah yes, I see it now. And the if isinstance(log_likelihood, str): and if isinstance(log_likelihood, (list, tuple)): checks seem to indicate that the log_likelihood input to __init__ could also be str or List[str] or Tuple[str, ...] in addition to the declared tp.Optional[tp.Mapping[str, str]]. I'll update the type hints appropriately.

Okey, thanks for the explanation, I see the need now, but it's kind of a bummer it is needed as it basicallly kills the usefulness of the requires decorator, we could now change them to if None: -> return None :/

The other alternative could be to make the decorator pass the checked value into the function as an argument. Since the argument's type would be Mapping and not Optional, we wouldn't need another None check there.

But to be honest, I think overall the code structure would be more readable and obvious with an if None -> return None clause and without the decorator. So if you're alright with that change, I can apply that refactor — let me know what you think!

@obi1kenobi
Copy link
Contributor Author

In the most recent commit, I ended up removing the use of @requires for one additional reason not discussed above: it can actually misdirect the reader as to the return type of the decorated methods. For example:

@requires("posterior")
def posterior_to_xarray(self) -> tp.Tuple[xarray.Dataset, xarray.Dataset]:
    ...

Calling posterior_to_xarray() would seem to always output a tuple of two Dataset objects, but could also return None because of the @requires. This means that the actual return type hint of that function should be Optional[Tuple[xarray.Dataset, xarray.Dataset]].

The @requires decorator could be made to correctly transform the function's return type to make this change visible to mypy. However, this doesn't change the fact that the reader of the code will still see Tuple and not Optional[Tuple] as the type hint -- because humans aren't mypy and won't look up or evaluate the return type transformation implied by the use of the decorator.

If I were the maintainer of this project, this would be my preferred approach. However, I am not the maintainer, so if there is another approach you'd prefer, I'm happy to change the code accordingly or give my 2 cents on it -- whatever is most helpful here.

@obi1kenobi
Copy link
Contributor Author

Also, just to clarify one point — whatever we decide here is only for the io_pyjags.py file and doesn't have to apply to other uses of @requires in general.

In principle, we could decide to merge this PR to unblock #1496 and start using typing_copilot, then update the definition of @requires and/or switch to an alternative (type-safe) approach here, such as using typing.cast() on the attributes checked by @requires.

In other words, we don't have to decide the "fate" of using @requires in this PR. We can make progress by making a small incremental change now, then easily change our mind down the road.

@OriolAbril
Copy link
Member

Sounds good, let's merge and start trying more serious typing

@OriolAbril OriolAbril merged commit b963996 into arviz-devs:master Jan 20, 2021
@obi1kenobi obi1kenobi deleted the fix_io_pyjags_types branch July 4, 2021 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants