Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev/fix latest numpy types #403

Merged
merged 30 commits into from
Jan 6, 2022

Conversation

jklaise
Copy link
Member

@jklaise jklaise commented Dec 6, 2021

This PR is based on the latest numpy which has type information. This flags a few issues with our types which have been resolved in this PR. It is a pre-requisite for upgrading to tensorflow 2.7.

Most of this PR is pretty straight-forward, however I've left TODO: TBD notes where I would welcome a discussion, this is mostly to do with implicit narrowing of types that either mypy can't handle (fixed easily with appropriate type: ignore[code]) or more seriously with our logic where we haven't considered that a type might not be narrowed down sufficiently (e.g. is the user list always an np.ndarray by the time it hits a function that only works on np.ndarray?).

There will be another PR enabling strict optional typing which will likely reveal a few more issues with current typing.

Copy link
Contributor

@ascillitoe ascillitoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM! Only bit I wonder about is whether to allow type redefinitions now?

@@ -247,10 +249,12 @@ def score(self, X: np.ndarray, batch_size: int = int(1e10), return_predictions:
# model predictions
y = predict_batch(X, self.model, batch_size=batch_size)
y_recon = predict_batch(X_recon, self.model, batch_size=batch_size)
y = cast(np.ndarray, y) # help mypy out
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason not to use a type annotation like below here? Or is it just a convention we are going with?

y: np.ndarray = predict_batch(X, self.model, batch_size=batch_size)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't work because y is already defined and --allow-redefinitions is not enabled.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that makes sense, thanks!

@@ -83,16 +83,20 @@ def __init__(
else:
self.device = torch.device('cpu')

# TODO: TBD: the several type:ignore's below are because x_ref is typed as an np.ndarray
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think --allow-redefinitions would cause mypy to miss anything significant elsewhere? I wonder if it's best to activate it now as redefining variable types is pretty common for us?

Otherwise, we could go with a convention of renaming variables instead, i.e. something like

x_ref_torch = cast(torch.Tensor, torch.as_tensor(self.x_ref).to(self.device)

Although the latter seems undesirable to me...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renaming wouldn't need the cast at all I think.

I'm a bit wary of enabling --allow-redefinitions as it's unclear as you say what we could potentially miss. Even so, it is quite limited as redefinitions would be allowed only within the same scope.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry yes cast is making more sense to me now!

I'd have thought that within the same scope would mostly be fine for us? But in any case, it's not clear to me what "within the same block and nesting depth" in the mypy docs means. Also, I take your point that it's probably worth being cautious about changing this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I don't know what happened but I just removed all these ignore's and locally the mypy check passes now...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, I was being silly as had downgraded back to numpy 1.19...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too good to be true!

@@ -20,6 +20,7 @@ exclude =
[mypy]
ignore_missing_imports = True
strict_optional = False
show_error_codes = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!

@@ -245,7 +245,8 @@ def logp(self, dist, X: np.ndarray, return_per_feature: bool = False, batch_size
Log probabilities.
"""
logp_fn = partial(dist.log_prob, return_per_feature=return_per_feature)
return predict_batch(X, logp_fn, batch_size=batch_size)
# TODO: TBD: can this be any of the other types from predict_batch? i.e. tf.Tensor or tuple
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this one. I'd have guessed that predict_batch won't return tf.Tensor here since we don't explicitly set dtype (so return_np=True), but it looks like tuple might be possible? Hopefully @arnaudvl has more input!

@jklaise
Copy link
Member Author

jklaise commented Jan 5, 2022

There are a few more issues with the recently dropped numpy 1.22 for which I will push another commit shortly.

@jklaise jklaise merged commit d2d413c into SeldonIO:master Jan 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants