-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use plugin to support numpy #3540
Comments
This project looks somewhat related: http://datashape.readthedocs.io, https://github.com/blaze/datashape |
There is some conversation about type annotations in this NumPy issue: numpy/numpy#7370 |
Here's my brain dump on static typing for arrays: TensorFlow (which copies the design of NumPy in many ways) has a form of static typing for array shapes that is useful precedence here. Notably, it distinguishes between three cases (copied from that page):
TensorFlow sets shapes using C++ or Python, mostly with In particular, one important use case supported by most NumPy functions is broadcasting, where output shapes are propagated via broadcasting rules. This could allow for identifying broadcasting errors (one of the most frequent user-errors in NumPy) at compile-time instead of run-time. Finally, a concept like |
I think dependent types are needed in very rare cases. What we probably need more, is something like static constraints. So dependent types would mean that some part of the type is based on value of function argument. For example, if function would be:
Where But what we first need is even support to express "numpy with dtype" type. I see this as an extra constraint on numpy type. The issue is that it is not enough just to have support for static constraints, but one needs a language so that you can describe that input arguments to a function can have some relations between them (like one argument array has to be 2x in width than the other argument array). It gets pretty tricky soon. See contracts project for an attempt at this. |
Working on a large codebase of scientific python code, a generic array-type annotation would be incredibly useful. An example that would be nice to type-check:
This documents that:
It would be good to add more flexibility to this example. For example it may be the case that the data points are not flattened along one dimension, e.g.
This would indicate that the predictions and labels can have any shapes as long as they match. It would also be great to have some kind of support for Generics Types. For example:
This annotation would show that
|
FYI, the https://github.com/numpy/numpy_stubs project is now working on support for typing numpy-using code. |
Yes, and we are very much interested in collaboration. See here for ongoing discussion about typing shapes: https://github.com/numpy/numpy_stubs/issues/5 |
Who is "we" here and who do you want to collaborate with? Is there a
complete proposal for the syntax yet? It would have to aim at becoming a
PEP (a sibling to PEP 484 and PEP 526) and someone should attempt an
implementation for mypy, perhaps in the form of a plugin. The mypy team
doesn't have the resources or priorities to help much, alas.
|
@gvanrossum |
@gvanrossum I suppose by "We" I was referring to NumPy fellow developers, and others in the broader numerical Python community. (But to be honest, I wasn't being very carefully with my post.) There are a few others in the mypy community who have also expressed interest here. I met up with @JelleZijlstra and @ethanhs along with @njsmith (from the NumPy side) a few months ago to discuss the issues in python/typing#513. This prompted the creation of numpy-stubs, but there hasn't been significant progress on the major issues since then (which of course is totally fine, this being open source I have no expectations on anyone else's time).
My preliminary partial proposal (which @ilevkivskyi linked to above) takes a stab at syntax, and also outlines some of the broader issues for typing. I agree that a PEP will be necessary eventually, but I'm not sure we're quite ready for that yet. A proof of concept implementation would certainly help clarify many of these issues. @rmcgibbo wrote an experimental mypy plugin for NumPy in https://github.com/rmcgibbo/numpy-mypy, but it only supports checks on the number of array dimensions (ndim), not shapes or dimension identity. I think dimension identity checks could be very powerful (along the same lines as what @petered writes above), but certainly support for ndim would be better than nothing. |
Ah, very interesting. Are you coming to PyCon in Cleveland this year? If so
you should make sure to have an in-person meeting with the mypy team (Jukka
and Ivan in particular) and any other folks interested (you mentioned a
few).
…On Sun, Mar 25, 2018 at 11:55 PM, Stephan Hoyer ***@***.***> wrote:
@gvanrossum <https://github.com/gvanrossum> I suppose by "We" I was
referring to NumPy fellow developers, and others in the broader numerical
Python community. (But to be honest, I wasn't being very carefully with my
post.)
There are a few others in the mypy community who have also expressed
interest here. I met up with @JelleZijlstra
<https://github.com/JelleZijlstra> and @ethanhs
<https://github.com/ethanhs> along with @njsmith
<https://github.com/njsmith> (from the NumPy side) a few months ago to
discuss the issues in python/typing#513
<python/typing#513>. This prompted the creation
of numpy-stubs <https://github.com/numpy/numpy_stubs>, but there hasn't
been significant progress on the major issues since then (which of course
is totally fine, this being open source I have no expectations on anyone
else's time).
Is there a complete proposal for the syntax yet? It would have to aim at
becoming a
PEP (a sibling to PEP 484 and PEP 526) and someone should attempt an
implementation for mypy, perhaps in the form of a plugin.
My preliminary partial proposal
<https://docs.google.com/document/d/1vpMse4c6DrWH5rq2tQSx3qwP_m_0lyn-Ij4WHqQqRHY/edit#>
(which @ilevkivskyi <https://github.com/ilevkivskyi> linked to above)
takes a stab at syntax, and also outlines some of the broader issues for
typing. I agree that a PEP will be necessary eventually, but I'm not sure
we're quite ready for that yet. A proof of concept implementation would
certainly help clarify many of these issues.
@rmcgibbo <https://github.com/rmcgibbo> wrote an experimental mypy plugin
for NumPy in https://github.com/rmcgibbo/numpy-mypy, but it only supports
checks on the *number* of array dimensions (ndim), not shapes or
dimension identity. I think dimension identity checks could be very
powerful (along the same lines as what @petered
<https://github.com/petered> writes above), but certainly support for
ndim would be better than nothing.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3540 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACwrMsb8o_LlqbCinfGUfwdz6mH8guW-ks5tiJDpgaJpZM4N5lFE>
.
--
--Guido van Rossum (python.org/~guido)
|
@gvanrossum unfortunately, I don't think I'll be able to make it to PyCon this year (though it is definitely tempting!). But if you could still have a good meeting with @njsmith and @mrocklin from the numpy/scientific python side. I would also be happy to do a video-conference/call with anyone interested in higher bandwidth discussion. |
This topic has been subsequently discussed in various other forums (including the Bay Area typing meetups -- and PyCon 2019, I think). Keeping this issue open doesn't seem worthwhile, since it's missing a lot of context. |
@JukkaL Is there a link to the discussions you mentioned? How can we get more context? |
There has been some recent discussion in the typing-sig@ mailing list. That seems like the best place to find up-to-date information. |
This was originally discussed at #1240. It's unclear how this would work exactly -- we may first need to design a static type system extension for numpy.
The text was updated successfully, but these errors were encountered: