Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please support static typing #376

Open
MarcSkovMadsen opened this issue Feb 5, 2020 · 26 comments
Open

Please support static typing #376

MarcSkovMadsen opened this issue Feb 5, 2020 · 26 comments
Labels
doc docs, interactive help, auto-completion, etc status: discussion Discussion. Not yet a specific feature/bug. Likely to result in multiple PRs/issues. type-feature Feature request
Milestone

Comments

@MarcSkovMadsen
Copy link
Collaborator

MarcSkovMadsen commented Feb 5, 2020

My Pain

I'm developing my applications in the VS Code editor. Furthermore I use pylint and mypy to help me create code of high quality. And google style docstring for documentation.

Param unfortunately does not work that well with that.

I've tried something like that.

class Scatter2dWithSelectionComponent(param.Parameterized):
    """The Scatter2dWithSelectionComponent enables a user to

    - provide a DataFrame of data
 
    Parameters
    ----------
    data : pd.DataFrame
        A dataframe of data to be shown and selected
    """
    data: pd.DataFrame = param.DataFrame(precedence=0.1)

But

  • VS Code cannot provide tab completion for instance.data. or help text on hover of instance.data.
  • Pylint complains for example that E1101: Instance of 'DataFrame' has no 'iloc' member (no-member) if I have a line like instance.data.iloc.

Solution

Support static type checking and enable all the help you can get in modern editors.

@jbednar
Copy link
Member

jbednar commented Feb 5, 2020

We have been focusing on simultaneously supporting Python 2 and Python 3 in the same codebases for the past 5 years, so we haven't been able to make use of any Python3-only syntax like type hinting in our own code. Once we are finally able to leave Python 2 behind, we can become experts in Python 3-only stuff, but for the moment I'm not actually sure what the above type annotation is meant to indicate. At the class level "data" is not a pd.DataFrame; it's a param.DataFrame object, so I'm not sure that's the correct type annotation to supply. But in the instance it will be a pd.DataFrame, which is what matters for instance.data.iloc.

I really don't know if Python 3's type hints are expressive enough to distinguish between type differences between the class and instance levels. So I can't say if this is something that is supported now but with a different syntax, if it can be supported but would require changes to the ParameterizedMetaclass, or if it is simply not possible given the type differences between Parameterized classes and instances (which are inherent to Param's design). Personally, I've found Parameter's type declarations to be vastly more powerful than static types (as they allow specifying ranges and allowed values and not simply type name), but it sounds like the fact that the type is static is something your editor can exploit, and of course the editor doesn't know about Param. Anyway, I wouldn't personally be ready to tackle this issue until we've completely left Python2 behind for good, but if someone else can see a way, be my guest!

@MarcSkovMadsen
Copy link
Collaborator Author

Thanks for the discussion.

I understand that that is a matter of resources and priority. I just wan't to raise it because it's important to the way I work and people working similarly.

I actually like param a lot. That is part of why I like Panel. I have no experiences from python 2 :-)

@ceball ceball added doc docs, interactive help, auto-completion, etc status: discussion Discussion. Not yet a specific feature/bug. Likely to result in multiple PRs/issues. type-feature Feature request labels Apr 13, 2020
@tonyfast tonyfast added this to the Wishlist milestone Aug 31, 2020
@MarcSkovMadsen
Copy link
Collaborator Author

I believe this link explains how to add type annotations to descriptors and thus to Param. https://stackoverflow.com/questions/57400478/using-typing-and-mypy-with-descriptors.

@jbednar
Copy link
Member

jbednar commented Jun 14, 2021

I have no experiences from python 2 :-)

Lucky you! :-)

I'm happy for param 2 to move away from Python2.7 support. Param 2 can happen this summer; it would be the next big thing after we have proper docs (as it basically requires deleting whatever's not in the docs as being no longer supported!) I.e., it's not a big jump from where Param is now to 2.0; it's just an agreement to clean out a bunch of stuff and not look back. So yes, if we can make use of type hints at that point, great!

@jbednar
Copy link
Member

jbednar commented Oct 22, 2021

https://atom.readthedocs.io/en/latest/basis/typing.html might be helpful for understanding how a similar library moved to start including py3 types. Param 2.0 is very close to release, and that's when we can drop py2 and consider how to move forward. I'd also be interested in seeing if there is some lightweight way to decorate a py3 dataclass to use its types in Param, as a limited but useful entry into Param's functionality using py3 type hints.

@MarcSkovMadsen
Copy link
Collaborator Author

MarcSkovMadsen commented Jun 6, 2022

For Param to be really easy to understand and use in a modern IDE lots of type annotations and just getting rid of all the *'params would be need.

Example - Number Parameter

The arguments of the Number.__init__ method would need 1) type annotations 2) docstrings describing the arguments

image

Example Integer Parameter

The __init__ method would need named arguments instead of the **params and 1) + 2) from above.

image

Would you be open to accept that kind of PRs @jbednar ?

@philippjfr
Copy link
Member

For Param to be really easy to understand and use in a modern IDE lots of type annotations and just getting rid of all the *'params would be need.

Can you expand here, yes it might make declaring parameters easier but that's not really the important case right? Figuring out the types of the parameters themselves (like a dataclass) is what we should optimize for no?

As for this concrete question, I find the idea of repeating all arguments for a parameter whenever you subclass a parameter to be kind of crazy and ugly. Is there really no mechanism for Python typing modules to figure out types from the arguments accepted by the baseclass and the subclass?

@philippjfr
Copy link
Member

Specifically why I find this so crazy and ugly is that it introduces a significant risk of argument definitions to become outdated/drift out-of-sync. Adding an argument on the baseclass which is then not also declared on all subclasses is a significant issue that is avoided by passing through **params.

@maximlt
Copy link
Member

maximlt commented Jun 6, 2022

Would you be open to accept that kind of PRs @jbednar ?

Unfortunately this will have to wait for at least one more release though, i.e. until the moment it is declared that the next release will be Param 2.0.

As for this concrete question, I find the idea of repeating all arguments for a parameter whenever you subclass a parameter to be kind of crazy and ugly. Is there really no mechanism for Python typing modules to figure out types from the arguments accepted by the baseclass and the subclass?

I also find that a bit disappointing, I haven't yet found a solution to this. See this related discussion: github.com/python/typing/discussions/1079

@philippjfr
Copy link
Member

Thanks for that reference @maximlt. I would have no particular objection to a @copy_type decorator as outlined in that issue.

@maximlt
Copy link
Member

maximlt commented Jun 6, 2022

@MarcSkovMadsen
Copy link
Collaborator Author

declaring parameters easier but that's not really the important case right

I would say it is. You probably know the arguments of Parameter and any derived class by heart. But a newbie or one who does not use Param all the time does not. Its so much easier for them to understand Integer if the editor provides helpful tooltips telling that readonly is an available argument, what it does and what type is should be.

@philippjfr
Copy link
Member

philippjfr commented Jun 6, 2022

Totally fair and I'm not objecting at all to typing of the Parameter parameters. Just was attempting to clarifying the distinction between typing Param(eters) and typing of a Parameterized, which is what the issue was originally about.

@maximlt
Copy link
Member

maximlt commented Jun 6, 2022

I agree with that too! Adding Numpy docstrings to each Parameter is something I would like to do, I was basically waiting for Param 2.0 to be the next milestone, to be able to refactor the whole code base and add docstrings all over the place.

@MarcSkovMadsen
Copy link
Collaborator Author

MarcSkovMadsen commented Jun 6, 2022

Is there really no mechanism for Python typing modules to figure out types from the arguments accepted by the baseclass and the subclass?

mypy provides a plugin mechanism. But unfortunately they only work when statically typing your code using mypy script.py.

  • Pyright that powers Pylance and VS Code tooltips does not support them Plugins? microsoft/pyright#607
  • If we could generate stubs automatically then VS Code/ Pyright/ Pylance issue would be solved but
    • unfortunately stubgen (the auto stub generator) that comes with mypy does not seem to use plugins either - at least I cannot get it working and I've not been able to find any docs stating it should work.
    • My own attempt at generating stubs Panel #3132 - Stub Experiments ended when I found out I was trying to reimplement stubgen and that is a big, big undertaking because you need to be able to generate stubs for everything not just Parameterized classes. So I would prefer to build upon stubgen and just plugin a little bit of Param.

@MarcSkovMadsen
Copy link
Collaborator Author

My guess is that when you define a custom __init__ function you have to provide all arguments and a good docstring.

How would mypy, pyright or other tools know that you did not intend to control the signature and docstring of the __init__?

At least that it the behaviour I see right now when running stubgen on a dataclass.

image

My conclusion is that the best future for param and the rest of the holoviz suite would be do be able to autogenerate stubs because then we can much better control and provide useful information. We have so much more useful information that can be autogenerated, than what I see for example PEP 681: Data Class Transforms providing providing.

@MarcSkovMadsen
Copy link
Collaborator Author

MarcSkovMadsen commented Jun 6, 2022

I have spent hours trying to understand the documentation, github discussions and other docs around typing. But as I see it you either need to be rather experienced in this area or maybe spend a month before you really understand what direction Param should take.

Do we know anybody who's an expert in this area?

@philippjfr
Copy link
Member

One perspective that may help would around what @mattpap has done and intends to do, for typing of the Bokeh property system. That fairly closely mirrors what Param provides although with a focus on serialization.

@MarcSkovMadsen
Copy link
Collaborator Author

MarcSkovMadsen commented Jun 6, 2022

I would really, really like to find a solution on this. I can see how much it means for my productivity now that Panel ships with some typing support and especially much improved docstrings.

Before everything had to be in your head or looked up in the documentation. Now the documentation is there as a nice tooltip and if its not enough I can easily click the reference link. Its such a mental relief, its so easier to discuss the code with colleagues and it feels so much more modern.

image

@jbednar
Copy link
Member

jbednar commented Jun 6, 2022

I would have no particular objection to a @copy_type decorator as outlined in that issue.

Same; using @copy_type to mirror types from a base class is a bit ugly, but it's acceptable because it is automatic and doesn't let the two definitions get out of sync. So I'd be ok with accepting PRs that add @copy_type to provide typing for constructor arguments in param, if there is a demonstrated benefit (which should be shown off in the PR's text).

@philippjfr
Copy link
Member

philippjfr commented Jun 7, 2022

Right now the only thing that can actually work would be the following:

  1. We add pytype properties to all parameters
  2. We add a utility that takes a Python module as input and rewrites it so that all parameters are annotated by type based on the pytype properties
  3. We provide a gh-action (and pre-commit hook) that checks that the parameter definitions match the type definitions

If we need to support an actual runtime checker things get a lot more involved because it will realize that param.Integer() is not actually of type int.

@jbednar
Copy link
Member

jbednar commented Jun 7, 2022

If we need to support an actual runtime checker things get a lot more involved because it will realize that param.Integer() is not actually of type int.

Depends -- at the instance level it will actually be int.

@philippjfr
Copy link
Member

Actually you're right, even at the class level that is true at runtime.

@tomascsantos
Copy link

Hi all, I've been scanning the issues and I'm trying to understand where we've landed with this? I really love the watcher / callback features of param, but it is a HUGE drawback that my code essentially requires users to read the source to understand how to use it.

Example of useless docstrings:
image
Or I have to use insanely redundant definitions to write self-documenting code:
image

Since the above is obviously unmanageable, I resort to having terrible docstrings in all my code which make adoption incredibly difficult. I can see @MarcSkovMadsen has really been pushing to fix these docstrings, so I just want to add my vote and really stress how important this is. I cannot think of any other feature besides stability that would be as important as nailing these docstrings / type hinting.

To be clear the question here is where are we now? And I'll add that I'm hesitantly interested in chipping in some manpower if there is a concrete plan of action. Thank you!

@LecrisUT
Copy link

Hi all, I've been scanning the issues and I'm trying to understand where we've landed with this? I really love the watcher / callback features of param, but it is a HUGE drawback that my code essentially requires users to read the source to understand how to use it.

The attribute documentation is rather an issue with PEP224. PyCharm and sphinx support this. I have played around with trying to add native docstring to attrs and concluded it is impossible (the docstring is getting attached to the parent type instead if even possible). Another nice tool is dostring_parser which can extract at least the static docstrings, including PEP224 format.

Otherwise for type-annotation and docstring inheritance attrs is a good reference to study I would say. Hope these comments help.

@LecrisUT
Copy link

As I have been learning how to interact with param, I've been thinking of how to best tackle this issue. I think this needs to be separated in a few chunks. It would also help those who are subscribed to this issue track what's the progress on this.

Step1: Make the base types generic

First the basics of type-annotation should be implemented: adding py.typed, CI for mypy, adding everything to mypy.exclude. Including ruff to the CI would also help for contributors to maintain some consistency.

A first target to convert is param.Parameterized.Parameter to a Generic linked by default type.

Step2: Convert the reactive expression

By priority I would say this is lower than the next step, but this one is much easier to achieve, since it primarily revolves around tpye-hinting ._obj and .value.

Step3: Start exposing the Generic type's attributes

Stuff like __getattributes__, __setattr__ are valuable to expose for the IDE to process. I am not sure at this point how to expose these

Step4: Convert the Parametrized

This one might be the most complex one, e.g. annotating __init__ and such. It might also be good to consider dataclasses and attrs approach using decorators instead of class inheritance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc docs, interactive help, auto-completion, etc status: discussion Discussion. Not yet a specific feature/bug. Likely to result in multiple PRs/issues. type-feature Feature request
Projects
None yet
Development

No branches or pull requests

8 participants