Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds docstrings to the core module #827

Merged
merged 19 commits into from
Aug 25, 2022

Conversation

MarcSkovMadsen
Copy link
Collaborator

@MarcSkovMadsen MarcSkovMadsen commented Aug 7, 2022

This PR aims to add better docstring for users. It does this by providing more descriptive text, examples and reference links similar to the docstrings recently added to Panel.

I will try to stick to the reference example below.

image

As a user I would like to understand

  • what the plot is
  • be guided when to use it
  • How this can be used
    • understand the available options and parameters and how to find out more about them
  • Quickly be able to learn more about the type of plot in general (wikipedia) and in particular for the selected backend.

As a developer or supporter on Discourse working on the specific plot it is very helpful to be able to find the Pandas .plot and backend references quickly.

For beginners the docstring should contain just enough information to get a general idea of what this is and if it is the right plot to use. For more info they can head of to the reference documentation.

For super users the docstring should have just enough information to refresh your memory and be able to work. And we should be able to navigate without to much context switching. When you have to leave your editor or IDE there is a cost.

Many users (my self included) have not been educated abstractly (we might have concretely) in data visualization. I.e. we might not know the right terms, the right plot to use for the kind of data, not understand why we cannot make a bar plot from non-categorical data when that is normal for other frameworks to support etc. So the docs should help guide and share knowledge.

@MarcSkovMadsen MarcSkovMadsen marked this pull request as draft August 7, 2022 19:37
@MarcSkovMadsen
Copy link
Collaborator Author

Would you be willing to accept a larger PR in the style of what I have started here @maximlt or @philippjfr ?

@maximlt
Copy link
Member

maximlt commented Aug 7, 2022

We recommend using 
  • .hvplot in a notebook environment as its very familiar to DataFrame users.
  • hvPlot in editors like VS Code because it provides better tooltips and TAB completion.

Recommending different APIs given what editors people use make me sad :( I think we'll have to discuss this more before committing to this.

@philippjfr
Copy link
Member

Recommending different APIs given what editors people use make me sad :( I think we'll have to discuss this more before committing to this.

Strongly agree, I can see the issue with tab-completion but on balance I'm a strong -1 on recommending two separate APIs based on your editor.

@MarcSkovMadsen
Copy link
Collaborator Author

MarcSkovMadsen commented Aug 8, 2022

For this PR I will then

  • let out the recommendation of using hvPlot api in VS Code. I will use it my self though because it makes a world of a difference for my productivity and for the ability of my colleagues to read and understand my code.
  • not add type annotations

Would you be willing to look positively at reviewing and accepting a larger PR then?

@philippjfr
Copy link
Member

Sounds good and to be clear I won't object to adding some discussion about different APIs and usage patterns in the docs. Just not excited about recommending different APIs directly in the docstrings.

Would you be willing to look positively at reviewing and accepting a larger PR then?

Absolutely.

@MarcSkovMadsen
Copy link
Collaborator Author

MarcSkovMadsen commented Aug 8, 2022

Just for the record. Besides better intellisense in editors, using hvPlot over .hvplot will also avoid linters complaining about missing imports from import hvplot.pandas.

@MarcSkovMadsen
Copy link
Collaborator Author

I've added a reference example to the top, opening post and pushed the code. Feel free to comment on the code and guide me. It will be a lot of work to do and if you don't like it @maximlt and @philippjfr I would have wasted a lot of time. Thanks.

@maximlt
Copy link
Member

maximlt commented Aug 9, 2022

I'm on a holidays today, I'll have a closer look at it later this week. What I can say from a quick glance is that it looks like a good start.

A very important point imo is that you seem to be heading towards writing docstrings that are long enough to be part, or even entirely constitute, the reference gallery page of the documented plot. There shouldn't be two separate systems of documentation (docstring and reference gallery) to maintain at the same time, otherwise I wouldn't consider that as an extra burden on a documentation we already struggle to maintain.
Your example docstring looks like the docstrings of Pandas plotting methods (e.g. https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.plot.line.html), which render pretty well in a website I think. The difference with hvPlot's site is that they're part of the API reference, while on hvPlot they're part of the reference gallery.
Therefore I feel that at this stage it would be wise to try to see if we could build the reference gallery page of the line method from the docstring you've added, there must be a way to do so but someone will have to find how! It doesn't have to be you though, that could be me but I can't commit to that right now (focusing on releasing a bug fix version of hvPlot) so that will put your work on hold. Thoughts on that @philippjfr ?

@MarcSkovMadsen
Copy link
Collaborator Author

MarcSkovMadsen commented Aug 10, 2022

Thanks for your comments. I would say that Documentation in different places serves different purposes and is really, really needed for hvPlot to be useful for users.

My opinion is that the Reference Gallery should have lots of different examples. I believe you cannot outsource to the user guide to show how to use some of the fundamental powers of hvPlot like by, groupby and the ability to sprinkle in widgets. New users might end up in the reference Gallery via a Google Search and never understand the power of hvPlot or just be confused on how to apply something that is introduced generally in the user guide to the specifics of a line plot. They just look at it and think this is like Pandas .plot but with less documentation and not my favorite plotting library. Why should I use it? When I look at the Pandas .plot documentation I go wow. can I make all of this. There is no indication of what you can really do with hvPlot. It needs to be shown. Users cannot always think that by them selves.

I am not so worried about the docstrings in the core module. After all there are not so many methods to document and keep updated. And the API is pretty stable. That is the advantage of hvPlot the api is flat and there are not so many methods.

@maximlt
Copy link
Member

maximlt commented Aug 10, 2022

I think it would be a mistake to try to have too much information in the Reference Gallery. I've been reading about the Diataxis documentation framework and I think the Reference Gallery falls into their Reference type:

Reference material should be austere and to the point. One hardly reads reference material; one consults it.

I feel your pain about the documentation and your desire to promote more hvPlot. I don't think that it's by stuffing more information in the reference gallery that this is going to work and improve the documentation, it is just going to make it even less maintainable in the future. For instance, giving you suggestion we should show groupby for the line plot, but you wouldn't just limit that to the line plot, and this means you'd have to add it to every plot reference. And why not showing a rasterized line example? You see, there are so many options that if you try to document them all, and their combinations, in the Reference Gallery, it becomes too much and looses its all point. Instead, it should just show simple example, and better link to the relevant part of the documentation (that maybe exist now, or need to be rearranged, or need to be written).

There is no indication of what you can really do with hvPlot. It needs to be shown. Users cannot always think that by them selves.

Yes, but to me really the keys to that are better cross-linking and better docs (for instance, a nice and short hvPlot tutorial that quickly demonstrates all you can do), not repeating the same information everywhere.

Something that is of a higher priority in my opinion is the need to have a reference gallery that displays not only Bokeh plots but also Matplotlib and Plotly plots. We need a system to make that work and render in a nice way.

When I look at the Pandas .plot documentation I go wow. can I make all of this

Which part of Pandas' documentation gives you this impression exactly? That page https://pandas.pydata.org/pandas-docs/stable/user_guide/visualization.html#chart-visualization?

@MarcSkovMadsen
Copy link
Collaborator Author

MarcSkovMadsen commented Aug 10, 2022

Which part of Pandas' documentation gives you this impression exactly? That page https://pandas.pydata.org/pandas-docs/stable/user_guide/visualization.html#chart-visualization?

This https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.plot.line.html and similarly the other links I've referenced for line plots.

The Pandas link is actually populated by their docstring.

Plotly knows that lots of examples are needed. It takes a long time before users have a mental model and can navigate.

image

If you are new to plotting then you wont have a chance with hvPlot as the documentation is so sparse. You really have to have all the the basic knowledge and intuition from Matplotlib and Pandas .plot first before you start to look at hvPlot currently.

@MarcSkovMadsen
Copy link
Collaborator Author

MarcSkovMadsen commented Aug 10, 2022

Regarding Diataxis framework I also believe in it. But as I see it, it refers to written documentation for books/ web sites.

My current belief is that tooltips and help in notebooks are a bit different because they are constrained. Users are looking for quick information. Reference for sure. But also a bit of the rest including "how-to". They are looking for something that will normally help them.

hvplot/plotting/core.py Outdated Show resolved Hide resolved
hvplot/plotting/core.py Outdated Show resolved Hide resolved
hvplot/plotting/core.py Outdated Show resolved Hide resolved
@MarcSkovMadsen
Copy link
Collaborator Author

MarcSkovMadsen commented Aug 20, 2022

Hi @maximlt

I've

  • Added an test_urls pytest.
  • Reviewed the current contribution.

Let me know if you think this could get into the next release of hvPlot. Then I will continue the docstring work in a new PR.

@MarcSkovMadsen
Copy link
Collaborator Author

One note on signatures and docstrings. I can see some key parameters are not in the signature or docstring.

Unless somewhere tells me otherwise (@maximlt , @philippjfr ) the guiding principles for me are

  • Special and very important parameters should be in the signature and docstring. Around 7 parameters for a signature is fine. A bit more for docstrings is ok. But for now I will focus on adding them to the docstring as changing the function signature is more risky and needs more careful review.

I say this because I can see for example the special _kind_options are often neither documented or in the signature.

image

@maximlt maximlt marked this pull request as ready for review August 25, 2022 01:08
@maximlt
Copy link
Member

maximlt commented Aug 25, 2022

Thanks @MarcSkovMadsen ! Excellent improvements to the docstrings!

Merging as is. I think there are still some changes that are due to your auto-formatter, please the next time make sure you don't commit such changes, they're difficult to revert.

The test file you have added will actually not run. All the test files are named "testthing.py" and not "test_thing.py", the former being what is picked up by pytest. I keep it anyway as ultimately I'd like to turn it into a script that would run during a doc build, I don't think it's necessary to run it part of the unit tests (and this single test takes 30 secs to run).

@maximlt maximlt merged commit ae84bb8 into holoviz:master Aug 25, 2022
@MarcSkovMadsen
Copy link
Collaborator Author

Thanks @maximlt

The docs are a tough process. There is a lot I have to learn and a lot we have to align on. I'm not done yet at all.

But I am super motivated by the fact that I can get improvements into the next release and start to get feedback and see how it works out in practice. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants