New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion: Usage examples in documentation / docstrings #968
Comments
sniping @ericmjl. notebooks are useful for some scenarios, take for instance |
@thatlittleboy thanks for chiming in! I actually agree with your sentiments, and I also think I've been not explicit enough with what I was hoping to accomplish with #957, leading to a bit of stagnation and confusion. You're right in observing that the docstrings become way too verbose. Additionally, maintaining the functions became difficult as the docstrings started interfering with the readability of the original source file I think the coverage of examples in the library is in big need of a redo, and we can probably do a distributed sprint to make it happen. As I see it right now, the docs examples should fulfill the following criteria:
I did a bit of digging, and I'm still a bit unsure how to ensure satisfy all of the conditions above simultaneously. That said, option number 2 that you mentioned above, namely:
seems to be the option that makes the most sense in the short term, and we probably could build up towards option 3 later on using that as a base. @thatlittleboy would you be open to helping out with executing on option 2? I think we'd need to start first by having one minimal example per notebook. |
Sure @ericmjl , I think I should be able to help with the minimal examples / tl;dr part of the sprint. So to be clear, the "example notebooks" that we are talking about here are the ones in here, yeah? |
Yes, that is right! If you could give me a day or two to template out the workflow, that'd be awesome. It'll give me a chance to work out potential kinks before we go all-in on this way of handling minimal working examples in docstrings. |
@thatlittleboy I did a few tests and ultimately found that putting minimal working examples in the docstrings is the best thing to do. We get free integration with doctests & pytest, for example! The examples also render well too. In my latest PR #971, I made a few infrastructural changes as well to clear up the CI. Once that one gets merged, the other PRs that you've got should merge in latest |
Looks great @ericmjl , thank you. I think this is a good direction forward, especially for offering clear, short examples to new users of pyjanitor. 👍🏻 |
@thatlittleboy I'd like to invite you onto the dev team. Can you ping me on Shortwhale so I can send you a link to join the Discord server? http://www.shortwhale.com/ericmjl |
Yep, pinged! |
Hi all, just wanted to open a discussion on the state of documentation of this package.
With the most recently merged pull request (#957) and #906, I'm inferring that a decision has been made to remove all Minimal Working Examples (MWE) from the docstrings and move them instead into Jupyter notebooks -- with 1 notebook for each function (?).
Qn: If this is so, then can I understand what is the recommended way for a user to study these examples / how the
pyjanitor
functions should be used?A bit more context on where the question is coming from:
I'm looking to incorporate this package more into my daily workflow, and the existing examples within the API reference have been instrumental to my understanding of what the package offers.
As far as I can tell, there are 2 locations for where examples are currently located:
After removing the MWE from the function docstrings (and thus, the API reference) as per #957 , is there then a plan to link up the API reference to the notebook examples, in any shape or form?
That is, how is the user, coming from the API reference page, to know that there are examples available that show the functionality with sample inputs/outputs?
Take this example from PR #957
The new docstring looks like the one on the right:
I would argue that the docstring on the right is in fact less informative (!!) and the remaining "skeleton" example is essentially useless (sorry for the blunt expression), since that is essentially repeating the function parameters back to the user. (And if the point of the skeleton example is purely to inform the user there are 3 ways pyjanitor functions can be used -- method-chaining, piping, function -- then I think it is redundant since this has already been mentioned in the HomePage and there's no need to repeat this in every subsequent function docstring)
But I digress. My main point is that: the docstrings, as it is being modified currently -- examples removed with no link / mention to examples) -- is confusing to the new user who is just looking to understand what each new function is meant to do.
On potential solutions
On the note of "linking" each of the function docstrings to their respective notebook examples, I suppose there are a few ways to design it, with considerations of BOTH the organization of the src code AND the eventual user experience of reading the docs:
I'm personally more in favour of 1 myself, but I suppose I'm in the minority. 😆 I genuinely don't think any of the pyjanitor function examples require a notebook to be explained thoroughly -- after all, they are just syntactic sugar for cleaning / manipulating dfs? I often see notebook examples in the context of explaining ML workflows / how to use a certain NN model (think: pytorch/dgl; training & evaluating ModelXXX on the MNIST dataset).
But barring solution 1, solution 3 seems like a nice middleground (huge fan of pandas' docs), but probably more complicated to implement than 2. If we indeed go for 2, I think we also need a tl;dr section for each notebook; but that's a different issue altogether.
Thoughts?
ps: Also don't mean to knock on the efforts made in #957 too much, forgive me 😝 Happy new year all 🎉
The text was updated successfully, but these errors were encountered: