New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix docstrings issues #2081
Fix docstrings issues #2081
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool thank you :)
src/datasets/arrow_dataset.py
Outdated
@@ -1308,41 +1324,45 @@ def map( | |||
and update the table (if function does updated examples). | |||
|
|||
Args: | |||
function (`callable`): with one of the following signature: | |||
function (`callable`): Function with one of the following signatures: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we use double backticks here ? same for the other arguments: ``callable``, ``bool``, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make a decision, because now different conventions are used.
In other open source projects, the convention is:
- When listing the Parameters, the types are specified just bare (not as in-line)
Args: a (int): First param. b (str): Second param. c (DatasetInfo): Third param.
- Only in the descriptions and examples, this are displayed as in-line: name of a parameter, Python code, the name of a module, function, built-in, type, literal, class, method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option might be not including by hand the types in the docstring, and let Sphinx insert automatically in the docstring the type hints from the signature ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lhoestq, however in the latter, the links to the corresponding classes (if they exist) are not created.
Once we make some decision, we could document it for contributors ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should do the same as in transformers
using :obj:<Type>
Here that would be
function (:obj:`Callable`): Function with one of the following signatures:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LysandreJik and @sgugger made a docupmentation and docstring writing guide some time ago: https://github.com/huggingface/transformers/tree/master/docs#writing-documentation---specification
Maybe you can find some answers there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @thomwolf for pointing out this guide.
I would really like to discuss with the authors @LysandreJik and @sgugger what is the reason for what I consider abusing of the usage of the role :obj:
:
- When you write:
you are telling Sphinx: try to create a cross-reference link to the definition (the corresponding description directive) of
:obj:`str`
str
in our docs. But we not definestr
in our docs; that is defined in the Python Standard Library docs. Therefore it is impossible to create a link to it. - When you write:
you are telling to Sphinx: try to create a cross-reference link to the definition of
:obj:`(batch_size, sequence_length)`
(batch_size, sequence_length)
. Well, I think this clearly makes no sense.
Moreover, I think it is important to note that I have not seen this usage of :obj:
before in other Open Source projects, like TensorFlow, TensorFlow Datasets, Keras, PyArrow,...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes using :obj:
tells Sphinx to try to cross-reference the object inside but it will fail graciously if it does not find the reference and apply proper code formatting to what is inside, which is mostly why we use it. Leaving a type between backsticks will have it be rendered in italics (since Sphinx delegates it to restructured text which interprets it that way) which is something we did not want for the documentation of Transformers, as it is incorrect.
Even using double-backsticks does not apply the same formatting.
Note that this decision has been taken a long time ago (even before I joined) and it is very unlikely we will change the way it's done on the Tranformers side, as it would require changing 204k lines of codes (not all of them are docstrings but you get the idea).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your clear explanation, @sgugger.
I am just wondering why double-backticks do not apply the same formatting...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure either. It sometimes ends up as code-formatting + bold (the same as with an :obj:
), sometimes just code-formatting without the bold.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks !
Fix docstring issues.