Added google/napoleon docstring support to fix #504 #617

wolph · 2015-08-13T11:18:39Z

No description provided.

davidhalter · 2015-08-13T12:16:22Z

jedi/evaluate/docstrings.py

@@ -29,11 +29,13 @@
    r'\s*:type\s+%s:\s*([^\n]+)',  # Sphinx
    r'\s*:param\s+(\w+)\s+%s:[^\n]+',  # Sphinx param with type
    r'\s*@type\s+%s:\s*([^\n]+)',  # Epydoc
+    '\s*%s\s*\((.+?)\):.+',  # Google param with type


This doesn't check for Args. Shouldn't we do that? Otherwise it might just much

Also I don't think you need .+.

Well, it's basically a context free language which would require to count the parenthesis, since that's not possible in a regular language it seemed like .+? would be the best solution here. We could try everything except for : instead but I'm not sure that will work correctly in all cases.

As for the Args, while not impossible, that would require a fairly large rewrite. The entire _search_param_in_docstr function wouldn't be usable anymore as it's only allowed to match using this regex when Args: precedes it. Or have I missed something in the code that would easily allow for this?

it's basically a context free language

I understand that, but you could check it before hand. That makes it a little bit more complicated (in the Python area), but could still work. Obviously modifying regexes would not be enough anymore.

davidhalter · 2016-02-02T13:05:59Z

@wolph If you want this to get merged, you need to invest some time in writing more detailed regexes. Sorry, but there's too many wildcards up there for me.

wolph · 2016-02-02T13:09:01Z

@davidhalter I'm planning to, I'm just fully booked till march. After that I'll have time to work on this issue.

No worries, I haven't forgotten :)

davidhalter · 2016-02-02T13:10:21Z

Cool! :)

achalddave · 2016-03-03T20:43:08Z

@wolph Are you still working on this? I can take a look at it if not.

I can play with the regexes, or we could also modify this bit so we can specify the parse method using a function instead of a regex string. However, I do think that adding sphinxdoc as a dependency wouldn't be that bad of an idea as suggested in issue #504 - this syntax is ambiguous so it'd probably be nicer to just let someone else take care of the parsing (otherwise, the set of docstrings that Jedi handles correctly will likely not be the same as the set that Sphinx handles, and people may well be using both).

If we're sticking with regexes, I think there isn't much to improve to this one (apart from possibly replacing .+? with [^\n]+?)

wolph · 2016-03-03T22:23:36Z

@achalddave If you have time to work on it, please do. I have it planned for this month but I'm currently still finishing my backlog of tasks so help is greatly appreciated.

As for a parsing function instead of a regex, that would be the better solution but it might be a bit of a big patch in that case. To properly parse this you would need a stack based approach, although I think a simple version (count the parentheses) could work too and might be enough already for this case.

That regex will definitely fail since the arguments can actually span multiple lines.

achalddave · 2016-03-03T22:27:42Z

The arguments can span multiple lines, but I think it would be safe to say that the type will be on the same line? Maybe that's just the way I've used it - then leaving the regex as is is fine.

Even with a stack-based approach, we'd make the assumption that there are no stray open or close parentheses within the type annotation, which is a fair assumption to make, but may be a different assumption from the one that Sphinx makes (and even if we use the same assumptions as them, their assumptions may change). But if adding Sphinx is a nogo, I can try and work on this at some point after mid March.

wolph · 2016-03-03T22:31:20Z

True, that's probably a fair assumption. If it needs to span multiple lines than perhaps it's just too complex ;)

I'm personally in favour of handling this using the native Sphinx code, but it does add an external dependency and this might be part of a private API which could break if it's somehow out of date or is ever changed. That last bit is an assumption but might be worth investigating, is this part of a stable API or not.

achalddave · 2016-03-09T02:24:22Z

@davidhalter What're your thoughts? I don't actually know if this regex can actually be made more specific without rejecting valid strings. I'd say for now something simple like this would still be better than no support for Google style docstrings (at least for me :) ).

davidhalter · 2016-03-09T08:45:42Z

As I've mentioned before, I don't like the current regexes. They are too broad.

However, I don't think we can fix this by just improving regexes. We need to parse the docstring in the case of google docstrings! I would recommend to split this logic from the sphinx/epydoc parsing. It's just something different.

Adding a dependency just for this is really not an option, sorry.

davidhalter · 2016-07-03T14:07:31Z

Closing, because it's been open for quite a long time and no progress has been made. Let me know when you're done!

wolph · 2016-07-03T18:58:51Z

Sure, no problem :)

The proposed pull request works good enough for me anyhow

Erotemic · 2016-10-20T18:13:07Z

@davidhalter What exactly needs to be done on this branch to make it an acceptable merge? From what I understand, a good solution should be based on parsing rather than using context insensitive regexes. And using a third party library like pyparsing is out of the question?

If pyparsing is not ok, would a more heuristic based approach based on docstring indentation work? My basic idea is to parse a google-style docstring into blocks based on indentations. Ignoring the first line, the code would search for the first line that matches the Args tag in the correct indentation followed by a colon and then where the next non-empty line is indented. All lines in this indentation scope would then be considered part of the Args block. Then each non-empty line at the root of the Args indentation scope would be considered as new args lines. Google style enforces that continuations of lines should be indented one further, so this should isolate each arg line. Once each arg line is isolated a regex could be used to extract the name, type, and description. I believe this should ameliorate concerns about over-matching.

What do you think?

davidhalter · 2016-10-23T15:26:06Z

And using a third party library like pyparsing is out of the question?

For this issue, yes. It's just too small of an improvement for Jedi.

If pyparsing is not ok, would a more heuristic based approach based on docstring indentation work? My basic idea is to parse a google-style docstring into blocks based on indentations.

That's exactly what I would have done. Happy to merge such a pull request.

The only alternative to that is to use the Jedi parser itself. You could create a grammar file for it and somehow use that. It's probably not even too hard (if you understand parsers). However I think that's almost too much work for something that simple. (Not sure though, the parser is pretty good, might be more of an issue to write a tokenizer that fits).

added google/napoleon docstring support to fix #504

a724a7d

davidhalter reviewed Aug 13, 2015
View reviewed changes

davidhalter closed this Jul 3, 2016

Erotemic mentioned this pull request Nov 7, 2016

Added google style docstr support + numpy docstr fixes #796

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added google/napoleon docstring support to fix #504 #617

Added google/napoleon docstring support to fix #504 #617

wolph commented Aug 13, 2015

davidhalter Aug 13, 2015

wolph Aug 13, 2015

davidhalter Aug 13, 2015

davidhalter commented Feb 2, 2016

wolph commented Feb 2, 2016

davidhalter commented Feb 2, 2016

achalddave commented Mar 3, 2016

wolph commented Mar 3, 2016

achalddave commented Mar 3, 2016

wolph commented Mar 3, 2016

achalddave commented Mar 9, 2016

davidhalter commented Mar 9, 2016

davidhalter commented Jul 3, 2016

wolph commented Jul 3, 2016

Erotemic commented Oct 20, 2016

davidhalter commented Oct 23, 2016

Added google/napoleon docstring support to fix #504 #617

Added google/napoleon docstring support to fix #504 #617

Conversation

wolph commented Aug 13, 2015

davidhalter Aug 13, 2015

Choose a reason for hiding this comment

wolph Aug 13, 2015

Choose a reason for hiding this comment

davidhalter Aug 13, 2015

Choose a reason for hiding this comment

davidhalter commented Feb 2, 2016

wolph commented Feb 2, 2016

davidhalter commented Feb 2, 2016

achalddave commented Mar 3, 2016

wolph commented Mar 3, 2016

achalddave commented Mar 3, 2016

wolph commented Mar 3, 2016

achalddave commented Mar 9, 2016

davidhalter commented Mar 9, 2016

davidhalter commented Jul 3, 2016

wolph commented Jul 3, 2016

Erotemic commented Oct 20, 2016

davidhalter commented Oct 23, 2016