New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added google/napoleon docstring support to fix #504 #617
Conversation
@@ -29,11 +29,13 @@ | |||
r'\s*:type\s+%s:\s*([^\n]+)', # Sphinx | |||
r'\s*:param\s+(\w+)\s+%s:[^\n]+', # Sphinx param with type | |||
r'\s*@type\s+%s:\s*([^\n]+)', # Epydoc | |||
'\s*%s\s*\((.+?)\):.+', # Google param with type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't check for Args
. Shouldn't we do that? Otherwise it might just much
Also I don't think you need .+
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it's basically a context free language which would require to count the parenthesis, since that's not possible in a regular language it seemed like .+?
would be the best solution here. We could try everything except for :
instead but I'm not sure that will work correctly in all cases.
As for the Args
, while not impossible, that would require a fairly large rewrite. The entire _search_param_in_docstr
function wouldn't be usable anymore as it's only allowed to match using this regex when Args:
precedes it. Or have I missed something in the code that would easily allow for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's basically a context free language
I understand that, but you could check it before hand. That makes it a little bit more complicated (in the Python area), but could still work. Obviously modifying regexes would not be enough anymore.
@wolph If you want this to get merged, you need to invest some time in writing more detailed regexes. Sorry, but there's too many wildcards up there for me. |
@davidhalter I'm planning to, I'm just fully booked till march. After that I'll have time to work on this issue. No worries, I haven't forgotten :) |
Cool! :) |
@wolph Are you still working on this? I can take a look at it if not. I can play with the regexes, or we could also modify this bit so we can specify the parse method using a function instead of a regex string. However, I do think that adding sphinxdoc as a dependency wouldn't be that bad of an idea as suggested in issue #504 - this syntax is ambiguous so it'd probably be nicer to just let someone else take care of the parsing (otherwise, the set of docstrings that Jedi handles correctly will likely not be the same as the set that Sphinx handles, and people may well be using both). If we're sticking with regexes, I think there isn't much to improve to this one (apart from possibly replacing |
@achalddave If you have time to work on it, please do. I have it planned for this month but I'm currently still finishing my backlog of tasks so help is greatly appreciated. As for a parsing function instead of a regex, that would be the better solution but it might be a bit of a big patch in that case. To properly parse this you would need a stack based approach, although I think a simple version (count the parentheses) could work too and might be enough already for this case. That regex will definitely fail since the arguments can actually span multiple lines. |
The arguments can span multiple lines, but I think it would be safe to say that the type will be on the same line? Maybe that's just the way I've used it - then leaving the regex as is is fine. Even with a stack-based approach, we'd make the assumption that there are no stray open or close parentheses within the type annotation, which is a fair assumption to make, but may be a different assumption from the one that Sphinx makes (and even if we use the same assumptions as them, their assumptions may change). But if adding Sphinx is a nogo, I can try and work on this at some point after mid March. |
True, that's probably a fair assumption. If it needs to span multiple lines than perhaps it's just too complex ;) I'm personally in favour of handling this using the native Sphinx code, but it does add an external dependency and this might be part of a private API which could break if it's somehow out of date or is ever changed. That last bit is an assumption but might be worth investigating, is this part of a stable API or not. |
@davidhalter What're your thoughts? I don't actually know if this regex can actually be made more specific without rejecting valid strings. I'd say for now something simple like this would still be better than no support for Google style docstrings (at least for me :) ). |
As I've mentioned before, I don't like the current regexes. They are too broad. However, I don't think we can fix this by just improving regexes. We need to parse the docstring in the case of google docstrings! I would recommend to split this logic from the sphinx/epydoc parsing. It's just something different. Adding a dependency just for this is really not an option, sorry. |
Closing, because it's been open for quite a long time and no progress has been made. Let me know when you're done! |
Sure, no problem :) The proposed pull request works good enough for me anyhow |
@davidhalter What exactly needs to be done on this branch to make it an acceptable merge? From what I understand, a good solution should be based on parsing rather than using context insensitive regexes. And using a third party library like pyparsing is out of the question? If pyparsing is not ok, would a more heuristic based approach based on docstring indentation work? My basic idea is to parse a google-style docstring into blocks based on indentations. Ignoring the first line, the code would search for the first line that matches the What do you think? |
For this issue, yes. It's just too small of an improvement for Jedi.
That's exactly what I would have done. Happy to merge such a pull request. The only alternative to that is to use the Jedi parser itself. You could create a grammar file for it and somehow use that. It's probably not even too hard (if you understand parsers). However I think that's almost too much work for something that simple. (Not sure though, the parser is pretty good, might be more of an issue to write a tokenizer that fits). |
No description provided.