Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Main module detection #1336

Open
zverok opened this issue Jun 9, 2020 · 7 comments
Open

Main module detection #1336

zverok opened this issue Jun 9, 2020 · 7 comments
Labels

Comments

@zverok
Copy link

zverok commented Jun 9, 2020

I just recently noticed that for, say, Nokogiri gem, main module's docs are saying just

– DO NOT MODIFY!!!! This file is automatically generated by rex 1.0.7 from lexical definition file “lib/nokogiri/css/tokenizer.rex”. ++

That's contents of lib/nokogiri/tokenizer.rb, while lib/nokogiri.rb has proper module docs. (To add insult to injury, -- and ++ in tokenizer.rb is RDoc's agreement to "ignore those comments as docs", which YARD in RDoc-compatibility mode should probably respect).

The same I noticed about my own gem Inforboxer, it is not a content of the lib/infoboxer.rb. The gem is currently in "feature-complete, just maintenance" state, I haven't seriously touched it for quite some time, but when it was developed I had a great attention to docs, and checked "how it rendered" many times, and it was OK. I am not sure whether it is something that I changed in some recent bugfix release caused the docs problem, or some new YARD version, but definitely looks like a bug to me.

Couple more examples:

  • Faraday: rdoc.info vs main file
  • Sequel: rdoc.info, main file (this one more tricky, as sequel's main file is seque/core, but YARD correctly list it first in the list of "where this module defined"...)
@lsegal
Copy link
Owner

lsegal commented Jun 9, 2020

YARD has always (read: for a very long time) parsed files in ascending file-length order by default. In addition, YARD always takes the last defined docstring of a parsed object. In other words, YARD will resolve "Comment 2" as the docstring on module A in the following example:

# Comment 1
module A; end

# Comment 2
module A; end

A full example of the behavior you're seeing is as follows:

#! in lib/a.rb:

# Module documentation
module A; end

#! in lib/a/b.rb:

# Not module documentation
module A
  class B; end
end

In the above, YARD will use "Not module documentation" as the docstring for A if the default file order is used. YARD does allow you to re-order the file list, but only by forcing files to be parsed earlier, not later, so in this case, the only fix is to not have a docstring on the second instance of module A, or, explicitly list the exact file order of all files of your codebase in the .yardopts file, which is really just offered for completeness, not an actual recommendation.

For what it's worth though, both of these heuristics have existed in YARD for quite a while, so I'd be surprised if something in YARD changed. Parsing the above with YARD v0.8.0 yields the same results, and in addition, parsing Infoboxer with YARD v0.8.0 also has the same result:

image

(YARD v0.8.0 was released in April 2012)

My guess is this has to do with added Rubocop comments, which YARD can't reasonably determine is a comment worth ignoring from documentation, though you could write a plugin to do this.

@lsegal lsegal added the Question label Jun 9, 2020
@zverok
Copy link
Author

zverok commented Jun 10, 2020

@lsegal Thanks for the explanation, now "when it has happened" is clear.

But my real question is: would you mind (accept a PR, say) changing this logic?

If I'd just be concerned about my library, I'd handle it locally (with a few experiments).

But as examples with other libraries demonstrate, I believe this logic is "wrong" in some way. Considering the common pattern of lib/foo.rb containing Foo docs, and then all lib/foo/bar/baz/blah.rb starting with module Foo, it is (again, as demonstrated) not unusual to have an accidental comment from deeply nested file to be treated as a "main" one. Are there counter-examples making current logic more desirable? Or it is just "how things always were", and they shouldn't be changed due to backward compatibility?

@lsegal
Copy link
Owner

lsegal commented Jun 10, 2020

shouldn't be changed due to backward compatibility

This one most importantly. Changing the behavior would change a ton of living documentation in which users were making use of this heuristic, and that wouldn't be a pleasant experience for them, and in the worst case (for unmaintained gems) would forever break documentation.

But using the first entry rather than the last also creates a technical problem in which if YARD has a defined docstring, it cannot be overridden by subsequent files, in other words, if you're using --use-cache or relying on a base .yardoc db, you cannot re-document any objects-- this would break a bunch of things including the live reloading yard server. There needs to be a technical ability to take a newer docstring for an object. There are ways to potentially allow for both, but it creates an inconsistent experience in which the heuristic depends on context, leading to possible bugs and user confusion.

We could add a flag to toggle this behavior, but this likely wouldn't help your case since fixing this in a gem would still require code changes + a new version to be published, at which point resolving the comment or some kind of ignore-rubocop plugin could be added too (the latter might be something you want anyway).

Finally, just pointing out that while there might be a few libraries out there with this issue, examples like Nokogiri are a little moot given that Nokogiri is explicitly targeting RDoc with RDoc-specific features (to answer your initial question, YARD does not support any RDoc directive parsing features including --/++ or :nodoc:) so it's likely there are a few other compatibility issues with the doc generation there.

@zverok
Copy link
Author

zverok commented Jun 10, 2020

But using the first entry rather than the last also creates a technical problem in which if YARD has a defined docstring, it cannot be overridden by subsequent files, in other words, if you're using --use-cache or relying on a base .yardoc db, you cannot re-document any objects-- this would break a bunch of things including the live reloading yard server.

Oh, I see now. Probably the solution(s) might be imagined (like, naively "take the first doc string, but remember in cache from where it came, and observe that location changes"), but it is hard to estimate how hard they might be to implement.

I am starting to think that probably some external "documentation linting" tool (like my slightly dated but still useful yard-junk) telling "hey, dude, you have shown YARD the wrong comment" might be the most pragmatic approach. At least for "YARD-aware" projects.

Finally, just pointing out that while there might be a few libraries out there with this issue, examples like Nokogiri are a little moot given that Nokogiri is explicitly targeting RDoc with RDoc-specific features

Yeah, I get it. The problem is that https://rubydoc.info/ is nowadays, as far as I understand, the "default" documentation server for Ruby gems (at least if the author haven't explicitly specified another one, rubygems.org point at https://www.rubydoc.info/gems/<gemname>/).

At the same time (again, as far as I can understand):

  • you support rubydoc.info on your own, without any help/sponsorship/advice from the community, so it is hard to demand that it worked this or that way, to say the least :)
  • rubydoc.info is povered by YARD, and, again, demanding/asking to generate each gems docs in different tool is dumb... I could imagine some "Full RDoc compatibility mode" in YARD, but expect it to be a lot of (thanksless...) work.

So that's where I've came from: pointing students to "read the docs", and finding out they aren't always available; and thinking about the most constructive way to help fixing it.
(Faraday, though, YARD-oriented, but still fell under the bad spell.)

@hramrach
Copy link

So yard is rdoc compatible but does not support rdoc 🤯

@lsegal
Copy link
Owner

lsegal commented Dec 24, 2023

So yard is rdoc compatible but does not support rdoc 🤯

YARD does have a pretty extensive compatibility (and thus "support") of RDoc and its syntax-- if you want good evidence of this support, note that YARD's own inline code comments rely exclusively on RDoc formatting-- but it is by no means complete. YARD was never intended to be a drop-in replacement for RDoc (what would be the point in that?) and certain RDoc features that are at odds with YARD functionality cannot be implemented. The :nodoc: and --/++ preprocessor directives are great examples of this, since YARD does not have a concept of purposefully omitting documentation, as this was one of the core inspirations for creating the tool.

That said, YARD's plugin system is fairly extensive and supporting this could be done via custom extensions or plugins, pretty easily, too.

@hramrach
Copy link

There is this kind of failure that ruby-doc web shows the :nodoc: strings including the :nodoc: tag.

I suppose that rubydoc.info would need a plugin for fixing this then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants