Doxygen parsing missing #529

Leon0402 · 2020-09-15T19:00:58Z

Currently clangd doesn't seem to process doxygen commands or at least not all of them (there are different supported syntax formats by doxygen).

A sample method produces this output by clangd

As seen in the screenshot the commands like \brief, \param are ignored.

The Vs Code C/C++ extension on the other hand is able to parse this correctly to

canatella · 2020-10-30T12:45:41Z

I could have a look at this. I had a quick look and found about the getDeclComment function but I guess it is not parsing anywhere. Is there infrastructure somewhere in clang to parse doxygen comments ? Should the parsing happen in Hoover.cpp or in CodeCompletionString.cpp ? Maybe a new source file could be added for that though and share between those two. Any other thought on the way to implement this ?

kadircet · 2020-10-30T13:08:39Z

Clang already has a doxygen parser, you can get a parsed result from RawComment::parse. It is hard to consume the final parsed comment tree in a generic enough way to accommodate all features though. So I would suggest having such a handling in a feature-by-feature basis, hence having the consumer in Hover.cpp for starters.

Maybe we can convert the parsed comment tree into a flat-structure that can be easily consumed by other features in clangd. E.g. something like:

struct Comment {
  string SymbolDescription; // possible with helpers to render as markdown or plaintext.
  Map<string, Comment> Parameters;
   // and so on
};

This will require some thinking to generalize to different types of c++ symbols and doxygen constructs. so if you want to go down that path it would be great to see some proposals and designs before diving into implementation.

andno037 · 2021-07-29T11:14:10Z

Any updates about this problem?

nullromo · 2021-12-02T20:17:52Z

It's been a year; any updates?

Also, @Leon0402, you said

or at least not all of them

Do you have any info on which features/syntaxes may be correctly processed?

NicolasIRAGNE · 2022-07-12T12:01:07Z

Hello, any update on this? Currently migrating to clangd and this is pretty much my only problem at the moment.

tom-anders · 2022-07-13T06:07:34Z

I took a look at this yesterday, here are my thoughts/ideas:

I think the right place to start would indeed be getDeclComment() in Hover.cpp. It looks like it's currently used for the following features:

Hover
Code Completion
Signature Help

all of which would benefit from improved doxygen parsing.

Now, we could directly convert the doxygen to markdown right in getDeclComment() and return it as a string, but this has the disadvantage that the information about the doxygen documentation that we've gathered (probably in a non-trivial way) would get thrown away just a few lines later. There are some features that could benefit from this additional information though, e.g. SignatureHelp could use it for filling the documentation field in ParameterInformation which would be a really cool thing we're not doing yet.

So I agree with @kadircet's approach of storing the information gathered from doxygen in a structure like this:

  /// For \param. 
  /// There's also \tparam, which basically works in the same way but for template parameters, 
  /// so in the future we might make this class more generic.
  struct ParameterInformation {
  ParameterInformation(const clang::comments::ParamCommandComment&); 

  // The following three fields can be filled directly from ParamCommandComment
  unsigned index; 
  std::string paramName; 
  llvm::Optional<ParamCommandComment::PassDirection> passDirection;

  // This is everything that comes after the \param command.
  //  It could in theory also contain other commands, but for the start I think
  // we can just concatenate all children into a string.
  std::string description;
};

// This would be the new result of getDeclComment()
struct SymbolDocumentation {
  SymbolDocumentation(const clang::comments::FullComment&);

  // \param
  llvm::SmallVector<ParameterInformation> parameters;

  // \note command, there might be multiple of these.
  // We could e.g. render them in italics
  llvm::SmallVector<std::string> notes;

  // \warning command, these could maybe be bold in markdown?
  llvm::SmallVector<std::string> warnings;

  /// Everything else, i.e. all the commands that we don't have any special handling for
  std::string documentaton;
}

I think a good way to start would be the \param commands, since parsing them correctly arguably provides the most value for users. Using this information in signatureHelp for ParameterInformation/documentation could also be done in this step.

In the future we can then add support for additional commands like \warning, \return, \retval, etc.
It would also be cool if we could recognize \code and translate it into markdown code blocks.

@kadircet What do you think? If you agree with this, I'd start with implementing the \param handling.

tom-anders · 2022-07-13T20:25:46Z

One more thing I haven't mentioned above: The index currently stores a symbol's documentation as a plain std::string, but I think we'll want to change this to the newly proposed SymbolDocumentation struct...?

How big of a deal would it be to change one of the datatypes in the index? Can clangd easily detect this and just rebuild the index or would we need some kind of conversion logic? @sam-mccall @HighCommander4 @kadircet

HighCommander4 · 2022-07-13T20:41:18Z

How big of a deal would it be to change one of the datatypes in the index? Can clangd easily detect this and just rebuild the index or would we need some kind of conversion logic?

Yep, there's a version number you can bump if you make a format change.

tom-anders · 2022-07-17T18:53:35Z

I played around a bit and here's what I came up with so far: https://reviews.llvm.org/D129972
Just a draft, but I'd be happy about some feedback whether this is heading into the right direction :)

Currently looks like this:

So far, I've implemeted parsing for the most common doxygen commands and added it to Hover. CodeCompletion still uses the unparsed comment string for now.

Some stuff left to do:

Figure out where to put the UTF-8 conversion (See TODO in CodeCompletionStrings: 97)
Investigate why the argument for \throws is not correctly parsed
Use the new info also in CodeCompletion and SignatureHelp
Update the index so that we can store the whole SymbolDocumentation struct.
I added logic to convert doxygen's commands like \b and \p into markdown. Unfortunately, clangd seems to escape the *, so it's currently not rendered correctly on the client side.

codeinred · 2022-07-21T18:10:33Z

This looks really good!

endingly · 2022-07-26T06:10:35Z

I am waiting for your good news.

I noticed this when adding a new type to the index for clangd/clangd#529. When the assertion failed, this actually caused a crash, because llvm::expected would complain that we did not take the error.

tom-anders · 2022-08-14T10:32:15Z

Patch is now submitted for Review: https://reviews.llvm.org/D131853

tom-anders · 2022-10-16T14:43:06Z

I played around a bit with using the parsed doxygen docs in other interesting ways, here's two things I've come up with:

When hovering on a parameter, use the \param command from the function declaration to fill the documentation:
When hovering on a variable that's passed to a function, add the documentation of the parameter that the variable is passed to:

Proof of Concept: https://reviews.llvm.org/D136038

kadircet · 2022-12-16T09:49:16Z

Copy/pasting the comment from review to here for having some high-level discussions.
Thanks a lot @tom-anders for taking a look at this (and sorry for such a delayed response).

Hi!

Sorry for letting these series of patches sit around without any comments. We were having some discussions internally to both understand the value proposition and its implications on the infrastructure.
So it'd help a lot if you can provide some more information for use cases, that way we can find a nice scope for this functionality to make sure it provides most of the value and doesn't introduce an unnecessary complexity into rest of the infrastructure and also we should try not to regress indexing of projects that don't have doxygen comments.

So first of all, what are the exact use cases you're planning to address/improve with support for doxygen parsing of comments? Couple that comes to mind:

obtaining docs about params & return value
stripping doxygen commands
treating brief/detail/warning/note differently
formatting text within comments (bold etc)
getting linebreaks/indent right clangd#1040

any other use cases that you believe are important?

as you might've noticed, this list already talks about dealing with certain doxygen commands (but not all).
that list is gathered by counting occurrences of those commands in a codebase with lots of open-source third_party code. findings and some initial ideas look like this:

\brief: ~70k occurrences
- common but usefulness in practice is unclear
- can infer for non-doxy too (e.g. first sentence of a regular documentation)
- maybe just strip (or merge into regular documentation)?
\return[s]: 30k occurrences
- unclear if worth separation in hover, because it might be tied to rest of the documentation (re-ordering concerns)
- can infer for non-doxy maybe?
- probably just strip the command and keep the rest as-is.
\param: 28k occurrences
- useful for signature help. maybe hover on func calls
- probably worth storing in a structured manner.
\detail[s]: 2k
\p: 20k
\code: 1k
\warning: 2k
\note: 9k
- (for all of the above) just introduce as formatted text?

what do you think about those conclusions? any other commands that you seem worth giving a thought?
One important concern we've noticed around this part is, re-ordering comment components might actually hinder readability. as the comments are usually written with the assumption that they'll be read from top to bottom, but if we re-order them during presentation (e.g. hover has its own layout) we might start referring to concepts/entities in documentation before they're introduced. so we believe it's important to avoid any sort of re-ordering. this is one of the big down sides for preserving parameter comments in a structured way.

another important thing to consider is trying to heuristically infer some of these fields for non-doxygen code bases as well. that way we can provide a similar experience for both.

some other things to discuss about the design overall:

How to store the extra information?
- Proposal from our side would be to introduce structured storage for the pieces we want (limited), and keep the rest as part of main documentation text while doing stripping/reformatting.
What to use as a parser?
- Clang's doxygen parser actually looks like a great piece of code to re-use, it's unfortunate that it can issue diagnostics, requires AST etc. It'd be great to refactor that into a state where we can use it without any AST or diagnostics, and a minimal SourceManager (this seems feasible to achieve at first glance, as most of these inputs seem to be optional or used in 1 or 2 places).
- we still need to make sure performance and behaviour on non-doxygen is reasonable though. do you have any numbers here?
How to store in the index?
- If we can strip the parser off the dependencies on an astcontext, diagnostics etc. the best option would be to just store as raw text and run the whole pipeline on demand (e.g. do the doxygen parsing and markdown-ization afterwards). This is the simplest approach as it keeps index interfaces the same.

Happy to move the discussion to some other medium as well, if you would like to have them in discourse/github etc.

founderio · 2022-12-24T22:12:23Z

re-ordering comment components might actually hinder readability

For most parts of the documentation, only formatting or stripping is probably enough, as there would not be a need to hide information.

For "specific contexts" (e.g. parameters): Why not both? Storing the info both in "original order" and "parsed" gives the best of both worlds:

Hover over e.g. the function name => You see the whole documentation, unabridged, in the "original order", with some formatting spice
Hover over specific parts e.g. parameter, context info while inside function parameters, etc => Extract the relevant parts from the documentation and only show that bit

tom-anders · 2022-12-27T18:42:09Z

So first of all, what are the exact use cases you're planning to address/improve with support for doxygen parsing of comments? Couple that comes to mind:
* obtaining docs about params & return value

* stripping doxygen commands

* treating brief/detail/warning/note differently

* formatting text within comments (bold etc)

* getting linebreaks/indent right [clangd#1040](https://github.com/clangd/clangd/issues/1040)
any other use cases that you believe are important?

Sounds about right!

as you might've noticed, this list already talks about dealing with certain doxygen commands (but not all). that list is gathered by counting occurrences of those commands in a codebase with lots of open-source third_party code. findings and some initial ideas look like this:
* \brief: ~70k occurrences
  
  * common but usefulness in practice is unclear
  * can infer for non-doxy too (e.g. first sentence of a regular documentation)
  * maybe just strip (or merge into regular documentation)?

Infering this from the first sentence sounds reasonable, it looks like this is what clang::comments::BriefParser already does (this is then used by Sema/CodeComplete).

* \return[s]: 30k occurrences
  
  * unclear if worth separation in hover, because it might be tied to rest of the documentation (re-ordering concerns)
  * can infer for non-doxy maybe?
  * probably just strip the command and keep the rest as-is.

One idea here might be to add this when hovering over a local variable that got assigned to the result of a function call, e.g.

/// \return my favorite foo
int makeFoo();

int main() {
   int foo = makeFoo();
  
   int bar = foo; // Hovering over "foo" here could maybe show the \return docs from makeFoo()
}

any other commands that you seem worth giving a thought?

Maybe handling \throws could be useful for some codebases?

There's also #1320 which proposes to add support for \copydoc

One important concern we've noticed around this part is, re-ordering comment components might actually hinder readability. as
the comments are usually written with the assumption that they'll be read from top to bottom, but if we re-order them during
presentation (e.g. hover has its own layout) we might start referring to concepts/entities in documentation before they're
introduced. so we believe it's important to avoid any sort of re-ordering. this is one of the big down sides for preserving
parameter comments in a structured way.

I agree with you and @founderio here, it's probably best to no reorder anything (I wonder how the VSCode extension handles reordering though...? I don't have VSCode installed, but maybe someone else can check this)

another important thing to consider is trying to heuristically infer some of these fields for non-doxygen code bases as well. that way we can provide a similar experience for both.

So what are some heuristics we can use here (apart from "first sentence -> @brief)? I haven't really worked with anything other than doxygen yet, so what are some common ways people document e.g. parameters instead?

some other things to discuss about the design overall:

* How to store the extra information?
  
  * Proposal from our side would be to introduce structured storage for the pieces we want (limited), and keep the rest as part of main documentation text while doing stripping/reformatting.

👍

* What to use as a parser?
  
  * Clang's doxygen parser actually looks like a great piece of code to re-use, it's unfortunate that it can issue diagnostics, requires AST etc. It'd be great to refactor that into a state where we can use it without any AST or diagnostics, and a minimal SourceManager (this seems feasible to achieve at first glance, as most of these inputs seem to be optional or used in 1 or 2 places).

Hmm so you're probably talking about RawComment::parse here...? That seems to use a lot of AST stuff, for example Context.getAllocator() - What would we do here if we made the ASTContext an optional parameter? Pass in our custom allocator instead? Doesn't look to me as if we could get rid of the allocator completely here.

I'd be interested in doing this refactor, but I think I need a few more pointers before I can get started with this.

  * we still need to make sure performance and behaviour on non-doxygen is reasonable though. do you have any numbers here?

Tested this out with the neovim and LLVM codebases. With my proposed patch, index size increased from 7.7 MB to 7.9 MB for neovim and from 142MB to 145MB. Indexing time (on my local machine, 12 threads) increased frrom 3.3s to 3.45s for neovim and from 13m16s to 13m19s for LLVM.

* How to store in the index?
  
  * If we can strip the parser off the dependencies on an astcontext, diagnostics etc. the best option would be to just store as raw text and run the whole pipeline on demand (e.g. do the doxygen parsing and markdown-ization afterwards). This is the simplest approach as it keeps index interfaces the same.

Yeah that sounds like the ideal solution (if the refactor of the parsing logic succeeds)

tom-anders · 2022-12-30T10:49:06Z

What to use as a parser?

Clang's doxygen parser actually looks like a great piece of code to re-use, it's unfortunate that it can issue diagnostics, requires AST etc. It'd be great to refactor that into a state where we can use it without any AST or diagnostics, and a minimal SourceManager (this seems feasible to achieve at first glance, as most of these inputs seem to be optional or used in 1 or 2 places).
Hmm so you're probably talking about RawComment::parse here...? That seems to use a lot of AST stuff, for example Context.getAllocator() - What would we do here if we made the ASTContext an optional parameter? Pass in our custom allocator instead? Doesn't look to me as if we could get rid of the allocator completely here.

I'd be interested in doing this refactor, but I think I need a few more pointers before I can get started with this.

Looked into this a bit more, the best solution would probably be to add the allocator to our SymbolDocumentation class? (This is the class that does the doxygen parsing and stores the structured information).

The CommentParser test actually already has an example of how to implement a SourceManager that just refers to a comment string, so we can probably reuse that.

However, SourceManager also wants a reference to the diagnostics, so it's a bit of a chicken and egg problem.

tom-anders · 2022-12-30T13:13:49Z

some other things to discuss about the design overall:

* How to store the extra information?
  
  * Proposal from our side would be to introduce structured storage for the pieces we want (limited), and keep the rest as part of main documentation text while doing stripping/reformatting.

One more thing I thought of: When parsing the doxygen comment, maybe for storing the stripped/reformated text we can directly use markup::Document. That way we could also nicely support stuff like \code, \bold etc.

kadircet · 2023-01-18T11:28:00Z

One idea here might be to add this when hovering over a local variable that got assigned to the result of a function call, e.g.

That makes sense, but I think it's orthogonal to what we do with doxygen comments. it's more about transferring/inferring docs from initializer of a vardecl.

Maybe handling \throws could be useful for some codebases?

Just to be clear, I was still suggesting stripping "all" doxygen commands from the documentation. I am not sure what else we can do for throws, i guess we can have a special place in hover cards but it doesn't seem to be common enough.

There's also #1320 which proposes to add support for \copydoc

This looks quite cool, but I'd actually leave it out of the initial scope at least (or just say something like "Same as Foo") as going from a textual representation of a symbol name to its identity is hard and heuristic searches here are likely to be hindering in big code bases.

So what are some heuristics we can use here (apart from "first sentence -> @brief)? I haven't really worked with anything other than doxygen yet, so what are some common ways people document e.g. parameters instead?

Well this is an exercise we need to do for every command we want to treat specially, in theory for parameters (if we chose to store them separately), we can search for sentences mentioning the parameter name and synthesize using those (or only synthesize when there's a single such sentence).

Hmm so you're probably talking about RawComment::parse here...? That seems to use a lot of AST stuff, for example Context.getAllocator() - What would we do here if we made the ASTContext an optional parameter? Pass in our custom allocator instead? Doesn't look to me as if we could get rid of the allocator completely here.

Right, in theory we can pass in any allocator we want, it doesn't need to come from ASTContext. Anything but a dependency to the AST itself (the tree) is something we can "synthesize" in theory, and it seems there's no strong dependency on the tree itself but just the support structures (AFAICT, all pieces that uses the D are null-checked first. I am not sure about how much functionality we'll lose though).

I'd be interested in doing this refactor, but I think I need a few more pointers before I can get started with this.

Yeah, more than happy to help.

Tested this out with the neovim and LLVM codebases. With my proposed patch, index size increased from 7.7 MB to 7.9 MB for neovim and from 142MB to 145MB. Indexing time (on my local machine, 12 threads) increased frrom 3.3s to 3.45s for neovim and from 13m16s to 13m19s for LLVM.

This looks promising, especially considering that this will probably get better since we're planning to preserve less structured information than the proposal and also perform more stripping.

However, SourceManager also wants a reference to the diagnostics, so it's a bit of a chicken and egg problem.

We've got a helper class called SourceManagerForFile, which would provide the mock SourceManager we need for parsing this comment.

One more thing I thought of: When parsing the doxygen comment, maybe for storing the stripped/reformated text we can directly use markup::Document. That way we could also nicely support stuff like \code, \bold etc.

Right, I guess my initial explanation was a little vague, but I feel like that's what we should do. just store as raw text and run the whole pipeline on demand (e.g. do the doxygen parsing and markdown-ization afterwards).. I think it would be best if doxygen parsing turns the documentation into a state where it's markdown-ified raw text, then we can simply use existing markdown parsing logic we have to generate a markup::Document.

tom-anders · 2023-01-18T19:23:20Z

One idea here might be to add this when hovering over a local variable that got assigned to the result of a function call, e.g.

That makes sense, but I think it's orthogonal to what we do with doxygen comments. it's more about transferring/inferring docs from initializer of a vardecl.

Yeah this just crossed my mind, but it's definitely out of scope for now.

Maybe handling \throws could be useful for some codebases?

Just to be clear, I was still suggesting stripping "all" doxygen commands from the documentation. I am not sure what else we can do for throws, i guess we can have a special place in hover cards but it doesn't seem to be common enough.

So for example for "\param foo docs for foo" you'd propose to replace it by something like "foo: docs for foo" ?

There's also #1320 which proposes to add support for \copydoc

This looks quite cool, but I'd actually leave it out of the initial scope at least (or just say something like "Same as Foo") as going from a textual representation of a symbol name to its identity is hard and heuristic searches here are likely to be hindering in big code bases.

Agreed!

So what are some heuristics we can use here (apart from "first sentence -> @brief)? I haven't really worked with anything other than doxygen yet, so what are some common ways people document e.g. parameters instead?

Well this is an exercise we need to do for every command we want to treat specially, in theory for parameters (if we chose to store them separately), we can search for sentences mentioning the parameter name and synthesize using those (or only synthesize when there's a single such sentence).

Okay, seems like a lot of potential for false positives, I think the heuristics should be pretty conservative here

However, SourceManager also wants a reference to the diagnostics, so it's a bit of a chicken and egg problem.

We've got a helper class called SourceManagerForFile, which would provide the mock SourceManager we need for parsing this comment.

That sounds useful, I think I'll take another look at this and will reach out again when I encounter problems.

One more thing I thought of: When parsing the doxygen comment, maybe for storing the stripped/reformated text we can directly use markup::Document. That way we could also nicely support stuff like \code, \bold etc.

Right, I guess my initial explanation was a little vague, but I feel like that's what we should do. just store as raw text and run the whole pipeline on demand (e.g. do the doxygen parsing and markdown-ization afterwards).. I think it would be best if doxygen parsing turns the documentation into a state where it's markdown-ified raw text, then we can simply use existing markdown parsing logic we have to generate a markup::Document.

👍

tom-anders · 2023-01-29T18:55:04Z

Right, I guess my initial explanation was a little vague, but I feel like that's what we should do. just store as raw text and run the whole pipeline on demand (e.g. do the doxygen parsing and markdown-ization afterwards).. I think it would be best if doxygen parsing turns the documentation into a state where it's markdown-ified raw text, then we can simply use existing markdown parsing logic we have to generate a markup::Document.

Sorry if I'm missing something, but where's this parsing logic? support/Markup.h only has logic for generating, not for parsing as far as I can tell. There's also llvm/DebugInfo/Symbolize/Markup.h, is that the file you mean?

kadircet · 2023-02-01T13:05:25Z

So for example for "\param foo docs for foo" you'd propose to replace it by something like "foo: docs for foo" ?

yeah, and for \throws foo maybe we can say "throws foo" etc.

Sorry if I'm missing something, but where's this parsing logic?

To be clear, it's not so advanced and could use some improvements: https://github.com/llvm/llvm-project/blob/main/clang-tools-extra/clangd/Hover.cpp#L1357

tom-anders · 2023-02-01T19:30:49Z

Sorry if I'm missing something, but where's this parsing logic?

To be clear, it's not so advanced and could use some improvements: https://github.com/llvm/llvm-project/blob/main/clang-tools-extra/clangd/Hover.cpp#L1357

Ah, maybe an alternative would be to parse the doxygen commands directly into a markdown::Document. For example, replace \p foo with addCodeBlock(foo).

We could then move our existing markdown-parsing logic to the same class that does the doxygen parsing and use it to convert non-doxygen comments to markdown::Document as well. wdyt?

sr-tream · 2023-10-29T20:56:52Z

I don't understand, why would that comment be assigned to S?

Its a bug of comment parser in https://reviews.llvm.org/D143112

aaronliu0130 · 2023-10-29T20:58:40Z

I know it's a bug that you think exists, but I don't understand what bug you think exists

HighCommander4 · 2023-10-29T23:33:06Z

Are we allowed to ping clangd?

I'm not sure what "ping clangd" means.

If the review comments on the patch so far have been addressed and the patch is waiting for additional review, the patch author can ping the reviewer(s) periodically as a reminder.

aaronliu0130 · 2023-10-30T01:26:11Z

I mean ping the clangd organization.

anstropleuton · 2024-01-17T16:46:53Z

@sam-mccall @kirillbobyrev @kadircet @hokein @Ceron257 @HO-COOH
Hey guys, (I am sorry that I pinged all of you), but let's not ignore this issue that was opened 3 years ago?
Also, have a really late Happy New Year!

aaronliu0130 · 2024-01-17T17:10:40Z

@tom-anders Now that the phab has been set to read-only, I think we should migrate the patches to a GitHub PR.

tristan957 · 2024-01-17T17:49:54Z

@anstropleuton that is pretty rude behavior. Why don't you work on this issue instead? These people don't work for free. People like you making requests of contributors and maintainers is the reason open-source software is so toxic.

This is in preparation for implementing doxygen parsing, see discussion in clangd/clangd#529. Differential Revision: https://reviews.llvm.org/D143112

anstropleuton · 2024-01-21T05:09:53Z

@tristan957 I am not good at programming and I can't reach the coding standards of this project. And making request or giving feedback doesn't make the community toxic. I am just a user of this project. Writing a hate on me doesn't change anything so please stop it.

tristan957 · 2024-01-21T05:11:41Z

@anstropleuton Then offer to pay someone to fix this issue or please stop asking people to do things for you without compensation.

aaronliu0130 · 2024-01-21T14:50:19Z

This is pretty silly… feature requests exist for a reason, but pinging random people is indeed not pretty good. Let’s just end this conversation here now, no use polluting the comments.

anstropleuton · 2024-01-21T16:01:57Z

I just saw "I mean ping the clangd organization." by aaronliu0130 so I thought it would be fine... My apologizes if it wasn't.
I am not old enough to have a bank account of something or to earn so I cannot have online money to spend on anything.
In plus, this is the first time I asked anything in github.
With that, I conclude that people who argue about the community asking for features are toxic, are the one who actually are toxic.
Can we stop arguing on silly stuff please?

Stehsaer · 2024-02-10T19:09:02Z

Any updates? Can't wait to see clangd supporting doxygen comments.

aaronliu0130 · 2024-02-10T20:06:36Z

We're all waiting on llvm/llvm-project#78491

Stehsaer · 2024-02-11T18:42:17Z

The hovered display is created in function clang::clangd::HoverInfo::present() const in file clang-tools-extra/clangd/HoverInfo.[h/cpp]. Modifying this and make your own simple doxygen parser seems a neat temporary solution. I'm already working on that and plan to use my modified clangd until it's officially supported. It's absolutely possible and easy to parse since the content of the comment is already parsed and stored in a string.

aaronliu0130 · 2024-02-11T18:48:32Z

You can also try Tom Anders's patches in their phabicrator merge requests.

Stehsaer · 2024-02-12T09:59:41Z

@aaronliu0130 Any link to the merge req that has doxygen parsing? Tom Ander mentioned some different links and I can't figure out which one is the one that has doxygen added.

sr-tream · 2024-02-12T13:58:04Z

@Stehsaer Here Tom Anders's patches updated for LLVM 18.x (and it also compilable with 19.x)

Stehsaer · 2024-02-13T09:50:56Z

@Stehsaer Here Tom Anders's patches updated for LLVM 18.x (and it also compiled with 19.x)

How do I properly patch? I tried using git apply xxx.patch, it worked, but when compiling, it went:

E:\git-proj\llvm-project-release-18.x\clang-tools-extra\clangd\SymbolDocumentation.cpp: In lambda function:
[build] E:\git-proj\llvm-project-release-18.x\clang-tools-extra\clangd\SymbolDocumentation.cpp:62:44: error: 'clang::comments::InlineCommandComment::RenderKind' has not been declared
[build]    62 |       case comments::InlineCommandComment::RenderKind::RenderMonospaced:
[build]       |                                            ^~~~~~~~~~
[build] E:\git-proj\llvm-project-release-18.x\clang-tools-extra\clangd\SymbolDocumentation.cpp:64:44: error: 'clang::comments::InlineCommandComment::RenderKind' has not been declared
[build]    64 |       case comments::InlineCommandComment::RenderKind::RenderBold:
[build]       |                                            ^~~~~~~~~~
[build] E:\git-proj\llvm-project-release-18.x\clang-tools-extra\clangd\SymbolDocumentation.cpp:66:44: error: 'clang::comments::InlineCommandComment::RenderKind' has not been declared
[build]    66 |       case comments::InlineCommandComment::RenderKind::RenderEmphasized:
[build]       |                                            ^~~~~~~~~~

The enums comments::InlineCommandComment::RenderKind seemed to be missing. I'm using release-18.x as the base. The full compiling log is attached here:
log.txt
The branch I'm using: release/18.x

My exact steps:

Download the source as zip in release/18.x and unzip into a folder
Download the patch hover-doxygen.patch from the link and put it somewhere
I run git apply ../hover-doxygen.patch and it gives warning:

../hover-doxygen.patch:990: trailing whitespace.
    /*!
../hover-doxygen.patch:1001: trailing whitespace.
    /**
../hover-doxygen.patch:1013: trailing whitespace.
    /*!
warning: 3 lines add whitespace errors.

I configured the cmake project, with options:

and instruct cmake to build clangd only, in release mode
The errors are reported

sr-tream · 2024-02-14T12:56:29Z

Download the patch hover-doxygen.patch from the link and put it somewhere

@Stehsaer this patch for older LLVM. Use hover-doxygen-trunk.patch instead. Link to this patch in my previous post

Stehsaer · 2024-02-14T13:16:13Z

Download the patch hover-doxygen.patch from the link and put it somewhere

@Stehsaer this patch for older LLVM. Use hover-doxygen-trunk.patch instead. Link to this patch in my previous post

Thanks, problem solved. It's working as intended and I've modified a little bit to match my own preferences.

Some features missing

@return and @returns are identical
lack parsing of @details
No doxygen document parsing in auto complete,

How can I add parsing in auto complete, I need some clues about where to modify.

sr-tream · 2024-02-14T13:22:26Z

Download the patch hover-doxygen.patch from the link and put it somewhere

@Stehsaer this patch for older LLVM. Use hover-doxygen-trunk.patch instead. Link to this patch in my previous post

Thanks, problem solved. It's working as intended and I've modified a little bit to match my own preferences.

Some features missing

1. `@return` and `@returns` are identical

2. lack parsing of `@details`

3. No doxygen document parsing in auto complete,
   ![image](https://private-user-images.githubusercontent.com/54050160/304744409-885a4fd7-34ee-49ad-ac8a-4d0a5eb358e8.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDc5MTY5MTYsIm5iZiI6MTcwNzkxNjYxNiwicGF0aCI6Ii81NDA1MDE2MC8zMDQ3NDQ0MDktODg1YTRmZDctMzRlZS00OWFkLWFjOGEtNGQwYTVlYjM1OGU4LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDAyMTQlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwMjE0VDEzMTY1NlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWM1YzNlOWE0NzRlOTg4MWMyZGM3YjZjNjMzNzU3OThiM2UyODVmZDkwYTQwOGUwNDRiZjM4MzVmMjE5NzVlZWImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.uLLRbkZQpsjzsaJlG42RhSjQkl2BMJ2kBLBCrUTBajs)

How can I add parsing in auto complete, I need some clues about where to modify.

You can use this patch to render more fields and fix detailed description.

No doxygen document parsing in auto complete,

Don't know how to enable HI window in autocomplete to explore this problem

Stehsaer · 2024-02-28T06:53:32Z

any schedule or plans to merge the patches into main?

HighCommander4 · 2024-05-06T18:58:36Z

Not sure how relevant this is to us, but there are some in-flight patches to improve the doxygen parsing code that's upstream in the clang frontend (clang/AST/CommentParser.h): https://discourse.llvm.org/t/rfc-improving-clangs-comment-parsing-to-conform-better-to-doxygen-semantics/78785

sam-mccall added enhancement New feature or request good first issue Good for newcomers labels Sep 16, 2020

HighCommander4 mentioned this issue Jan 14, 2021

documentation is bad comparing to ccls #645

Closed

HighCommander4 mentioned this issue Aug 1, 2021

Doxygen preview clangd/vscode-clangd#213

Closed

kirillbobyrev mentioned this issue Dec 7, 2021

SignatureHelp doesn't respect documentationFormat #945

Closed

HighCommander4 mentioned this issue Feb 10, 2022

Hover hint comments have no formatting clangd/vscode-clangd#10

Open

tom-anders mentioned this issue Jan 17, 2024

[clangd] Support parsing comments without ASTContext llvm/llvm-project#78491

Open

randoragon mentioned this issue Feb 3, 2024

Consider adding in-code documentation randoragon/libstaple#3

Open

This was referenced Feb 21, 2024

Doxygen comments not shown properly in tooltips clangd/vscode-clangd#585

Closed

vim.lsp.buf.hover don't have any syntax highlight in C file neovim/neovim#27563

Closed

Stehsaer mentioned this issue Feb 25, 2024

Help: Error reported for a thousand times, literally: IMPORTED_IMPLIB not set for imported target llvm/llvm-project#82915

Closed

Doxygen parsing missing #529

Doxygen parsing missing #529

Comments

Leon0402 commented Sep 15, 2020

canatella commented Oct 30, 2020

kadircet commented Oct 30, 2020

andno037 commented Jul 29, 2021

nullromo commented Dec 2, 2021

NicolasIRAGNE commented Jul 12, 2022

tom-anders commented Jul 13, 2022

tom-anders commented Jul 13, 2022

HighCommander4 commented Jul 13, 2022

tom-anders commented Jul 17, 2022 • edited

codeinred commented Jul 21, 2022

endingly commented Jul 26, 2022

tom-anders commented Aug 14, 2022

tom-anders commented Oct 16, 2022

kadircet commented Dec 16, 2022

founderio commented Dec 24, 2022

tom-anders commented Dec 27, 2022

tom-anders commented Dec 30, 2022 • edited

tom-anders commented Dec 30, 2022

kadircet commented Jan 18, 2023

tom-anders commented Jan 18, 2023

tom-anders commented Jan 29, 2023

kadircet commented Feb 1, 2023

tom-anders commented Feb 1, 2023

sr-tream commented Oct 29, 2023

aaronliu0130 commented Oct 29, 2023

HighCommander4 commented Oct 29, 2023

aaronliu0130 commented Oct 30, 2023

anstropleuton commented Jan 17, 2024

aaronliu0130 commented Jan 17, 2024

tristan957 commented Jan 17, 2024

anstropleuton commented Jan 21, 2024

tristan957 commented Jan 21, 2024

aaronliu0130 commented Jan 21, 2024

anstropleuton commented Jan 21, 2024

Stehsaer commented Feb 10, 2024

aaronliu0130 commented Feb 10, 2024

Stehsaer commented Feb 11, 2024

aaronliu0130 commented Feb 11, 2024

Stehsaer commented Feb 12, 2024 • edited

sr-tream commented Feb 12, 2024 • edited

Stehsaer commented Feb 13, 2024

My exact steps:

sr-tream commented Feb 14, 2024

Stehsaer commented Feb 14, 2024

Some features missing

sr-tream commented Feb 14, 2024

Some features missing

Stehsaer commented Feb 28, 2024

HighCommander4 commented May 6, 2024

tom-anders commented Jul 17, 2022 •

edited

tom-anders commented Dec 30, 2022 •

edited

Stehsaer commented Feb 12, 2024 •

edited

sr-tream commented Feb 12, 2024 •

edited