Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doxygen parsing missing #529

Open
Leon0402 opened this issue Sep 15, 2020 · 54 comments
Open

Doxygen parsing missing #529

Leon0402 opened this issue Sep 15, 2020 · 54 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@Leon0402
Copy link

Currently clangd doesn't seem to process doxygen commands or at least not all of them (there are different supported syntax formats by doxygen).

A sample method produces this output by clangd
image
As seen in the screenshot the commands like \brief, \param are ignored.

The Vs Code C/C++ extension on the other hand is able to parse this correctly to
image

@sam-mccall sam-mccall added enhancement New feature or request good first issue Good for newcomers labels Sep 16, 2020
@canatella
Copy link

I could have a look at this. I had a quick look and found about the getDeclComment function but I guess it is not parsing anywhere. Is there infrastructure somewhere in clang to parse doxygen comments ? Should the parsing happen in Hoover.cpp or in CodeCompletionString.cpp ? Maybe a new source file could be added for that though and share between those two. Any other thought on the way to implement this ?

@kadircet
Copy link
Member

Clang already has a doxygen parser, you can get a parsed result from RawComment::parse. It is hard to consume the final parsed comment tree in a generic enough way to accommodate all features though. So I would suggest having such a handling in a feature-by-feature basis, hence having the consumer in Hover.cpp for starters.

Maybe we can convert the parsed comment tree into a flat-structure that can be easily consumed by other features in clangd. E.g. something like:

struct Comment {
  string SymbolDescription; // possible with helpers to render as markdown or plaintext.
  Map<string, Comment> Parameters;
   // and so on
};

This will require some thinking to generalize to different types of c++ symbols and doxygen constructs. so if you want to go down that path it would be great to see some proposals and designs before diving into implementation.

@andno037
Copy link

Any updates about this problem?

@nullromo
Copy link

nullromo commented Dec 2, 2021

It's been a year; any updates?

Also, @Leon0402, you said

or at least not all of them

Do you have any info on which features/syntaxes may be correctly processed?

@NicolasIRAGNE
Copy link

Hello, any update on this? Currently migrating to clangd and this is pretty much my only problem at the moment.

@tom-anders
Copy link

I took a look at this yesterday, here are my thoughts/ideas:

I think the right place to start would indeed be getDeclComment() in Hover.cpp. It looks like it's currently used for the following features:

  • Hover
  • Code Completion
  • Signature Help

all of which would benefit from improved doxygen parsing.

Now, we could directly convert the doxygen to markdown right in getDeclComment() and return it as a string, but this has the disadvantage that the information about the doxygen documentation that we've gathered (probably in a non-trivial way) would get thrown away just a few lines later. There are some features that could benefit from this additional information though, e.g. SignatureHelp could use it for filling the documentation field in ParameterInformation which would be a really cool thing we're not doing yet.

So I agree with @kadircet's approach of storing the information gathered from doxygen in a structure like this:

  /// For \param. 
  /// There's also \tparam, which basically works in the same way but for template parameters, 
  /// so in the future we might make this class more generic.
  struct ParameterInformation {
  ParameterInformation(const clang::comments::ParamCommandComment&); 

  // The following three fields can be filled directly from ParamCommandComment
  unsigned index; 
  std::string paramName; 
  llvm::Optional<ParamCommandComment::PassDirection> passDirection;

  // This is everything that comes after the \param command.
  //  It could in theory also contain other commands, but for the start I think
  // we can just concatenate all children into a string.
  std::string description;
};

// This would be the new result of getDeclComment()
struct SymbolDocumentation {
  SymbolDocumentation(const clang::comments::FullComment&);

  // \param
  llvm::SmallVector<ParameterInformation> parameters;

  // \note command, there might be multiple of these.
  // We could e.g. render them in italics
  llvm::SmallVector<std::string> notes;

  // \warning command, these could maybe be bold in markdown?
  llvm::SmallVector<std::string> warnings;

  /// Everything else, i.e. all the commands that we don't have any special handling for
  std::string documentaton;
}

I think a good way to start would be the \param commands, since parsing them correctly arguably provides the most value for users. Using this information in signatureHelp for ParameterInformation/documentation could also be done in this step.

In the future we can then add support for additional commands like \warning, \return, \retval, etc.
It would also be cool if we could recognize \code and translate it into markdown code blocks.

@kadircet What do you think? If you agree with this, I'd start with implementing the \param handling.

@tom-anders
Copy link

One more thing I haven't mentioned above: The index currently stores a symbol's documentation as a plain std::string, but I think we'll want to change this to the newly proposed SymbolDocumentation struct...?

How big of a deal would it be to change one of the datatypes in the index? Can clangd easily detect this and just rebuild the index or would we need some kind of conversion logic? @sam-mccall @HighCommander4 @kadircet

@HighCommander4
Copy link

How big of a deal would it be to change one of the datatypes in the index? Can clangd easily detect this and just rebuild the index or would we need some kind of conversion logic?

Yep, there's a version number you can bump if you make a format change.

@tom-anders
Copy link

tom-anders commented Jul 17, 2022

I played around a bit and here's what I came up with so far: https://reviews.llvm.org/D129972
Just a draft, but I'd be happy about some feedback whether this is heading into the right direction :)

Currently looks like this:
hover

So far, I've implemeted parsing for the most common doxygen commands and added it to Hover. CodeCompletion still uses the unparsed comment string for now.

Some stuff left to do:

  • Figure out where to put the UTF-8 conversion (See TODO in CodeCompletionStrings: 97)
  • Investigate why the argument for \throws is not correctly parsed
  • Use the new info also in CodeCompletion and SignatureHelp
  • Update the index so that we can store the whole SymbolDocumentation struct.
  • I added logic to convert doxygen's commands like \b and \p into markdown. Unfortunately, clangd seems to escape the *, so it's currently not rendered correctly on the client side.

@codeinred
Copy link

This looks really good!

@endingly
Copy link

I am waiting for your good news.

tom-anders added a commit to llvm/llvm-project that referenced this issue Aug 11, 2022
I noticed this when adding a new type to the index for
clangd/clangd#529. When the assertion failed,
this actually caused a crash, because llvm::expected would complain that
we did not take the error.
@tom-anders
Copy link

Patch is now submitted for Review: https://reviews.llvm.org/D131853

@tom-anders
Copy link

I played around a bit with using the parsed doxygen docs in other interesting ways, here's two things I've come up with:

  1. When hovering on a parameter, use the \param command from the function declaration to fill the documentation:
  2. When hovering on a variable that's passed to a function, add the documentation of the parameter that the variable is passed to:
    paramdoc2

Proof of Concept: https://reviews.llvm.org/D136038

@kadircet
Copy link
Member

Copy/pasting the comment from review to here for having some high-level discussions.
Thanks a lot @tom-anders for taking a look at this (and sorry for such a delayed response).


Hi!

Sorry for letting these series of patches sit around without any comments. We were having some discussions internally to both understand the value proposition and its implications on the infrastructure.
So it'd help a lot if you can provide some more information for use cases, that way we can find a nice scope for this functionality to make sure it provides most of the value and doesn't introduce an unnecessary complexity into rest of the infrastructure and also we should try not to regress indexing of projects that don't have doxygen comments.

So first of all, what are the exact use cases you're planning to address/improve with support for doxygen parsing of comments? Couple that comes to mind:

  • obtaining docs about params & return value
  • stripping doxygen commands
  • treating brief/detail/warning/note differently
  • formatting text within comments (bold etc)
  • getting linebreaks/indent right clangd#1040

any other use cases that you believe are important?

as you might've noticed, this list already talks about dealing with certain doxygen commands (but not all).
that list is gathered by counting occurrences of those commands in a codebase with lots of open-source third_party code. findings and some initial ideas look like this:

  • \brief: ~70k occurrences
    • common but usefulness in practice is unclear
    • can infer for non-doxy too (e.g. first sentence of a regular documentation)
    • maybe just strip (or merge into regular documentation)?
  • \return[s]: 30k occurrences
    • unclear if worth separation in hover, because it might be tied to rest of the documentation (re-ordering concerns)
    • can infer for non-doxy maybe?
    • probably just strip the command and keep the rest as-is.
  • \param: 28k occurrences
    • useful for signature help. maybe hover on func calls
    • probably worth storing in a structured manner.
  • \detail[s]: 2k
  • \p: 20k
  • \code: 1k
  • \warning: 2k
  • \note: 9k
    • (for all of the above) just introduce as formatted text?

what do you think about those conclusions? any other commands that you seem worth giving a thought?
One important concern we've noticed around this part is, re-ordering comment components might actually hinder readability. as the comments are usually written with the assumption that they'll be read from top to bottom, but if we re-order them during presentation (e.g. hover has its own layout) we might start referring to concepts/entities in documentation before they're introduced. so we believe it's important to avoid any sort of re-ordering. this is one of the big down sides for preserving parameter comments in a structured way.

another important thing to consider is trying to heuristically infer some of these fields for non-doxygen code bases as well. that way we can provide a similar experience for both.

some other things to discuss about the design overall:

  • How to store the extra information?
    • Proposal from our side would be to introduce structured storage for the pieces we want (limited), and keep the rest as part of main documentation text while doing stripping/reformatting.
  • What to use as a parser?
    • Clang's doxygen parser actually looks like a great piece of code to re-use, it's unfortunate that it can issue diagnostics, requires AST etc. It'd be great to refactor that into a state where we can use it without any AST or diagnostics, and a minimal SourceManager (this seems feasible to achieve at first glance, as most of these inputs seem to be optional or used in 1 or 2 places).
    • we still need to make sure performance and behaviour on non-doxygen is reasonable though. do you have any numbers here?
  • How to store in the index?
    • If we can strip the parser off the dependencies on an astcontext, diagnostics etc. the best option would be to just store as raw text and run the whole pipeline on demand (e.g. do the doxygen parsing and markdown-ization afterwards). This is the simplest approach as it keeps index interfaces the same.

Happy to move the discussion to some other medium as well, if you would like to have them in discourse/github etc.

@founderio
Copy link

re-ordering comment components might actually hinder readability

For most parts of the documentation, only formatting or stripping is probably enough, as there would not be a need to hide information.

For "specific contexts" (e.g. parameters): Why not both? Storing the info both in "original order" and "parsed" gives the best of both worlds:

  • Hover over e.g. the function name => You see the whole documentation, unabridged, in the "original order", with some formatting spice
  • Hover over specific parts e.g. parameter, context info while inside function parameters, etc => Extract the relevant parts from the documentation and only show that bit

@tom-anders
Copy link

So first of all, what are the exact use cases you're planning to address/improve with support for doxygen parsing of comments? Couple that comes to mind:

* obtaining docs about params & return value

* stripping doxygen commands

* treating brief/detail/warning/note differently

* formatting text within comments (bold etc)

* getting linebreaks/indent right [clangd#1040](https://github.com/clangd/clangd/issues/1040)

any other use cases that you believe are important?

Sounds about right!

as you might've noticed, this list already talks about dealing with certain doxygen commands (but not all). that list is gathered by counting occurrences of those commands in a codebase with lots of open-source third_party code. findings and some initial ideas look like this:

* \brief: ~70k occurrences
  
  * common but usefulness in practice is unclear
  * can infer for non-doxy too (e.g. first sentence of a regular documentation)
  * maybe just strip (or merge into regular documentation)?

Infering this from the first sentence sounds reasonable, it looks like this is what clang::comments::BriefParser already does (this is then used by Sema/CodeComplete).

* \return[s]: 30k occurrences
  
  * unclear if worth separation in hover, because it might be tied to rest of the documentation (re-ordering concerns)
  * can infer for non-doxy maybe?
  * probably just strip the command and keep the rest as-is.

One idea here might be to add this when hovering over a local variable that got assigned to the result of a function call, e.g.

/// \return my favorite foo
int makeFoo();

int main() {
   int foo = makeFoo();
  
   int bar = foo; // Hovering over "foo" here could maybe show the \return docs from makeFoo()
}

any other commands that you seem worth giving a thought?

Maybe handling \throws could be useful for some codebases?

There's also #1320 which proposes to add support for \copydoc

One important concern we've noticed around this part is, re-ordering comment components might actually hinder readability. as
the comments are usually written with the assumption that they'll be read from top to bottom, but if we re-order them during
presentation (e.g. hover has its own layout) we might start referring to concepts/entities in documentation before they're
introduced. so we believe it's important to avoid any sort of re-ordering. this is one of the big down sides for preserving
parameter comments in a structured way.

I agree with you and @founderio here, it's probably best to no reorder anything (I wonder how the VSCode extension handles reordering though...? I don't have VSCode installed, but maybe someone else can check this)

another important thing to consider is trying to heuristically infer some of these fields for non-doxygen code bases as well. that way we can provide a similar experience for both.

So what are some heuristics we can use here (apart from "first sentence -> @brief)? I haven't really worked with anything other than doxygen yet, so what are some common ways people document e.g. parameters instead?

some other things to discuss about the design overall:

* How to store the extra information?
  
  * Proposal from our side would be to introduce structured storage for the pieces we want (limited), and keep the rest as part of main documentation text while doing stripping/reformatting.

👍

* What to use as a parser?
  
  * Clang's doxygen parser actually looks like a great piece of code to re-use, it's unfortunate that it can issue diagnostics, requires AST etc. It'd be great to refactor that into a state where we can use it without any AST or diagnostics, and a minimal SourceManager (this seems feasible to achieve at first glance, as most of these inputs seem to be optional or used in 1 or 2 places).

Hmm so you're probably talking about RawComment::parse here...? That seems to use a lot of AST stuff, for example Context.getAllocator() - What would we do here if we made the ASTContext an optional parameter? Pass in our custom allocator instead? Doesn't look to me as if we could get rid of the allocator completely here.

I'd be interested in doing this refactor, but I think I need a few more pointers before I can get started with this.

  * we still need to make sure performance and behaviour on non-doxygen is reasonable though. do you have any numbers here?

Tested this out with the neovim and LLVM codebases. With my proposed patch, index size increased from 7.7 MB to 7.9 MB for neovim and from 142MB to 145MB. Indexing time (on my local machine, 12 threads) increased frrom 3.3s to 3.45s for neovim and from 13m16s to 13m19s for LLVM.

* How to store in the index?
  
  * If we can strip the parser off the dependencies on an astcontext, diagnostics etc. the best option would be to just store as raw text and run the whole pipeline on demand (e.g. do the doxygen parsing and markdown-ization afterwards). This is the simplest approach as it keeps index interfaces the same.

Yeah that sounds like the ideal solution (if the refactor of the parsing logic succeeds)

@tom-anders
Copy link

tom-anders commented Dec 30, 2022

  • What to use as a parser?

    • Clang's doxygen parser actually looks like a great piece of code to re-use, it's unfortunate that it can issue diagnostics, requires AST etc. It'd be great to refactor that into a state where we can use it without any AST or diagnostics, and a minimal SourceManager (this seems feasible to achieve at first glance, as most of these inputs seem to be optional or used in 1 or 2 places).

Hmm so you're probably talking about RawComment::parse here...? That seems to use a lot of AST stuff, for example Context.getAllocator() - What would we do here if we made the ASTContext an optional parameter? Pass in our custom allocator instead? Doesn't look to me as if we could get rid of the allocator completely here.

I'd be interested in doing this refactor, but I think I need a few more pointers before I can get started with this.

Looked into this a bit more, the best solution would probably be to add the allocator to our SymbolDocumentation class? (This is the class that does the doxygen parsing and stores the structured information).

The CommentParser test actually already has an example of how to implement a SourceManager that just refers to a comment string, so we can probably reuse that.

However, SourceManager also wants a reference to the diagnostics, so it's a bit of a chicken and egg problem.

@tom-anders
Copy link

some other things to discuss about the design overall:

* How to store the extra information?
  
  * Proposal from our side would be to introduce structured storage for the pieces we want (limited), and keep the rest as part of main documentation text while doing stripping/reformatting.

One more thing I thought of: When parsing the doxygen comment, maybe for storing the stripped/reformated text we can directly use markup::Document. That way we could also nicely support stuff like \code, \bold etc.

@kadircet
Copy link
Member

One idea here might be to add this when hovering over a local variable that got assigned to the result of a function call, e.g.

That makes sense, but I think it's orthogonal to what we do with doxygen comments. it's more about transferring/inferring docs from initializer of a vardecl.

Maybe handling \throws could be useful for some codebases?

Just to be clear, I was still suggesting stripping "all" doxygen commands from the documentation. I am not sure what else we can do for throws, i guess we can have a special place in hover cards but it doesn't seem to be common enough.

There's also #1320 which proposes to add support for \copydoc

This looks quite cool, but I'd actually leave it out of the initial scope at least (or just say something like "Same as Foo") as going from a textual representation of a symbol name to its identity is hard and heuristic searches here are likely to be hindering in big code bases.

So what are some heuristics we can use here (apart from "first sentence -> @brief)? I haven't really worked with anything other than doxygen yet, so what are some common ways people document e.g. parameters instead?

Well this is an exercise we need to do for every command we want to treat specially, in theory for parameters (if we chose to store them separately), we can search for sentences mentioning the parameter name and synthesize using those (or only synthesize when there's a single such sentence).

Hmm so you're probably talking about RawComment::parse here...? That seems to use a lot of AST stuff, for example Context.getAllocator() - What would we do here if we made the ASTContext an optional parameter? Pass in our custom allocator instead? Doesn't look to me as if we could get rid of the allocator completely here.

Right, in theory we can pass in any allocator we want, it doesn't need to come from ASTContext. Anything but a dependency to the AST itself (the tree) is something we can "synthesize" in theory, and it seems there's no strong dependency on the tree itself but just the support structures (AFAICT, all pieces that uses the D are null-checked first. I am not sure about how much functionality we'll lose though).

I'd be interested in doing this refactor, but I think I need a few more pointers before I can get started with this.

Yeah, more than happy to help.

Tested this out with the neovim and LLVM codebases. With my proposed patch, index size increased from 7.7 MB to 7.9 MB for neovim and from 142MB to 145MB. Indexing time (on my local machine, 12 threads) increased frrom 3.3s to 3.45s for neovim and from 13m16s to 13m19s for LLVM.

This looks promising, especially considering that this will probably get better since we're planning to preserve less structured information than the proposal and also perform more stripping.

However, SourceManager also wants a reference to the diagnostics, so it's a bit of a chicken and egg problem.

We've got a helper class called SourceManagerForFile, which would provide the mock SourceManager we need for parsing this comment.

One more thing I thought of: When parsing the doxygen comment, maybe for storing the stripped/reformated text we can directly use markup::Document. That way we could also nicely support stuff like \code, \bold etc.

Right, I guess my initial explanation was a little vague, but I feel like that's what we should do. just store as raw text and run the whole pipeline on demand (e.g. do the doxygen parsing and markdown-ization afterwards).. I think it would be best if doxygen parsing turns the documentation into a state where it's markdown-ified raw text, then we can simply use existing markdown parsing logic we have to generate a markup::Document.

@tom-anders
Copy link

One idea here might be to add this when hovering over a local variable that got assigned to the result of a function call, e.g.

That makes sense, but I think it's orthogonal to what we do with doxygen comments. it's more about transferring/inferring docs from initializer of a vardecl.

Yeah this just crossed my mind, but it's definitely out of scope for now.

Maybe handling \throws could be useful for some codebases?

Just to be clear, I was still suggesting stripping "all" doxygen commands from the documentation. I am not sure what else we can do for throws, i guess we can have a special place in hover cards but it doesn't seem to be common enough.

So for example for "\param foo docs for foo" you'd propose to replace it by something like "foo: docs for foo" ?

There's also #1320 which proposes to add support for \copydoc

This looks quite cool, but I'd actually leave it out of the initial scope at least (or just say something like "Same as Foo") as going from a textual representation of a symbol name to its identity is hard and heuristic searches here are likely to be hindering in big code bases.

Agreed!

So what are some heuristics we can use here (apart from "first sentence -> @brief)? I haven't really worked with anything other than doxygen yet, so what are some common ways people document e.g. parameters instead?

Well this is an exercise we need to do for every command we want to treat specially, in theory for parameters (if we chose to store them separately), we can search for sentences mentioning the parameter name and synthesize using those (or only synthesize when there's a single such sentence).

Okay, seems like a lot of potential for false positives, I think the heuristics should be pretty conservative here

However, SourceManager also wants a reference to the diagnostics, so it's a bit of a chicken and egg problem.

We've got a helper class called SourceManagerForFile, which would provide the mock SourceManager we need for parsing this comment.

That sounds useful, I think I'll take another look at this and will reach out again when I encounter problems.

One more thing I thought of: When parsing the doxygen comment, maybe for storing the stripped/reformated text we can directly use markup::Document. That way we could also nicely support stuff like \code, \bold etc.

Right, I guess my initial explanation was a little vague, but I feel like that's what we should do. just store as raw text and run the whole pipeline on demand (e.g. do the doxygen parsing and markdown-ization afterwards).. I think it would be best if doxygen parsing turns the documentation into a state where it's markdown-ified raw text, then we can simply use existing markdown parsing logic we have to generate a markup::Document.

👍

@tom-anders
Copy link

Right, I guess my initial explanation was a little vague, but I feel like that's what we should do. just store as raw text and run the whole pipeline on demand (e.g. do the doxygen parsing and markdown-ization afterwards).. I think it would be best if doxygen parsing turns the documentation into a state where it's markdown-ified raw text, then we can simply use existing markdown parsing logic we have to generate a markup::Document.

Sorry if I'm missing something, but where's this parsing logic? support/Markup.h only has logic for generating, not for parsing as far as I can tell. There's also llvm/DebugInfo/Symbolize/Markup.h, is that the file you mean?

@kadircet
Copy link
Member

kadircet commented Feb 1, 2023

So for example for "\param foo docs for foo" you'd propose to replace it by something like "foo: docs for foo" ?

yeah, and for \throws foo maybe we can say "throws foo" etc.

Sorry if I'm missing something, but where's this parsing logic?

To be clear, it's not so advanced and could use some improvements: https://github.com/llvm/llvm-project/blob/main/clang-tools-extra/clangd/Hover.cpp#L1357

@tom-anders
Copy link

Sorry if I'm missing something, but where's this parsing logic?

To be clear, it's not so advanced and could use some improvements: https://github.com/llvm/llvm-project/blob/main/clang-tools-extra/clangd/Hover.cpp#L1357

Ah, maybe an alternative would be to parse the doxygen commands directly into a markdown::Document. For example, replace \p foo with addCodeBlock(foo).

We could then move our existing markdown-parsing logic to the same class that does the doxygen parsing and use it to convert non-doxygen comments to markdown::Document as well. wdyt?

@sr-tream
Copy link

I don't understand, why would that comment be assigned to S?

Its a bug of comment parser in https://reviews.llvm.org/D143112

@aaronliu0130
Copy link

I know it's a bug that you think exists, but I don't understand what bug you think exists

@HighCommander4
Copy link

Are we allowed to ping clangd?

I'm not sure what "ping clangd" means.

If the review comments on the patch so far have been addressed and the patch is waiting for additional review, the patch author can ping the reviewer(s) periodically as a reminder.

@aaronliu0130
Copy link

I mean ping the clangd organization.

@anstropleuton
Copy link

@sam-mccall @kirillbobyrev @kadircet @hokein @Ceron257 @HO-COOH
Hey guys, (I am sorry that I pinged all of you), but let's not ignore this issue that was opened 3 years ago?
Also, have a really late Happy New Year!

@aaronliu0130
Copy link

@tom-anders Now that the phab has been set to read-only, I think we should migrate the patches to a GitHub PR.

@tristan957
Copy link

@anstropleuton that is pretty rude behavior. Why don't you work on this issue instead? These people don't work for free. People like you making requests of contributors and maintainers is the reason open-source software is so toxic.

tom-anders added a commit to tom-anders/llvm-project that referenced this issue Jan 17, 2024
This is in preparation for implementing doxygen parsing, see discussion in clangd/clangd#529.

Differential Revision: https://reviews.llvm.org/D143112
@anstropleuton
Copy link

@tristan957 I am not good at programming and I can't reach the coding standards of this project. And making request or giving feedback doesn't make the community toxic. I am just a user of this project. Writing a hate on me doesn't change anything so please stop it.

@tristan957
Copy link

@anstropleuton Then offer to pay someone to fix this issue or please stop asking people to do things for you without compensation.

@aaronliu0130
Copy link

This is pretty silly… feature requests exist for a reason, but pinging random people is indeed not pretty good. Let’s just end this conversation here now, no use polluting the comments.

@anstropleuton
Copy link

I just saw "I mean ping the clangd organization." by aaronliu0130 so I thought it would be fine... My apologizes if it wasn't.
I am not old enough to have a bank account of something or to earn so I cannot have online money to spend on anything.
In plus, this is the first time I asked anything in github.
With that, I conclude that people who argue about the community asking for features are toxic, are the one who actually are toxic.
Can we stop arguing on silly stuff please?

@Stehsaer
Copy link

Any updates? Can't wait to see clangd supporting doxygen comments.

@aaronliu0130
Copy link

We're all waiting on llvm/llvm-project#78491

@Stehsaer
Copy link

The hovered display is created in function clang::clangd::HoverInfo::present() const in file clang-tools-extra/clangd/HoverInfo.[h/cpp]. Modifying this and make your own simple doxygen parser seems a neat temporary solution. I'm already working on that and plan to use my modified clangd until it's officially supported. It's absolutely possible and easy to parse since the content of the comment is already parsed and stored in a string.

@aaronliu0130
Copy link

You can also try Tom Anders's patches in their phabicrator merge requests.

@Stehsaer
Copy link

Stehsaer commented Feb 12, 2024

@aaronliu0130 Any link to the merge req that has doxygen parsing? Tom Ander mentioned some different links and I can't figure out which one is the one that has doxygen added.

@sr-tream
Copy link

sr-tream commented Feb 12, 2024

@Stehsaer Here Tom Anders's patches updated for LLVM 18.x (and it also compilable with 19.x)

@Stehsaer
Copy link

@Stehsaer Here Tom Anders's patches updated for LLVM 18.x (and it also compiled with 19.x)

How do I properly patch? I tried using git apply xxx.patch, it worked, but when compiling, it went:

E:\git-proj\llvm-project-release-18.x\clang-tools-extra\clangd\SymbolDocumentation.cpp: In lambda function:
[build] E:\git-proj\llvm-project-release-18.x\clang-tools-extra\clangd\SymbolDocumentation.cpp:62:44: error: 'clang::comments::InlineCommandComment::RenderKind' has not been declared
[build]    62 |       case comments::InlineCommandComment::RenderKind::RenderMonospaced:
[build]       |                                            ^~~~~~~~~~
[build] E:\git-proj\llvm-project-release-18.x\clang-tools-extra\clangd\SymbolDocumentation.cpp:64:44: error: 'clang::comments::InlineCommandComment::RenderKind' has not been declared
[build]    64 |       case comments::InlineCommandComment::RenderKind::RenderBold:
[build]       |                                            ^~~~~~~~~~
[build] E:\git-proj\llvm-project-release-18.x\clang-tools-extra\clangd\SymbolDocumentation.cpp:66:44: error: 'clang::comments::InlineCommandComment::RenderKind' has not been declared
[build]    66 |       case comments::InlineCommandComment::RenderKind::RenderEmphasized:
[build]       |                                            ^~~~~~~~~~

The enums comments::InlineCommandComment::RenderKind seemed to be missing. I'm using release-18.x as the base. The full compiling log is attached here:
log.txt
The branch I'm using: release/18.x

My exact steps:

  1. Download the source as zip in release/18.x and unzip into a folder
  2. Download the patch hover-doxygen.patch from the link and put it somewhere
  3. I run git apply ../hover-doxygen.patch and it gives warning:
../hover-doxygen.patch:990: trailing whitespace.
    /*!
../hover-doxygen.patch:1001: trailing whitespace.
    /**
../hover-doxygen.patch:1013: trailing whitespace.
    /*!
warning: 3 lines add whitespace errors.
  1. I configured the cmake project, with options:
    image
    and instruct cmake to build clangd only, in release mode
  2. The errors are reported

@sr-tream
Copy link

Download the patch hover-doxygen.patch from the link and put it somewhere

@Stehsaer this patch for older LLVM. Use hover-doxygen-trunk.patch instead. Link to this patch in my previous post

@Stehsaer
Copy link

Download the patch hover-doxygen.patch from the link and put it somewhere

@Stehsaer this patch for older LLVM. Use hover-doxygen-trunk.patch instead. Link to this patch in my previous post

Thanks, problem solved. It's working as intended and I've modified a little bit to match my own preferences.

Some features missing

  1. @return and @returns are identical
  2. lack parsing of @details
  3. No doxygen document parsing in auto complete,
    image

How can I add parsing in auto complete, I need some clues about where to modify.

@sr-tream
Copy link

Download the patch hover-doxygen.patch from the link and put it somewhere

@Stehsaer this patch for older LLVM. Use hover-doxygen-trunk.patch instead. Link to this patch in my previous post

Thanks, problem solved. It's working as intended and I've modified a little bit to match my own preferences.

Some features missing

1. `@return` and `@returns` are identical

2. lack parsing of `@details`

3. No doxygen document parsing in auto complete,
   ![image](https://private-user-images.githubusercontent.com/54050160/304744409-885a4fd7-34ee-49ad-ac8a-4d0a5eb358e8.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDc5MTY5MTYsIm5iZiI6MTcwNzkxNjYxNiwicGF0aCI6Ii81NDA1MDE2MC8zMDQ3NDQ0MDktODg1YTRmZDctMzRlZS00OWFkLWFjOGEtNGQwYTVlYjM1OGU4LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDAyMTQlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwMjE0VDEzMTY1NlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWM1YzNlOWE0NzRlOTg4MWMyZGM3YjZjNjMzNzU3OThiM2UyODVmZDkwYTQwOGUwNDRiZjM4MzVmMjE5NzVlZWImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.uLLRbkZQpsjzsaJlG42RhSjQkl2BMJ2kBLBCrUTBajs)

How can I add parsing in auto complete, I need some clues about where to modify.

You can use this patch to render more fields and fix detailed description.

No doxygen document parsing in auto complete,

Don't know how to enable HI window in autocomplete to explore this problem

@Stehsaer
Copy link

any schedule or plans to merge the patches into main?

@HighCommander4
Copy link

Not sure how relevant this is to us, but there are some in-flight patches to improve the doxygen parsing code that's upstream in the clang frontend (clang/AST/CommentParser.h): https://discourse.llvm.org/t/rfc-improving-clangs-comment-parsing-to-conform-better-to-doxygen-semantics/78785

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests