Skip to content

[clang][doc]: Merge entries with duplicate content. #134089

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 7, 2025

Conversation

YLChenZ
Copy link
Contributor

@YLChenZ YLChenZ commented Apr 2, 2025

Closes #133706.

before the patch:

docb4
after the patch:

doc-after

@llvmbot llvmbot added the clang Clang issues not falling into any other category label Apr 2, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 2, 2025

@llvm/pr-subscribers-clang

Author: None (YLChenZ)

Changes

Closes #133706.

before the patch:

docb4
after the patch:

doc-after


Full diff: https://github.com/llvm/llvm-project/pull/134089.diff

1 Files Affected:

  • (modified) clang/utils/TableGen/ClangAttrEmitter.cpp (+50-18)
diff --git a/clang/utils/TableGen/ClangAttrEmitter.cpp b/clang/utils/TableGen/ClangAttrEmitter.cpp
index 06f6b073240ba..6a2cdcd4ebc8e 100644
--- a/clang/utils/TableGen/ClangAttrEmitter.cpp
+++ b/clang/utils/TableGen/ClangAttrEmitter.cpp
@@ -5144,6 +5144,14 @@ class SpellingList {
 
     Spellings[(size_t)Kind].push_back(Name);
   }
+
+  void merge(const SpellingList &Other) {
+    for (size_t Kind = 0; Kind < NumSpellingKinds; ++Kind) {
+      Spellings[Kind].insert(Spellings[Kind].end(),
+                             Other.Spellings[Kind].begin(),
+                             Other.Spellings[Kind].end());
+    }
+  }
 };
 
 class DocumentationData {
@@ -5301,44 +5309,68 @@ void EmitClangAttrDocs(const RecordKeeper &Records, raw_ostream &OS) {
       return L->getValueAsString("Name") < R->getValueAsString("Name");
     }
   };
-  std::map<const Record *, std::vector<DocumentationData>, CategoryLess>
+
+  std::map<const Record *, std::map<std::string, DocumentationData>,
+           CategoryLess>
       SplitDocs;
+
+  // Collect documentation data, grouping by category and heading.
   for (const auto *A : Records.getAllDerivedDefinitions("Attr")) {
     const Record &Attr = *A;
     std::vector<const Record *> Docs =
         Attr.getValueAsListOfDefs("Documentation");
+
     for (const auto *D : Docs) {
       const Record &Doc = *D;
       const Record *Category = Doc.getValueAsDef("Category");
       // If the category is "InternalOnly", then there cannot be any other
       // documentation categories (otherwise, the attribute would be
       // emitted into the docs).
-      const StringRef Cat = Category->getValueAsString("Name");
-      bool InternalOnly = Cat == "InternalOnly";
-      if (InternalOnly && Docs.size() > 1)
+      StringRef Cat = Category->getValueAsString("Name");
+      if (Cat == "InternalOnly" && Docs.size() > 1)
         PrintFatalError(Doc.getLoc(),
                         "Attribute is \"InternalOnly\", but has multiple "
                         "documentation categories");
 
-      if (!InternalOnly)
-        SplitDocs[Category].push_back(DocumentationData(
-            Doc, Attr, GetAttributeHeadingAndSpellings(Doc, Attr, Cat)));
+      if (Cat == "InternalOnly")
+        continue;
+
+      // Generate Heading and Spellings.
+      auto HeadingAndSpellings =
+          GetAttributeHeadingAndSpellings(Doc, Attr, Cat);
+
+      auto &CategoryDocs = SplitDocs[Category];
+
+      // If the heading already exists, merge the documentation.
+      auto It = CategoryDocs.find(HeadingAndSpellings.first);
+      if (It != CategoryDocs.end()) {
+        // Merge spellings
+        It->second.SupportedSpellings.merge(HeadingAndSpellings.second);
+        // Merge content
+        It->second.Documentation = &Doc; // Update reference
+      } else {
+        // Otherwise, add a new entry.
+        CategoryDocs.emplace(HeadingAndSpellings.first,
+                             DocumentationData(Doc, Attr, HeadingAndSpellings));
+      }
     }
   }
 
-  // Having split the attributes out based on what documentation goes where,
-  // we can begin to generate sections of documentation.
-  for (auto &I : SplitDocs) {
-    WriteCategoryHeader(I.first, OS);
+  // Write out documentation, merging attributes with the same heading.
+  for (auto &CategoryPair : SplitDocs) {
+    WriteCategoryHeader(CategoryPair.first, OS);
+
+    std::vector<DocumentationData> MergedDocs;
+    for (auto &DocPair : CategoryPair.second)
+      MergedDocs.push_back(std::move(DocPair.second));
 
-    sort(I.second,
-         [](const DocumentationData &D1, const DocumentationData &D2) {
-           return D1.Heading < D2.Heading;
-         });
+    std::sort(MergedDocs.begin(), MergedDocs.end(),
+              [](const DocumentationData &D1, const DocumentationData &D2) {
+                return D1.Heading < D2.Heading;
+              });
 
-    // Walk over each of the attributes in the category and write out their
-    // documentation.
-    for (const auto &Doc : I.second)
+    // Output each documentation entry.
+    for (const auto &Doc : MergedDocs)
       WriteDocumentation(Records, Doc, OS);
   }
 }

@AaronBallman AaronBallman changed the title [llvm][doc]: Merge the contents of identical entries. [clang][doc]: Merge the contents of identical entries. Apr 3, 2025
Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this! I verified that it does address the issue with randomize_layout and no_randomize_layout being duplicated. But it doesn't seem to handle other cases like cf_returns_not_retained, cf_returns_retained, and cf_consumed, or coro_return_type and coro_wrapper. So I think it's close to correct, but there still seems to be an issue where some aren't being merged.

Btw, you can test this locally. How I do it is:

F:\source\llvm-project>llvm\out\build\x64-Debug\bin\clang-tblgen.exe -o clang\docs\AttributeReference.rst -gen-attr-docs clang\include\clang\Basic\Attr.td -I clang\include\

F:\source\llvm-project\clang\docs>cd clang\docs

F:\source\llvm-project\clang\docs>make html

Then I open at the output in file:///F:/source/llvm-project/clang/docs/_build/html/AttributeReference.html in my web browser and look around at the generated HTML. (This does require your python to have sphinx installed, which you can get through pip.)

@YLChenZ
Copy link
Contributor Author

YLChenZ commented Apr 3, 2025

@AaronBallman thanks for the suggestion. I see the problem you pointed out, my current implementation is merging content based on attributes with the same Heading, but not for different attributes with the same Content.

@YLChenZ
Copy link
Contributor Author

YLChenZ commented Apr 4, 2025

Hi @AaronBallman, bothering you again. I have a doubt about the merge of the following case:
屏幕截图 2025-04-04 081652

There is a lot of content that is No documentation, and to merge these entries would be to create a very large entry, like this:
屏幕截图 2025-04-04 082342

It doesn't look pretty, so if this doesn't have any impact, I'll follow this proposal.

@AaronBallman
Copy link
Collaborator

AaronBallman commented Apr 4, 2025

Hi @AaronBallman, bothering you again. I have a doubt about the merge of the following case: 屏幕截图 2025-04-04 081652

There is a lot of content that is No documentation, and to merge these entries would be to create a very large entry, like this: 屏幕截图 2025-04-04 082342

It doesn't look pretty, so if this doesn't have any impact, I'll follow this proposal.

Oh yeah, that is not a good outcome. :-) I think Undocumented is a special case. For this patch, I would not merge if the category is Undocumented. In a completely separate patch, it might make sense to combine the Undocumented attributes into their own list that's a single table of the supported syntaxes, with one row per undocumented attribute. Then we're not repeating as much content such as No documentation. or the headings + the spellings. (Don't feel obligated to work on that second patch, either; I'm not trying to sign you up for more work.)

@YLChenZ
Copy link
Contributor Author

YLChenZ commented Apr 4, 2025

Oh yeah, that is not a good outcome. :-) I think Undocumented is a special case. For this patch, I would not merge if the category is Undocumented.

@AaronBallman Oh, I forgot the other merged entry that is slightly smaller compared to Undocumented like this:

屏幕截图 2025-04-04 200659

屏幕截图 2025-04-04 195903

Similarly, it doesn't look very pretty.

@AaronBallman
Copy link
Collaborator

Oh yeah, that is not a good outcome. :-) I think Undocumented is a special case. For this patch, I would not merge if the category is Undocumented.

@AaronBallman Oh, I forgot the other merged entry that is slightly smaller compared to Undocumented like this:

屏幕截图 2025-04-04 200659

屏幕截图 2025-04-04 195903

Similarly, it doesn't look very pretty.

Oh yeah, that is not a good outcome. :-) I think Undocumented is a special case. For this patch, I would not merge if the category is Undocumented.

@AaronBallman Oh, I forgot the other merged entry that is slightly smaller compared to Undocumented like this:

屏幕截图 2025-04-04 200659

屏幕截图 2025-04-04 195903

Similarly, it doesn't look very pretty.

CC @erichkeane for opinions, but I think this is... ugly-but-reasonable. Ideally, I'd like to see those split off (cf in one group, ns in another, etc) with different documentation given that these are presumably doing different things. But as it stands, I think this is better than repeating the same information 15 times.

@YLChenZ
Copy link
Contributor Author

YLChenZ commented Apr 4, 2025

But as it stands, I think this is better than repeating the same information 15 times.

Couldn't agree more.

@erichkeane
Copy link
Collaborator

Oh yeah, that is not a good outcome. :-) I think Undocumented is a special case. For this patch, I would not merge if the category is Undocumented.

@AaronBallman Oh, I forgot the other merged entry that is slightly smaller compared to Undocumented like this:
屏幕截图 2025-04-04 200659
屏幕截图 2025-04-04 195903
Similarly, it doesn't look very pretty.

Oh yeah, that is not a good outcome. :-) I think Undocumented is a special case. For this patch, I would not merge if the category is Undocumented.

@AaronBallman Oh, I forgot the other merged entry that is slightly smaller compared to Undocumented like this:
屏幕截图 2025-04-04 200659
屏幕截图 2025-04-04 195903
Similarly, it doesn't look very pretty.

CC @erichkeane for opinions, but I think this is... ugly-but-reasonable. Ideally, I'd like to see those split off (cf in one group, ns in another, etc) with different documentation given that these are presumably doing different things. But as it stands, I think this is better than repeating the same information 15 times.

Ooof, that last one is frustrating. That just SCREAMS to me people re-using 'generic' documentation for a ton of things. I'm pretty grumpy about THAT.

However, that is a vast improvement.

As far as Undocumented, I like the idea of merging them into 1 table like the rest, but I would actually suggest that we just remove the 'header' that lists all the names (or change it to 'undocumented attributes').

WDYT? (could be done in a followup).

@YLChenZ YLChenZ changed the title [clang][doc]: Merge the contents of identical entries. [clang][doc]: Merge entries with duplicate content. Apr 7, 2025
@YLChenZ YLChenZ requested a review from AaronBallman April 7, 2025 10:46
@AaronBallman
Copy link
Collaborator

I thought we decided to merge everything but the undocumented attributes? I'm still seeing cf_returns_retained that's distinct from cf_returns_not_retained, etc.

@YLChenZ
Copy link
Contributor Author

YLChenZ commented Apr 7, 2025

I'm still seeing cf_returns_retained that's distinct from cf_returns_not_retained, etc.

@AaronBallman Really? But here's what I'm getting when I run it:

屏幕截图 2025-04-07 195538

@AaronBallman
Copy link
Collaborator

This is what I'm getting (I applied the patch locally, re-ran clang-tblgen, then re-ran make html to make the docs):
Screenshot 2025-04-07 083145

@YLChenZ
Copy link
Contributor Author

YLChenZ commented Apr 7, 2025

@AaronBallman I'm sorry to bother you again. The first patch I just merged the same Heading, and then according to your suggestion, the second patch merged the same Content entries. Then the third patch: did not include Undocumented. I used the build steps you gave to get this (I worked on Ubuntu):

build steps:

ylchen@lambda:~/llvm-project/build$ cd bin
ylchen@lambda:~/llvm-project/build/bin$ ./clang-tblgen -o ../../clang/docs/AttributeReference.rst   -gen-attr-docs ../../clang/include/clang/Basic/Attr.td   -I ../../clang/include/
ylchen@lambda:~/llvm-project/build/bin$ cd ../../clang/docs
ylchen@lambda:~/llvm-project/clang/docs$ sphinx-build -b html . _build/html
Running Sphinx v8.1.3
loading translations [en]... done
loading pickled environment... done
myst v4.0.1: MdParserConfig(commonmark_only=False, gfm_only=False, enable_extensions=set(), disable_syntax=[], all_links_external=False, links_external_new_tab=False, url_schemes=('http', 'https', 'mailto', 'ftp'), ref_domains=None, fence_as_directive=set(), number_code_blocks=[], title_to_header=False, heading_anchors=0, heading_slug_func=None, html_meta={}, footnote_sort=True, footnote_transition=True, words_per_minute=200, substitutions={}, linkify_fuzzy_links=True, dmath_allow_labels=True, dmath_allow_space=True, dmath_allow_digits=True, dmath_double_inline=False, update_mathjax=True, mathjax_classes='tex2jax_process|mathjax_process|math|output_area', enable_checkboxes=False, suppress_warnings=[], highlight_code_blocks=True)
building [mo]: targets for 0 po files that are out of date
writing output... 
building [html]: targets for 1 source files that are out of date
updating environment: 0 added, 2 changed, 0 removed
reading sources... [100%] index
/home/ylchen/llvm-project/clang/docs/index.rst:16: WARNING: toctree contains reference to nonexisting document 'ClangCommandLineReference' [toc.not_readable]
/home/ylchen/llvm-project/clang/docs/index.rst:16: WARNING: toctree contains reference to nonexisting document 'DiagnosticsReference' [toc.not_readable]
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
copying assets... 
copying static files... 
Writing evaluated template result to /home/ylchen/llvm-project/clang/docs/_build/html/_static/language_data.js
Writing evaluated template result to /home/ylchen/llvm-project/clang/docs/_build/html/_static/basic.css
Writing evaluated template result to /home/ylchen/llvm-project/clang/docs/_build/html/_static/documentation_options.js
Writing evaluated template result to /home/ylchen/llvm-project/clang/docs/_build/html/_static/haiku.css
copying static files: done
copying extra files... 
copying extra files: done
copying assets: done
writing output... [100%] index
generating indices... genindex done
writing additional pages... search done
dumping search index in English (code: en)... done
dumping object inventory... done
build succeeded, 2 warnings.

The HTML pages are in _build/html.

outcome:

屏幕截图 2025-04-07 205630

And I used another way to get the documentation, running directly in the ~/llvm-project/build directory :

ylchen@lambda:~/llvm-project$ cd build
ylchen@lambda:~/llvm-project/build$ ninja docs-clang-html

I found the target file in the ~/llvm-project/build/tools/clang/docs/html directory, like this:

image

@AaronBallman
Copy link
Collaborator

Ugh, the issue was on my end. I downloaded the changes, rebuilt the documentation... and skipped the step where I rebuild clang-tblgen. When I do things properly, it works (shocking, I know).

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! What do you think @erichkeane?

@erichkeane
Copy link
Collaborator

Yeah, LGTM. Thank you!

@AaronBallman AaronBallman merged commit 529e912 into llvm:main Apr 7, 2025
11 checks passed
@YLChenZ
Copy link
Contributor Author

YLChenZ commented Apr 7, 2025

Ugh, the issue was on my end. I downloaded the changes, rebuilt the documentation... and skipped the step where I rebuild clang-tblgen. When I do things properly, it works (shocking, I know).

Ah, the classic 'forgot to rebuild the tools' trap — we’ve all been there! Glad it’s working now. @AaronBallman @erichkeane thanks for the guidance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Duplicate documentation entry for {no_,}randomize_layout
4 participants