Skip to content

Conversation

@frobtech
Copy link
Contributor

This adds a few new features to hdrgen, all meant to facilitate
using it with inputs and outputs that are outside the llvm-libc
source tree.

The new extra_standards field is a dictionary to augment the
set of names that can be used in standards lists. The keys are
the identifiers used in YAML ("stdc") and the values are the
pretty names generated in the header comments ("Standard C").
This lets a libc project that's leveraging the llvm-libc sources
along with its own code define new APIs outside the formal and de
facto standards that llvm-libc draws its supported APIs from.

The new license_text field is a list of lines of license text
that replaces the standard LLVM license text used at the top of
each generated header. This lets other projects use hdrgen with
their own inputs to produce generated headers that are not tied
to the LLVM project.

Finally, for any function attributes that are not in a canonical
list known to be provided by __llvm-libc-common.h, an include
will be generated for "llvm-libc-macros/{attribute name}.h",
expecting that file to define the "attribute" name as a macro.

All this can be used immediately by builds that drive hdrgen and
build libc code outside the LLVM CMake build. Future changes
could add CMake plumbing to facilitate augmenting the LLVM CMake
build of libc with outside sources via overlays and cache files.

This adds a few new features to hdrgen, all meant to facilitate
using it with inputs and outputs that are outside the llvm-libc
source tree.

The new `extra_standards` field is a dictionary to augment the
set of names that can be used in `standards` lists.  The keys are
the identifiers used in YAML ("stdc") and the values are the
pretty names generated in the header comments ("Standard C").
This lets a libc project that's leveraging the llvm-libc sources
along with its own code define new APIs outside the formal and de
facto standards that llvm-libc draws its supported APIs from.

The new `license_text` field is a list of lines of license text
that replaces the standard LLVM license text used at the top of
each generated header.  This lets other projects use hdrgen with
their own inputs to produce generated headers that are not tied
to the LLVM project.

Finally, for any function attributes that are not in a canonical
list known to be provided by __llvm-libc-common.h, an include
will be generated for "llvm-libc-macros/{attribute name}.h",
expecting that file to define the "attribute" name as a macro.

All this can be used immediately by builds that drive hdrgen and
build libc code outside the LLVM CMake build.  Future changes
could add CMake plumbing to facilitate augmenting the LLVM CMake
build of libc with outside sources via overlays and cache files.
@frobtech frobtech marked this pull request as ready for review October 28, 2025 18:52
@llvmbot llvmbot added the libc label Oct 28, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 28, 2025

@llvm/pr-subscribers-libc

Author: Roland McGrath (frobtech)

Changes

This adds a few new features to hdrgen, all meant to facilitate
using it with inputs and outputs that are outside the llvm-libc
source tree.

The new extra_standards field is a dictionary to augment the
set of names that can be used in standards lists. The keys are
the identifiers used in YAML ("stdc") and the values are the
pretty names generated in the header comments ("Standard C").
This lets a libc project that's leveraging the llvm-libc sources
along with its own code define new APIs outside the formal and de
facto standards that llvm-libc draws its supported APIs from.

The new license_text field is a list of lines of license text
that replaces the standard LLVM license text used at the top of
each generated header. This lets other projects use hdrgen with
their own inputs to produce generated headers that are not tied
to the LLVM project.

Finally, for any function attributes that are not in a canonical
list known to be provided by __llvm-libc-common.h, an include
will be generated for "llvm-libc-macros/{attribute name}.h",
expecting that file to define the "attribute" name as a macro.

All this can be used immediately by builds that drive hdrgen and
build libc code outside the LLVM CMake build. Future changes
could add CMake plumbing to facilitate augmenting the LLVM CMake
build of libc with outside sources via overlays and cache files.


Full diff: https://github.com/llvm/llvm-project/pull/165459.diff

8 Files Affected:

  • (modified) libc/utils/hdrgen/hdrgen/header.py (+55-16)
  • (modified) libc/utils/hdrgen/hdrgen/yaml_to_classes.py (+2)
  • (added) libc/utils/hdrgen/tests/expected_output/custom.h (+21)
  • (modified) libc/utils/hdrgen/tests/expected_output/test_header.h (+1)
  • (modified) libc/utils/hdrgen/tests/expected_output/test_small.json (+1)
  • (added) libc/utils/hdrgen/tests/input/custom-common.yaml (+6)
  • (added) libc/utils/hdrgen/tests/input/custom.yaml (+13)
  • (modified) libc/utils/hdrgen/tests/test_integration.py (+7)
diff --git a/libc/utils/hdrgen/hdrgen/header.py b/libc/utils/hdrgen/hdrgen/header.py
index 2118db6e5fb75..ba174b22d6d6d 100644
--- a/libc/utils/hdrgen/hdrgen/header.py
+++ b/libc/utils/hdrgen/hdrgen/header.py
@@ -35,6 +35,13 @@
 
 COMMON_HEADER = PurePosixPath("__llvm-libc-common.h")
 
+# These "attributes" are known macros defined in COMMON_HEADER.
+# Others are found in "llvm-libc-macros/{name}.h".
+COMMON_ATTRIBUTES = {
+    "_Noreturn",
+    "_Returns_twice",
+}
+
 # All the canonical identifiers are in lowercase for easy maintenance.
 # This maps them to the pretty descriptions to generate in header comments.
 LIBRARY_DESCRIPTIONS = {
@@ -50,9 +57,7 @@
 HEADER_TEMPLATE = """\
 //===-- {library} header <{header}> --===//
 //
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+{license_lines}
 //
 //===---------------------------------------------------------------------===//
 
@@ -64,6 +69,12 @@
 #endif // {guard}
 """
 
+LLVM_LICENSE_TEXT = [
+    "Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.",
+    "See https://llvm.org/LICENSE.txt for license information.",
+    "SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception",
+]
+
 
 class HeaderFile:
     def __init__(self, name):
@@ -74,8 +85,10 @@ def __init__(self, name):
         self.enumerations = []
         self.objects = []
         self.functions = []
+        self.extra_standards = {}
         self.standards = []
         self.merge_yaml_files = []
+        self.license_text = []
 
     def add_macro(self, macro):
         self.macros.append(macro)
@@ -98,6 +111,11 @@ def merge(self, other):
         self.enumerations = sorted(set(self.enumerations) | set(other.enumerations))
         self.objects = sorted(set(self.objects) | set(other.objects))
         self.functions = sorted(set(self.functions) | set(other.functions))
+        self.extra_standards |= other.extra_standards
+        if self.license_text:
+            assert not other.license_text, "only one `license_text` allowed"
+        else:
+            self.license_text = other.license_text
 
     def all_types(self):
         return reduce(
@@ -106,6 +124,13 @@ def all_types(self):
             set(self.types),
         )
 
+    def all_attributes(self):
+        return reduce(
+            lambda a, b: a | b,
+            [set(f.attributes) for f in self.functions],
+            set(),
+        )
+
     def all_standards(self):
         # FIXME: Only functions have the "standard" field, but all the entity
         # types should have one too.
@@ -114,16 +139,24 @@ def all_standards(self):
         )
 
     def includes(self):
-        return {
-            PurePosixPath("llvm-libc-macros") / macro.header
-            for macro in self.macros
-            if macro.header is not None
-        } | {
-            COMPILER_HEADER_TYPES.get(
-                typ.type_name, PurePosixPath("llvm-libc-types") / f"{typ.type_name}.h"
-            )
-            for typ in self.all_types()
-        }
+        return (
+            {
+                PurePosixPath("llvm-libc-macros") / macro.header
+                for macro in self.macros
+                if macro.header is not None
+            }
+            | {
+                COMPILER_HEADER_TYPES.get(
+                    typ.type_name,
+                    PurePosixPath("llvm-libc-types") / f"{typ.type_name}.h",
+                )
+                for typ in self.all_types()
+            }
+            | {
+                PurePosixPath("llvm-libc-macros") / f"{attr}.h"
+                for attr in self.all_attributes() - COMMON_ATTRIBUTES
+            }
+        )
 
     def header_guard(self):
         return "_LLVM_LIBC_" + "_".join(
@@ -131,24 +164,29 @@ def header_guard(self):
         )
 
     def library_description(self):
+        descriptions = LIBRARY_DESCRIPTIONS | self.extra_standards
         # If the header itself is in standard C, just call it that.
         if "stdc" in self.standards:
-            return LIBRARY_DESCRIPTIONS["stdc"]
+            return descriptions["stdc"]
         # If the header itself is in POSIX, just call it that.
         if "posix" in self.standards:
-            return LIBRARY_DESCRIPTIONS["posix"]
+            return descriptions["posix"]
         # Otherwise, consider the standards for each symbol as well.
         standards = self.all_standards()
         # Otherwise, it's described by all those that apply, but ignoring
         # "stdc" and "posix" since this is not a "stdc" or "posix" header.
         return " / ".join(
             sorted(
-                LIBRARY_DESCRIPTIONS[standard]
+                descriptions[standard]
                 for standard in standards
                 if standard not in {"stdc", "posix"}
             )
         )
 
+    def license_lines(self):
+        lines = self.license_text or LLVM_LICENSE_TEXT
+        return "\n".join([f"// {line}" for line in lines])
+
     def template(self, dir, files_read):
         if self.template_file is not None:
             # There's a custom template file, so just read it in and record
@@ -162,6 +200,7 @@ def template(self, dir, files_read):
             library=self.library_description(),
             header=self.name,
             guard=self.header_guard(),
+            license_lines=self.license_lines(),
         )
 
     def public_api(self):
diff --git a/libc/utils/hdrgen/hdrgen/yaml_to_classes.py b/libc/utils/hdrgen/hdrgen/yaml_to_classes.py
index ebe7781d449f7..9eddbe615cbba 100644
--- a/libc/utils/hdrgen/hdrgen/yaml_to_classes.py
+++ b/libc/utils/hdrgen/hdrgen/yaml_to_classes.py
@@ -37,6 +37,8 @@ def yaml_to_classes(yaml_data, header_class, entry_points=None):
     header = header_class(header_name)
     header.template_file = yaml_data.get("header_template")
     header.standards = yaml_data.get("standards", [])
+    header.extra_standards = yaml_data.get("extra_standards", {})
+    header.license_text = yaml_data.get("license_text", [])
     header.merge_yaml_files = yaml_data.get("merge_yaml_files", [])
 
     for macro_data in yaml_data.get("macros", []):
diff --git a/libc/utils/hdrgen/tests/expected_output/custom.h b/libc/utils/hdrgen/tests/expected_output/custom.h
new file mode 100644
index 0000000000000..5f9ed231490fd
--- /dev/null
+++ b/libc/utils/hdrgen/tests/expected_output/custom.h
@@ -0,0 +1,21 @@
+//===-- Wile E. Coyote header <custom.h> --===//
+//
+// Caveat emptor.
+// I never studied law.
+//
+//===---------------------------------------------------------------------===//
+
+#ifndef _LLVM_LIBC_CUSTOM_H
+#define _LLVM_LIBC_CUSTOM_H
+
+#include "__llvm-libc-common.h"
+#include "llvm-libc-types/meep.h"
+#include "llvm-libc-types/road.h"
+
+__BEGIN_C_DECLS
+
+road runner(meep, meep) __NOEXCEPT;
+
+__END_C_DECLS
+
+#endif // _LLVM_LIBC_CUSTOM_H
diff --git a/libc/utils/hdrgen/tests/expected_output/test_header.h b/libc/utils/hdrgen/tests/expected_output/test_header.h
index 748c09808c128..49112a353f7b6 100644
--- a/libc/utils/hdrgen/tests/expected_output/test_header.h
+++ b/libc/utils/hdrgen/tests/expected_output/test_header.h
@@ -12,6 +12,7 @@
 #include "__llvm-libc-common.h"
 #include "llvm-libc-macros/float16-macros.h"
 
+#include "llvm-libc-macros/CONST_FUNC_A.h"
 #include "llvm-libc-macros/test_more-macros.h"
 #include "llvm-libc-macros/test_small-macros.h"
 #include "llvm-libc-types/float128.h"
diff --git a/libc/utils/hdrgen/tests/expected_output/test_small.json b/libc/utils/hdrgen/tests/expected_output/test_small.json
index 9cc73d013a679..8502df23b9a41 100644
--- a/libc/utils/hdrgen/tests/expected_output/test_small.json
+++ b/libc/utils/hdrgen/tests/expected_output/test_small.json
@@ -4,6 +4,7 @@
     "standards": [],
     "includes": [
       "__llvm-libc-common.h",
+      "llvm-libc-macros/CONST_FUNC_A.h",
       "llvm-libc-macros/test_more-macros.h",
       "llvm-libc-macros/test_small-macros.h",
       "llvm-libc-types/float128.h",
diff --git a/libc/utils/hdrgen/tests/input/custom-common.yaml b/libc/utils/hdrgen/tests/input/custom-common.yaml
new file mode 100644
index 0000000000000..909a3ba5163a5
--- /dev/null
+++ b/libc/utils/hdrgen/tests/input/custom-common.yaml
@@ -0,0 +1,6 @@
+license_text:
+  - Caveat emptor.
+  - I never studied law.
+
+extra_standards:
+  acme: Wile E. Coyote
diff --git a/libc/utils/hdrgen/tests/input/custom.yaml b/libc/utils/hdrgen/tests/input/custom.yaml
new file mode 100644
index 0000000000000..7d3ff8ec421dd
--- /dev/null
+++ b/libc/utils/hdrgen/tests/input/custom.yaml
@@ -0,0 +1,13 @@
+merge_yaml_files:
+  - custom-common.yaml
+
+header: custom.h
+standards:
+  - acme
+
+functions:
+  - name: runner
+    return_type: road
+    arguments:
+      - type: meep
+      - type: meep
diff --git a/libc/utils/hdrgen/tests/test_integration.py b/libc/utils/hdrgen/tests/test_integration.py
index bf393d26a8101..c6e76d826a3a4 100644
--- a/libc/utils/hdrgen/tests/test_integration.py
+++ b/libc/utils/hdrgen/tests/test_integration.py
@@ -59,6 +59,13 @@ def test_generate_subdir_header(self):
         self.run_script(yaml_file, output_file)
         self.compare_files(output_file, expected_output_file)
 
+    def test_custom_license_and_standards(self):
+        yaml_file = self.source_dir / "input" / "custom.yaml"
+        expected_output_file = self.source_dir / "expected_output" / "custom.h"
+        output_file = self.output_dir / "custom.h"
+        self.run_script(yaml_file, output_file)
+        self.compare_files(output_file, expected_output_file)
+
     def test_generate_json(self):
         yaml_file = self.source_dir / "input/test_small.yaml"
         expected_output_file = self.source_dir / "expected_output/test_small.json"

@frobtech frobtech self-assigned this Oct 28, 2025
Copy link
Contributor

@ilovepi ilovepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Contributor

@michaelrj-google michaelrj-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM

@frobtech frobtech merged commit 22079e3 into llvm:main Oct 30, 2025
18 of 20 checks passed
@frobtech frobtech deleted the p/libc-hdrgen-extras branch October 30, 2025 17:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants