Skip to content

Conversation

@tgs-sc
Copy link
Contributor

@tgs-sc tgs-sc commented Oct 21, 2025

Introduced a way for generating YAML schema for validating input YAML. This can be useful to see the full structure of the input YAML file for clang-tidy tool. This PR consists of 3 main commits: commit that introduces new YamlIO GenerateSchema and adds only necessary changes to YAMLTraits.h to leave it in working state, commit that adds main changes to YAMLTraits.h that add capabilities such as type names in ScalarTraits and final commit that adds an option to clang-tidy with some simple changes such as changing YamlIO.outputting() -> YamlIO.getKind(). I have an RFC with this topic: https://discourse.llvm.org/t/rfc-yamlgenerateschema-support-for-producing-yaml-schemas/85846.

This PR depends on #133284 and #164826

@llvmbot
Copy link
Member

llvmbot commented Oct 21, 2025

@llvm/pr-subscribers-llvm-support

Author: Timur Golubovich (tgs-sc)

Changes

Introduced a way for generating YAML schema for validating input YAML. This can be useful to see the full structure of the input YAML file for clang-tidy tool. This PR consists of 3 main commits: commit that introduces new YamlIO GenerateSchema and adds only necessary changes to YAMLTraits.h to leave it in working state, commit that adds main changes to YAMLTraits.h that add capabilities such as type names in ScalarTraits and final commit that adds an option to clang-tidy with some simple changes such as changing YamlIO.outputting() -> YamlIO.getKind(). I have an RFC with this topic: https://discourse.llvm.org/t/rfc-yamlgenerateschema-support-for-producing-yaml-schemas/85846.


Patch is 49.67 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/164412.diff

11 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/ClangTidyOptions.cpp (+18-3)
  • (modified) clang-tools-extra/clang-tidy/ClangTidyOptions.h (+3)
  • (modified) clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp (+11)
  • (added) llvm/include/llvm/Support/YAMLGenerateSchema.h (+400)
  • (modified) llvm/include/llvm/Support/YAMLTraits.h (+109-29)
  • (modified) llvm/lib/Support/CMakeLists.txt (+1)
  • (added) llvm/lib/Support/YAMLGenerateSchema.cpp (+283)
  • (modified) llvm/lib/Support/YAMLTraits.cpp (+4)
  • (modified) llvm/unittests/Support/CMakeLists.txt (+1)
  • (added) llvm/unittests/Support/YAMLGenerateSchemaTest.cpp (+124)
  • (modified) llvm/utils/gn/secondary/llvm/lib/Support/BUILD.gn (+1)
diff --git a/clang-tools-extra/clang-tidy/ClangTidyOptions.cpp b/clang-tools-extra/clang-tidy/ClangTidyOptions.cpp
index b752a9beb0e34..b168e0dd28ddd 100644
--- a/clang-tools-extra/clang-tidy/ClangTidyOptions.cpp
+++ b/clang-tools-extra/clang-tidy/ClangTidyOptions.cpp
@@ -16,6 +16,7 @@
 #include "llvm/Support/ErrorOr.h"
 #include "llvm/Support/MemoryBufferRef.h"
 #include "llvm/Support/Path.h"
+#include "llvm/Support/YAMLGenerateSchema.h"
 #include "llvm/Support/YAMLTraits.h"
 #include <algorithm>
 #include <optional>
@@ -87,7 +88,7 @@ struct NOptionMap {
 template <>
 void yamlize(IO &IO, ClangTidyOptions::OptionMap &Val, bool,
              EmptyContext &Ctx) {
-  if (IO.outputting()) {
+  if (IO.getKind() == IOKind::Outputting) {
     // Ensure check options are sorted
     std::vector<std::pair<StringRef, StringRef>> SortedOptions;
     SortedOptions.reserve(Val.size());
@@ -108,7 +109,7 @@ void yamlize(IO &IO, ClangTidyOptions::OptionMap &Val, bool,
       IO.postflightKey(SaveInfo);
     }
     IO.endMapping();
-  } else {
+  } else if (IO.getKind() == IOKind::Inputting) {
     // We need custom logic here to support the old method of specifying check
     // options using a list of maps containing key and value keys.
     auto &I = reinterpret_cast<Input &>(IO);
@@ -128,6 +129,11 @@ void yamlize(IO &IO, ClangTidyOptions::OptionMap &Val, bool,
     } else {
       IO.setError("expected a sequence or map");
     }
+  } else {
+    MappingNormalization<NOptionMap, ClangTidyOptions::OptionMap> NOpts(IO,
+                                                                        Val);
+    EmptyContext Ctx;
+    yamlize(IO, NOpts->Options, true, Ctx);
   }
 }
 
@@ -182,7 +188,7 @@ struct ChecksVariant {
 };
 
 template <> void yamlize(IO &IO, ChecksVariant &Val, bool, EmptyContext &Ctx) {
-  if (!IO.outputting()) {
+  if (IO.getKind() == IOKind::Inputting) {
     // Special case for reading from YAML
     // Must support reading from both a string or a list
     auto &I = reinterpret_cast<Input &>(IO);
@@ -195,6 +201,9 @@ template <> void yamlize(IO &IO, ChecksVariant &Val, bool, EmptyContext &Ctx) {
     } else {
       IO.setError("expected string or sequence");
     }
+  } else if (IO.getKind() == IOKind::GeneratingSchema) {
+    Val.AsVector = std::vector<std::string>();
+    yamlize(IO, *Val.AsVector, true, Ctx);
   }
 }
 
@@ -541,6 +550,12 @@ parseConfiguration(llvm::MemoryBufferRef Config) {
   return Options;
 }
 
+void dumpConfigurationYAMLSchema(llvm::raw_fd_ostream &Stream) {
+  ClangTidyOptions Options;
+  llvm::yaml::GenerateSchema GS(Stream);
+  GS << Options;
+}
+
 static void diagHandlerImpl(const llvm::SMDiagnostic &Diag, void *Ctx) {
   (*reinterpret_cast<DiagCallback *>(Ctx))(Diag);
 }
diff --git a/clang-tools-extra/clang-tidy/ClangTidyOptions.h b/clang-tools-extra/clang-tidy/ClangTidyOptions.h
index 2aae92f1d9eb3..f0aa710c685a2 100644
--- a/clang-tools-extra/clang-tidy/ClangTidyOptions.h
+++ b/clang-tools-extra/clang-tidy/ClangTidyOptions.h
@@ -343,6 +343,9 @@ std::error_code parseLineFilter(llvm::StringRef LineFilter,
 llvm::ErrorOr<ClangTidyOptions>
 parseConfiguration(llvm::MemoryBufferRef Config);
 
+/// Dumps configuration YAML Schema to \p Stream
+void dumpConfigurationYAMLSchema(llvm::raw_fd_ostream &Stream);
+
 using DiagCallback = llvm::function_ref<void(const llvm::SMDiagnostic &)>;
 
 llvm::ErrorOr<ClangTidyOptions>
diff --git a/clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp b/clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp
index 64157f530b8c0..7c6fa7f5c40b9 100644
--- a/clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp
+++ b/clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp
@@ -355,6 +355,12 @@ see https://clang.llvm.org/extra/clang-tidy/QueryBasedCustomChecks.html.
                                               cl::init(false),
                                               cl::cat(ClangTidyCategory));
 
+static cl::opt<bool> DumpYAMLSchema("dump-yaml-schema", desc(R"(
+Dumps configuration YAML Schema in JSON format to
+stdout.
+)"),
+                                    cl::init(false),
+                                    cl::cat(ClangTidyCategory));
 namespace clang::tidy {
 
 static void printStats(const ClangTidyStats &Stats) {
@@ -684,6 +690,11 @@ int clangTidyMain(int argc, const char **argv) {
     return 0;
   }
 
+  if (DumpYAMLSchema) {
+    dumpConfigurationYAMLSchema(llvm::outs());
+    return 0;
+  }
+
   if (VerifyConfig) {
     std::vector<ClangTidyOptionsProvider::OptionsSource> RawOptions =
         OptionsProvider->getRawOptions(FileName);
diff --git a/llvm/include/llvm/Support/YAMLGenerateSchema.h b/llvm/include/llvm/Support/YAMLGenerateSchema.h
new file mode 100644
index 0000000000000..ac1609a9ee469
--- /dev/null
+++ b/llvm/include/llvm/Support/YAMLGenerateSchema.h
@@ -0,0 +1,400 @@
+//===- llvm/Support/YAMLGenerateSchema.h ------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_SUPPORT_YAMLGENERATE_SCHEMA_H
+#define LLVM_SUPPORT_YAMLGENERATE_SCHEMA_H
+
+#include "llvm/Support/Casting.h"
+#include "llvm/Support/YAMLTraits.h"
+
+namespace llvm {
+
+namespace json {
+class Value;
+}
+
+namespace yaml {
+
+class GenerateSchema : public IO {
+public:
+  GenerateSchema(raw_ostream &RO);
+  ~GenerateSchema() override = default;
+
+  IOKind getKind() const override;
+  bool outputting() const override;
+  bool mapTag(StringRef, bool) override;
+  void beginMapping() override;
+  void endMapping() override;
+  bool preflightKey(StringRef, bool, bool, bool &, void *&) override;
+  void postflightKey(void *) override;
+  std::vector<StringRef> keys() override;
+  void beginFlowMapping() override;
+  void endFlowMapping() override;
+  unsigned beginSequence() override;
+  void endSequence() override;
+  bool preflightElement(unsigned, void *&) override;
+  void postflightElement(void *) override;
+  unsigned beginFlowSequence() override;
+  bool preflightFlowElement(unsigned, void *&) override;
+  void postflightFlowElement(void *) override;
+  void endFlowSequence() override;
+  void beginEnumScalar() override;
+  bool matchEnumScalar(StringRef, bool) override;
+  bool matchEnumFallback() override;
+  void endEnumScalar() override;
+  bool beginBitSetScalar(bool &) override;
+  bool bitSetMatch(StringRef, bool) override;
+  void endBitSetScalar() override;
+  void scalarString(StringRef &, QuotingType) override;
+  void blockScalarString(StringRef &) override;
+  void scalarTag(std::string &) override;
+  NodeKind getNodeKind() override;
+  void setError(const Twine &message) override;
+  std::error_code error() override;
+  bool canElideEmptySequence() override;
+
+  bool preflightDocument();
+  void postflightDocument();
+
+  class SchemaNode {
+  public:
+    virtual json::Value toJSON() const = 0;
+
+    virtual ~SchemaNode() = default;
+  };
+
+  enum class PropertyKind : uint8_t {
+    UserDefined,
+    Properties,
+    AdditionalProperties,
+    Required,
+    Optional,
+    Type,
+    Enum,
+    Items,
+    FlowStyle,
+  };
+
+  class SchemaProperty : public SchemaNode {
+    StringRef Name;
+    PropertyKind Kind;
+
+  public:
+    SchemaProperty(StringRef Name, PropertyKind Kind)
+        : Name(Name), Kind(Kind) {}
+
+    PropertyKind getKind() const { return Kind; }
+
+    StringRef getName() const { return Name; }
+  };
+
+  class Schema;
+
+  class UserDefinedProperty final : public SchemaProperty {
+    Schema *Value;
+
+  public:
+    UserDefinedProperty(StringRef Name, Schema *Value)
+        : SchemaProperty(Name, PropertyKind::UserDefined), Value(Value) {}
+
+    Schema *getSchema() const { return Value; }
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::UserDefined;
+    }
+  };
+
+  class PropertiesProperty final : public SchemaProperty,
+                                   SmallVector<UserDefinedProperty *, 8> {
+  public:
+    using BaseVector = SmallVector<UserDefinedProperty *, 8>;
+
+    PropertiesProperty()
+        : SchemaProperty("properties", PropertyKind::Properties) {}
+
+    using BaseVector::begin;
+    using BaseVector::emplace_back;
+    using BaseVector::end;
+    using BaseVector::size;
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::Properties;
+    }
+  };
+
+  class AdditionalPropertiesProperty final : public SchemaProperty {
+    Schema *Value;
+
+  public:
+    AdditionalPropertiesProperty(Schema *Value = nullptr)
+        : SchemaProperty("additionalProperties",
+                         PropertyKind::AdditionalProperties),
+          Value(Value) {}
+
+    Schema *getSchema() const { return Value; }
+
+    void setSchema(Schema *S) { Value = S; }
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::AdditionalProperties;
+    }
+  };
+
+  class RequiredProperty final : public SchemaProperty,
+                                 SmallVector<StringRef, 4> {
+  public:
+    using BaseVector = SmallVector<StringRef, 4>;
+
+    RequiredProperty() : SchemaProperty("required", PropertyKind::Required) {}
+
+    using BaseVector::begin;
+    using BaseVector::emplace_back;
+    using BaseVector::end;
+    using BaseVector::size;
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::Required;
+    }
+  };
+
+  class OptionalProperty final : public SchemaProperty,
+                                 SmallVector<StringRef, 4> {
+  public:
+    using BaseVector = SmallVector<StringRef, 4>;
+
+    OptionalProperty() : SchemaProperty("optional", PropertyKind::Optional) {}
+
+    using BaseVector::begin;
+    using BaseVector::emplace_back;
+    using BaseVector::end;
+    using BaseVector::size;
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::Optional;
+    }
+  };
+
+  class TypeProperty final : public SchemaProperty {
+    StringRef Value;
+
+  public:
+    TypeProperty(StringRef Value)
+        : SchemaProperty("type", PropertyKind::Type), Value(Value) {}
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::Type;
+    }
+  };
+
+  class EnumProperty final : public SchemaProperty, SmallVector<StringRef, 4> {
+  public:
+    using BaseVector = SmallVector<StringRef, 4>;
+
+    EnumProperty() : SchemaProperty("enum", PropertyKind::Enum) {}
+
+    using BaseVector::begin;
+    using BaseVector::emplace_back;
+    using BaseVector::end;
+    using BaseVector::size;
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::Enum;
+    }
+  };
+
+  class ItemsProperty final : public SchemaProperty {
+    Schema *Value;
+
+  public:
+    ItemsProperty(Schema *Value = nullptr)
+        : SchemaProperty("items", PropertyKind::Items), Value(Value) {}
+
+    Schema *getSchema() const { return Value; }
+
+    void setSchema(Schema *S) { Value = S; }
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::Items;
+    }
+  };
+
+  enum class FlowStyle : bool {
+    Block,
+    Flow,
+  };
+
+  class FlowStyleProperty final : public SchemaProperty {
+    FlowStyle Style;
+
+  public:
+    FlowStyleProperty(FlowStyle Style = FlowStyle::Block)
+        : SchemaProperty("flowStyle", PropertyKind::FlowStyle), Style(Style) {}
+
+    void setStyle(FlowStyle S) { Style = S; }
+
+    FlowStyle getStyle() const { return Style; }
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::FlowStyle;
+    }
+  };
+
+  class Schema final : public SchemaNode, SmallVector<SchemaProperty *, 8> {
+  public:
+    using BaseVector = SmallVector<SchemaProperty *, 8>;
+
+    Schema() = default;
+
+    using BaseVector::begin;
+    using BaseVector::emplace_back;
+    using BaseVector::end;
+    using BaseVector::size;
+
+    json::Value toJSON() const override;
+  };
+
+private:
+  std::vector<std::unique_ptr<SchemaNode>> SchemaNodes;
+  SmallVector<Schema *, 8> Schemas;
+  raw_ostream &RO;
+  SchemaNode *Root = nullptr;
+
+  template <typename PropertyType, typename... PropertyArgs>
+  PropertyType *createProperty(PropertyArgs &&...Args) {
+    auto UPtr =
+        std::make_unique<PropertyType>(std::forward<PropertyArgs>(Args)...);
+    auto *Ptr = UPtr.get();
+    SchemaNodes.emplace_back(std::move(UPtr));
+    return Ptr;
+  }
+
+  template <typename PropertyType, typename... PropertyArgs>
+  PropertyType *getOrCreateProperty(Schema &S, PropertyArgs... Args) {
+    auto Found = std::find_if(S.begin(), S.end(), [](SchemaProperty *Property) {
+      return isa<PropertyType>(Property);
+    });
+    if (Found != S.end()) {
+      return cast<PropertyType>(*Found);
+    }
+    PropertyType *Created =
+        createProperty<PropertyType>(std::forward<PropertyArgs>(Args)...);
+    S.emplace_back(Created);
+    return Created;
+  }
+
+  Schema *createSchema() {
+    auto UPtr = std::make_unique<Schema>();
+    auto *Ptr = UPtr.get();
+    SchemaNodes.emplace_back(std::move(UPtr));
+    return Ptr;
+  }
+
+  Schema *getTopSchema() const {
+    return Schemas.empty() ? nullptr : Schemas.back();
+  }
+};
+
+// Define non-member operator<< so that Output can stream out document list.
+template <typename T>
+inline std::enable_if_t<has_DocumentListTraits<T>::value, GenerateSchema &>
+operator<<(GenerateSchema &Gen, T &DocList) {
+  EmptyContext Ctx;
+  Gen.preflightDocument();
+  yamlize(Gen, DocumentListTraits<T>::element(Gen, DocList, 0), true, Ctx);
+  Gen.postflightDocument();
+  return Gen;
+}
+
+// Define non-member operator<< so that Output can stream out a map.
+template <typename T>
+inline std::enable_if_t<has_MappingTraits<T, EmptyContext>::value,
+                        GenerateSchema &>
+operator<<(GenerateSchema &Gen, T &Map) {
+  EmptyContext Ctx;
+  Gen.preflightDocument();
+  yamlize(Gen, Map, true, Ctx);
+  Gen.postflightDocument();
+  return Gen;
+}
+
+// Define non-member operator<< so that Output can stream out a sequence.
+template <typename T>
+inline std::enable_if_t<has_SequenceTraits<T>::value, GenerateSchema &>
+operator<<(GenerateSchema &Gen, T &Seq) {
+  EmptyContext Ctx;
+  Gen.preflightDocument();
+  yamlize(Gen, Seq, true, Ctx);
+  Gen.postflightDocument();
+  return Gen;
+}
+
+// Define non-member operator<< so that Output can stream out a block scalar.
+template <typename T>
+inline std::enable_if_t<has_BlockScalarTraits<T>::value, GenerateSchema &>
+operator<<(GenerateSchema &Gen, T &Val) {
+  EmptyContext Ctx;
+  Gen.preflightDocument();
+  yamlize(Gen, Val, true, Ctx);
+  Gen.postflightDocument();
+  return Gen;
+}
+
+// Define non-member operator<< so that Output can stream out a string map.
+template <typename T>
+inline std::enable_if_t<has_CustomMappingTraits<T>::value, GenerateSchema &>
+operator<<(GenerateSchema &Gen, T &Val) {
+  EmptyContext Ctx;
+  Gen.preflightDocument();
+  yamlize(Gen, Val, true, Ctx);
+  Gen.postflightDocument();
+  return Gen;
+}
+
+// Define non-member operator<< so that Output can stream out a polymorphic
+// type.
+template <typename T>
+inline std::enable_if_t<has_PolymorphicTraits<T>::value, GenerateSchema &>
+operator<<(GenerateSchema &Gen, T &Val) {
+  EmptyContext Ctx;
+  Gen.preflightDocument();
+  yamlize(Gen, Val, true, Ctx);
+  Gen.postflightDocument();
+  return Gen;
+}
+
+// Provide better error message about types missing a trait specialization
+template <typename T>
+inline std::enable_if_t<missingTraits<T, EmptyContext>::value, GenerateSchema &>
+operator<<(GenerateSchema &Gen, T &seq) {
+  char missing_yaml_trait_for_type[sizeof(MissingTrait<T>)];
+  return Gen;
+}
+
+} // namespace yaml
+
+} // namespace llvm
+
+#endif // LLVM_SUPPORT_YAMLGENERATE_SCHEMA_H
diff --git a/llvm/include/llvm/Support/YAMLTraits.h b/llvm/include/llvm/Support/YAMLTraits.h
index 3d36f41ca1a04..f92e26e6424a1 100644
--- a/llvm/include/llvm/Support/YAMLTraits.h
+++ b/llvm/include/llvm/Support/YAMLTraits.h
@@ -145,6 +145,7 @@ enum class QuotingType { None, Single, Double };
 ///        return StringRef();
 ///      }
 ///      static QuotingType mustQuote(StringRef) { return QuotingType::Single; }
+///      static constexpr StringRef typeName = "string";
 ///    };
 template <typename T, typename Enable = void> struct ScalarTraits {
   // Must provide:
@@ -158,6 +159,9 @@ template <typename T, typename Enable = void> struct ScalarTraits {
   //
   // Function to determine if the value should be quoted.
   // static QuotingType mustQuote(StringRef);
+  //
+  // Optional, for GeneratingSchema:
+  // static constexpr StringRef typeName = "string";
 };
 
 /// This class should be specialized by type that requires custom conversion
@@ -175,6 +179,7 @@ template <typename T, typename Enable = void> struct ScalarTraits {
 ///        // return empty string on success, or error string
 ///        return StringRef();
 ///      }
+///      static constexpr StringRef typeName = "string";
 ///    };
 template <typename T> struct BlockScalarTraits {
   // Must provide:
@@ -189,6 +194,7 @@ template <typename T> struct BlockScalarTraits {
   // Optional:
   // static StringRef inputTag(T &Val, std::string Tag)
   // static void outputTag(const T &Val, raw_ostream &Out)
+  // static constexpr StringRef typeName = "string";
 };
 
 /// This class should be specialized by type that requires custom conversion
@@ -211,6 +217,7 @@ template <typename T> struct BlockScalarTraits {
 ///      static QuotingType mustQuote(const MyType &Value, StringRef) {
 ///        return QuotingType::Single;
 ///      }
+///      static constexpr StringRef typeName = "integer";
 ///    };
 template <typename T> struct TaggedScalarTraits {
   // Must provide:
@@ -226,6 +233,9 @@ template <typename T> struct TaggedScalarTraits {
   //
   // Function to determine if the value should be quoted.
   // static QuotingType mustQuote(const T &Value, StringRef Scalar);
+  //
+  // Optional:
+  // static constexpr StringRef typeName = "string";
 };
 
 /// This class should be specialized by any type that needs to be converted
@@ -442,6 +452,14 @@ template <class T> struct has_CustomMappingTraits {
       is_detected<check, CustomMappingTraits<T>>::value;
 };
 
+// Test if typeName is defined on type T.
+template <typename T> struct has_TypeNameTraits {
+  template <class U>
+  using check = std::is_same<decltype(&U::typeName), StringRef>;
+
+  static constexpr bool value = is_detected<check, T>::value;
+};
+
 // Test if flow is defined on type T.
 template <typename T> struct has_FlowTraits {
   template <class U> using check = decltype(&U::flow);
@@ -683,12 +701,19 @@ struct unvalidatedMappingTraits
                                 !has_MappingValidateTraits<T, Context>::value> {
 };
 
+enum class IOKind : uint8_t {
+  Outputting,
+  Inputting,
+  GeneratingSchema,
+};
+
 // Base class for Input and Output.
 class LLVM_ABI IO {
 public:
   IO(void *Ctxt = nullptr);
   virtual ~IO();
 
+  virtual IOKind getKind() const = 0;
   virtual bool outputting() const = 0;
 
   virtual unsigned beginSequence() = 0;
@@ -732,7 +757,8 @@ class LLVM_ABI IO {
   virtual void setAllowUnknownKeys(bool Allow);
 
   template <typename T> void enumCase(T &Val, StringRef Str, const T ConstVal) {
-    if (matchEnumScalar(Str, outputting() && Val == ConstVal)) {
+    if (matchEnumScalar(Str,
+                        getKind() == IOKind::Outputting && Val == ConstVal)) {
       Val = ConstVal;
     }
   }
@@ -740,7 +766,8 @@ class LLVM_ABI IO {
   // allow anonymous enum values to be used with LLVM_YAML_STRONG_TYPEDEF
   template <typename T>
   void enumCase(T &Val, Str...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Oct 21, 2025

@llvm/pr-subscribers-clang-tools-extra

Author: Timur Golubovich (tgs-sc)

Changes

Introduced a way for generating YAML schema for validating input YAML. This can be useful to see the full structure of the input YAML file for clang-tidy tool. This PR consists of 3 main commits: commit that introduces new YamlIO GenerateSchema and adds only necessary changes to YAMLTraits.h to leave it in working state, commit that adds main changes to YAMLTraits.h that add capabilities such as type names in ScalarTraits and final commit that adds an option to clang-tidy with some simple changes such as changing YamlIO.outputting() -> YamlIO.getKind(). I have an RFC with this topic: https://discourse.llvm.org/t/rfc-yamlgenerateschema-support-for-producing-yaml-schemas/85846.


Patch is 49.67 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/164412.diff

11 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/ClangTidyOptions.cpp (+18-3)
  • (modified) clang-tools-extra/clang-tidy/ClangTidyOptions.h (+3)
  • (modified) clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp (+11)
  • (added) llvm/include/llvm/Support/YAMLGenerateSchema.h (+400)
  • (modified) llvm/include/llvm/Support/YAMLTraits.h (+109-29)
  • (modified) llvm/lib/Support/CMakeLists.txt (+1)
  • (added) llvm/lib/Support/YAMLGenerateSchema.cpp (+283)
  • (modified) llvm/lib/Support/YAMLTraits.cpp (+4)
  • (modified) llvm/unittests/Support/CMakeLists.txt (+1)
  • (added) llvm/unittests/Support/YAMLGenerateSchemaTest.cpp (+124)
  • (modified) llvm/utils/gn/secondary/llvm/lib/Support/BUILD.gn (+1)
diff --git a/clang-tools-extra/clang-tidy/ClangTidyOptions.cpp b/clang-tools-extra/clang-tidy/ClangTidyOptions.cpp
index b752a9beb0e34..b168e0dd28ddd 100644
--- a/clang-tools-extra/clang-tidy/ClangTidyOptions.cpp
+++ b/clang-tools-extra/clang-tidy/ClangTidyOptions.cpp
@@ -16,6 +16,7 @@
 #include "llvm/Support/ErrorOr.h"
 #include "llvm/Support/MemoryBufferRef.h"
 #include "llvm/Support/Path.h"
+#include "llvm/Support/YAMLGenerateSchema.h"
 #include "llvm/Support/YAMLTraits.h"
 #include <algorithm>
 #include <optional>
@@ -87,7 +88,7 @@ struct NOptionMap {
 template <>
 void yamlize(IO &IO, ClangTidyOptions::OptionMap &Val, bool,
              EmptyContext &Ctx) {
-  if (IO.outputting()) {
+  if (IO.getKind() == IOKind::Outputting) {
     // Ensure check options are sorted
     std::vector<std::pair<StringRef, StringRef>> SortedOptions;
     SortedOptions.reserve(Val.size());
@@ -108,7 +109,7 @@ void yamlize(IO &IO, ClangTidyOptions::OptionMap &Val, bool,
       IO.postflightKey(SaveInfo);
     }
     IO.endMapping();
-  } else {
+  } else if (IO.getKind() == IOKind::Inputting) {
     // We need custom logic here to support the old method of specifying check
     // options using a list of maps containing key and value keys.
     auto &I = reinterpret_cast<Input &>(IO);
@@ -128,6 +129,11 @@ void yamlize(IO &IO, ClangTidyOptions::OptionMap &Val, bool,
     } else {
       IO.setError("expected a sequence or map");
     }
+  } else {
+    MappingNormalization<NOptionMap, ClangTidyOptions::OptionMap> NOpts(IO,
+                                                                        Val);
+    EmptyContext Ctx;
+    yamlize(IO, NOpts->Options, true, Ctx);
   }
 }
 
@@ -182,7 +188,7 @@ struct ChecksVariant {
 };
 
 template <> void yamlize(IO &IO, ChecksVariant &Val, bool, EmptyContext &Ctx) {
-  if (!IO.outputting()) {
+  if (IO.getKind() == IOKind::Inputting) {
     // Special case for reading from YAML
     // Must support reading from both a string or a list
     auto &I = reinterpret_cast<Input &>(IO);
@@ -195,6 +201,9 @@ template <> void yamlize(IO &IO, ChecksVariant &Val, bool, EmptyContext &Ctx) {
     } else {
       IO.setError("expected string or sequence");
     }
+  } else if (IO.getKind() == IOKind::GeneratingSchema) {
+    Val.AsVector = std::vector<std::string>();
+    yamlize(IO, *Val.AsVector, true, Ctx);
   }
 }
 
@@ -541,6 +550,12 @@ parseConfiguration(llvm::MemoryBufferRef Config) {
   return Options;
 }
 
+void dumpConfigurationYAMLSchema(llvm::raw_fd_ostream &Stream) {
+  ClangTidyOptions Options;
+  llvm::yaml::GenerateSchema GS(Stream);
+  GS << Options;
+}
+
 static void diagHandlerImpl(const llvm::SMDiagnostic &Diag, void *Ctx) {
   (*reinterpret_cast<DiagCallback *>(Ctx))(Diag);
 }
diff --git a/clang-tools-extra/clang-tidy/ClangTidyOptions.h b/clang-tools-extra/clang-tidy/ClangTidyOptions.h
index 2aae92f1d9eb3..f0aa710c685a2 100644
--- a/clang-tools-extra/clang-tidy/ClangTidyOptions.h
+++ b/clang-tools-extra/clang-tidy/ClangTidyOptions.h
@@ -343,6 +343,9 @@ std::error_code parseLineFilter(llvm::StringRef LineFilter,
 llvm::ErrorOr<ClangTidyOptions>
 parseConfiguration(llvm::MemoryBufferRef Config);
 
+/// Dumps configuration YAML Schema to \p Stream
+void dumpConfigurationYAMLSchema(llvm::raw_fd_ostream &Stream);
+
 using DiagCallback = llvm::function_ref<void(const llvm::SMDiagnostic &)>;
 
 llvm::ErrorOr<ClangTidyOptions>
diff --git a/clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp b/clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp
index 64157f530b8c0..7c6fa7f5c40b9 100644
--- a/clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp
+++ b/clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp
@@ -355,6 +355,12 @@ see https://clang.llvm.org/extra/clang-tidy/QueryBasedCustomChecks.html.
                                               cl::init(false),
                                               cl::cat(ClangTidyCategory));
 
+static cl::opt<bool> DumpYAMLSchema("dump-yaml-schema", desc(R"(
+Dumps configuration YAML Schema in JSON format to
+stdout.
+)"),
+                                    cl::init(false),
+                                    cl::cat(ClangTidyCategory));
 namespace clang::tidy {
 
 static void printStats(const ClangTidyStats &Stats) {
@@ -684,6 +690,11 @@ int clangTidyMain(int argc, const char **argv) {
     return 0;
   }
 
+  if (DumpYAMLSchema) {
+    dumpConfigurationYAMLSchema(llvm::outs());
+    return 0;
+  }
+
   if (VerifyConfig) {
     std::vector<ClangTidyOptionsProvider::OptionsSource> RawOptions =
         OptionsProvider->getRawOptions(FileName);
diff --git a/llvm/include/llvm/Support/YAMLGenerateSchema.h b/llvm/include/llvm/Support/YAMLGenerateSchema.h
new file mode 100644
index 0000000000000..ac1609a9ee469
--- /dev/null
+++ b/llvm/include/llvm/Support/YAMLGenerateSchema.h
@@ -0,0 +1,400 @@
+//===- llvm/Support/YAMLGenerateSchema.h ------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_SUPPORT_YAMLGENERATE_SCHEMA_H
+#define LLVM_SUPPORT_YAMLGENERATE_SCHEMA_H
+
+#include "llvm/Support/Casting.h"
+#include "llvm/Support/YAMLTraits.h"
+
+namespace llvm {
+
+namespace json {
+class Value;
+}
+
+namespace yaml {
+
+class GenerateSchema : public IO {
+public:
+  GenerateSchema(raw_ostream &RO);
+  ~GenerateSchema() override = default;
+
+  IOKind getKind() const override;
+  bool outputting() const override;
+  bool mapTag(StringRef, bool) override;
+  void beginMapping() override;
+  void endMapping() override;
+  bool preflightKey(StringRef, bool, bool, bool &, void *&) override;
+  void postflightKey(void *) override;
+  std::vector<StringRef> keys() override;
+  void beginFlowMapping() override;
+  void endFlowMapping() override;
+  unsigned beginSequence() override;
+  void endSequence() override;
+  bool preflightElement(unsigned, void *&) override;
+  void postflightElement(void *) override;
+  unsigned beginFlowSequence() override;
+  bool preflightFlowElement(unsigned, void *&) override;
+  void postflightFlowElement(void *) override;
+  void endFlowSequence() override;
+  void beginEnumScalar() override;
+  bool matchEnumScalar(StringRef, bool) override;
+  bool matchEnumFallback() override;
+  void endEnumScalar() override;
+  bool beginBitSetScalar(bool &) override;
+  bool bitSetMatch(StringRef, bool) override;
+  void endBitSetScalar() override;
+  void scalarString(StringRef &, QuotingType) override;
+  void blockScalarString(StringRef &) override;
+  void scalarTag(std::string &) override;
+  NodeKind getNodeKind() override;
+  void setError(const Twine &message) override;
+  std::error_code error() override;
+  bool canElideEmptySequence() override;
+
+  bool preflightDocument();
+  void postflightDocument();
+
+  class SchemaNode {
+  public:
+    virtual json::Value toJSON() const = 0;
+
+    virtual ~SchemaNode() = default;
+  };
+
+  enum class PropertyKind : uint8_t {
+    UserDefined,
+    Properties,
+    AdditionalProperties,
+    Required,
+    Optional,
+    Type,
+    Enum,
+    Items,
+    FlowStyle,
+  };
+
+  class SchemaProperty : public SchemaNode {
+    StringRef Name;
+    PropertyKind Kind;
+
+  public:
+    SchemaProperty(StringRef Name, PropertyKind Kind)
+        : Name(Name), Kind(Kind) {}
+
+    PropertyKind getKind() const { return Kind; }
+
+    StringRef getName() const { return Name; }
+  };
+
+  class Schema;
+
+  class UserDefinedProperty final : public SchemaProperty {
+    Schema *Value;
+
+  public:
+    UserDefinedProperty(StringRef Name, Schema *Value)
+        : SchemaProperty(Name, PropertyKind::UserDefined), Value(Value) {}
+
+    Schema *getSchema() const { return Value; }
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::UserDefined;
+    }
+  };
+
+  class PropertiesProperty final : public SchemaProperty,
+                                   SmallVector<UserDefinedProperty *, 8> {
+  public:
+    using BaseVector = SmallVector<UserDefinedProperty *, 8>;
+
+    PropertiesProperty()
+        : SchemaProperty("properties", PropertyKind::Properties) {}
+
+    using BaseVector::begin;
+    using BaseVector::emplace_back;
+    using BaseVector::end;
+    using BaseVector::size;
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::Properties;
+    }
+  };
+
+  class AdditionalPropertiesProperty final : public SchemaProperty {
+    Schema *Value;
+
+  public:
+    AdditionalPropertiesProperty(Schema *Value = nullptr)
+        : SchemaProperty("additionalProperties",
+                         PropertyKind::AdditionalProperties),
+          Value(Value) {}
+
+    Schema *getSchema() const { return Value; }
+
+    void setSchema(Schema *S) { Value = S; }
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::AdditionalProperties;
+    }
+  };
+
+  class RequiredProperty final : public SchemaProperty,
+                                 SmallVector<StringRef, 4> {
+  public:
+    using BaseVector = SmallVector<StringRef, 4>;
+
+    RequiredProperty() : SchemaProperty("required", PropertyKind::Required) {}
+
+    using BaseVector::begin;
+    using BaseVector::emplace_back;
+    using BaseVector::end;
+    using BaseVector::size;
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::Required;
+    }
+  };
+
+  class OptionalProperty final : public SchemaProperty,
+                                 SmallVector<StringRef, 4> {
+  public:
+    using BaseVector = SmallVector<StringRef, 4>;
+
+    OptionalProperty() : SchemaProperty("optional", PropertyKind::Optional) {}
+
+    using BaseVector::begin;
+    using BaseVector::emplace_back;
+    using BaseVector::end;
+    using BaseVector::size;
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::Optional;
+    }
+  };
+
+  class TypeProperty final : public SchemaProperty {
+    StringRef Value;
+
+  public:
+    TypeProperty(StringRef Value)
+        : SchemaProperty("type", PropertyKind::Type), Value(Value) {}
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::Type;
+    }
+  };
+
+  class EnumProperty final : public SchemaProperty, SmallVector<StringRef, 4> {
+  public:
+    using BaseVector = SmallVector<StringRef, 4>;
+
+    EnumProperty() : SchemaProperty("enum", PropertyKind::Enum) {}
+
+    using BaseVector::begin;
+    using BaseVector::emplace_back;
+    using BaseVector::end;
+    using BaseVector::size;
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::Enum;
+    }
+  };
+
+  class ItemsProperty final : public SchemaProperty {
+    Schema *Value;
+
+  public:
+    ItemsProperty(Schema *Value = nullptr)
+        : SchemaProperty("items", PropertyKind::Items), Value(Value) {}
+
+    Schema *getSchema() const { return Value; }
+
+    void setSchema(Schema *S) { Value = S; }
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::Items;
+    }
+  };
+
+  enum class FlowStyle : bool {
+    Block,
+    Flow,
+  };
+
+  class FlowStyleProperty final : public SchemaProperty {
+    FlowStyle Style;
+
+  public:
+    FlowStyleProperty(FlowStyle Style = FlowStyle::Block)
+        : SchemaProperty("flowStyle", PropertyKind::FlowStyle), Style(Style) {}
+
+    void setStyle(FlowStyle S) { Style = S; }
+
+    FlowStyle getStyle() const { return Style; }
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::FlowStyle;
+    }
+  };
+
+  class Schema final : public SchemaNode, SmallVector<SchemaProperty *, 8> {
+  public:
+    using BaseVector = SmallVector<SchemaProperty *, 8>;
+
+    Schema() = default;
+
+    using BaseVector::begin;
+    using BaseVector::emplace_back;
+    using BaseVector::end;
+    using BaseVector::size;
+
+    json::Value toJSON() const override;
+  };
+
+private:
+  std::vector<std::unique_ptr<SchemaNode>> SchemaNodes;
+  SmallVector<Schema *, 8> Schemas;
+  raw_ostream &RO;
+  SchemaNode *Root = nullptr;
+
+  template <typename PropertyType, typename... PropertyArgs>
+  PropertyType *createProperty(PropertyArgs &&...Args) {
+    auto UPtr =
+        std::make_unique<PropertyType>(std::forward<PropertyArgs>(Args)...);
+    auto *Ptr = UPtr.get();
+    SchemaNodes.emplace_back(std::move(UPtr));
+    return Ptr;
+  }
+
+  template <typename PropertyType, typename... PropertyArgs>
+  PropertyType *getOrCreateProperty(Schema &S, PropertyArgs... Args) {
+    auto Found = std::find_if(S.begin(), S.end(), [](SchemaProperty *Property) {
+      return isa<PropertyType>(Property);
+    });
+    if (Found != S.end()) {
+      return cast<PropertyType>(*Found);
+    }
+    PropertyType *Created =
+        createProperty<PropertyType>(std::forward<PropertyArgs>(Args)...);
+    S.emplace_back(Created);
+    return Created;
+  }
+
+  Schema *createSchema() {
+    auto UPtr = std::make_unique<Schema>();
+    auto *Ptr = UPtr.get();
+    SchemaNodes.emplace_back(std::move(UPtr));
+    return Ptr;
+  }
+
+  Schema *getTopSchema() const {
+    return Schemas.empty() ? nullptr : Schemas.back();
+  }
+};
+
+// Define non-member operator<< so that Output can stream out document list.
+template <typename T>
+inline std::enable_if_t<has_DocumentListTraits<T>::value, GenerateSchema &>
+operator<<(GenerateSchema &Gen, T &DocList) {
+  EmptyContext Ctx;
+  Gen.preflightDocument();
+  yamlize(Gen, DocumentListTraits<T>::element(Gen, DocList, 0), true, Ctx);
+  Gen.postflightDocument();
+  return Gen;
+}
+
+// Define non-member operator<< so that Output can stream out a map.
+template <typename T>
+inline std::enable_if_t<has_MappingTraits<T, EmptyContext>::value,
+                        GenerateSchema &>
+operator<<(GenerateSchema &Gen, T &Map) {
+  EmptyContext Ctx;
+  Gen.preflightDocument();
+  yamlize(Gen, Map, true, Ctx);
+  Gen.postflightDocument();
+  return Gen;
+}
+
+// Define non-member operator<< so that Output can stream out a sequence.
+template <typename T>
+inline std::enable_if_t<has_SequenceTraits<T>::value, GenerateSchema &>
+operator<<(GenerateSchema &Gen, T &Seq) {
+  EmptyContext Ctx;
+  Gen.preflightDocument();
+  yamlize(Gen, Seq, true, Ctx);
+  Gen.postflightDocument();
+  return Gen;
+}
+
+// Define non-member operator<< so that Output can stream out a block scalar.
+template <typename T>
+inline std::enable_if_t<has_BlockScalarTraits<T>::value, GenerateSchema &>
+operator<<(GenerateSchema &Gen, T &Val) {
+  EmptyContext Ctx;
+  Gen.preflightDocument();
+  yamlize(Gen, Val, true, Ctx);
+  Gen.postflightDocument();
+  return Gen;
+}
+
+// Define non-member operator<< so that Output can stream out a string map.
+template <typename T>
+inline std::enable_if_t<has_CustomMappingTraits<T>::value, GenerateSchema &>
+operator<<(GenerateSchema &Gen, T &Val) {
+  EmptyContext Ctx;
+  Gen.preflightDocument();
+  yamlize(Gen, Val, true, Ctx);
+  Gen.postflightDocument();
+  return Gen;
+}
+
+// Define non-member operator<< so that Output can stream out a polymorphic
+// type.
+template <typename T>
+inline std::enable_if_t<has_PolymorphicTraits<T>::value, GenerateSchema &>
+operator<<(GenerateSchema &Gen, T &Val) {
+  EmptyContext Ctx;
+  Gen.preflightDocument();
+  yamlize(Gen, Val, true, Ctx);
+  Gen.postflightDocument();
+  return Gen;
+}
+
+// Provide better error message about types missing a trait specialization
+template <typename T>
+inline std::enable_if_t<missingTraits<T, EmptyContext>::value, GenerateSchema &>
+operator<<(GenerateSchema &Gen, T &seq) {
+  char missing_yaml_trait_for_type[sizeof(MissingTrait<T>)];
+  return Gen;
+}
+
+} // namespace yaml
+
+} // namespace llvm
+
+#endif // LLVM_SUPPORT_YAMLGENERATE_SCHEMA_H
diff --git a/llvm/include/llvm/Support/YAMLTraits.h b/llvm/include/llvm/Support/YAMLTraits.h
index 3d36f41ca1a04..f92e26e6424a1 100644
--- a/llvm/include/llvm/Support/YAMLTraits.h
+++ b/llvm/include/llvm/Support/YAMLTraits.h
@@ -145,6 +145,7 @@ enum class QuotingType { None, Single, Double };
 ///        return StringRef();
 ///      }
 ///      static QuotingType mustQuote(StringRef) { return QuotingType::Single; }
+///      static constexpr StringRef typeName = "string";
 ///    };
 template <typename T, typename Enable = void> struct ScalarTraits {
   // Must provide:
@@ -158,6 +159,9 @@ template <typename T, typename Enable = void> struct ScalarTraits {
   //
   // Function to determine if the value should be quoted.
   // static QuotingType mustQuote(StringRef);
+  //
+  // Optional, for GeneratingSchema:
+  // static constexpr StringRef typeName = "string";
 };
 
 /// This class should be specialized by type that requires custom conversion
@@ -175,6 +179,7 @@ template <typename T, typename Enable = void> struct ScalarTraits {
 ///        // return empty string on success, or error string
 ///        return StringRef();
 ///      }
+///      static constexpr StringRef typeName = "string";
 ///    };
 template <typename T> struct BlockScalarTraits {
   // Must provide:
@@ -189,6 +194,7 @@ template <typename T> struct BlockScalarTraits {
   // Optional:
   // static StringRef inputTag(T &Val, std::string Tag)
   // static void outputTag(const T &Val, raw_ostream &Out)
+  // static constexpr StringRef typeName = "string";
 };
 
 /// This class should be specialized by type that requires custom conversion
@@ -211,6 +217,7 @@ template <typename T> struct BlockScalarTraits {
 ///      static QuotingType mustQuote(const MyType &Value, StringRef) {
 ///        return QuotingType::Single;
 ///      }
+///      static constexpr StringRef typeName = "integer";
 ///    };
 template <typename T> struct TaggedScalarTraits {
   // Must provide:
@@ -226,6 +233,9 @@ template <typename T> struct TaggedScalarTraits {
   //
   // Function to determine if the value should be quoted.
   // static QuotingType mustQuote(const T &Value, StringRef Scalar);
+  //
+  // Optional:
+  // static constexpr StringRef typeName = "string";
 };
 
 /// This class should be specialized by any type that needs to be converted
@@ -442,6 +452,14 @@ template <class T> struct has_CustomMappingTraits {
       is_detected<check, CustomMappingTraits<T>>::value;
 };
 
+// Test if typeName is defined on type T.
+template <typename T> struct has_TypeNameTraits {
+  template <class U>
+  using check = std::is_same<decltype(&U::typeName), StringRef>;
+
+  static constexpr bool value = is_detected<check, T>::value;
+};
+
 // Test if flow is defined on type T.
 template <typename T> struct has_FlowTraits {
   template <class U> using check = decltype(&U::flow);
@@ -683,12 +701,19 @@ struct unvalidatedMappingTraits
                                 !has_MappingValidateTraits<T, Context>::value> {
 };
 
+enum class IOKind : uint8_t {
+  Outputting,
+  Inputting,
+  GeneratingSchema,
+};
+
 // Base class for Input and Output.
 class LLVM_ABI IO {
 public:
   IO(void *Ctxt = nullptr);
   virtual ~IO();
 
+  virtual IOKind getKind() const = 0;
   virtual bool outputting() const = 0;
 
   virtual unsigned beginSequence() = 0;
@@ -732,7 +757,8 @@ class LLVM_ABI IO {
   virtual void setAllowUnknownKeys(bool Allow);
 
   template <typename T> void enumCase(T &Val, StringRef Str, const T ConstVal) {
-    if (matchEnumScalar(Str, outputting() && Val == ConstVal)) {
+    if (matchEnumScalar(Str,
+                        getKind() == IOKind::Outputting && Val == ConstVal)) {
       Val = ConstVal;
     }
   }
@@ -740,7 +766,8 @@ class LLVM_ABI IO {
   // allow anonymous enum values to be used with LLVM_YAML_STRONG_TYPEDEF
   template <typename T>
   void enumCase(T &Val, Str...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Oct 21, 2025

@llvm/pr-subscribers-clang-tidy

Author: Timur Golubovich (tgs-sc)

Changes

Introduced a way for generating YAML schema for validating input YAML. This can be useful to see the full structure of the input YAML file for clang-tidy tool. This PR consists of 3 main commits: commit that introduces new YamlIO GenerateSchema and adds only necessary changes to YAMLTraits.h to leave it in working state, commit that adds main changes to YAMLTraits.h that add capabilities such as type names in ScalarTraits and final commit that adds an option to clang-tidy with some simple changes such as changing YamlIO.outputting() -> YamlIO.getKind(). I have an RFC with this topic: https://discourse.llvm.org/t/rfc-yamlgenerateschema-support-for-producing-yaml-schemas/85846.


Patch is 49.67 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/164412.diff

11 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/ClangTidyOptions.cpp (+18-3)
  • (modified) clang-tools-extra/clang-tidy/ClangTidyOptions.h (+3)
  • (modified) clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp (+11)
  • (added) llvm/include/llvm/Support/YAMLGenerateSchema.h (+400)
  • (modified) llvm/include/llvm/Support/YAMLTraits.h (+109-29)
  • (modified) llvm/lib/Support/CMakeLists.txt (+1)
  • (added) llvm/lib/Support/YAMLGenerateSchema.cpp (+283)
  • (modified) llvm/lib/Support/YAMLTraits.cpp (+4)
  • (modified) llvm/unittests/Support/CMakeLists.txt (+1)
  • (added) llvm/unittests/Support/YAMLGenerateSchemaTest.cpp (+124)
  • (modified) llvm/utils/gn/secondary/llvm/lib/Support/BUILD.gn (+1)
diff --git a/clang-tools-extra/clang-tidy/ClangTidyOptions.cpp b/clang-tools-extra/clang-tidy/ClangTidyOptions.cpp
index b752a9beb0e34..b168e0dd28ddd 100644
--- a/clang-tools-extra/clang-tidy/ClangTidyOptions.cpp
+++ b/clang-tools-extra/clang-tidy/ClangTidyOptions.cpp
@@ -16,6 +16,7 @@
 #include "llvm/Support/ErrorOr.h"
 #include "llvm/Support/MemoryBufferRef.h"
 #include "llvm/Support/Path.h"
+#include "llvm/Support/YAMLGenerateSchema.h"
 #include "llvm/Support/YAMLTraits.h"
 #include <algorithm>
 #include <optional>
@@ -87,7 +88,7 @@ struct NOptionMap {
 template <>
 void yamlize(IO &IO, ClangTidyOptions::OptionMap &Val, bool,
              EmptyContext &Ctx) {
-  if (IO.outputting()) {
+  if (IO.getKind() == IOKind::Outputting) {
     // Ensure check options are sorted
     std::vector<std::pair<StringRef, StringRef>> SortedOptions;
     SortedOptions.reserve(Val.size());
@@ -108,7 +109,7 @@ void yamlize(IO &IO, ClangTidyOptions::OptionMap &Val, bool,
       IO.postflightKey(SaveInfo);
     }
     IO.endMapping();
-  } else {
+  } else if (IO.getKind() == IOKind::Inputting) {
     // We need custom logic here to support the old method of specifying check
     // options using a list of maps containing key and value keys.
     auto &I = reinterpret_cast<Input &>(IO);
@@ -128,6 +129,11 @@ void yamlize(IO &IO, ClangTidyOptions::OptionMap &Val, bool,
     } else {
       IO.setError("expected a sequence or map");
     }
+  } else {
+    MappingNormalization<NOptionMap, ClangTidyOptions::OptionMap> NOpts(IO,
+                                                                        Val);
+    EmptyContext Ctx;
+    yamlize(IO, NOpts->Options, true, Ctx);
   }
 }
 
@@ -182,7 +188,7 @@ struct ChecksVariant {
 };
 
 template <> void yamlize(IO &IO, ChecksVariant &Val, bool, EmptyContext &Ctx) {
-  if (!IO.outputting()) {
+  if (IO.getKind() == IOKind::Inputting) {
     // Special case for reading from YAML
     // Must support reading from both a string or a list
     auto &I = reinterpret_cast<Input &>(IO);
@@ -195,6 +201,9 @@ template <> void yamlize(IO &IO, ChecksVariant &Val, bool, EmptyContext &Ctx) {
     } else {
       IO.setError("expected string or sequence");
     }
+  } else if (IO.getKind() == IOKind::GeneratingSchema) {
+    Val.AsVector = std::vector<std::string>();
+    yamlize(IO, *Val.AsVector, true, Ctx);
   }
 }
 
@@ -541,6 +550,12 @@ parseConfiguration(llvm::MemoryBufferRef Config) {
   return Options;
 }
 
+void dumpConfigurationYAMLSchema(llvm::raw_fd_ostream &Stream) {
+  ClangTidyOptions Options;
+  llvm::yaml::GenerateSchema GS(Stream);
+  GS << Options;
+}
+
 static void diagHandlerImpl(const llvm::SMDiagnostic &Diag, void *Ctx) {
   (*reinterpret_cast<DiagCallback *>(Ctx))(Diag);
 }
diff --git a/clang-tools-extra/clang-tidy/ClangTidyOptions.h b/clang-tools-extra/clang-tidy/ClangTidyOptions.h
index 2aae92f1d9eb3..f0aa710c685a2 100644
--- a/clang-tools-extra/clang-tidy/ClangTidyOptions.h
+++ b/clang-tools-extra/clang-tidy/ClangTidyOptions.h
@@ -343,6 +343,9 @@ std::error_code parseLineFilter(llvm::StringRef LineFilter,
 llvm::ErrorOr<ClangTidyOptions>
 parseConfiguration(llvm::MemoryBufferRef Config);
 
+/// Dumps configuration YAML Schema to \p Stream
+void dumpConfigurationYAMLSchema(llvm::raw_fd_ostream &Stream);
+
 using DiagCallback = llvm::function_ref<void(const llvm::SMDiagnostic &)>;
 
 llvm::ErrorOr<ClangTidyOptions>
diff --git a/clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp b/clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp
index 64157f530b8c0..7c6fa7f5c40b9 100644
--- a/clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp
+++ b/clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp
@@ -355,6 +355,12 @@ see https://clang.llvm.org/extra/clang-tidy/QueryBasedCustomChecks.html.
                                               cl::init(false),
                                               cl::cat(ClangTidyCategory));
 
+static cl::opt<bool> DumpYAMLSchema("dump-yaml-schema", desc(R"(
+Dumps configuration YAML Schema in JSON format to
+stdout.
+)"),
+                                    cl::init(false),
+                                    cl::cat(ClangTidyCategory));
 namespace clang::tidy {
 
 static void printStats(const ClangTidyStats &Stats) {
@@ -684,6 +690,11 @@ int clangTidyMain(int argc, const char **argv) {
     return 0;
   }
 
+  if (DumpYAMLSchema) {
+    dumpConfigurationYAMLSchema(llvm::outs());
+    return 0;
+  }
+
   if (VerifyConfig) {
     std::vector<ClangTidyOptionsProvider::OptionsSource> RawOptions =
         OptionsProvider->getRawOptions(FileName);
diff --git a/llvm/include/llvm/Support/YAMLGenerateSchema.h b/llvm/include/llvm/Support/YAMLGenerateSchema.h
new file mode 100644
index 0000000000000..ac1609a9ee469
--- /dev/null
+++ b/llvm/include/llvm/Support/YAMLGenerateSchema.h
@@ -0,0 +1,400 @@
+//===- llvm/Support/YAMLGenerateSchema.h ------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_SUPPORT_YAMLGENERATE_SCHEMA_H
+#define LLVM_SUPPORT_YAMLGENERATE_SCHEMA_H
+
+#include "llvm/Support/Casting.h"
+#include "llvm/Support/YAMLTraits.h"
+
+namespace llvm {
+
+namespace json {
+class Value;
+}
+
+namespace yaml {
+
+class GenerateSchema : public IO {
+public:
+  GenerateSchema(raw_ostream &RO);
+  ~GenerateSchema() override = default;
+
+  IOKind getKind() const override;
+  bool outputting() const override;
+  bool mapTag(StringRef, bool) override;
+  void beginMapping() override;
+  void endMapping() override;
+  bool preflightKey(StringRef, bool, bool, bool &, void *&) override;
+  void postflightKey(void *) override;
+  std::vector<StringRef> keys() override;
+  void beginFlowMapping() override;
+  void endFlowMapping() override;
+  unsigned beginSequence() override;
+  void endSequence() override;
+  bool preflightElement(unsigned, void *&) override;
+  void postflightElement(void *) override;
+  unsigned beginFlowSequence() override;
+  bool preflightFlowElement(unsigned, void *&) override;
+  void postflightFlowElement(void *) override;
+  void endFlowSequence() override;
+  void beginEnumScalar() override;
+  bool matchEnumScalar(StringRef, bool) override;
+  bool matchEnumFallback() override;
+  void endEnumScalar() override;
+  bool beginBitSetScalar(bool &) override;
+  bool bitSetMatch(StringRef, bool) override;
+  void endBitSetScalar() override;
+  void scalarString(StringRef &, QuotingType) override;
+  void blockScalarString(StringRef &) override;
+  void scalarTag(std::string &) override;
+  NodeKind getNodeKind() override;
+  void setError(const Twine &message) override;
+  std::error_code error() override;
+  bool canElideEmptySequence() override;
+
+  bool preflightDocument();
+  void postflightDocument();
+
+  class SchemaNode {
+  public:
+    virtual json::Value toJSON() const = 0;
+
+    virtual ~SchemaNode() = default;
+  };
+
+  enum class PropertyKind : uint8_t {
+    UserDefined,
+    Properties,
+    AdditionalProperties,
+    Required,
+    Optional,
+    Type,
+    Enum,
+    Items,
+    FlowStyle,
+  };
+
+  class SchemaProperty : public SchemaNode {
+    StringRef Name;
+    PropertyKind Kind;
+
+  public:
+    SchemaProperty(StringRef Name, PropertyKind Kind)
+        : Name(Name), Kind(Kind) {}
+
+    PropertyKind getKind() const { return Kind; }
+
+    StringRef getName() const { return Name; }
+  };
+
+  class Schema;
+
+  class UserDefinedProperty final : public SchemaProperty {
+    Schema *Value;
+
+  public:
+    UserDefinedProperty(StringRef Name, Schema *Value)
+        : SchemaProperty(Name, PropertyKind::UserDefined), Value(Value) {}
+
+    Schema *getSchema() const { return Value; }
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::UserDefined;
+    }
+  };
+
+  class PropertiesProperty final : public SchemaProperty,
+                                   SmallVector<UserDefinedProperty *, 8> {
+  public:
+    using BaseVector = SmallVector<UserDefinedProperty *, 8>;
+
+    PropertiesProperty()
+        : SchemaProperty("properties", PropertyKind::Properties) {}
+
+    using BaseVector::begin;
+    using BaseVector::emplace_back;
+    using BaseVector::end;
+    using BaseVector::size;
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::Properties;
+    }
+  };
+
+  class AdditionalPropertiesProperty final : public SchemaProperty {
+    Schema *Value;
+
+  public:
+    AdditionalPropertiesProperty(Schema *Value = nullptr)
+        : SchemaProperty("additionalProperties",
+                         PropertyKind::AdditionalProperties),
+          Value(Value) {}
+
+    Schema *getSchema() const { return Value; }
+
+    void setSchema(Schema *S) { Value = S; }
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::AdditionalProperties;
+    }
+  };
+
+  class RequiredProperty final : public SchemaProperty,
+                                 SmallVector<StringRef, 4> {
+  public:
+    using BaseVector = SmallVector<StringRef, 4>;
+
+    RequiredProperty() : SchemaProperty("required", PropertyKind::Required) {}
+
+    using BaseVector::begin;
+    using BaseVector::emplace_back;
+    using BaseVector::end;
+    using BaseVector::size;
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::Required;
+    }
+  };
+
+  class OptionalProperty final : public SchemaProperty,
+                                 SmallVector<StringRef, 4> {
+  public:
+    using BaseVector = SmallVector<StringRef, 4>;
+
+    OptionalProperty() : SchemaProperty("optional", PropertyKind::Optional) {}
+
+    using BaseVector::begin;
+    using BaseVector::emplace_back;
+    using BaseVector::end;
+    using BaseVector::size;
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::Optional;
+    }
+  };
+
+  class TypeProperty final : public SchemaProperty {
+    StringRef Value;
+
+  public:
+    TypeProperty(StringRef Value)
+        : SchemaProperty("type", PropertyKind::Type), Value(Value) {}
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::Type;
+    }
+  };
+
+  class EnumProperty final : public SchemaProperty, SmallVector<StringRef, 4> {
+  public:
+    using BaseVector = SmallVector<StringRef, 4>;
+
+    EnumProperty() : SchemaProperty("enum", PropertyKind::Enum) {}
+
+    using BaseVector::begin;
+    using BaseVector::emplace_back;
+    using BaseVector::end;
+    using BaseVector::size;
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::Enum;
+    }
+  };
+
+  class ItemsProperty final : public SchemaProperty {
+    Schema *Value;
+
+  public:
+    ItemsProperty(Schema *Value = nullptr)
+        : SchemaProperty("items", PropertyKind::Items), Value(Value) {}
+
+    Schema *getSchema() const { return Value; }
+
+    void setSchema(Schema *S) { Value = S; }
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::Items;
+    }
+  };
+
+  enum class FlowStyle : bool {
+    Block,
+    Flow,
+  };
+
+  class FlowStyleProperty final : public SchemaProperty {
+    FlowStyle Style;
+
+  public:
+    FlowStyleProperty(FlowStyle Style = FlowStyle::Block)
+        : SchemaProperty("flowStyle", PropertyKind::FlowStyle), Style(Style) {}
+
+    void setStyle(FlowStyle S) { Style = S; }
+
+    FlowStyle getStyle() const { return Style; }
+
+    json::Value toJSON() const override;
+
+    static bool classof(const SchemaProperty *Property) {
+      return Property->getKind() == PropertyKind::FlowStyle;
+    }
+  };
+
+  class Schema final : public SchemaNode, SmallVector<SchemaProperty *, 8> {
+  public:
+    using BaseVector = SmallVector<SchemaProperty *, 8>;
+
+    Schema() = default;
+
+    using BaseVector::begin;
+    using BaseVector::emplace_back;
+    using BaseVector::end;
+    using BaseVector::size;
+
+    json::Value toJSON() const override;
+  };
+
+private:
+  std::vector<std::unique_ptr<SchemaNode>> SchemaNodes;
+  SmallVector<Schema *, 8> Schemas;
+  raw_ostream &RO;
+  SchemaNode *Root = nullptr;
+
+  template <typename PropertyType, typename... PropertyArgs>
+  PropertyType *createProperty(PropertyArgs &&...Args) {
+    auto UPtr =
+        std::make_unique<PropertyType>(std::forward<PropertyArgs>(Args)...);
+    auto *Ptr = UPtr.get();
+    SchemaNodes.emplace_back(std::move(UPtr));
+    return Ptr;
+  }
+
+  template <typename PropertyType, typename... PropertyArgs>
+  PropertyType *getOrCreateProperty(Schema &S, PropertyArgs... Args) {
+    auto Found = std::find_if(S.begin(), S.end(), [](SchemaProperty *Property) {
+      return isa<PropertyType>(Property);
+    });
+    if (Found != S.end()) {
+      return cast<PropertyType>(*Found);
+    }
+    PropertyType *Created =
+        createProperty<PropertyType>(std::forward<PropertyArgs>(Args)...);
+    S.emplace_back(Created);
+    return Created;
+  }
+
+  Schema *createSchema() {
+    auto UPtr = std::make_unique<Schema>();
+    auto *Ptr = UPtr.get();
+    SchemaNodes.emplace_back(std::move(UPtr));
+    return Ptr;
+  }
+
+  Schema *getTopSchema() const {
+    return Schemas.empty() ? nullptr : Schemas.back();
+  }
+};
+
+// Define non-member operator<< so that Output can stream out document list.
+template <typename T>
+inline std::enable_if_t<has_DocumentListTraits<T>::value, GenerateSchema &>
+operator<<(GenerateSchema &Gen, T &DocList) {
+  EmptyContext Ctx;
+  Gen.preflightDocument();
+  yamlize(Gen, DocumentListTraits<T>::element(Gen, DocList, 0), true, Ctx);
+  Gen.postflightDocument();
+  return Gen;
+}
+
+// Define non-member operator<< so that Output can stream out a map.
+template <typename T>
+inline std::enable_if_t<has_MappingTraits<T, EmptyContext>::value,
+                        GenerateSchema &>
+operator<<(GenerateSchema &Gen, T &Map) {
+  EmptyContext Ctx;
+  Gen.preflightDocument();
+  yamlize(Gen, Map, true, Ctx);
+  Gen.postflightDocument();
+  return Gen;
+}
+
+// Define non-member operator<< so that Output can stream out a sequence.
+template <typename T>
+inline std::enable_if_t<has_SequenceTraits<T>::value, GenerateSchema &>
+operator<<(GenerateSchema &Gen, T &Seq) {
+  EmptyContext Ctx;
+  Gen.preflightDocument();
+  yamlize(Gen, Seq, true, Ctx);
+  Gen.postflightDocument();
+  return Gen;
+}
+
+// Define non-member operator<< so that Output can stream out a block scalar.
+template <typename T>
+inline std::enable_if_t<has_BlockScalarTraits<T>::value, GenerateSchema &>
+operator<<(GenerateSchema &Gen, T &Val) {
+  EmptyContext Ctx;
+  Gen.preflightDocument();
+  yamlize(Gen, Val, true, Ctx);
+  Gen.postflightDocument();
+  return Gen;
+}
+
+// Define non-member operator<< so that Output can stream out a string map.
+template <typename T>
+inline std::enable_if_t<has_CustomMappingTraits<T>::value, GenerateSchema &>
+operator<<(GenerateSchema &Gen, T &Val) {
+  EmptyContext Ctx;
+  Gen.preflightDocument();
+  yamlize(Gen, Val, true, Ctx);
+  Gen.postflightDocument();
+  return Gen;
+}
+
+// Define non-member operator<< so that Output can stream out a polymorphic
+// type.
+template <typename T>
+inline std::enable_if_t<has_PolymorphicTraits<T>::value, GenerateSchema &>
+operator<<(GenerateSchema &Gen, T &Val) {
+  EmptyContext Ctx;
+  Gen.preflightDocument();
+  yamlize(Gen, Val, true, Ctx);
+  Gen.postflightDocument();
+  return Gen;
+}
+
+// Provide better error message about types missing a trait specialization
+template <typename T>
+inline std::enable_if_t<missingTraits<T, EmptyContext>::value, GenerateSchema &>
+operator<<(GenerateSchema &Gen, T &seq) {
+  char missing_yaml_trait_for_type[sizeof(MissingTrait<T>)];
+  return Gen;
+}
+
+} // namespace yaml
+
+} // namespace llvm
+
+#endif // LLVM_SUPPORT_YAMLGENERATE_SCHEMA_H
diff --git a/llvm/include/llvm/Support/YAMLTraits.h b/llvm/include/llvm/Support/YAMLTraits.h
index 3d36f41ca1a04..f92e26e6424a1 100644
--- a/llvm/include/llvm/Support/YAMLTraits.h
+++ b/llvm/include/llvm/Support/YAMLTraits.h
@@ -145,6 +145,7 @@ enum class QuotingType { None, Single, Double };
 ///        return StringRef();
 ///      }
 ///      static QuotingType mustQuote(StringRef) { return QuotingType::Single; }
+///      static constexpr StringRef typeName = "string";
 ///    };
 template <typename T, typename Enable = void> struct ScalarTraits {
   // Must provide:
@@ -158,6 +159,9 @@ template <typename T, typename Enable = void> struct ScalarTraits {
   //
   // Function to determine if the value should be quoted.
   // static QuotingType mustQuote(StringRef);
+  //
+  // Optional, for GeneratingSchema:
+  // static constexpr StringRef typeName = "string";
 };
 
 /// This class should be specialized by type that requires custom conversion
@@ -175,6 +179,7 @@ template <typename T, typename Enable = void> struct ScalarTraits {
 ///        // return empty string on success, or error string
 ///        return StringRef();
 ///      }
+///      static constexpr StringRef typeName = "string";
 ///    };
 template <typename T> struct BlockScalarTraits {
   // Must provide:
@@ -189,6 +194,7 @@ template <typename T> struct BlockScalarTraits {
   // Optional:
   // static StringRef inputTag(T &Val, std::string Tag)
   // static void outputTag(const T &Val, raw_ostream &Out)
+  // static constexpr StringRef typeName = "string";
 };
 
 /// This class should be specialized by type that requires custom conversion
@@ -211,6 +217,7 @@ template <typename T> struct BlockScalarTraits {
 ///      static QuotingType mustQuote(const MyType &Value, StringRef) {
 ///        return QuotingType::Single;
 ///      }
+///      static constexpr StringRef typeName = "integer";
 ///    };
 template <typename T> struct TaggedScalarTraits {
   // Must provide:
@@ -226,6 +233,9 @@ template <typename T> struct TaggedScalarTraits {
   //
   // Function to determine if the value should be quoted.
   // static QuotingType mustQuote(const T &Value, StringRef Scalar);
+  //
+  // Optional:
+  // static constexpr StringRef typeName = "string";
 };
 
 /// This class should be specialized by any type that needs to be converted
@@ -442,6 +452,14 @@ template <class T> struct has_CustomMappingTraits {
       is_detected<check, CustomMappingTraits<T>>::value;
 };
 
+// Test if typeName is defined on type T.
+template <typename T> struct has_TypeNameTraits {
+  template <class U>
+  using check = std::is_same<decltype(&U::typeName), StringRef>;
+
+  static constexpr bool value = is_detected<check, T>::value;
+};
+
 // Test if flow is defined on type T.
 template <typename T> struct has_FlowTraits {
   template <class U> using check = decltype(&U::flow);
@@ -683,12 +701,19 @@ struct unvalidatedMappingTraits
                                 !has_MappingValidateTraits<T, Context>::value> {
 };
 
+enum class IOKind : uint8_t {
+  Outputting,
+  Inputting,
+  GeneratingSchema,
+};
+
 // Base class for Input and Output.
 class LLVM_ABI IO {
 public:
   IO(void *Ctxt = nullptr);
   virtual ~IO();
 
+  virtual IOKind getKind() const = 0;
   virtual bool outputting() const = 0;
 
   virtual unsigned beginSequence() = 0;
@@ -732,7 +757,8 @@ class LLVM_ABI IO {
   virtual void setAllowUnknownKeys(bool Allow);
 
   template <typename T> void enumCase(T &Val, StringRef Str, const T ConstVal) {
-    if (matchEnumScalar(Str, outputting() && Val == ConstVal)) {
+    if (matchEnumScalar(Str,
+                        getKind() == IOKind::Outputting && Val == ConstVal)) {
       Val = ConstVal;
     }
   }
@@ -740,7 +766,8 @@ class LLVM_ABI IO {
   // allow anonymous enum values to be used with LLVM_YAML_STRONG_TYPEDEF
   template <typename T>
   void enumCase(T &Val, Str...
[truncated]

@tgs-sc tgs-sc force-pushed the users/tgs-sc/dev-auto-generate-yaml-scheme branch from 7971639 to b7ac397 Compare October 21, 2025 13:00
@tgs-sc
Copy link
Contributor Author

tgs-sc commented Oct 21, 2025

@carlosgalvezp, @boomanaiden154, @DavidSpickett, can you please a look at this?

@vbvictor
Copy link
Contributor

This PR consists of 3 main commits: commit that introduces new YamlIO GenerateSchema and adds only necessary changes to YAMLTraits.h to leave it in working state, commit that adds main changes to YAMLTraits.h that add capabilities such as type names in ScalarTraits and final commit that adds an option to clang-tidy with some simple changes such as changing YamlIO.outputting() -> YamlIO.getKind()

Could we make 3 commits into 3 separate PRs?

@tgs-sc
Copy link
Contributor Author

tgs-sc commented Oct 21, 2025

Could we make 3 commits into 3 separate PRs?

Well, there is actually one opened PR that is in fact same as the first commit here (#133284). Unfortunately no one responded me there, so we decided with @DavidSpickett in RFC open PR to clang-tidy as example of usage.

@vbvictor
Copy link
Contributor

vbvictor commented Oct 21, 2025

Could we make 3 commits into 3 separate PRs?

This is hard to review in one go and when landing PR all the commits would be squashed. I particularly don't like that all change to yaml traits would land as [clang-tidy] Introduced new option

open PR to clang-tidy as example of usage.

I can't speak for usefulness of clang-tidy part for now, but to proceed further it should have tests, could you add them please. For now, I don't understand how it would look like to end-user.

@tgs-sc
Copy link
Contributor Author

tgs-sc commented Oct 21, 2025

This is hard to review in one go and when landing PR all the commits would be squashed. I particularly don't like that all change to yaml traits would land as [clang-tidy] Introduced new option

Well, I agree that this might be a little bit confusing, probably I should rename this PR to something like [llvm][clang-tidy] New infrastructure for --dump-yaml-schema

I can't speak for usefulness of clang-tidy part for now, but to proceed further it should tests at least. For now, I don't understand how it would look like to end-user.

This can be useful for people that use IDE. I attached in RFC topic some screenshots from it. Basically, the only action that is needed to enable this support, is insert schema obtained from the tool to IDE's config file. Probably we can add unittest to compare runtime emitted YAML schema with already obtained one.

Copy link
Contributor

@EugeneZelenko EugeneZelenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update documentation and mention Release Notes.

bool GenerateSchema::mapTag(StringRef, bool) { return false; }

void GenerateSchema::beginMapping() {
auto *Top = getTopSchema();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not use auto unless type is explicitly stated in same statement or iterator. Same in other places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

Please update documentation and mention Release Notes.

Do I need to only mention new option to clang-tidy, or also changes in llvm/Support/YAMLTraits.h?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EugeneZelenko, ping. And can you please leave some more comments to the PR #133284, as I now divided this one into three parts.

@DavidSpickett
Copy link
Collaborator

Well, I agree that this might be a little bit confusing, probably I should rename this PR to something like [llvm][clang-tidy] New infrastructure for --dump-yaml-schema

To be clear: my suggestion to @tgs-sc was that they start with it all in one PR in case reviewers preferred to see the whole scope in one place. They are not necessarily trying to land it in this form.

I knew it was 99.9% likely you would want it split, but as a reviewer sometimes I find N split (sometimes stacked) PRs even more confusing to assess when a feature is just being proposed. Certainly for landing, I also like the changes split logically.

This is hard to review in one go and when landing PR all the commits would be squashed. I particularly don't like that all change to yaml traits would land as [clang-tidy] Introduced new option

@vbvictor would you be ok with N PRs stacked? That way each one clearly has one set of "new" code, and the CI jobs can run correctly?

Then @tgs-sc you can focus this PR purely on the clang-tidy changes and include a brief pitch about its use in IDEs. I know you said some of that in the RFC, but not everyone will, wants, or has time to read that.

If you are making split/stacked PRs please leave a comment on each one saying what it depends on. This helps reviewers focus their attention. There are some tools that can help with stacked PRs, or you can do it by hand. Up to you.

https://llvm.org/docs/GitHub.html#stacked-pull-requests

Again, assuming @vbvictor thinks this is an appropriate way to present this.

@tgs-sc
Copy link
Contributor Author

tgs-sc commented Oct 22, 2025

@vbvictor would you be ok with N PRs stacked?

So, I guess to split it into stack<3> PRs. But as I understand, I need commit access to do this:

Use user branches in llvm/llvm-project
Create user branches in the main repository, as described above. > Then:
Open a pull request from users//feature_1 → main
Open another from users//feature_2 → users//feature_1
This approach allows GitHub to display clean, incremental diffs for each PR in the stack, making it much easier for ? reviewers to see what has changed at each step. Once feature_1 is merged, GitHub will automatically rebase and re-target your branch feature_2 to main. For more complex stacks, you can perform this step using the web interface.
This approach requires commit access.

Then @tgs-sc you can focus this PR purely on the clang-tidy changes and include a brief pitch about its use in IDEs.

So, would it be convenient for you if I attach this pitch to this PR?

@DavidSpickett
Copy link
Collaborator

Perhaps it's not very clear from the docs but option 2 does not require commit access:

Two PRs with a dependency note

Create PR_1 for feature_1 and PR_2 for feature_2. In PR_2, include a note in the PR summary indicating that it depends on PR_1 (e.g., “Depends on #PR_1”).

To make review easier, make it clear which commits are part of the base PR and which are new, e.g. “The first N commits are from the base PR”. This helps reviewers focus only on the incremental changes.

Essentially you have branch 1 with commit A, branch 2 with commits A and B, branch 3 with commits A B and C. On each PR you make, add a comment noting what it depends on.

So, would it be convenient for you if I attach this pitch to this PR?

This may be directed to other reviewers, but regardless, putting the pitch in this PR was my intent when I said that. That is, the utility of adding this to clang-tidy and the benefits you think it will bring. In the others, you will justify choices related to the focus of that PR, meaning that here you can assume the underlying changes already exist.

Introduced a way for generating schema for validating input YAML. This can be
useful to see the full structure of the input YAML file for different llvm
based tools that use existing YAML parser, for example clang-format,
clang-tidy e.t.c. This commit also can be useful for yaml-language-server.
Since now, when creating a YAML Schema, the type of all scalars is a
'string', this may be a bit strange. For example if you start typing
number in IDE, it will complain that 'string' was expected. This
patch fixes it by introducing new optional TypeNameTrait. Is is
optional in order not to break backward compatibility.
@tgs-sc tgs-sc force-pushed the users/tgs-sc/dev-auto-generate-yaml-scheme branch from b7ac397 to 263582f Compare October 22, 2025 16:37
@tgs-sc
Copy link
Contributor Author

tgs-sc commented Oct 23, 2025

This is a small pitch:

A quick introduction to using YAML schemas from the LLVM tool clang-tidy

There are some useful LLVM tools that everyone has used in one way or another, such as clang-format, clang-tidy, and others. These tools usually require some configuration input, which is often provided in YAML format. Similarly, at our company, an LLVM tool is being developed with a very complex input configuration. Usually, this configuration is either copied from project to project or checked against documentation. In this topic, I propose adding an option to clang-tidy that dumps the skeleton of this configuration in YAML schema format. In addition to allowing users to visualize this skeleton, if they use an IDE, they will receive hints when setting up this configuration, since modern IDEs have such support.

Example

Let's consider an example of setting up the input configuration for clang-tidy in a certain project using VSCode IDE.

Step 1

Assume you have a certain project that you want to run through clang-tidy.

step1

Step 2

In VSCode, there may be a .vscode folder where local IDE settings for each project are stored. Using the redhat.vscode-yaml extension, you can add the "yaml.schemas" setting, specifying a specific YAML schema (via URL, which can be local or remote) for a set of files defined using regular expressions. For simplicity, let's assume we placed our YAML schema directly in the .vscode folder.

step2

Step 3

Next, we need to obtain this schema. When this patch is merged, this can be done using the --dump-yaml-schema option.

step3

After that, copy it to the clipboard.

Step 4

Now, copy it to a file, and optionally add fields for the "title" and the "$schema" format that this schema itself conforms to.

step4

Step 5

So, now everything is ready, and you can enter the input configuration. To see the current suggestions from the IDE, press Ctrl+Space. At the top of the input file, you will see the name of the specified YAML schema (if added).

step5

This is how the proposed functionality is expected to be used in such a straightforward way.

@tgs-sc
Copy link
Contributor Author

tgs-sc commented Oct 23, 2025

Essentially you have branch 1 with commit A, branch 2 with commits A and B, branch 3 with commits A B and C. On each PR you make, add a comment noting what it depends on.

@DavidSpickett, I have created this 3 PRs. Here is the right order:
#133284
#164826
#164412

Since YAML Generate Schema was added, it can be used to dump
current clang-tidy's YAML's schema.
@tgs-sc tgs-sc force-pushed the users/tgs-sc/dev-auto-generate-yaml-scheme branch from 263582f to 9617518 Compare October 24, 2025 13:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants