New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ [Sema, Lex, Parse] Preprocessor embed in C and C++ (and Obj-C and Obj-C++ by-proxy) #68620
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-clang-static-analyzer-1 @llvm/pr-subscribers-llvm-support ChangesThis pull request implements the entirety of the now-accepted N3017 - Preprocessor Embed and its sister C++ paper p1967. It implements everything in the specification, and includes an implementation that drastically improves the time it takes to embed data in specific scenarios (the initialization of character type arrays). The mechanisms used to do this are used under the "as-if" rule, and in general when the system cannot detect it is initializing an array object in a variable declaration, will simply expand the list out to a sequence of numbers. There are likely places where the Images are a dry run of a recently built clang and LLVM on Release mode (not RelWithDebInfo) on Windows, under the following properties: OS Name: Microsoft Windows 10 Pro All of the added tests pass under the above machine properties. I have no intention of following up on this PR. I am too tired to carry it to fruition. Pick this apart, take from it what you want, or reimplement it entirely. (I will be unsubscribing from this immediately after posting.) Good luck ✌! Patch is 1.33 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/68620.diff 82 Files Affected:
diff --git a/clang/CMakeLists.txt b/clang/CMakeLists.txt
index 9b52c58be41e7f7..1b88905da3b8597 100644
--- a/clang/CMakeLists.txt
+++ b/clang/CMakeLists.txt
@@ -300,6 +300,7 @@ configure_file(
${CMAKE_CURRENT_BINARY_DIR}/include/clang/Basic/Version.inc)
# Add appropriate flags for GCC
+option(CLANG_ENABLE_PEDANTIC "Compile with pedantic enabled." ON)
if (LLVM_COMPILER_IS_GCC_COMPATIBLE)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fno-common -Woverloaded-virtual")
if (NOT "${CMAKE_CXX_COMPILER_ID}" MATCHES "Clang")
@@ -307,7 +308,7 @@ if (LLVM_COMPILER_IS_GCC_COMPATIBLE)
endif ()
# Enable -pedantic for Clang even if it's not enabled for LLVM.
- if (NOT LLVM_ENABLE_PEDANTIC)
+ if (NOT LLVM_ENABLE_PEDANTIC AND CLANG_ENABLE_PEDANTIC)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -pedantic -Wno-long-long")
endif ()
diff --git a/clang/include/clang/AST/Expr.h b/clang/include/clang/AST/Expr.h
index b69c616b0090365..9303307fd6a8af5 100644
--- a/clang/include/clang/AST/Expr.h
+++ b/clang/include/clang/AST/Expr.h
@@ -4805,6 +4805,58 @@ class SourceLocExpr final : public Expr {
friend class ASTStmtReader;
};
+/// Represents a function call to __builtin_pp_embed().
+class PPEmbedExpr final : public Expr {
+ SourceLocation BuiltinLoc, RParenLoc;
+ DeclContext *ParentContext;
+ StringLiteral *Filename;
+ StringLiteral *BinaryData;
+
+public:
+ enum Action {
+ NotFound,
+ FoundOne,
+ Expanded,
+ };
+
+ PPEmbedExpr(const ASTContext &Ctx, QualType ResultTy, StringLiteral* Filename, StringLiteral* BinaryData,
+ SourceLocation BLoc, SourceLocation RParenLoc,
+ DeclContext *Context);
+
+ /// Build an empty call expression.
+ explicit PPEmbedExpr(EmptyShell Empty)
+ : Expr(SourceLocExprClass, Empty) {}
+
+ /// If the PPEmbedExpr has been resolved return the subexpression
+ /// representing the resolved value. Otherwise return null.
+ const DeclContext *getParentContext() const { return ParentContext; }
+ DeclContext *getParentContext() { return ParentContext; }
+
+ SourceLocation getLocation() const { return BuiltinLoc; }
+ SourceLocation getBeginLoc() const { return BuiltinLoc; }
+ SourceLocation getEndLoc() const { return RParenLoc; }
+
+ StringLiteral *getFilenameStringLiteral() const { return Filename; }
+ StringLiteral *getDataStringLiteral() const { return BinaryData; }
+
+ size_t getDataElementCount(ASTContext &Context) const;
+
+ child_range children() {
+ return child_range(child_iterator(), child_iterator());
+ }
+
+ const_child_range children() const {
+ return const_child_range(child_iterator(), child_iterator());
+ }
+
+ static bool classof(const Stmt *T) {
+ return T->getStmtClass() == PPEmbedExprClass;
+ }
+
+private:
+ friend class ASTStmtReader;
+};
+
/// Describes an C or C++ initializer list.
///
/// InitListExpr describes an initializer list, which can be used to
diff --git a/clang/include/clang/AST/RecursiveASTVisitor.h b/clang/include/clang/AST/RecursiveASTVisitor.h
index 3dd23eb38eeabfc..6b7211bb0a0d3f1 100644
--- a/clang/include/clang/AST/RecursiveASTVisitor.h
+++ b/clang/include/clang/AST/RecursiveASTVisitor.h
@@ -2809,6 +2809,7 @@ DEF_TRAVERSE_STMT(ShuffleVectorExpr, {})
DEF_TRAVERSE_STMT(ConvertVectorExpr, {})
DEF_TRAVERSE_STMT(StmtExpr, {})
DEF_TRAVERSE_STMT(SourceLocExpr, {})
+DEF_TRAVERSE_STMT(PPEmbedExpr, {})
DEF_TRAVERSE_STMT(UnresolvedLookupExpr, {
TRY_TO(TraverseNestedNameSpecifierLoc(S->getQualifierLoc()));
diff --git a/clang/include/clang/Basic/Builtins.def b/clang/include/clang/Basic/Builtins.def
index 6ea8484606cfd5d..0dfc6456daf059a 100644
--- a/clang/include/clang/Basic/Builtins.def
+++ b/clang/include/clang/Basic/Builtins.def
@@ -1766,6 +1766,9 @@ BUILTIN(__builtin_ms_va_copy, "vc*&c*&", "n")
// Arithmetic Fence: to prevent FP reordering and reassociation optimizations
LANGBUILTIN(__arithmetic_fence, "v.", "tE", ALL_LANGUAGES)
+// preprocessor embed builtin
+LANGBUILTIN(__builtin_pp_embed, "v.", "tE", ALL_LANGUAGES)
+
#undef BUILTIN
#undef LIBBUILTIN
#undef LANGBUILTIN
diff --git a/clang/include/clang/Basic/DiagnosticCommonKinds.td b/clang/include/clang/Basic/DiagnosticCommonKinds.td
index f2df283c74829f6..4df86e35eebde38 100644
--- a/clang/include/clang/Basic/DiagnosticCommonKinds.td
+++ b/clang/include/clang/Basic/DiagnosticCommonKinds.td
@@ -59,6 +59,9 @@ def err_expected_string_literal : Error<"expected string literal "
"'external_source_symbol' attribute|"
"as argument of '%1' attribute}0">;
+def err_builtin_pp_embed_invalid_argument : Error<
+ "invalid argument to '__builtin_pp_embed': %0">;
+
def err_invalid_string_udl : Error<
"string literal with user-defined suffix cannot be used here">;
def err_invalid_character_udl : Error<
@@ -80,6 +83,9 @@ def err_expected : Error<"expected %0">;
def err_expected_either : Error<"expected %0 or %1">;
def err_expected_after : Error<"expected %1 after %0">;
+def err_builtin_pp_embed_invalid_location : Error<
+ "'__builtin_pp_embed' in invalid location: %0%select{|%2}1">;
+
def err_param_redefinition : Error<"redefinition of parameter %0">;
def warn_method_param_redefinition : Warning<"redefinition of method parameter %0">;
def warn_method_param_declaration : Warning<"redeclaration of method parameter %0">,
diff --git a/clang/include/clang/Basic/DiagnosticGroups.td b/clang/include/clang/Basic/DiagnosticGroups.td
index 0b09c002191848a..7ebea56891d4654 100644
--- a/clang/include/clang/Basic/DiagnosticGroups.td
+++ b/clang/include/clang/Basic/DiagnosticGroups.td
@@ -708,6 +708,12 @@ def ReservedIdAsMacro : DiagGroup<"reserved-macro-identifier">;
def ReservedIdAsMacroAlias : DiagGroup<"reserved-id-macro", [ReservedIdAsMacro]>;
def RestrictExpansionMacro : DiagGroup<"restrict-expansion">;
def FinalMacro : DiagGroup<"final-macro">;
+// Warnings about unknown preprocessor parameters (e.g. `#embed` and extensions)
+def UnsupportedDirective : DiagGroup<"unsupported-directive">;
+def UnknownDirectiveParameters : DiagGroup<"unknown-directive-parameters">;
+def IgnoredDirectiveParameters : DiagGroup<"ignored-directive-parameters">;
+def DirectiveParameters : DiagGroup<"directive-parameters",
+ [UnknownDirectiveParameters, IgnoredDirectiveParameters]>;
// Just silence warnings about -Wstrict-aliasing for now.
def : DiagGroup<"strict-aliasing=0">;
@@ -715,6 +721,7 @@ def : DiagGroup<"strict-aliasing=1">;
def : DiagGroup<"strict-aliasing=2">;
def : DiagGroup<"strict-aliasing">;
+
// Just silence warnings about -Wstrict-overflow for now.
def : DiagGroup<"strict-overflow=0">;
def : DiagGroup<"strict-overflow=1">;
diff --git a/clang/include/clang/Basic/DiagnosticLexKinds.td b/clang/include/clang/Basic/DiagnosticLexKinds.td
index 940cca67368492f..4490f40806b0345 100644
--- a/clang/include/clang/Basic/DiagnosticLexKinds.td
+++ b/clang/include/clang/Basic/DiagnosticLexKinds.td
@@ -422,6 +422,22 @@ def warn_cxx23_compat_warning_directive : Warning<
def warn_c23_compat_warning_directive : Warning<
"#warning is incompatible with C standards before C23">,
InGroup<CPre23Compat>, DefaultIgnore;
+def warn_c23_pp_embed : Warning<
+ "'__has_embed' is a C23 extension">,
+ InGroup<CPre23Compat>,
+ DefaultIgnore;
+def warn_c23_pp_has_embed : Warning<
+ "'__has_embed' is a C23 extension">,
+ InGroup<CPre23Compat>,
+ DefaultIgnore;
+def warn_cxx26_pp_embed : Warning<
+ "'__has_embed' is a C++26 extension">,
+ InGroup<CXXPre26Compat>,
+ DefaultIgnore;
+def warn_cxx26_pp_has_embed : Warning<
+ "'__has_embed' is a C++26 extension">,
+ InGroup<CXXPre26Compat>,
+ DefaultIgnore;
def ext_pp_extra_tokens_at_eol : ExtWarn<
"extra tokens at end of #%0 directive">, InGroup<ExtraTokens>;
@@ -483,7 +499,13 @@ def ext_pp_gnu_line_directive : Extension<
def err_pp_invalid_directive : Error<
"invalid preprocessing directive%select{|, did you mean '#%1'?}0">;
def warn_pp_invalid_directive : Warning<
- err_pp_invalid_directive.Summary>, InGroup<DiagGroup<"unknown-directives">>;
+ err_pp_invalid_directive.Summary>,
+ InGroup<UnsupportedDirective>;
+def warn_pp_unknown_parameter_ignored : Warning<
+ "unknown%select{ | embed}0 preprocessor parameter '%1' ignored">,
+ InGroup<UnknownDirectiveParameters>;
+def err_pp_unsupported_directive : Error<
+ "unsupported%select{ | embed}0 directive: %1">;
def err_pp_directive_required : Error<
"%0 must be used within a preprocessing directive">;
def err_pp_file_not_found : Error<"'%0' file not found">, DefaultFatal;
diff --git a/clang/include/clang/Basic/FileManager.h b/clang/include/clang/Basic/FileManager.h
index 56cb093dd8c376f..c757f8775b425e9 100644
--- a/clang/include/clang/Basic/FileManager.h
+++ b/clang/include/clang/Basic/FileManager.h
@@ -276,11 +276,13 @@ class FileManager : public RefCountedBase<FileManager> {
/// MemoryBuffer if successful, otherwise returning null.
llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>>
getBufferForFile(FileEntryRef Entry, bool isVolatile = false,
- bool RequiresNullTerminator = true);
+ bool RequiresNullTerminator = true,
+ std::optional<int64_t> MaybeLimit = std::nullopt);
llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>>
getBufferForFile(StringRef Filename, bool isVolatile = false,
- bool RequiresNullTerminator = true) {
- return getBufferForFileImpl(Filename, /*FileSize=*/-1, isVolatile,
+ bool RequiresNullTerminator = true,
+ std::optional<int64_t> MaybeLimit = std::nullopt) {
+ return getBufferForFileImpl(Filename, /*FileSize=*/(MaybeLimit ? *MaybeLimit : -1), isVolatile,
RequiresNullTerminator);
}
diff --git a/clang/include/clang/Basic/StmtNodes.td b/clang/include/clang/Basic/StmtNodes.td
index cec301dfca2817b..e3be997dd1c86e0 100644
--- a/clang/include/clang/Basic/StmtNodes.td
+++ b/clang/include/clang/Basic/StmtNodes.td
@@ -203,6 +203,7 @@ def OpaqueValueExpr : StmtNode<Expr>;
def TypoExpr : StmtNode<Expr>;
def RecoveryExpr : StmtNode<Expr>;
def BuiltinBitCastExpr : StmtNode<ExplicitCastExpr>;
+def PPEmbedExpr : StmtNode<Expr>;
// Microsoft Extensions.
def MSPropertyRefExpr : StmtNode<Expr>;
diff --git a/clang/include/clang/Basic/TokenKinds.def b/clang/include/clang/Basic/TokenKinds.def
index 94db56a9fd5d78c..167bd614efe7bd9 100644
--- a/clang/include/clang/Basic/TokenKinds.def
+++ b/clang/include/clang/Basic/TokenKinds.def
@@ -126,6 +126,9 @@ PPKEYWORD(error)
// C99 6.10.6 - Pragma Directive.
PPKEYWORD(pragma)
+// C23 & C++26 #embed
+PPKEYWORD(embed)
+
// GNU Extensions.
PPKEYWORD(import)
PPKEYWORD(include_next)
@@ -751,6 +754,7 @@ ALIAS("__char32_t" , char32_t , KEYCXX)
KEYWORD(__builtin_bit_cast , KEYALL)
KEYWORD(__builtin_available , KEYALL)
KEYWORD(__builtin_sycl_unique_stable_name, KEYSYCL)
+KEYWORD(__builtin_pp_embed , KEYALL)
// Keywords defined by Attr.td.
#ifndef KEYWORD_ATTRIBUTE
@@ -986,6 +990,7 @@ ANNOTATION(repl_input_end)
#undef CXX11_KEYWORD
#undef KEYWORD
#undef PUNCTUATOR
+#undef BUILTINOK
#undef TOK
#undef C99_KEYWORD
#undef C23_KEYWORD
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 5415b18d3f406df..bfc4b15d5411cde 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -114,6 +114,11 @@ def IncludePath_Group : OptionGroup<"<I/i group>">, Group<Preprocessor_Group>,
DocBrief<[{
Flags controlling how ``#include``\s are resolved to files.}]>;
+def EmbedPath_Group : OptionGroup<"<Embed group>">, Group<Preprocessor_Group>,
+ DocName<"Embed path management">,
+ DocBrief<[{
+Flags controlling how ``#embed``\s and similar are resolved to files.}]>;
+
def I_Group : OptionGroup<"<I group>">, Group<IncludePath_Group>, DocFlatten;
def i_Group : OptionGroup<"<i group>">, Group<IncludePath_Group>, DocFlatten;
def clang_i_Group : OptionGroup<"<clang i group>">, Group<i_Group>, DocFlatten;
@@ -816,6 +821,14 @@ will be ignored}]>;
def L : JoinedOrSeparate<["-"], "L">, Flags<[RenderJoined]>, Group<Link_Group>,
Visibility<[ClangOption, FlangOption]>,
MetaVarName<"<dir>">, HelpText<"Add directory to library search path">;
+def embed_dir : JoinedOrSeparate<["-"], "embed-dir">,
+ Flags<[RenderJoined]>, Group<EmbedPath_Group>,
+ Visibility<[ClangOption, CC1Option, CC1AsOption, FlangOption, FC1Option]>,
+ MetaVarName<"<dir>">, HelpText<"Add directory to embed search path">;
+def embed_dir_EQ : JoinedOrSeparate<["-"], "embed-dir=">,
+ Flags<[RenderJoined]>, Group<EmbedPath_Group>,
+ Visibility<[ClangOption, CC1Option, CC1AsOption, FlangOption, FC1Option]>,
+ MetaVarName<"<dir>">, HelpText<"Add directory to embed search path">;
def MD : Flag<["-"], "MD">, Group<M_Group>,
HelpText<"Write a depfile containing user and system headers">;
def MMD : Flag<["-"], "MMD">, Group<M_Group>,
@@ -1353,6 +1366,9 @@ def dD : Flag<["-"], "dD">, Group<d_Group>, Visibility<[ClangOption, CC1Option]>
def dI : Flag<["-"], "dI">, Group<d_Group>, Visibility<[ClangOption, CC1Option]>,
HelpText<"Print include directives in -E mode in addition to normal output">,
MarshallingInfoFlag<PreprocessorOutputOpts<"ShowIncludeDirectives">>;
+def dE : Flag<["-"], "dE">, Group<d_Group>, Visibility<[ClangOption, CC1Option]>,
+ HelpText<"Print embed directives in -E mode in addition to normal output">,
+ MarshallingInfoFlag<PreprocessorOutputOpts<"ShowEmbedDirectives">>;
def dM : Flag<["-"], "dM">, Group<d_Group>, Visibility<[ClangOption, CC1Option]>,
HelpText<"Print macro definitions in -E mode instead of normal output">;
def dead__strip : Flag<["-"], "dead_strip">;
diff --git a/clang/include/clang/Frontend/PreprocessorOutputOptions.h b/clang/include/clang/Frontend/PreprocessorOutputOptions.h
index db2ec9f2ae20698..3e36db3f8ce46ea 100644
--- a/clang/include/clang/Frontend/PreprocessorOutputOptions.h
+++ b/clang/include/clang/Frontend/PreprocessorOutputOptions.h
@@ -22,6 +22,7 @@ class PreprocessorOutputOptions {
unsigned ShowMacroComments : 1; ///< Show comments, even in macros.
unsigned ShowMacros : 1; ///< Print macro definitions.
unsigned ShowIncludeDirectives : 1; ///< Print includes, imports etc. within preprocessed output.
+ unsigned ShowEmbedDirectives : 1; ///< Print embeds, etc. within preprocessed output.
unsigned RewriteIncludes : 1; ///< Preprocess include directives only.
unsigned RewriteImports : 1; ///< Include contents of transitively-imported modules.
unsigned MinimizeWhitespace : 1; ///< Ignore whitespace from input.
@@ -37,6 +38,7 @@ class PreprocessorOutputOptions {
ShowMacroComments = 0;
ShowMacros = 0;
ShowIncludeDirectives = 0;
+ ShowEmbedDirectives = 0;
RewriteIncludes = 0;
RewriteImports = 0;
MinimizeWhitespace = 0;
diff --git a/clang/include/clang/Lex/PPCallbacks.h b/clang/include/clang/Lex/PPCallbacks.h
index 94f96cf9c512541..7028c81dbc84aac 100644
--- a/clang/include/clang/Lex/PPCallbacks.h
+++ b/clang/include/clang/Lex/PPCallbacks.h
@@ -22,11 +22,11 @@
#include "llvm/ADT/StringRef.h"
namespace clang {
- class Token;
- class IdentifierInfo;
- class MacroDefinition;
- class MacroDirective;
- class MacroArgs;
+class Token;
+class IdentifierInfo;
+class MacroDefinition;
+class MacroDirective;
+class MacroArgs;
/// This interface provides a way to observe the actions of the
/// preprocessor as it does its thing.
@@ -36,9 +36,7 @@ class PPCallbacks {
public:
virtual ~PPCallbacks();
- enum FileChangeReason {
- EnterFile, ExitFile, SystemHeaderPragma, RenameFile
- };
+ enum FileChangeReason { EnterFile, ExitFile, SystemHeaderPragma, RenameFile };
/// Callback invoked whenever a source file is entered or exited.
///
@@ -47,8 +45,7 @@ class PPCallbacks {
/// the file before the new one entered for \p Reason EnterFile.
virtual void FileChanged(SourceLocation Loc, FileChangeReason Reason,
SrcMgr::CharacteristicKind FileType,
- FileID PrevFID = FileID()) {
- }
+ FileID PrevFID = FileID()) {}
enum class LexedFileChangeReason { EnterFile, ExitFile };
@@ -83,6 +80,47 @@ class PPCallbacks {
const Token &FilenameTok,
SrcMgr::CharacteristicKind FileType) {}
+ /// Callback invoked whenever the preprocessor cannot find a file for an
+ /// embed directive.
+ ///
+ /// \param FileName The name of the file being included, as written in the
+ /// source code.
+ ///
+ /// \returns true to indicate that the preprocessor should skip this file
+ /// and not issue any diagnostic.
+ virtual bool EmbedFileNotFound(StringRef FileName) { return false; }
+
+ /// Callback invoked whenever an embed directive has been processed,
+ /// regardless of whether the embed will actually find a file.
+ ///
+ /// \param HashLoc The location of the '#' that starts the embed directive.
+ ///
+ /// \param FileName The name of the file being included, as written in the
+ /// source code.
+ ///
+ /// \param IsAngled Whether the file name was enclosed in angle brackets;
+ /// otherwise, it was enclosed in quotes.
+ ///
+ /// \param FilenameRange The character range of the quotes or angle brackets
+ /// for the written file name.
+ ///
+ /// \param ParametersRange The character range of the embed parameters. An
+ /// empty range if there were no parameters.
+ ///
+ /// \param File The actual file that may be included by this embed directive.
+ ///
+ /// \param SearchPath Contains the search path which was used to find the file
+ /// in the file system. If the file was found via an absolute path,
+ /// SearchPath will be empty.
+ ///
+ /// \param RelativePath The path relative to SearchPath, at which the resource
+ /// file was found. This is equal to FileName.
+ virtual void EmbedDirective(SourceLocation HashLoc, StringRef FileName,
+ bool IsAngled, CharSourceRange FilenameRange,
+ CharSourceRange ParametersRange,
+ OptionalFileEntryRef File, StringRef SearchPath,
+ StringRef RelativePath) {}
+
/// Callback invoked whenever the preprocessor cannot find a file for an
/// inclusion directive.
///
@@ -151,7 +189,7 @@ class PPCallbacks {
/// \param ForPragma If entering from pragma directive.
///
virtual void EnteredSubmodule(Module *M, SourceLocation ImportLoc,
- bool ForPragma) { }
+ bool ForPragma) {}
/// Callback invoked whenever a submodule was left.
///
@@ -162,7 +200,7 @@ class PPCallbacks {
/// \param ForPragma If entering from pragma directive.
///
virtual void LeftSubmodule(Module *M, SourceLocation ImportLoc,
- bool ForPragma) { }
+ bool ForPragma) {}
/// Callback invoked whenever there was an explicit module-import
/// syntax.
@@ -174,49 +212,40 @@ class PPCallbacks {
///
/// \param Imported The imported module; can be null if importing failed.
///
- virtual void moduleImport(SourceLocation ImportLoc,
- ModuleIdPath Path,
- const Module *Imported) {
- }
+ virtual void moduleImport(SourceLocation ImportLoc, ModuleIdPath Path,
+ const Module *Imported) {}
/// Callback invoked when the end of the main file is reached.
///
/// No subsequent callbacks will be made.
- virtual void EndOfMainFile() {
- }
+ virtual void EndOfMainFile() {}
/// Callback invoked when a \#ident or \#sccs directive is read.
/// \param Loc The location of the directive.
/// \param str The text of the directive.
///
- virtual void Ident(SourceLocation Loc, StringRef str) {
- }
+ virtual void Ident(SourceLocation Loc, StringRef str) {}
/// Callback invoked when start reading any pragma directive.
virtual void PragmaDirective(SourceLocation Loc,
- PragmaIntroducerKind Introducer) {
- }
+ PragmaIntroducerKind Introducer) {}
/// Callback invoked when a \#p...
[truncated]
|
You can test this locally with the following command:git-clang-format --diff 96b2e35a58819eb2fbe1821650e35a1f0e085bd7 8629e19e7b227d95e676ccee80f9afb2d8922bdf -- clang/include/clang/Lex/PPDirectiveParameter.h clang/include/clang/Lex/PPEmbedParameters.h clang/test/C/C2x/Inputs/boop.h clang/test/C/C2x/n3017.c clang/test/Preprocessor/embed___has_embed.c clang/test/Preprocessor/embed___has_embed_parsing_errors.c clang/test/Preprocessor/embed___has_embed_supported.c clang/test/Preprocessor/embed_art.c clang/test/Preprocessor/embed_codegen.cpp clang/test/Preprocessor/embed_constexpr.cpp clang/test/Preprocessor/embed_dependencies.c clang/test/Preprocessor/embed_ext_compat_diags.c clang/test/Preprocessor/embed_feature_test.cpp clang/test/Preprocessor/embed_file_not_found_chevron.c clang/test/Preprocessor/embed_file_not_found_quote.c clang/test/Preprocessor/embed_init.c clang/test/Preprocessor/embed_parameter_if_empty.c clang/test/Preprocessor/embed_parameter_limit.c clang/test/Preprocessor/embed_parameter_offset.c clang/test/Preprocessor/embed_parameter_prefix.c clang/test/Preprocessor/embed_parameter_suffix.c clang/test/Preprocessor/embed_parameter_unrecognized.c clang/test/Preprocessor/embed_parsing_errors.c clang/test/Preprocessor/embed_path_chevron.c clang/test/Preprocessor/embed_path_quote.c clang/test/Preprocessor/embed_preprocess_to_file.c clang/test/Preprocessor/embed_single_entity.c clang/test/Preprocessor/embed_weird.cpp clang-tools-extra/test/clang-tidy/checkers/modernize/loop-convert-extra.cpp clang-tools-extra/test/pp-trace/pp-trace-macro.cpp clang/include/clang/AST/Expr.h clang/include/clang/AST/RecursiveASTVisitor.h clang/include/clang/AST/TextNodeDumper.h clang/include/clang/Basic/FileManager.h clang/include/clang/Frontend/PreprocessorOutputOptions.h clang/include/clang/Lex/PPCallbacks.h clang/include/clang/Lex/Preprocessor.h clang/include/clang/Lex/PreprocessorOptions.h clang/include/clang/Parse/Parser.h clang/include/clang/Sema/Sema.h clang/include/clang/Serialization/ASTBitCodes.h clang/lib/AST/Expr.cpp clang/lib/AST/ExprClassification.cpp clang/lib/AST/ExprConstant.cpp clang/lib/AST/Interp/ByteCodeExprGen.cpp clang/lib/AST/Interp/ByteCodeExprGen.h clang/lib/AST/ItaniumMangle.cpp clang/lib/AST/StmtPrinter.cpp clang/lib/AST/StmtProfile.cpp clang/lib/AST/TextNodeDumper.cpp clang/lib/Basic/FileManager.cpp clang/lib/Basic/IdentifierTable.cpp clang/lib/CodeGen/CGExprAgg.cpp clang/lib/CodeGen/CGExprConstant.cpp clang/lib/CodeGen/CGExprScalar.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInvocation.cpp clang/lib/Frontend/DependencyFile.cpp clang/lib/Frontend/DependencyGraph.cpp clang/lib/Frontend/InitPreprocessor.cpp clang/lib/Frontend/PrintPreprocessedOutput.cpp clang/lib/Lex/PPDirectives.cpp clang/lib/Lex/PPExpressions.cpp clang/lib/Lex/PPMacroExpansion.cpp clang/lib/Parse/ParseExpr.cpp clang/lib/Parse/ParseInit.cpp clang/lib/Parse/ParseTemplate.cpp clang/lib/Sema/SemaExceptionSpec.cpp clang/lib/Sema/SemaExpr.cpp clang/lib/Sema/SemaInit.cpp clang/lib/Sema/TreeTransform.h clang/lib/Serialization/ASTReaderStmt.cpp clang/lib/Serialization/ASTWriterStmt.cpp clang/lib/StaticAnalyzer/Core/ExprEngine.cpp clang/test/Preprocessor/init-aarch64.c clang/test/Preprocessor/init.c clang/tools/libclang/CXCursor.cpp View the diff from clang-format here.diff --git a/clang/include/clang/AST/Expr.h b/clang/include/clang/AST/Expr.h
index 0bd2201978..45fb5bcc0d 100644
--- a/clang/include/clang/AST/Expr.h
+++ b/clang/include/clang/AST/Expr.h
@@ -4857,9 +4857,7 @@ public:
EmbedDataStorage *getData() const { return Data; }
unsigned getStartingElementPos() const { return Begin; }
- size_t getDataElementCount() const {
- return NumOfElements;
- }
+ size_t getDataElementCount() const { return NumOfElements; }
template <bool Const>
class ChildElementIter
@@ -4937,9 +4935,7 @@ public:
return T->getStmtClass() == EmbedExprClass;
}
- ChildElementIter<false> begin() {
- return ChildElementIter<false>(this);
- }
+ ChildElementIter<false> begin() { return ChildElementIter<false>(this); }
ChildElementIter<true> begin() const {
return ChildElementIter<true>(const_cast<EmbedExpr *>(this));
diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index 882ed68fd9..3720345d53 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -11140,8 +11140,7 @@ bool ArrayExprEvaluator::VisitCXXParenListOrInitListExpr(
NumEltsToInit = NumElts;
} else {
for (auto *Init : Args) {
- if (auto *EmbedS =
- dyn_cast<EmbedExpr>(Init->IgnoreParenImpCasts())) {
+ if (auto *EmbedS = dyn_cast<EmbedExpr>(Init->IgnoreParenImpCasts())) {
NumEltsToInit += EmbedS->getDataElementCount() - 1;
}
}
@@ -11189,8 +11188,7 @@ bool ArrayExprEvaluator::VisitCXXParenListOrInitListExpr(
const Expr *Init = Index < Args.size() ? Args[Index] : ArrayFiller;
if (ArrayIndex >= NumEltsToInit)
break;
- if (auto *EmbedS =
- dyn_cast<EmbedExpr>(Init->IgnoreParenImpCasts())) {
+ if (auto *EmbedS = dyn_cast<EmbedExpr>(Init->IgnoreParenImpCasts())) {
if (!EmbedS->doForEachDataElement(Eval, ArrayIndex))
return false;
} else {
diff --git a/clang/lib/AST/TextNodeDumper.cpp b/clang/lib/AST/TextNodeDumper.cpp
index 2ddfd8d7c1..751c9a397c 100644
--- a/clang/lib/AST/TextNodeDumper.cpp
+++ b/clang/lib/AST/TextNodeDumper.cpp
@@ -2721,5 +2721,5 @@ void TextNodeDumper::VisitOpenACCConstructStmt(const OpenACCConstructStmt *S) {
void TextNodeDumper::VisitEmbedExpr(const EmbedExpr *S) {
AddChild("begin", [=] { OS << S->getStartingElementPos(); });
AddChild("number of elements", [=] { OS << S->getDataElementCount(); });
- //AddChild("embed", [=] { Visit(S->getEmbed()); });
+ // AddChild("embed", [=] { Visit(S->getEmbed()); });
}
diff --git a/clang/lib/CodeGen/CGExprAgg.cpp b/clang/lib/CodeGen/CGExprAgg.cpp
index 31ee4aec65..ecfda370a7 100644
--- a/clang/lib/CodeGen/CGExprAgg.cpp
+++ b/clang/lib/CodeGen/CGExprAgg.cpp
@@ -511,8 +511,7 @@ void AggExprEmitter::EmitArrayInit(Address DestPtr, llvm::ArrayType *AType,
uint64_t NumArrayElements = AType->getNumElements();
for (const auto *Init : Args) {
- if (const auto *Embed =
- dyn_cast<EmbedExpr>(Init->IgnoreParenImpCasts())) {
+ if (const auto *Embed = dyn_cast<EmbedExpr>(Init->IgnoreParenImpCasts())) {
NumInitElements += Embed->getDataElementCount() - 1;
if (NumInitElements > NumArrayElements) {
NumInitElements = NumArrayElements;
@@ -622,8 +621,7 @@ void AggExprEmitter::EmitArrayInit(Address DestPtr, llvm::ArrayType *AType,
for (uint64_t i = 0; i != NumInitElements; ++i) {
if (ArrayIndex >= NumInitElements)
break;
- if (auto *EmbedS =
- dyn_cast<EmbedExpr>(Args[i]->IgnoreParenImpCasts())) {
+ if (auto *EmbedS = dyn_cast<EmbedExpr>(Args[i]->IgnoreParenImpCasts())) {
EmbedS->doForEachDataElement(Emit, ArrayIndex);
} else {
Emit(Args[i], ArrayIndex);
diff --git a/clang/lib/CodeGen/CGExprConstant.cpp b/clang/lib/CodeGen/CGExprConstant.cpp
index de00dbdabc..96d0101b97 100644
--- a/clang/lib/CodeGen/CGExprConstant.cpp
+++ b/clang/lib/CodeGen/CGExprConstant.cpp
@@ -1309,8 +1309,7 @@ public:
unsigned ArrayIndex = 0;
for (unsigned i = 0; i < ILE->getNumInits(); ++i) {
const Expr *Init = ILE->getInit(i);
- if (auto *EmbedS =
- dyn_cast<EmbedExpr>(Init->IgnoreParenImpCasts())) {
+ if (auto *EmbedS = dyn_cast<EmbedExpr>(Init->IgnoreParenImpCasts())) {
if (!EmbedS->doForEachDataElement(Emit, ArrayIndex,
/*EmbedInit=*/true))
return nullptr;
diff --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp
index 3511c1867b..06a25849fa 100644
--- a/clang/lib/Sema/SemaInit.cpp
+++ b/clang/lib/Sema/SemaInit.cpp
@@ -2139,8 +2139,7 @@ void InitListChecker::CheckArrayType(const InitializedEntity &Entity,
elementIndex + CurEmbedIndex - EmbedElementIndexBeforeInit - 1;
} else {
auto Embed = cast<EmbedExpr>(Init);
- elementIndex = elementIndex +
- Embed->getDataElementCount() -
+ elementIndex = elementIndex + Embed->getDataElementCount() -
EmbedElementIndexBeforeInit - 1;
}
}
|
FWIW, I spoke offline with the original author of the PR and he said that he's fine with me picking up the changes and carrying the review forward. Because I don't know of any better way to commandeer a patch in GitHub, I'll probably grab the changes, get them into my own fork, and then recreate the PR to get the review started. Whenever we get ready to land the PR, @ThePhD will be credited as a co-author. |
As a maintainer, you can push into this branch (unless @ThePhD unchecked the default setting when creating the PR), including forcefully. For example (you could also use the github CLI, but I'm using vanilla git to demo):
|
Amazing! I had started to think about looking into getting this implemented recently, so it's really nice to see an implementation uploaded now!
I'm very sorry to hear that, but thank you for the contribution regardless! |
So just to clarify, we should wait till you post a follow-up review before commenting, to avoid fragmenting the discussion. |
I had reached out to @ThePhD with an offer to help on this before they opened this PR, and I now have push access to their fork, which means I could change the state of this PR. To be clear, I cannot offer to defend the content (semantically) in the face of review, but - if desired - I'd be happy to help with some more routine operations like rebasing, formatting, or chopping off pieces into separate PRs (all while keeping @ThePhD's authorship, of course). |
Oh, that's excellent! @ThePhD also gave me push access to his repo. If we're going to tag-team efforts on this and we both have access, it probably makes sense to just leave it in his repo and work from there. If he finds the notifications too distracting, we can always move the work elsewhere at that point. @h-vetinari -- would you mind doing some initial cleanup on the patch for things like rebasing, removing spurious formatting changes, naming, and the likes? I have WG14 meetings coming up next week (they should hopefully go quickly though), so it is likely going to be ~a week before I have the chance to work on this more heavily.
Given the above, I'd say we can keep the review going here. |
I'll try to do that! |
…+26. 🛠 [Frontend] Ensure commas inserted by #embed are properly serialized to output
OK, I've got a first rebase going. However, because iterating on a diff with 1000s of lines is extremely hard to review and bound to lead to mistakes and iterations (making it harder still to follow what's going on), I've tried to follow a more structured approach. Basically, I've pushed tags for intermediate milestones during the rebase to my fork, and am documenting below what I've done. That should make it easier to:
I've used a simple tag scheme The table is probably busted visually on mobile (sorry...), but I hope it reads alright on a desktop. Where used, equality (
Since the rebase from v3 to v4 went cleanly, I haven't run the compilation tests. I will push Footnotes
|
adc9737
to
5956fc2
Compare
⚡ [Lex] Better reservations for improved performance/memory usage. 🛠 [Lex, Frontend] Remove comma hardcoding since we are servicing a full file apply suggestions from git-clang-format
5f5d3a6
to
6a7a4c9
Compare
JeanHeyd wrote up a blog post about this implementation, which is probably helpful as background material for any prospective reviewers. :) |
@AaronBallman, hope the WG14 meeting went well? I've essentially done what I planned to do here, so I was wondering if you're ready to take it from here? I can try to incorporate the review feedback myself, but it's quite likely I'd make a mess of it (not familiar with the clang code base at all). 😅 |
So, with the latest improvement on my machine
Takes ~13 seconds to compile with 200 mb file. It used to take 4m20s without my changes. The following
Takes 1.1s to compile in my environment. With code move from
also works now. |
@Fznamznon Thanks for looking into that. I am fairly concerned by the growing complexity of this PR and the long term maintenance burden. ExpandSinglePPEmbedExpr, ModifyTemplateArguments in particular should not be needed Can we merge EmbedSubscriptExpr and EmbedExpr in a single class or find a way to make only one of these things an expression that needs to be visited by all consumers? |
Thanks for the feedback @cor3ntin .
I think we still want to be able to reference a part of #embed data, maybe reference the same data several times for cases with nested initializer lists without braces, so we understand where each data element goes. |
How aout:
|
SGTM! |
Okay, now there is only one new |
@@ -1062,7 +1062,7 @@ void f() { | |||
E Ee{ { { g( { Array[I] } ) } } }; | |||
} | |||
// CHECK-MESSAGES: :[[@LINE-7]]:3: warning: use range-based for loop instead | |||
// CHECK-FIXES: for (int I : Array) | |||
// CHECK-FIXES: for (int & I : Array) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My changes in SemaInit.cpp likely caused this, however the outcome seems correct to me, so I just changed the test.
Gentle ping :) |
This pull request implements the entirety of the now-accepted N3017 - Preprocessor Embed and its sister C++ paper p1967. It implements everything in the specification, and includes an implementation that drastically improves the time it takes to embed data in specific scenarios (the initialization of character type arrays). The mechanisms used to do this are used under the "as-if" rule, and in general when the system cannot detect it is initializing an array object in a variable declaration, will simply expand the list out to a sequence of numbers.
There are likely places where the
__builtin_pp_embed
intrinsic fails to compile properly and triggers ICEs. However, we have covered what we feel are the vast majority of cases where users would want or need the speedup.Images are a dry run of a recently built clang and LLVM on Release mode (not RelWithDebInfo) on Windows, under the following properties:
OS Name: Microsoft Windows 10 Pro
Version: 10.0.19045 Build 19045
System Type: x64-based PC
Processor: AMD Ryzen 9 5950X 16-Core Processor, 3401 Mhz, 16 Core(s), 32 Logical Processor(s)
Installed Physical Memory (RAM): 32.0 GB
Total Physical Memory: 31.9 GB
Total Virtual Memory: 36.7 GB
All of the added tests pass under the above machine properties.
I have no intention of following up on this PR. I am too tired to carry it to fruition. Pick this apart, take from it what you want, or reimplement it entirely. (I will be unsubscribing from this immediately after posting.)
Good luck ✌!