-
Notifications
You must be signed in to change notification settings - Fork 11.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Serialization] Load Specializations Lazily #76774
Conversation
@llvm/pr-subscribers-clang-driver @llvm/pr-subscribers-clang-modules Author: Chuanqi Xu (ChuanqiXu9) ChangesThe idea comes from @vgvassilev and @vgvassilev had patch for it on phab. Unfortunately phab is closed and I forgot the Dxxx number of that patch. But I remember the last comment from @vgvassilev is that we should use MultiOnDiskHashTable for it. So I followed that and rewrite the whole from the scratch in the new year. BackgroundCurrently all the specializations of a template (including instantiation, specialization and partial specializations) will be loaded at once if we want to instantiate another instance for the template, or find instantiation for the template, or just want to complete the redecl chain. This means basically we need to load every specializations for the template once the template declaration got loaded. This is bad since when we load a specialization, we need to load all of its template arguments. Then we have to deserialize a lot of unnecessary declarations. For example,
We should a specialization Given that the templates are heavily used in C++, this is a pain point for the performance. What this patch didThis patch adds MultiOnDiskHashTable for specializations in the ASTReader. Then we will only deserialize the specializations with the same template arguments. We made that by using ODRHash for the template arguments as the key of the hash table. The partial specializations are not added to the MultiOnDiskHashTable. Since we can't know if a partial specialization is needed before deciding the template declaration for a instantiation request. There may be space for further optimizations, but let's do that in the future. To review this patch, I think What this patch not didThis patch doesn't solve the problem completely. Since we will add llvm-project/clang/lib/Serialization/ASTWriterDecl.cpp Lines 251 to 269 in 8ae73fe
That said, we can't handle this case now:
Now But the current patch is already relatively big. So I want to split it in the next patch. I think the current patch is already self contained. Patch is 53.18 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/76774.diff 20 Files Affected:
diff --git a/clang/include/clang/AST/DeclTemplate.h b/clang/include/clang/AST/DeclTemplate.h
index 832ad2de6b08a8..ab380f55c038ee 100644
--- a/clang/include/clang/AST/DeclTemplate.h
+++ b/clang/include/clang/AST/DeclTemplate.h
@@ -30,6 +30,7 @@
#include "llvm/ADT/FoldingSet.h"
#include "llvm/ADT/PointerIntPair.h"
#include "llvm/ADT/PointerUnion.h"
+#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/iterator.h"
#include "llvm/ADT/iterator_range.h"
#include "llvm/Support/Casting.h"
@@ -525,8 +526,11 @@ class FunctionTemplateSpecializationInfo final
return Function.getInt();
}
+ void loadExternalRedecls();
+
public:
friend TrailingObjects;
+ friend class ASTReader;
static FunctionTemplateSpecializationInfo *
Create(ASTContext &C, FunctionDecl *FD, FunctionTemplateDecl *Template,
@@ -789,13 +793,15 @@ class RedeclarableTemplateDecl : public TemplateDecl,
return SpecIterator<EntryType>(isEnd ? Specs.end() : Specs.begin());
}
- void loadLazySpecializationsImpl() const;
+ void loadExternalSpecializations() const;
template <class EntryType, typename ...ProfileArguments>
typename SpecEntryTraits<EntryType>::DeclType*
findSpecializationImpl(llvm::FoldingSetVector<EntryType> &Specs,
void *&InsertPos, ProfileArguments &&...ProfileArgs);
+ void loadLazySpecializationsWithArgs(ArrayRef<TemplateArgument> TemplateArgs);
+
template <class Derived, class EntryType>
void addSpecializationImpl(llvm::FoldingSetVector<EntryType> &Specs,
EntryType *Entry, void *InsertPos);
@@ -814,9 +820,13 @@ class RedeclarableTemplateDecl : public TemplateDecl,
/// If non-null, points to an array of specializations (including
/// partial specializations) known only by their external declaration IDs.
///
+ /// These specializations needs to be loaded at once in
+ /// loadExternalSpecializations to complete the redecl chain or be preparing
+ /// for template resolution.
+ ///
/// The first value in the array is the number of specializations/partial
/// specializations that follow.
- uint32_t *LazySpecializations = nullptr;
+ uint32_t *ExternalSpecializations = nullptr;
/// The set of "injected" template arguments used within this
/// template.
@@ -850,6 +860,8 @@ class RedeclarableTemplateDecl : public TemplateDecl,
friend class ASTDeclWriter;
friend class ASTReader;
template <class decl_type> friend class RedeclarableTemplate;
+ friend class ClassTemplateSpecializationDecl;
+ friend class VarTemplateSpecializationDecl;
/// Retrieves the canonical declaration of this template.
RedeclarableTemplateDecl *getCanonicalDecl() override {
@@ -977,6 +989,12 @@ SpecEntryTraits<FunctionTemplateSpecializationInfo> {
class FunctionTemplateDecl : public RedeclarableTemplateDecl {
protected:
friend class FunctionDecl;
+ friend class FunctionTemplateSpecializationInfo;
+
+ template <typename DeclTy>
+ friend void GetSpecializationsImpl(const DeclTy *,
+ llvm::SmallPtrSetImpl<const NamedDecl *> &,
+ ASTReader *Reader);
/// Data that is common to all of the declarations of a given
/// function template.
@@ -1012,13 +1030,13 @@ class FunctionTemplateDecl : public RedeclarableTemplateDecl {
void addSpecialization(FunctionTemplateSpecializationInfo* Info,
void *InsertPos);
+ /// Load any lazily-loaded specializations from the external source.
+ void LoadLazySpecializations() const;
+
public:
friend class ASTDeclReader;
friend class ASTDeclWriter;
- /// Load any lazily-loaded specializations from the external source.
- void LoadLazySpecializations() const;
-
/// Get the underlying function declaration of the template.
FunctionDecl *getTemplatedDecl() const {
return static_cast<FunctionDecl *>(TemplatedDecl);
@@ -1839,6 +1857,8 @@ class ClassTemplateSpecializationDecl
LLVM_PREFERRED_TYPE(TemplateSpecializationKind)
unsigned SpecializationKind : 3;
+ void loadExternalRedecls();
+
protected:
ClassTemplateSpecializationDecl(ASTContext &Context, Kind DK, TagKind TK,
DeclContext *DC, SourceLocation StartLoc,
@@ -1852,6 +1872,7 @@ class ClassTemplateSpecializationDecl
public:
friend class ASTDeclReader;
friend class ASTDeclWriter;
+ friend class ASTReader;
static ClassTemplateSpecializationDecl *
Create(ASTContext &Context, TagKind TK, DeclContext *DC,
@@ -2238,6 +2259,11 @@ class ClassTemplatePartialSpecializationDecl
/// Declaration of a class template.
class ClassTemplateDecl : public RedeclarableTemplateDecl {
protected:
+ template <typename DeclTy>
+ friend void GetSpecializationsImpl(const DeclTy *,
+ llvm::SmallPtrSetImpl<const NamedDecl *> &,
+ ASTReader *Reader);
+
/// Data that is common to all of the declarations of a given
/// class template.
struct Common : CommonBase {
@@ -2285,9 +2311,7 @@ class ClassTemplateDecl : public RedeclarableTemplateDecl {
friend class ASTDeclReader;
friend class ASTDeclWriter;
friend class TemplateDeclInstantiator;
-
- /// Load any lazily-loaded specializations from the external source.
- void LoadLazySpecializations() const;
+ friend class ClassTemplateSpecializationDecl;
/// Get the underlying class declarations of the template.
CXXRecordDecl *getTemplatedDecl() const {
@@ -2651,6 +2675,8 @@ class VarTemplateSpecializationDecl : public VarDecl,
LLVM_PREFERRED_TYPE(bool)
unsigned IsCompleteDefinition : 1;
+ void loadExternalRedecls();
+
protected:
VarTemplateSpecializationDecl(Kind DK, ASTContext &Context, DeclContext *DC,
SourceLocation StartLoc, SourceLocation IdLoc,
@@ -2664,6 +2690,7 @@ class VarTemplateSpecializationDecl : public VarDecl,
public:
friend class ASTDeclReader;
friend class ASTDeclWriter;
+ friend class ASTReader;
friend class VarDecl;
static VarTemplateSpecializationDecl *
@@ -3018,6 +3045,11 @@ class VarTemplatePartialSpecializationDecl
/// Declaration of a variable template.
class VarTemplateDecl : public RedeclarableTemplateDecl {
protected:
+ template <typename DeclTy>
+ friend void GetSpecializationsImpl(const DeclTy *,
+ llvm::SmallPtrSetImpl<const NamedDecl *> &,
+ ASTReader *Reader);
+
/// Data that is common to all of the declarations of a given
/// variable template.
struct Common : CommonBase {
@@ -3057,8 +3089,7 @@ class VarTemplateDecl : public RedeclarableTemplateDecl {
friend class ASTDeclReader;
friend class ASTDeclWriter;
- /// Load any lazily-loaded specializations from the external source.
- void LoadLazySpecializations() const;
+ friend class VarTemplatePartialSpecializationDecl;
/// Get the underlying variable declarations of the template.
VarDecl *getTemplatedDecl() const {
diff --git a/clang/include/clang/AST/ExternalASTSource.h b/clang/include/clang/AST/ExternalASTSource.h
index 8e573965b0a336..7f26afd53106ba 100644
--- a/clang/include/clang/AST/ExternalASTSource.h
+++ b/clang/include/clang/AST/ExternalASTSource.h
@@ -150,6 +150,11 @@ class ExternalASTSource : public RefCountedBase<ExternalASTSource> {
virtual bool
FindExternalVisibleDeclsByName(const DeclContext *DC, DeclarationName Name);
+ /// Load all the external specialzations for the Decl and the corresponding
+ /// template arguments.
+ virtual void LoadExternalSpecs(const Decl *D,
+ ArrayRef<TemplateArgument> TemplateArgs);
+
/// Ensures that the table of all visible declarations inside this
/// context is up to date.
///
diff --git a/clang/include/clang/AST/ODRHash.h b/clang/include/clang/AST/ODRHash.h
index cedf644520fc32..ddd1bb0f095e75 100644
--- a/clang/include/clang/AST/ODRHash.h
+++ b/clang/include/clang/AST/ODRHash.h
@@ -101,6 +101,9 @@ class ODRHash {
// Save booleans until the end to lower the size of data to process.
void AddBoolean(bool value);
+ // Add intergers to ID.
+ void AddInteger(unsigned Value);
+
static bool isSubDeclToBeProcessed(const Decl *D, const DeclContext *Parent);
private:
diff --git a/clang/include/clang/Sema/MultiplexExternalSemaSource.h b/clang/include/clang/Sema/MultiplexExternalSemaSource.h
index 2bf91cb5212c5e..886c3854adac6e 100644
--- a/clang/include/clang/Sema/MultiplexExternalSemaSource.h
+++ b/clang/include/clang/Sema/MultiplexExternalSemaSource.h
@@ -97,6 +97,12 @@ class MultiplexExternalSemaSource : public ExternalSemaSource {
bool FindExternalVisibleDeclsByName(const DeclContext *DC,
DeclarationName Name) override;
+ /// Load all the external specialzations for the Decl and the corresponding
+ /// template args.
+ virtual void
+ LoadExternalSpecs(const Decl *D,
+ ArrayRef<TemplateArgument> TemplateArgs) override;
+
/// Ensures that the table of all visible declarations inside this
/// context is up to date.
void completeVisibleDeclsMap(const DeclContext *DC) override;
diff --git a/clang/include/clang/Serialization/ASTBitCodes.h b/clang/include/clang/Serialization/ASTBitCodes.h
index fdd64f2abbe937..a1bf3659e91f3e 100644
--- a/clang/include/clang/Serialization/ASTBitCodes.h
+++ b/clang/include/clang/Serialization/ASTBitCodes.h
@@ -1523,6 +1523,9 @@ enum DeclCode {
/// An ImplicitConceptSpecializationDecl record.
DECL_IMPLICIT_CONCEPT_SPECIALIZATION,
+ // A decls specilization record.
+ DECL_SPECS,
+
DECL_LAST = DECL_IMPLICIT_CONCEPT_SPECIALIZATION
};
diff --git a/clang/include/clang/Serialization/ASTReader.h b/clang/include/clang/Serialization/ASTReader.h
index 21d791f5cd89a2..52ca6c76db8e37 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -340,6 +340,9 @@ class ASTIdentifierLookupTrait;
/// The on-disk hash table(s) used for DeclContext name lookup.
struct DeclContextLookupTable;
+/// The on-disk hash table(s) used for specialization decls.
+struct SpecializedDeclsLookupTable;
+
} // namespace reader
} // namespace serialization
@@ -599,6 +602,11 @@ class ASTReader
llvm::DenseMap<const DeclContext *,
serialization::reader::DeclContextLookupTable> Lookups;
+ /// Map from decls to specialized decls.
+ llvm::DenseMap<const Decl *,
+ serialization::reader::SpecializedDeclsLookupTable>
+ SpecLookups;
+
// Updates for visible decls can occur for other contexts than just the
// TU, and when we read those update records, the actual context may not
// be available yet, so have this pending map using the ID as a key. It
@@ -640,6 +648,9 @@ class ASTReader
llvm::BitstreamCursor &Cursor,
uint64_t Offset, serialization::DeclID ID);
+ bool ReadDeclsSpecs(ModuleFile &M, llvm::BitstreamCursor &Cursor,
+ uint64_t Offset, Decl *D);
+
/// A vector containing identifiers that have already been
/// loaded.
///
@@ -1343,6 +1354,11 @@ class ASTReader
const serialization::reader::DeclContextLookupTable *
getLoadedLookupTables(DeclContext *Primary) const;
+ /// Get the loaded specializations lookup tables for \p D,
+ /// if any.
+ serialization::reader::SpecializedDeclsLookupTable *
+ getLoadedSpecLookupTables(Decl *D);
+
private:
struct ImportedModule {
ModuleFile *Mod;
@@ -1982,6 +1998,9 @@ class ASTReader
bool FindExternalVisibleDeclsByName(const DeclContext *DC,
DeclarationName Name) override;
+ void LoadExternalSpecs(const Decl *D,
+ ArrayRef<TemplateArgument> TemplateArgs) override;
+
/// Read all of the declarations lexically stored in a
/// declaration context.
///
diff --git a/clang/include/clang/Serialization/ASTWriter.h b/clang/include/clang/Serialization/ASTWriter.h
index de69f99003d827..c98beaa1a24dc0 100644
--- a/clang/include/clang/Serialization/ASTWriter.h
+++ b/clang/include/clang/Serialization/ASTWriter.h
@@ -527,6 +527,10 @@ class ASTWriter : public ASTDeserializationListener,
bool isLookupResultExternal(StoredDeclsList &Result, DeclContext *DC);
bool isLookupResultEntirelyExternal(StoredDeclsList &Result, DeclContext *DC);
+ uint64_t
+ WriteSpecsLookupTable(NamedDecl *D,
+ llvm::SmallVectorImpl<const NamedDecl *> &Specs);
+
void GenerateNameLookupTable(const DeclContext *DC,
llvm::SmallVectorImpl<char> &LookupTable);
uint64_t WriteDeclContextLexicalBlock(ASTContext &Context, DeclContext *DC);
@@ -564,6 +568,8 @@ class ASTWriter : public ASTDeserializationListener,
unsigned DeclEnumAbbrev = 0;
unsigned DeclObjCIvarAbbrev = 0;
unsigned DeclCXXMethodAbbrev = 0;
+ unsigned DeclSpecsAbbrev = 0;
+
unsigned DeclDependentNonTemplateCXXMethodAbbrev = 0;
unsigned DeclTemplateCXXMethodAbbrev = 0;
unsigned DeclMemberSpecializedCXXMethodAbbrev = 0;
diff --git a/clang/lib/AST/DeclTemplate.cpp b/clang/lib/AST/DeclTemplate.cpp
index 7d7556e670f951..43c9158fb40413 100644
--- a/clang/lib/AST/DeclTemplate.cpp
+++ b/clang/lib/AST/DeclTemplate.cpp
@@ -331,14 +331,14 @@ RedeclarableTemplateDecl::CommonBase *RedeclarableTemplateDecl::getCommonPtr() c
return Common;
}
-void RedeclarableTemplateDecl::loadLazySpecializationsImpl() const {
+void RedeclarableTemplateDecl::loadExternalSpecializations() const {
// Grab the most recent declaration to ensure we've loaded any lazy
// redeclarations of this template.
CommonBase *CommonBasePtr = getMostRecentDecl()->getCommonPtr();
- if (CommonBasePtr->LazySpecializations) {
+ if (CommonBasePtr->ExternalSpecializations) {
ASTContext &Context = getASTContext();
- uint32_t *Specs = CommonBasePtr->LazySpecializations;
- CommonBasePtr->LazySpecializations = nullptr;
+ uint32_t *Specs = CommonBasePtr->ExternalSpecializations;
+ CommonBasePtr->ExternalSpecializations = nullptr;
for (uint32_t I = 0, N = *Specs++; I != N; ++I)
(void)Context.getExternalSource()->GetExternalDecl(Specs[I]);
}
@@ -358,6 +358,15 @@ RedeclarableTemplateDecl::findSpecializationImpl(
return Entry ? SETraits::getDecl(Entry)->getMostRecentDecl() : nullptr;
}
+void RedeclarableTemplateDecl::loadLazySpecializationsWithArgs(
+ ArrayRef<TemplateArgument> TemplateArgs) {
+ auto *ExternalSource = getASTContext().getExternalSource();
+ if (!ExternalSource)
+ return;
+
+ ExternalSource->LoadExternalSpecs(this->getCanonicalDecl(), TemplateArgs);
+}
+
template<class Derived, class EntryType>
void RedeclarableTemplateDecl::addSpecializationImpl(
llvm::FoldingSetVector<EntryType> &Specializations, EntryType *Entry,
@@ -430,24 +439,23 @@ FunctionTemplateDecl::newCommon(ASTContext &C) const {
return CommonPtr;
}
-void FunctionTemplateDecl::LoadLazySpecializations() const {
- loadLazySpecializationsImpl();
-}
-
llvm::FoldingSetVector<FunctionTemplateSpecializationInfo> &
FunctionTemplateDecl::getSpecializations() const {
- LoadLazySpecializations();
+ loadExternalSpecializations();
return getCommonPtr()->Specializations;
}
FunctionDecl *
FunctionTemplateDecl::findSpecialization(ArrayRef<TemplateArgument> Args,
void *&InsertPos) {
+ loadLazySpecializationsWithArgs(Args);
return findSpecializationImpl(getSpecializations(), InsertPos, Args);
}
void FunctionTemplateDecl::addSpecialization(
FunctionTemplateSpecializationInfo *Info, void *InsertPos) {
+ using SETraits = SpecEntryTraits<FunctionTemplateSpecializationInfo>;
+ loadLazySpecializationsWithArgs(SETraits::getTemplateArgs(Info));
addSpecializationImpl<FunctionTemplateDecl>(getSpecializations(), Info,
InsertPos);
}
@@ -508,19 +516,15 @@ ClassTemplateDecl *ClassTemplateDecl::CreateDeserialized(ASTContext &C,
DeclarationName(), nullptr, nullptr);
}
-void ClassTemplateDecl::LoadLazySpecializations() const {
- loadLazySpecializationsImpl();
-}
-
llvm::FoldingSetVector<ClassTemplateSpecializationDecl> &
ClassTemplateDecl::getSpecializations() const {
- LoadLazySpecializations();
+ loadExternalSpecializations();
return getCommonPtr()->Specializations;
}
llvm::FoldingSetVector<ClassTemplatePartialSpecializationDecl> &
ClassTemplateDecl::getPartialSpecializations() const {
- LoadLazySpecializations();
+ loadExternalSpecializations();
return getCommonPtr()->PartialSpecializations;
}
@@ -534,11 +538,14 @@ ClassTemplateDecl::newCommon(ASTContext &C) const {
ClassTemplateSpecializationDecl *
ClassTemplateDecl::findSpecialization(ArrayRef<TemplateArgument> Args,
void *&InsertPos) {
+ loadLazySpecializationsWithArgs(Args);
return findSpecializationImpl(getSpecializations(), InsertPos, Args);
}
void ClassTemplateDecl::AddSpecialization(ClassTemplateSpecializationDecl *D,
void *InsertPos) {
+ using SETraits = SpecEntryTraits<ClassTemplateSpecializationDecl>;
+ loadLazySpecializationsWithArgs(SETraits::getTemplateArgs(D));
addSpecializationImpl<ClassTemplateDecl>(getSpecializations(), D, InsertPos);
}
@@ -546,6 +553,7 @@ ClassTemplatePartialSpecializationDecl *
ClassTemplateDecl::findPartialSpecialization(
ArrayRef<TemplateArgument> Args,
TemplateParameterList *TPL, void *&InsertPos) {
+ loadLazySpecializationsWithArgs(Args);
return findSpecializationImpl(getPartialSpecializations(), InsertPos, Args,
TPL);
}
@@ -900,6 +908,11 @@ FunctionTemplateSpecializationInfo *FunctionTemplateSpecializationInfo::Create(
FD, Template, TSK, TemplateArgs, ArgsAsWritten, POI, MSInfo);
}
+void FunctionTemplateSpecializationInfo::loadExternalRedecls() {
+ getTemplate()->loadExternalSpecializations();
+ getTemplate()->loadLazySpecializationsWithArgs(TemplateArguments->asArray());
+}
+
//===----------------------------------------------------------------------===//
// ClassTemplateSpecializationDecl Implementation
//===----------------------------------------------------------------------===//
@@ -1024,6 +1037,12 @@ ClassTemplateSpecializationDecl::getSourceRange() const {
}
}
+void ClassTemplateSpecializationDecl::loadExternalRedecls() {
+ getSpecializedTemplate()->loadExternalSpecializations();
+ getSpecializedTemplate()->loadLazySpecializationsWithArgs(
+ getTemplateArgs().asArray());
+}
+
//===----------------------------------------------------------------------===//
// ConceptDecl Implementation
//===----------------------------------------------------------------------===//
@@ -1226,19 +1245,15 @@ VarTemplateDecl *VarTemplateDecl::CreateDeserialized(ASTContext &C,
DeclarationName(), nullptr, nullptr);
}
-void VarTemplateDecl::LoadLazySpecializations() const {
- loadLazySpecializationsImpl();
-}
-
llvm::FoldingSetVector<VarTemplateSpecializationDecl> &
VarTemplateDecl::getSpecializations() const {
- LoadLazySpecializations();
+ loadExternalSpecializations();
return getCommonPtr()->Specializations;
}
llvm::FoldingSetVector<VarTemplatePartialSpecializationDecl> &
VarTemplateDecl::getPartialSpecializations() const {
- LoadLazySpecializations();
+ loadExternalSpecializations();
return getCommonPtr()->PartialSpecializations;
}
@@ -1252,17 +1267,21 @@ VarTemplateDecl::newCommon(ASTContext &C) const {
VarTemplateSpecializationDecl *
VarTemplateDecl::findSpecialization(ArrayRef<TemplateArgument> Args,
void *&InsertPos) {
+ loadLazySpecializationsWithArgs(Args);
return findSpecializationImpl(getSpecializations(), InsertPos, Args);
}
void VarTemplateDecl::AddSpecialization(VarTemplateSpecializationDecl *D,
void *InsertPos) {
+ using SETraits = SpecEntryTraits<VarTemplateSpecializationDecl>;
+ loadLazySpecializationsWithArgs(SETraits::getTemplateArgs(D));
addSpecializationImpl<VarTemplateDecl>(getSpecializations(), D, InsertPos);
}
VarTemplatePartialSpecializationD...
[truncated]
|
f419166
to
af6f8ca
Compare
You can test this locally with the following command:git-clang-format --diff 4118082f651a05cca258c684ab1199578b57afac 22c9d1145eb57d9c2cb2ef490b7c474598dd5d12 -- clang/unittests/Serialization/LoadSpecLazilyTest.cpp clang/include/clang/AST/DeclTemplate.h clang/include/clang/AST/ExternalASTSource.h clang/include/clang/AST/ODRHash.h clang/include/clang/Sema/MultiplexExternalSemaSource.h clang/include/clang/Serialization/ASTBitCodes.h clang/include/clang/Serialization/ASTReader.h clang/include/clang/Serialization/ASTWriter.h clang/lib/AST/DeclTemplate.cpp clang/lib/AST/ExternalASTSource.cpp clang/lib/AST/ODRHash.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Sema/MultiplexExternalSemaSource.cpp clang/lib/Serialization/ASTCommon.h clang/lib/Serialization/ASTReader.cpp clang/lib/Serialization/ASTReaderDecl.cpp clang/lib/Serialization/ASTReaderInternals.h clang/lib/Serialization/ASTWriter.cpp clang/lib/Serialization/ASTWriterDecl.cpp View the diff from clang-format here.diff --git a/clang/include/clang/AST/DeclTemplate.h b/clang/include/clang/AST/DeclTemplate.h
index 515c60e51e..26a9ebc468 100644
--- a/clang/include/clang/AST/DeclTemplate.h
+++ b/clang/include/clang/AST/DeclTemplate.h
@@ -802,7 +802,7 @@ protected:
template <class EntryType, typename... ProfileArguments>
typename SpecEntryTraits<EntryType>::DeclType *
findLocalSpecialization(llvm::FoldingSetVector<EntryType> &Specs,
- void *&InsertPos, ProfileArguments &&... ProfileArgs);
+ void *&InsertPos, ProfileArguments &&...ProfileArgs);
bool loadLazySpecializationsWithArgs(ArrayRef<TemplateArgument> TemplateArgs);
diff --git a/clang/lib/AST/DeclTemplate.cpp b/clang/lib/AST/DeclTemplate.cpp
index f7d513b096..c2bbf29f61 100644
--- a/clang/lib/AST/DeclTemplate.cpp
+++ b/clang/lib/AST/DeclTemplate.cpp
@@ -344,7 +344,7 @@ void RedeclarableTemplateDecl::loadExternalSpecializations() const {
}
// We still load all the external specializations explicitly in the case
- // the writer specified `-fload-external-specializations-lazily`.
+ // the writer specified `-fload-external-specializations-lazily`.
if (!getASTContext().getLangOpts().LoadExternalSpecializationsLazily &&
getASTContext().getExternalSource())
getASTContext().getExternalSource()->LoadAllExternalSpecializations(
@@ -355,7 +355,7 @@ template <class EntryType, typename... ProfileArguments>
typename RedeclarableTemplateDecl::SpecEntryTraits<EntryType>::DeclType *
RedeclarableTemplateDecl::findLocalSpecialization(
llvm::FoldingSetVector<EntryType> &Specs, void *&InsertPos,
- ProfileArguments &&... ProfileArgs) {
+ ProfileArguments &&...ProfileArgs) {
using SETraits = SpecEntryTraits<EntryType>;
llvm::FoldingSetNodeID ID;
@@ -370,7 +370,7 @@ template <class EntryType, typename... ProfileArguments>
typename RedeclarableTemplateDecl::SpecEntryTraits<EntryType>::DeclType *
RedeclarableTemplateDecl::findSpecializationImpl(
llvm::FoldingSetVector<EntryType> &Specs, void *&InsertPos,
- ProfileArguments &&... ProfileArgs) {
+ ProfileArguments &&...ProfileArgs) {
if (auto *Ret = findLocalSpecialization(
Specs, InsertPos, std::forward<ProfileArguments>(ProfileArgs)...))
return Ret;
diff --git a/clang/lib/AST/ODRHash.cpp b/clang/lib/AST/ODRHash.cpp
index 9e274ff596..72a9a870ea 100644
--- a/clang/lib/AST/ODRHash.cpp
+++ b/clang/lib/AST/ODRHash.cpp
@@ -1318,4 +1318,3 @@ void ODRHash::AddStructuralValue(const APValue &Value) {
}
void ODRHash::AddInteger(unsigned Value) { ID.AddInteger(Value); }
-
diff --git a/clang/lib/Serialization/ASTReaderDecl.cpp b/clang/lib/Serialization/ASTReaderDecl.cpp
index 99b02f3987..1facfd4865 100644
--- a/clang/lib/Serialization/ASTReaderDecl.cpp
+++ b/clang/lib/Serialization/ASTReaderDecl.cpp
@@ -265,9 +265,10 @@ namespace clang {
: Reader(Reader), Record(Record), Loc(Loc), ThisDeclID(thisDeclID),
ThisDeclLoc(ThisDeclLoc) {}
- template <typename T> static
- void AddExternalSpecializations(T *D,
- SmallVectorImpl<serialization::DeclID>& IDs) {
+ template <typename T>
+ static void
+ AddExternalSpecializations(T *D,
+ SmallVectorImpl<serialization::DeclID> &IDs) {
if (IDs.empty())
return;
@@ -4273,11 +4274,14 @@ void ASTReader::loadDeclUpdateRecords(PendingUpdateRecord &Record) {
isa<ClassTemplateDecl, VarTemplateDecl, FunctionTemplateDecl>(D)) &&
"Must not have pending specializations");
if (auto *CTD = dyn_cast<ClassTemplateDecl>(D))
- ASTDeclReader::AddExternalSpecializations(CTD, PendingExternalSpecializationIDs);
+ ASTDeclReader::AddExternalSpecializations(CTD,
+ PendingExternalSpecializationIDs);
else if (auto *FTD = dyn_cast<FunctionTemplateDecl>(D))
- ASTDeclReader::AddExternalSpecializations(FTD, PendingExternalSpecializationIDs);
+ ASTDeclReader::AddExternalSpecializations(FTD,
+ PendingExternalSpecializationIDs);
else if (auto *VTD = dyn_cast<VarTemplateDecl>(D))
- ASTDeclReader::AddExternalSpecializations(VTD, PendingExternalSpecializationIDs);
+ ASTDeclReader::AddExternalSpecializations(VTD,
+ PendingExternalSpecializationIDs);
PendingExternalSpecializationIDs.clear();
// Load the pending visible updates for this decl context, if it has any.
diff --git a/clang/unittests/Serialization/LoadSpecLazilyTest.cpp b/clang/unittests/Serialization/LoadSpecLazilyTest.cpp
index 39b183f774..58ffd1ca38 100644
--- a/clang/unittests/Serialization/LoadSpecLazilyTest.cpp
+++ b/clang/unittests/Serialization/LoadSpecLazilyTest.cpp
@@ -46,8 +46,7 @@ public:
OS << Contents;
}
- std::string GenerateModuleInterface(StringRef ModuleName,
- StringRef Contents,
+ std::string GenerateModuleInterface(StringRef ModuleName, StringRef Contents,
bool WriteExternalSpecsTable) {
std::string FileName = llvm::Twine(ModuleName + ".cppm").str();
addFile(FileName, Contents);
@@ -65,9 +64,9 @@ public:
const char *Args[] = {"clang++",
"-std=c++20",
"--precompile",
- (WriteExternalSpecsTable ?
- "-fload-external-specializations-lazily" :
- ""),
+ (WriteExternalSpecsTable
+ ? "-fload-external-specializations-lazily"
+ : ""),
PrebuiltModulePath.c_str(),
"-working-directory",
TestDir.c_str(),
@@ -159,7 +158,8 @@ export class ShouldNotBeLoaded {};
export class Temp {
A<ShouldNotBeLoaded> AS;
};
- )cpp", /*WriteExternalSpecsTable=*/true);
+ )cpp",
+ /*WriteExternalSpecsTable=*/true);
const char *test_file_contents = R"cpp(
import M;
@@ -185,7 +185,8 @@ TEST_F(LoadSpecLazilyTest, ChainedTest) {
export module M;
export template <class T>
class A {};
- )cpp", /*WriteExternalSpecsTable=*/true);
+ )cpp",
+ /*WriteExternalSpecsTable=*/true);
GenerateModuleInterface("N", R"cpp(
export module N;
@@ -195,7 +196,8 @@ export class ShouldNotBeLoaded {};
export class Temp {
A<ShouldNotBeLoaded> AS;
};
- )cpp", /*WriteExternalSpecsTable=*/true);
+ )cpp",
+ /*WriteExternalSpecsTable=*/true);
const char *test_file_contents = R"cpp(
import N;
@@ -223,7 +225,8 @@ TEST_F(LoadSpecLazilyTest, LoadAllTest) {
export module M;
export template <class T>
class A {};
- )cpp", /*WriteExternalSpecsTable=*/true);
+ )cpp",
+ /*WriteExternalSpecsTable=*/true);
GenerateModuleInterface("N", R"cpp(
export module N;
@@ -233,7 +236,8 @@ export class ShouldBeLoaded {};
export class Temp {
A<ShouldBeLoaded> AS;
};
- )cpp", /*WriteExternalSpecsTable=*/true);
+ )cpp",
+ /*WriteExternalSpecsTable=*/true);
const char *test_file_contents = R"cpp(
import N;
|
af6f8ca
to
79cefc9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great way to start a new year ;)
The phab link is https://reviews.llvm.org/D41416.
In general I was wondering could we simplify the implementation by loading the specialization hash table upon module load. That should be relatively cheap as we will read 2 integers per specialization.
Perhaps we should put both patches together and that'd allow us to test them if they are on par with https://reviews.llvm.org/D41416 which we use downstream.
Thanks for working on this!
@@ -150,6 +150,11 @@ class ExternalASTSource : public RefCountedBase<ExternalASTSource> { | |||
virtual bool | |||
FindExternalVisibleDeclsByName(const DeclContext *DC, DeclarationName Name); | |||
|
|||
/// Load all the external specialzations for the Decl and the corresponding | |||
/// template arguments. | |||
virtual void LoadExternalSpecs(const Decl *D, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
virtual void LoadExternalSpecs(const Decl *D, | |
virtual void FindExternalSpecialization(const Decl *D, |
sounds more consistent to the surroundings here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel Load
may be a better name. Since from the signature it doesn't find anything. And if we want consistency, I suggest to rename FindExternalVisibleDeclsByName
to LoadExternalVisibleDeclsByName
.
@@ -527,6 +527,10 @@ class ASTWriter : public ASTDeserializationListener, | |||
bool isLookupResultExternal(StoredDeclsList &Result, DeclContext *DC); | |||
bool isLookupResultEntirelyExternal(StoredDeclsList &Result, DeclContext *DC); | |||
|
|||
uint64_t | |||
WriteSpecsLookupTable(NamedDecl *D, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally spec
would read as specification not specialization. Maybe we should use the full word.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Will do in the next circle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
SmallVector<serialization::DeclID, 32> SpecIDs; | ||
readDeclIDList(SpecIDs); | ||
|
||
if (Record.readInt()) | ||
ReadDeclsSpecs(*Loc.F, D, Loc.F->DeclsCursor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the TemplateDecl came from a different module file and this module file contains only specializations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then it won't fall here. It is the job of the latter patch (ChuanqiXu9@7f027f0)
@@ -1249,3 +1249,5 @@ void ODRHash::AddQualType(QualType T) { | |||
void ODRHash::AddBoolean(bool Value) { | |||
Bools.push_back(Value); | |||
} | |||
|
|||
void ODRHash::AddInteger(unsigned Value) { ID.AddInteger(Value); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember @hahnjo and @zygoloid discussing that the odr-hasher is probably not the best way to has template arguments because the hasher would not take into account semantic aspects of template arguments. For example, a fully qualified template argument would not compare the same to a non-qualified one. We might need to implement our own folding set logic.
@hahnjo, could you help me out dig that discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. I didn't recognize this. If this is true, we need to decide if we can leave a FIXME here or we must fix it to proceed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The review related to ODRHash
is this one: https://reviews.llvm.org/D153003
In short, my understanding is that ODRHash
gives the following guarantee: If the hashes are different, there is guaranteed to be a ODR violation. In the other direction, if two hashes are the same, the declarations have to be compared in more detail, ie there may or may not be an ODR violation.
For the specializations, we need the opposite: If two template arguments are semantically the same (*), they must hash to the same value or otherwise we will not find the correct bucket. On the other hand, two different specialization arguments may have the same hash, that's fine for the map data structure.
Now the additional caveat (*) is that "semantically the same" is not the same congruence as "no ODR violation". In https://reviews.llvm.org/D153003 we discuss using
declarations, but IIRC it's also possible to construct problematic cases with (nested) namespaces, top-level ::
prefixes, and template template parameters. Taken together, my conclusion from the discussion above is that ODRHash
is simply not the right method to find template specialization parameters in a map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great analysis. Fair enough, let's find a method to proceed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to add a test case to show the problem in 9b808a4. But the current patch works well for that. While I agree the ODRHash may be too aggressive for the problem we're solving, I don't want to write things that can't be well tested. I am wondering if we can proceed by leaving a FIXME here if we can't find good test in time? Or maybe we can add an option -fload-specialization-lazily
, then we can regress smoothly if there are any problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the qualified related problems in ODRHash (at least some of them) are fixed in https://reviews.llvm.org/D156210
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the comment we are discussing is here: https://reviews.llvm.org/D154324#4524368 by @zygoloid:
"
...
For D41416, ODR hashing may not be the best mechanism to hash the template arguments, unfortunately. ODR hashing is (or perhaps, should be) about determining whether two things are spelled the same way and have the same meaning (as required by the C++ ODR), whereas I think what you're looking for is whether they have the same meaning regardless of spelling. Maybe we can get away with reusing ODR hashing anyway, on the basis that any canonical, non-dependent template argument should have the same (invented) spelling in every translation unit, but I'm not certain that's true in all cases. There may still be cases where the canonical type includes some aspect of "whatever we saw first", in which case the ODR hash can differ across translation units for non-dependent, canonical template arguments that are spelled differently but have the same meaning, though I can't think of one off-hand.
"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I just saw it. My concern for reinventing a new hash mechanism is how can we make sure it is correct. It may be not hard to invent a new hasher. But I am just worrying it may not be well tested. I prefer to make it step by step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the example of @hahnjo works, perhaps a FIXME referring to this discussion should be sufficient and we can revisit the issue once we have an example that breaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Hi Vassilev, for testing purpose I sent https://github.com/ChuanqiXu9/llvm-project/tree/LoadSpecializationUpdatesLazily. I didn't create stacked review since I feel a standalone branch may be sufficient.
IIUC, it looks like what I do in ChuanqiXu9@7f027f0#diff-c61a3cce4bfa099b5af032fa83cbf1563f0af4bf58dc112b39571d74b6b681c1R3487-R3499. But I don't want to do that with this patch. Since we can avoid load the hash table if the template decl is not loaded. |
…iased template args This a test for #76774. In the review comments, we're concerning about the case that ODRHash may produce the different hash values for semantical same template arguments. For example, if the template argument in a specialization is not qualified and the semantical same template argument in the instantiation point is qualified, we should be able to select that template specialization. And this patch tests this behavior: we should be able to select the correct specialization with semantical same template arguments.
@ChuanqiXu9, I'd prefer to review both patches at the same time. Otherwise we risk of missing some important details. |
Got it. I can try to create a stacked review. But from I know about the status quo stacked review now, it will require us to lost the current contexnt... And it will still be pretty valuable if you can test this with your internal workloads, then may be we can find something pretty important in the high level before going into the details. I've tested this in our local workloads, and it looks good and the performance improvements remains. But I know our uses about modules may be not so complex like yours. |
I would just push the second commit here. It should be good enough. |
79cefc9
to
50fd47f
Compare
I failed to use spr to create stacked review... So I just create the stacked PR manually: #77417. Luckily the context are remained. I heard the current context may be lost if we change to use spr now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks quite promising to me. Have you run that patch on bigger workflows? Do we have some performance numbers to compare?
I will run some tests on our infrastructure and report back.
clang/lib/AST/ExternalASTSource.cpp
Outdated
@@ -100,6 +100,11 @@ ExternalASTSource::FindExternalVisibleDeclsByName(const DeclContext *DC, | |||
return false; | |||
} | |||
|
|||
void ExternalASTSource::LoadExternalSpecializations( | |||
const Decl *D, ArrayRef<TemplateArgument> TemplateArgs) { | |||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do in the next circle.
@ilya-biryukov any chance you/your folks could test this change for performance implications in google? It's especially helpful to CERN, but the last iteration of this direction had some regressions that stalled out progress on that version a few years ago, so it'd be good to help poke this along while making sure it doesn't cause release hiccups/etc for google. |
I've tested it functionality in our largest workload about modules. It runs well. But our uses of modules don't have a lot of complexities while it has a large scale. For performances, I plan to make it this week. It is a little bit additional work since I need to compile the compiler with different optimizations to have a fair comparison. |
@ChuanqiXu9, this PR does not seem to compile. Can you make the second commit work before I start testing? |
fd2d753
to
43648e5
Compare
Oh, sorry. It should work now. |
Update: Previously we will always try to load the specializations with the This patch tries to improve this by trying to load the specializations But I didn't observe significant improvement with this change locally. |
I feel better to drop D153003 if it is not a blocking issue. I feel it make things more complicated... |
If I drop it then our tests will break. IIUC that's somewhere deep in the hasher and should be not impact this PR. Does this make the work on the on-disk hashtable more complicated in some way? |
No, it won't block the work for on-disk hashtable. But if we want to land that, we must understand what happened actually... |
We can’t land that without attaching your on-disk hashtable implementation part of this PR because of what’s mentioned here #76774 (comment) |
I know that. But we're not talking about the same thing. This is one of the reason that we can't land that. But my point is that we can't land that if we don't understand what's going wrong without that patch. |
We understand that very well and it's described in https://reviews.llvm.org/D153003 as well as the surrounding discussions: because of the way that
(and IIRC that's the same direction that Richard was going) |
Yeah, so I tried to fix that in the following patches. And if that works, I expect that can fix internal errors in your workloads.
|
Let's zoom out a little. The approach in D41416 shows that it is feasible to store a hash of the template arguments to delay eager deserializations. The ODR hash approach is a second order problem because we can swap it with something better once we need to. In order to make progress we have introduced D153003 which allows our infrastructure to work. The way I see moving forward here is:
|
Yeah, no problem at all. This is what I want in the higher level too. What I am confused is about the status of D153003. If it is true that we've describe the problem completely in the review page, then c31d6b4 should be a proper fix for that. |
I can try it on our infrastructure and if it works I will remove D153003. |
Sorry for losing track of the discussion here. What is the current status here? Should we run another round of testing? Also, I see proposals to land the new behaviour under a flag and have it off by default. |
@ilya-biryukov, this PR is not ready to test. However, I'd appreciate if you could test our baseline patch located here: https://github.com/llvm/llvm-project/tree/users/vgvassilev/D41416_D153003 on you |
@ChuanqiXu9, you were right. We seem to not need D153003 and I have removed it from the branch. |
Yeah, then let's create a new branch (the existing |
Oh, I didn't notice you've removed D153003 already. But the branch name looks not good. So I've created a pr in #83108 |
@vgvassilev Let me try to double check your advice. In you suggestion, you suggest to replace |
Following up for #83108 This follows the suggestion literally from #76774 (comment) which introduces OnDiskHashTable for specializations based on D41416. Note that I didn't polish this patch to reduce the diff from D41416 to it easier to review. I'll make the polishing patch later. So that we can focus what we're doing in this patch and focus on the style in the next patch.
Following up for #83108 This follows the suggestion literally from #76774 (comment) which introduces OnDiskHashTable for specializations based on D41416. Note that I didn't polish this patch to reduce the diff from D41416 to it easier to review. I'll make the polishing patch later. So that we can focus what we're doing in this patch and focus on the style in the next patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am happy to defer to @vgvassilev et al. on this one.
Following up for llvm#83108 This follows the suggestion literally from llvm#76774 (comment) which introduces OnDiskHashTable for specializations based on D41416. Note that I didn't polish this patch to reduce the diff from D41416 to it easier to review. I'll make the polishing patch later. So that we can focus what we're doing in this patch and focus on the style in the next patch.
Given we're pursuing #83237 series. I'll close this one. |
Following up for #83108 This follows the suggestion literally from #76774 (comment) which introduces OnDiskHashTable for specializations based on D41416. Note that I didn't polish this patch to reduce the diff from D41416 to it easier to review. I'll make the polishing patch later. So that we can focus what we're doing in this patch and focus on the style in the next patch.
Following up for #83108 This follows the suggestion literally from #76774 (comment) which introduces OnDiskHashTable for specializations based on D41416. Note that I didn't polish this patch to reduce the diff from D41416 to it easier to review. I'll make the polishing patch later. So that we can focus what we're doing in this patch and focus on the style in the next patch.
Following up for llvm#83108 This follows the suggestion literally from llvm#76774 (comment) which introduces OnDiskHashTable for specializations based on D41416. Note that I didn't polish this patch to reduce the diff from D41416 to it easier to review. I'll make the polishing patch later. So that we can focus what we're doing in this patch and focus on the style in the next patch.
…zations when looking for one. fmt [Serialization] Introduce OnDiskHashTable for specializations Following up for llvm#83108 This follows the suggestion literally from llvm#76774 (comment) which introduces OnDiskHashTable for specializations based on D41416. Note that I didn't polish this patch to reduce the diff from D41416 to it easier to review. I'll make the polishing patch later. So that we can focus what we're doing in this patch and focus on the style in the next patch. [Serialization] Code cleanups and polish 83233 fmt load specializations before writing specialization decls address comments Revert "load specializations before writing specialization decls" This reverts commit 61c451d. Do not omit data from imported modules with same key Handle merging spec info manually
…zations when looking for one. fmt [Serialization] Introduce OnDiskHashTable for specializations Following up for llvm#83108 This follows the suggestion literally from llvm#76774 (comment) which introduces OnDiskHashTable for specializations based on D41416. Note that I didn't polish this patch to reduce the diff from D41416 to it easier to review. I'll make the polishing patch later. So that we can focus what we're doing in this patch and focus on the style in the next patch. [Serialization] Code cleanups and polish 83233 fmt load specializations before writing specialization decls address comments Revert "load specializations before writing specialization decls" This reverts commit 61c451d. Do not omit data from imported modules with same key Handle merging spec info manually
Following up for #83108 This follows the suggestion literally from #76774 (comment) which introduces OnDiskHashTable for specializations based on D41416. Note that I didn't polish this patch to reduce the diff from D41416 to it easier to review. I'll make the polishing patch later. So that we can focus what we're doing in this patch and focus on the style in the next patch.
The idea comes from @vgvassilev and @vgvassilev had a patch for it on phab. Unfortunately phab is closed and I forgot the Dxxx number of that patch. But I remember the last comment from @vgvassilev is that we should use MultiOnDiskHashTable for it. So I followed that and rewrite the whole from the scratch in the new year.
Background
Currently all the specializations of a template (including instantiation, specialization and partial specializations) will be loaded at once if we want to instantiate another instance for the template, or find instantiation for the template, or just want to complete the redecl chain.
This means basically we need to load every specializations for the template once the template declaration got loaded. This is bad since when we load a specialization, we need to load all of its template arguments. Then we have to deserialize a lot of unnecessary declarations.
For example,
We should a specialization
A<ShouldNotBeLoaded>
inM.cppm
and we instantiate the templateA
inuse.cpp
. Then we will deserializeShouldNotBeLoaded
surprisingly when compilinguse.cpp
. And this patch tries to avoid that.Given that the templates are heavily used in C++, this is a pain point for the performance.
What this patch did
This patch adds MultiOnDiskHashTable for specializations in the ASTReader. Then we will only deserialize the specializations with the same template arguments. We made that by using ODRHash for the template arguments as the key of the hash table.
The partial specializations are not added to the MultiOnDiskHashTable. Since we can't know if a partial specialization is needed before deciding the template declaration for a instantiation request. There may be space for further optimizations, but let's do that in the future.
To review this patch, I think
ASTReaderDecl::AddLazySpecializations
may be a good entry point.