Skip to content

Commit

Permalink
Use rich mangling information in Symtab::InitNameIndexes()
Browse files Browse the repository at this point in the history
Summary:
I set up a new review, because not all the code I touched was marked as a change in old one anymore.

In preparation for this review, there were two earlier ones:
* https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes
* https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function

Primary goals for this patch are:
(1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index.
(2) Provide a uniform interface.
(3) Improve indexing performance.

The central implementation in this patch is our new function for explicit demangling:
```
const RichManglingInfo *
Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *)
```

It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are:
* `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`).
* `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing.

The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`.

Future potential:
* `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.)
* The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages.

One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess).

First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think.

Reviewers: labath, jingham, JDevlieghere, erik.pilkington

Subscribers: zturner, clayborg, mgorny, lldb-commits

Differential Revision: https://reviews.llvm.org/D50071

llvm-svn: 339291
  • Loading branch information
weliveindetail committed Aug 8, 2018
1 parent f71dd34 commit f1a98df
Show file tree
Hide file tree
Showing 11 changed files with 730 additions and 147 deletions.
46 changes: 36 additions & 10 deletions lldb/include/lldb/Core/Mangled.h
Expand Up @@ -11,18 +11,15 @@
#define liblldb_Mangled_h_
#if defined(__cplusplus)

#include "lldb/lldb-enumerations.h"
#include "lldb/lldb-forward.h"

#include "lldb/Utility/ConstString.h"
#include "lldb/lldb-enumerations.h" // for LanguageType
#include "llvm/ADT/StringRef.h" // for StringRef

#include <stddef.h> // for size_t
#include "llvm/ADT/StringRef.h"

namespace lldb_private {
class RegularExpression;
}
namespace lldb_private {
class Stream;
}
#include <memory>
#include <stddef.h>

namespace lldb_private {

Expand Down Expand Up @@ -238,7 +235,6 @@ class Mangled {
return true;
return GetDemangledName(language) == name;
}

bool NameMatches(const RegularExpression &regex,
lldb::LanguageType language) const;

Expand Down Expand Up @@ -300,6 +296,36 @@ class Mangled {
//----------------------------------------------------------------------
lldb::LanguageType GuessLanguage() const;

/// Function signature for filtering mangled names.
using SkipMangledNameFn = bool(llvm::StringRef, ManglingScheme);

//----------------------------------------------------------------------
/// Trigger explicit demangling to obtain rich mangling information. This is
/// optimized for batch processing while populating a name index. To get the
/// pure demangled name string for a single entity, use GetDemangledName()
/// instead.
///
/// For names that match the Itanium mangling scheme, this uses LLVM's
/// ItaniumPartialDemangler. All other names fall back to LLDB's builtin
/// parser currently.
///
/// This function is thread-safe when used with different \a context
/// instances in different threads.
///
/// @param[in] context
/// The context for this function. A single instance can be stack-
/// allocated in the caller's frame and used for multiple calls.
///
/// @param[in] skip_mangled_name
/// A filtering function for skipping entities based on name and mangling
/// scheme. This can be null if unused.
///
/// @return
/// True on success, false otherwise.
//----------------------------------------------------------------------
bool DemangleWithRichManglingInfo(RichManglingContext &context,
SkipMangledNameFn *skip_mangled_name);

private:
//----------------------------------------------------------------------
/// Mangled member variables.
Expand Down
110 changes: 110 additions & 0 deletions lldb/include/lldb/Core/RichManglingContext.h
@@ -0,0 +1,110 @@
//===-- RichManglingContext.h -----------------------------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//

#ifndef liblldb_RichManglingContext_h_
#define liblldb_RichManglingContext_h_

#include "lldb/lldb-forward.h"
#include "lldb/lldb-private.h"

#include "lldb/Utility/ConstString.h"

#include "llvm/ADT/Any.h"
#include "llvm/ADT/SmallString.h"
#include "llvm/Demangle/Demangle.h"

namespace lldb_private {

/// Uniform wrapper for access to rich mangling information from different
/// providers. See Mangled::DemangleWithRichManglingInfo()
class RichManglingContext {
public:
RichManglingContext()
: m_provider(None), m_ipd_buf_size(2048), m_ipd_str_len(0) {
m_ipd_buf = static_cast<char *>(std::malloc(m_ipd_buf_size));
m_ipd_buf[m_ipd_str_len] = '\0';
}

~RichManglingContext() { std::free(m_ipd_buf); }

/// Use the ItaniumPartialDemangler to obtain rich mangling information from
/// the given mangled name.
bool FromItaniumName(const ConstString &mangled);

/// Use the legacy language parser implementation to obtain rich mangling
/// information from the given demangled name.
bool FromCxxMethodName(const ConstString &demangled);

/// If this symbol describes a constructor or destructor.
bool IsCtorOrDtor() const;

/// If this symbol describes a function.
bool IsFunction() const;

/// Get the base name of a function. This doesn't include trailing template
/// arguments, ie "a::b<int>" gives "b". The result will overwrite the
/// internal buffer. It can be obtained via GetBufferRef().
void ParseFunctionBaseName();

/// Get the context name for a function. For "a::b::c", this function returns
/// "a::b". The result will overwrite the internal buffer. It can be obtained
/// via GetBufferRef().
void ParseFunctionDeclContextName();

/// Get the entire demangled name. The result will overwrite the internal
/// buffer. It can be obtained via GetBufferRef().
void ParseFullName();

/// Obtain a StringRef to the internal buffer that holds the result of the
/// most recent ParseXy() operation. The next ParseXy() call invalidates it.
llvm::StringRef GetBufferRef() const {
assert(m_provider != None && "Initialize a provider first");
return m_buffer;
}

private:
enum InfoProvider { None, ItaniumPartialDemangler, PluginCxxLanguage };

/// Selects the rich mangling info provider.
InfoProvider m_provider;

/// Reference to the buffer used for results of ParseXy() operations.
llvm::StringRef m_buffer;

/// Members for ItaniumPartialDemangler
llvm::ItaniumPartialDemangler m_ipd;
char *m_ipd_buf;
size_t m_ipd_buf_size;
size_t m_ipd_str_len;

/// Members for PluginCxxLanguage
/// Cannot forward declare inner class CPlusPlusLanguage::MethodName. The
/// respective header is in Plugins and including it from here causes cyclic
/// dependency. Instead keep a llvm::Any and cast it on-access in the cpp.
llvm::Any m_cxx_method_parser;

/// Clean up memory and set a new info provider for this instance.
void ResetProvider(InfoProvider new_provider);

/// Uniform handling of string buffers for ItaniumPartialDemangler.
void processIPDStrResult(char *ipd_res, size_t res_len);

/// Cast the given parser to the given type. Ideally we would have a type
/// trait to deduce \a ParserT from a given InfoProvider, but unfortunately we
/// can't access CPlusPlusLanguage::MethodName from within the header.
template <class ParserT> static ParserT *get(llvm::Any parser) {
assert(parser.hasValue());
assert(llvm::any_isa<ParserT *>(parser));
return llvm::any_cast<ParserT *>(parser);
}
};

} // namespace lldb_private

#endif
9 changes: 9 additions & 0 deletions lldb/include/lldb/Symbol/Symtab.h
Expand Up @@ -197,6 +197,15 @@ class Symtab {
void SymbolIndicesToSymbolContextList(std::vector<uint32_t> &symbol_indexes,
SymbolContextList &sc_list);

void RegisterMangledNameEntry(
NameToIndexMap::Entry &entry, std::set<const char *> &class_contexts,
std::vector<std::pair<NameToIndexMap::Entry, const char *>> &backlog,
RichManglingContext &rmc);

void RegisterBacklogEntry(const NameToIndexMap::Entry &entry,
const char *decl_context,
const std::set<const char *> &class_contexts);

DISALLOW_COPY_AND_ASSIGN(Symtab);
};

Expand Down
11 changes: 11 additions & 0 deletions lldb/include/lldb/lldb-forward.h
Expand Up @@ -191,6 +191,7 @@ class RegisterLocationList;
class RegisterValue;
class RegularExpression;
class REPL;
class RichManglingContext;
class Scalar;
class ScriptInterpreter;
class ScriptInterpreterLocker;
Expand Down Expand Up @@ -492,5 +493,15 @@ typedef std::shared_ptr<lldb_private::Watchpoint> WatchpointSP;

} // namespace lldb

//----------------------------------------------------------------------
// llvm forward declarations
//----------------------------------------------------------------------
namespace llvm {

struct ItaniumPartialDemangler;
class StringRef;

} // namespace llvm

#endif // #if defined(__cplusplus)
#endif // LLDB_lldb_forward_h_
20 changes: 16 additions & 4 deletions lldb/lldb.xcodeproj/project.pbxproj
Expand Up @@ -487,6 +487,9 @@
8C3BD9961EF45DA50016C343 /* MainThreadCheckerRuntime.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 8C3BD9951EF45D9B0016C343 /* MainThreadCheckerRuntime.cpp */; };
2689004313353E0400698AC0 /* Mangled.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 26BC7E8010F1B85900F91463 /* Mangled.cpp */; };
4F29D3CF21010FA3003B549A /* MangledTest.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 4F29D3CD21010F84003B549A /* MangledTest.cpp */; };
4FBC04EF211A06820015A814 /* RichManglingContext.h in Headers */ = {isa = PBXBuildFile; fileRef = 4FBC04EE211A06820015A814 /* RichManglingContext.h */; };
4FBC04ED211A06200015A814 /* RichManglingContext.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 4FBC04EC211A06200015A814 /* RichManglingContext.cpp */; };
4FBC04F5211A13770015A814 /* RichManglingContextTest.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 4FBC04F3211A0F0F0015A814 /* RichManglingContextTest.cpp */; };
4CD44CFC20B37C440003557C /* ManualDWARFIndex.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 4CD44CF920B37C440003557C /* ManualDWARFIndex.cpp */; };
49DCF702170E70120092F75E /* Materializer.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 49DCF700170E70120092F75E /* Materializer.cpp */; };
2690B3711381D5C300ECFBAE /* Memory.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 2690B3701381D5C300ECFBAE /* Memory.cpp */; };
Expand Down Expand Up @@ -2198,6 +2201,9 @@
26BC7E8010F1B85900F91463 /* Mangled.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = Mangled.cpp; path = source/Core/Mangled.cpp; sourceTree = "<group>"; };
26BC7D6910F1B77400F91463 /* Mangled.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = Mangled.h; path = include/lldb/Core/Mangled.h; sourceTree = "<group>"; };
4F29D3CD21010F84003B549A /* MangledTest.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = MangledTest.cpp; sourceTree = "<group>"; };
4FBC04EE211A06820015A814 /* RichManglingContext.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = RichManglingContext.h; path = include/lldb/Core/RichManglingContext.h; sourceTree = "<group>"; };
4FBC04EC211A06200015A814 /* RichManglingContext.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = RichManglingContext.cpp; path = source/Core/RichManglingContext.cpp; sourceTree = "<group>"; };
4FBC04F3211A0F0F0015A814 /* RichManglingContextTest.cpp */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.cpp.cpp; path = RichManglingContextTest.cpp; sourceTree = "<group>"; };
4CD44CF920B37C440003557C /* ManualDWARFIndex.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = ManualDWARFIndex.cpp; sourceTree = "<group>"; };
4CD44D0020B37C580003557C /* ManualDWARFIndex.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; path = ManualDWARFIndex.h; sourceTree = "<group>"; };
2682100C143A59AE004BCF2D /* MappedHash.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = MappedHash.h; path = include/lldb/Core/MappedHash.h; sourceTree = "<group>"; };
Expand Down Expand Up @@ -3676,13 +3682,14 @@
23CB14E51D66CBEB00EDDDE1 /* Core */ = {
isa = PBXGroup;
children = (
23CB14E61D66CC0E00EDDDE1 /* BroadcasterTest.cpp */,
23CB14E71D66CC0E00EDDDE1 /* CMakeLists.txt */,
23CB14E81D66CC0E00EDDDE1 /* DataExtractorTest.cpp */,
58A080B12112AB2200D5580F /* HighlighterTest.cpp */,
4F29D3CD21010F84003B549A /* MangledTest.cpp */,
9A3D43E31F3237D500EB767C /* ListenerTest.cpp */,
4F29D3CD21010F84003B549A /* MangledTest.cpp */,
4FBC04F3211A0F0F0015A814 /* RichManglingContextTest.cpp */,
9A3D43E11F3237D500EB767C /* StreamCallbackTest.cpp */,
23CB14E71D66CC0E00EDDDE1 /* CMakeLists.txt */,
23CB14E61D66CC0E00EDDDE1 /* BroadcasterTest.cpp */,
23CB14E81D66CC0E00EDDDE1 /* DataExtractorTest.cpp */,
);
path = Core;
sourceTree = "<group>";
Expand Down Expand Up @@ -5047,6 +5054,8 @@
26BC7D7110F1B77400F91463 /* PluginManager.h */,
26BC7E8A10F1B85900F91463 /* PluginManager.cpp */,
2626B6AD143E1BEA00EF935C /* RangeMap.h */,
4FBC04EE211A06820015A814 /* RichManglingContext.h */,
4FBC04EC211A06200015A814 /* RichManglingContext.cpp */,
26BC7CF910F1B71400F91463 /* SearchFilter.h */,
26BC7E1510F1B83100F91463 /* SearchFilter.cpp */,
26BC7D7510F1B77400F91463 /* Section.h */,
Expand Down Expand Up @@ -6955,6 +6964,7 @@
2619C4862107A9A2009CDE81 /* RegisterContextMinidump_ARM64.h in Headers */,
AF235EB11FBE77B6009C5541 /* RegisterContextPOSIX_ppc64le.h in Headers */,
267F68501CC02E270086832B /* RegisterContextPOSIXCore_s390x.h in Headers */,
4FBC04EF211A06820015A814 /* RichManglingContext.h in Headers */,
4984BA181B979C08008658D4 /* ExpressionVariable.h in Headers */,
26C7C4841BFFEA7E009BD01F /* WindowsMiniDump.h in Headers */,
30B38A001CAAA6D7009524E3 /* ClangUtil.h in Headers */,
Expand Down Expand Up @@ -7473,6 +7483,7 @@
9A2057181F3B861400F6C293 /* TestType.cpp in Sources */,
9A2057171F3B861400F6C293 /* TestDWARFCallFrameInfo.cpp in Sources */,
4F29D3CF21010FA3003B549A /* MangledTest.cpp in Sources */,
4FBC04F5211A13770015A814 /* RichManglingContextTest.cpp in Sources */,
9A3D43EC1F3237F900EB767C /* ListenerTest.cpp in Sources */,
9A3D43DC1F3151C400EB767C /* TimeoutTest.cpp in Sources */,
9A3D43D61F3151C400EB767C /* ConstStringTest.cpp in Sources */,
Expand Down Expand Up @@ -7590,6 +7601,7 @@
4C0083401B9F9BA900D5CF24 /* UtilityFunction.cpp in Sources */,
AF415AE71D949E4400FCE0D4 /* x86AssemblyInspectionEngine.cpp in Sources */,
26474CCD18D0CB5B0073DEBA /* RegisterContextPOSIX_x86.cpp in Sources */,
4FBC04ED211A06200015A814 /* RichManglingContext.cpp in Sources */,
AEB0E4591BD6E9F800B24093 /* LLVMUserExpression.cpp in Sources */,
2689FFEF13353DB600698AC0 /* Breakpoint.cpp in Sources */,
267A47FB1B1411C40021A5BC /* NativeRegisterContext.cpp in Sources */,
Expand Down
1 change: 1 addition & 0 deletions lldb/source/Core/CMakeLists.txt
Expand Up @@ -34,6 +34,7 @@ add_lldb_library(lldbCore
ModuleList.cpp
Opcode.cpp
PluginManager.cpp
RichManglingContext.cpp
SearchFilter.cpp
Section.cpp
SourceManager.cpp
Expand Down

0 comments on commit f1a98df

Please sign in to comment.