Skip to content

[CGData] Make an option to skip reading Names into StableFunctionMap #142095

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 10, 2025

Conversation

nocchijiang
Copy link
Contributor

@nocchijiang nocchijiang commented May 30, 2025

Names are used for debugging purpose and have no impact on codegen. For a non-trivial project, reading them consumes a lot of memory and slows down the compilation significantly. This patch adds a field in the indexed CGData to remember the total size of Names, and creates a command-line option to skip reading Names by advancing the pointer when deserializing the indexed CGData.

…uilds

Names are used for debugging purpose and have no impact on codegen. For
a non-trivial project, reading them consumes a lot of memory and slows
down the compilation significantly. This patch adds a field in the
serialized CGData to remember the total size of Names, and skips reading
Names by advancing the pointer when deserializing for the Use mode in
non-assertion builds.
@nocchijiang
Copy link
Contributor Author

@kyulee-com

Copy link
Contributor

@kyulee-com kyulee-com left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!

: DataBuffer(std::move(DataBuffer)) {}
IndexedCodeGenDataReader(std::unique_ptr<MemoryBuffer> DataBuffer,
Options Opts)
: DataBuffer(std::move(DataBuffer)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
: DataBuffer(std::move(DataBuffer)) {
: DataBuffer(std::move(DataBuffer)), Opts(Opts) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decide to fully remove Opts for simplicity.

@@ -155,7 +155,14 @@ CodeGenData &CodeGenData::getInstance() {
// Instead, just emit an warning message and fall back as if no CGData
// were available.
auto FS = vfs::getRealFileSystem();
auto ReaderOrErr = CodeGenDataReader::create(CodeGenDataUsePath, *FS);
CodeGenDataReader::Options Opts;
#ifdef NDEBUG
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to have consistent behavior between debug and release builds.
Also we can allow users to have explicit control regardless of build types.
It'd be simpler and maintainable without any confusion as the constructor sets it as true.

Can we define a command line flag, and set it based on that? Something like:

static cl::opt<bool> CodeGenDataReadFunctionMapNames(
    "codegen-data-read-function-map-names", cl::init(true), cl::Hidden,
    cl::desc("Read function map names in CodeGenData (disable to save memory and time)"));

...
Opts.ReadStableFunctionMapNames = CodeGenDataReadFunctionMapNames;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this is exactly what I initially thought. Maybe I misunderstood your response in the RFC thread.

@kyulee-com kyulee-com requested a review from ellishg June 4, 2025 19:16
@nocchijiang nocchijiang changed the title [CGData] Skip reading Names into StableFunctionMap in non-assertion b… [CGData] Make an option to skip reading Names into StableFunctionMap Jun 6, 2025
Copy link
Contributor

@kyulee-com kyulee-com left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nocchijiang
Copy link
Contributor Author

@kyulee-com Could you please merge this PR for me?

@kyulee-com kyulee-com merged commit 7137021 into llvm:main Jun 10, 2025
8 checks passed
rorth pushed a commit to rorth/llvm-project that referenced this pull request Jun 11, 2025
…lvm#142095)

Names are used for debugging purpose and have no impact on codegen. For
a non-trivial project, reading them consumes a lot of memory and slows
down the compilation significantly. This patch adds a field in the
indexed CGData to remember the total size of Names, and creates a
command-line option to skip reading Names by advancing the pointer when
deserializing the indexed CGData.
tomtor pushed a commit to tomtor/llvm-project that referenced this pull request Jun 14, 2025
…lvm#142095)

Names are used for debugging purpose and have no impact on codegen. For
a non-trivial project, reading them consumes a lot of memory and slows
down the compilation significantly. This patch adds a field in the
indexed CGData to remember the total size of Names, and creates a
command-line option to skip reading Names by advancing the pointer when
deserializing the indexed CGData.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants