Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reland "[TypeProf][InstrPGO] Introduce raw and instr profile format change for type profiling." #82711

Merged
merged 24 commits into from
Feb 27, 2024

Conversation

minglotus-6
Copy link
Contributor

@minglotus-6 minglotus-6 commented Feb 23, 2024

New change on top of reviewed patch are in commits after this one. Previous commits are restored from the remote branch with timestamps.

  1. Fix build breakage for non-ELF platforms, by defining the missing functions {__llvm_profile_begin_vtables, __llvm_profile_end_vtables, __llvm_profile_begin_vtabnames , __llvm_profile_end_vtabnames} everywhere.

    • Tested on mac laptop (for darwins) and Windows. Specifically, functions in InstrProfilingPlatformWindows.c returns NULL to make it more explicit that type prof isn't supported; see comments for the reason.
    • For the rest (AIX, other), mostly follow existing examples (like this one)
  2. Rename __llvm_prf_vtabnames -> __llvm_prf_vns for shorter section name, and make returned pointers const

Original Description

  • Raw profile format
    • Header: records the byte size of compressed vtable names, and the number of profiled vtable entries (call it VTableProfData). Header also records padded bytes of each section.
    • Payload: adds a section for compressed vtable names, and a section to store VTableProfData. Both sections are padded so the size is a multiple of 8.
  • Indexed profile format
    • Header: records the byte offset of compressed vtable names.
    • Payload: adds a section to store compressed vtable names. This section is used by llvm-profdata to show the list of vtables profiled for an instrumented site.

The originally reviewed patch will have profile reader/write change and llvm-profdata change.

rfc in https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600

Copy link

github-actions bot commented Feb 23, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@minglotus-6
Copy link
Contributor Author

This is a link (https://lab.llvm.org/buildbot/#/builders/127/builds/62532/steps/8/logs/stdio) for previously broken builds.

const char *__llvm_profile_begin_names(void) { return &NamesStart + 1; }
const char *__llvm_profile_end_names(void) { return &NamesEnd; }

const char *__llvm_profile_begin_vtabnames(void) { return NULL; }
Copy link
Contributor Author

@minglotus-6 minglotus-6 Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@minglotus-6
Copy link
Contributor Author

minglotus-6 commented Feb 23, 2024

@w2yehia I wonder if you could help take a look at the changes to compiler-rt/lib/profile/InstrProfilingPlatformAIX.c? I expect them to be no-op in terms of new features, but would like a second pair of eyes. Thanks!

Copy link
Contributor

@w2yehia w2yehia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AIX changes lgtm.

…plement the linker magic. also remove one unused variable
@minglotus-6
Copy link
Contributor Author

minglotus-6 commented Feb 26, 2024

Update: I got the Windows machine. Windows linker magic [1] works out without compile errors or bad side effects, but the current type profiling won't work under MSVC ABI. Thereby helper functions in compiler-rt/lib/profile/InstrProfilingPlatformWindows.c is made simple (just return NULL) such that it's very explicit that type profiling isn't supported.

To elaborate on why current type profiling implementation doesn't work under MSVC ABI

  • Under MSVC ABI, the name of the vtable variable is not necessarily the mangled name. Take https://gcc.godbolt.org/z/drs1n9qd5 as an example, @0, @1 and @2 are vtables.
  • For instrumentation && profile collection under Itanium ABI, the vtables' mangled names (and its MD5hash) are used to map profiled runtime address to actual vtable variables.

As the godbolt example shows, the MSVC mangled name is still in the IR (just not vtable's variable name). I added some comments.

[1] diff in https://gist.github.com/minglotus-6/9df208e14f7c20b352cdb7f89d38c228

@minglotus-6
Copy link
Contributor Author

Given this is a reland to make the patch work on non-ELF platforms, I'll go ahead to commit this.

@minglotus-6 minglotus-6 merged commit 16e74fd into main Feb 27, 2024
4 checks passed
@minglotus-6 minglotus-6 deleted the users/minglotus-6/typeprofrawformat branch February 27, 2024 19:07
alanzhao1 added a commit that referenced this pull request Mar 4, 2024
Several profdata tests pass the byte 012 to printf. This causes these
tests to fail when using GnuWin32's version of printf because printf
will detect that 012 is the LF character and will prepend the byte 015
(CR) in front of LF.

This change is required after
#82711 which bumped the version
number.
minglotus-6 added a commit that referenced this pull request Mar 28, 2024
This is a follow-up to the profile format change in #82711
@whentojump
Copy link
Member

Hi @minglotus-6 , can you please also update this place:

namespace RawInstrProf {
// Version 1: First version
// Version 2: Added value profile data section. Per-function control data
// struct has more fields to describe value profile information.
// Version 3: Compressed name section support. Function PGO name reference
// from control data struct is changed from raw pointer to Name's MD5 value.
// Version 4: ValueDataBegin and ValueDataSizes fields are removed from the
// raw header.
// Version 5: Bit 60 of FuncHash is reserved for the flag for the context
// sensitive records.
// Version 6: Added binary id.
// Version 7: Reorder binary id and include version in signature.
// Version 8: Use relative counter pointer.
// Version 9: Added relative bitmap bytes pointer and count used by MC/DC.
const uint64_t Version = INSTR_PROF_RAW_VERSION;

And may I ask whether you plan to back port this to branch 18? (I'm asking because we have some dependency on this version and would like to know in advance). Thanks!

minglotus-6 added a commit that referenced this pull request Apr 1, 2024
@minglotus-6
Copy link
Contributor Author

can you please also update this place

Thanks for the catch! Updated in f87bde2

whether you plan to back port this to branch 18? (I'm asking because we have some dependency on this version and would like to know in advance)

Before seeing your comment, I don't have plans to back port the version update to LLVM version 18. Could you elaborate on how back porting a version bump to LLVM branch 18 helps? I have seen back-ports for bug fixes but haven't done it myself. I'm glad to back port if it helps.

@whentojump
Copy link
Member

Updated in f87bde2

Thanks!

By the way, https://llvm.org/docs/InstrProfileFormat.html is a really informative page. Thanks for your contributions!

Before seeing your comment, I don't have plans to back port the version update to LLVM version 18. Could you elaborate on how back porting a version bump to LLVM branch 18 helps? I have seen back-ports for bug fixes but haven't done it myself. I'm glad to back port if it helps.

Sorry for not making it clear -- in fact, not backporting it can better help our use cases. :)) or :((

In short, we are instrumenting some OS kernels, like what this patch does.

In kernel space, it's impossible to dump the profiles using any "runtime". So instead, some kernel code itself has to do this job. As a result, this piece of kernel code has to keep synchronized with LLVM code, e.g. in terms of profile format. That's why a stable format is certainly more desirable for us. You may notice in the linked patch: they were using raw profile version 5 three years ago and now we are 10 already. (In user space, this should not be a problem, because whatever produces or consumes the profile usually comes from the same version of LLVM.)

That being said, please make the decision at your own discretion, or based on broader community feedback. :))

@minglotus-6
Copy link
Contributor Author

in fact, not backporting it can better help our use cases. :)) or :((

In short, we are instrumenting some OS kernels, like what this patch does.
...
That being said, please make the decision at your own discretion, or based on broader community feedback. :))

Ack, thanks for sharing!

DimitryAndric added a commit that referenced this pull request Jul 29, 2024
In 16e74fd (for #82711) a duplicate definition of `IntPtrT` was
added to `InstrProfiling.h`, leading to warnings:

    compiler-rt/lib/profile/InstrProfiling.h:52:15: warning: redefinition of typedef 'IntPtrT' is a C11 feature [-Wtypedef-redefinition]
       52 | typedef void *IntPtrT;
          |               ^
    compiler-rt/lib/profile/InstrProfiling.h:34:15: note: previous definition is here
       34 | typedef void *IntPtrT;
          |               ^

Fix the warnings by removing the duplicate typedef.
llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Jul 29, 2024
In 16e74fd (for llvm#82711) a duplicate definition of `IntPtrT` was
added to `InstrProfiling.h`, leading to warnings:

    compiler-rt/lib/profile/InstrProfiling.h:52:15: warning: redefinition of typedef 'IntPtrT' is a C11 feature [-Wtypedef-redefinition]
       52 | typedef void *IntPtrT;
          |               ^
    compiler-rt/lib/profile/InstrProfiling.h:34:15: note: previous definition is here
       34 | typedef void *IntPtrT;
          |               ^

Fix the warnings by removing the duplicate typedef.

(cherry picked from commit 2c376fe)
tru pushed a commit to llvmbot/llvm-project that referenced this pull request Aug 2, 2024
In 16e74fd (for llvm#82711) a duplicate definition of `IntPtrT` was
added to `InstrProfiling.h`, leading to warnings:

    compiler-rt/lib/profile/InstrProfiling.h:52:15: warning: redefinition of typedef 'IntPtrT' is a C11 feature [-Wtypedef-redefinition]
       52 | typedef void *IntPtrT;
          |               ^
    compiler-rt/lib/profile/InstrProfiling.h:34:15: note: previous definition is here
       34 | typedef void *IntPtrT;
          |               ^

Fix the warnings by removing the duplicate typedef.

(cherry picked from commit 2c376fe)
banach-space pushed a commit to banach-space/llvm-project that referenced this pull request Aug 7, 2024
In 16e74fd (for llvm#82711) a duplicate definition of `IntPtrT` was
added to `InstrProfiling.h`, leading to warnings:

    compiler-rt/lib/profile/InstrProfiling.h:52:15: warning: redefinition of typedef 'IntPtrT' is a C11 feature [-Wtypedef-redefinition]
       52 | typedef void *IntPtrT;
          |               ^
    compiler-rt/lib/profile/InstrProfiling.h:34:15: note: previous definition is here
       34 | typedef void *IntPtrT;
          |               ^

Fix the warnings by removing the duplicate typedef.
@ayermolo
Copy link
Contributor

ayermolo commented Sep 7, 2024

Do you know by any chance how raw-{32, 64}-bits-{le, be}.test were generated?
I am trying to re-generate them with this patch, but without a previous changes, and having trouble getting 64 bit ones to pass.

@minglotus-6
Copy link
Contributor Author

minglotus-6 commented Sep 8, 2024

Do you know by any chance how raw-{32, 64}-bits-{le, be}.test were generated?

I don't know how they were generated initially. When I updated these tests, I manually added the bytes in RUN: printf lines and initially had to run llvm-profdata show in a debugger (gdb) to get the test pass.

Given a binary raw profile, hexdump should print the bytes (example). But it is still painful to map output bytes to the (likely multi-byte) fields in a (multi-section) raw profile.

I am trying to re-generate them with this patch, but without a previous changes, and having trouble getting 64 bit ones to pass.

Is there a pointer showing the failed test? And which section (profile header, per function profile data, etc) is failing the raw profile reader? I remember I got stumbled on updating CounterPtr which records relative offset, illustrated with ASCII art in https://llvm.org/docs/InstrProfileFormat.html#profile-counters

@ayermolo
Copy link
Contributor

ayermolo commented Sep 9, 2024

The offset in to name section is wrong.
Thanks, will dig into hexdump output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants