Skip to content

Conversation

enum-class
Copy link
Owner

@enum-class enum-class commented Sep 22, 2025

  • Added clang coverage profile 10, 9, 8.
  • initialize llvm_profile_header for all versions based on llvm source code in compiler-rt/include/profile/InstrProfData.inc for each version.
  • On X86 I test Xen hypervisor code coverage using Clang instrumentation on a Debian 12 (Bookworm) system. Xen was compiled from source with coverage enabled, installed, and booted in Dom0. Coverage data was extracted using xencov and compared against expected metrics from gcov
  • On ARM I test it using qemu and follow this instructions: in here

Example of coverage output on ARM:
cov.zip

Right now clang versions 14-19 with profile version 8,9,10 are added.
Clean master branch in x86 platform is not linkable with any version of clang. just compilable.
Clean master branch in ARM platform is not compilable with any version of clang. Only in clang-19 if we disable this flag
`-mgeneral-regs-only`it will succesfuly compile and link.
We tested compilability of our patch in above senario and compiled with clang 14-19 in x86 and in ARM it compiled and
linked with clang-19 in the described situation.
*/
int __llvm_profile_runtime;

extern char __start___llvm_prf_data[];
Copy link
Collaborator

@whentojump whentojump Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should char be const struct llvm_profile_data?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are linker-defined symbols that mark section boundaries. The linker doesn't understand C types like struct llvm_profile_data, it only places symbols at memory addresses.
Using char* (which is 1 byte) makes pointer arithmetic straightforward since we are really just working with addresses.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tricky thing here is it is an array not a pointer. So it sort of "allures" people to use __start___llvm_prf_data[i] even though our patch currently does not.

In fact, in the Linux counterpart, we calculate "the number of elements" using

static inline unsigned long __llvm_prf_data_count(void)	
{								
	return __llvm_prf_data_size() /			
		sizeof(__llvm_prf_data_start[0]);		
}

I still think we should revert to the right type. Similar thing for cnts which is 64 bit wide.

typedef __PTRDIFF_TYPE__ ptrdiff_t;
typedef __UINTPTR_TYPE__ uintptr_t;
typedef __INTPTR_TYPE__ intptr_t;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this signed type is needed?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signed arithmetic avoids undefined or unexpected behavior
If uintptr_t were used, the subtraction could result in an unsigned underflow

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give an example how the underflow could happen?

My concern is this touches some file that many other components depend and can block our patch specifically for coverage. But if you can justify it well, go for it.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extern char __stop___llvm_prf_cnts[];

#define START_DATA ((const char *)__start___llvm_prf_data)
#define END_DATA ((const char *)__stop___llvm_prf_data)
Copy link
Collaborator

@whentojump whentojump Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing code uses void * and maybe let's not touch that.
I guess this is one of reasons why git diff gets confused and bloated

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried void* but in calculation of sections I faced issue so I could not get output with llvm-profdata merge.

uint64_t binary_ids_size;
uint64_t data_size;
uint64_t padding_bytes_before_counters;
uint64_t counter_size;
Copy link
Collaborator

@whentojump whentojump Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this field gets renamed to "num_counters" it has always been called "counter*s*_size"? llvm/llvm-project@a6f33ad

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In new version renamed to num_{counter, num_data}

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

@whentojump whentojump left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Saman. This version looks great. Please see my new comments. (I will follow up in some of old conversations in a bit)

#elif __clang_major__ == 18
#define LLVM_PROFILE_VERSION 9
#define LLVM_PROFILE_NUM_KINDS 2
#elif __clang_major__ >= 14 && __clang_major__ <= 17
Copy link
Collaborator

@whentojump whentojump Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

  1. I think && __clang_major__ <= 17 is not needed
  2. three lines above: if the rest of code is using >= 18 let's maybe be consistent here

* __llvm_profile_runtime must be defined according to the docs at:
*
* https://clang.llvm.org/docs/SourceBasedCodeCoverage.html
* https://clang.llvm.org/docs/SourceBasedCodeCoverage.html
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please avoid whitespace changes (even though we should blame upstream for inappropriate coding style and lack of code linting :))

- (intptr_t)START_COUNTERS) / sizeof(uint64_t),
.padding_bytes_after_counters = 0,
.names_size = (END_NAMES - START_NAMES) * sizeof(char),
.counters_delta = (uintptr_t)START_COUNTERS - (uintptr_t)START_DATA,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also sprinkle ifdef's to this initialization list

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The num calculation is a bit defensive to me. Do you have evidence

        .num_data = (END_DATA - START_DATA) / sizeof(struct llvm_profile_data),
        .num_counters = (END_COUNTERS - START_COUNTERS) / sizeof(uint64_t),

won't work?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also sprinkle ifdef's to this initialization

could you please explain more

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

won't work?

The linker script does not guarantee the data section boundaries align to sizeof(struct llvm_profile_data)

@whentojump
Copy link
Collaborator

Followed up in old conversations as well.

Saman can you give @matthew-l-weber and me write access to this repo so we can also resolve/reopen conversations? Thanks

@enum-class
Copy link
Owner Author

Followed up in old conversations as well.

Saman can you give @matthew-l-weber and me write access to this repo so we can also resolve/reopen conversations? Thanks

I tried to fix what you have suggested. But maybe I couldn't see some comments (it become messy :))
I don't know how can I give you access to reopen conversations ? in the settings I could not find

#endif

#define LLVM_PROFILE_VERSION 4
#if __clang_major__ >= 19
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to update the clang version conversation in the docs? https://github.com/enum-class/xen/blob/saman/docs/hypervisor-guide/code-coverage.rst#compiling-xen

#define LLVM_PROFILE_VERSION 8
#define LLVM_PROFILE_NUM_KINDS 2
#else
#error "Unsupported Clang version"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a subsection talking about the profile and challenges with keeping it matching newer tool revisions? Maybe links to LLVM/clang profile definition to understand what to update in Xen and then a link in the docs to this file?

https://github.com/enum-class/xen/blob/saman/docs/hypervisor-guide/code-coverage.rst#clang-coverage

Copy link
Collaborator

@whentojump whentojump Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely. Here are some links to consider:

We will also put up some general guideline text on how to update it in the future.

@matthew-l-weber
Copy link
Collaborator

If we're close, it would be nice to see the squashed commits into the single commit for the patch so we can review the commit description and additional details to add under a ---

@whentojump
Copy link
Collaborator

@matthew-l-weber Thanks for your suggestions!

Saman is refining and testing his v2, focusing on removing unnecessary type changes as pointed out both internally and by Andrew Cooper.

We are breaking down the patches into these parts:

  1. General llvm-cov support with newer toolchains (Header definition changes; initializer changes; dumper changes?)
  2. MC/DC (Makefile changes; dumper changes)
  3. Docs etc

We will let you know when the squashed commit with proper message becomes ready. Thanks!
(I believe Saman has already given you write access to the repo, so feel free to resolve/reopen discussions.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants