diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst index 2e658557b0e31..9d64195ee338e 100644 --- a/clang/docs/UsersManual.rst +++ b/clang/docs/UsersManual.rst @@ -2348,9 +2348,10 @@ differences between the two: 1. Profile data generated with one cannot be used by the other, and there is no conversion tool that can convert one to the other. So, a profile generated - via ``-fprofile-instr-generate`` must be used with ``-fprofile-instr-use``. - Similarly, sampling profiles generated by external profilers must be - converted and used with ``-fprofile-sample-use``. + via ``-fprofile-generate`` or ``-fprofile-instr-generate`` must be used with + ``-fprofile-use`` or ``-fprofile-instr-use``. Similarly, sampling profiles + generated by external profilers must be converted and used with ``-fprofile-sample-use`` + or ``-fauto-profile``. 2. Instrumentation profile data can be used for code coverage analysis and optimization. @@ -2598,6 +2599,8 @@ Of those, 31,977 were spent inside the body of ``bar``. The last line of the profile (``2: 0``) corresponds to line 2 inside ``main``. No samples were collected there. +.. _prof_instr: + Profiling with Instrumentation ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -2607,11 +2610,25 @@ overhead during the profiling, but it provides more detailed results than a sampling profiler. It also provides reproducible results, at least to the extent that the code behaves consistently across runs. +Clang supports two types of instrumentation: frontend-based and IR-based. +Frontend-based instrumentation can be enabled with the option ``-fprofile-instr-generate``, +and IR-based instrumentation can be enabled with the option ``-fprofile-generate``. +For best performance with PGO, IR-based instrumentation should be used. It has +the benefits of lower instrumentation overhead, smaller raw profile size, and +better runtime performance. Frontend-based instrumentation, on the other hand, +has better source correlation, so it should be used with source line-based +coverage testing. + +The flag ``-fcs-profile-generate`` also instruments programs using the same +instrumentation method as ``-fprofile-generate``. However, it performs a +post-inline late instrumentation and can produce context-sensitive profiles. + + Here are the steps for using profile guided optimization with instrumentation: 1. Build an instrumented version of the code by compiling and linking with the - ``-fprofile-instr-generate`` option. + ``-fprofile-generate`` or ``-fprofile-instr-generate`` option. .. code-block:: console @@ -2674,8 +2691,8 @@ instrumentation: Note that this step is necessary even when there is only one "raw" profile, since the merge operation also changes the file format. -4. Build the code again using the ``-fprofile-instr-use`` option to specify the - collected profile data. +4. Build the code again using the ``-fprofile-use`` or ``-fprofile-instr-use`` + option to specify the collected profile data. .. code-block:: console @@ -2685,13 +2702,10 @@ instrumentation: profile. As you make changes to your code, clang may no longer be able to use the profile data. It will warn you when this happens. -Profile generation using an alternative instrumentation method can be -controlled by the GCC-compatible flags ``-fprofile-generate`` and -``-fprofile-use``. Although these flags are semantically equivalent to -their GCC counterparts, they *do not* handle GCC-compatible profiles. -They are only meant to implement GCC's semantics with respect to -profile creation and use. Flag ``-fcs-profile-generate`` also instruments -programs using the same instrumentation method as ``-fprofile-generate``. +Note that ``-fprofile-use`` option is semantically equivalent to +its GCC counterpart, it *does not* handle profile formats produced by GCC. +Both ``-fprofile-use`` and ``-fprofile-instr-use`` accept profiles in the +indexed format, regardeless whether it is produced by frontend or the IR pass. .. option:: -fprofile-generate[=] @@ -4401,13 +4415,21 @@ Execute ``clang-cl /?`` to see a list of supported options: Instrument only functions from files where names don't match all the regexes separated by a semi-colon -fprofile-filter-files= Instrument only functions from files where names match any regex separated by a semi-colon - -fprofile-instr-generate= - Generate instrumented code to collect execution counts into + -fprofile-generate= + Generate instrumented code to collect execution counts into a raw profile file in the directory specified by the argument. The filename uses default_%m.profraw pattern + (overridden by LLVM_PROFILE_FILE env var) + -fprofile-generate + Generate instrumented code to collect execution counts into default_%m.profraw file + (overridden by '=' form of option or LLVM_PROFILE_FILE env var) + -fprofile-instr-generate= + Generate instrumented code to collect execution counts into the file whose name pattern is specified as the argument (overridden by LLVM_PROFILE_FILE env var) -fprofile-instr-generate Generate instrumented code to collect execution counts into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var) -fprofile-instr-use= + Use instrumentation data for coverage testing or profile-guided optimization + -fprofile-use= Use instrumentation data for profile-guided optimization -fprofile-remapping-file= Use the remappings described in to match the profile data against names in the program @@ -4569,7 +4591,7 @@ clang-cl supports several features that require runtime library support: - Address Sanitizer (ASan): ``-fsanitize=address`` - Undefined Behavior Sanitizer (UBSan): ``-fsanitize=undefined`` - Code coverage: ``-fprofile-instr-generate -fcoverage-mapping`` -- Profile Guided Optimization (PGO): ``-fprofile-instr-generate`` +- Profile Guided Optimization (PGO): ``-fprofile-generate`` - Certain math operations (int128 division) require the builtins library In order to use these features, the user must link the right runtime libraries