Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BOLT] Fix boltedcollection and no_lbr order in fdata writer and parser #86501

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

zct
Copy link

@zct zct commented Mar 25, 2024

when BAT and no_lbr are set at the same time, the perf2bolt writes boltedcollection first and then no_lbr

if (BAT)
OutFile << "boltedcollection\n";
if (opts::BasicAggregation) {
OutFile << "no_lbr";
for (const StringMapEntry<std::nullopt_t> &Entry : EventNames)
OutFile << " " << Entry.getKey();
OutFile << "\n";

However, when used, llvm-bolt reads the no_lbr first, and then the boltedcollection

ErrorOr<bool> FlagOrErr = maybeParseNoLBRFlag();
if (!FlagOrErr)
return FlagOrErr.getError();
NoLBRMode = *FlagOrErr;
ErrorOr<bool> BATFlagOrErr = maybeParseBATFlag();
if (!BATFlagOrErr)
return BATFlagOrErr.getError();
BATMode = *BATFlagOrErr;

This causes an error when reading profile data,because the profile first line is boltedcollection,the maybeParseNoLBRFlag return false despite no_lbr was use

BOLT-INFO: pre-processing profile using branch profile reader
ERROR: no valid profile data found
BOLT-ERROR: 'cannot pre-process profile': Input/output error.

So we adjusted the order of the reads

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@aaupov
Copy link
Contributor

aaupov commented Mar 25, 2024

Thank you for spotting an issue. Can you please add a trivial test case? Profile containing both lines in the wrong order.

@aaupov aaupov changed the title [BOLT] solve the profile data processing problem when BAT and no_lbr are set at the same time [BOLT] Fix boltedcollection and no_lbr order in fdata writer and parser Mar 25, 2024
Copy link

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link

✅ With the latest revision this PR passed the Python code formatter.

@zct
Copy link
Author

zct commented Mar 26, 2024

@aaupov Here are the test cases
We can use any source code to reproduce, and I've constructed a simple use case here

https://pastebin.com/5uJ5kg8k

First we get a BOLT optimized binary(bolt_1), profile use no_lbr and enable BAT model

perf record -e cycles:up ./a.out
perf2bolt --nl -p perf.data -o perf.fdata ./a.out
llvm-bolt ./a.out -o bolt_1 -data perf.fdata  -reorder-blocks=ext-tsp -reorder-functions=hfsort+  -update-debug-sections -enable-bat

Then we use the profile collected from the bolt_1 and perform the second optimization, the error has occurred

perf record -e cycles:up ./bolt_1
perf2bolt --nl -p perf.data -o perf.fdata_2 ./bolt_1
llvm-bolt ./a.out -o bolt_2 -data perf.fdata_2  -reorder-blocks=ext-tsp -reorder-functions=hfsort+  -update-debug-sections -enable-bat

The contents of file perf.fdata_2

boltedcollection
no_lbr cycles:up:
1 _ZN9__gnu_cxxneIPiSt6vectorIiSaIiEEEEbRKNS_17__normal_iteratorIT_T0_EESA_ 36 1
1 _ZNK9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEmiEl 39 1
1 _ZNSt6vectorIiSaIiEE4backEv 15 2
1 _ZNSt6vectorIiSaIiEE4backEv 4 1v

llvm-bolt errors

BOLT-INFO: shared object or position-independent executable detected
BOLT-INFO: Target architecture: x86_64
BOLT-INFO: BOLT version: e19b7dc36bc047b9eb72078d034596be766da350
BOLT-INFO: first alloc address is 0x0
BOLT-INFO: creating new program header table at address 0x200000, offset 0x200000
BOLT-INFO: enabling relocation mode
BOLT-INFO: enabling lite mode
BOLT-INFO: pre-processing profile using branch profile reader
ERROR: no valid profile data found
BOLT-ERROR: 'cannot pre-process profile': Input/output error.

The reason is what I said at the beginning about the order problem

@zct
Copy link
Author

zct commented Apr 8, 2024

@aaupov ping

@aaupov
Copy link
Contributor

aaupov commented May 6, 2024

Sorry, I meant a simple test with fdata profile containing the lines in the right order, and that BOLT succeeds reading it. Please check similar tests under bolt/test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants