Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PGO]add bitcode-use-sample-profile #66178

Closed
wants to merge 1 commit into from

Conversation

lifengxiang1025
Copy link
Contributor

@lifengxiang1025 lifengxiang1025 commented Sep 13, 2023

When use bitcode file as input and use "-fprofile-sample-use" first. Function don't has "use-sample-profile" attribute and then "buildFunctionOrder" return empty vector.
When "bitcode-use-sample-profile" enable, add "use-sample-profile" to function and make it can do optimation about pgo.

Copy link
Contributor

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really follow the description of this from the conversation (I think the commit itself also needs a description which seems to be missing here). The change also needs a test case, which would help clarify what problem this is trying to solve.

I've added another reviewer who is more familiar with the sample profiler than I
(@huangjd).

@@ -1886,6 +1891,10 @@ SampleProfileLoader::buildFunctionOrder(Module &M, LazyCallGraph &CG) {
errs() << "WARNING: -use-profiled-call-graph ignored, should be used "
"together with -sample-profile-top-down-load.\n";

for (Function &F : M)
if (!F.isDeclaration() && BitCodeUseSampleProfile)
F.addFnAttr("use-sample-profile");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the starting place the attribute will be used. MatchingManager->runOnModule() also needs to see the attribute.

@htyu
Copy link
Contributor

htyu commented Sep 13, 2023

I guess you are adding module-level switch to indicate all functions in this module should use profile or not. This may lead to undefined/unexpected behavior for functions imported from other modules when thinLTO mode. In your use case, where does the bitcode come from? Can you set the attribute when the bitcode is generated?

Copy link
Member

@WenleiHe WenleiHe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the use case either. It seems hacky to bypass the attribute in IR - why is the attribute not present in IR if sample profile is used?

@huangjd
Copy link
Contributor

huangjd commented Sep 14, 2023

I think this is an issue with LLVM pipeline. Functions are marked as "use-sample-profile" at front end when compiling from C++ source code (At clang::ParseAST -> ... -> CodeGenModule::EmitGlobal), but when compiling LLVM bitcode, it will not go through the parser frontend, instead go directly to optimization passes, so no function attributes will be added to the existing ones from the bitcode. It is a design decision whether it should be allowed to add/override this (or any) attribute in the bitcode

@htyu
Copy link
Contributor

htyu commented Sep 14, 2023

but when compiling LLVM bitcode, it will not go through the parser frontend, ...

How was the LLVM bitcode generated? Is it not from a Clang compilation?

@lifengxiang1025
Copy link
Contributor Author

lifengxiang1025 commented Sep 14, 2023

@teresajohnson @WenleiHe @htyu @huangjd
Sorry for implicit description.
For example,
I have foo.cpp and bar.cpp and steps as follow:

1. clang++ -c foo.cpp -flto=thin -o foo.o
   clang++ -c bar.cpp -flto=thin -o bar.o
2. ar libtest.a foo.o bar.o
3. ar x libtest.a
4. clang++ -x ir foo.o -flto=thin -fprofile-sample-use=xxx -o foo.o.o
    clang++ -x ir bar.o -flto=thin -fprofile-sample-use=xxx -o bar.o.o
5. thinlink->thinbackend->final link

In step 4. Because function doesn't have attribute "use-sample-profile". It can't do pgo's optimation.
Only if in step1, add flags "-fprofile-sample-use=xxx". Function could have attribute "use-sample-profile".
This example happens when project use some third party libraries. I want to compile those third party libraries to bitcode archive and then do some optimation like pgo and thinlto.

@lifengxiang1025
Copy link
Contributor Author

I think this is an issue with LLVM pipeline. Functions are marked as "use-sample-profile" at front end when compiling from C++ source code (At clang::ParseAST -> ... -> CodeGenModule::EmitGlobal), but when compiling LLVM bitcode, it will not go through the parser frontend, instead go directly to optimization passes, so no function attributes will be added to the existing ones from the bitcode. It is a design decision whether it should be allowed to add/override this (or any) attribute in the bitcode

Yeah. Another way is changing F.hasFnAttribute("use-sample-profile") to F.hasFnAttribute("use-sample-profile") || BitCodeUseSampleProfile. Add attribute may break design. How about it?

@teresajohnson
Copy link
Contributor

@teresajohnson @WenleiHe @htyu @huangjd Sorry for implicit description. For example, I have foo.cpp and bar.cpp and steps as follow:

1. clang++ -c foo.cpp -flto=thin -o foo.o
   clang++ -c bar.cpp -flto=thin -o bar.o
2. ar libtest.a foo.o bar.o
3. ar x libtest.a
4. clang++ -x ir foo.o -flto=thin -fprofile-sample-use=xxx -o foo.o.o
    clang++ -x ir bar.o -flto=thin -fprofile-sample-use=xxx -o bar.o.o
5. thinlink->thinbackend->final link

In step 4. Because function doesn't have attribute "use-sample-profile". It can't do pgo's optimation. Only if in step1, add flags "-fprofile-sample-use=xxx". Function could have attribute "use-sample-profile". This example happens when project use some third party libraries. I want to compile those third party libraries to bitcode archive and then do some optimation like pgo and thinlto.

To confirm my understanding, are you saying that you are not doing steps 1 and 2 yourself, but are using an archive of ThinLTO bitcode distributed by a third party?

@teresajohnson
Copy link
Contributor

Also, if you are going to use an internal attribute, how about using the existing -mllvm -force-attribute=use-sample-profile. The ForceFunctionAttrs pass runs early in the ThinLTO pre-link pipeline, before sample pgo.

@huangjd
Copy link
Contributor

huangjd commented Sep 14, 2023

-fprofile-sample-use should also be specified in step 1, otherwise it's too late because IR are not being annotated

@huangjd
Copy link
Contributor

huangjd commented Sep 14, 2023

Also, if you are going to use an internal attribute, how about using the existing -mllvm -force-attribute=use-sample-profile. The ForceFunctionAttrs pass runs early in the ThinLTO pre-link pipeline, before sample pgo.

-force-attribute needs a function name, like -force-attribute=foo:use-sample-profile, so it is unfeasible to apply it to every function in the IR. Although it is quite easy to add support for wildcard like -force-attribute=*:use-sample-profile and I think this sounds like an elegant solution

@teresajohnson
Copy link
Contributor

Also, if you are going to use an internal attribute, how about using the existing -mllvm -force-attribute=use-sample-profile. The ForceFunctionAttrs pass runs early in the ThinLTO pre-link pipeline, before sample pgo.

-force-attribute needs a function name, like -force-attribute=foo:use-sample-profile, so it is unfeasible to apply it to every function in the IR. Although it is quite easy to add support for wildcard like -force-attribute=*:use-sample-profile and I think this sounds like an elegant solution

I haven't personally tried this option, but the documentation of the option indicates that if no function name is given the attribute gets applied to all functions in the module:

static cl::liststd::string ForceAttributes(
"force-attribute", cl::Hidden,
cl::desc(
"Add an attribute to a function. This can be a "
"pair of 'function-name:attribute-name', to apply an attribute to a "
"specific function. For "
"example -force-attribute=foo:noinline. Specifying only an attribute "
"will apply the attribute to every function in the module. This "
"option can be specified multiple times."));

@huangjd
Copy link
Contributor

huangjd commented Sep 14, 2023

Oh this is a recent change, I didn't noticed it. In this case then just use the force attribute option would resolve the problem.

@lifengxiang1025
Copy link
Contributor Author

lifengxiang1025 commented Sep 15, 2023

@teresajohnson @WenleiHe @htyu @huangjd Sorry for implicit description. For example, I have foo.cpp and bar.cpp and steps as follow:

1. clang++ -c foo.cpp -flto=thin -o foo.o
   clang++ -c bar.cpp -flto=thin -o bar.o
2. ar libtest.a foo.o bar.o
3. ar x libtest.a
4. clang++ -x ir foo.o -flto=thin -fprofile-sample-use=xxx -o foo.o.o
    clang++ -x ir bar.o -flto=thin -fprofile-sample-use=xxx -o bar.o.o
5. thinlink->thinbackend->final link

In step 4. Because function doesn't have attribute "use-sample-profile". It can't do pgo's optimation. Only if in step1, add flags "-fprofile-sample-use=xxx". Function could have attribute "use-sample-profile". This example happens when project use some third party libraries. I want to compile those third party libraries to bitcode archive and then do some optimation like pgo and thinlto.

To confirm my understanding, are you saying that you are not doing steps 1 and 2 yourself, but are using an archive of ThinLTO bitcode distributed by a third party?

I do steps 1,2 and other steps separately. Many third party projects use build tool different from mine, such as cmake,make,autoreconf. It's difficult to migrate third party projects in my project. So I want to do 1,2 steps as apt(advanced package tool) and do 3,4,5 steps in my build system.
I think that processing thinlto bitcode iteratively is accepted, right?

@lifengxiang1025
Copy link
Contributor Author

lifengxiang1025 commented Sep 15, 2023

-fprofile-sample-use should also be specified in step 1, otherwise it's too late because IR are not being annotated

Could you explain why it's too late?
I tried this patch and I can see some new inline behavior and hotness information.
Do you mean step 1 do inline pass before step 4's ProfileSampleLoader pass? Normally, IR do inline and update PSI,BFI in ProfileSampleLoader pass which head of IPO pass. How about step 1 use -O0 or -fno-inline?

@Enna1
Copy link
Contributor

Enna1 commented Sep 20, 2023

@lifengxiang1025
Did you try -force-attribute=noinline as suggested, did this work in your use case ?

@lifengxiang1025
Copy link
Contributor Author

@lifengxiang1025 Did you try -force-attribute=noinline as suggested, did this work in your use case ?

Yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants