Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PowerPC] Optimize allocation of Conditional Register #69299

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

bzEq
Copy link
Collaborator

@bzEq bzEq commented Oct 17, 2023

In a single BB, if we allocate different physical condition registers for different definitions of virtual registers, performance improvement can be observed for some workloads.

@nemanjai
Copy link
Member

Can you outline the reason for the performance improvement? I wonder if perhaps it would be better to implement a flag to tell the register allocator to use a circular allocation order for a particular register class (or at least for a subset of a register class - i.e. volatile registers, etc.).

@bzEq
Copy link
Collaborator Author

bzEq commented Nov 15, 2023

Can you outline the reason for the performance improvement?

This is found in our internal workload where the loop is short and inside the loop body, two CR-logic instructions, very near in the body, are writing the same CR register(though different field). If we modify the allocation of CR registers manually, we got significant improvement in IPC and runtime, though this PR doesn't affect SPEC2017 too much.

I wonder if perhaps it would be better to implement a flag to tell the register allocator to use a circular allocation order for a particular register class

This is a good idea that worth a try. I'll investigate it.

@bzEq
Copy link
Collaborator Author

bzEq commented Nov 21, 2023

tell the register allocator to use a circular allocation order for a particular register class

I've got through the code of GreedyRA, this might not help a lot, since the order of allocating virtual registers is determined by priority, not the definition order inside one MBB. That's to say there is still possibility adjacent CR bit definition might be allocating the same CR even in circular allocation order.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants