-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Descriptor Offsets Relocatable #457
Conversation
Can one of the admins verify this patch? |
This branch won't pass CI because it requires GPUOpen-Drivers/llvm-project#1 for llvm-project. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for giving us a preview. I already have some detail comments.
More importantly, on a high level, I don't think this patch can go in until we've properly discussed how all of this is going to work in detail. Some things that come to mind:
- Who will be responsible for doing the relocation. PAL is quite likely to be preferable over LLPC.
- What will the relocation types look like (the ELF part of it).
- How do we actually represent the relocatable offsets in LLVM IR. On that one, I think the intrinsic may be misplaced, and using a special global value would be a better solution.
8b5e957
to
a9506b6
Compare
392ccf5
to
1bf58c6
Compare
c061fb9
to
0ca554d
Compare
5d6bea0
to
e44d5e6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering that you have multiple forms of relocation (samplers, descriptor resource, other), you might want to add a test for each of them.
Otherwise this looks good to me with a few nits.
e44d5e6
to
4a25fbe
Compare
@nhaehnle @trenouf This PR is now ready for review. These changes only implement the first step of userDataNode relocation: the offsets of descriptors in a descriptor table. It does not address other issues brought up by @trenouf in #474. Those will come in future PRs. AMDGPU changes are submitted to Phabricator as in https://reviews.llvm.org/D76440 |
4a25fbe
to
bd31a4b
Compare
retest this please |
7bc698d
to
8ab05b6
Compare
retest this please |
@csyonghe all agents are hanged because of this pull request, please make sure hang is not observed on your local environment when do next force-push. |
How do I setup my local environment to run the tests? |
@csyonghe , do you mean how to fetch cts source code and compile cts debug version driver? |
I am not sure how to produce the exact same environment as the testing server, including buildling the driver and run it. I can try build the Stadia version and run it on stadia server, but that might differ from the server. I am just wondering if there are some instructions I can follow to setup the testing environment. |
I think you can try on stadia environment if it's more convenient for you. if you want try on local enviroment, you need to setup amdgpu-pro driver and cts binary. |
do you need instructions on building open-source driver or building cts binary? |
I can build the driver. I also have the instructions on building the cts binary. I am just not sure how to deploy and test the driver on a local environment. |
you need to install a opensource driver from repos in README, and replace /usr/lib/x86_64-linux-gnu/amdvlk64.so on ubuntu with drivers you built. sudo wget -qO - http://repo.radeon.com/amdvlk/apt/debian/amdvlk.gpg.key | sudo apt-key add - |
I see. Thanks! |
@csyonghe I still see some coding issues with new coding style. Please have a check. |
descPtr = builder.CreateBitCast(descPtr, builder.getInt32Ty()->getPointerTo(ADDR_SPACE_CONST)); | ||
descPtr = builder.CreateGEP(builder.getInt32Ty(), descPtr, offset); | ||
|
||
// The LLVM's internal handling of GEP instruction results in a lot of junk code and prevented selection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re Jax's report that this change causes test agents to hang: I'm a bit suspicious of this edit here, if only because nothing else looks like it should affect normal pipeline compilation. You could try making it conditional on !isa(descPtr), falling back to the old GEP code when it is a ConstantInt, which it will always be for pipeline compilation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. This is actually wrong when relocation is not used. In normal mode, offset is in dwords not bytes, so it cannot be just an add. I have corrected this issue and trying to test this before making a push. Thanks so much for pointing this out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, that would explain the crash. :-)
I don't like the offset being in dwords for one path and bytes for the other path. I'll have a look at changing the normal path to bytes in a separate change.
d25b863
to
be15f43
Compare
This commit consists of the original changes made by s-perron that emits a relocatable entry for each descriptor load, instead of baking in a constant offset during initial compilation. The compiler patches up the compiled binary with actual descriptor offset at a later time when that info is available.
be15f43
to
20f036f
Compare
The Jenkins test still haven't returned any result. Can we restart it? |
There is some problem with Jenkins tests. I've already asked infrastructure team to have a look. |
retest this please |
Can you post the details on which case is failing? |
It looks to me like it passed CTS on two agents and hung waiting for the third, so I would take that as a pass. |
let me re-trigger the test again. |
retest this please |
merge process is started, please do not force push. |
This PR is based on PR #440.
The commit is based on the original changes made by s-perron
that emits a relocatable entry for each descriptor load, instead
of baking in a constant offset during initial compilation. The
compiler patches up the compiled binary with actual descriptor
offset at a later time when that info is available.