Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Descriptor Offsets Relocatable #457

Merged
merged 1 commit into from
Apr 21, 2020

Conversation

csyonghe
Copy link

@csyonghe csyonghe commented Feb 10, 2020

This PR is based on PR #440.

The commit is based on the original changes made by s-perron
that emits a relocatable entry for each descriptor load, instead
of baking in a constant offset during initial compilation. The
compiler patches up the compiled binary with actual descriptor
offset at a later time when that info is available.

@amdvlk-admin
Copy link
Collaborator

Can one of the admins verify this patch?

@csyonghe
Copy link
Author

This branch won't pass CI because it requires GPUOpen-Drivers/llvm-project#1 for llvm-project.

Copy link
Member

@nhaehnle nhaehnle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for giving us a preview. I already have some detail comments.

More importantly, on a high level, I don't think this patch can go in until we've properly discussed how all of this is going to work in detail. Some things that come to mind:

  1. Who will be responsible for doing the relocation. PAL is quite likely to be preferable over LLPC.
  2. What will the relocation types look like (the ELF part of it).
  3. How do we actually represent the relocatable offsets in LLVM IR. On that one, I think the intrinsic may be misplaced, and using a special global value would be a better solution.

llpc/context/llpcCompiler.cpp Outdated Show resolved Hide resolved
llpc/context/llpcCompiler.cpp Outdated Show resolved Hide resolved
llpc/context/llpcCompiler.cpp Outdated Show resolved Hide resolved
llpc/context/llpcCompiler.cpp Outdated Show resolved Hide resolved
llpc/context/llpcCompiler.cpp Outdated Show resolved Hide resolved
llpc/patch/llpcPatchDescriptorLoad.cpp Outdated Show resolved Hide resolved
llpc/patch/llpcPatchDescriptorLoad.cpp Outdated Show resolved Hide resolved
@csyonghe csyonghe force-pushed the reloc-desc-offset branch 9 times, most recently from 392ccf5 to 1bf58c6 Compare March 3, 2020 01:49
@csyonghe csyonghe force-pushed the reloc-desc-offset branch 5 times, most recently from c061fb9 to 0ca554d Compare March 11, 2020 23:29
@csyonghe csyonghe force-pushed the reloc-desc-offset branch 2 times, most recently from 5d6bea0 to e44d5e6 Compare March 17, 2020 22:38
Copy link
Contributor

@s-perron s-perron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering that you have multiple forms of relocation (samplers, descriptor resource, other), you might want to add a test for each of them.

Otherwise this looks good to me with a few nits.

llpc/context/llpcCompiler.cpp Outdated Show resolved Hide resolved
llpc/patch/llpcPatchDescriptorLoad.cpp Outdated Show resolved Hide resolved
llpc/patch/llpcPatchDescriptorLoad.cpp Outdated Show resolved Hide resolved
llpc/test/shaderdb/PipelineCs_RelocConst.pipe Outdated Show resolved Hide resolved
llpc/test/shaderdb/PipelineVsFs_RelocConst.pipe Outdated Show resolved Hide resolved
llpc/util/llpcElfWriter.cpp Outdated Show resolved Hide resolved
@csyonghe csyonghe changed the title [WIP] Make Descriptor Offsets Relocatable Make Descriptor Offsets Relocatable Mar 20, 2020
@csyonghe
Copy link
Author

csyonghe commented Mar 20, 2020

@nhaehnle @trenouf This PR is now ready for review. These changes only implement the first step of userDataNode relocation: the offsets of descriptors in a descriptor table. It does not address other issues brought up by @trenouf in #474. Those will come in future PRs.

AMDGPU changes are submitted to Phabricator as in https://reviews.llvm.org/D76440
Those changes are required for this PR to pass CI checks.

@JaxLinAMD
Copy link
Contributor

retest this please

@csyonghe csyonghe force-pushed the reloc-desc-offset branch 2 times, most recently from 7bc698d to 8ab05b6 Compare April 13, 2020 20:05
@JaxLinAMD
Copy link
Contributor

retest this please

@JaxLinAMD
Copy link
Contributor

@csyonghe all agents are hanged because of this pull request, please make sure hang is not observed on your local environment when do next force-push.

@csyonghe
Copy link
Author

@csyonghe all agents are hanged because of this pull request, please make sure hang is not observed on your local environment when do next force-push.

How do I setup my local environment to run the tests?

@JaxLinAMD
Copy link
Contributor

@csyonghe , do you mean how to fetch cts source code and compile cts debug version driver?

@csyonghe
Copy link
Author

I am not sure how to produce the exact same environment as the testing server, including buildling the driver and run it. I can try build the Stadia version and run it on stadia server, but that might differ from the server. I am just wondering if there are some instructions I can follow to setup the testing environment.

@JaxLinAMD
Copy link
Contributor

I think you can try on stadia environment if it's more convenient for you. if you want try on local enviroment, you need to setup amdgpu-pro driver and cts binary.

@JaxLinAMD
Copy link
Contributor

do you need instructions on building open-source driver or building cts binary?

@csyonghe
Copy link
Author

I can build the driver. I also have the instructions on building the cts binary. I am just not sure how to deploy and test the driver on a local environment.

@JaxLinAMD
Copy link
Contributor

you need to install a opensource driver from repos in README, and replace /usr/lib/x86_64-linux-gnu/amdvlk64.so on ubuntu with drivers you built.

sudo wget -qO - http://repo.radeon.com/amdvlk/apt/debian/amdvlk.gpg.key | sudo apt-key add -
sudo sh -c 'echo deb [arch=amd64] http://repo.radeon.com/amdvlk/apt/debian/ bionic main > /etc/apt/sources.list.d/amdvlk.list'
sudo apt-get remove amdvlk /* If old version is installed on the machine, remove it first */
sudo apt update
sudo apt-get install amdvlk

@csyonghe
Copy link
Author

I see. Thanks!

@amdrexu
Copy link
Contributor

amdrexu commented Apr 14, 2020

@csyonghe I still see some coding issues with new coding style. Please have a check.
Such as:
pMDNode -> metaNode (we don't use p prefix any more and names are camelcase, avoid acronym)
A* -> A * (this could be done by clang-format, positions of * and & follow LLVM style now)

descPtr = builder.CreateBitCast(descPtr, builder.getInt32Ty()->getPointerTo(ADDR_SPACE_CONST));
descPtr = builder.CreateGEP(builder.getInt32Ty(), descPtr, offset);

// The LLVM's internal handling of GEP instruction results in a lot of junk code and prevented selection
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re Jax's report that this change causes test agents to hang: I'm a bit suspicious of this edit here, if only because nothing else looks like it should affect normal pipeline compilation. You could try making it conditional on !isa(descPtr), falling back to the old GEP code when it is a ConstantInt, which it will always be for pipeline compilation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. This is actually wrong when relocation is not used. In normal mode, offset is in dwords not bytes, so it cannot be just an add. I have corrected this issue and trying to test this before making a push. Thanks so much for pointing this out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, that would explain the crash. :-)

I don't like the offset being in dwords for one path and bytes for the other path. I'll have a look at changing the normal path to bytes in a separate change.

@csyonghe csyonghe force-pushed the reloc-desc-offset branch 3 times, most recently from d25b863 to be15f43 Compare April 17, 2020 00:23
This commit consists of the original changes made by s-perron
that emits a relocatable entry for each descriptor load, instead
of baking in a constant offset during initial compilation. The
compiler patches up the compiled binary with actual descriptor
offset at a later time when that info is available.
@csyonghe
Copy link
Author

The Jenkins test still haven't returned any result. Can we restart it?

@JacobHeAMD
Copy link
Contributor

There is some problem with Jenkins tests. I've already asked infrastructure team to have a look.

@JacobHeAMD
Copy link
Contributor

retest this please

@csyonghe
Copy link
Author

Can you post the details on which case is failing?

@trenouf
Copy link
Member

trenouf commented Apr 20, 2020

It looks to me like it passed CTS on two agents and hung waiting for the third, so I would take that as a pass.

@JaxLinAMD
Copy link
Contributor

let me re-trigger the test again.

@JaxLinAMD
Copy link
Contributor

retest this please

@JaxLinAMD
Copy link
Contributor

merge process is started, please do not force push.

@JaxLinAMD JaxLinAMD merged commit 84123da into GPUOpen-Drivers:dev Apr 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants