Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document __global_pointer$ dynamic symbol requirement #165

Closed
jim-wilson opened this issue Dec 2, 2020 · 28 comments · Fixed by #215
Closed

Document __global_pointer$ dynamic symbol requirement #165

jim-wilson opened this issue Dec 2, 2020 · 28 comments · Fixed by #215
Milestone

Comments

@jim-wilson
Copy link
Collaborator

SiFive is proposing to define a dynamic tag DT_RISCV_GP to hold the GP value. This is to solve a problem that turned up during glibc ifunc testing, where an ifunc resolver can be called before _start is called, and an ifunc resolver in the main applicatiotn may require gp, which isn't set until _start is called. The proposed solution is a new dynamic tag that tells ld.so to load gp with its value.

The binutils patch
https://sourceware.org/pipermail/binutils/2020-December/114346.html
the glibc patch thread
https://sourceware.org/pipermail/libc-alpha/2020-December/120236.html
the second part of the glibc patch that actually uses DT_RISCV_GP
https://sourceware.org/pipermail/libc-alpha/2020-December/120238.html

@jrtc27
Copy link
Collaborator

jrtc27 commented Dec 2, 2020

Hm. So FreeBSD's rtld faces this same issue for calling init/fini and init_array functions and gets round it by setting gp to __global_pointer$'s value if it exists in the dynamic symbol table. Can glibc not do the same? Compared to the actual overhead of doing all the dynamic relocations, a single symbol lookup should be lost in the noise.

@jrtc27
Copy link
Collaborator

jrtc27 commented Dec 2, 2020

The patch also looks wrong to me; you need to be very careful to set gp just before you jump to the object and restore it just after you return. I'm concerned the former isn't being done quite properly, and I see nothing about the latter there.

@aswaterman
Copy link
Contributor

There's at most one gp in this ABI, so such care shouldn't be necessary. At worst, the same gp value will be redundantly loaded later on.

@jim-wilson
Copy link
Collaborator Author

ld.so is a shared library, and shared libraries don't use gp. We don't have any multi-got solution, so there can be only one gp value in an application.

@jrtc27
Copy link
Collaborator

jrtc27 commented Dec 2, 2020

Oh, hm, yes, FreeBSD's is just being overly cautious there. Probably stolen from IA-64 or something where it did matter.

@jrtc27
Copy link
Collaborator

jrtc27 commented Dec 2, 2020

Anyway, the symbol is lying around ready to be looked up (even with a GNU toolchain for Linux, not specific to FreeBSD or LLVM), no need for the dynamic tag:

burn:~ jrtc4% readelf -W --dyn-syms /var/cache/pbuilder/base-riscv64-sid.cow/bin/bash | grep __global_pointer
  1326: 00000000000d41c8     0 NOTYPE  GLOBAL DEFAULT   23 __global_pointer$

@jim-wilson
Copy link
Collaborator Author

Looking at glibc, I see that there is a .preinit_array section that contains the address of the load_gp function, so the gp will be set before init_array functions are called. And then will be set again in _start which also calls load_gp.

@jrtc27
Copy link
Collaborator

jrtc27 commented Dec 2, 2020

I guess that's one way to deal with that problem... Anyway I'd suggest adopting FreeBSD's approach (simplified to only set it once and not bother saving/restoring), it's stupidly simple and should just work in every case.

@aswaterman
Copy link
Contributor

Not needing to extend the ABI is a nice upside.

@jim-wilson
Copy link
Collaborator Author

We don't control glibc so the just set it once approach may not work for us.

The preinit_array is to ensure that gp is set before calling shared library init_array functions which might call back into functions in the main application. So this gets called from ld.so. But statically linked applications don't use ld.so or shared libraries, so they still need to set gp in _start for statically linked binaries. And if you have a dynamically linked binary that doesn't need to load any shared libraries before it starts then it will need to set gp in _start too. Shared libraries loaded after _start was don't need preinit_array to set gp but will still run it anyways. It sounds simpler to just set gp multiple times and not worry about it. Otherwise we have to change how ld.so works to ensure gp is always set even if no shared libraries are loaded before _start, and then _start only needs to set gp for statically linked binaries. I'm not a glibc expert, so I don't know if that is feasible.

@aswaterman
Copy link
Contributor

aswaterman commented Dec 2, 2020

To be clear, I don't think that only setting it once is a viable option at this point - we'd continue having to set it up to 3 times because of the existing conventions (and for static links). I was just thinking that the dynamic symbol lookup vs. the DT seems attractive because it avoids extending the ABI.

@jrtc27
Copy link
Collaborator

jrtc27 commented Dec 2, 2020

Yes you would need to set it in _start to handle static linking correctly but you would only need to set it once in rtld if you set it early enough; I am imagining that you set it at the same time (before or after doesn’t really matter) as you process any possible relocations for the main object. Maybe there’s an even better place in glibc, but those two cases should cover everything, no? Old binaries can continue to set things in preinit_array if they want (and maybe that needs to stay around for a while even for new binaries, possibly forever, as dropping that would force a dependency on a new glibc), but that’s purely for backwards-compatibility, not because it’s conceptually needed.

@jrtc27
Copy link
Collaborator

jrtc27 commented Dec 2, 2020

Which is probably what you meant @aswaterman?

@aswaterman
Copy link
Contributor

Which is probably what you meant @aswaterman?

Indeed.

@jim-wilson
Copy link
Collaborator Author

On IRC, Fangrui Song the lld maintainer said that he would prefer the DT_SYMTAB approach that Jessica suggested.

@Nelson1225
Copy link
Collaborator

Just to follow up on this issue, the patches for the second approach, search DT_SYMTAB to get gp value, are as follows,

@jrtc27
Copy link
Collaborator

jrtc27 commented Dec 16, 2020

The principle looks good to me, though on the binutils patch front do PIEs not also need the symbol exported?

@jrtc27
Copy link
Collaborator

jrtc27 commented Dec 16, 2020

(and I don't know what should happen for relocatable objects, though maybe that's not relevant in this function)

@jrtc27
Copy link
Collaborator

jrtc27 commented Dec 16, 2020

i.e. I'd expect to see bfd_link_executable not !bfd_link_pic.

@jim-wilson
Copy link
Collaborator Author

Code compiled PIC doesn't use gp. Only non-pic executables set and use gp, so it only makes sense for a position dependent executable (pde) can export it.

@jrtc27
Copy link
Collaborator

jrtc27 commented Dec 16, 2020

R_RISCV_PCREL_HI20/LO12_[IS] can be relaxed just fine for PIEs? GP-relative and PC-relative addressing are both position-independent.

@jrtc27
Copy link
Collaborator

jrtc27 commented Dec 16, 2020

I see bfd guards that specific relaxation with !bfd_link_pic but that looks wrong to me, I see no reason why that shouldn't be !bfd_link_dll?

@jim-wilson
Copy link
Collaborator Author

It looks like gp relaxation in PIEs is an oversight. I think that should be handled as a separate project. There are likely a number of issues, and it will need testing. The compiler won't put small variables in sdata when pic, but it probably could for pie. That will seriously limit the usefulness of gp relaxations in the linker for pie. There could also be start file problems if libraries aren't initializing gp for PIE because it wasn't needed before. It looks like glibc handles this correctly, but I haven't checked other libraries.

@jrtc27
Copy link
Collaborator

jrtc27 commented Dec 17, 2020

Yeah it’s related but separate. I think LLVM will get sdata right, it has clearer distinctions internally between PIC and PIE than some of the GNU world, but I wouldn’t like to say for sure. I know FreeBSD’s CSU code initialises gp regardless of whether it’s PIC or PIE, it’s the same file with no ifdef, and I imagine newlib and glibc share their heritage.

@jrtc27
Copy link
Collaborator

jrtc27 commented Dec 17, 2020

(the number of times I’ve seen code in bfd get confused about what bfd_link_pic means... :()

@kito-cheng
Copy link
Collaborator

I believe that we don't need a new dynamic tag DT_RISCV_GP anymore, the binutils and glibc are landed another solution which @Nelson1225 mentioned, @jrtc27 did you think there something need to clarify? if no I think we could close this issue.

@jrtc27 jrtc27 changed the title new RISC-V dynamic tag DT_RISCV_GP Document __global_pointer$ requirement Mar 12, 2021
@jrtc27 jrtc27 changed the title Document __global_pointer$ requirement Document __global_pointer$ dynamic symbol requirement Mar 12, 2021
@jrtc27
Copy link
Collaborator

jrtc27 commented Mar 12, 2021

I believe that we don't need a new dynamic tag DT_RISCV_GP anymore, the binutils and glibc are landed another solution which @Nelson1225 mentioned, @jrtc27 did you think there something need to clarify? if no I think we could close this issue.

We need to document that __global_pointer$ needs to be in the dynamic symbol table for the run-time linker, i.e. its visibility needs to be global (I think that's it?).

@jrtc27 jrtc27 added this to the First Release DoD milestone Jul 9, 2021
@kito-cheng
Copy link
Collaborator

#215

@jrtc27 jrtc27 linked a pull request Sep 27, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants