-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/link: allow asking linker to *not* eliminate a "dead" function #35055
Comments
How are you referencing You should probably use a reference to We do this in the runtime a fair amount. See runtime/proc.go:funcPC. You would do (with your own copy of funcPC):
|
At the moment, we use a hack: strings placed at the beginning and the end of the assembly file, so the whole segment can be given to the kernel: https://github.com/u-root/u-root/blob/ac7ae682c648c26a7c90ac73a2a8b4597bdbeb78/pkg/multiboot/internal/trampoline/trampoline_linux_amd64.go#L67 That's what the "begin" and "end" symbols in the .s file are. This makes another assumption that I'm not sure will remain true: that the assembly stuff will all be contiguously compiled together. given that, would you still recommend something like funcPC? |
I should add -- this is because we don't only want start to be passed, but we need start, boot, farjump{32,64}, and gdt. Two other strings are in there to mark places where some values need to be inserted. (Those could probably be replaced by funcPC easily.) |
Yes, that's not something you should rely on. Although we've not done anything like this yet, we've been contemplating feedback-directed optimization which would (among other things) reorder function layout.
Yes, you could make that work. It doesn't require any hacks except for Note that a |
When that happens, what do you recommend? What is a better way to solve this problem for us?
I'll replace all the string hacks with funcPC. That should at least keep the linker from getting rid of all that stuff. That just leaves the other problem (contiguous addressing). |
I'm not entirely sure what you are asking. Why do you need contiguous? You should pass the values of funcPC(start), funcPC(boot), funcPC(farjump64), etc. to whomever needs them. Then they don't need to be contiguous. |
Contiguous is nice, because it allows us to allocate one segment of physical memory to the functions. They are currently written to be relocatable, so it'd be a bit of a bummer if In other words, it's desirable to keep the byte slice returned from here (which contains the entire relocatable trampoline) small. |
If they are relocatable you can just copy them out into a new slice. |
I'm sorry, I don't fully understand. Copy them out into a new slice? Multiple slices are not the problem, but the fact that we have to take up more address space in a sparse fashion. And yeah, to do what you suggest I also need to know the length of each function, rather than the length of the whole segment. |
You're going to have your Go binary mapped into the address space regardless, so you're not wasting any address space. Or, you're wasting the same space whether Maybe you'd have to allocate |
Oh, the copying out and/or the userspace allocation don't matter. We already move it around / copy it like that. But the bootloader also allocates physical ring 0 address space to lay out the trampoline, the kernel being loaded, and other data structures to be passed to the kernel, within the constraints of the memory available on the physical system. For the trampoline, right now, we only have to allocate 1-2 pages, and we copy that memory from userspace to kernel space in that 1-2 pages. But if it's not all contiguous, and the code is all relative to each other, then I have to have several scattered allocations across the physical address space, exactly /as much apart/ as the userspace layout of those functions, which will make the kernel harder to lay out. If the userspace binary is 20M, and start and boot are at opposite ends, that'd probably lead to us allocating 20M of physical address space just to the trampoline. In practice that'll probably not matter for a while, unless we end up running on really low memory systems or unless our bootloader binaries get really big. |
Still, we'd have to know the length of each function. For the purposes of this, we can probably just assume that they're less than a page and just call the length 4096. (It just results in unnecessary code copied into ring 0, but alas...) |
Rather than opening /proc/self/exe and finding a string within the binary, we use the Go runtime to tell us where the trampoline functions are located. (Must already be mapped in address space, since it's part of our own executable.) Based on discussion in golang/go#35055 Signed-off-by: Chris Koch <chrisko@google.com>
Rather than opening /proc/self/exe and finding a string within the binary, we use the Go runtime to tell us where the trampoline functions are located. (Must already be mapped in address space, since it's part of our own executable.) Based on discussion in golang/go#35055 Signed-off-by: Chris Koch <chrisko@google.com>
Rather than opening /proc/self/exe and finding a string within the binary, we use the Go runtime to tell us where the trampoline functions are located. (Must already be mapped in address space, since it's part of our own executable.) Based on discussion in golang/go#35055 Signed-off-by: Chris Koch <chrisko@google.com>
Any movement on this one, or am I still stuck with the hack suggested by hugelgupf? |
I can confirm I am experiencing the same bug, while using gollvm |
We have some assembly code in our Go binary that is never directly called by any Go code in userspace. (It's intended to be handed to the kernel to execute during a
kexec
in ring 0. Kernel hands execution to this piece of Go assembly, Go assembly hands it to a kernel we're booting.)Normally, this code would be optimized out of the final binary by the linker. So far, we can get around that with this: https://github.com/u-root/u-root/blob/ac7ae682c648c26a7c90ac73a2a8b4597bdbeb78/pkg/multiboot/internal/trampoline/trampoline_linux_amd64.go#L39
Arguably a hack that will eventually be optimized away by the compiler.
Can we have a directive to not optimize away a piece of code? "//go:nodelete"? This is the assembly in question that shouldn't be optimized away: https://github.com/u-root/u-root/blob/ac7ae682c648c26a7c90ac73a2a8b4597bdbeb78/pkg/multiboot/internal/trampoline/trampoline_linux_amd64.s#L29
The text was updated successfully, but these errors were encountered: