Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stage2: code signing in self-hosted MachO linker (arm64) #7103

Closed
kubkon opened this issue Nov 13, 2020 · 13 comments · Fixed by #7231
Closed

stage2: code signing in self-hosted MachO linker (arm64) #7103

kubkon opened this issue Nov 13, 2020 · 13 comments · Fixed by #7231
Labels
arch-aarch64 64-bit ARM frontend Tokenization, parsing, AstGen, Sema, and Liveness. os-macos
Milestone

Comments

@kubkon
Copy link
Member

kubkon commented Nov 13, 2020

This is a meta issue for tracking progress on code signing of binaries generated with the self-hosted MachO linker targeting arm64 Macs. I also hope to disseminate knowledge about the signing process used and enforced by Apple on the latest arm64 platform.

Background

With the release of Apple Silicon and macOS 11 Big Sur, Apple is now enforcing code signed binaries even at debugging stage. Essentially speaking, if the user wants to build an app/binary for local use/testing, they are expected to do the strict validation of the MachO binary and apply adhoc code signing. There is a pretty good stackExchange answer on that topic as well.

My understanding here is that adhoc signed binary can only be used on the machine it was originally built on; however, this requires further investigation.

What does all of this mean for Zig?

Cross-compilation to arm64 Macs will become tricky due to the additional code signing requirement. Distribution of Zig binaries as well; however for the latter I assume the standard code signing process required by Apple on other platforms (iOS, watchOS, tvOS) will be used here --- that is, obtaining a developer identity and certificate, and signing the binary with it. @jedisct1 pointed me to some nice, existing OSS solutions that do this and were not developed by Apple (see for instance gon) which hints on the possibility of having a similar solution written in Zig but hosted as a separate, preferably community-driven project. (Any takers? 😁)

What about local debug builds? @andrewrk suggested we investigate if perhaps there is an exception in the kernel which at least permits non-signed binaries to run fine via Apple debugger, however, this is not the case. No code signature means immediate SIGKILL -9 even for binaries build from source by the user on the very Mac.

With this in mind, my idea, and what I've been trying hard to understand and explore, is how should we tweak the self-hosted MachO linker to be able to generate a valid adhoc code signature. My idea was for this research effort to proceed in the following 3 steps:

  1. Get the generated MachO binary code signed with codesign -s - binary --- this will generate an adhoc signature and allow running of the binary locally.
  2. Figure out if we want the necessary changes to the linker to be included in the codebase by default, or
  3. Be lucky enough and figure out how to replicate the output of codesign -s - in the self-hosted linker.

Depending on progress, we'd either go immediately from 1 -> 3, or 1 -> 2 and optionally -> 3.

The story so far...

Easy stuff first. The code signature is stored inside a __LINKEDIT segment at an offset pointed to by LC_CODE_SIGNATURE load command. The load command itself is a linked_data_command.

The structure that's embedded into the binary is a little bit more cryptic and complicated. I couldn't find much info about it, however, from browsing Apple's sources, I managed to come up with a partial parser and included it in the draft helper project Zacho#64f0a0d. This is based largely on SecurityTool/codesign.c. In fact, this tool pointed out that a LC_VERSION_MIN_MACOSX load command might not be optional after all since its presence or absence has direct effect on the code signature version generated and embedded within the binary by codesign utility.

Now onto the more tricky stuff. It turns out that codesign will perform strict validation of the MachO binary before it is signed, and if the binary doesn't conform to, it will either refuse to sign and generate some garbage. I'm still to work out what that is, however, I already know that the sections within __LINKEDIT segment cannot have "holes" between them; i.e., the offset of one should be the offset + size of the preceding section. This complicated things for us since we specifically want to leave some gaps for easier management of the incremental linking process in the self-hosted. With that out of the way, I also think arm64 macOS requires the binary to be position-independent executables (or PIEs) which is a big setback wrt to the self-hosted linker since we store and rely on absolute addressing stored within __DATA,__got section. We can work this out, however, since this is somewhat tangential to the code signature issue, I believe it is of interest to get a simple "exit syscall" binary working first. Such a binary will not contain any function call (or debugging info), thus not requiring any GOT indirection:

export fn _start() noreturn {                                                                                                                                                  
    asm volatile ("svc #0x80"                                                                                                                                                  
        :                                                                                                                                                                      
        : [number] "{x16}" (1),                                                                                                                                                
          [arg1] "{x0}" (0)                                                                                                                                                    
        : "memory"                                                                                                                                                             
    );                                                                                                                                                                         
    unreachable;                                                                                                                                                               
}  

And this is what I'm currently working on. You can track the progress in kubkon/zig.git#stage2-arm-macos.

@kubkon kubkon added os-macos frontend Tokenization, parsing, AstGen, Sema, and Liveness. arch-aarch64 64-bit ARM labels Nov 13, 2020
@kubkon kubkon added this to the 0.8.0 milestone Nov 13, 2020
@kubkon
Copy link
Member Author

kubkon commented Nov 17, 2020

I thought I'm going to post an update. Yesterday I have finally managed to get the snippet mentioned in the issue description to generate a valid MachO executable that in turn was successfully code signed using Apple's codesign utility and loaded by dyld 🎉

From what I see, there are two critical bits that need to be satisfied for MachO to be code signed and accepted by the kernel on Apple Silicon:

  • Unlike on x86_64, each VM page needs to be 16KB aligned.
  • The MachO executable needs to be position-independent (or PIE).

The cleaned-up changes required to run the snippet above can be found in my local branch kubkon/zig.git#stage2-arm-macos. I will not be pushing these to master branch just yet until I work out how to go about the PIE requirement given our goal of incremental linking in stage2 (they don't really go hand-in-hand).

@komuw
Copy link

komuw commented Nov 18, 2020

My understanding here is that adhoc signed binary can only be used on the machine it was originally built on; however, this requires further investigation.

"I was able to apply an ad-hoc signature to a binary on my Intel mac and transfer it to my M1 mac and it ran just fine. " - golang/go#42684 (comment)

@kubkon
Copy link
Member Author

kubkon commented Nov 18, 2020

That’s some really good news @komuw!

@Mouvedia
Copy link

The MachO executable needs to be position-independent (or PIE).

Is #4503 a blocker?

@komuw
Copy link

komuw commented Nov 19, 2020

@kubkon
also;

These new signatures are not bound to the specific machine that was used to build the executable, they can be verified on any other system- https://developer.apple.com/documentation/macos-release-notes/macos-big-sur-11_0_1-universal-apps-release-notes

@komuw
Copy link

komuw commented Nov 19, 2020

There's also a tool used in nixOS to inject adhoc signatures; https://github.com/thefloweringash/sigtool

@kubkon
Copy link
Member Author

kubkon commented Nov 19, 2020

There's also a tool used in nixOS to inject adhoc signatures; https://github.com/thefloweringash/sigtool

You’re the best @komuw, thanks! If this indeed generates a valid signature, we’re one step closer to home!

@kubkon
Copy link
Member Author

kubkon commented Nov 19, 2020

The MachO executable needs to be position-independent (or PIE).

Is #4503 a blocker?

It’s got to be done, yep, but my thinking was that initially we’ll support it only on macOS, since there’s no running from it now.

@kubkon
Copy link
Member Author

kubkon commented Nov 20, 2020

OK y'all, got some really good news. The first draft of in-house, adhoc codesigning mechanism which makes up for the very last stage of flushing the output binary from the self-hosted linker, is alive. I repeat, is alive. 😎

Many thanks to @komuw for finding the sigtool sources which have definitely helped me in driving this home. You can find the sources for adhoc codesigning in MachO linker here: src/link/MachO/CodeSignature.zig. It is currently in my local branch, and will probably stay there until I work out all details required to make the self-hosted work for any assembly on arm64 Macs. In the meantime, however, if anyone wants to experiment with the self-hosted on arm64 Macs, feel free to rebase onto my branch.

Lastly, if you feel like prodding what's inside the code signature data section, I've added a flag for dissecting that section to ZachO, and it goes like this:

> zacho -c /path/to/binary

@Mouvedia
Copy link

It’s got to be done, yep, but my thinking was that initially we’ll support it only on macOS, since there’s no running from it now.

#4503 was closed by c7170e4; does it cover macOS?

@kubkon
Copy link
Member Author

kubkon commented Nov 23, 2020

@Mouvedia no, it doesn’t. For the PIE, which I’m working on right now, we need to tweak the GOT in stage2. Currently, we use absolute addresses to indirect via GOT which obviously will not work with PIE.

@kubkon
Copy link
Member Author

kubkon commented Nov 24, 2020

Quick update: I’ve got the PoC of PIE on x86_64 working (with incremental linking preserved). Next up is aarch64 PoC!

@kubkon
Copy link
Member Author

kubkon commented Nov 26, 2020

FYI, #7231 brings this one home!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-aarch64 64-bit ARM frontend Tokenization, parsing, AstGen, Sema, and Liveness. os-macos
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants