-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#50: Add architecture detection, and ARMv8 support, to zwo_fixer's PLT hooking functionality #51
Conversation
@bgottula Make sure to give this a test run for me to make sure I didn't cause any runtime regressions. I did a basic "run the compiler on it" sort of "test", but that's about as far as I could go, given my lack of camera hardware here. (I suppose I could have run a stub program that loads libASICamera2.so but doesn't do actual camera operations, just to see if the fixer library inits properly and doesn't crash it, but meh.) (Also, though I do have GCC ARM cross-compilers handy, I don't know of any convenient hardware off the top of my head to test run it on, so I couldn't really actually test that the ARMv8 stuff works properly either.) |
🦾 <-- Hey look, it's an ARM. |
Didn't the OP on #49 want arm7 rather than arm8 support? |
Oh ffs...
So apparently, now that I look closer, I guess it’s typical, for some
reason, for RPi distributions to run a 32-bit ARMv7 kernel+userland on
their fancy 64-bit ARMv8 CPU.
Sigh. I guess I’ll go implement ARMv7 support then, shouldn’t be too hard I
think...
…On Sat, Jul 4, 2020 at 7:59 PM Brett Gottula ***@***.***> wrote:
Didn't the OP on #49 <#49>
want arm7 rather than arm8 support?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#51 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGIXP2WX4WKYUXBVZH2BHLRZ7UB3ANCNFSM4OQVGP3A>
.
|
And also yeah I don’t believe the libASICamera2 deb repo I made contains
the non-x86 libraries. I may or may not look into addressing that.
Ultimately that’s more of a convenience thing I guess since it’s just a
repackaging of the exact same binaries as ZWO themselves provide for
download.
…On Sat, Jul 4, 2020 at 8:04 PM Justin Gottula ***@***.***> wrote:
Oh ffs...
So apparently, now that I look closer, I guess it’s typical, for some
reason, for RPi distributions to run a 32-bit ARMv7 kernel+userland on
their fancy 64-bit ARMv8 CPU.
Sigh. I guess I’ll go implement ARMv7 support then, shouldn’t be too hard
I think...
On Sat, Jul 4, 2020 at 7:59 PM Brett Gottula ***@***.***>
wrote:
> Didn't the OP on #49 <#49>
> want arm7 rather than arm8 support?
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#51 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAGIXP2WX4WKYUXBVZH2BHLRZ7UB3ANCNFSM4OQVGP3A>
> .
>
|
I can test the armv7 module on a PI 4 if you want. |
@serveurperso that would be great. |
@jgottula what do you want to do with this one? @serveurperso do you ever plan to test this, or did @jgottula do all this work for nothing? |
Actually yesterday, among other things, I was looking into expanding the apt package support to include the various ARM binaries. Didn't get around to it just yet because it was lower priority; but it is on my radar. |
Okay. Well I don't have any current plans to use ARM so I'll leave it up to you. I guess I'll leave this open then? |
fd1bc2f
to
4bceef7
Compare
Rebased ARMv7 and ARMv8 support onto SDK version 1.16.3. |
9c6793b
to
cfabec8
Compare
ifeq ($(PLATFORM), armv5) | ||
# CC = arm-none-linux-gnueabi-g++ | ||
CFLAGS += -march=armv5 | ||
endif | ||
|
||
ifeq ($(PLATFORM), armv6) | ||
# CC = arm-linux-gnueabi-g++ | ||
CFLAGS += -march=armv6 | ||
endif | ||
|
||
ifeq ($(PLATFORM), armv7) | ||
# CC = arm-linux-gnueabihf-g++ | ||
CFLAGS += -march=armv7 -mcpu=cortex-m3 -mthumb | ||
endif | ||
|
||
ifeq ($(PLATFORM), armv8) | ||
# CC = aarch64-linux-gnu-g++ | ||
endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's going on with the commented-out CC
lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, those are based (slightly loosely) on what's in the original demo makefile here: https://github.com/seeing-things/zwo/blob/master/1.16.3/linux_sdk/demo/Makefile#L53-L85
In principle they would be necessary if you were, for example, cross-compiling the capture
program on your Intel box for your ARM box with your gcc cross-compiler for ARM. Because in that situation, the main/non-prefixed gcc is the one that builds binaries targeted for your Intel box. And the prefixed version is the one that builds binaries targeted for the ARM box.
When building for the ARM box on the ARM box itself, you don't actually really need the toolchain prefix, because the main/non-prefixed gcc in that situation would be the one targeting the ARM box itself.
(Fun exercise: I bet that on any of your AMD64 boxes, you can run regular gcc via just gcc
but also via x86_64-pc-linux-gnu-gcc
or similar. But because the host and target arch are the same, you normally just use gcc
to build stuff.)
So... yeah, these prefixes are only really needed for cross-compiling... but somebody might want them... I dunno... in any event, I halfheartedly left them in, in commented-out form, as a sort of "whatever" compromise.
d92d3c5
to
e6ba4c2
Compare
I can build from this branch, but the program crashes:
Here's a stack trace:
|
Oh I know what went wrong. This is one of those cases where git's fancy automatic merging did me a disservice by assuming some things were unrelated when they actually are interrelated and should have been a conflict of sorts. Will fix it up. |
ARM Thumb mode is the most evil thing ever conceived from a reverse engineering point of view. Not only do you (and your disassembler) never have a goddamn clue when the code is in Thumb mode or not (leading to confusion and wrongly-disassembled instructions); but then if you want to hack around with this code and actually interface with it, good luck with that, because probably half the stuff will be in regular ARM ISA and the other half will be in Thumb and it's just a giant inconsistent mess. You have no idea how happy I am that ARM made the highly intelligent decision to NOT support the Thumb instruction set(s) in AArch64. Dropping that shit was the best decision they ever made.
This Makefile is frankly an abomination. It's still originally based on the crusty ugly Makefile provided with the demo programs in the ASK SDK. The necessity to manually adjustment of the PLATFORM variable in the Makefile is idiotic and really should not need to be there, except maybe in cases of cross-compilation.
Use strlcpy instead of strncpy; and ensure full buffer zeroing.
Whoever invented the SER header was an idiot and decided that all the 4- and 8-byte fields should be 2-byte-aligned. Do these people understand how computers work? Do they think memory buses are magic? Sigh... This solution to the problem is nifty because it uses std::tie stuff. But ultimately it ended up dumber than intended, because it didn't end up being possible to do std::tie directly to the struct members for exactly the same reason that taking pointers to them generated warnings: they aren't aligned. So the troublesome part ended up just needing an intermediate temporary assignment anyway, and the niftiness factor was sadly substantially decreased.
e6ba4c2
to
c40cad9
Compare
Force-pushed a fix. |
Confirm, that fixed it. |
I had a sudden realization after thinking about issue #50 for a little while, that the timeouts may actually have been happening due to libzwo_fixer screwing things up, because apparently it was under the very incorrect presumption that everything was detected properly and that it was running on an AMD64 system, and was therefore patching in AMD64 hook code at undoubtedly incorrect/random addresses in the libASICamera2.so ARMv8 library!
So, the first thing that needed correcting was to do proper architecture detection. Previously there was fairly rigorous detection of the ASI SDK version; however, nothing actually stopped the library from blissfully doing completely wrong things if it was running on a non-AMD64 machine. So now, there are preprocessor checks to categorize the machine the library is being compiled on, and either allow compilation to continue, or annoy the user with an error/warning message telling them that they're attempting to build for an architecture that doesn't actually have the necessary code support built in (including the actual offsets in the relevant platform's variation of the libASICamera2.so binary).
After that, I took a look at how the PLT is handled by the runtime dynamic linker on AArch64 vs how it's done on AMD64, and it's actually pretty similar. And given that it was trivial to locate the necessary offsets in the ARMv8 libASICamera2.so binary, I figured I would just go ahead and implement some multi-arch support into the library. So now it supports both AMD64 and ARMv8. (But not i386, ARMv5, ARMv6, ARMv7, or the Mac or Windows versions. The former four could probably be done pretty easily if the desire ever arose.)
Making the code simpler / more unified / more architecture-generic steered me toward altering the way I had been doing the PLT hooks. Previously I was doing a frankly somewhat boneheaded thing where I wrote a custom AMD64 asm stub over the PLT entry itself. I could have taken the same approach for ARMv8, but I'd have to do a separate asm implementation, and the ARM ISA instruction encoding is a gigantic pain in the ass (particularly because something ostensibly easy like "hey let's just load up a 64-bit immediate value into a register and then jump to it" requires like 7 steps because you can't fit very many bits into a fixed-size 32-bit instruction word). And there were other downsides too: the way I was doing it required me to flip the page protection flags on the memory whenever the hook was enabled or disabled, because the PLT stub itself is a code section with R-X permissions, and modifying the code requires RW-. (Oh and on ARM, unlike x86, doing self-modifying-code-type-stuff doesn't "just work", you have to actually explicitly make sure the L1 I$ and D$ synchronize properly and so forth.) Plus, I was doing some questionable things anyway like using memcpy to overwrite code, which is potentially problematic because there's no guarantee that memcpy isn't going to copy one byte at a time and temporarily put the code into a partially-overwritten state.
So now, I just take advantage of the fact that the PLT entries are already set to just do an indirect jump via their GOT PLT slot (which is in a data section that is already RW-), and I literally just do a single 64-bit atomic exchange operation to overwrite the pointer that's used for the indirect jump. I probably should have been doing this originally, but I think there were plausible reasons why I didn't want to do it. In any case, I think the new way is much simpler and less dumb overall.