-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does not work on 32-bit ARM (or maybe any 32-bit arch?) with raw binary data #2
Comments
I suspect this has something to do with 64bit vs 32bit. Testing 32bit PHP is not something I can do easily. Are you able to run the unit tests? Do they all fail? The ONLY thing in the code that should care about 32bit would be the |
I forgot I have an old Raspberry Pi 3 that I could test with. These tests were run on that Pi which is running PHP v7.4.33. If I take the above sample input and do some debugging I see this:
We're seeing a total of 16 bytes which we break up in 4x 32bit unsigned longs using If I run the same code on my x86_64 Ryzen box I get:
Something is funky with |
I upgraded to Debian 12 which gives me PHP v8.2.7 and I'm getting the same error with the weird third chunk. Not sure what the next steps are. Either:
|
See this comment, seems to be related: https://www.php.net/manual/en/function.unpack.php#106041 |
That gets us closer... I was able to refactor the code to not use
I need those as integers not strings, and when I try and convert that third byte with
Not to mention that we still have to do some 64 bit math with this number later to calculate the appropriate base85 character. Without a massive refactor I don't know if I can get this to work. We rely on 64bit math under the hood. |
With some help from ChatGPT I was able to figure out how to use
I'm not doing anything special in |
I went ahead and added a bunch of Please test and let me know. |
I encoded and decoded a PNG file, and it came out identical by checksum, so it works. Bad news though: it is really really slow. On a small ARMv7 system at 1 GHz (Allwinner A20) it took 5 minutes 17 seconds to encode 1 MB. Decoding the base85 file took 3 seconds, so that part is fine. |
That doesn't really surprise me. I imagine doing a ton of intensive math with The world has moved on to 64bit CPUs. After diving into all this madness, I now see why :) |
Would it be possible to get back to the initial method as shown in But overall I would agree it's not worth it to spend a ton of effort and add much complexity just to better support 32-bit systems. As is, even a slow fallback is fine, main thing is that it now gives correct results. Only if someone would see this as a cool hacking project or a challenge to solve it in a nice way. :) |
Oh that's an interesting idea... I just landed some code that only goes the slow/gmp way if it can't do the math in 32bit. Try the latest commit and see if it's faster. |
It now takes 12 seconds to encode, but the encoded result is different than before, and decoding it results in a differing PNG file, that's also broken and not viewable. Did your unit tests pass? Do they include encoding and verifying binary data, and not just ASCII (probably not)? |
Interesting... all of my unit tests are passing. Clearly I'm missing something though. Take a look at the unit tests and see if you think I'm missing anything. I'm open to PRs for new unit tests.
|
What a weird bug... I spent 45 minutes tracking it down but I think I got it with bcc9583. See if that works for you. I also added unit tests for this scenario so it won't catch us again in the future. If this isn't it then I will need a small file that recreates this problem I can test with. Smaller the better. |
On my Raspberry Pi 3 using a 1MB file as a test:
Even on a 32bit platform the algorithm takes the "fast" path 81.6% of the time. The "slow" path is only taken when the integer chunk is greater than 2^31. |
Now it works correctly for me on that 1M PNG file. Takes 13 seconds to encode, 3 seconds to decode. Thanks! |
I suggest to add a check if we are running on a 32-bit system, and then verify that the GMP PHP extension is installed (function_exists(gmp_init)) and throw an error outright about that if not. Because otherwise it might work without any error for simple cases and the user might deploy it thinking everything is fine, but will later fail in more complex ones. |
That's not a bad idea... How should it look? Just a simple |
I would put it not into the encode, but right into the PHP script itself, even outside class declaration. So any script which includes the PHP file, will run the check at include-time once, and fail to continue if it's not met. Also no need to check for "first" then. Better print it to STDERR, and terminate execution. Using Also do not expect to be running in a web context, there's a ton of utility in PHP to do "shell scripts" with a better language, not necessarily web apps. So I suggest to omit any HTML formatting. |
With a source code like this:
Proper output on amd64 looks like this:
But on ARMv7 (32-bit) results in a garbled mess (also binary data in output):
PHP version is the same on both, 7.4
The text was updated successfully, but these errors were encountered: