-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed to decode my known-good test file #2
Comments
Hmm.
I find it quite amusing that the very first test file I fed through it appears to be an edge case that I can't figure out how I created in the first place. Maybe I'll find time to set up gdbgui some time in the next couple of days and step through it to see what's going on... or maybe I'll just set up the differential fuzzing against the Python implementation that I was planning to do anyway (using PyO3 to get the Python and Rust in the same process, if you want to leapfrog me on that) and see if that removes the need for gdbgui to figure out where they diverge. |
Thanks for bringing up this issue! I think I know the issue here,
where BINHEX_PROMPT_PREFIX is "(This file must be converted with BinHex" and DATA is any number of lines of binhex data. I interpreted the spec you posted to mean that this string prefix is a hard requirement, but if it is true that like in this case many files don't comply with that, I could remove this requirement when parsing. |
I can verify that your file correctly decodes to
when I change it to
|
They don't appear to be common, but they exist (a quick search through the BinHex was basically a Macintosh counterpart to things like uuencode and, depending on the nature of your early Usenet client, it'd be easy for someone to assume that message isn't significant to the parsing. You might literally be copy-pasting the non-human-readable portion of the BinHex file into a message after you wrote something like "Here's the .hqx". (It's reminiscent of how early file transfers over serial lines and modems worked, with the receiver running something like XModem and the sender then running it, and, as far as the link was concerned, the human on one end just started typing really fast, ending with the program on the receiving end exiting on its own. Very ad-hoc, lenient, and lacking formal framing.) EDIT: Watch Cathode Ray Dude's AT&T's '60s Modem That Won't Die if you want an interesting exploration of how ad-hoc it began. |
I'd be interested in seeing the result of the fuzzing! If you have a python fuzzing implementation setup, I could also look at integrating it with this package / if you wanted to do that that would be appreciated. As for the parsing, thanks for the context, I will remove the requirement and just read it like |
Barring unanticipated show-stoppers, it'll be a Rust fuzzing system which binds to I may get to it today or it may not. I'm trying to get my sleep cycle in order and it'll depend on how much I get to before I feel sleepy. |
Sounds good, no rush. I'll close this issue for now since I've just fixed this issue and updated it to v0.1.2. let me know if there are any other issues you run into with the library |
This minimal test program...
...reports
BadFormat
......when fed this minimal "known good case" test file that Python's standard library
binhex
module has no problem with anduudeview
produces expected output from:(Yeah, you don't need to binhex text files, but I built the test file from
testfile.txt
because it makes it easier to quickly check for problems usingless
, and because text files have no internal checksums that a tool might leverage to cover up its lack of support for checking BinHex4's own CRC.)The text was updated successfully, but these errors were encountered: