-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to handle two sets of bytes for matching improvements? #46
Comments
What we could do there is instead of matching at offset 0 and FORM we can change to the offset where the more accurate info lives and match there instead. Don't currently have a way to do wildcards, so can't be as accurate matching both FORM and ACBM Thanks for the info, I can work on that when I have time. If you know a source of sample files for that please share! |
Instead or in addition to wildcards another option could be dual match, take our .iff sample, we could look to do...
If your code sees a list instead of a string, process both hex matches using the matching offset from the next list, if both matches, we get pretty much 100% confidence it's what we think it is. Logic is a little weirder than wildcarding but it's another possible way. Aminet is pretty much the internet oldest resource for all things Amiga, we should be able to find pretty much all things there. 7zip will happily unpack most of the .lha and other formats you'll find there. If you get stuck on any let me know and I'm sure I can unearth samples from somewhere. |
Thanks for the samples! Added a multi-part detect. Should be working in 1.20 https://github.com/cdgriffith/puremagic/releases/tag/1.20 |
Nice! I've just looked at the implementation and that's way a great way to handle it, much tidier than mine. I'll test it out later on a script I have for handling converting images between formats. For retro uses this will be handy as there are a lot of older formats like file packers that use a two part fingerprint. |
Hi there,
I'm looking for a python package to help identify weird and wonderful files inside various scripts. I had seen fleep but that appears to be dead. Puremagic looks to offer the same functionality for what I want it for.
One job is for handling Amiga .iff files in an image conversion script. Having a quick look, it's nice to see .iff getting some love:
puremagic/puremagic/magic_data.json
Line 1084 in ff042db
But in Amiga land that .iff
FORM
header is used for many things Wikipedia: List_of_file_signaturesIs there a way to help improve mapping and confidence by adding additional matching strings such as
ILBM
ACBM
etc..? I'm happy to help with a PR if it can be done.The text was updated successfully, but these errors were encountered: