Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

imghdr matches in PureMagic? #68

Closed
NebularNerd opened this issue May 11, 2024 · 5 comments · Fixed by #75
Closed

imghdr matches in PureMagic? #68

NebularNerd opened this issue May 11, 2024 · 5 comments · Fixed by #75

Comments

@NebularNerd
Copy link
Contributor

NebularNerd commented May 11, 2024

I've re-titled this as it got a bit off-track from the original purpose of the issue.

@cclauss wanted to know if PureMagic could provide a 1:1 replacement of imghdr, to avoid crosstalk I've moved the original content to #69

@cclauss
Copy link
Contributor

cclauss commented May 11, 2024

Did imghdr have tests we should attempt to run here?

@NebularNerd
Copy link
Contributor Author

NebularNerd commented May 11, 2024

Looking at the imghdr source it runs a very primitive test, and can only match 0x01da / b'\001\332'.

https://github.com/python/cpython/blob/af8db2b6817e1ae25cbee98e67d1f4bac9d6af9c/Lib/imghdr.py#L73-L76

The current magic in PureMagic would not pick this up as it currently has a longer variant specific match, adding a short match like above is fine, but it can lead to low confidence scores and potential clashes if another file uses the same header. The best approach for good match scores and a definitive solid match is to use the longest header and multi-match possible.

I'll likely run up a PR using the same conventions as I did for PCX, this issue is more about a thought about naming conventions and potential better presentation of the results.

Also, as an aside, SGI would make a great example of a rule-based detection as the header is quite complicated with about 6 unique fixed fingerprints.

@cclauss
Copy link
Contributor

cclauss commented May 11, 2024

I meant all the tests in that file for all the format, not just rgb.

@NebularNerd
Copy link
Contributor Author

NebularNerd commented May 11, 2024

Looking at the source again, a quick run-down:

  • .jpg ✅
  • .png ✅
  • .gif ✅
  • .tif/.tiff ✅
  • .rgb ❌ As discussed above
  • .pbm ✅
  • .pgm ✅
  • Sun Raster ✅
  • X Bitmap ❌
  • .bmp ✅
  • .webp ✅
  • OpenEXR ✅

X Bitmap looks to not be present, SGI we know needs some love, the other formats are supported to at least the same degree the imghdr uses (there are a couple that need tidying and testing). If I do a PR then it would include the missing formats, plus improvements to the existing ones where I think PureMagic can now do a better job.

@NebularNerd
Copy link
Contributor Author

I'll close this as my PR #75 was merged but it did not close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants