fix: improve media format detection with proper ftyp-based MP4 detection#311
Conversation
- Fix MP4 detection by checking for 'ftyp' at offset 4 instead of hardcoded box sizes - Add HEIC/HEIF and AVIF image format detection - Add GIF, BMP, TIFF image format detection - Add MOV, 3GP, AVI video format detection - Add MP3, WAV, FLAC, OGG, M4A audio format detection - Add length checks to prevent index errors on short data - Improve JPEG detection with 3-byte signature
There was a problem hiding this comment.
This is the final PR Bugbot will review for you during this billing cycle
Your free Bugbot reviews will reset on December 24
Details
Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
- Merge all ftyp-based format detection (HEIC, AVIF, M4A, MOV, 3GP, MP4) into single block - Remove duplicate unreachable M4A detection code - Ensures M4A files are correctly identified instead of falling through to MP4
|
@dylanuys Can you plz check this PR and give me your feedback? Thanks. |
|
@0xsatoshi99 sorry for leaving this one hanging -- this doesn't impact anything since c2pa-python under the hood extracts format from media bytes itself, but regardless this is much better than the original code! Hopefully you can still get incentive for this, but reach out to me on discord if you need me to vouch for you for Gittensor. |
* Feat/escrow addr (#322) * get escrow addresses * Fix/gen miner (#321) * generator miner updates * update dependencies * remove liveness in db * prompt modality addition --------- Co-authored-by: kenobijon <kjmiyachi@gmail.com> --------- Co-authored-by: kenobijon <kjmiyachi@gmail.com> Co-authored-by: Kenobi <108417131+kenobijon@users.noreply.github.com> * fix: improve media format detection with proper ftyp-based MP4 detection (#311) * fix: improve media format detection with proper ftyp-based MP4 detection - Fix MP4 detection by checking for 'ftyp' at offset 4 instead of hardcoded box sizes - Add HEIC/HEIF and AVIF image format detection - Add GIF, BMP, TIFF image format detection - Add MOV, 3GP, AVI video format detection - Add MP3, WAV, FLAC, OGG, M4A audio format detection - Add length checks to prevent index errors on short data - Improve JPEG detection with 3-byte signature * fix: consolidate ftyp-based format detection to fix unreachable M4A code - Merge all ftyp-based format detection (HEIC, AVIF, M4A, MOV, 3GP, MP4) into single block - Remove duplicate unreachable M4A detection code - Ensures M4A files are correctly identified instead of falling through to MP4 --------- Co-authored-by: Dylan Uys <dylan.uys@gmail.com> * bump version * create index after adding new col * remove prompts after sampling * improve gen miner error messages (#324) * webhook status tracking (#325) * gen miner env template update * better self-signed check * remove duplicate pm2 logs * useless comments * verify_c2pa helper and other improvements * fix google in new truster issuers * fix ai generated check * use block arg instead of subtensor --------- Co-authored-by: kenobijon <kjmiyachi@gmail.com> Co-authored-by: Kenobi <108417131+kenobijon@users.noreply.github.com> Co-authored-by: Satoshi Dev <162055292+0xsatoshi99@users.noreply.github.com>
* Release 4.4.0 (#323) * Feat/escrow addr (#322) * get escrow addresses * Fix/gen miner (#321) * generator miner updates * update dependencies * remove liveness in db * prompt modality addition --------- Co-authored-by: kenobijon <kjmiyachi@gmail.com> --------- Co-authored-by: kenobijon <kjmiyachi@gmail.com> Co-authored-by: Kenobi <108417131+kenobijon@users.noreply.github.com> * fix: improve media format detection with proper ftyp-based MP4 detection (#311) * fix: improve media format detection with proper ftyp-based MP4 detection - Fix MP4 detection by checking for 'ftyp' at offset 4 instead of hardcoded box sizes - Add HEIC/HEIF and AVIF image format detection - Add GIF, BMP, TIFF image format detection - Add MOV, 3GP, AVI video format detection - Add MP3, WAV, FLAC, OGG, M4A audio format detection - Add length checks to prevent index errors on short data - Improve JPEG detection with 3-byte signature * fix: consolidate ftyp-based format detection to fix unreachable M4A code - Merge all ftyp-based format detection (HEIC, AVIF, M4A, MOV, 3GP, MP4) into single block - Remove duplicate unreachable M4A detection code - Ensures M4A files are correctly identified instead of falling through to MP4 --------- Co-authored-by: Dylan Uys <dylan.uys@gmail.com> * bump version * create index after adding new col * remove prompts after sampling * improve gen miner error messages (#324) * webhook status tracking (#325) * gen miner env template update * better self-signed check * remove duplicate pm2 logs * useless comments * verify_c2pa helper and other improvements * fix google in new truster issuers * fix ai generated check * use block arg instead of subtensor --------- Co-authored-by: kenobijon <kjmiyachi@gmail.com> Co-authored-by: Kenobi <108417131+kenobijon@users.noreply.github.com> Co-authored-by: Satoshi Dev <162055292+0xsatoshi99@users.noreply.github.com> * fix score norm for gen miners, switch off prompt removal for now * bump version --------- Co-authored-by: kenobijon <kjmiyachi@gmail.com> Co-authored-by: Kenobi <108417131+kenobijon@users.noreply.github.com> Co-authored-by: Satoshi Dev <162055292+0xsatoshi99@users.noreply.github.com>
Summary
Fixes incorrect MP4 format detection in C2PA verification by using proper ftyp-based detection instead of hardcoded box sizes. Also adds comprehensive support for additional media formats.
Problem
The previous implementation detected MP4 files by checking for specific ftyp box sizes:
This only catches MP4 files with ftyp box sizes of 28 or 32 bytes. Many valid MP4 files have different box sizes and would incorrectly return
.bin, causing C2PA verification to fail.Solution
Check for
ftypsignature at offset 4 (the standard location per ISO base media file format):Changes
Bug Fix
FF D8 FF)New Format Support
Testing
Manual verification with sample files of each format type.
Contribution by Gittensor, learn more at https://gittensor.io/
Note
Low Risk
Changes are limited to magic-byte format detection used for temp file suffixes; main risk is misclassification of edge-case
ftypbrands affecting which suffix is used for C2PA parsing.Overview
Fixes C2PA temp-file suffix detection by replacing the brittle MP4 check (hardcoded box sizes) with ISO BMFF
ftyp-based parsing and brand-aware handling.Expands
_detect_formatto recognize additional image/video/audio types (e.g., GIF/BMP/TIFF, HEIC/AVIF, MOV/3GP/AVI, MP3/WAV/FLAC/OGG/M4A) and adds basic length guards to avoid short-buffer indexing issues, reducing false.binfallbacks that can break verification.Written by Cursor Bugbot for commit a7f2627. This will update automatically on new commits. Configure here.