Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Canon CR3 support. #121

Open
LebedevRI opened this issue Mar 2, 2018 · 206 comments
Open

Canon CR3 support. #121

LebedevRI opened this issue Mar 2, 2018 · 206 comments
Assignees

Comments

@LebedevRI
Copy link
Member

@LebedevRI LebedevRI commented Mar 2, 2018

Though, until the production camera is actually released, and the full set of freely-licensed samples is contributed to the https://raw.pixls.us/, it won't be merged.

@LebedevRI LebedevRI self-assigned this Mar 2, 2018
@LebedevRI
Copy link
Member Author

@LebedevRI LebedevRI commented Mar 2, 2018

Does not appear to be TIFF.
Appears to be ISO Media / ISO/IEC 14496-12.
The actual content of the 'mdat' box appears to be JPEG-ish, but not old-good lossless-JPEG, because it contains M_DQT (define-quantization-tables marker)

$ file canon_eos_m50_01.cr3  
canon_eos_m50_01.cr3: ISO Media
$ mpv canon_eos_m50_01.cr3
Playing: canon_eos_m50_01.cr3
 (+) Video --vid=1 (*) ( 6000x4000 1.000fps)
     Video --vid=2 (*) ( 1624x1080 1.000fps)
     Video --vid=3 (*) ( 6288x4056 1.000fps)
Failed to initialize a video decoder for codec ''.

@LebedevRI
Copy link
Member Author

@LebedevRI LebedevRI commented Mar 4, 2018

Hmm, ok, this is strange. Got the ISO Media parser working, and hooked-up dummy Cr3Decoder.
The mdat chunk is NOT lossless jpeg. But it just loads via libjpeg.
Those CR3 appear to be plain demosaiced JPEG's...
(Yes, this is an actual decoded image)
0001_

@LibRaw
Copy link

@LibRaw LibRaw commented Mar 4, 2018

According to ffprobe, mdat chunk contains four substreams.
First one is, exactly, JPEG preview, so from libjpeg's point of view this is JPEG file + ~30Mb of garbage (real raw data + makernotes) on tail.

@LebedevRI
Copy link
Member Author

@LebedevRI LebedevRI commented Mar 4, 2018

Aha, thank you, that makes more sense. Let's see...

@LibRaw
Copy link

@LibRaw LibRaw commented Mar 4, 2018

unfortunately, two data chunks (1624x1080 and 6288x4056, first one probably is low-res preview) are not lossless jpegs too :(

@LebedevRI
Copy link
Member Author

@LebedevRI LebedevRI commented Mar 4, 2018

(I'm still hoping they are JPEG w/arithmetic coding)

@LibRaw
Copy link

@LibRaw LibRaw commented Mar 4, 2018

It does not looks like JPEG (e.g. no FF doubling).

@LebedevRI
Copy link
Member Author

@LebedevRI LebedevRI commented Mar 4, 2018

There are 0xFF0xFF sequences in files 05, 11, 19, (looking at the second half of the file in okteta hex editor)

@clanmills
Copy link

@clanmills clanmills commented Mar 4, 2018

Thank you folks. I'm going to keep out of a discussion about the pixels and keep to my domain of metadata. Thanks to your comments, I've now heard of ffprobe.

822 rmills@rmillsmbp:~/gnu/github $ exiv2 -pR foo.jpg 
STRUCTURE OF JPEG FILE: foo.jpg
 address | marker       |  length | data
       0 | 0xffd8 SOI  
       2 | 0xffdb DQT   |     132 
     136 | 0xffc0 SOF0  |      17 
     155 | 0xffc4 DHT   |     418 
     575 | 0xffda SOS  
823 rmills@rmillsmbp:~/gnu/github $ ffprobe foo.jpg 
ffprobe version 3.4.1 Copyright (c) 2007-2017 the FFmpeg developers
  built with Apple LLVM version 9.0.0 (clang-900.0.39.2)
  configuration: --prefix=/opt/local --enable-swscale --enable-avfilter --enable-avresample --enable-libmp3lame --enable-libvorbis --enable-libopus --enable-librsvg --enable-libtheora --enable-libopenjpeg --enable-libmodplug --enable-libvpx --enable-libsoxr --enable-libspeex --enable-libass --enable-libbluray --enable-lzma --enable-gnutls --enable-fontconfig --enable-libfreetype --enable-libfribidi --disable-jack --disable-libopencore-amrnb --disable-libopencore-amrwb --disable-indev=jack --enable-opencl --disable-outdev=xv --enable-audiotoolbox --enable-videotoolbox --enable-sdl2 --mandir=/opt/local/share/man --enable-shared --enable-pthreads --cc=/usr/bin/clang --arch=x86_64 --enable-x86asm --enable-libx265 --enable-gpl --enable-postproc --enable-libx264 --enable-libxvid
  libavutil      55. 78.100 / 55. 78.100
  libavcodec     57.107.100 / 57.107.100
  libavformat    57. 83.100 / 57. 83.100
  libavdevice    57. 10.100 / 57. 10.100
  libavfilter     6.107.100 /  6.107.100
  libavresample   3.  7.  0 /  3.  7.  0
  libswscale      4.  8.100 /  4.  8.100
  libswresample   2.  9.100 /  2.  9.100
  libpostproc    54.  7.100 / 54.  7.100
Input #0, image2, from 'foo.jpg':
  Duration: 00:00:00.04, start: 0.000000, bitrate: 7605132 kb/s
    Stream #0:0: Video: mjpeg, yuvj422p(pc, bt470bg/unknown/unknown), 6000x4000, 25 tbr, 25 tbn, 25 tbc
824 rmills@rmillsmbp:~/gnu/github $ 

I'll work on the metadata. We've found the XMP data and I'm confident the Exif (and IPTC) data is in the Canon uuid box at the top of the file.

Later, I will investigate previews and ICC profiles.

@LebedevRI
Copy link
Member Author

@LebedevRI LebedevRI commented Mar 4, 2018

I'll work on the metadata. We've found the XMP data and I'm confident the Exif (and IPTC) data is in the Canon uuid box at the top of the file.
Later, I will investigate previews and ICC profiles.

Thank you, @clanmills! Sounds awesome! :)

@lclevy
Copy link

@lclevy lclevy commented Mar 11, 2018

Phil Harvey exiftool 10.82 can parse Canon CR3
it seems there are at least 3 pictures: THMB, PRVW, and other in mdat

@lclevy
Copy link

@lclevy lclevy commented Mar 11, 2018

for http://img.photographyblog.com/reviews/canon_eos_m50/photos/canon_eos_m50_01.cr3
mdat contains "trak" picture at 0x6d940, 0x37b030, 0x546c70, and TIFF metadata at 0x2565b98

@LebedevRI
Copy link
Member Author

@LebedevRI LebedevRI commented Mar 11, 2018

Yep, that much was understood. Need to continue working on that code.. Hopefully tomorrow?

@lclevy
Copy link

@lclevy lclevy commented Mar 11, 2018

I started a CR3 format description here https://github.com/lclevy/canon_cr3 if you are interested in contributing...
I've done it for CR2 here http://lclevy.free.fr/cr2/ and https://github.com/lclevy/libcraw2

@clanmills
Copy link

@clanmills clanmills commented Mar 12, 2018

Laurent

This is very helpful indeed. I wrote an email to you to ask if you were considering working on CR3. Then I made good progress and decided not to send the email.

I've documented my progress: Exiv2/exiv2#236

Team Exiv2 will have a meeting on May 5 at my home in England. This project will be discussed. There are many demands on the team and only a few contributors. In honestly, I don't think we have the resources to implement support for CR3 in 2018. Exiv2/exiv2#225

@lclevy
Copy link

@lclevy lclevy commented Mar 12, 2018

Hi,
a lots of updates at https://github.com/lclevy/canon_cr3, you can subscribe to the project to see updates

@clanmills
Copy link

@clanmills clanmills commented Mar 13, 2018

Thanks for doing this. It's looking good.

I'm puzzled by the claim (also made by @lclevy and @LibRaw) that there are 4 images in MDAT. I only see a single JPEG. I see there are 4 trak objects which are a hierarchy of items (CRAW, stsz, co64). Perhaps you could document the meaning of the objects in the trak hierarchy.

532 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ ./isobmffdump --dump stsz ~/Downloads/canon_eos_m50_01.cr3 
@0         | ftyp [24]
@24        | moov [28440]
@32        |   uuid [26216]
@26248     |   mvhd [108]
@26356     |   trak [484]
@26364     |     tkhd [92]
@26456     |     mdia [384]
@26464     |       mdhd [32]
@26496     |       hdlr [33]
@26529     |       minf [311]
@26537     |         vmhd [20]
@26557     |         dinf [36]
@26565     |           dref [28]
@26593     |         stbl [247]
@26601     |           stsd [128]
@26617     |             CRAW [112]
@26729     |           stts [24]
@26753     |           stsc [28]
@26781     |           stsz [20]
                       00000000  00 00 00 00 00 31 ea 2f 00 00 00 01                                                               .....1./....
@26801     |           free [15]
@26816     |           co64 [24]
...

@lclevy
Copy link

@lclevy lclevy commented Mar 13, 2018

There are 4 parts in mdat: 3 pictures (JPEG, plus 2 with new compression), then metadata like in tiff

@clanmills
Copy link

@clanmills clanmills commented Mar 13, 2018

@lclevy I do not agree that there is a tiff in the MDAT. Here's my evidence: Exiv2/exiv2#236 (comment)

We agree that there is at least one JPEG in the MDAT. I'm not convinced there are 3 pictures, however I have no proof to assert there is only 1 picture.

@clanmills
Copy link

@clanmills clanmills commented Mar 13, 2018

@lclevy Apologies. I am mistaken. There is a valid tiff in the MDAT and ExifTool reads it.

@LebedevRI Thanks for pointing this out. By coincidence I discovered and documented this before your email arrived. Two heads thinking of the same thing! That's better than no heads thinking! Exiv2/exiv2#236 (comment)

Steady progress. Great Teamwork. Thanks everybody.

@lclevy
Copy link

@lclevy lclevy commented Mar 13, 2018

updates: THMB and PRVW detailed. confirmed 3 jpeg and their sizes

@clanmills
Copy link

@clanmills clanmills commented Mar 13, 2018

Well done, @lclevy. Very impressive progress. It's possible I now have sufficient information to parse all the CR3 metadata and thumbnails in Exiv2.

@lclevy
Copy link

@lclevy lclevy commented Mar 29, 2018

please find parse_cr3.py ...

@clanmills
Copy link

@clanmills clanmills commented Apr 3, 2018

@lclevy Happy Easter. We were on vacation last week and now I'm home and would like to look at parse_cr3.py. Where have you put it?

@lclevy
Copy link

@lclevy lclevy commented Apr 3, 2018

@clanmills
Copy link

@clanmills clanmills commented Apr 3, 2018

Thanks. That worked first time. I'll study that. Thanks very much.

@clanmills
Copy link

@clanmills clanmills commented Jul 28, 2020

Please don't blame anybody. It will arrive, you'll see.

I'm very pleased with the book. I'm writing down everything I know about Metadata. In just over 2000 lines of C++ that accompanies the book, I am parsing all the major file formats and metadata categories (Exif, ICC, XMP and IPTC). The parsing strategy mirrors Exiv2. With only 2% of the code, I hope readers will find it easy to understand.

Not only do I hope that people will understand how to parse the Metadata. I hope they'll find it fascinating. I hope somebody will volunteer to maintain Exiv2 or even better to write a nice new library.

I've enjoyed my adventures in the world of metadata. And I've enjoyed working with Andreas, Luis, Dan, Niels and others. I know somebody will accept the challenge to work on this and I assure everybody that I will cooperate in every possible way with anyone who volunteers.

I hope the future maintainer receives support and encouragement from the community.

@hfiguiere
Copy link

@hfiguiere hfiguiere commented Aug 6, 2020

@hfiguiere and @lclevy If either of you would like to work with me on Exiv2 v0.27.3 to add support for CR3 and HEIF, I would be delighted to accept your help.

This week I merged my "exif" branch into the main branch for libopenraw.
What took me most of time was to write everything around get Exif and MakerNote for the other formats, adding support for CR3 based on what I have was relatively simple in comparison - it was already mostly done.

For CR3 the complicated part is the ISOMedia container, but for that I took Mozilla own parser they use in Firefox, written in Rust, and added a few patches on top for Canon own sauce. I upstreamed what was generic enough (that was in 2018). If you can deal with that, then the heavy lifting should be done - libopenraw is still mostly C++ until I bite the bullet and rewrite in Rust.

CR3 MakerNote seems to be very close to the rest, ExifTool helps here too. But that's probably the kind of things subject to change.

@clanmills
Copy link

@clanmills clanmills commented Aug 6, 2020

I've been working on HEIF and CR3 for my book this week. As everybody predicted, HEIF and CR3 can be derived from the existing JP2 support in Exiv2. So the future looks fine.

I've found the code and documents by @lclevy very useful and he is acknowledged in the book. https://clanmills.com/exiv2/book/

I released Exiv2 v0.27.3 without ISOBMFF support on 2020-06-30. Will the code from my book make it into Exiv2 v0.27.4 or v0.28? We'll see. Will the objectors modify their position? Ask them!

@lclevy
Copy link

@lclevy lclevy commented Aug 6, 2020

I've been working on HEIF and CR3 for my book this week. As everybody predicted, HEIF and CR3 can be derived from the existing JP2 support in Exiv2. So the future looks fine.

I've found the code and documents by @lclevy very useful and he is acknowledged in the book. https://clanmills.com/exiv2/book/

I released Exiv2 v0.27.3 without ISOBMFF support on 2020-06-30. Will the code from my book make it into Exiv2 v0.27.4 or v0.28? We'll see. Will the objectors modify their position? Ask them!

Hi, I think what is missing to the book is a reference list per format and container

Laurent

@clanmills
Copy link

@clanmills clanmills commented Aug 6, 2020

Thanks @lclevy. The book is work-in-progress.

Chapter 1 deals with file formats (TIFF, JPEG etc).

Chapter 2 deals with Metadata Standards (Exif, IPTC, XMP and IPTC).

Chapter 8 discusses tvisitor.cpp which decodes the metadata in the image standards discussed in Chapter 1.

Chapter 13 discusses "Project Management" and the many challenges to maintain an open-source library.

@lclevy
Copy link

@lclevy lclevy commented Aug 7, 2020

I mean external references (internet links)

@KfUT10yxdw
Copy link

@KfUT10yxdw KfUT10yxdw commented Aug 11, 2020

I think the proportion between users and developers should be at least 1:100, so I'd say at least 50,000 users

darktable releases get about half the attention Lightroom releases get on news aggregators like Reddit, so we could always use the Distrowatch method and say that market research indicates that darktable has about half as many users as lightroom.

@alexvanderberkel
Copy link

@alexvanderberkel alexvanderberkel commented Sep 6, 2020

Got feedback from Canon, however not any statement regarding the license topic yet

_```
_Dear Alex,

I apologize that it took a long time to reply.

I have spoken to my colleagues in Japan several times, but they are still hesitating to disclose the technical specifications of the file format to open source community.

Of course various IT companies e.g. Microsoft, Google or Apple are contributing to the open source world, but at the same time they are also keeping their proprietary technologies confidential. As same as them, RAW format is a crucial element for Canon so R&D team would like to keep it in the secure place.

I understood there might be various benefits for both customers and Canon thanks for your detailed explanation.
On the other hand, to support open source projects, R&D team will need to pay efforts for it, for example to make a document, to answer for questions or to provide sample codes fix problems.
Our development resources are also definitely limited and they'd like to focus on their own product development i.e. Digital Photo Professional for Windows, macOS and iOS.

From both points of view, we're not able to disclose RAW format to the open source community unfortunately.

Your voice is one of the valuable feedback from the community, we really appreciate.
From my personal opinion, they will need to change their mind for open source world in the near future.
Please wait a moment for the tipping point.

Thanks again for your suggestion and the discussion.
If you have further comments or questions, please feel free to reply to this e-mail._

@johnny-bit
Copy link
Member

@johnny-bit johnny-bit commented Sep 6, 2020

Wow, You've managed to get "good" feedback. Just to make sure - please tell the person that responded to you that Open Source doesn't want to WRITE to the format and have 100% interoperability with things like Canon Photo Professional. The only thing we need in case of releasing READ-ONLY info is clear description of exif, xmp, iptc (generally for reading exif) etc locations in file and location of raw sensor dump data (for reading raws and displaying them). Nothing more. Just "here's where the data is in file, have fun writting your reader". That would be great.
In case there's no "read-only" non-nda description of format, the next best thing would obviously be a lawyer speak statement of something like "Canon does not provide CR3 spec to anybody, Canon does not guarantee that anything claiming to be able to read CR3 to be able to take advantage of every aspect of CR3 data format, Canon does not take any responsibility for any CR3 reader correctness except for Canon Photo Professional, Canon will not take any action against a party that writes CR3 reader or uses said unofficial CR3 reader for purpose of reading CR3 image files taken by Canon camera in a legal manner"

Basically any actually official statement that Canon does not take liability for any unofficial CR3 reading and that they won't go after anybody who successfully reads CR3 data that is not encrypted.

@clanmills
Copy link

@clanmills clanmills commented Sep 6, 2020

That's a very well written reply from Canon. It comes down to resources. Canon prioritise. They realise that they must cooperate with Adobe because without PhotoShop support, the camera will fail in the market. They probably correctly think "It's very time consuming to deal with Adobe, Corel, Google and Microsoft, we don't have the people to deal with hundreds of open-source projects".

To read the CR3 (and HEIC) file, I obtained the ISOBMFF/JP2000 Specification w15177 which is available from here. https://mpeg.chiariglione.org/standards/mpeg-4/iso-base-media-file-format/text-isoiec-14496-12-5th-edition

The Specification is a tough 246 page read and defines how information is encoded in the file. Exiv2 is interested in Exif, ICC, XMP, IPTC data and this is in the ‘meta’ box (which is a tree of other boxes). It’s all well defined. I have implemented the necessary code and documented it in my book Image Metadata and Exiv2 Architecture. https://clanmills.com/exiv2/book/ I hope to finish the book by the end of 2020.

I don't believe reading pixels from the image is defined in ISOBMFF and I know that's of great interest to darktable.

The trolls lost the patent wars in the 1990s concerning GIF and JPEG. That was 30 years ago and the world has changed since those days. Microsoft has done a 180 on open source. There is nothing in any of this correspondence that indicates Canon (or Apple) are hostile to our intent, however they aren't going to give us a "carte blanche" or any help.

From a metadata point of view, the ISOBMFF Specification is sufficient for our purposes and there is no evidence that Canon (or Apple) are hostile. Equally, it's clear that Canon will not actively support us. The obstacle to supporting CR3 and HEIC in Exiv2 is the objection of two open-source contributors.

@VioletGiraffe
Copy link

@VioletGiraffe VioletGiraffe commented Sep 6, 2020

this is in the ‘meta’ box (which is a tree of other boxes). It’s all well defined.

Interesting! I have tried this ISO-BMFF parser, and found out that it cannot "see" the box with CR3 metadata itself, it only sees a higher-level box but not its nested contents. Going to read the book for details, thank you!

@clanmills
Copy link

@clanmills clanmills commented Sep 6, 2020

I found that parser very useful in my study of the ISOBMFF Specification w15177. Happy to have a one-to-one with you on Zoom to discuss this further.

@homberghp
Copy link

@homberghp homberghp commented Oct 2, 2020

I think some legislation is due here.
It should something along the line of
'Any data format that is supposed to be exchanged between devices must be documented to be able to read said format.
This documentation must be accompanied with a reference implementation of a reader program, which should be open source,
thereby allowing the user of the data format can always have access, irrespective of operating system or application.'

This might be wishful thinking but the rationale is that the data belongs to the creator, which is not the device but the operator of the device.

@clanmills
Copy link

@clanmills clanmills commented Oct 4, 2020

@homberghp I think you have a good idea here, Pieter. The standard should have an open-source reference implementation with a legal guarantee that anything it performs is patent free.

The reality is that the ISOBMFF/JPEG2000 Specification w15177_15444.pdf declares:

Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.

ISO totally avoid any responsibility and push a heavy burden onto everybody else.

I don't object to Software Patents in principle. In the case of processing unencrypted metadata in an image, I have not heard a single reason how/why this could be a patent infringement.

@lclevy
Copy link

@lclevy lclevy commented Nov 28, 2020

@alexvanderberkel could please share privately your Canon contact with me ?

@alexvanderberkel
Copy link

@alexvanderberkel alexvanderberkel commented Nov 28, 2020

@lclevy Sure! Just let me know how you would like to get the details.

@lclevy
Copy link

@lclevy lclevy commented Nov 28, 2020

@lorenzo2472 on twitter ?

@alexvanderberkel
Copy link

@alexvanderberkel alexvanderberkel commented Nov 28, 2020

Any other option?

@lclevy
Copy link

@lclevy lclevy commented Nov 28, 2020

Lclevy at free.fr (email)

@cytrinox
Copy link
Contributor

@cytrinox cytrinox commented Apr 29, 2021

I try to add a simple isobmff parser to rawspeed as a first step. If anyone working in the dark on the full implementation, please let me know so I don't waste my time 😉

@lclevy
Copy link

@lclevy lclevy commented Apr 29, 2021

@clanmills
Copy link

@clanmills clanmills commented Apr 29, 2021

@hfiguiere
Copy link

@hfiguiere hfiguiere commented Apr 29, 2021

Since we all share URL of our favourite, I have had cr3 parsing for quite a while in libopenraw. (LGPL-3.0 code)

https://gitlab.freedesktop.org/libopenraw/libopenraw/-/blob/master/lib/cr3file.cpp

  1. it uses a modified version of Mozilla mp4 parser written in Rust (and used in Firefox). Code is in tree. MPL-2 licensed
  2. it lacks the decompression code for the raw bitstream. I think libraw has something.
  3. however it can extract the EXIF
  4. it probably lack other things.

But it's a decent starting point if licensing permits.

For the record RawSpeed took some code from libopenraw back when it was LGPL-2.0 (this was totally OK), and the reverse happened too.

@hfiguiere
Copy link

@hfiguiere hfiguiere commented Apr 29, 2021

And just to add about the fear of patents, since the mp4 parser code is in mainline Mozilla Firefox, unlike h264, I consider due diligence was made. Just a personal assumption though.

IANAL though, so if you have doubt, make sure to get a legal opinion.

@hfiguiere
Copy link

@hfiguiere hfiguiere commented Apr 29, 2021

And this code was written with the help of @lclevy work mentioned previously (his documentation and tools) and no assistance from the vendor. Credit where it's due.

@lclevy
Copy link

@lclevy lclevy commented Apr 30, 2021

@lclevy
Copy link

@lclevy lclevy commented Apr 30, 2021

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
New Raw Formats
  
Code needs porting
Linked pull requests

Successfully merging a pull request may close this issue.

None yet