Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Canon CR3 support. #236

Closed
1 task
LebedevRI opened this issue Mar 2, 2018 · 66 comments
Closed
1 task

Canon CR3 support. #236

LebedevRI opened this issue Mar 2, 2018 · 66 comments
Labels
request feature request or any other kind of wish

Comments

@LebedevRI
Copy link

LebedevRI commented Mar 2, 2018

Just thought i file the meta-bug.

I'm investigating that new raw format from the point-of-view of the raw image decoding in RawSpeed library in darktable-org/rawspeed#121.
Right now exiv2 (0.25 001900) knows nothing about it.
It should eventually have the same support as for the usual CR2:

  • camera orientation
  • thumbnail
@clanmills
Copy link
Collaborator

Oh! Something new to be supported. As a retired Adobe Engineer I think "Why don't they all use DNG?" One standard's enough! With my Exiv2/opensource hat, "Why doesn't Canon contribute to Exiv2 to support their new standard?".

If you're willing to get involved, Team Exiv2 will support and mentor. Our existing commitments make CR3 very unlikely to get attention in 2018.

@LebedevRI
Copy link
Author

Why don't they all use DNG?

Tell me about it :)

"Why doesn't Canon contribute to Exiv2 to support their new standard?".

Should they do that, make sure they get me/rawspeed a spec of the actual image compression algo they used :)

If you're willing to get involved

We'll see. Certainly not before the rawspeed part is done and working.
(Also, things like #214 in 2018 aren't too encouraging...)

@clanmills
Copy link
Collaborator

Why are you worried about #214? I want to continue to support C++98 and add C++11/14 on top. The team want to dump C++98 support. What is your opinion?

@LebedevRI
Copy link
Author

LebedevRI commented Mar 3, 2018

I'm not worried, i'm not working on that code :)
There are two different things here - usage of C++11/C++14 in the public API, and in the internals.
You might be able to keep the API usable with older standards (though i can not comment whether anyone will ever need that, and it may negatively affect the users that are using newer standards).
But sticking with old standards for the internal code is whole other matter...

@clanmills
Copy link
Collaborator

So we can use any technology in the library and offer an API that can be used by C++98 (or C++11/14/17) application code.

One of the goals in v0.27 is to try to establish a "v1.0 API". Something like: "We hope the API for v0.27 will be the API for v1.0 and we will try to avoid changes to the API for v0.28". If the API for v0.28 is identical to v0.27, we will call it v1.0.

This is a very useful conversation. We're having a Exiv2 Team Meeting at my home in England on the weekend of Saturday May 5 and we will discuss this topic (and many others). You (and any of your open-source friends) are welcome. #225. It won't be all work, it's a social/team-building weekend and partners will also attend.

@clanmills
Copy link
Collaborator

clanmills commented Mar 3, 2018

There are sample images available here: http://www.photographyblog.com/reviews/canon_eos_m50_review/preview_images

763 rmills@rmillsmbp:~/clanmills $ dmpf ~/Downloads/canon_eos_m50_01.cr3 | head -10
       0        0: ....ftypcrx ....  ->  00 00 00 18  f  t  y  p  c  r  x    00 00 00 01
    0x10       16: crx isom..o.moov  ->   c  r  x     i  s  o  m 00 00  o 18  m  o  o  v
    0x20       32: ..fhuuid........  ->  00 00  f  h  u  u  i  d 85 c0 b6 87 82 0f 11 e0
    0x30       48: ....F+jH...&CNCV  ->  81 11 f4 ce  F  +  j  H 00 00 00  &  C  N  C  V
    0x40       64: CanonCR3_001/00.  ->   C  a  n  o  n  C  R  3  _  0  0  1  /  0  0 2e
    0x50       80: 09.00/00.00.00..  ->   0  9 2e  0  0  /  0  0 2e  0  0 2e  0  0 00 00
    0x60       96: .\CCTP..........  ->  00  \  C  C  T  P 00 00 00 00 00 00 00 01 00 00
    0x70      112: ......CCDT......  ->  00 03 00 00 00 18  C  C  D  T 00 00 00 00 00 00
    0x80      128: ..............CC  ->  00 10 00 00 00 00 00 00 00 01 00 00 00 18  C  C
    0x90      144: DT..............  ->   D  T 00 00 00 00 00 00 00 01 00 00 00 00 00 00

I can see an embedded tiffs (exif metadata) at 296 and 688:

764 rmills@rmillsmbp:~/clanmills $ dmpf ~/Downloads/canon_eos_m50_01.cr3 | grep 'I  I' | head -3
   0x120      288: ....CMT1II*.....  ->  00 00 01 88  C  M  T  1  I  I  * 00 08 00 00 00
   0x2b0      688: II*.....'.......  ->   I  I  * 00 08 00 00 00  ' 00 9a 82 05 00 01 00
   0x6d0     1744: ...8CMT3II*.....  ->  00 00 14  8  C  M  T  3  I  I  * 00 08 00 00 00
765 rmills@rmillsmbp:~/clanmills $ 

And can extract them:

786 rmills@rmillsmbp:~/clanmills $ dd bs=1 skip=296 if=~/Downloads/canon_eos_m50_01.cr3 count=200000 | exiv2 -pa -
200000+0 records in
200000+0 records out
200000 bytes (200 kB) copied, 0.795933 s, 251 kB/s
Exif.Image.ImageWidth                        Short       1  6000
Exif.Image.ImageLength                       Short       1  4000
Exif.Image.BitsPerSample                     Short       3  8 8 8
Exif.Image.Compression                       Short       1  JPEG (old-style)
Exif.Image.Make                              Ascii       6  Canon
Exif.Image.Model                             Ascii      14  Canon EOS M50
Exif.Image.Orientation                       Short       1  left, bottom
Exif.Image.XResolution                       Rational    1  72
Exif.Image.YResolution                       Rational    1  72
Exif.Image.ResolutionUnit                    Short       1  inch
Exif.Image.DateTime                          Ascii      20  2018:02:21 12:00:56
Exif.Image.Artist                            Ascii       1  
Exif.Image.Copyright                         Ascii       1  
787 rmills@rmillsmbp:~/clanmills $ 
768 rmills@rmillsmbp:~/clanmills $ dd bs=1 skip=688 if=~/Downloads/canon_eos_m50_01.cr3 count=200000 | exiv2 -pa -
200000+0 records in
200000+0 records out
200000 bytes (200 kB) copied, 0.792246 s, 252 kB/s
Exif.Image.ExposureTime                      Rational    1  1/80 s
Exif.Image.FNumber                           Rational    1  F6.3
Exif.Image.ExposureProgram                   Short       1  Aperture priority
Exif.Image.ISOSpeedRatings                   Short       1  10000
Exif.Image.DateTimeOriginal                  Ascii      20  2018:02:21 12:00:56
Exif.Image.ShutterSpeedValue                 SRational   1  1/83 s
Exif.Image.ApertureValue                     Rational    1  F6.4
Exif.Image.ExposureBiasValue                 SRational   1  0 EV
Exif.Image.MeteringMode                      Short       1  Multi-segment
Exif.Image.Flash                             Short       1  No flash
Exif.Image.FocalLength                       Rational    1  45.0 mm
769 rmills@rmillsmbp:~/clanmills $ 

Here's the XMP:

556 rmills@rmillsmbp:/Applications $ dd bs=1 count=300 skip=$((28480+8)) if=~/Downloads/canon_eos_m50_01.cr3 2>/dev/null | xmllint --format -
<?xml version="1.0"?>
<?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?>
<x:xmpmeta xmlns:x="adobe:ns:meta/">
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    <rdf:Description xmlns:xmp="http://ns.adobe.com/xap/1.0/" rdf:about="">
      <xmp:Rating>0</xmp:Rating>
    </rdf:Description>
  </rdf:RDF>
</x:xmpmeta>
557 rmills@rmillsmbp:/Applications $ 

This project doesn't look painful. We'll have to write an image handler cr3image.cpp which will:

  1. Detect the image type
  2. cr3image->readMetadata()
  3. cr3image->writeMetadata()
  4. add to the test suite

It's probably similar to src/cr2image.cpp which is 231 lines of code.

@clanmills
Copy link
Collaborator

I've made some progress with this. I've looked at a couple of libraries for ISO BMFF support. I've the (Mac Only) ISOBMFF Explorer useful to understand those files: https://imazing.com/isobmff/download

I've discovered a one-file project which dumps ISO BMFF files. This looks like a great starting point and I've invited the author to join team Exiv2: pyke369/isobmffdump#1

Here's output from the canon CR3 file:

749 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ isobmffdump  ~/Downloads/canon_eos_m50_01.cr3 
@0         | ftyp [24]
@24        | moov [28440]
@32        |   uuid [26216]
@26248     |   mvhd [108]
@26356     |   trak [484]
...
@28464     | uuid [65560]
@94024     | uuid [416007]
@510031    | mdat [38025680]
@38535711  | end
750 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ 

The uuid box at 32 is the Canon magic. I don't know how to decode it at the moment.

The uuid box at 28464 is the XMP metadata and uses the same UUID as jp2image.cpp

754 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ grep -i kJp2UuidXmp ~/gnu/github/exiv2/exiv2/src/jp2image.cpp  | head -1
const unsigned char kJp2UuidXmp[]  = "\xbe\x7a\xcf\xcb\x97\xa9\x42\xe8\x9c\x71\x99\x94\x91\xe3\xaf\xac";
755 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ 
@0         | ftyp [24]
@24        | moov [28440]
@32        |   uuid [26216]
               00000000  85 c0 b6 87 82 0f 11 e0 81 11 f4 ce 46 2b 6a 48  00 00 00 26 43 4e 43 56 43 61 6e 6f 6e 43 52 33  ............F+jH...&CNCVCanonCR3
               00000020  5f 30 30 31 2f 30 30 2e 30 39 2e 30 30 2f 30 30  2e 30 30 2e 30 30 00 00 00 5c 43 43 54 50 00 00  _001/00.09.00/00.00.00...\CCTP..
               00000040  00 00 00 00 00 01 00 00 00 03 00 00 00 18 43 43  44 54 00 00 00 00 00 00 00 10 00 00 00 00 00 00  ..............CCDT..............
               00000060  00 01 00 00 00 18 43 43 44 54 00 00 00 00 00 00  00 01 00 00 00 00 00 00 00 02 00 00 00 18 43 43  ......CCDT....................CC
               00000080  44 54 00 00 00 00 00 00 00 00 00 00 00 00 00 00  00 03 00 00 00 5c 43 54 42 4f 00 00 00 04 00 00  DT...................\CTBO......
               000000a0  00 01 00 00 00 00 00 00 6f 30 00 00 00 00 00 01  00 18 00 00 00 02 00 00 00 00 00 01 6f 48 00 00  ........o0..................oH..
...
               00006640  0e 09 e4 73 cd 5d 3c a7 2d a5 2e 7a 74 52 66 53  cd b3 1a 8a d5 2a b7 f7 1f ff d9 00 00 a1 e0 10  ...s.]<.-..ztRfS.....*..........
@26248     |   mvhd [108]
@26356     |   trak [484]
...
@26840     |   trak [584]
...
@28464     | uuid [65560]
             00000000  be 7a cf cb 97 a9 42 e8 9c 71 99 94 91 e3 af ac  3c 3f 78 70 61 63 6b 65 74 20 62 65 67 69 6e 3d  .z....B..q......<?xpacket begin=
             00000020  27 ef bb bf 27 20 69 64 3d 27 57 35 4d 30 4d 70  43 65 68 69 48 7a 72 65 53 7a 4e 54 63 7a 6b 63  '...' id='W5M0MpCehiHzreSzNTczkc
             00000040  39 64 27 3f 3e 3c 78 3a 78 6d 70 6d 65 74 61 20  78 6d 6c 6e 73 3a 78 3d 22 61 64 6f 62 65 3a 6e  9d'?><x:xmpmeta xmlns:x="adobe:n
             00000060  73 3a 6d 65 74 61 2f 22 3e 3c 72 64 66 3a 52 44  46 20 78 6d 6c 6e 73 3a 72 64 66 3d 22 68 74 74  s:meta/"><rdf:RDF xmlns:rdf="htt
             00000080  70 3a 2f 2f 77 77 77 2e 77 33 2e 6f 72 67 2f 31  39 39 39 2f 30 32 2f 32 32 2d 72 64 66 2d 73 79  p://www.w3.org/1999/02/22-rdf-sy
             000000a0  6e 74 61 78 2d 6e 73 23 22 3e 3c 72 64 66 3a 44  65 73 63 72 69 70 74 69 6f 6e 20 72 64 66 3a 61  ntax-ns#"><rdf:Description rdf:a
             000000c0  62 6f 75 74 3d 22 22 20 78 6d 6c 6e 73 3a 78 6d  70 3d 22 68 74 74 70 3a 2f 2f 6e 73 2e 61 64 6f  bout="" xmlns:xmp="http://ns.ado
             000000e0  62 65 2e 63 6f 6d 2f 78 61 70 2f 31 2e 30 2f 22  3e 3c 78 6d 70 3a 52 61 74 69 6e 67 3e 30 3c 2f  be.com/xap/1.0/"><xmp:Rating>0</
             00000100  78 6d 70 3a 52 61 74 69 6e 67 3e 3c 2f 72 64 66  3a 44 65 73 63 72 69 70 74 69 6f 6e 3e 3c 2f 72  xmp:Rating></rdf:Description></r
             00000120  64 66 3a 52 44 46 3e 3c 2f 78 3a 78 6d 70 6d 65  74 61 3e 20 20 20 20 20 20 20 20 20 20 20 20 20  df:RDF></x:xmpmeta>             
             00000140  20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20                                  
             00000160  20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20                                  
...
@510031    | mdat [38025680]
@38535711  | end

There's another uuid box at 94024. I don't know what that is yet. It's not Iptc:

760 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ isobmffdump --dump uuid ~/Downloads/canon_eos_m50_01.cr3 | grep uuid
@32        |   uuid [26216]
@28464     | uuid [65560]
@94024     | uuid [416007]
761 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ 

@clanmills
Copy link
Collaborator

I've raised a discussion thread on Pixls.us to see if somebody knows the specification for those unidentified uuid box structures. https://discuss.pixls.us/t/new-canon-cr3-file-specification/6881

As I've mentioned in on Pixls.us, I have a (retired) Canon friend who may be able to help. Canon is a huge company and it's likely that my friend will have no contacts with digital camera software engineering.

@LebedevRI
Copy link
Author

I too was able to make progress. It's strange, those raws are demosaiced already :/
darktable-org/rawspeed#121

@clanmills
Copy link
Collaborator

clanmills commented Mar 4, 2018

Good News. I don't think this is going to be very difficult. The file is ISO BMFF format. Almost all the data in the file is in the MDAT (which I thought was intended for audio/music). I thought ISO BMFF was intended for video. Confused? (I'm easily confused).

I confirm that the MDAT "appears" to be a plain 38mb JPEG 4000x6000 pixels.

513 rmills@rmillsmbp:~/gnu/github $ dd bs=1 skip=$((510031+16)) count=$((38025680-16)) if=~/Downloads/canon_eos_m50_01.cr3 > foo.jpg
38025664+0 records in
38025664+0 records out
38025664 bytes (38 MB) copied, 418.962 s, 90.8 kB/s
514 rmills@rmillsmbp:~/gnu/github $ exiv2 foo.jpg 
File name       : foo.jpg
File size       : 38025664 Bytes
MIME type       : image/jpeg
Image size      : 6000 x 4000
foo.jpg: No Exif data found in the file
515 rmills@rmillsmbp:~/gnu/github $ 

This comment is relevant: darktable-org/rawspeed#121 (comment) From Exiv2's point of view, the MDAT previews are interesting.

The Exif (and IPTC) metadata is almost certainly in the Canon uuid box near the top of the file. We have already identified the uuid box with the XMP metadata.

So, we're making good progress.

@LebedevRI
Copy link
Author

I confirm that the MDAT "appears" to be a plain 38mb JPEG 4000x6000 pixels.

As per darktable-org/rawspeed#121 (comment), what you are looking at, is the largest embedded thumbnail, which exiv2 should ideally be able to provide via the usual means.

$ ffprobe canon_eos_m50_01.cr3 
...
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'canon_eos_m50_01.cr3':
  Metadata:
    major_brand     : crx 
    minor_version   : 1
    compatible_brands: crx isom
    creation_time   : 2018-02-21T12:00:56.000000Z
  Duration: 00:00:01.00, start: 0.000000, bitrate: 308285 kb/s
    Stream #0:0(eng): Video: none (CRAW / 0x57415243), none, 6000x4000, 26169 kb/s, 1 fps, 1 tbr, 1 tbn, 1 tbc (default)      <- THIS ONE
    Metadata:
      creation_time   : 2018-02-21T12:00:56.000000Z
    Stream #0:1(eng): Video: none (CRAW / 0x57415243), none, 1624x1080, 14829 kb/s, 1 fps, 1 tbr, 1 tbn, 1 tbc (default)      <- LIKELY TOO
    Metadata:
      creation_time   : 2018-02-21T12:00:56.000000Z
    Stream #0:2(eng): Video: none (CRAW / 0x57415243), none, 6288x4056, 262877 kb/s, 1 fps, 1 tbr, 1 tbn, 1 tbc (default)   <- has to be raw, dimensions/bitrate look about right...
    Metadata:
      creation_time   : 2018-02-21T12:00:56.000000Z
    Stream #0:3(eng): Data: none (CTMD / 0x444D5443), 328 kb/s (default)
    Metadata:
      creation_time   : 2018-02-21T12:00:56.000000Z

The second one is also a thumbnail, likely.
And only the third one is the raw data chunk

@clanmills
Copy link
Collaborator

clanmills commented Mar 5, 2018

Progress with decoding the Canon uuid box. It's a linked list of records:

893 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ ./isobmffdump ~/Downloads/canon_eos_m50_01.cr3 | head -5
@0         | ftyp [24]
@24        | moov [28440]
@32        |   uuid [26216]
@26248     |   mvhd [108]
@26356     |   trak [484]

I can extract the uuid with this: (24 = 16 bytes for the UUID + 8 bytes for the UUID marker)

894 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ dd skip=$((32+24)) count=$((26216-24)) bs=1 if=~/Downloads/canon_eos_m50_01.cr3 > canon.uuid
26192+0 records in
26192+0 records out
26192 bytes (26 kB) copied, 0.261663 s, 100 kB/s

I've written a utility dumper.cpp (code below):

970 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ ./dumper canon.uuid 
dumping canon.uuid
 offset| length| data
      4|     34| CNCVCanonCR3_001/00.09.00/00.00.00
     42|     88| CCTP................CCDT....................CCDT................ _._
    134|     88| CTBO..............o0..................oH......Y............O.... _._
    226|      6| free..
    236|    388| CMT1II*...............p......................................... _._
    628|   1060| CMT2II*.....'........................."...........'........'..0. _._
   1692|   5172| CMT3II*...../.....1...B..............................."......... _._
   6868|   1812| CMT4II*......................................................... _._
   8684|  17508| THMB.......x..DK................................................ _._
971 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ 

CTM1,2,3,4 looks like tiffs. Exif metadata probably. Why 4? Don't know yet.
THMB is probably a thumbnail (JPEG perhaps)?

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <iostream>

static bool isBigEndianPlatform()
{
    union {
        uint32_t i;
        char c[4];
    } e = { 0x01000000 };

    return e.c[0]?true:false;
}

// https://stackoverflow.com/questions/2182002/convert-big-endian-to-little-endian-in-c-without-using-provided-func
static int32_t bigEndianToInt(int32_t num)
{
    return isBigEndianPlatform()
              ? num
              : ((num>>24)&0xff)       // move byte 3 to byte 0
              | ((num<<8)&0xff0000)    // move byte 1 to byte 2
              | ((num>>8)&0xff00)      // move byte 2 to byte 1
              | ((num<<24)&0xff000000) // byte 0 to byte 3
              ;
}

void dump(void* buffer,int32_t length)
{
    bool bEarly = false;
    if ( length > 64 ) {
        bEarly = true;
        length = 64 ;
    }
    for ( int32_t i = 0 ; i < length ; i++ ) {
        int c = (int) ((char*)buffer) [i];
        printf("%c", ((c<32) || (c>=128)) ? '.' : c );
    }
    if ( bEarly ) printf(" _._");
    printf("\n");
}

int main(int argc, const char* argv[])
{
    int         result   = 0;
    FILE*       f        = NULL;
    const char* program  = argv[0];
    const char* filename = argv[1];

    if ( argc == 2 ) {
        f = fopen(filename,"rb");
        if ( !f ) {
            fprintf(stderr,"unable to open %s\n",filename);
            result = 1;
        }
    } else {
        fprintf(stderr,"syntax: %s file\n",program);
    }

    if ( f ) {
        printf("dumping %s\n",filename);
        printf(" offset| length| data\n");
        int32_t length ;
        do {
            length=0;
            if ( fread(&length,1,4,f) == 4 ) {
                length=bigEndianToInt(length);
                if ( length > 4 ) {
                    length -= 4;
                    printf("%7ld|%7d| ",ftell(f),length);
                    void* buffer = ::malloc(length);
                    fread(buffer,length,1,f);
                    dump(buffer,length);
                    ::free(buffer);
                }
                // fseek(f,(length-4),SEEK_CUR);
            }
        } while (length > 0);
    }

    return result;
}

@LebedevRI
Copy link
Author

CTM1,2,3,4 looks like tiffs. Exif metadata probably. Why 4? Don't know yet.

Well, we do have 3 actual Streams in the mdat, so one per each of those + one general IFD?

@clanmills
Copy link
Collaborator

Good Morning, Roman. Hope the weather's warmer for you this week.

I'm going to leave you to investigate the MDAT because I believe it's a JPEG with no metadata. I don't see an IFD in the MDAT. If you know how to locate that, I'll investigate that after I understand the moov/uuid. And we've already identified the XMP.

I'm focused on the Canon uuid box in the moov. I've run out of time for today. I hope tomorrow morning to know enough to read the metadata.

@LebedevRI
Copy link
Author

(I was talking about the fact that the CR3 ISO BMFF has exactly 4 trak boxes, so my guess would be that it's for each of those)

@clanmills
Copy link
Collaborator

clanmills commented Mar 5, 2018

Thanks, Roman. We'll figure this puzzle. 4 trak boxes. Yes - that's right.

Last discovery for Monday morning before I get on with what I should be doing! The THMB record contains a JPEG 120x160 thumbnail.

It's 20 bytes into the record. Why 20? 4 for THMB followed by 16 bytes. For extracting thumbnails, we never need to know. Exiv2 has never been able to edit thumbnails. Exiv2 can relocate THMB in the file as a binary blob.

919 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ dd bs=1 skip=$((8684+20)) count=$((17508-20)) if=canon.uuid > canon.jpg
17488+0 records in
17488+0 records out
17488 bytes (17 kB) copied, 0.177006 s, 98.8 kB/s
920 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ exiv2 canon.jpg 
File name       : canon.jpg
File size       : 17488 Bytes
MIME type       : image/jpeg
Image size      : 160 x 120
canon.jpg: No Exif data found in the file
921 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ 

@clanmills
Copy link
Collaborator

Phil Harvey (ExifTool) knows about .CR3 files:

995 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ exiftool ~/Downloads/canon_eos_m50_01.cr3 
ExifTool Version Number         : 10.82
...
Compatible Brands               : crx , isom
Compressor Version              : CanonCR3_001/00.09.00/00.00.00
Image Width                     : 6000
Image Height                    : 4000
Bits Per Sample                 : 8 8 8
Compression                     : JPEG (old-style)
...

There is also helpful document. See "Canon uuid Tags" on this page: https://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/Canon.html

Tag ID Values / Notes Comment
CCTP CanonCCTP
CMT1 EXIF Tags
CMT2 EXIF Tags
CMT3 Canon Tags https://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/Canon.html
CMT4 Canon Unknown IFD
CNCV Compressor Version
CNTH Canon CNTH Tags
THMB Thumbnail Image

@clanmills
Copy link
Collaborator

I think I've got a good understanding about the scope of this project. We're having a team meeting in May #225 when we'll discuss the specification and schedule for v0.27.

@clanmills
Copy link
Collaborator

clanmills commented Mar 13, 2018

I've written code to search for tiff structures in a binary file. It's "simple minded" and only searches for the pattern **_II*\0_** or **_MM\0*_** and reports the location. Discovered tiffs can be extracted with the utility dd. Although there is binary data with the **_II*\0_** finger-print in MDAT, none of these structures is a valid tiff. It's just a coincidence.

615 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ tiffs ~/Downloads/canon_eos_m50_01.cr3 
searching /Users/rmills/Downloads/canon_eos_m50_01.cr3
    offset | type
       296 | II
       688 | II
      1752 | II
      6906 | II
      6928 | II
  38494795 | II
  38494825 | II
  38496517 | II
  38534235 | II
616 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $

Here's the code:

// tiffs.cpp - search for tiff structures in a binary file
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <iostream>

int main(int argc, const char* argv[])
{
    int         result   = 0;
    FILE*       f        = NULL;
    const char* program  = argv[0];
    const char* filename = argv[1];

    if ( argc == 2 ) {
        f = fopen(filename,"rb");
        if ( !f ) {
            fprintf(stderr,"unable to open %s\n",filename);
            result = 1;
        }
    } else {
        fprintf(stderr,"syntax: %s file\n",program);
    }

    if ( f ) {
        printf("searching %s\n",filename);
        printf("    offset | type\n");
        while ( !feof(f) ) {
            int32_t count=0;
            char    buff[4];
            char    II[4];
            char    MM[4];
            memcpy(II,"II*\0",4);
            memcpy(MM,"MM\0*",4);

            size_t now = ftell(f);
            while (  fread(buff+count,1,1,f) == 1
                &&( (memcmp(MM,buff,count+1) == 0) || (memcmp(II,buff,count+1) == 0))
                  ) {
                count++;
                if ( count ==  4 ) {
                    printf("%10ld | %c%c\n",now,buff[0],buff[1]);
                    break;
                }
            }
            if ( count > 1 ) {
            	fseek(f,count,SEEK_CUR);
            }
        }
        fclose(f);
        f=NULL;
    }

    return result;
}

@lclevy
Copy link

lclevy commented Mar 13, 2018

In mdat, to be more accurate, there are IFDs, that are parsed by ExifTool 10.82

@clanmills
Copy link
Collaborator

Here's a snippet from the -verbose out from ExifTool.

621 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ exiftool -verbose ~/Downloads/canon_eos_m50_01.cr3

CTMD (SubDirectory) -->

  • [CTMD type 1 directory, 18 bytes]
  • [CTMD type 3 directory, 10 bytes]
  • [CTMD type 4 directory, 18 bytes]
  • [CTMD type 5 directory, 34 bytes]
  • [CTMD type 7 directory, 1716 bytes]
    | ExifInfo (SubDirectory) -->
    | + [TIFF directory]
    | | ExifByteOrder = II
    | | + [IFD0 directory with 1 entries]
    | | | 0) SceneCaptureType = 0
    | MakerNoteCanon (SubDirectory) -->
629 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ tiffs ~/Downloads/canon_eos_m50_01.cr3 
searching /Users/rmills/Downloads/canon_eos_m50_01.cr3
    offset | type
       296 | II
       688 | II
      1752 | II
      6906 | II
      6928 | II
  38494795 | II  <--- this is a valid tiff
  38494825 | II
  38496517 | II
  38534235 | II
630 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ dd bs=1 skip=38494795 count=10000 if=~/Downloads/canon_eos_m50_01.cr3 > foo1.tif
10000+0 records in
10000+0 records out
10000 bytes (10 kB) copied, 0.11086 s, 90.2 kB/s
631 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ exiv2 -pS foo1.tif 
STRUCTURE OF TIFF FILE (II): foo1.tif
 address |    tag                           |      type |    count |    offset | value
      10 | 0xa406 SceneCaptureType          |     SHORT |        1 |         0 | 0
END foo1.tif
632 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ 

@clanmills
Copy link
Collaborator

@lclevy @LebedevRI Thanks Guys. I'm mistaken. There is a valid tiff in the mdat. I still don't have a complete understanding of how all the data is organised in CR3, however we're making good progress. And I'm enjoying working with you both. Thanks.

@clanmills clanmills self-assigned this Mar 13, 2018
@boardhead
Copy link
Collaborator

The CTMD type 1, 4 and 5 directories contain timestamp, focal-length and exposure information respectively. ExifTool 10.87 will decode these too. I still don't know about type 3 though.

@lclevy
Copy link

lclevy commented Mar 19, 2018 via email

@clanmills
Copy link
Collaborator

clanmills commented Mar 19, 2018

Could you share your code please, Laurent? I was working on my decoder (tiffs.cpp) on Friday and will publish a new version here later this week. #236

@LebedevRI
Copy link
Author

I have made progress on the craw proprietary format, and found canon code which handle it, in dpp 4.8.20. Do you know someone else working on it ? So that we can work together?
I would to create an open source decoder.

Is there a specification? As you are well aware, i'm kinda working on it in darktable-org/rawspeed#121

@clanmills
Copy link
Collaborator

dpp 4.8.20 from Canon Asia = http://support-th.canon-asia.com/contents/TH/EN/0200544802.html

I haven't downloaded this. Looks like an executable binary.

@clanmills clanmills removed their assignment Oct 10, 2019
@CarVac
Copy link

CarVac commented Nov 2, 2019

LibRaw released a snapshot yesterday with CR3 support, and now this is the blocker for Filmulator supporting .CR3 cameras.

Is there any way I can help out?

@mrkkrp
Copy link

mrkkrp commented Nov 13, 2019

@clanmills Can you provide some guidance for those who I'd like to try to implement this? A plan/outline? Note that @CarVac asks if they can help.

@clanmills
Copy link
Collaborator

Mark. I have retired from Exiv2. However, I admire your enthusiasm for this task and I had hoped that I would get to this before I retired.

So I'm happy to mentor you and @CarVac Can we discuss this by email: robin@clanmills.com When you're ready to submit a PR, you can summarise the implementation for several reasons:

  1. To help Team Exiv2 with the review of your changes.
  2. To act as a 'road map' for other file types being added to Exiv2.
  3. It'll help you be quite certain and clear in your mind about what you did (and why).

When would you like to start?

@mrkkrp
Copy link

mrkkrp commented Nov 13, 2019

@CarVac would you like to give it a go or should I try to dig it? If C++ is fresh in your head and most importantly if you have the time you may be a better fit for this.

@CarVac
Copy link

CarVac commented Nov 13, 2019

I'm fresh on C++ but I'm in the process of moving so I can't do it right now. But I really would like to follow along in the process (in whatever email conversations you have).

@clanmills
Copy link
Collaborator

I know the Exiv2/C++ code very well. I downloaded the LibRaw code a couple of months ago at the suggestion of @lclevy. However I was pushing hard to release Exiv2 v0.27.3 at that time and didn't look at it. I'll look in the next few days. I have gardening project going on this week and, as I'm paying people to work with me, I need to deal with that first. I should have time to work on this next week.

It would be helpful if @CarVac could let us know about his skills, knowledge of C++, LibRaw, Exiv2, .CR3 - or anything that he thinks will be useful.

@clanmills
Copy link
Collaborator

@CarVac I don't think this is a task for somebody "fresh on C++", however you could help with testing. Do you know Python3?

@CarVac
Copy link

CarVac commented Nov 13, 2019

I meant "fresh" as in it's fresh in my mind. I write Filmulator (https://github.com/CarVac/filmulator-gui) in C++, and it's basically the way I learned the language. By contrast, I don't really know Python at all anymore; I learned back on 2.7 and haven't used it since.

My main skill in programming is working memory: given familiarity, I can grasp a very large amount of a codebase.

I know basically nothing about the inner workings of Exiv2 or CR3, and only a slight amount about the structure of LibRaw. But I definitely want to learn; part of this is so that I can better understand how to approach handling EXIF in Filmulator, which uses the crudest possible method of "copy all the exif from the raw to the output file".

@clanmills
Copy link
Collaborator

@CarVac @mrkkrp I've changed my mind. I have retired and don't want to get involved. I'm sorry I offered to help. I'm very happy to have the retired from Exiv2 after 12 years and 10,000+ hours of effort. I wish you every success if you take on this challenge.

@boardhead
Copy link
Collaborator

boardhead commented Nov 14, 2019

@clanmills A well-deserved break. The effort that you put into this project is impressive.

@phako
Copy link
Contributor

phako commented Nov 14, 2019

I think the basic idea would be to to look at exiftool, libraw and the python script mentioned above, extract the difference to CR2 and implement that, ideally deriving (either spiritually or C++-wise) from the CR2 parser.

I have no idea how similar they are, if Canon has some common sense, they hopefully are.

@boardhead
Copy link
Collaborator

The overall structure of a CR3 is completely different than CR2. It is based on the QuickTime file format (MOV/MP4), not the TIFF format. So you should start from a MOV/MP4 parser.

@lclevy
Copy link

lclevy commented Nov 14, 2019

@CarVac I just pushed a parse_cr3.py update that parses ... CR2 too. Maybe I should rename it to parse_cr23.py :-)

@phako
Copy link
Contributor

phako commented Nov 14, 2019

The overall structure of a CR3 is completely different than CR2. It is based on the QuickTime file format (MOV/MP4), not the TIFF format. So you should start from a MOV/MP4 parser.

Oh. Awesome...maybe one could even throw a HEIF parser at it?

@boardhead
Copy link
Collaborator

boardhead commented Nov 14, 2019

Exactly. The format and challenges are the same as for HEIC (although metadata is stored in different locations). That is why the ability to write CR3 and HEIC was released in the same ExifTool update.

@lclevy
Copy link

lclevy commented Nov 14, 2019

and Canon 1D Marx III will produce 10 bits HEIC: https://www.dpreview.com/interviews/6888036612/canon-interview-eos-1d-x-mark-iii-2019

@phako
Copy link
Contributor

phako commented Nov 19, 2019

For parsing arbitrary ISOMP4 containers, you might want to have a look at https://github.com/DigiDNA/ISOBMFF - It's the successor of MP4Parse which had really nice C++ API and is MIT licensed.

@phako
Copy link
Contributor

phako commented Nov 19, 2019

gah. though its Linux build system is sh*t

@lclevy
Copy link

lclevy commented Nov 19, 2019

@VioletGiraffe
Copy link

@phako, thanks for the ISOBMFF link. I experimented with it, and it looks it ISOBMFF doesn't see the specific container that holds EXIF metadata inside CR3.

@lclevy
Copy link

lclevy commented Jun 11, 2020

@VioletGiraffe its normal, containers cmt1, ... Cmt4 and ctmd are specific to Canon https://github.com/lclevy/canon_cr3/blob/master/readme.md

@VioletGiraffe
Copy link

VioletGiraffe commented Jun 11, 2020

Yes, I was going by that exact description.

@lclevy
Copy link

lclevy commented Jun 11, 2020

Canon is the second company after IBM by the number of patent filed each year. But it is not because they own the intellectual property that they will sue people. They were tolerant for example about Magic Lantern team, which is a "camera jailbreak". But they can

@clanmills
Copy link
Collaborator

Here's the summary of the Sunday's meeting on Zoom. #318 (comment)

@clanmills
Copy link
Collaborator

Topic: Support for ISOBMFF Files (AVIF, HEIF, CR3)
Time: Jan 9, 2021 07:30 PM Amsterdam, Berlin, Rome, Stockholm, Vienna

Join Zoom Meeting

Zoom Video 1

Join our Cloud HD Video Meeting 1

Zoom is the leader in modern enterprise video communications, with an easy, reliable cloud platform for video and audio conferencing, chat, and webinars across mobile, desktop, and room systems. Zoom Rooms is the original software-based conference...
Meeting ID: 821 6888 5673
Passcode: L1B3Hz

Agenda:
Exiv2-ISOBMFF-2021-01-09.pdf

@clanmills
Copy link
Collaborator

Shipped in v0.27.4 RC1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
request feature request or any other kind of wish
Projects
None yet
Development

No branches or pull requests

9 participants