New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Canon CR3 support. #236

Open
LebedevRI opened this Issue Mar 2, 2018 · 33 comments

Comments

Projects
None yet
5 participants
@LebedevRI

LebedevRI commented Mar 2, 2018

Just thought i file the meta-bug.

I'm investigating that new raw format from the point-of-view of the raw image decoding in RawSpeed library in darktable-org/rawspeed#121.
Right now exiv2 (0.25 001900) knows nothing about it.
It should eventually have the same support as for the usual CR2:

  • camera orientation
  • thumbnail
@clanmills

This comment has been minimized.

Collaborator

clanmills commented Mar 2, 2018

Oh! Something new to be supported. As a retired Adobe Engineer I think "Why don't they all use DNG?" One standard's enough! With my Exiv2/opensource hat, "Why doesn't Canon contribute to Exiv2 to support their new standard?".

If you're willing to get involved, Team Exiv2 will support and mentor. Our existing commitments make CR3 very unlikely to get attention in 2018.

@LebedevRI

This comment has been minimized.

LebedevRI commented Mar 3, 2018

Why don't they all use DNG?

Tell me about it :)

"Why doesn't Canon contribute to Exiv2 to support their new standard?".

Should they do that, make sure they get me/rawspeed a spec of the actual image compression algo they used :)

If you're willing to get involved

We'll see. Certainly not before the rawspeed part is done and working.
(Also, things like #214 in 2018 aren't too encouraging...)

@clanmills

This comment has been minimized.

Collaborator

clanmills commented Mar 3, 2018

Why are you worried about #214? I want to continue to support C++98 and add C++11/14 on top. The team want to dump C++98 support. What is your opinion?

@LebedevRI

This comment has been minimized.

LebedevRI commented Mar 3, 2018

I'm not worried, i'm not working on that code :)
There are two different things here - usage of C++11/C++14 in the public API, and in the internals.
You might be able to keep the API usable with older standards (though i can not comment whether anyone will ever need that, and it may negatively affect the users that are using newer standards).
But sticking with old standards for the internal code is whole other matter...

@clanmills

This comment has been minimized.

Collaborator

clanmills commented Mar 3, 2018

So we can use any technology in the library and offer an API that can be used by C++98 (or C++11/14/17) application code.

One of the goals in v0.27 is to try to establish a "v1.0 API". Something like: "We hope the API for v0.27 will be the API for v1.0 and we will try to avoid changes to the API for v0.28". If the API for v0.28 is identical to v0.27, we will call it v1.0.

This is a very useful conversation. We're having a Exiv2 Team Meeting at my home in England on the weekend of Saturday May 5 and we will discuss this topic (and many others). You (and any of your open-source friends) are welcome. #225. It won't be all work, it's a social/team-building weekend and partners will also attend.

@clanmills

This comment has been minimized.

Collaborator

clanmills commented Mar 3, 2018

There are sample images available here: http://www.photographyblog.com/reviews/canon_eos_m50_review/preview_images

763 rmills@rmillsmbp:~/clanmills $ dmpf ~/Downloads/canon_eos_m50_01.cr3 | head -10
       0        0: ....ftypcrx ....  ->  00 00 00 18  f  t  y  p  c  r  x    00 00 00 01
    0x10       16: crx isom..o.moov  ->   c  r  x     i  s  o  m 00 00  o 18  m  o  o  v
    0x20       32: ..fhuuid........  ->  00 00  f  h  u  u  i  d 85 c0 b6 87 82 0f 11 e0
    0x30       48: ....F+jH...&CNCV  ->  81 11 f4 ce  F  +  j  H 00 00 00  &  C  N  C  V
    0x40       64: CanonCR3_001/00.  ->   C  a  n  o  n  C  R  3  _  0  0  1  /  0  0 2e
    0x50       80: 09.00/00.00.00..  ->   0  9 2e  0  0  /  0  0 2e  0  0 2e  0  0 00 00
    0x60       96: .\CCTP..........  ->  00  \  C  C  T  P 00 00 00 00 00 00 00 01 00 00
    0x70      112: ......CCDT......  ->  00 03 00 00 00 18  C  C  D  T 00 00 00 00 00 00
    0x80      128: ..............CC  ->  00 10 00 00 00 00 00 00 00 01 00 00 00 18  C  C
    0x90      144: DT..............  ->   D  T 00 00 00 00 00 00 00 01 00 00 00 00 00 00

I can see an embedded tiffs (exif metadata) at 296 and 688:

764 rmills@rmillsmbp:~/clanmills $ dmpf ~/Downloads/canon_eos_m50_01.cr3 | grep 'I  I' | head -3
   0x120      288: ....CMT1II*.....  ->  00 00 01 88  C  M  T  1  I  I  * 00 08 00 00 00
   0x2b0      688: II*.....'.......  ->   I  I  * 00 08 00 00 00  ' 00 9a 82 05 00 01 00
   0x6d0     1744: ...8CMT3II*.....  ->  00 00 14  8  C  M  T  3  I  I  * 00 08 00 00 00
765 rmills@rmillsmbp:~/clanmills $ 

And can extract them:

786 rmills@rmillsmbp:~/clanmills $ dd bs=1 skip=296 if=~/Downloads/canon_eos_m50_01.cr3 count=200000 | exiv2 -pa -
200000+0 records in
200000+0 records out
200000 bytes (200 kB) copied, 0.795933 s, 251 kB/s
Exif.Image.ImageWidth                        Short       1  6000
Exif.Image.ImageLength                       Short       1  4000
Exif.Image.BitsPerSample                     Short       3  8 8 8
Exif.Image.Compression                       Short       1  JPEG (old-style)
Exif.Image.Make                              Ascii       6  Canon
Exif.Image.Model                             Ascii      14  Canon EOS M50
Exif.Image.Orientation                       Short       1  left, bottom
Exif.Image.XResolution                       Rational    1  72
Exif.Image.YResolution                       Rational    1  72
Exif.Image.ResolutionUnit                    Short       1  inch
Exif.Image.DateTime                          Ascii      20  2018:02:21 12:00:56
Exif.Image.Artist                            Ascii       1  
Exif.Image.Copyright                         Ascii       1  
787 rmills@rmillsmbp:~/clanmills $ 
768 rmills@rmillsmbp:~/clanmills $ dd bs=1 skip=688 if=~/Downloads/canon_eos_m50_01.cr3 count=200000 | exiv2 -pa -
200000+0 records in
200000+0 records out
200000 bytes (200 kB) copied, 0.792246 s, 252 kB/s
Exif.Image.ExposureTime                      Rational    1  1/80 s
Exif.Image.FNumber                           Rational    1  F6.3
Exif.Image.ExposureProgram                   Short       1  Aperture priority
Exif.Image.ISOSpeedRatings                   Short       1  10000
Exif.Image.DateTimeOriginal                  Ascii      20  2018:02:21 12:00:56
Exif.Image.ShutterSpeedValue                 SRational   1  1/83 s
Exif.Image.ApertureValue                     Rational    1  F6.4
Exif.Image.ExposureBiasValue                 SRational   1  0 EV
Exif.Image.MeteringMode                      Short       1  Multi-segment
Exif.Image.Flash                             Short       1  No flash
Exif.Image.FocalLength                       Rational    1  45.0 mm
769 rmills@rmillsmbp:~/clanmills $ 

Here's the XMP:

556 rmills@rmillsmbp:/Applications $ dd bs=1 count=300 skip=$((28480+8)) if=~/Downloads/canon_eos_m50_01.cr3 2>/dev/null | xmllint --format -
<?xml version="1.0"?>
<?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?>
<x:xmpmeta xmlns:x="adobe:ns:meta/">
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    <rdf:Description xmlns:xmp="http://ns.adobe.com/xap/1.0/" rdf:about="">
      <xmp:Rating>0</xmp:Rating>
    </rdf:Description>
  </rdf:RDF>
</x:xmpmeta>
557 rmills@rmillsmbp:/Applications $ 

This project doesn't look painful. We'll have to write an image handler cr3image.cpp which will:

  1. Detect the image type
  2. cr3image->readMetadata()
  3. cr3image->writeMetadata()
  4. add to the test suite

It's probably similar to src/cr2image.cpp which is 231 lines of code.

@clanmills

This comment has been minimized.

Collaborator

clanmills commented Mar 4, 2018

I've made some progress with this. I've looked at a couple of libraries for ISO BMFF support. I've the (Mac Only) ISOBMFF Explorer useful to understand those files: https://imazing.com/isobmff/download

I've discovered a one-file project which dumps ISO BMFF files. This looks like a great starting point and I've invited the author to join team Exiv2: pyke369/isobmffdump#1

Here's output from the canon CR3 file:

749 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ isobmffdump  ~/Downloads/canon_eos_m50_01.cr3 
@0         | ftyp [24]
@24        | moov [28440]
@32        |   uuid [26216]
@26248     |   mvhd [108]
@26356     |   trak [484]
...
@28464     | uuid [65560]
@94024     | uuid [416007]
@510031    | mdat [38025680]
@38535711  | end
750 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ 

The uuid box at 32 is the Canon magic. I don't know how to decode it at the moment.

The uuid box at 28464 is the XMP metadata and uses the same UUID as jp2image.cpp

754 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ grep -i kJp2UuidXmp ~/gnu/github/exiv2/exiv2/src/jp2image.cpp  | head -1
const unsigned char kJp2UuidXmp[]  = "\xbe\x7a\xcf\xcb\x97\xa9\x42\xe8\x9c\x71\x99\x94\x91\xe3\xaf\xac";
755 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ 
@0         | ftyp [24]
@24        | moov [28440]
@32        |   uuid [26216]
               00000000  85 c0 b6 87 82 0f 11 e0 81 11 f4 ce 46 2b 6a 48  00 00 00 26 43 4e 43 56 43 61 6e 6f 6e 43 52 33  ............F+jH...&CNCVCanonCR3
               00000020  5f 30 30 31 2f 30 30 2e 30 39 2e 30 30 2f 30 30  2e 30 30 2e 30 30 00 00 00 5c 43 43 54 50 00 00  _001/00.09.00/00.00.00...\CCTP..
               00000040  00 00 00 00 00 01 00 00 00 03 00 00 00 18 43 43  44 54 00 00 00 00 00 00 00 10 00 00 00 00 00 00  ..............CCDT..............
               00000060  00 01 00 00 00 18 43 43 44 54 00 00 00 00 00 00  00 01 00 00 00 00 00 00 00 02 00 00 00 18 43 43  ......CCDT....................CC
               00000080  44 54 00 00 00 00 00 00 00 00 00 00 00 00 00 00  00 03 00 00 00 5c 43 54 42 4f 00 00 00 04 00 00  DT...................\CTBO......
               000000a0  00 01 00 00 00 00 00 00 6f 30 00 00 00 00 00 01  00 18 00 00 00 02 00 00 00 00 00 01 6f 48 00 00  ........o0..................oH..
...
               00006640  0e 09 e4 73 cd 5d 3c a7 2d a5 2e 7a 74 52 66 53  cd b3 1a 8a d5 2a b7 f7 1f ff d9 00 00 a1 e0 10  ...s.]<.-..ztRfS.....*..........
@26248     |   mvhd [108]
@26356     |   trak [484]
...
@26840     |   trak [584]
...
@28464     | uuid [65560]
             00000000  be 7a cf cb 97 a9 42 e8 9c 71 99 94 91 e3 af ac  3c 3f 78 70 61 63 6b 65 74 20 62 65 67 69 6e 3d  .z....B..q......<?xpacket begin=
             00000020  27 ef bb bf 27 20 69 64 3d 27 57 35 4d 30 4d 70  43 65 68 69 48 7a 72 65 53 7a 4e 54 63 7a 6b 63  '...' id='W5M0MpCehiHzreSzNTczkc
             00000040  39 64 27 3f 3e 3c 78 3a 78 6d 70 6d 65 74 61 20  78 6d 6c 6e 73 3a 78 3d 22 61 64 6f 62 65 3a 6e  9d'?><x:xmpmeta xmlns:x="adobe:n
             00000060  73 3a 6d 65 74 61 2f 22 3e 3c 72 64 66 3a 52 44  46 20 78 6d 6c 6e 73 3a 72 64 66 3d 22 68 74 74  s:meta/"><rdf:RDF xmlns:rdf="htt
             00000080  70 3a 2f 2f 77 77 77 2e 77 33 2e 6f 72 67 2f 31  39 39 39 2f 30 32 2f 32 32 2d 72 64 66 2d 73 79  p://www.w3.org/1999/02/22-rdf-sy
             000000a0  6e 74 61 78 2d 6e 73 23 22 3e 3c 72 64 66 3a 44  65 73 63 72 69 70 74 69 6f 6e 20 72 64 66 3a 61  ntax-ns#"><rdf:Description rdf:a
             000000c0  62 6f 75 74 3d 22 22 20 78 6d 6c 6e 73 3a 78 6d  70 3d 22 68 74 74 70 3a 2f 2f 6e 73 2e 61 64 6f  bout="" xmlns:xmp="http://ns.ado
             000000e0  62 65 2e 63 6f 6d 2f 78 61 70 2f 31 2e 30 2f 22  3e 3c 78 6d 70 3a 52 61 74 69 6e 67 3e 30 3c 2f  be.com/xap/1.0/"><xmp:Rating>0</
             00000100  78 6d 70 3a 52 61 74 69 6e 67 3e 3c 2f 72 64 66  3a 44 65 73 63 72 69 70 74 69 6f 6e 3e 3c 2f 72  xmp:Rating></rdf:Description></r
             00000120  64 66 3a 52 44 46 3e 3c 2f 78 3a 78 6d 70 6d 65  74 61 3e 20 20 20 20 20 20 20 20 20 20 20 20 20  df:RDF></x:xmpmeta>             
             00000140  20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20                                  
             00000160  20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20                                  
...
@510031    | mdat [38025680]
@38535711  | end

There's another uuid box at 94024. I don't know what that is yet. It's not Iptc:

760 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ isobmffdump --dump uuid ~/Downloads/canon_eos_m50_01.cr3 | grep uuid
@32        |   uuid [26216]
@28464     | uuid [65560]
@94024     | uuid [416007]
761 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ 
@clanmills

This comment has been minimized.

Collaborator

clanmills commented Mar 4, 2018

I've raised a discussion thread on Pixls.us to see if somebody knows the specification for those unidentified uuid box structures. https://discuss.pixls.us/t/new-canon-cr3-file-specification/6881

As I've mentioned in on Pixls.us, I have a (retired) Canon friend who may be able to help. Canon is a huge company and it's likely that my friend will have no contacts with digital camera software engineering.

@LebedevRI

This comment has been minimized.

LebedevRI commented Mar 4, 2018

I too was able to make progress. It's strange, those raws are demosaiced already :/
darktable-org/rawspeed#121

@clanmills

This comment has been minimized.

Collaborator

clanmills commented Mar 4, 2018

Good News. I don't think this is going to be very difficult. The file is ISO BMFF format. Almost all the data in the file is in the MDAT (which I thought was intended for audio/music). I thought ISO BMFF was intended for video. Confused? (I'm easily confused).

I confirm that the MDAT "appears" to be a plain 38mb JPEG 4000x6000 pixels.

513 rmills@rmillsmbp:~/gnu/github $ dd bs=1 skip=$((510031+16)) count=$((38025680-16)) if=~/Downloads/canon_eos_m50_01.cr3 > foo.jpg
38025664+0 records in
38025664+0 records out
38025664 bytes (38 MB) copied, 418.962 s, 90.8 kB/s
514 rmills@rmillsmbp:~/gnu/github $ exiv2 foo.jpg 
File name       : foo.jpg
File size       : 38025664 Bytes
MIME type       : image/jpeg
Image size      : 6000 x 4000
foo.jpg: No Exif data found in the file
515 rmills@rmillsmbp:~/gnu/github $ 

This comment is relevant: darktable-org/rawspeed#121 (comment) From Exiv2's point of view, the MDAT previews are interesting.

The Exif (and IPTC) metadata is almost certainly in the Canon uuid box near the top of the file. We have already identified the uuid box with the XMP metadata.

So, we're making good progress.

@LebedevRI

This comment has been minimized.

LebedevRI commented Mar 4, 2018

I confirm that the MDAT "appears" to be a plain 38mb JPEG 4000x6000 pixels.

As per darktable-org/rawspeed#121 (comment), what you are looking at, is the largest embedded thumbnail, which exiv2 should ideally be able to provide via the usual means.

$ ffprobe canon_eos_m50_01.cr3 
...
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'canon_eos_m50_01.cr3':
  Metadata:
    major_brand     : crx 
    minor_version   : 1
    compatible_brands: crx isom
    creation_time   : 2018-02-21T12:00:56.000000Z
  Duration: 00:00:01.00, start: 0.000000, bitrate: 308285 kb/s
    Stream #0:0(eng): Video: none (CRAW / 0x57415243), none, 6000x4000, 26169 kb/s, 1 fps, 1 tbr, 1 tbn, 1 tbc (default)      <- THIS ONE
    Metadata:
      creation_time   : 2018-02-21T12:00:56.000000Z
    Stream #0:1(eng): Video: none (CRAW / 0x57415243), none, 1624x1080, 14829 kb/s, 1 fps, 1 tbr, 1 tbn, 1 tbc (default)      <- LIKELY TOO
    Metadata:
      creation_time   : 2018-02-21T12:00:56.000000Z
    Stream #0:2(eng): Video: none (CRAW / 0x57415243), none, 6288x4056, 262877 kb/s, 1 fps, 1 tbr, 1 tbn, 1 tbc (default)   <- has to be raw, dimensions/bitrate look about right...
    Metadata:
      creation_time   : 2018-02-21T12:00:56.000000Z
    Stream #0:3(eng): Data: none (CTMD / 0x444D5443), 328 kb/s (default)
    Metadata:
      creation_time   : 2018-02-21T12:00:56.000000Z

The second one is also a thumbnail, likely.
And only the third one is the raw data chunk

@clanmills

This comment has been minimized.

Collaborator

clanmills commented Mar 5, 2018

Progress with decoding the Canon uuid box. It's a linked list of records:

893 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ ./isobmffdump ~/Downloads/canon_eos_m50_01.cr3 | head -5
@0         | ftyp [24]
@24        | moov [28440]
@32        |   uuid [26216]
@26248     |   mvhd [108]
@26356     |   trak [484]

I can extract the uuid with this: (24 = 16 bytes for the UUID + 8 bytes for the UUID marker)

894 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ dd skip=$((32+24)) count=$((26216-24)) bs=1 if=~/Downloads/canon_eos_m50_01.cr3 > canon.uuid
26192+0 records in
26192+0 records out
26192 bytes (26 kB) copied, 0.261663 s, 100 kB/s

I've written a utility dumper.cpp (code below):

970 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ ./dumper canon.uuid 
dumping canon.uuid
 offset| length| data
      4|     34| CNCVCanonCR3_001/00.09.00/00.00.00
     42|     88| CCTP................CCDT....................CCDT................ _._
    134|     88| CTBO..............o0..................oH......Y............O.... _._
    226|      6| free..
    236|    388| CMT1II*...............p......................................... _._
    628|   1060| CMT2II*.....'........................."...........'........'..0. _._
   1692|   5172| CMT3II*...../.....1...B..............................."......... _._
   6868|   1812| CMT4II*......................................................... _._
   8684|  17508| THMB.......x..DK................................................ _._
971 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ 

CTM1,2,3,4 looks like tiffs. Exif metadata probably. Why 4? Don't know yet.
THMB is probably a thumbnail (JPEG perhaps)?

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <iostream>

static bool isBigEndianPlatform()
{
    union {
        uint32_t i;
        char c[4];
    } e = { 0x01000000 };

    return e.c[0]?true:false;
}

// https://stackoverflow.com/questions/2182002/convert-big-endian-to-little-endian-in-c-without-using-provided-func
static int32_t bigEndianToInt(int32_t num)
{
    return isBigEndianPlatform()
              ? num
              : ((num>>24)&0xff)       // move byte 3 to byte 0
              | ((num<<8)&0xff0000)    // move byte 1 to byte 2
              | ((num>>8)&0xff00)      // move byte 2 to byte 1
              | ((num<<24)&0xff000000) // byte 0 to byte 3
              ;
}

void dump(void* buffer,int32_t length)
{
    bool bEarly = false;
    if ( length > 64 ) {
        bEarly = true;
        length = 64 ;
    }
    for ( int32_t i = 0 ; i < length ; i++ ) {
        int c = (int) ((char*)buffer) [i];
        printf("%c", ((c<32) || (c>=128)) ? '.' : c );
    }
    if ( bEarly ) printf(" _._");
    printf("\n");
}

int main(int argc, const char* argv[])
{
    int         result   = 0;
    FILE*       f        = NULL;
    const char* program  = argv[0];
    const char* filename = argv[1];

    if ( argc == 2 ) {
        f = fopen(filename,"rb");
        if ( !f ) {
            fprintf(stderr,"unable to open %s\n",filename);
            result = 1;
        }
    } else {
        fprintf(stderr,"syntax: %s file\n",program);
    }

    if ( f ) {
        printf("dumping %s\n",filename);
        printf(" offset| length| data\n");
        int32_t length ;
        do {
            length=0;
            if ( fread(&length,1,4,f) == 4 ) {
                length=bigEndianToInt(length);
                if ( length > 4 ) {
                    length -= 4;
                    printf("%7ld|%7d| ",ftell(f),length);
                    void* buffer = ::malloc(length);
                    fread(buffer,length,1,f);
                    dump(buffer,length);
                    ::free(buffer);
                }
                // fseek(f,(length-4),SEEK_CUR);
            }
        } while (length > 0);
    }

    return result;
}
@LebedevRI

This comment has been minimized.

LebedevRI commented Mar 5, 2018

CTM1,2,3,4 looks like tiffs. Exif metadata probably. Why 4? Don't know yet.

Well, we do have 3 actual Streams in the mdat, so one per each of those + one general IFD?

@clanmills

This comment has been minimized.

Collaborator

clanmills commented Mar 5, 2018

Good Morning, Roman. Hope the weather's warmer for you this week.

I'm going to leave you to investigate the MDAT because I believe it's a JPEG with no metadata. I don't see an IFD in the MDAT. If you know how to locate that, I'll investigate that after I understand the moov/uuid. And we've already identified the XMP.

I'm focused on the Canon uuid box in the moov. I've run out of time for today. I hope tomorrow morning to know enough to read the metadata.

@LebedevRI

This comment has been minimized.

LebedevRI commented Mar 5, 2018

(I was talking about the fact that the CR3 ISO BMFF has exactly 4 trak boxes, so my guess would be that it's for each of those)

@clanmills

This comment has been minimized.

Collaborator

clanmills commented Mar 5, 2018

Thanks, Roman. We'll figure this puzzle. 4 trak boxes. Yes - that's right.

Last discovery for Monday morning before I get on with what I should be doing! The THMB record contains a JPEG 120x160 thumbnail.

It's 20 bytes into the record. Why 20? 4 for THMB followed by 16 bytes. For extracting thumbnails, we never need to know. Exiv2 has never been able to edit thumbnails. Exiv2 can relocate THMB in the file as a binary blob.

919 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ dd bs=1 skip=$((8684+20)) count=$((17508-20)) if=canon.uuid > canon.jpg
17488+0 records in
17488+0 records out
17488 bytes (17 kB) copied, 0.177006 s, 98.8 kB/s
920 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ exiv2 canon.jpg 
File name       : canon.jpg
File size       : 17488 Bytes
MIME type       : image/jpeg
Image size      : 160 x 120
canon.jpg: No Exif data found in the file
921 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ 
@clanmills

This comment has been minimized.

Collaborator

clanmills commented Mar 5, 2018

Phil Harvey (ExifTool) knows about .CR3 files:

995 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ exiftool ~/Downloads/canon_eos_m50_01.cr3 
ExifTool Version Number         : 10.82
...
Compatible Brands               : crx , isom
Compressor Version              : CanonCR3_001/00.09.00/00.00.00
Image Width                     : 6000
Image Height                    : 4000
Bits Per Sample                 : 8 8 8
Compression                     : JPEG (old-style)
...

There is also helpful document. See "Canon uuid Tags" on this page: https://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/Canon.html

Tag ID Values / Notes Comment
CCTP CanonCCTP
CMT1 EXIF Tags
CMT2 EXIF Tags
CMT3 Canon Tags https://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/Canon.html
CMT4 Canon Unknown IFD
CNCV Compressor Version
CNTH Canon CNTH Tags
THMB Thumbnail Image
@clanmills

This comment has been minimized.

Collaborator

clanmills commented Mar 9, 2018

I think I've got a good understanding about the scope of this project. We're having a team meeting in May #225 when we'll discuss the specification and schedule for v0.27.

@clanmills

This comment has been minimized.

Collaborator

clanmills commented Mar 13, 2018

I've written code to search for tiff structures in a binary file. It's "simple minded" and only searches for the pattern **_II*\0_** or **_MM\0*_** and reports the location. Discovered tiffs can be extracted with the utility dd. Although there is binary data with the **_II*\0_** finger-print in MDAT, none of these structures is a valid tiff. It's just a coincidence.

615 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ tiffs ~/Downloads/canon_eos_m50_01.cr3 
searching /Users/rmills/Downloads/canon_eos_m50_01.cr3
    offset | type
       296 | II
       688 | II
      1752 | II
      6906 | II
      6928 | II
  38494795 | II
  38494825 | II
  38496517 | II
  38534235 | II
616 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $

Here's the code:

// tiffs.cpp - search for tiff structures in a binary file
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <iostream>

int main(int argc, const char* argv[])
{
    int         result   = 0;
    FILE*       f        = NULL;
    const char* program  = argv[0];
    const char* filename = argv[1];

    if ( argc == 2 ) {
        f = fopen(filename,"rb");
        if ( !f ) {
            fprintf(stderr,"unable to open %s\n",filename);
            result = 1;
        }
    } else {
        fprintf(stderr,"syntax: %s file\n",program);
    }

    if ( f ) {
        printf("searching %s\n",filename);
        printf("    offset | type\n");
        while ( !feof(f) ) {
            int32_t count=0;
            char    buff[4];
            char    II[4];
            char    MM[4];
            memcpy(II,"II*\0",4);
            memcpy(MM,"MM\0*",4);

            size_t now = ftell(f);
            while (  fread(buff+count,1,1,f) == 1
                &&( (memcmp(MM,buff,count+1) == 0) || (memcmp(II,buff,count+1) == 0))
                  ) {
                count++;
                if ( count ==  4 ) {
                    printf("%10ld | %c%c\n",now,buff[0],buff[1]);
                    break;
                }
            }
            if ( count > 1 ) {
            	fseek(f,count,SEEK_CUR);
            }
        }
        fclose(f);
        f=NULL;
    }

    return result;
}
@lclevy

This comment has been minimized.

lclevy commented Mar 13, 2018

In mdat, to be more accurate, there are IFDs, that are parsed by ExifTool 10.82

@clanmills

This comment has been minimized.

Collaborator

clanmills commented Mar 13, 2018

Here's a snippet from the -verbose out from ExifTool.

621 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ exiftool -verbose ~/Downloads/canon_eos_m50_01.cr3

CTMD (SubDirectory) -->

  • [CTMD type 1 directory, 18 bytes]
  • [CTMD type 3 directory, 10 bytes]
  • [CTMD type 4 directory, 18 bytes]
  • [CTMD type 5 directory, 34 bytes]
  • [CTMD type 7 directory, 1716 bytes]
    | ExifInfo (SubDirectory) -->
    | + [TIFF directory]
    | | ExifByteOrder = II
    | | + [IFD0 directory with 1 entries]
    | | | 0) SceneCaptureType = 0
    | MakerNoteCanon (SubDirectory) -->
629 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ tiffs ~/Downloads/canon_eos_m50_01.cr3 
searching /Users/rmills/Downloads/canon_eos_m50_01.cr3
    offset | type
       296 | II
       688 | II
      1752 | II
      6906 | II
      6928 | II
  38494795 | II  <--- this is a valid tiff
  38494825 | II
  38496517 | II
  38534235 | II
630 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ dd bs=1 skip=38494795 count=10000 if=~/Downloads/canon_eos_m50_01.cr3 > foo1.tif
10000+0 records in
10000+0 records out
10000 bytes (10 kB) copied, 0.11086 s, 90.2 kB/s
631 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ exiv2 -pS foo1.tif 
STRUCTURE OF TIFF FILE (II): foo1.tif
 address |    tag                           |      type |    count |    offset | value
      10 | 0xa406 SceneCaptureType          |     SHORT |        1 |         0 | 0
END foo1.tif
632 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ 
@clanmills

This comment has been minimized.

Collaborator

clanmills commented Mar 13, 2018

@lclevy @LebedevRI Thanks Guys. I'm mistaken. There is a valid tiff in the mdat. I still don't have a complete understanding of how all the data is organised in CR3, however we're making good progress. And I'm enjoying working with you both. Thanks.

@clanmills clanmills self-assigned this Mar 13, 2018

@boardhead

This comment has been minimized.

boardhead commented Mar 19, 2018

The CTMD type 1, 4 and 5 directories contain timestamp, focal-length and exposure information respectively. ExifTool 10.87 will decode these too. I still don't know about type 3 though.

@lclevy

This comment has been minimized.

lclevy commented Mar 19, 2018

@clanmills

This comment has been minimized.

Collaborator

clanmills commented Mar 19, 2018

Could you share your code please, Laurent? I was working on my decoder (tiffs.cpp) on Friday and will publish a new version here later this week. #236

@LebedevRI

This comment has been minimized.

LebedevRI commented Mar 19, 2018

I have made progress on the craw proprietary format, and found canon code which handle it, in dpp 4.8.20. Do you know someone else working on it ? So that we can work together?
I would to create an open source decoder.

Is there a specification? As you are well aware, i'm kinda working on it in darktable-org/rawspeed#121

@clanmills

This comment has been minimized.

Collaborator

clanmills commented Mar 19, 2018

dpp 4.8.20 from Canon Asia = http://support-th.canon-asia.com/contents/TH/EN/0200544802.html

I haven't downloaded this. Looks like an executable binary.

@lclevy

This comment has been minimized.

lclevy commented Mar 19, 2018

@Exiv2 Exiv2 deleted a comment from boardhead Mar 19, 2018

@clanmills

This comment has been minimized.

Collaborator

clanmills commented Mar 19, 2018

Gentlemen: DPP4 is an impressive application. It does have a metadata dialog box (View/Info) with tabs. Curiously, under tab "IPTC/XMP" is shows categories with no meaningful data. Maybe the camera firmware engineers didn't tell the DPP Developers about the uuid for XMP! In the test file, the XMP only has "Rating 0". Anyway good progress. I'll spend more time looking at this later in the week.

@clanmills

This comment has been minimized.

Collaborator

clanmills commented Mar 19, 2018

There is a Canon Digital Camera SDK available in SE Asia. I'd have to sign an NDA to obtain that, so as Exiv2 is licensed under GPLv2, I suspect the NDA would cause me trouble. I feel we're close to unravelling the .CR3 format. I'll spend time on this later in the week and give everybody an update on my progress.

@lclevy

This comment has been minimized.

lclevy commented Mar 19, 2018

updated: https://github.com/lclevy/canon_cr3
(only by examining samples)

@lclevy

This comment has been minimized.

lclevy commented Mar 19, 2018

the hardest part is understanding the compression algorithm. not sure it is possible

@clanmills

This comment has been minimized.

Collaborator

clanmills commented Mar 19, 2018

@lclevy Nice job of updating your "spec" for CR3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment