Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Canon CR3 support. #121

Open
LebedevRI opened this Issue Mar 2, 2018 · 74 comments

Comments

Projects
None yet
@LebedevRI
Copy link
Member

LebedevRI commented Mar 2, 2018

Though, until the production camera is actually released, and the full set of freely-licensed samples is contributed to the https://raw.pixls.us/, it won't be merged.

@LebedevRI LebedevRI self-assigned this Mar 2, 2018

@LebedevRI

This comment has been minimized.

Copy link
Member Author

LebedevRI commented Mar 2, 2018

Does not appear to be TIFF.
Appears to be ISO Media / ISO/IEC 14496-12.
The actual content of the 'mdat' box appears to be JPEG-ish, but not old-good lossless-JPEG, because it contains M_DQT (define-quantization-tables marker)

$ file canon_eos_m50_01.cr3  
canon_eos_m50_01.cr3: ISO Media
$ mpv canon_eos_m50_01.cr3
Playing: canon_eos_m50_01.cr3
 (+) Video --vid=1 (*) ( 6000x4000 1.000fps)
     Video --vid=2 (*) ( 1624x1080 1.000fps)
     Video --vid=3 (*) ( 6288x4056 1.000fps)
Failed to initialize a video decoder for codec ''.
@LebedevRI

This comment has been minimized.

Copy link
Member Author

LebedevRI commented Mar 4, 2018

Hmm, ok, this is strange. Got the ISO Media parser working, and hooked-up dummy Cr3Decoder.
The mdat chunk is NOT lossless jpeg. But it just loads via libjpeg.
Those CR3 appear to be plain demosaiced JPEG's...
(Yes, this is an actual decoded image)
0001_

@LibRaw

This comment has been minimized.

Copy link

LibRaw commented Mar 4, 2018

According to ffprobe, mdat chunk contains four substreams.
First one is, exactly, JPEG preview, so from libjpeg's point of view this is JPEG file + ~30Mb of garbage (real raw data + makernotes) on tail.

@LebedevRI

This comment has been minimized.

Copy link
Member Author

LebedevRI commented Mar 4, 2018

Aha, thank you, that makes more sense. Let's see...

@LibRaw

This comment has been minimized.

Copy link

LibRaw commented Mar 4, 2018

unfortunately, two data chunks (1624x1080 and 6288x4056, first one probably is low-res preview) are not lossless jpegs too :(

@LebedevRI

This comment has been minimized.

Copy link
Member Author

LebedevRI commented Mar 4, 2018

(I'm still hoping they are JPEG w/arithmetic coding)

@LibRaw

This comment has been minimized.

Copy link

LibRaw commented Mar 4, 2018

It does not looks like JPEG (e.g. no FF doubling).

@LebedevRI

This comment has been minimized.

Copy link
Member Author

LebedevRI commented Mar 4, 2018

There are 0xFF0xFF sequences in files 05, 11, 19, (looking at the second half of the file in okteta hex editor)

@clanmills

This comment has been minimized.

Copy link

clanmills commented Mar 4, 2018

Thank you folks. I'm going to keep out of a discussion about the pixels and keep to my domain of metadata. Thanks to your comments, I've now heard of ffprobe.

822 rmills@rmillsmbp:~/gnu/github $ exiv2 -pR foo.jpg 
STRUCTURE OF JPEG FILE: foo.jpg
 address | marker       |  length | data
       0 | 0xffd8 SOI  
       2 | 0xffdb DQT   |     132 
     136 | 0xffc0 SOF0  |      17 
     155 | 0xffc4 DHT   |     418 
     575 | 0xffda SOS  
823 rmills@rmillsmbp:~/gnu/github $ ffprobe foo.jpg 
ffprobe version 3.4.1 Copyright (c) 2007-2017 the FFmpeg developers
  built with Apple LLVM version 9.0.0 (clang-900.0.39.2)
  configuration: --prefix=/opt/local --enable-swscale --enable-avfilter --enable-avresample --enable-libmp3lame --enable-libvorbis --enable-libopus --enable-librsvg --enable-libtheora --enable-libopenjpeg --enable-libmodplug --enable-libvpx --enable-libsoxr --enable-libspeex --enable-libass --enable-libbluray --enable-lzma --enable-gnutls --enable-fontconfig --enable-libfreetype --enable-libfribidi --disable-jack --disable-libopencore-amrnb --disable-libopencore-amrwb --disable-indev=jack --enable-opencl --disable-outdev=xv --enable-audiotoolbox --enable-videotoolbox --enable-sdl2 --mandir=/opt/local/share/man --enable-shared --enable-pthreads --cc=/usr/bin/clang --arch=x86_64 --enable-x86asm --enable-libx265 --enable-gpl --enable-postproc --enable-libx264 --enable-libxvid
  libavutil      55. 78.100 / 55. 78.100
  libavcodec     57.107.100 / 57.107.100
  libavformat    57. 83.100 / 57. 83.100
  libavdevice    57. 10.100 / 57. 10.100
  libavfilter     6.107.100 /  6.107.100
  libavresample   3.  7.  0 /  3.  7.  0
  libswscale      4.  8.100 /  4.  8.100
  libswresample   2.  9.100 /  2.  9.100
  libpostproc    54.  7.100 / 54.  7.100
Input #0, image2, from 'foo.jpg':
  Duration: 00:00:00.04, start: 0.000000, bitrate: 7605132 kb/s
    Stream #0:0: Video: mjpeg, yuvj422p(pc, bt470bg/unknown/unknown), 6000x4000, 25 tbr, 25 tbn, 25 tbc
824 rmills@rmillsmbp:~/gnu/github $ 

I'll work on the metadata. We've found the XMP data and I'm confident the Exif (and IPTC) data is in the Canon uuid box at the top of the file.

Later, I will investigate previews and ICC profiles.

@LebedevRI

This comment has been minimized.

Copy link
Member Author

LebedevRI commented Mar 4, 2018

I'll work on the metadata. We've found the XMP data and I'm confident the Exif (and IPTC) data is in the Canon uuid box at the top of the file.
Later, I will investigate previews and ICC profiles.

Thank you, @clanmills! Sounds awesome! :)

@lclevy

This comment has been minimized.

Copy link

lclevy commented Mar 11, 2018

Phil Harvey exiftool 10.82 can parse Canon CR3
it seems there are at least 3 pictures: THMB, PRVW, and other in mdat

@lclevy

This comment has been minimized.

Copy link

lclevy commented Mar 11, 2018

for http://img.photographyblog.com/reviews/canon_eos_m50/photos/canon_eos_m50_01.cr3
mdat contains "trak" picture at 0x6d940, 0x37b030, 0x546c70, and TIFF metadata at 0x2565b98

@LebedevRI

This comment has been minimized.

Copy link
Member Author

LebedevRI commented Mar 11, 2018

Yep, that much was understood. Need to continue working on that code.. Hopefully tomorrow?

@lclevy

This comment has been minimized.

Copy link

lclevy commented Mar 11, 2018

I started a CR3 format description here https://github.com/lclevy/canon_cr3 if you are interested in contributing...
I've done it for CR2 here http://lclevy.free.fr/cr2/ and https://github.com/lclevy/libcraw2

@clanmills

This comment has been minimized.

Copy link

clanmills commented Mar 12, 2018

Laurent

This is very helpful indeed. I wrote an email to you to ask if you were considering working on CR3. Then I made good progress and decided not to send the email.

I've documented my progress: Exiv2/exiv2#236

Team Exiv2 will have a meeting on May 5 at my home in England. This project will be discussed. There are many demands on the team and only a few contributors. In honestly, I don't think we have the resources to implement support for CR3 in 2018. Exiv2/exiv2#225

@lclevy

This comment has been minimized.

Copy link

lclevy commented Mar 12, 2018

Hi,
a lots of updates at https://github.com/lclevy/canon_cr3, you can subscribe to the project to see updates

@clanmills

This comment has been minimized.

Copy link

clanmills commented Mar 13, 2018

Thanks for doing this. It's looking good.

I'm puzzled by the claim (also made by @lclevy and @LibRaw) that there are 4 images in MDAT. I only see a single JPEG. I see there are 4 trak objects which are a hierarchy of items (CRAW, stsz, co64). Perhaps you could document the meaning of the objects in the trak hierarchy.

532 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ ./isobmffdump --dump stsz ~/Downloads/canon_eos_m50_01.cr3 
@0         | ftyp [24]
@24        | moov [28440]
@32        |   uuid [26216]
@26248     |   mvhd [108]
@26356     |   trak [484]
@26364     |     tkhd [92]
@26456     |     mdia [384]
@26464     |       mdhd [32]
@26496     |       hdlr [33]
@26529     |       minf [311]
@26537     |         vmhd [20]
@26557     |         dinf [36]
@26565     |           dref [28]
@26593     |         stbl [247]
@26601     |           stsd [128]
@26617     |             CRAW [112]
@26729     |           stts [24]
@26753     |           stsc [28]
@26781     |           stsz [20]
                       00000000  00 00 00 00 00 31 ea 2f 00 00 00 01                                                               .....1./....
@26801     |           free [15]
@26816     |           co64 [24]
...
@lclevy

This comment has been minimized.

Copy link

lclevy commented Mar 13, 2018

There are 4 parts in mdat: 3 pictures (JPEG, plus 2 with new compression), then metadata like in tiff

@clanmills

This comment has been minimized.

Copy link

clanmills commented Mar 13, 2018

@lclevy I do not agree that there is a tiff in the MDAT. Here's my evidence: Exiv2/exiv2#236 (comment)

We agree that there is at least one JPEG in the MDAT. I'm not convinced there are 3 pictures, however I have no proof to assert there is only 1 picture.

@clanmills

This comment has been minimized.

Copy link

clanmills commented Mar 13, 2018

@lclevy Apologies. I am mistaken. There is a valid tiff in the MDAT and ExifTool reads it.

@LebedevRI Thanks for pointing this out. By coincidence I discovered and documented this before your email arrived. Two heads thinking of the same thing! That's better than no heads thinking! Exiv2/exiv2#236 (comment)

Steady progress. Great Teamwork. Thanks everybody.

@lclevy

This comment has been minimized.

Copy link

lclevy commented Mar 13, 2018

updates: THMB and PRVW detailed. confirmed 3 jpeg and their sizes

@clanmills

This comment has been minimized.

Copy link

clanmills commented Mar 13, 2018

Well done, @lclevy. Very impressive progress. It's possible I now have sufficient information to parse all the CR3 metadata and thumbnails in Exiv2.

@lclevy

This comment has been minimized.

Copy link

lclevy commented Mar 29, 2018

please find parse_cr3.py ...

@clanmills

This comment has been minimized.

Copy link

clanmills commented Apr 3, 2018

@lclevy Happy Easter. We were on vacation last week and now I'm home and would like to look at parse_cr3.py. Where have you put it?

@lclevy

This comment has been minimized.

Copy link

lclevy commented Apr 3, 2018

@clanmills

This comment has been minimized.

Copy link

clanmills commented Apr 3, 2018

Thanks. That worked first time. I'll study that. Thanks very much.

@neo22s

This comment has been minimized.

Copy link

neo22s commented Jan 3, 2019

Hello,

Thanks for the job, and joining the conversation, any news on this?

Bought a canon eos R and I can no use the RAW files. I work only on ubuntu :(

Any temporary fix?

@LebedevRI

This comment has been minimized.

Copy link
Member Author

LebedevRI commented Jan 3, 2019

Hello,

Thanks for the job, and joining the conversation, any news on this?

Bought a canon eos R

Aha, finally someone with that camera!
Please help, we need samples for https://raw.pixls.us/ :)

Details:
Pick some static scene (outdoor landscape, no people), use tripod
There are total of 2 * 2 * 4 samples needed (i think?), let me break it down:

  • There are two raw modes on that camera (i think?): RAW, C-RAW.
  • That camera has dual-pixel mode, which can be disabled and enabled
  • That camera can produce raws in 4 different aspect ratios
    We need the full coverage of all that, so
  • RAW, dual pixel off, 3:2
  • RAW, dual pixel off, 4:3
  • RAW, dual pixel off, 16:9
  • RAW, dual pixel off, 1:1
  • RAW, dual pixel on, 3:2
  • RAW, dual pixel on, 4:3
  • RAW, dual pixel on, 16:9
  • RAW, dual pixel on, 1:1
  • C-RAW, dual pixel off, 3:2
  • C-RAW, dual pixel off, 4:3
  • C-RAW, dual pixel off, 16:9
  • C-RAW, dual pixel off, 1:1
  • C-RAW, dual pixel on, 3:2
  • C-RAW, dual pixel on, 4:3
  • C-RAW, dual pixel on, 16:9
  • C-RAW, dual pixel on, 1:1

Please, do rename the files appropriately!!!
Afterwards, upload each one of these files separately(!!!) to https://raw.pixls.us/

and I can no use the RAW files. I work only on ubuntu :(

Any temporary fix?

@neo22s

This comment has been minimized.

Copy link

neo22s commented Jan 3, 2019

@LebedevRI tomorrow you have it ;)

@lclevy

This comment has been minimized.

Copy link

lclevy commented Jan 3, 2019

@LebedevRI

This comment has been minimized.

Copy link
Member Author

LebedevRI commented Jan 3, 2019

@LebedevRI tomorrow you have it ;)

the upload form will say that there are samples already, do upload everything anyway.

@LebedevRI

This comment has been minimized.

Copy link
Member Author

LebedevRI commented Jan 3, 2019

You'll have also information here: https://github.com/lclevy/canon_cr3/blob/master/readme.md#samples

Yes, i'm aware of those samples.

  1. they are not on RPU, therefore i don't/can't/won't use them in the integration testing of the library.
  2. They are not under CC0, therefore their usage is somewhat questionable
  3. We don't have CC0 sample set anyway, so we need one anyway
@neo22s

This comment has been minimized.

Copy link

neo22s commented Jan 5, 2019

Hello,

All RAW files here properly named:
https://mega.nz/#!NsQRgCxT!vCh4EaU8FfPsXCl2ge4Ar4XGMhbgtjPMu8qBi6Gp2q0

I am uploading 1 by 1 but It gets stuck and I need to leave in 10 minutes.

@LebedevRI

This comment has been minimized.

Copy link
Member Author

LebedevRI commented Jan 5, 2019

No worries, please do finish uploading them when/if you have time.
Sadly, this isn't really time-critical issue here..

@neo22s

This comment has been minimized.

Copy link

neo22s commented Jan 5, 2019

Done! so what's next? :D

I really need this.. I am trying to make a living from photography and doing it only using Linux gets hard... ;)

@LebedevRI

This comment has been minimized.

Copy link
Member Author

LebedevRI commented Jan 5, 2019

Sadly, the best current advice i can get is to build a chroot (https://wiki.debian.org/Debootstrap e.g.), install wine there, and temporarily use adobe dng converter to temporarily convert the raws to DNG, while keeping the originals.

@neo22s

This comment has been minimized.

Copy link

neo22s commented Jan 5, 2019

LebedevRI added a commit to LebedevRI/canon_cr3 that referenced this issue Jan 7, 2019

@seanmavley

This comment has been minimized.

Copy link

seanmavley commented Jan 20, 2019

@neo22s
I have the M50, and I must say working with it has been a challenge. Only adobe and canon's dpp support it. And both apps work on Windows only. No linux support.

Currently, my workflow involves running canon dpp in Windows, export all the files, then continue with darktable or gimp in Linux.

@LebedevRI I have the m50, in case samples would be needed from it. Over 5k of CRAW files to pick from in almost all forms of lighting conditions.

@neo22s

This comment has been minimized.

Copy link

neo22s commented Jan 20, 2019

@seanmavley

Luckily for me I was barely using the RAW files, I do fashion photography and only if I messed up badly I needed to use the RAW files (wrong white balance or really bad exposed).

If I was a wedding photographer (which I do sometimes) will be a nightmare. I am ready to use windows those days if this does not go faster.

Such a pity :(

@jlbee

This comment has been minimized.

Copy link

jlbee commented Feb 2, 2019

Any news on this? I use the EOS R, and can't open any of my RAW files. It kind of makes me regret trading in my previous camera since it had really good support. I submitted my pics to a repository I was linked to on Mastodon in December. I can't get Adobe or Canon's software to work on anything, and my desktop is too old to download anything to. I can do basic stuff in GIMP, but it just isn't the same.

@LebedevRI

This comment has been minimized.

Copy link
Member Author

LebedevRI commented Feb 2, 2019

I have not looked further into that yet.
If anyone wants to, feel free to work on it.

@lclevy

This comment has been minimized.

Copy link

lclevy commented Feb 5, 2019

it seems Alexey is working on it
lclevy/canon_cr3#7

@LebedevRI

This comment has been minimized.

Copy link
Member Author

LebedevRI commented Feb 5, 2019

Nice!
Good spec will be better than working code (looking at you, fuji decompressor).

@neo22s

This comment has been minimized.

Copy link

neo22s commented Feb 5, 2019

@Alexey-Danilchenko

This comment has been minimized.

Copy link

Alexey-Danilchenko commented Feb 5, 2019

@LebedevRI - yet working code just works and is miles better than empty statements. And if you really expect to have good specs in a world of proprietary raw formats, I think it is time to wake up and have a reality check

@LebedevRI

This comment has been minimized.

Copy link
Member Author

LebedevRI commented Feb 5, 2019

A working code (source code under open license, not something else) is sure better than nothing.

That fuji remark was mostly in context of

lclevy/canon_cr3#7

which seemed like an improvement/contribution to the attempt to spec-ify the codec.
I omitted word "even" in "Good spec will be better", thus, having an idea of the state
of fuji decompressor, it does read as some variation on "better no code than that again".

I admit, i'm fond of the effort fuji decompressor took,
and i will freely admit that i'm not fully fond of the structure of that code.

In particular, i'm observing that the threading is per-input-tile, and there are 4..12 of them,
which results in load imbalance. so, i'we been wondering whether the algo can be
separated into stages, so some of decoding can be more fine-grainedly parallelized (per-row),
but so far i was unable to fully comb through it. Although i haven't really looked into that too deep.

Oh well, hopefully this didn't degrade things even further :)

Thank you for looking into it!

@Alexey-Danilchenko

This comment has been minimized.

Copy link

Alexey-Danilchenko commented Feb 5, 2019

That format came from reversing just as in Fuji case header format for their data. Neither will give you the specifications of original format or decoding algorithms. So whatever that is - it is definitely not and attempt to specify the codec, only the file format.

As with anything in reversing, unless you will want to put an effort in understanding the algorithms and rewrite the code, it will not improve from how it was coded by original developers (Fuji, Canon etc.) and from what I seen their coding is left to be desired. I certainly won't do it for you (I have enough spare time spent on it already) so perhaps it's task you will take upon yourself (it will certainly come out better than speeches). In the meantime, I will release and license CR3 (when ready) to libraw only.

@LebedevRI

This comment has been minimized.

Copy link
Member Author

LebedevRI commented Feb 6, 2019

To be honest i'm not quite sure what prompted such harsh responses.
It was logical that fuji decompressor code was not produced from a good human-readable spec.

That format came from reversing just as in Fuji case header format for their data.

Confirms that.

I have not (nor have i wanted to) insulted you, your coding/developing abilities, reversing abilities.
I have not demanded/requested/etc that you improve that reversed code.
I have only voiced my frustration with that current implementation, with how inefficient it is,
and voiced (false, apparently) hopes that in CR3 case, the approach would be different.

@neo22s

This comment has been minimized.

Copy link

neo22s commented Feb 20, 2019

Hello!!

Any news about this? thanks :D

@kznsq

This comment has been minimized.

Copy link

kznsq commented Mar 2, 2019

Hello!
The patent is describing the prediction coding with K parameter (Golomb-Rice coding base value), which is actually a power of 2. Each step this K parameter is updating. But where is a initial value?
Maybe here?
"19 bits value (3 right most bits of byte#9 + 16 bits at offset 10). Substracted to size at offset 4. Likely to compute exact meanful bits at end of the encoded stream, as size is rounded to 8 bits (observed values are 0 to 7). y"
https://github.com/lclevy/canon_cr3

@lclevy

This comment has been minimized.

Copy link

lclevy commented Mar 2, 2019

if you use parse_cr3.py -v 3, the value is displayed. I see values 0, 1, 2, 3, 4, 5, 6, or 7.

a discussion with a wavelet expert told me each level add 2 bits.
"With this logic each stage adds two bits, so
level 1: 14-bit input -> 14+2 bit output, which is used for level 2
level 2: 16-bit input -> 16+2 bit output, which is used for level 3
level 3: 18-bit input -> 18+2 bit output".
so yes, these 19 bits might be related to 20 bits output of subband level 3

@kznsq

This comment has been minimized.

Copy link

kznsq commented Mar 2, 2019

As I can see, the encoder uses simple formula: (unaryCodedValue<<K)+KBits
"0063 Step 1: 0s equal to a number indicated by a value obtained by shifting the symbol S (binary expression) to the right by Kbits are arranged, and binary data is generated by adding 1 after these 0s. 0064 Step 2: Lower K bits of the symbol S are extracted and added after the binary data generated in step 1. "

Why there are 21 bits?
"00000000 00000000 00000000 00000000 00000000 001 = 42

00000 00100111 10100101 = 10149 (next 21th bits)"

@lclevy

This comment has been minimized.

Copy link

lclevy commented Mar 2, 2019

please contact me lclevy at free dot fr, it will be easier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.