Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for extracting thumbnails #98

Merged
merged 11 commits into from
Jan 24, 2020
Merged

Add support for extracting thumbnails #98

merged 11 commits into from
Jan 24, 2020

Conversation

letmaik
Copy link
Owner

@letmaik letmaik commented Jan 21, 2020

Implements #96.

@letmaik
Copy link
Owner Author

letmaik commented Jan 21, 2020

@harshithdwivedi @dtakeshta Could you give this a try? You can download the wheels by clicking "Show all checks", then clicking on the right variant, and then on the right click on "Artifacts". E.g. Windows 64bit Python 3.7 is here: https://github.com/letmaik/rawpy/pull/98/checks?check_run_id=401958378

Usage is like that:

            with rawpy.imread('image.nef') as raw:
              thumb = raw.extract_thumb() # returns (format: enum, data: bytes) namedtuple
              assert thumb.format == rawpy.ThumbFormat.JPEG
              with open('thumb.jpg') as fp:
                f.write(thumb.data)

Does this make sense?

@harshithdwivedi
Copy link
Contributor

harshithdwivedi commented Jan 22, 2020

Hey @letmaik thanks for this!
I just tried this on my windows machine and while it works as expected, there are some performance issues that I noticed when compared to the dcraw binary.

The time taken to extract the jpeg previews from 6 raw files is as follows:
rawpy : 5.94 secs
dcraw binary with -e : ~0.26 sec

I'm not sure if this overhead is added by python or if there's something else that's going on.

Also, in case the embedded raw doesn't exist, will extract_thumb() return a None object?

@harshithdwivedi
Copy link
Contributor

harshithdwivedi commented Jan 22, 2020

I looked at the libraw api examples, and looks like the sample code here emulates the behaviour of dcraw -e; using which might speed up the process.

https://www.libraw.org/docs/Samples-LibRaw.html#code

@letmaik
Copy link
Owner Author

letmaik commented Jan 22, 2020

Thanks for trying this. I will have a look at performance, I'm assuming it's to do with that we unpack both the raw and the thumb, whereas dcraw would only do the latter.

If there's no thumbnail or an unsupported format, then an exception is raised. Currently there's no easy way to distinguish between both cases, like in any other libraw error condition. If you need to handle them differently I could expose the error code via named constants in the exception object.

@harshithdwivedi
Copy link
Contributor

If you need to handle them differently I could expose the error code via named constants in the exception object.

Yeah, that'd be great!
Thanks again.

@letmaik
Copy link
Owner Author

letmaik commented Jan 22, 2020

I added support for bitmap thumbnails (which are returned as RGB ndarray) and expose libraw errors now. A semi-complete example from the docstring:

with rawpy.imread('image.nef') as raw:
  try:
    thumb = raw.extract_thumb()
  except rawpy.LibRawNoThumbnailError:
    print('no thumbnail found')
  except rawpy.LibRawUnsupportedThumbnailError:
    print('unsupported thumbnail')
  else:
    if thumb.format == rawpy.ThumbFormat.JPEG:
      with open('thumb.jpg') as f:
        f.write(thumb.data)
    elif thumb.format == rawpy.ThumbFormat.BITMAP:
      imageio.imsave('thumb.tiff', thumb.data)

Let me know what you think.

@harshithdwivedi
Copy link
Contributor

harshithdwivedi commented Jan 23, 2020

I just used the updated wheel again and it seems to have a lower performance than the earlier one.
Unpacking 6 CR2 files is taking more than 10 secs now.

Here's the code I have:

    def raw_to_jpeg(self, raw_path, file_name, path_for_jpeg):

        # create the final destination
        new_file_path = pathlib.Path(path_for_jpeg).joinpath(
            "{}.jpg".format(file_name))

        try:
            raw = rawpy.imread(raw_path)
            try:
                thumb = raw.extract_thumb()
                if thumb.format == rawpy.ThumbFormat.JPEG:
                    print("jpeg")
                    with open(new_file_path, 'wb') as f:
                        f.write(thumb.data)
                elif thumb.format == rawpy.ThumbFormat.BITMAP:
                    print("bitmap")
                    cv2.imwrite(new_file_path, thumb.data)
            except (rawpy.LibRawNoThumbnailError, rawpy.LibRawUnsupportedThumbnailError):
                # no embedded thumb, so convert the raw to jpeg
                print("None")
                rgb = raw.postprocess()
                cv2.imwrite(new_file_path, rgb)
        except rawpy._rawpy.LibRawNonFatalError:
            pass

I only see "jpeg" printed on my terminal, so it's certain that the library is extracting the embedded jepg only.

@letmaik
Copy link
Owner Author

letmaik commented Jan 23, 2020

I haven't looked at the performance issue yet, but my latest changes wouldn't have any impact. This was only about the bitmap support and errors.
BTW, you shouldn't use anything inside rawpy._rawpy. Just use rawpy.LibRawNonFatalError

@harshithdwivedi
Copy link
Contributor

Ok, got it!

@letmaik
Copy link
Owner Author

letmaik commented Jan 23, 2020

@harshithdwivedi I fixed the performance issue. Can you try again?

@letmaik letmaik mentioned this pull request Jan 23, 2020
@harshithdwivedi
Copy link
Contributor

Perfect! It works like a charm now.

@letmaik letmaik merged commit a52dd77 into master Jan 24, 2020
@letmaik letmaik deleted the thumb branch January 24, 2020 09:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants