Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to read exif data unless loaded from path #745

Closed
dhensby opened this issue Jun 23, 2017 · 14 comments
Closed

Unable to read exif data unless loaded from path #745

dhensby opened this issue Jun 23, 2017 · 14 comments

Comments

@dhensby
Copy link
Contributor

dhensby commented Jun 23, 2017

At the momement it's only possible to read exif data from images loaded from a path.

exif_read_data supports the ability to load from base64 encoded data-urls. This means it is possible, in theory, to read exif data from images that are loaded through any of the init methods and not just those from file paths.

Would there be any support from the maintainers of this library to add exif reading capabilities to images loaded through means other than files?


related: #336

@dhensby
Copy link
Contributor Author

dhensby commented Jun 23, 2017

For clarity: I'm happy to work on the PR for this

@olivervogel
Copy link
Member

It would be nice if exif() would be able to read meta data from all images no matter if they were loaded from a path.

PR would be welcomed.

Be sure to follow:

  • Follow PSR-2 coding standards.
  • Write tests for new functions and added features
  • API calls should work consistently with both GD and Imagick drivers

@olivervogel
Copy link
Member

olivervogel commented Jun 24, 2017

I think the only problem is that, GD encoding does not retain EXIF data. So if you encode the image as data-url to reread it with exif_read_data all the original data will be lost.

@dhensby
Copy link
Contributor Author

dhensby commented Jun 26, 2017

The issues I can see coming are:

  1. Loading the image as a base64 encoded is going to mean loading the entire image into memory (given it's base64 it'll be 30% larger than the actual image
  2. as @olivervogel points out using the encoder to create the base64 encoded image data will have the exif already stripped. This means we'll need to do one of the following:
    1. need to store a reference to the original image data (or location if it's on the file system)
    2. pre-fetch the exif data when the image is initialised
    3. Save the image to the temporary directory on init and then load from that path when needed

Are any of these specific blockers?

@dhensby
Copy link
Contributor Author

dhensby commented Jun 27, 2017

OK - so from some testing it's possible to read the exif headers by base64 encoding just a portion of the binary data. eg:

$headerData = stream_get_contents($assetContainer->getStream(), 10240);
$exif = exif_read_data ( "data://image/jpeg;base64," . base64_encode($headerData) , 0 , true );

So it may be possible to either just hold onto a portion of the data for exif reading or we can prefetch exif data for the image but by encoding just a portion of the binary data upon initialisation.

@olivervogel
Copy link
Member

What is $assetContainer in your example?

@dhensby
Copy link
Contributor Author

dhensby commented Jun 28, 2017

Oh, sorry, that's just an abstraction layer for assets. Wherever the binary data comes from we can just read the first N bytes and pass that to exif_read_data as a base64 encoded string and extract the headers successfully.

How do you feel about extracting the exif meta data and storing it in a meta property on images when initiating the image object? I'm not sure theirs any other way of making it work without constant reference to the original image data.

@olivervogel
Copy link
Member

olivervogel commented Jun 29, 2017

The following overview shows, how to handle the different input types. There are actually five methods. Methods 1-2 can read exif data smoothly. Method 3-5 are problematic, because we need run additional resource intensive encoding, which may not be wanted. I'm not sure, if this should be the default on every init.

  1. initFromPath()

    Handles the following input:

    • Path of the image in filesystem.
    • SplFileInfo instance

    Can be read directly by exif_read_data without encoding.

  2. initFromBinary()

    Handles the following input:

    • Binary image data.
    • URL of an image
    • Stream
    • Data-URL encoded image data
    • Base64 encoded image data

    Can be read as base64 encoded string by exif_read_data.

  3. initFromGdResource()

    Handles the following input:

    • PHP resource of type gd

    Must be encoded as JPEG and base64 before read by exif_read_data.

  4. initFromImagick()

    Handles the following input:

    • Imagick instance

    Must be encoded as JPEG and base64 before read by exif_read_data.

  5. initFromInterventionImage()

    Handles the following input:

    • Intervention\Image\Image instance

    Must be encoded as JPEG and base64 before read by exif_read_data.

@dhensby
Copy link
Contributor Author

dhensby commented Jun 29, 2017

Thanks for the overview. I'd like some guidance on the likelihood of you accepting an approach of caching the exit data on initialisation before I spend time writing on an approach you don't like

@tractorcow
Copy link

I have been following and investigating this issue, and was looking to suggest something much much along these lines.

I would suggest a setExif / getExif on Image class, which would allow this data to be assigned on init. ExifCommand would instead simply return getExif(). It would be the responsibility of the Decoder to invoke setExif based on the functionality you describe.

I would accept that there are cases where exif simply isn't accessible, but we should certainly attempt to capture it in cases where it could be, but currently isn't (e.g. streams).

@mstaack
Copy link

mstaack commented Feb 28, 2018

are there any infos regarding exif data from non-file image instances?

@sb-relaxt-at
Copy link

Since PHP 7.2.0 it is possible. But from my understanding Intervention always converts the stream (in AbstractDecoder->initFromStream), therefore the ExifCommand cannot access the stream and is not able to read the EXIF data.

@tractorcow
Copy link

It sounds like a combination of using PHP 7.2 exif reading from streams, and shifting some of the responsibility for exif init to the initFrom* methods could work.

  • Add getExif / setExif to Image class
  • initFromStream reads exif data, and calls setExif on result of initFromBinary()
  • ExifCommand checks getExif() for data, failing over to reading from file path otherwise.
  • Update other initFrom* as necessary / feasible.

Caveats:

  • PHP 7.1 and below would reject streams (either bump requirements, or soft-support if >7.2)
  • non-seekable streams would not load exif
  • Additional API necessary; Will need to investigate possible breakage of API.
  • Does not guarantee exif will always be loaded, only increases likelihood.

esperecyan added a commit to esperecyan/dictionary-php that referenced this issue Jan 25, 2020
ファイルパスではなくバイナリデータを読み込むと、Exif情報が失われる問題 Intervention/image#745 への対処を含む。
olivervogel added a commit that referenced this issue Jul 5, 2021
Fixed exif data from non image path (close #343, close #745, close #764)
@tractorcow
Copy link

Nice work @olivervogel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants