-
Notifications
You must be signed in to change notification settings - Fork 824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make image metadata available to the processors. #625
Comments
There was a disucssion on creating a Adding IPTC and XMP metadata would make a lot of sense. |
+1 |
This turned out to be more difficult than I thought. There is a Python XMP Toolkit that can read and write XMP metadata to and from an image file. Unfortunately only to image files, not image buffers. The limitation seems to be set by the underlying C-Library Exempi. The only way to extract XMP metadata from an image buffer for now is to do it by hand. I have created a prototype implementation for jpeg images and the PIL engine here: https://github.com/sbaechler/thumbor/commit/3cb53677870029ca09045f0ebfb03693a08899ee The XMP data can then be further processed by the XMP toolkit. My ultimate goal is to be able to write XMP data as well. If we used the Media object and the FileStorage, it would be possible to pass the file path along to a filter that would modify the meta data. |
If that framework only deals with files, why not dump the buffer into a temp file? |
@sbaechler the current filters and optimizers receive a tmp file buffer, already on the filesystem. |
@masom Maybe that temp file won't be available once the Media object is used. The best thing about Pyexiv is that it not only supports XMP, but also IPTC and Exif. It can also converts all data into Python objects. https://github.com/sbaechler/thumbor/commit/060f0e6ed70e3fbbe67b2339f63b0a2e4f47bb44 I added the code to the Engine class. Maybe it is better to keep it in a separate module. That way any filter that needed to access the metadata would have to instantiate the Metadata class, but the code itself would be independent from the engine (or the Media object). |
A lot of tools don't work well with STDIN / STDOUT. ffmpeg for instance will stutter and produce weird output if the input file is STDIN. |
That's because by default ffmpeg is interactive while it runs. You can press a key to abort its processing. There's an ffmpeg option to turn that off, though. |
Actually maybe not an option, but I remember that there's a way to work around that problem. |
I don't think hard dependencies are an issue as long as this is a plugin
|
The metadata can only be extracted from the raw buffer, not the The metadata extraction is only done if the library is installed. Another option is to extract the metadata from the temp file in the filters. The downside of this is that it creates additional I/O. |
Fixed in #661 |
@sbaechler Does this fix only provide metadata to the engine? Is there still more work to do to have IPTC data persist on output images? |
Metadata, not just Exif but also IPTC and XMP data can contain information that could be used by Thumbor.
As far as I can see, PIL can only read Exif data. This would mean a CLI tool such as
exiftool
would have to be used. Exif and IPTC are key-value based. XMP has a tree structure.Possible use cases would be:
You could assign the issue to me. I'll try to implement a solution within the next few weeks.
The text was updated successfully, but these errors were encountered: