Metadata: Embed XMP metadata in JPEG files #243

tmb80c · 2020-02-05T08:00:33Z

Hi,

came across of this exiting software. Not sure if the following is by design, a missing feature or a bug.

I uploaded some pictures (tried with and without sidecar file). The pictures were exports from Lightroom and included already a picture name/tile and some keywords. I could see in photoprism the EXIF data after the import but not the title / keywords from the picture file. I think title and keywords are not EXIF data but other Metadata which I can see under details in windows file explorer. Looks like photoprism does only import EXIF not title and keywords into its database. Although the imported picture file still includes the title and the keywords.

For example one picture was a macro of a ladybug. The picture name/title (not filename) was "ladybug" and the picture description included two keywords "ladybug" and "Insect". Indexing indentified it correctly as a beetle with 96% confidence - cool, great work! After indexing the title was set to "Beetle". And under labels I could only see "Beetle". I would have expected that the import would use the picture name as photoprism-title and lists under labels the imported keywords as "manual" or "imported" and the "beetle" as a result of indexing.

Another user expectation would be that photoprism uses the keywords from the imported picture to eliminate false positives. Meaning if the file already includes a keyword which matches labels or categories in photoprism then this information should help the indexing in particular if there is a low confidence.

If the above can not be implemented then reindexing should not override manual edited labels and titles after reindexing.

It is my first contribution to github.

lastzero · 2020-02-05T15:14:20Z

Thanks for your feedback! This absolutely makes sense. We import information from EXIF and also XMP to a certain extend. Would be good to know what fields are actually used (could be "description", but XMP also has a DC title field that Lightroom uses). We should also add a "keywords" field in our database (need to figure out what EXIF/XMP field this is as well).

tmb80c · 2020-02-05T17:25:44Z

I loaded the picture of the ladybug into the tool "Get IPTC Photo Metadata" from the IPTC organisation. The relevant XMP fields are "title" and "keywords". If title could be loaded into photoprism's title field that would be of great help. At the moment a reindexing overwrites the title which is not what the user want's once the title has been manually updated.

With the keywords used as labels I assume you can improve the tagging a lot. The system could learn from imported pictures.

In another thread I saw a discussion regarding face regocgnition. Let's assume the keyword field already includes the name of the person.......

lastzero · 2020-02-05T17:35:38Z

Not difficult to implement, let's do this. Also need the keywords field for words extracted from file names so that the user can see and edit them. Thanks for the test files!

lastzero · 2020-02-07T13:38:56Z

We currently don't index Exif.Image.XPTitle and other fields starting with XP, probably because this is not included in the base standard @dsoprea?

Added: Subject, Keywords, Comment, CameraOwner and CameraSerial Todo: Read values from Exif.Image.XPTitle, XPSubject, XPKeywords,... Signed-off-by: Michael Mayer <michael@liquidbytes.net>

lastzero · 2020-02-07T13:41:45Z

Added a test for this. Hope it was OK to use the Ladybug as example image!

Signed-off-by: Michael Mayer <michael@liquidbytes.net>

lastzero · 2020-02-07T16:39:40Z

Need to get ready for our journey now, hope this will do for now. Code is prepared to index additional Exif fields once we get them from our meta package. Hope @dsoprea can help with that.

tmb80c · 2020-02-07T17:07:03Z

Added a test for this. Hope it was OK to use the Ladybug as example image!

Sure, please!

dsoprea · 2020-02-07T20:51:14Z

How does the EXIF parsing depend on the 'meta' package: "is prepared to index additional Exif fields once we get them from our meta package."? Is it some kind of dynamic binding defined in the DB?

dsoprea · 2020-02-07T20:52:39Z

Non-standard stuff can be indexed. We preload the standard tags at the top of the process, but can readily add more.

lastzero · 2020-02-08T06:54:55Z

How does the EXIF parsing depend on the 'meta' package: "is prepared to index additional Exif fields once we get them from our meta package."? Is it some kind of dynamic binding defined in the DB?

We parse it there using your Exif library so that our indexer can read from the Data struct, independent where the data came from.

lastzero · 2020-02-08T07:11:30Z

Non-standard stuff can be indexed. We preload the standard tags at the top of the process, but can readily add more.

That's what I thought but failed to figure out how yesterday... Do you have an example or can send a PR for our meta package?

dsoprea · 2020-02-08T09:06:57Z

#243 (comment)

I was in the car at the time. I had forgotten that this was the name of the package that hosts go-exif.

#243 (comment)

It'd be here:

photoprism/internal/meta/exif.go

Line 77 in c13e39e

You'd insert something like:

// If nothing is loaded, this will be implicitly loaded at first access. 
// However, since we're about to intervene and add one, we'll become responsible 
// for loading the whole set.
exif.LoadStandardTags(ti)

it := &exif.IndexedTag{
    // The IFD that it is found in.
    IfdPath: exif.IfdPathStandardExif,

    // It's ID.
    Id: 0x1234,

    // A human-friendly name.
    Name: "SomeName",

    // The type of the data.
    Type: exifcommon.TypeShort,
}

err = ti.Add(it)
if err != nil {
    log.Errorf("exif: %s", err.Error())
    return nil
}

Let me know if you want me to help.

dsoprea · 2020-02-08T09:07:44Z

<- Note that you'll have to import github.com/dsoprea/go-exif/v2/common.

lastzero · 2020-02-08T09:25:50Z

Thank you! I'm on vacation for a week, pull requests welcome. I'll see what I can do while on the train. Already added a test image.

lastzero · 2020-02-08T11:39:22Z

Wow, looks like Adobe somehow managed to add XMP / Dublin Core data to the JPEG without using a sidecar file. That's why our Exif parser doesn't find it!

<rdf:Description rdf:about=''
  xmlns:dc='http://purl.org/dc/elements/1.1/'>
  <dc:creator>
   <rdf:Seq>
    <rdf:li>Photographer: TMB</rdf:li>
   </rdf:Seq>
  </dc:creator>
  <dc:format>image/jpeg</dc:format>
  <dc:subject>
   <rdf:Bag>
    <rdf:li>Ladybug</rdf:li>
   </rdf:Bag>
  </dc:subject>
  <dc:title>
   <rdf:Alt>
    <rdf:li xml:lang='x-default'>Ladybug</rdf:li>
   </rdf:Alt>
  </dc:title>
</rdf:Description>

So what we need here is the XMP support we started working on plus a way to extract this data from a JPEG. In Exif, ImageDescription is the right field to store the title of an image. There is no Title field.

lastzero · 2020-02-08T11:45:09Z

For embedding XMP metadata in JPEG files, see https://wwwimages2.adobe.com/content/dam/acom/en/devnet/xmp/pdfs/XMP%20SDK%20Release%20cc-2016-08/XMPSpecificationPart3.pdf

@dsoprea Any idea how we can implement this elegantly?

lastzero · 2020-02-08T11:55:51Z

@tmb80c Try using an XMP sidecar file instead for now. Is this possible?

lastzero · 2020-02-08T12:12:42Z

Apparently go-xmp has a method xmp.ScanPackets(io.Reader) which we could try, see trimmer-io/go-xmp#1

dsoprea · 2020-02-08T14:38:11Z

@lastzero

There seems to be a lot of JPEG talk for being an XMP document. The XMP data is just in another segment? What is it that you're concerned won't be done well/elegantly? Seems like it would just be a simple enumeration of the JPEG segments (which can be done via go-jpeg-image-structure) and to just scan/grab/test the one specified in the text above, no?

tmb80c · 2020-02-08T14:43:12Z

@tmb80c Try using an XMP sidecar file instead for now. Is this possible?

I'm not using sidecar files in Lightroom.

lastzero · 2020-02-08T14:48:18Z

@lastzero

There seems to be a lot of JPEG talk for being an XMP document. The XMP data is just in another segment? What is it that you're concerned won't be done well/elegantly? Seems like it would just be a simple enumeration of the JPEG segments (which can be done via go-jpeg-image-structure) and to just scan/grab/test the one specified in the text above, no?

Possible, didn't try. I'm on vacation and on a train, that's as far as I got... With go-jpeg you mean the built-in JPEG lib that comes with Go? You're probably doing something similar to get the Exif data.

dsoprea · 2020-02-08T15:35:14Z

go-jpeg-image-structure is my project, which I used from Photoprism to parse JPEGs and extract EXIF (which there are convenience functions for). I'll try to do it in the next couple of days. I'm fairly highly utilized at the moment.

Signed-off-by: Michael Mayer <michael@liquidbytes.net>

lastzero self-assigned this Feb 5, 2020

lastzero added the enhancement Refactoring, improvement or maintenance task label Feb 5, 2020

lastzero added this to the MVP milestone Feb 5, 2020

lastzero added the in-progress Somebody is working on this label Feb 7, 2020

lastzero added a commit that referenced this issue Feb 7, 2020

Backend: Index Keywords, Subject and Artist #243

c583d7e

Signed-off-by: Michael Mayer <michael@liquidbytes.net>

lastzero added a commit that referenced this issue Feb 7, 2020

Frontend: Set Modified* flags #243

5fba038

Signed-off-by: Michael Mayer <michael@liquidbytes.net>

lastzero changed the title ~~Import does not read keywords nor title of an uploaded picture~~ Support for embedding XMP metadata in JPEG files May 7, 2020

lastzero added important and removed in-progress Somebody is working on this labels May 7, 2020

lastzero removed this from the MVP milestone May 7, 2020

lastzero added a commit that referenced this issue May 13, 2020

Backend: Read from JSON sidecar files (created by exiftool) #4 #243

5f408f4

Signed-off-by: Michael Mayer <michael@liquidbytes.net>

lastzero added a commit that referenced this issue May 13, 2020

Docker: Enable JSON sidecar files on demo.photoprism.org #4 #243

2ca1ff6

Signed-off-by: Michael Mayer <michael@liquidbytes.net>

lastzero added a commit that referenced this issue May 13, 2020

Docker: Add PHOTOPRISM_SIDECAR_JSON to example config #4 #243

011fda3

Signed-off-by: Michael Mayer <michael@liquidbytes.net>

lastzero added a commit that referenced this issue May 13, 2020

Backend: Read JSON from sidecar file #4 #243

55819db

Signed-off-by: Michael Mayer <michael@liquidbytes.net>

lastzero mentioned this issue May 16, 2020

Date not recognized / Date from picture name #304

Closed

graciousgrey changed the title ~~Support for embedding XMP metadata in JPEG files~~ Metadata / Embed XMP metadata in JPEG files Nov 26, 2020

graciousgrey changed the title ~~Metadata / Embed XMP metadata in JPEG files~~ Metadata: Embed XMP metadata in JPEG files Jan 5, 2021

graciousgrey added the needs-analysis Requires further investigation label Nov 3, 2021

lastzero added priority Supported by early sponsors or popular demand and removed important labels Dec 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metadata: Embed XMP metadata in JPEG files #243

Metadata: Embed XMP metadata in JPEG files #243

tmb80c commented Feb 5, 2020

lastzero commented Feb 5, 2020

tmb80c commented Feb 5, 2020

lastzero commented Feb 5, 2020

lastzero commented Feb 7, 2020

lastzero commented Feb 7, 2020

lastzero commented Feb 7, 2020

tmb80c commented Feb 7, 2020

dsoprea commented Feb 7, 2020

dsoprea commented Feb 7, 2020

lastzero commented Feb 8, 2020

lastzero commented Feb 8, 2020

dsoprea commented Feb 8, 2020

dsoprea commented Feb 8, 2020

lastzero commented Feb 8, 2020

lastzero commented Feb 8, 2020

lastzero commented Feb 8, 2020

lastzero commented Feb 8, 2020

lastzero commented Feb 8, 2020

dsoprea commented Feb 8, 2020

tmb80c commented Feb 8, 2020 •

edited

Loading

lastzero commented Feb 8, 2020

dsoprea commented Feb 8, 2020

Metadata: Embed XMP metadata in JPEG files #243

Metadata: Embed XMP metadata in JPEG files #243

Comments

tmb80c commented Feb 5, 2020

lastzero commented Feb 5, 2020

tmb80c commented Feb 5, 2020

lastzero commented Feb 5, 2020

lastzero commented Feb 7, 2020

lastzero commented Feb 7, 2020

lastzero commented Feb 7, 2020

tmb80c commented Feb 7, 2020

dsoprea commented Feb 7, 2020

dsoprea commented Feb 7, 2020

lastzero commented Feb 8, 2020

lastzero commented Feb 8, 2020

dsoprea commented Feb 8, 2020

dsoprea commented Feb 8, 2020

lastzero commented Feb 8, 2020

lastzero commented Feb 8, 2020

lastzero commented Feb 8, 2020

lastzero commented Feb 8, 2020

lastzero commented Feb 8, 2020

dsoprea commented Feb 8, 2020

tmb80c commented Feb 8, 2020 • edited Loading

lastzero commented Feb 8, 2020

dsoprea commented Feb 8, 2020

tmb80c commented Feb 8, 2020 •

edited

Loading