Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tag export: "omit hierarchy" behavior can't actually be disabled #3206

Closed
junkyardsparkle opened this issue Oct 24, 2019 · 23 comments
Closed

Comments

@junkyardsparkle
Copy link
Contributor

junkyardsparkle commented Oct 24, 2019

EDIT: the actual issue has been summarized below, since I was initially confused about the new intended behavior.

Describe the bug

When exporting an image which has hierarchical tags applied, and the "omit hierarchy" checkbox is NOT selected in the export settings, all tags within the branch are still not included as individual tags.

To Reproduce

Start darktable with new profile, import an image, add a tag such as 'foo|blah|baz|bar', export image using default JPEG settings. Observe XMP Subject field in exported file contains only 'bar'. No combination of options selected in export settings would change this for me.

Expected behavior

Include all elements of hierarchical tags when "omit hierarchy" is not selected.

Platform (please complete the following information):

Linux, darktable 2.7.0+2075, exiv2 0.26 (XMP Core 4.4.0-Exiv2)

Additional context

In fact, I was having more complicated, harder-to-reproduce issues under real-world conditions when trying to export images with multiple hierarchical tags. In some cases, only the first and last element of the "path" were included. In a few cases, all were. None of these tags has been set as "private" or "category" or any other special class, as far as I can tell (and selecting those boxes didn't help). This is my attempt to provide a simple test that can be reproduced, in the hopes that it will be enough to start an investigation. ;-)

If anyone can NOT reproduce this, could you tell what exiv2 version you build against? Thanks!

@junkyardsparkle
Copy link
Contributor Author

junkyardsparkle commented Oct 25, 2019

I finally updated my system to exiv2 0.27.2, and it didn't fix the issue.

Here's a more methodical procedure I just followed on a clean, untouched profile:
Import image (with existing XML sidecar having tags 'something' and 'foo|blah|baz|bar'); tags are displayed correctly in module view. Without changing default settings in export module, export image. The JPEG contains neither 'Subject' nor 'Hierarchical Subject' XML tags, even though 'tags' checkbox is selected by default.

Select the 'Hierarchical tags' checkbox and export again. JPEG now has both tags, but 'Subject' tag contains only 'bar', 'exported' and 'something' keywords, even though 'omit hierarchical' is unselected by default. The 'Hierachical Subject' tag contains all full keywords as expected.

Select the 'omit hierarchy' checkbox, export, same results as above.

Select the 'synonyms' checkbox, export, same results as above.

Select the 'private tags' checkbox, export, same results as above.

@junkyardsparkle
Copy link
Contributor Author

@phweyland Any thoughts on this? It seems like a potentially nasty bug that can bite people without them realizing it until later (such as after uploading a bunch of files somewhere, which happened to me).

@phweyland
Copy link
Contributor

Just seen your issue. I'll look at this.
With dt 2.7.0+2055~g36f87d176 that still works for me (and Exiv2 0.27.2-2).

Which "target storage" are you using ? The "file on disk" one ?

Could you share an image I could import for testing ?

@phweyland
Copy link
Contributor

but 'Subject' tag contains only 'bar', 'exported' and 'something' keywords

Should not be exported (neither in Subject nor in Hierarchical Subject), never.

@phweyland
Copy link
Contributor

May be a silly question : Is your database set with UTF-8 ?

@junkyardsparkle
Copy link
Contributor Author

With dt 2.7.0+2055~g36f87d176 that still works for me (and Exiv2 0.27.2-2).

Interesting. So, to be clear, with the test procedure described above, you end up with "foo", "blah", "baz", and "bar" all in the (non-hierarchical) Subject tag? If so, then something is weird on my end. Maybe don't worry about this until I test more or somebody else can reproduce.

Which "target storage" are you using ? The "file on disk" one ?

Yes, exactly the defaults with a new, unmodified profile as created on first run.

Could you share an image I could import for testing ?

Sure, but I can't do it right now (remind me if I forget later).

@junkyardsparkle
Copy link
Contributor Author

May be a silly question : Is your database set with UTF-8 ?

That doesn't sound like a silly question at all... sounds like a good guess. How do I check that?

@phweyland
Copy link
Contributor

with the test procedure described above, you end up with "foo", "blah", "baz", and "bar" all in the (non-hierarchical) Subject tag?

As a matter of fact ... no. With your procedure I get this
Subject : bar, exported
While I have not the issue with my current database...

I've matter for work now. :)

@junkyardsparkle
Copy link
Contributor Author

Well, glad to know I'm not in the Twilight Zone here... thanks for doing the extra checking. :-)

@phweyland
Copy link
Contributor

phweyland commented Oct 29, 2019

About "exported" that's a bug. If you launch again dt it will disappear. However I'll fix that.

The rest is normal for the implemented logic (but I understand this is a change compared to the past).
The logic I've followed is that if an intermediate tag, let's say 'foo|blah', doesn't exist as a tag it is considered as a 'category'. That's why you haven't 'blah' in Subject. If you create the 'foo|blah' as a tag and don't set it as category it appears in Subject list, even if it is not attached.

So to be transparent with previous behavior I could invert that logic, i.e. if 'foo|blah' doesn't exist as a tag it is NOT a 'category'. It is not intuitive (for me) but that can avoid to break the former logic (where category did not exist).

What do you think ?

PS: a category is never exported into Subject.

@phweyland
Copy link
Contributor

I could invert that logic

It seems more complicated to implement than the current one. :(

@phweyland
Copy link
Contributor

The current representation of tags (in italic) in hierarchy is aligned with the current logic (path per default as category). Should be removed too if we invert the logic.

@junkyardsparkle
Copy link
Contributor Author

So to be transparent with previous behavior I could invert that logic, i.e. if 'foo|blah' doesn't exist as a tag it is NOT a 'category'. It is not intuitive (for me) but that can avoid to break the former logic (where category did not exist).

Ah, so the issue was that I had implicit "categories" that I didn't know were categories... I'll give the module some more use with this in mind, and see if it still feels like a "surprising" behavior or not.

It might be enough to make this very, very clear in the release notes... I'll also try to do some editing work on the tagging doc soon, since I seem to have the right "unfamiliar user" perspective... ;-)

@junkyardsparkle
Copy link
Contributor Author

junkyardsparkle commented Oct 29, 2019

Out of curiosity, is the current "implicit category" logic consistent with any other software that you're familiar with? That would be an argument for retaining it... I'm just not really in touch with commercial software of the last decade or so. I'll close this, since the issue isn't what I thought it was. Might be good to get more user input about this in a forum context, I suppose? It's really hard to guess if anyone else is likely to be affected the same way as I was... anyway, thanks again for the work!

@phweyland
Copy link
Contributor

Out of curiosity, is the current "implicit category" logic consistent with any other software

Actually I have no such a reference. I've just followed a certain logic. For example in your 'foo|blah|baz|bar' case you cannot assign 'blah' alone. This can mean this is not a real tag. And as it cannot can get specific attributes it was necessary to make to some choice. That's why I've worked in that way.

But I have to admit I don't see the reverse logic weaker than the current one. It's just a different assumption. If it is a bit more complex to implement I don't think there is anything impossible there.

@TurboGit, what are your thoughts ?

@junkyardsparkle
Copy link
Contributor Author

My own assumptions were probably based mostly on previous behavior of darktable with same tags library (inertia), but also somewhat by geeqie, which is the only other software I've used which maintains a library of keywords presented in a tree view. In that case, when adding another keyword to the tree the user explicitly selects between "Active keyword" (default) and "Helper" (ie category). So... I just wasn't expecting anything to have implicit category status... for whatever that's worth - I'm not asserting that this will be the "normal" assumption. :-)

That first part about previous behavior is what concerns me... but if things need to get broken a little bit, this is probably the release to do it in!

@phweyland
Copy link
Contributor

when adding another keyword to the tree the user explicitly selects between "Active keyword" (default) and "Helper" (ie category)

Here dt behaves the same way. Category must be set explicitly.
On the other hand 'blah' in 'foo|blah|baz|bar' is not a tag, just a piece in the path of 'bar'. Except if 'foo|blah' is also created, by default as a tag (not a category).

@junkyardsparkle
Copy link
Contributor Author

junkyardsparkle commented Oct 29, 2019

Yes, so to summarize this issue for anyone else concerned:

I had become accustomed to thinking of all elements in hierarchical "paths" as keywords themselves, based on the previous darktable behavior of always treating them that way. The new tagging module logic does not treat them this way - they must each also exist as end nodes within the library to be counted as keywords and exported. Therefore, for a library containing hierarchies such as

people|family|smith|bob
people|friends|jones|susan

but for whatever reason not also individually containing:

people|family|smith
people|family
people

then those elements will not be considered keywords along with "bob" when "bob" is attached in the hierarchical form above.

Is this a problem? Well, maybe, in the sense that a workflow that has worked for users previously can suddenly stop working as expected in a subtle way that is easy to not notice until later (for instance, after many images have been exported and uploaded somewhere). The nature of the problem may also not be immediately apparent (if you're as dumb as me).

So, the question is to alter the logic, or make its workings very clear in the manual and release notes, and hope people read them. Right? :-)

@junkyardsparkle
Copy link
Contributor Author

It also now seems to me that the concept of the explicitly set "category" is only really needed with the old logic; as it is now, just exluding a term from the library as an end node would accomplish the same thing... although I may be missing some use case?

@junkyardsparkle
Copy link
Contributor Author

junkyardsparkle commented Oct 29, 2019

Looking at this some more, I think the presence of the "omit hierarchies" checkbox in export settings is part of the source of confusion, since it implies that selecting it would enable the current logic vs. the old logic... when in fact the situation isn't quite that simple. With that and the addition of categories, I would say it's very counter-intuitive for a previous user to assume the new logic - these seem like features you would add to complement the old logic.

@phweyland
Copy link
Contributor

phweyland commented Oct 30, 2019

I think the presence of the "omit hierarchies" checkbox in export settings is part of the source of confusion

Agreed. It has an effect (when both 'foo|blah|baz|bar' and 'foo|blah' exist the second one ignored if "omit hierarchy" is set) but not very useful and may be counter-intuitive.

It also now seems to me that the concept of the explicitly set "category" is only really needed with the old logic

(you mean with the current logic, right ?) Partly agreed. First, for the "helper" side I think it doesn't hurt to have both possibilities: implicit and explicit category.

Then, the most important, when a leave tag is a category itself, it can be used to create whatever kind of meta information you think of and use them to set up xmp tags of your choice at export time. example:
creator|John Smith, both tags set as category. Neither creator nor John Smith are exported (because categories) into Subject but John Smith can be used in export formulas like Xmp.dc.rights = Copyrights $(YEAR) $(CATEGORY0(creator)).

image

It also appear separately in image information:

image

to produce this:

---- XMP-dc ----
Rights                          : Copyrights 2019 John Smith
Subject                         : bar

This said, the debate between implicit keyword or implicit category for non declared path elements, for example 'blah' in 'foo|blah|baz|bar', is still valid.

phweyland pushed a commit to phweyland/darktable that referenced this issue Oct 30, 2019
TurboGit added a commit that referenced this issue Oct 30, 2019
fix #3203 (scroll) + #3206 (dt tags + default for hierarchical tags)
@junkyardsparkle
Copy link
Contributor Author

(you mean with the current logic, right ?) Partly agreed. First, for the "helper" side I think it doesn't hurt to have both possibilities: implicit and explicit category.

I only meant that the old logic really needed the feature, because it had zero methods for excluding anything from export, while the other logic has at least one method "built in". Your example case is one (of many?) that I hadn't considered, though... I assumed there were some. Anyway, I woke up to find things already restored to "normal", so thanks for that (I was really trying not to insist too much that this was the "correct" solution, but if you're not unhappy with it then it seems good to me). I'll try to torture-test the new code later. :-)

@phweyland
Copy link
Contributor

I'll try to torture-test the new code later. :-)

Yes, please !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants