Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import datetime detection for copy & import and from camera #11016

Merged
merged 7 commits into from Feb 27, 2022

Conversation

phweyland
Copy link
Contributor

@phweyland phweyland commented Jan 28, 2022

Solves a pixls.us request.

If merged before #10844 this function will have to be aligned afterwards.

Principle: Save in an internal (not visible from the user) metadata ("Xmp.darktable.image_id") the orignal filename + timestamp.
At import time looks if the corresponding metadata value already exists or not.

The first commit fixes several bugs related to internal metadata which has not be used so far.

image

Limitation: the detection doesn't happen for images imported before this PR ...

@phweyland phweyland added bugfix pull request fixing a bug feature: enhancement current features to improve scope: DAM managing files, collections, archiving, metadata, etc. scope: UI user interface and interactions labels Jan 28, 2022
@quovadit
Copy link
Contributor

I support the pixls request to download only new images from your camera’s card.
But I don't think, that storing + comparing orignal filename + timestamp is a good way to achieve this.

LR does the same (using Date, File Size, File Name and an internal hash), but there are multiple complaints of users, that duplicates are not detected, and different images are classified as duplicates by mistake.

So there won't be a perfect duplicate-check.

And as I understood it, it won't work for images already imported before updating.

Finally, there might be single images on my sd-card which I did not import last time on purpose, but they will be imported next time together with the new images (without me noticing it).

A simpler (and for me: better) solution would be, to easily select all images of one day (e.g. by grouping by date).

By default all images from today (or the latest day available) could be selected.

As midnight might not be the best day-seperator, we could use the grouping of images with a configurable time separator in AutoGrouper.lua mentioned in #10996

And of course there should be a check if this import would overwrite existing files with the same name in the same directory.
But the user should have the choice what to do: skip those images, or use different filenames (e.g. resume the sequence number).

@phweyland
Copy link
Contributor Author

phweyland commented Jan 28, 2022

@quovadit, I disagree.
Nothing is automatic, that's just highlighting images which are seen as not already imported. No more.

You can already sort out the files by datetime. So that seems easy to select only images from a date to another one, or only from today. I really don't see how to improve this.

The name conflict case is another topic. Today the import doesn't happen.
Here I agree, the user could have a couple of options.
But as said for optional msec, that's another topic.

EDIT

LR does the same (using Date, File Size, File Name and an internal hash), but there are multiple complaints of users, that duplicates are not detected, and different images are classified as duplicates by mistake.

I would like to understand better the actual problems. Complaints are not always enough.
I've discarded the hash because it could be slow.
Camera filenames are quite stable and, except playing with several cameras at the same time with same counters, there shouldn't be too many issues.
Datetime is the most problematic, due to automatic system time changes (windows?) or ... user mess.

@spaceChRiS
Copy link

I've discarded the hash because it could be slow.

What about hashing only the metadata? Maybe together with additional checks of some of the likely unique metadata, to avoid false duplicates (maybe dependent on the hash algorithm). The metadata hash could be stored in xmp/db to save half of the computations in every new run. Disclaimer: I am asking just out of curiosity, I do not use this feature.

@phweyland
Copy link
Contributor Author

phweyland commented Jan 29, 2022

What about hashing only the metadata?

The fact of having to read the file is the main burden. See here an attempt to use exif datetime instead of file datetime.

@quovadit
Copy link
Contributor

Nothing is automatic, that's just highlighting images which are seen as not already imported. No more.

Of course I can change the suggested selection, but the problem is, that it's not obvious what are the criteria for the suggested selection, and it's propable that I don't realize, that the selection does not cover what I expect.

image

  • left: it seems, that all images starting 2022-01-04 are selected (which is what I want)
  • right: but after scrolling up you see, that there is also one image from 2022-01-02 which is selected, which I did not import last time on purpose

The name conflict case is another topic. Today the import doesn't happen.

That's not always true: if I use a sequence number in my pattern, it does continue the sequence, and there are situations where this is expected, but other situations where this is not expected.

there shouldn't be too many issues.

That's exactly my point. I am sure that the duplicate check will work in more than 90%, but it's nothing I would ever use if I am not 100% sure that all my images are imported before I delete my sd-card.

So I would prefer helping the user making a clear selection, instead of guessing which single images to select depending on metadata.

Grouping would be one possibility:
image

Of course I can do this manually now, but it would be much simpler (and safer) to click on the top-level to select all images in this group.

And personally I would invert the checkbox-logic: All images that will be imported should be checked, all deselected should be greyed out.

@phweyland
Copy link
Contributor Author

@quovadit, you are mixing everything. Sorry, that's tiring.

If you want to work with dates work with dates and forget the automatic selection. Don't mix both.
If you work with automatic selection, check it before applying. If there are failing cases, please report them with the related data to understand why they have failed.

@quovadit
Copy link
Contributor

I don't see, what I'm mixing...

I work with dates, but still I would like to have a way to easily select all images of my last shoot, without scrolling through hundreds of photos finding where the day starts.

If there is a function 'only new pictures' a user should be able to trust that exactly this will happen.
One example where this does not work are all images imported before updating to this new version.
And even if I can't name all other edge-cases where it might fail, I know from LR-users, that there are problems, although LR uses even more criteria than dt will use.

So instead of implementing a new feature, waiting for complaints and then trying to fix one after the other, I suggested to take a different approach to solve the same problem ('Only import images of my last shoot').

By the way: I really hope that my input is constructive and not destructive!

@phweyland
Copy link
Contributor Author

@quovadit, This PR is about "already imported image detection".
That works or not ? If not, is that fixable or not ? If not I close the PR.
Simple.

@TurboGit
Copy link
Member

@phweyland : I lost track on this. First there is a conflict to resolve, then is that ready to be merged?

@quovadit : Have you had time to test this?

@quovadit
Copy link
Contributor

sorry, I did not test the last commits yet, will do it tomorrow evening

@TurboGit TurboGit self-requested a review February 19, 2022 21:00
@phweyland
Copy link
Contributor Author

phweyland commented Feb 19, 2022

Reminder. The principle is to recognize the source filename plus its timestamp. Then:

  • this doesn't recognize files imported before this PR
  • this doesn't cover the case where files are renamed
    • but should work when the file is moved
  • this may fail if the system time changes (I've bad memory of file comparison on windows...).

However, on normal usage (at least mine 😄) that seems to work pretty well.

@quovadit
Copy link
Contributor

Still it only works with a fresh install.
When updating from 3.8, the function can't identify already imported images, which is a no-go for me, because the function is called 'select only new images' - which is not true.

@TurboGit
Copy link
Member

@quovadit : As said by @phweyland :

  • this doesn't recognize files imported before this PR

So indeed that doesn't work (and it is expected) for already imported images.

@phweyland
Copy link
Contributor Author

I gladly admit that the label select only new pictures may be misleading in that case.
It could be replaced by something like select only not recognized pictures in copy & import and import from camera dialogs.
The corresponding tooltip could give a bit of explanation as well.

@phweyland
Copy link
Contributor Author

Check box label and tooltip tweaked for copy & import and import from camera dialogs.

@quovadit
Copy link
Contributor

quovadit commented Feb 21, 2022

I don't think that a new label solves this problem.
It does not just 'fail in certain circumstances', but it won't work for all users updating from 3.8 for all images they have already imported.

As I don't want to mix things, I created a separate Pull Request (EDIT: FR), where I described my point of view: #11195

@phweyland
Copy link
Contributor Author

As I don't want to mix things, I created a separate Pull Request

Thanks !

@TurboGit TurboGit added this to the 4.0 milestone Feb 27, 2022
@phweyland phweyland added the documentation-pending a documentation work is required label Feb 27, 2022
Copy link
Member

@TurboGit TurboGit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good to me now.

@TurboGit TurboGit merged commit 5bf0b23 into darktable-org:master Feb 27, 2022
@phweyland phweyland deleted the import-id branch February 27, 2022 17:21
@elstoc elstoc removed the documentation-pending a documentation work is required label Jun 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix pull request fixing a bug feature: enhancement current features to improve scope: DAM managing files, collections, archiving, metadata, etc. scope: UI user interface and interactions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants