Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store uploaded image hashes in a LevelDB to prevent re-uploads #23

Merged
merged 36 commits into from
Dec 18, 2018

Conversation

nhorvath
Copy link
Contributor

Use a LevelDB key value store to keep track up previously uploaded images.
https://github.com/syndtr/goleveldb

I'm new to programming in Go so if you have a better storage solution feel free to change it or let me know.

Improvement towards achieving #12

@nmrshll
Copy link
Contributor

nmrshll commented Dec 14, 2018

Thanks a lot, that's a very good idea. And your code is every bit as good as a go developer's by the way.
And leveldb seems like a good, minimalist choice to me.

Just one thing about the hash, this "hash" that I put in and you reused is actually a thing called "perceptual hash" and it's calculated so that it would be very close or identical for similar pictures (a bit like sound signatures that shazam & co use, or image signatures for google search by image)
So that could be a problem here. If say you took several pictures of the same scene in a row and they are very similar, this system would upload only the first of them.

I'll make a PR against your fork to change that, then if you approve we merge it into the master here, is that okay ?

@nhorvath
Copy link
Contributor Author

nhorvath commented Dec 14, 2018 via email

@nmrshll
Copy link
Contributor

nmrshll commented Dec 14, 2018

Well my bad really, I didn't document that part of the code properly, I didn't really expect the stars and PRs

@nmrshll
Copy link
Contributor

nmrshll commented Dec 14, 2018

@nhorvath Is there a particular reason you included the last modified time in the cache key ?

@nhorvath
Copy link
Contributor Author

nhorvath commented Dec 14, 2018 via email

@nmrshll
Copy link
Contributor

nmrshll commented Dec 14, 2018

Thanks, that's the kind of thing I wasn't sure of
Just asking to avoid going through the same steps and getting to the same conclusion

@nhorvath
Copy link
Contributor Author

See my stream of consciousness here for explanations of tweaks to your changes: nhorvath#1

@nhorvath
Copy link
Contributor Author

@nmrshll I think I'm happy with the current state of this branch now. It gets through my files quickly and now works with large movie files. Memory usage now stays low (previously I was seeing like 4GB ram used).

@nmrshll nmrshll merged commit 63e19fa into gphotosuploader:master Dec 18, 2018
@nhorvath nhorvath deleted the de-dupe branch December 18, 2018 15:45
@cirrusflyer
Copy link

Sorry for the noob question, but am I to understand that this has been implemented and it'll now just upload new photos and not reupload every time?

@nhorvath
Copy link
Contributor Author

yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants