-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store uploaded image hashes in a LevelDB to prevent re-uploads #23
Conversation
add logging for potentially corrupt images.
Thanks a lot, that's a very good idea. And your code is every bit as good as a go developer's by the way. Just one thing about the hash, this "hash" that I put in and you reused is actually a thing called "perceptual hash" and it's calculated so that it would be very close or identical for similar pictures (a bit like sound signatures that shazam & co use, or image signatures for google search by image) I'll make a PR against your fork to change that, then if you approve we merge it into the master here, is that okay ? |
No problem, that's probably a good idea since it's actually used to verify
that the picture hasn't changed since last upload. If you just like image
corrected a picture it might get the same hash given what you said.
md5/sha1 would probably be better suited to this, I just saw hash and
assumed I could reuse it.
…On Fri, Dec 14, 2018 at 6:27 AM Nicolas Marshall ***@***.***> wrote:
Thanks a lot, that's a very good idea. And your code is every bit as good
as a go developer's by the way.
And leveldb seems like a good, minimalist choice to me.
Just one thing about the hash, this "hash" that I put in and you reused is
actually a thing called "perceptual hash" and it's calculated so that it
would be very close or identical for similar pictures (a bit like sound
signatures that shazam & co use, or image signatures for google search by
image)
So that could be a problem here. If say you took several pictures of the
same scene in a row and they are very similar, this system would upload
only the first of them.
I'll make a PR against your fork to change that, then if you approve we
merge it into the master here, is that okay ?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#23 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAKE3KnQgJaj8I8Qi2-9eZBoz9IYmKeTks5u44sagaJpZM4YyFjo>
.
|
Well my bad really, I didn't document that part of the code properly, I didn't really expect the stars and PRs |
@nhorvath Is there a particular reason you included the last modified time in the cache key ? |
Stat was way faster than hash. I have tens of thousands of pictures I tried
to upload and it takes forever to go through. It uses mtime first and if
that's different it uses hash.
…On Fri, Dec 14, 2018 at 6:33 PM Nicolas Marshall ***@***.***> wrote:
@nhorvath <https://github.com/nhorvath> Is there a particular reason you
included the last modified time in the cache key ?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#23 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAKE3GSJymduiUxRpr4uynckA5CAwekMks5u5DVQgaJpZM4YyFjo>
.
|
Thanks, that's the kind of thing I wasn't sure of |
upload cache changes from upstream
See my stream of consciousness here for explanations of tweaks to your changes: nhorvath#1 |
@nmrshll I think I'm happy with the current state of this branch now. It gets through my files quickly and now works with large movie files. Memory usage now stays low (previously I was seeing like 4GB ram used). |
Sorry for the noob question, but am I to understand that this has been implemented and it'll now just upload new photos and not reupload every time? |
yes |
Use a LevelDB key value store to keep track up previously uploaded images.
https://github.com/syndtr/goleveldb
I'm new to programming in Go so if you have a better storage solution feel free to change it or let me know.
Improvement towards achieving #12