Skip to content
This repository has been archived by the owner on Oct 3, 2022. It is now read-only.

Watching directories for new imports #23

Closed
Chiiruno opened this issue Jul 11, 2018 · 29 comments
Closed

Watching directories for new imports #23

Chiiruno opened this issue Jul 11, 2018 · 29 comments
Assignees

Comments

@Chiiruno
Copy link
Collaborator

Chiiruno commented Jul 11, 2018

Probably just run a hash check and if new/altered, import the image and remove the replaced image, if applicable.

@bakape
Copy link
Owner

bakape commented Jul 11, 2018

What?

@Chiiruno
Copy link
Collaborator Author

Being able to have access to new or altered images without having to reimport the folder each time would be nice.

@bakape
Copy link
Owner

bakape commented Jul 11, 2018 via email

@Chiiruno
Copy link
Collaborator Author

Yes, but when I change or add something in /home/okina/Pictures, I want hydron to reflect that either immediately or every start of hydron, without having to import the entire folder each time.

@bakape
Copy link
Owner

bakape commented Jul 11, 2018 via email

@Chiiruno
Copy link
Collaborator Author

Chiiruno commented Jul 11, 2018

Yes, I know.
I want hydron to find out if the local copy differs from the original copy, and if it does, replace the local with the (new) "original".

@bakape
Copy link
Owner

bakape commented Jul 11, 2018 via email

@Chiiruno
Copy link
Collaborator Author

Chiiruno commented Jul 11, 2018

Keep a database of hashes which you already do, and check if the hash is the same.
If not, replace/add to the local copy.

@bakape
Copy link
Owner

bakape commented Jul 11, 2018 via email

@Chiiruno
Copy link
Collaborator Author

I can't say I know.
All I know is that being able to not have to import every time I start up hydron would be nice, since I have a large image folder.

@bakape
Copy link
Owner

bakape commented Jul 11, 2018 via email

@Chiiruno
Copy link
Collaborator Author

I don't have to, sure.
But if I want the two images I just added to my Pictures folder, I have to re-run it, in addition to fetch_tags, which takes a while too.

@bakape
Copy link
Owner

bakape commented Jul 11, 2018 via email

@Chiiruno
Copy link
Collaborator Author

Having to have a separate directory for newly saved images is silly just so I could do that, waaay too much maintenance.
There has to be a smart way to at the very least, find new images and fetch their tags without reimporting.

@bakape
Copy link
Owner

bakape commented Jul 11, 2018 via email

@Chiiruno
Copy link
Collaborator Author

How about this then, a possible middleground between performance and ease-of-use.
A slow import and fetch that's always running, so it doesn't bring the system to a hault or otherwise take up too much CPU.
Or, a slightly faster import and fetch that happens every X amount of time.
That way, we don't have to run import to update for new/altered images and fetch_tags for tags, since both will either always be running or being ran every increment of time.

@bakape
Copy link
Owner

bakape commented Jul 11, 2018 via email

@Chiiruno
Copy link
Collaborator Author

Sure, but it might take me a while to get to.

@bakape
Copy link
Owner

bakape commented Jul 11, 2018

It's fine. I don't consider this core functionality anyway.

@bakape bakape changed the title Git-like system for checking for new images Watching directories for new imports Jul 11, 2018
@Chiiruno
Copy link
Collaborator Author

Chiiruno commented Aug 2, 2018

@bakape Could you give me collaborator permissions on this repo too so I can make branches here like meguca?
I changed up my local stuff to make branches on meguca directly, so I'd like to do it on hydron too for whenever I get to this.

@bakape
Copy link
Owner

bakape commented Aug 2, 2018 via email

@Chiiruno
Copy link
Collaborator Author

Watching directories with hydron may be not only a problem for hydron itself, but also the entire OS.
Read the bottom of https://github.com/fsnotify/fsnotify under "How many files can be watched at once?"

There are OS-specific limits as to how many watches can be created:

Linux: /proc/sys/fs/inotify/max_user_watches contains the limit, reaching this limit results in a "no space left on device" error.
BSD / OSX: sysctl variables "kern.maxfiles" and "kern.maxfilesperproc", reaching these limits results in a "too many open files" error.

@Chiiruno
Copy link
Collaborator Author

Chiiruno commented Aug 15, 2018

So, just importing each time you want to add new files and trying our hardest to optimize and even skip suspected already imported files may be for the best.
Thoughts?

@Chiiruno
Copy link
Collaborator Author

Chiiruno commented Aug 15, 2018

Also I know you said it was dumb, but BLAKE2 might be a good way for faster and more unique hashes, if you ever want to consider that.
https://research.kudelskisecurity.com/2017/03/06/why-replace-sha-1-with-blake2/
https://godoc.org/golang.org/x/crypto/blake2b

@bakape
Copy link
Owner

bakape commented Aug 15, 2018

I have reduced memory usage for already imported files with d5ca369.

BLAKE2

Not an option. Whatever slow gain would be offset by the overhead of still needing SHA1 hashes and storing an extra hash per image.

Basically, don't rescan your image folders all the fucking time. That is not how hydron was intended to be used.

@Chiiruno
Copy link
Collaborator Author

Not an option. Whatever slow gain would be offset by the overhead of still needing SHA1 hashes and storing an extra hash per image.

Now it makes sense why you think it's dumb, does this mean you can't put BLAKE2 hashes into the DB the same way you can SHA1? Why would you need an extra hash? I'm well aware it would require rewriting the thumbnailer and other stuff.

@bakape
Copy link
Owner

bakape commented Aug 15, 2018

Because we still need to generate SHA1, because external services use SHA1. Same with MD5.
So currently we generate SHA1+MD5. With BLAKE2 we would need to generate BLAKE2+SHA1+MD5 and store the BLAKE2 hash as well. At the same time no external service I know of uses BLAKE2, so it's not reusable.

@Chiiruno
Copy link
Collaborator Author

Okay, thank you for explaining this to me.
BLAKE2 isn't an optimization option, at least not for the foreseeable future.

@Chiiruno
Copy link
Collaborator Author

Since you closed #32 , should we close this one too?
AFAIK using fsnotify isn't an option because of the file limit.

@bakape bakape closed this as completed Aug 15, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants