Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: require restart of server to be able to train images from storage #15

Closed
erikarenhill opened this issue Mar 23, 2021 · 3 comments
Closed
Labels
bug Something isn't working

Comments

@erikarenhill
Copy link
Contributor

erikarenhill commented Mar 23, 2021

If you add some images to storage/train/TAG and try to call the api /train/add/TAG it doesn't pick up any files until restarting the application. Have not checked the code but it seems like it's only scanning/adding the files from that directory to the database as untrained when the application starts up, maybe it should do a scan for new files so we can add new files meanwhile the application is running without having to do a restart

@jakowenko jakowenko added the bug Something isn't working label Mar 24, 2021
@jakowenko
Copy link
Owner

You're right, this is a known issue and was something I overlooked when I worked on the feature. I wonder if it makes sense to implement some sort of file watcher like chokidar or to just rescan the folder when the /train route is hit. Do you have any opinions?

@erikarenhill
Copy link
Contributor Author

I was thinking a rescan of the folder on the API hit is enough as it shouldnt be hit so often to cause problems, at least that's a starter?

My initial thought from another point of view was actually that the train/add endpoint could actually delete the files after they have been trained, that way no database/file scan is needed at all. Does that make sense? Or do you want to keep the training database to easily train another service (e.g. running compreface but also want to add deepstack and easily put the same dataset into deepstack)

@jakowenko
Copy link
Owner

Rescanning when the API hits is what I was leaning towards for now too. That should be easy to fix and get in.

My only logic in keeping the training files is that it makes it easy to retrain in the future or if the user ends up using a new detector, all those images are still available on the disk for the user to easily queue back up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants