Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up dedicated devices / Tensorflow serving #73

Open
arch-user-france1 opened this issue Aug 28, 2021 · 27 comments
Open

Set up dedicated devices / Tensorflow serving #73

arch-user-france1 opened this issue Aug 28, 2021 · 27 comments
Labels
enhancement New feature or request
Projects

Comments

@arch-user-france1
Copy link

Make an app that can be installed on other devices and a setting that if a device that has such app and is configured (and online) gets the model & pictures so it can process byself. My Server doesn't have a good GPU so I would like to run it on my fast er computer with a NVIDIA GeForce GTX 1660 super (or if a rasperry pi is online send it some data so it is a bit faster etc.)

@marcelklehr
Copy link
Member

This is a good idea, but it will take some time until I can get to it.

@marcelklehr marcelklehr added the enhancement New feature or request label Aug 28, 2021
@arch-user-france1
Copy link
Author

arch-user-france1 commented Aug 28, 2021

I love this project

It is very fun and the people begin clicking on cats (as this is the most accurate) and look how their cats looked past 2 years
When my cloud works well I possibly stop billing OneDrive and pay you 1$/month (Well I get 50CHF/Month)

@stavros-k
Copy link

One advantage is that you don't have to overload a nextcloud container with extra packages.
Also it can be scaled up a lot, like in a kubernetes cluster, you can spawn multiple workers to do the processing. (maybe not so much for a home lab, but in a bigger installations would be awesome).

@arch-user-france1
Copy link
Author

arch-user-france1 commented Sep 20, 2021

Maybe I can do a little beta

Maybe a bit of bash script that lets the classifier work through SSH (rsync)

@marcelklehr marcelklehr added this to Backlog in Recognize Sep 29, 2021
@arch-user-france1
Copy link
Author

arch-user-france1 commented Oct 15, 2021

I'm actually thinking about it because I don't want to use a bash script
It sucks

How do you use the node/tensorflow? I see that you use - instead of a file path. Is it a huge performance decreasment to spawn a tensorflow for every file?

I should look into the code...

@marcelklehr
Copy link
Member

marcelklehr commented Oct 15, 2021

The classifier scripts accept input either via cli args or as JSON via stdin.

@marcelklehr
Copy link
Member

Ideally we would have a Tensorflow serving container and allow people to connect to it with recognize.

@marcelklehr marcelklehr changed the title Set up dedicated devices Set up dedicated devices / Tensorflow serving Oct 15, 2021
@UmbrellaCodr

This comment was marked as resolved.

@gotagitaccnt
Copy link

Any updates on this? Would really appreciate this feature to be implemented or maybe some pointers how to start

@guystreeter
Copy link

I can run the rest of Nextcloud on a 1GB VM.
A 4GB VM costs almost 4 times a much per month.
I would like to to batch-process recognition on another machine that I only spin up for a day once in a while. (Or do the recognition runs on my home machine with its NVIDIA GPU)

@ddarek2000
Copy link

Great Idea. Up-Vote

@szaimen
Copy link
Contributor

szaimen commented Jan 21, 2023

I think this would also be a good idea e.g. for AIO since libtensorflow does not seem to run in an alpine container but could then run from another container which could use debian as base.

@guystreeter
Copy link

Setting up GPU access for a container is complicated. Anyone running Nextcloud in a container would probably want to send image analysis requests to a service running on the native OS.

@marcelklehr
Copy link
Member

Setting up GPU access for a container is complicated.

It's not that hard, I believe

@guystreeter
Copy link

Setting up GPU access for a container is complicated.

It's not that hard, I believe

Have you tried it? The documented steps are expert admin level stuff

@relink2013
Copy link

I would love for this to become a reality. Im currently running NC in a Ubuntu VM for the sole purpose of using my Nvidia GPU with Recognize and Memories and I absolutely hate managing it.

@tbelway
Copy link

tbelway commented Mar 20, 2023

Oooh I would love this. Decentralized recognize service would allow for a greater deal of flexibility. I use a nextcloud container (linuxserver) that is alpine based, and recognize stopped working relatively recently due to changes in libtensorflow, prior I had it working with some container customization scripts... now that isn't working whihc is frustrating.

@marcelklehr I see that you are looking at tensorflow serving, does that mean you're thinking of having a nextcloud recognize-proxy app that would interact with this novel instance (probably container...)?

Setting up GPU access for a container is complicated.

It's not that hard, I believe

Have you tried it? The documented steps are expert admin level stuff

I don't know what you mean it's expert admin level stuff. It's pretty simple...
I've done this both through debian and RHEL based hypervisors (proxmox and kvm with cockpit) via both direct lxc or podman containerization as well as through hardware passthrough to a VM which is then containerized. As long as you are running linux, it's trivial.

@Leptopoda
Copy link

@marcelklehr
I currently have some time to look into this and would love giving this a shot.
What is the blocker? Or how would you imagine this being implemented?

@arch-user-france1
Copy link
Author

arch-user-france1 commented Mar 24, 2023

@marcelklehr I currently have some time to look into this and would love giving this a shot. What is the blocker? Or how would you imagine this being implemented?

IMO, a nodejs socket server could be run on the dedicated device and then the images sent and tagged with an ID; preferably 128 to 512 at once, because that's the amount modern graphics cards handle with 100% utilisation and not something like 7% because the images are not supplied in-time or do not have enough pixels.
The dedicated device would send the result back with the appropriate ID, and done.

To set up a socket, socket.io or ws could be used.

It would also be very handy to have a configuration file. The easiest would be to write the variables exported into a javascript file and then import them in the actual program. Better, though, would be CSV files or something like them.

Remember tis a simple draft. Per'aps someone would like to implement it. There are many solutions to the problem, and it would always be great if we could get an anwer from marcelklehr so we know what he actually likes to have.
Finally, we could overcome the limitation of the nextcloud app repository hindering us to add back GPU support.

@Leptopoda
Copy link

Why would a nodejs socket server be needed?
Tensorflow serving already has a restfull api that we could interact with directly. Also the latest version introduces a batch size option that sound like what you wanted.

I think you mean the training device (running TFserving) should fetch the jobs from the server but that's not how TFserving is meant to be used.

@pktiuk
Copy link

pktiuk commented Mar 24, 2023

@Leptopoda I fully agree with you.

BTW
In terms of asking about the way of implementing it, I think you should wait for feedback from maintainer of this repo (@marcelklehr), but he is on vacation right now, so it may take some time to get feedback from him.

@arch-user-france1
Copy link
Author

I meant not to use TFserving, but make it on my own. But in case you want to use it, yes, may be better.

@617a7a
Copy link

617a7a commented Dec 5, 2023

Any update on this?

@Tsaukpaetra
Copy link

(Commenting to add myself to notifications)
I am also interested in testing anything that can help this proceed. I recently updated from NC 23 (phew) and saw this nifty app. I enabled it, and then made a 😓 face when I realized the dinky little Celeron the server was running under would take years to process the current files (let alone any new ones that will ingest as part of the family archive project). Meanwhile my gaming PC is sitting idle and I'm left pondering how to bridge the gap of immense power for the times it is needed on demand. :)

@RudolfAchter
Copy link

Same situation here. Looks to me like a huge pain to get tensorflow running in the nextloud-aio alpine image. For me there also would be the benefit of using the power of my gamin pc.
Thinking of enterprises you could offload the gpu load to different GPU Servers or even a GPU Server cluster for analyzing stuff.

@marcelklehr
Copy link
Member

Nextcloud GmbH is planning to move the classifiers in recognize to docker containers as part of the External Apps Ecosystem in the coming months

@szaimen
Copy link
Contributor

szaimen commented Jan 14, 2024

Nextcloud GmbH is planning to move the classifiers in recognize to docker containers as part of the External Apps Ecosystem in the coming months

Sounds great! As soon as it is available via the External Apps Ecosystem, it will also automatically be available in AIO after one enables the docker socket proxy in the AIO interface and installs the app from the Nextcloud apps page :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Recognize
Backlog
Development

No branches or pull requests