GPU is not being used #67

derritter88 · 2021-08-26T06:37:55Z

I have enabled GPU support at the admin GUI but when I start a manual process via occ recognize:classify I can see that a process is being started and using 100 % a CPU core.
The GPU is not being used.

I have installed all the specified Nvidia applications/libraries

The text was updated successfully, but these errors were encountered:

marcelklehr · 2021-08-26T10:10:26Z

Hi!

Are there any messages in the nextcloud log?

derritter88 · 2021-08-26T10:25:27Z

Unfortunatley not - the only "warning" I can see in my log would be:
[recognize] Warning: Classifying photos of user 3A60C52D-9415-4F28-A2B7-71A8CBD7A9E3 at 2021-08-26T08:37:57+02:00

The only thing I can see on my shell is that www-data is running node-v14.17.4-linux-x64.
This processed cannot be stopped or killed - even a reboot does not solve it.
I need to reset the whole VM to have the processed killed.

derritter88 · 2021-08-26T10:26:52Z

What I see additional within the log (but it's not linked to my manual start of the classifying process) would be:
`[recognize] Warning: Classifier process output: 2021-08-26 07:59:08.434295: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
[]

at 2021-08-26T07:59:08+02:00`

derritter88 · 2021-08-26T10:27:18Z

and:

[index] Error: Call to a member function getOwner() on null

GET /index.php/apps/recognize/admin/countMissed
from 192.168.10.2 by 3A60C52D-9415-4F28-A2B7-71A8CBD7A9E3 at 2021-08-26T08:19:11+02:00

But I am not sure if this is linked to this issue or not.

derritter88 · 2021-08-27T04:42:53Z

Okay so during the night the new version was able to be downloaded. I did so today morning.
Nextcloud 22.1.1
Recognize 1.6.3

When manually starting the process I get following error message:
Classifying photos of user ED17CAA4-EC2F-4457-95AB-A5980927C9C8
Failed to classify images
Classifier process error

My log would say:
[recognize] Warning: Classifier process output: 2021-08-27 06:42:20.937775: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Error: Cannot find module '@tensorflow/tfjs-node-gpu'
Require stack:

/var/www/cloud/apps/recognize/src/efficientnet/EfficientnetModel.js
/var/www/cloud/apps/recognize/src/classifier_imagenet.js
at Function.Module._resolveFilename (internal/modules/cjs/loader.js:889:15)
at Function.Module._load (internal/modules/cjs/loader.js:745:27)
at Module.require (internal/modules/cjs/loader.js:961:19)
at require (internal/modules/cjs/helpers.js:92:18)
at Object. (/var/www/cloud/apps/recognize/src/efficientnet/EfficientnetModel.js:11:9)
at Module._compile (internal/modules/cjs/loader.js:1072:14)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:1101:10)
at Module.load (internal/modules/cjs/loader.js:937:32)
at Function.Module._load (internal/modules/cjs/loader.js:778:12)
at Module.require (internal/modules/cjs/loader.js:961:19) {
code: 'MODULE_NOT_FOUND',
requireStack: [
'/var/www/cloud/apps/recognize/src/efficientnet/EfficientnetModel.js',
'/var/www/cloud/apps/recognize/src/classifier_imagenet.js'
]
}
Trying js-only mode
internal/modules/cjs/loader.js:892
throw err;
^

Error: Cannot find module '@tensorflow/tfjs-backend-wasm'
Require stack:

/var/www/cloud/apps/recognize/src/efficientnet/EfficientnetModel.js
/var/www/cloud/apps/recognize/src/classifier_imagenet.js
at Function.Module._resolveFilename (internal/modules/cjs/loader.js:889:15)
at Function.Module._load (internal/modules/cjs/loader.js:745:27)
at Module.require (internal/modules/cjs/loader.js:961:19)
at require (internal/modules/cjs/helpers.js:92:18)
at Object. (/var/www/cloud/apps/recognize/src/efficientnet/EfficientnetModel.js:19:3)
at Module._compile (internal/modules/cjs/loader.js:1072:14)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:1101:10)
at Module.load (internal/modules/cjs/loader.js:937:32)
at Function.Module._load (internal/modules/cjs/loader.js:778:12)
at Module.require (internal/modules/cjs/loader.js:961:19) {
code: 'MODULE_NOT_FOUND',
requireStack: [
'/var/www/cloud/apps/recognize/src/efficientnet/EfficientnetModel.js',
'/var/www/cloud/apps/recognize/src/classifier_imagenet.js'
]
}

at 2021-08-27T06:42:20+02:00

derritter88 · 2021-08-27T05:19:58Z

So it looks like '@tensorflow/tfjs-node-gpu & @tensorflow/tfjs-backend-wasm are not included in the NC app.

marcelklehr · 2021-08-27T11:21:11Z

I've had to disable GPU for now, because the bundle would exceed the bundle size limit :/

derritter88 · 2021-08-27T11:47:16Z

I've had to disable GPU for now, because the bundle would exceed the bundle size limit :/

The limitation from the Nextcloud appstore?

marcelklehr · 2021-08-27T12:12:04Z

Yeah

derritter88 · 2021-08-27T12:22:58Z

Okay, would it be possible that you create a "Github-only" version of it (e.g. xxx-RC1) so I can download and test it?

marcelklehr · 2021-08-28T14:13:29Z

I'll definitely try to make something available. Currently, my problem is that I have to develop that blindly, as I don't have a GPU machine available.

derritter88 · 2021-08-28T14:21:21Z

If you want you can pack me the thing and I will act as your alpha-/beta tester?!

arch-user-france1 · 2021-09-21T15:31:37Z

I'm testing it with my NVIDIA GeForce GTX 1660 super (cuda supported even I couldn't find it on the list)

First I have to set up another instance ..
I'm using an older version where it still is integrated

arch-user-france1 · 2021-09-21T16:05:58Z

lol nextcloud apps is down :(

Now I can wait even longer

marcelklehr · 2021-10-12T20:32:23Z

GPU support has to wait until other issues are sorted out, sorry.

derritter88 · 2021-10-13T13:31:26Z

Okay so for the moment I can remove all necessary Nvidia libraries (except driver)?

marcelklehr · 2021-10-13T13:34:47Z

Okay so for the moment I can remove all necessary Nvidia libraries (except driver)?

For the moment no NVIDIA drivers and libraries are needed, but they won't hurt either, so it's up to you.

derritter88 · 2021-10-13T13:36:52Z

It's just a bit complex to install different CUDA libraries/versions - that's why I am asking :-)
At the moment I sticking with CUDA 11.2 as you have mentioned it in a previous version

derritter88 · 2021-11-18T06:27:45Z

@marcelklehr just in case of: Windows now supports NVIDIA GPUs within its WSL which I am using.
So if you have any tests which I could do just let me know.

bugsyb · 2022-12-01T22:41:03Z

@derritter88, did you get it working?
I've NC in Docker and have been able to get containers gaining access to GPU, i.e. Tensorflow example:
docker run --gpus all -it --rm tensorflow/tensorflow:latest-gpu python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"

Similar results get NVIDIA examples:

#docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
GPU Device 0: "Maxwell" with compute capability 5.0

> Compute 5.0 CUDA device: [NVIDIA GeForce GTX 960M]
5120 bodies, total time for 10 iterations: 6.155 ms
= 42.591 billion interactions per second
= 851.816 single-precision GFLOP/s at 20 flops per interaction

I've big archive of photos to get processed and running it on CPU is an overkill.

Thanks for hints on how to get it working - am not shy customizing NC container/whatever is needed.

derritter88 · 2022-12-02T07:04:26Z

@derritter88, did you get it working? I've NC in Docker and have been able to get containers gaining access to GPU, i.e. Tensorflow example: docker run --gpus all -it --rm tensorflow/tensorflow:latest-gpu python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"

Similar results get NVIDIA examples:
#docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
GPU Device 0: "Maxwell" with compute capability 5.0

> Compute 5.0 CUDA device: [NVIDIA GeForce GTX 960M]
5120 bodies, total time for 10 iterations: 6.155 ms
= 42.591 billion interactions per second
= 851.816 single-precision GFLOP/s at 20 flops per interaction
I've big archive of photos to get processed and running it on CPU is an overkill.

Thanks for hints on how to get it working - am not shy customizing NC container/whatever is needed.

Hello @bugsyb ,

thanks for sharing this with me/us. Might be a useful information for some people but unfortunately I do not use Nextcloud as a Docker container.
I "just" have a regular dedicated Nextcloud VM.
I also had around ~100k of photos/images to classify but my CPU handled that over the last couple of weeks.

I had some discussions with @marcelklehr about it and the major problem would be to have an AI library like Tensorflow which could handled both Nvidia and AMD GPUs

bugsyb · 2022-12-02T19:20:25Z

Hi @derritter88 ,

Thanks for swift response.

I did take a quick look at what gets installed as part of Recognize and smells like tensorflow-webgl gets there.
There is also flag in the code which suggest it should be possible even today:
process.env.RECOGNIZE_GPU

Hopes were that given your earlier engagement you'd know how to get Recognize using GPU.

I have also large number of photos to be processed and... well, hoped could leverage GPU which is wasted otherwise.

I run most of apps these days as containers, just for simplicity/dependency and easiness of portability between systems. Happy to share knowledge on the side if you'd be interested.

Re GPUs Nvidia and AMD, tensorflow allows to get it run both natively as well as in container, as demonstrated for Nvidia.

Here is small explanation covering AMD:
https://community.amd.com/t5/hsa/tensorflow-with-amd-gpu/td-p/199925
https://medium.com/analytics-vidhya/install-tensorflow-2-for-amd-gpus-87e8d7aeb812
https://www.amd.com/en/technologies/infinity-hub/tensorflow
https://tealfeed.com/install-tensorflow-gpu-amd-gpus-vbs7s

There was also other implementation DirectML, though as Internet claims, it was for Windows and WSL which standard Linux wouldn't count in as to be used (am not sure about the latter though).

If we could get started with Nvidia, which is more popular across people who would use it for Linux (not so much gaming ;) ) it would be great, especially as Tensorflow is already available.

I can't help much with AMD as don't have one.

derritter88 · 2022-12-02T20:48:41Z

To be honest: I gave up this topic and passed my GPU to a Plex VM for video transcoding but maybe @marcelklehr could improve the general logic of recognize?

Doomsdayrs · 2022-12-03T22:48:51Z

I have an AMD gpu in a laptop that I use for nextcloud

arch-user-france1 · 2022-12-03T23:50:14Z

I have an AMD gpu in a laptop that I use for nextcloud

AMD GPUs probably won't work anyways.

fixes #67 Signed-off-by: Marcel Klehr <mklehr@gmx.net>

marcelklehr mentioned this issue Aug 28, 2021

Coral Usb Accelerator Support #69

Open

marcelklehr added this to To do in Recognize Sep 29, 2021

marcelklehr added the enhancement New feature or request label Oct 21, 2022

marcelklehr added a commit that referenced this issue Dec 4, 2022

Implement GPU mode

a0d74e6

fixes #67 Signed-off-by: Marcel Klehr <mklehr@gmx.net>

marcelklehr mentioned this issue Dec 4, 2022

Implement GPU mode #529

Merged

marcelklehr closed this as completed in #529 Dec 4, 2022

Recognize automation moved this from To do to Done Dec 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU is not being used #67

GPU is not being used #67

derritter88 commented Aug 26, 2021

marcelklehr commented Aug 26, 2021

derritter88 commented Aug 26, 2021

derritter88 commented Aug 26, 2021

derritter88 commented Aug 26, 2021

derritter88 commented Aug 27, 2021

derritter88 commented Aug 27, 2021 •

edited

marcelklehr commented Aug 27, 2021

derritter88 commented Aug 27, 2021

marcelklehr commented Aug 27, 2021

derritter88 commented Aug 27, 2021

marcelklehr commented Aug 28, 2021

derritter88 commented Aug 28, 2021

arch-user-france1 commented Sep 21, 2021 •

edited

arch-user-france1 commented Sep 21, 2021

marcelklehr commented Oct 12, 2021

derritter88 commented Oct 13, 2021

marcelklehr commented Oct 13, 2021

derritter88 commented Oct 13, 2021

derritter88 commented Nov 18, 2021

bugsyb commented Dec 1, 2022

derritter88 commented Dec 2, 2022

bugsyb commented Dec 2, 2022

derritter88 commented Dec 2, 2022

Doomsdayrs commented Dec 3, 2022

arch-user-france1 commented Dec 3, 2022

GPU is not being used #67

GPU is not being used #67

Comments

derritter88 commented Aug 26, 2021

marcelklehr commented Aug 26, 2021

derritter88 commented Aug 26, 2021

derritter88 commented Aug 26, 2021

derritter88 commented Aug 26, 2021

derritter88 commented Aug 27, 2021

derritter88 commented Aug 27, 2021 • edited

marcelklehr commented Aug 27, 2021

derritter88 commented Aug 27, 2021

marcelklehr commented Aug 27, 2021

derritter88 commented Aug 27, 2021

marcelklehr commented Aug 28, 2021

derritter88 commented Aug 28, 2021

arch-user-france1 commented Sep 21, 2021 • edited

arch-user-france1 commented Sep 21, 2021

marcelklehr commented Oct 12, 2021

derritter88 commented Oct 13, 2021

marcelklehr commented Oct 13, 2021

derritter88 commented Oct 13, 2021

derritter88 commented Nov 18, 2021

bugsyb commented Dec 1, 2022

derritter88 commented Dec 2, 2022

bugsyb commented Dec 2, 2022

derritter88 commented Dec 2, 2022

Doomsdayrs commented Dec 3, 2022

arch-user-france1 commented Dec 3, 2022

derritter88 commented Aug 27, 2021 •

edited

arch-user-france1 commented Sep 21, 2021 •

edited