immich-app · mertalev · Feb 3, 2024 · Feb 3, 2024
diff --git a/docs/docs/features/hardware-transcoding.md b/docs/docs/features/hardware-transcoding.md
@@ -4,6 +4,10 @@ This feature allows you to use a GPU to accelerate transcoding and reduce CPU lo
 Note that hardware transcoding is much less efficient for file sizes.
 As this is a new feature, it is still experimental and may not work on all systems.
 
+:::info
+You do not need to redo any transcoding jobs after enabling hardware acceleration. The acceleration device will be used for any jobs that run after enabling it.
+:::
+
 ## Supported APIs
 
 - NVENC (NVIDIA)
@@ -50,29 +54,49 @@ As this is a new feature, it is still experimental and may not work on all syste
 3. Redeploy the `immich-microservices` container with these updated settings.
 4. In the Admin page under `Video transcoding settings`, change the hardware acceleration setting to the appropriate option and save.
 
-#### All-In-One - Unraid Setup
+#### Single Compose File
 
-##### NVENC - NVIDIA GPUs
-
-1. In the container app, add this environmental variable: Key=`NVIDIA_VISIBLE_DEVICES` Value=`all`
-2. While still in the container app, change the container from Basic Mode to Advanced Mode and add the following parameter to the Extra Parameters field: `--runtime=nvidia`
-3. Restart the container app.
-4. Continue to step 4 of "Basic Setup".
-
-##### Other APIs
-
-Unraid does not currently support multiple Compose files. As an alternative, you can "inline" the relevant contents of the [`hwaccel.transcoding.yml`][hw-file] file into the `immich-microservices` service directly.
+Some platforms, including Unraid and Portainer, do not support multiple Compose files as of writing. As an alternative, you can "inline" the relevant contents of the [`hwaccel.transcoding.yml`][hw-file] file into the `immich-microservices` service directly.
 
 For example, the `qsv` section in this file is:
 
-```
+```yaml
 devices:
   - /dev/dri:/dev/dri
 ```
 
-You can add this to the `immich-microservices` service instead of extending from `hwaccel.transcoding.yml`.
+You can add this to the `immich-microservices` service instead of extending from `hwaccel.transcoding.yml`:
+
+```yaml
+immich-microservices:
+  container_name: immich_microservices
+  image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
+  # Note the lack of an `extends` section
+  devices:
+    - /dev/dri:/dev/dri
+  command: ['start.sh', 'microservices']
+  volumes:
+    - ${UPLOAD_LOCATION}:/usr/src/app/upload
+    - /etc/localtime:/etc/localtime:ro
+  env_file:
+    - .env
+  depends_on:
+    - redis
+    - database
+  restart: always
+```
+
 Once this is done, you can continue to step 3 of "Basic Setup".
 
+#### All-In-One - Unraid Setup
+
+##### NVENC - NVIDIA GPUs
+
+1. In the container app, add this environmental variable: Key=`NVIDIA_VISIBLE_DEVICES` Value=`all`
+2. While still in the container app, change the container from Basic Mode to Advanced Mode and add the following parameter to the Extra Parameters field: `--runtime=nvidia`
+3. Restart the container app.
+4. Continue to step 4 of "Basic Setup".
+
 ## Tips
 
 - You may want to choose a slower preset than for software transcoding to maintain quality and efficiency

diff --git a/docs/docs/features/ml-hardware-acceleration.md b/docs/docs/features/ml-hardware-acceleration.md
@@ -3,7 +3,11 @@
 This feature allows you to use a GPU to accelerate machine learning tasks, such as Smart Search and Facial Recognition, while reducing CPU load.
 As this is a new feature, it is still experimental and may not work on all systems.
 
-## Supported APIs
+:::info
+You do not need to redo any machine learning jobs after enabling hardware acceleration. The acceleration device will be used for any jobs that run after enabling it.
+:::
+
+## Supported Backends
 
 - ARM NN (Mali)
 - CUDA (NVIDIA)
@@ -14,7 +18,8 @@ As this is a new feature, it is still experimental and may not work on all syste
 - The instructions and configurations here are specific to Docker Compose. Other container engines may require different configuration.
 - Only Linux and Windows (through WSL2) servers are supported.
 - ARM NN is only supported on devices with Mali GPUs. Other Arm devices are not supported.
-- The OpenVINO backend has only been tested on an iGPU. ARC GPUs may not work without other changes.
+- There is currently an upstream issue with OpenVINO, so whether it will work is device-dependent.
+- Some models may not be compatible with certain backends. CUDA is the most reliable.
 
 ## Prerequisites
 
@@ -40,10 +45,60 @@ As this is a new feature, it is still experimental and may not work on all syste
 2. In the `docker-compose.yml` under `immich-machine-learning`, uncomment the `extends` section and change `cpu` to the appropriate backend.
 3. Redeploy the `immich-machine-learning` container with these updated settings.
 
+#### Single Compose File
+
+Some platforms, including Unraid and Portainer, do not support multiple Compose files as of writing. As an alternative, you can "inline" the relevant contents of the [`hwaccel.ml.yml`][hw-file] file into the `immich-machine-learning` service directly.
+
+For example, the `cuda` section in this file is:
+
+```yaml
+deploy:
+  resources:
+    reservations:
+      devices:
+        - driver: nvidia
+          count: 1
+          capabilities:
+            - gpu
+            - compute
+            - video
+```
+
+You can add this to the `immich-machine-learning` service instead of extending from `hwaccel.ml.yml`:
+
+```yaml
+immich-machine-learning:
+  container_name: immich_machine_learning
+  image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
+  # Note the lack of an `extends` section
+  deploy:
+    resources:
+      reservations:
+        devices:
+          - driver: nvidia
+            count: 1
+            capabilities:
+              - gpu
+              - compute
+              - video
+  volumes:
+    - model-cache:/cache
+  env_file:
+    - .env
+  restart: always
+```
+
+Once this is done, you can redeploy the `immich-machine-learning` container.
+
+:::info
+You can confirm the device is being recognized and used by checking its utilization (via `nvtop` for CUDA, `intel_gpu_top` for OpenVINO, etc.). You can also enable debug logging by setting `LOG_LEVEL=debug` in the `.env` file and restarting the `immich-machine-learning` container. When a Smart Search or Face Detection job begins, you should see a log for `Available ORT providers` containing the relevant provider. In the case of ARM NN, the absence of a `Could not load ANN shared libraries` log entry means it loaded successfully.
+:::
+
 [hw-file]: https://github.com/immich-app/immich/releases/latest/download/hwaccel.ml.yml
 [nvcr]: https://github.com/NVIDIA/nvidia-container-runtime/
 
 ## Tips
 
+- If you encounter an error when a model is running, try a different model to see if the issue is model-specific.
 - You may want to increase concurrency past the default for higher utilization. However, keep in mind that this will also increase VRAM consumption.
 - Larger models benefit more from hardware acceleration, if you have the VRAM for them.