Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(ml,server): updated hwaccel docs #6878

Merged
merged 1 commit into from
Feb 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
50 changes: 37 additions & 13 deletions docs/docs/features/hardware-transcoding.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@ This feature allows you to use a GPU to accelerate transcoding and reduce CPU lo
Note that hardware transcoding is much less efficient for file sizes.
As this is a new feature, it is still experimental and may not work on all systems.

:::info
You do not need to redo any transcoding jobs after enabling hardware acceleration. The acceleration device will be used for any jobs that run after enabling it.
:::

## Supported APIs

- NVENC (NVIDIA)
Expand Down Expand Up @@ -50,29 +54,49 @@ As this is a new feature, it is still experimental and may not work on all syste
3. Redeploy the `immich-microservices` container with these updated settings.
4. In the Admin page under `Video transcoding settings`, change the hardware acceleration setting to the appropriate option and save.

#### All-In-One - Unraid Setup
#### Single Compose File

##### NVENC - NVIDIA GPUs

1. In the container app, add this environmental variable: Key=`NVIDIA_VISIBLE_DEVICES` Value=`all`
2. While still in the container app, change the container from Basic Mode to Advanced Mode and add the following parameter to the Extra Parameters field: `--runtime=nvidia`
3. Restart the container app.
4. Continue to step 4 of "Basic Setup".

##### Other APIs

Unraid does not currently support multiple Compose files. As an alternative, you can "inline" the relevant contents of the [`hwaccel.transcoding.yml`][hw-file] file into the `immich-microservices` service directly.
Some platforms, including Unraid and Portainer, do not support multiple Compose files as of writing. As an alternative, you can "inline" the relevant contents of the [`hwaccel.transcoding.yml`][hw-file] file into the `immich-microservices` service directly.

For example, the `qsv` section in this file is:

```
```yaml
devices:
- /dev/dri:/dev/dri
```

You can add this to the `immich-microservices` service instead of extending from `hwaccel.transcoding.yml`.
You can add this to the `immich-microservices` service instead of extending from `hwaccel.transcoding.yml`:

```yaml
immich-microservices:
container_name: immich_microservices
image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
# Note the lack of an `extends` section
devices:
- /dev/dri:/dev/dri
command: ['start.sh', 'microservices']
volumes:
- ${UPLOAD_LOCATION}:/usr/src/app/upload
- /etc/localtime:/etc/localtime:ro
env_file:
- .env
depends_on:
- redis
- database
restart: always
```

Once this is done, you can continue to step 3 of "Basic Setup".

#### All-In-One - Unraid Setup

##### NVENC - NVIDIA GPUs

1. In the container app, add this environmental variable: Key=`NVIDIA_VISIBLE_DEVICES` Value=`all`
2. While still in the container app, change the container from Basic Mode to Advanced Mode and add the following parameter to the Extra Parameters field: `--runtime=nvidia`
3. Restart the container app.
4. Continue to step 4 of "Basic Setup".

## Tips

- You may want to choose a slower preset than for software transcoding to maintain quality and efficiency
Expand Down
59 changes: 57 additions & 2 deletions docs/docs/features/ml-hardware-acceleration.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,11 @@
This feature allows you to use a GPU to accelerate machine learning tasks, such as Smart Search and Facial Recognition, while reducing CPU load.
As this is a new feature, it is still experimental and may not work on all systems.

## Supported APIs
:::info
You do not need to redo any machine learning jobs after enabling hardware acceleration. The acceleration device will be used for any jobs that run after enabling it.
:::

## Supported Backends

- ARM NN (Mali)
- CUDA (NVIDIA)
Expand All @@ -14,7 +18,8 @@ As this is a new feature, it is still experimental and may not work on all syste
- The instructions and configurations here are specific to Docker Compose. Other container engines may require different configuration.
- Only Linux and Windows (through WSL2) servers are supported.
- ARM NN is only supported on devices with Mali GPUs. Other Arm devices are not supported.
- The OpenVINO backend has only been tested on an iGPU. ARC GPUs may not work without other changes.
- There is currently an upstream issue with OpenVINO, so whether it will work is device-dependent.
- Some models may not be compatible with certain backends. CUDA is the most reliable.

## Prerequisites

Expand All @@ -40,10 +45,60 @@ As this is a new feature, it is still experimental and may not work on all syste
2. In the `docker-compose.yml` under `immich-machine-learning`, uncomment the `extends` section and change `cpu` to the appropriate backend.
3. Redeploy the `immich-machine-learning` container with these updated settings.

#### Single Compose File

Some platforms, including Unraid and Portainer, do not support multiple Compose files as of writing. As an alternative, you can "inline" the relevant contents of the [`hwaccel.ml.yml`][hw-file] file into the `immich-machine-learning` service directly.

For example, the `cuda` section in this file is:

```yaml
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities:
- gpu
- compute
- video
```

You can add this to the `immich-machine-learning` service instead of extending from `hwaccel.ml.yml`:

```yaml
immich-machine-learning:
container_name: immich_machine_learning
image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
# Note the lack of an `extends` section
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities:
- gpu
- compute
- video
volumes:
- model-cache:/cache
env_file:
- .env
restart: always
```

Once this is done, you can redeploy the `immich-machine-learning` container.

:::info
You can confirm the device is being recognized and used by checking its utilization (via `nvtop` for CUDA, `intel_gpu_top` for OpenVINO, etc.). You can also enable debug logging by setting `LOG_LEVEL=debug` in the `.env` file and restarting the `immich-machine-learning` container. When a Smart Search or Face Detection job begins, you should see a log for `Available ORT providers` containing the relevant provider. In the case of ARM NN, the absence of a `Could not load ANN shared libraries` log entry means it loaded successfully.
:::

[hw-file]: https://github.com/immich-app/immich/releases/latest/download/hwaccel.ml.yml
[nvcr]: https://github.com/NVIDIA/nvidia-container-runtime/

## Tips

- If you encounter an error when a model is running, try a different model to see if the issue is model-specific.
- You may want to increase concurrency past the default for higher utilization. However, keep in mind that this will also increase VRAM consumption.
- Larger models benefit more from hardware acceleration, if you have the VRAM for them.