Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA 12 issue with GTX1070 #8570

Closed
1 of 3 tasks
OperKH opened this issue Apr 6, 2024 · 0 comments
Closed
1 of 3 tasks

CUDA 12 issue with GTX1070 #8570

OperKH opened this issue Apr 6, 2024 · 0 comments

Comments

@OperKH
Copy link
Contributor

OperKH commented Apr 6, 2024

The bug

Smart search become broken since v1.98.0

It might be caused by #7569

Here is immich_machine_learning logs

[04/06/24 16:06:30] INFO     Booting worker with pid: 40                        
[04/06/24 16:06:33] INFO     Started server process [40]                        
[04/06/24 16:06:33] INFO     Waiting for application startup.                   
[04/06/24 16:06:33] INFO     Created in-memory cache with unloading after 300s  
                             of inactivity.                                     
[04/06/24 16:06:33] INFO     Initialized request thread pool with 8 threads.    
[04/06/24 16:06:33] INFO     Application startup complete.                      
[04/06/24 16:07:36] INFO     Setting                                            
                             'XLM-Roberta-Large-ViT-H-14__frozen_laion5b_s13b_b9
                             0k' execution providers to                         
                             ['CUDAExecutionProvider', 'CPUExecutionProvider'], 
                             in descending order of preference                  
[04/06/24 16:07:36] INFO     Loading clip model                                 
                             'XLM-Roberta-Large-ViT-H-14__frozen_laion5b_s13b_b9
                             0k' to memory                                      
[04/06/24 16:07:37] ERROR    Worker (pid:40) was sent code 139!                 

I use NVIDIA GTX1070

nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05   Driver Version: 525.147.05   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0 Off |                  N/A |
|  0%   51C    P8    16W / 230W |      1MiB /  8192MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Looks like it supports CUDA 12.0, and as I see in #7569 that you are using CUDA 12.2.2

Currently I downgraded to immich-machine-learning:v1.97.0-cuda and smart search become working.

The OS that Immich Server is running on

Ubuntu 22.04

Version of Immich Server

v1.101.0

Version of Immich Mobile App

v1.101.0

Platform with the issue

  • Server
  • Web
  • Mobile

Your docker-compose.yml content

name: immich

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    command: ['start.sh', 'immich']
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
    networks:
      - immich
      - proxy
    env_file:
      - .env
    depends_on:
      - redis
      - database
    restart: always

  immich-microservices:
    user: 0:998
    container_name: immich_microservices
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/hardware-transcoding
      file: hwaccel.transcoding.yml 
      service: quicksync # set to one of [nvenc, quicksync, rkmpp, vaapi, vaapi-wsl] for accelerated transcoding
    command: ['start.sh', 'microservices']
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
    networks:
      - immich
    env_file:
      - .env
    depends_on:
      - redis
      - database
    restart: always

  immich-machine-learning:
    container_name: immich_machine_learning
    # For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.
    # Example tag: ${IMMICH_VERSION:-release}-cuda
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-cuda
    extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
      file: hwaccel.ml.yml
      service: cuda # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable
    volumes:
      - ${MODEL_CACHE}:/cache
    networks:
      - immich
    env_file:
      - .env
    restart: always

  redis:
    container_name: immich_redis
    image: registry.hub.docker.com/library/redis:6.2-alpine@sha256:51d6c56749a4243096327e3fb964a48ed92254357108449cb6e23999c37773c5
    networks:
      - immich
    restart: always

  database:
    container_name: immich_postgres
    image: registry.hub.docker.com/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
    env_file:
      - .env
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
    volumes:
      - ${PG_DATA}:/var/lib/postgresql/data
    networks:
      - immich
    restart: always

networks:
  immich:
  proxy:
    external: true

Your .env content

# You can find documentation for all the supported env variables at https://immich.app/docs/install/environment-variables

# The location where your uploaded files are stored
UPLOAD_LOCATION=/home/immich

# Volumes
PG_DATA=/srv/appdata/immich/pgdata
MODEL_CACHE=/srv/appdata/immich/model-cache

# The Immich version to use. You can pin this to a specific version like "v1.71.0"
IMMICH_VERSION=release

# Connection secrets for postgres and typesense. You should change these to random passwords
DB_PASSWORD=

# The values below this line do not need to be changed
###################################################################################
DB_HOSTNAME=database
DB_USERNAME=postgres
DB_DATABASE_NAME=immich

REDIS_HOSTNAME=redis

Reproduction steps

Open website, search e.g. `car`, you will see error, and in `immich_machine_learning` container logs you will see:

[04/06/24 16:06:30] INFO     Booting worker with pid: 40                        
[04/06/24 16:06:33] INFO     Started server process [40]                        
[04/06/24 16:06:33] INFO     Waiting for application startup.                   
[04/06/24 16:06:33] INFO     Created in-memory cache with unloading after 300s  
                             of inactivity.                                     
[04/06/24 16:06:33] INFO     Initialized request thread pool with 8 threads.    
[04/06/24 16:06:33] INFO     Application startup complete.                      
[04/06/24 16:07:36] INFO     Setting                                            
                             'XLM-Roberta-Large-ViT-H-14__frozen_laion5b_s13b_b9
                             0k' execution providers to                         
                             ['CUDAExecutionProvider', 'CPUExecutionProvider'], 
                             in descending order of preference                  
[04/06/24 16:07:36] INFO     Loading clip model                                 
                             'XLM-Roberta-Large-ViT-H-14__frozen_laion5b_s13b_b9
                             0k' to memory                                      
[04/06/24 16:07:37] ERROR    Worker (pid:40) was sent code 139!                 


### Additional information

I also played with [localai.io](https://localai.io/) - they have CUDA 11 and CUDA 12 images, and CUDA 12 images didn't work on my GTX1070, but, after several version updates of localai - now I can run their CUDA 12 images
@immich-app immich-app locked and limited conversation to collaborators Apr 6, 2024
@mertalev mertalev converted this issue into discussion #8575 Apr 6, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant