Skip to content

Cuda Dockerfile - model produces garbage output on GPU (if layers offloaded to GPU) #597

@pradhyumna85

Description

@pradhyumna85

Current Behavior

If we use the cuda dockerfile (https://github.com/abetlen/llama-cpp-python/blob/main/docker/cuda_simple/Dockerfile) and use it with any ggml model file and offloading layers to gpu, model outputs garbage text.

By garbage I mean, absolute garbage - symbols, special characters, Unicode characters etc, no English or any language.

Environment and Context

Running on centos 7 with docker and nvidia docker toolkit installed
Instance - aws g4dn.12xlarge (4 Nvidia T4 - 64GB VRAM)

Exact same issue seen in a repo related to Cublas in koboldcpp which also kind of uses llamacpp

issue: bartowski1182/koboldcpp-docker#1
fix commit: bartowski1182/koboldcpp-docker@331326a

Fix

I'll raise a pull request, with the updated dockerfile.

Cheers !🥂

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions