Running on linux #64

TheMcSebi · 2021-07-21T15:44:26Z

I was trying to get this to run on Pop OS when I encountered an issue.
The installation steps all went fine, but when I was first tried to start the game using play-cuda.sh it didn't work because of this issue:

KoboldAI-Client/docker-cuda/Dockerfile

Line 7 in 91efd9a

RUN apt update && apt install xorg -y

Resolved that by commenting out the line, because xorg was already installed.
Now when trying to run it I get this error:

Error response from daemon: error gathering device information while adding custom device "/dev/kfd": no such file or directory

Found out this might have something to do with rocm, which I don't have installed (Because I'm trying to run this on a 1080Ti).
Now I wonder if running this on linux is even supported since all the instructions are made for windows :)
I have to mention though, that it does run flawlessly in my Installation of Windows. No issues at any process step. Only tried the gpt neo 2.7B parameter set so far and runs fine on 11GB VRAM. Thanks for all the work that has already been put into this.

henk717 · 2021-08-02T05:37:07Z

I do not have nvidia so my attempt at play-cuda.sh was completely blind and apparently unsuccesful. This will be quite tricky to solve over github since we will have to find out what ends up working one on one.

For that one line don't comment it out and add the following above it and then move these two lines to the bottom of the script:
USER root

This should elevate its permission to root at the last moment and install X11 (this is inside the docker not on your real installation). You need this inside the docker instance so it can draw the file selection window properly.

The missing device is most likely an issue with me plainly copying my amd version hoping it would work. Try removing that from the cuda docker files.

If it still gives issues i recommend joining the kobold discord at https://discord.gg/UCyXV7NssH so we can try and fix this one on one (I am Henky!! there).

ghost · 2021-10-06T23:24:54Z

Hi on the topic Linux, to get KoboldAI to run on Arch you may need to modify the docker-compose.yml for it to see your nvidia GPU. If you don't it may lock up on large models.

version: "3.2"
services:
koboldai:
build: .
environment:
- DISPLAY=${DISPLAY}
network_mode: "host"
volumes:
- /tmp/.X11-unix:/tmp/.X11-unix
- ../:/content/
- $HOME/.Xauthority:/home/micromamba/.Xauthority:rw
devices:
- /dev/dri
- /dev/nvidia0:/dev/nvidia0
- /dev/nvidiactl:/dev/nvidiactl
- /dev/nvidia-uvm:/dev/nvidia-uvm
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ['0']
capabilities: [gpu]
group_add:
- video

henk717 · 2021-10-06T23:53:28Z

Feel free to issue that as a commit here on github, i don't have all the parts for my nvidia gpu yet.

ghost · 2021-10-07T00:52:07Z

Done, I have made a pull request. Im not sure it's bug free though, but it is a start.

henk717 · 2021-10-07T01:16:44Z

The old one wasn't working on the GPU's in general, leaving this issue open for further testing but i expect this to work well for most people!

ghost · 2021-10-22T22:14:09Z

Hello, two things. The first an update on nvidia. Sometimes you may get this message like after a reboot,
"Error response from daemon: error gathering device information while adding custom device "/dev/nvidia-uvm": no such file or directory"
In this case you may have to comment out this line
- /dev/nvidia-uvm:/dev/nvidia-uvm
Run ./play-cuda.sh again, have it say GPU NOT FOUND, close the program, uncomment the line again, then rerun the program, don't know why that happens exactly.

Issue number 2 is to do with AMD. I'm working on the docker-rocm/docker-compose.yml file. My GPU was getting marked not found so I made this small change.

devices:
  - /dev/kfd:/dev/kfd
  - /dev/dri:/dev/dri

But now I am getting this error.

"hipErrorNoBinaryForGpu: Unable to find code object for all current devices!"

Does anyone have any ideas how to fix this? Thank you.

henk717 · 2021-10-23T04:29:14Z

Which AMD gpu do you have? Not all of them are supported and you need the built in driver (not the pro driver) + ROCm to get a working conpute stack with them. Only very few xards are supported.

Prevent tokenizer from taking extra time the first time it's used

henk717 closed this as completed Jan 14, 2022

henk717 added a commit that referenced this issue Feb 6, 2022

Merge pull request #64 from VE-FORBRYDERNE/patch

509b9a8

Prevent tokenizer from taking extra time the first time it's used

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running on linux #64

Running on linux #64

TheMcSebi commented Jul 21, 2021

henk717 commented Aug 2, 2021

ghost commented Oct 6, 2021

henk717 commented Oct 6, 2021

ghost commented Oct 7, 2021

henk717 commented Oct 7, 2021

ghost commented Oct 22, 2021

henk717 commented Oct 23, 2021

Running on linux #64

Running on linux #64

Comments

TheMcSebi commented Jul 21, 2021

henk717 commented Aug 2, 2021

ghost commented Oct 6, 2021

henk717 commented Oct 6, 2021

ghost commented Oct 7, 2021

henk717 commented Oct 7, 2021

ghost commented Oct 22, 2021

henk717 commented Oct 23, 2021