-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU Installation #1624
Comments
It's a good start that you checked that torch condition. We only really support cuda 12.1 and above at this point, so maybe there's some issue with the installation because you have old cuda toolkit. It's easy to follow our instructions for installing the cuda toolkit 12.1, but you'll need drivers that are also compatible. |
Ok I was actually able to follow the GPU version of PyTorch. But the launching of the actual interface takes forever. I am running this command: but the output gets stuck at this: with nothing after it. It has been like that for at least an hour. Any tips? |
I recommend not using h2oai/h2ogpt-oig-oasst1-512-6_9b as a model, but instead a GGUF model at first like
|
Sorry, I'm getting a Python: no match error when I enter this command and I copied it exactly as is. I'm using Python 3.10.12. I also tried this (without the ?download=true at the end of the model_llama_path) and it gets stuck at this:
|
Can you find the PID of the process and when it gets stuck in a separate terminal do |
It says SIGUSR1: Unknown signal; kill -l lists signals. I replaced PID with the actual PID and it said that |
What platform are you using? linux, windows, mac? |
Linux. SIGUSR1 isn't actually an available signal on my machine
…On Thu, May 23, 2024 at 1:08 AM pseudotensor ***@***.***> wrote:
What platform are you using? linux, windows, mac?
—
Reply to this email directly, view it on GitHub
<https://urldefense.com/v3/__https://github.com/h2oai/h2ogpt/issues/1624*issuecomment-2126489537__;Iw!!Mih3wA!G78pxIIEqntUEtnjJ4i6ZaAKwPtcB-ROl4-cL2dlLp3zq_68g9b8JeQCRJi-oDTwupLLA3-ORT2dGn_07-nUP5N3$>,
or unsubscribe
<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ARX4ZIYKB7PS2OPMGATUSMDZDWPWVAVCNFSM6AAAAABHZFNBGCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRWGQ4DSNJTG4__;!!Mih3wA!G78pxIIEqntUEtnjJ4i6ZaAKwPtcB-ROl4-cL2dlLp3zq_68g9b8JeQCRJi-oDTwupLLA3-ORT2dGn_071Wolxol$>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Ok that's very odd, not sure about that. Normal PC w/ Linux has that signal, you must have something else special. Try downloading the FILE separately and place in the llamacpp_path folder, then start h2oGPT with |
It still stops at the same place it does above. The thing is when I try launching it on CPU, it works but when I set the CUDA_VISIBLE_DEVICES env variable to enable GPU, it fails and just doesn't continue after it says Using Model Llama Edit: I did some debugging and found that it gets stuck at this line here: Line 1956 in e11d4c7
|
So during getting or initializing the embedding model, which should be fairly trivial. It's not clear, and really hard to debug without SIGUSR1. Have you tried the docker installation to avoid concerns about installation issues? |
Sorry for the late reply. I just tried the Docker version and encountered an out of space error running docker build -t h2ogpt . --> 39936edc8bac How much space is this expected to use? |
When you build a docker image, make sure the local path is clean. E.g. I recommend a separate clone of the repo and be in there. |
Ok I have managed to build the container. I ran this example on the README:
But seems to be hanging after this output:
Does it have anything to do with git_hash.txt? I am running this in the h2ogpt directory. Also when I try to make a db in the container prior to running the bot, I am hanging after this output:
|
Actually never mind, I resolved the issue. Thank you so much for your help! One thing I'm curious about is that if I have multiple doc source directories, how would I launch multiple chatbots? Do I need to specify multiple generate.py commands? Would it all be connected to one container? I'm using docker-compose for reference. |
Yes, if you made multiple collections, but want each to be served separately, then you can make them with make_db, then later launch separate h2oGPT for each, using CLI options like used here: https://github.com/h2oai/h2ogpt/blob/main/docs/README_LangChain.md#multiple-embeddings-and-sources You can use TEI server to share the embeddings if you want to save on GPU and get better speed. I share in FAQ how gpt.h2o.ai is setup using TEI server: https://github.com/h2oai/h2ogpt/blob/main/docs/FAQ.md#text-embedding-inference-server |
Ok sounds good thanks! I am also receiving this error
after running
Any workarounds? |
My guess is that you didn't make the directory as a user before it was created as root by the docker image. I've seen it in that case. If you already have the dirs, please remove them. Then do as user:
then run docker run so that those paths are mapped, e.g.:
|
Sorry for all the questions but I seem to not be able to get any of those working. I'm also getting some errors with the tool not being able to ingest any documents now
Can I get a step by step example from you if possible that uses Docker setup with document ingestion? I followed the ones on the README docs given but everytime I am met with an error. Thank you. |
Hi, can you give me your startup command and an example URL you are trying to provide (or one that fails in same way)? |
That playwright line should already have been done in docker or local install here: Line 63 in 5637fe4
But, can you try in expert settings disabling playwright and only forcing unstructured to be used? i.e. related code: Lines 4526 to 4542 in 5b48852
|
It's also possible I broke something very recently w.r.t. name handling. I'll check soon. |
A few things I just tried worked: In UI for "Ask or Ingest" I put in these urls and they all worked, after clicking ingest: www.cnn.com Are are you not giving a URL, but trying to upload a PDF? Please provide some details about what you are doing. |
I am trying to upload PDFs. Actually, I just resolved the document loading issues. But I'm having a few other issues related to the ones I've been getting. First making the database with Docker command. I am running the following
which is straight from the documentation and I made sure the directories I've made were writable. I put my documents under ~/user_path (181 PDFs). It manages to ingest but this is all that prints
But nothing else happens after that. So I'm assuming no db has been made since the command did not terminate. The second issue I'm getting is the permission issue still. Using this one from the documentation about running the tool using a generated db (after running src/make_db.py)
I get the following output
but nothing else after. I will see if I can reproduce the permission error because I saw it quite a few times the past few days. |
After make_db.py does the ingestion of PDFs, it needs to embed the data into the database. I assume you have GPUs, and so then it would use a GPU to do the embedding. The speed depends upon your GPUs, the embedding model, etc. This is what is meant by the line You should see intense activity on the GPU used for embedding. |
These:
are ignorable. But if it's hanging at:
some issue with the model. It could be downloading the model (it should show that though) or a network issue in talking to HF, etc. |
I ran your command on about 290 PDFs in the ~/user_path. During ingestion, you'll see stuff like this, using all cores efficiently since pymupdf is default PDF parser and only backup is used if it totally fails and pymupdf uses only CPU. After it becomes less busy, it means it's working on the last remaining files that are left in some batch, so not as parallel if the PDFs are kinda different from each other. If some PDF fails with pymupdf, it will go to unstructured loader using tesserract, which can be very slow. You can run with |
Once the GPU kicks in it'll look like the below. It takes a while to use instructor-large that is default for GPU case. It would be faster if one used a smaller embedding model, like
when you run ensure to use generate.py with |
May I possibly have specific instructions as for the GPU installation of the tool? I have followed the installation but it still says no GPU detected. I have the following GPU on my system: NVIDIA Corporation GA102GL [A10G] (rev a1) and my nvcc --version output is
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Nov_18_09:45:30_PST_2021
Cuda compilation tools, release 11.5, V11.5.119
Build cuda_11.5.r11.5/compiler.30672275_0
And tried to install Pytorch CUDA. But when I run
import torch
print(torch.cuda.is_available())
it still says False and when I try to run a model, it still says no GPU detected. Any guidance would be appreciated.
The text was updated successfully, but these errors were encountered: