-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
address.jsonnet file format and CUDA error #5
Comments
Oops, it looks like those files didn't get added as they were in my .gitignore. I've added them now. Regarding OOM: There is no batching happening anywhere in the inference. It's all one instance at a time. To reduce memory usage, however, there are two things you can do.
|
Thank you for quick & detailed explanation. I'm new to FastAPI, and for the jsonnet files you provided, can I use When I run the script below:
I get a message as follows:
and the message repeats. |
You can ignore The |
Thank you for the answer, The issue I'm having is with the retriever server.
However, when I run
I'm running your |
Can you confirm Also, can you also confirm that your Elasticsearch server is running fine and you've already run the indexing scripts? You can check it by running |
It seems I'm having issue with elasticsearch. I cannot access port 9200 on GCP. Let me resolve this issue and come back if the issue persists. Thank you again for your prompt responses. |
Hi,
I'm trying to reproduce the results, and I found
llm_server_address.jsonnet
andretriever_address.jsonnet
necessary.Can you provide an example scripts for these?
Also, I'm getting
torch.cuda.OutOfMemoryError: CUDA out of memory
error message. If you can give me some tips to prevent cuda error (e.g. where to reduce the batch size), that would be appreciated.Thank you in advance :)
The text was updated successfully, but these errors were encountered: