Created RunPod template for easy deploy #58

kodxana · 2023-03-18T09:49:45Z

First of all I want to say I love Basaran is more OpenAI than OpenAI itself :)
I decided to give a go and made template for Basaran to run on RunPod GPU service and it works well including UI and API endpoints.
Hope that helps users who want to use it without GPU access to also be able to enjoy it.
https://runpod.io/gsc?template=7ito7h393l&ref=vfker49t

I will be publishing blog post soon so will share link later.
I also have question about location of the models. For now container saves model to temp storage if you let me know where models are being saved I will adjust template to allow saving to volume storage so users can avoid downloading models every time.

Thank you for amazing work again :)

kodxana · 2023-03-18T15:41:10Z

Post link: https://blog.runpod.io/guide-for-running-basaran-an-open-source-alternative-to-the-openai-text-completion-api/

peakji · 2023-03-18T17:10:13Z

Hi @kodxana , thanks for the blog post!

The directory for storing/caching model is specified by the MODEL_CACHE_DIR environment variable, which is set to /models by default.

Here are some recommended environment variables to be included in the template:

MODEL, MODEL_REVISION, and MODEL_TRUST_REMOTE_CODE: these are arguments for AutoModel.from_pretrained.
MODEL_LOAD_IN_8BIT and MODEL_HALF_PRECISION: these quantization options allow users to run much larger models.

For a complete list of environment variables, please refer to the Dockerfile.

kodxana · 2023-03-18T17:27:20Z

Updated template to include recommended settings question is GPT-Nano model compatible by chance?

peakji · 2023-03-18T17:45:30Z

is GPT-Nano model compatible by chance?

I haven't tried yet, but theoretically Basaran supports all 🤗 Transformers-based models. If the model is not available on HF hub, you can load it by specifying the MODEL environment variable to a local directory, e.g. MODEL=/home/ubuntu/gpt-nano.

You may also want to set SERVER_MODEL_NAME=gpt-nano to prevent the full path being included in responses.

kodxana · 2023-03-18T17:47:20Z

I get it worked though had to remove half precision was breaking it. GPT-Neo is so Dark :D

kodxana · 2023-03-18T17:54:24Z

Multi-GPU support also works without issues

Running on 2xA6000

peakji · 2023-03-18T18:01:21Z

Great to hear it's working well for you, @kodxana !

peakji closed this as completed Mar 19, 2023

fardeon added the question Further information is requested label Apr 24, 2023

peakji mentioned this issue Dec 18, 2023

Runpod Serverless #288

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Created RunPod template for easy deploy #58

Created RunPod template for easy deploy #58

kodxana commented Mar 18, 2023

kodxana commented Mar 18, 2023

peakji commented Mar 18, 2023

kodxana commented Mar 18, 2023

peakji commented Mar 18, 2023

kodxana commented Mar 18, 2023

kodxana commented Mar 18, 2023

peakji commented Mar 18, 2023

Created RunPod template for easy deploy #58

Created RunPod template for easy deploy #58

Comments

kodxana commented Mar 18, 2023

kodxana commented Mar 18, 2023

peakji commented Mar 18, 2023

kodxana commented Mar 18, 2023

peakji commented Mar 18, 2023

kodxana commented Mar 18, 2023

kodxana commented Mar 18, 2023

peakji commented Mar 18, 2023