Skip to content
This repository has been archived by the owner on Jan 24, 2024. It is now read-only.

Created RunPod template for easy deploy #58

Closed
kodxana opened this issue Mar 18, 2023 · 7 comments
Closed

Created RunPod template for easy deploy #58

kodxana opened this issue Mar 18, 2023 · 7 comments
Labels
question Further information is requested

Comments

@kodxana
Copy link

kodxana commented Mar 18, 2023

First of all I want to say I love Basaran is more OpenAI than OpenAI itself :)
I decided to give a go and made template for Basaran to run on RunPod GPU service and it works well including UI and API endpoints.
Hope that helps users who want to use it without GPU access to also be able to enjoy it.
https://runpod.io/gsc?template=7ito7h393l&ref=vfker49t

I will be publishing blog post soon so will share link later.
I also have question about location of the models. For now container saves model to temp storage if you let me know where models are being saved I will adjust template to allow saving to volume storage so users can avoid downloading models every time.

Thank you for amazing work again :)

@kodxana
Copy link
Author

kodxana commented Mar 18, 2023

Post link: https://blog.runpod.io/guide-for-running-basaran-an-open-source-alternative-to-the-openai-text-completion-api/

@peakji
Copy link
Member

peakji commented Mar 18, 2023

Hi @kodxana , thanks for the blog post!

The directory for storing/caching model is specified by the MODEL_CACHE_DIR environment variable, which is set to /models by default.

Here are some recommended environment variables to be included in the template:

  • MODEL, MODEL_REVISION, and MODEL_TRUST_REMOTE_CODE: these are arguments for AutoModel.from_pretrained.
  • MODEL_LOAD_IN_8BIT and MODEL_HALF_PRECISION: these quantization options allow users to run much larger models.

For a complete list of environment variables, please refer to the Dockerfile.

@kodxana
Copy link
Author

kodxana commented Mar 18, 2023

Updated template to include recommended settings question is GPT-Nano model compatible by chance?

@peakji
Copy link
Member

peakji commented Mar 18, 2023

is GPT-Nano model compatible by chance?

I haven't tried yet, but theoretically Basaran supports all 🤗 Transformers-based models. If the model is not available on HF hub, you can load it by specifying the MODEL environment variable to a local directory, e.g. MODEL=/home/ubuntu/gpt-nano.

You may also want to set SERVER_MODEL_NAME=gpt-nano to prevent the full path being included in responses.

@kodxana
Copy link
Author

kodxana commented Mar 18, 2023

I get it worked though had to remove half precision was breaking it. GPT-Neo is so Dark :D
obraz

@kodxana
Copy link
Author

kodxana commented Mar 18, 2023

Multi-GPU support also works without issues
obraz
Running on 2xA6000

@peakji
Copy link
Member

peakji commented Mar 18, 2023

Great to hear it's working well for you, @kodxana !

@peakji peakji closed this as completed Mar 19, 2023
@fardeon fardeon added the question Further information is requested label Apr 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants