diff --git a/README.md b/README.md index f08c1fc..55b01e8 100644 --- a/README.md +++ b/README.md @@ -1,10 +1,10 @@ # Serverless Tiny Language Models -[TinyLlama 4-bit quantized 3 trillion token chat model](https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF) and [8-bit quantized Qwen 2 beta 0.5B Chat](https://huggingface.co/Qwen/Qwen1.5-0.5B-Chat-GGUF) running on Azure Functions consumption plan. +[TinyLlama 1.1B 4-bit quantized 3 trillion token chat model](https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF) and [Qwen 2 beta 0.5B 8-bit quantized chat model](https://huggingface.co/Qwen/Qwen1.5-0.5B-Chat-GGUF) running on Azure Functions consumption plan. -This project is not intended for production use. It is a technology demonstration to show that it is possible to run a large language model on a cheap and scalable serverless platform. +This project is not intended for production use. It is a technology demonstration to show that it is possible to run large language models on a cheap and scalable serverless platform. -A demo of the app is available at [https://.azurewebsites.net/](https://.azurewebsites.net/). In the demo, you can enter a prompt and the model will generate a completion. +A demo of the app is available at [https://tiny-serverless-llms.azurewebsites.net](https://tiny-serverless-llms.azurewebsites.net). In the demo, you can enter a prompt and the model will generate a completion. **Any abuse of the service will result in the service being taken down.** @@ -39,13 +39,15 @@ You'll need the following resources in Azure: You'll need to set the following application setting in Azure Functions: -`POST_BUILD_SCRIPT_PATH=post_build.sh` -`MODEL_BASE=/home/site/wwwroot/` -`LLAMA_BASE=/home/site/wwwroot/llama.cpp` -`AzureSignalRBase=https://{signal_r_service_name}.service.signalr.net` -`AzureSignalRAccessKey={signal_r_service_key}` +```bash +POST_BUILD_SCRIPT_PATH=post_build.sh +MODEL_BASE=/home/site/wwwroot/ +LLAMA_BASE=/home/site/wwwroot/llama.cpp/ +AzureSignalRBase=https://{signal_r_service_name}.service.signalr.net +AzureSignalRAccessKey={signal_r_service_key} +``` -This script will be executed during Oryx build. It is used to build the llama.cpp binary and add the models to the deployment package. +The `post_build.sh` script will be executed during Oryx build. It is used to build the llama.cpp binary and add the models to the deployment package. ## About [Softlandia](https://softlandia.fi/)