Running Locally

The LLM Chatbot service

Prerequisites

The application requires Python 3.10 Here's a link to download 3.10.5 https://www.python.org/downloads/release/python-3105/

Running Locally

Downloading models

On application start, the app will attempt to download the models for your current environment (defaulted to local). The models for each environment can be found within the src/config.py file. If the app determines that the requested model already exists, it will attempt to load the model into memory. WARNING Laptops will only be able to run *.GGML models due to system memory constraints.

You can download different models by updating the download_link field for a given EnvrionmentConfiguration. An example for how to retrieve the download_link for a given model is shown in the video below:

All models will be downloaded to /src/model/downloaded_models/{folder_name}/{quantization_model}.

Application Standalone

Setup a virtual env
1. pip install virtualenv
2. python3 -m venv grace-hopper
3. source grace-hopper/bin/activate
pip install wheel
pip install -r requirements.txt
Run uvicorn src.main:app --reload to start the app
- The first time you run this it will attempt to download the model. This means it can take up to 15 minutes for the app to startup depending on internet speeds.
Chat with the app! curl --request POST \ --url http://127.0.0.1:8000/api/v1/inference \ --header 'Content-Type: application/json' \ --data '{ "query": "Write a short story about Grace Hopper" }'

Troubleshooting

PIP Install Errors for hnswlib

pypa/packaging-problems#648 (comment)

export HNSWLIB_NO_NATIVE=1 and then run pip install -r requirements.txt

PIP Install Errors for ChromaDb

If you run into the issue ERROR: Could not build wheels for chroma-hnswlib, which is required to install pyproject.toml-based projects Per stackoverflow link

First, run export HNSWLIB_NO_NATIVE=1 Then run `pip install chromadb

Local Docker Build Out of Memory

This can be attributed to Docker not removing unused images. You can run docker system prune -a -f to free the necessary memory.

Conceptual Overview

This sample project leverages a collection of open source frameworks, APIs, and programming concepts which may be unfamiliar to some. Below are some explanations of the elements of the solution:

The Retrieval Augmented Generation (RAG) Pattern

The architectural pattern in which domain-specific context is used to populate a searchable data store - and which is then retrieved on-demand by a retrieval mechanism before sending to a Large Language Model - is a common pattern. Read more about it here

Use of Python for AI Applications

Python is the primary language of choice for developing applications that leverage Machine Learning / AI. There are a plethora of libraries as fundamental as numpy, SciPy, scikit-learn, pytorch and more recently langchain and huggingface. The variety of available libraries and the power of the capabilities they expose in accomplishing AI-related tasks is unparalleled (regardless of your feelings about whitespace) and thus python is a common choice for rapidly building AI-powered apps.

This application is built in python, and uses many libraries (found in requirements.txt). The following libraries are key enablers of most LLM-based applications, and are used by this project:

In addition, the project uses many of the python language's features to accomplish tasks. If interested, feel free to read more about some core python concepts:

Use of Docker for Application Deployment / Delivery

Docker is a framework for wrapping up an application and its runtime requirements, and orchestrating its deployment. Docker's "Containerized App" allows an application to be wrapped up in a singular object that can be run anywhere Docker can run - regardless of implementation language - in a way that allows the app to be replicated, load-balanced, and more easily self-healing. Most modern apps leverage containers in their back-end. Read more about Docker here.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
scripts		scripts
src		src
tests		tests
.dockerignore		.dockerignore
.flake8		.flake8
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
ReadMe_Model_URL_Example.mov		ReadMe_Model_URL_Example.mov
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The LLM Chatbot service

Prerequisites

Running Locally

Downloading models

Application Standalone

Troubleshooting

PIP Install Errors for hnswlib

PIP Install Errors for ChromaDb

Local Docker Build Out of Memory

Conceptual Overview

The Retrieval Augmented Generation (RAG) Pattern

Use of Python for AI Applications

Use of Docker for Application Deployment / Delivery

About

Releases

Packages

Languages

License

troweprice/grace-hopper

Folders and files

Latest commit

History

Repository files navigation

The LLM Chatbot service

Prerequisites

Running Locally

Downloading models

Application Standalone

Troubleshooting

PIP Install Errors for hnswlib

PIP Install Errors for ChromaDb

Local Docker Build Out of Memory

Conceptual Overview

The Retrieval Augmented Generation (RAG) Pattern

Use of Python for AI Applications

Use of Docker for Application Deployment / Delivery

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages