simpleChat_llama2_mac_silicon

NOTE I've archived this repo, as it's aged-out! 🤪🤪🤪🤪🤪

A simple chat app with embeddings and vector database, exclusively for local execution on mac/apple silicon.

This repo is a distillation and riff based off privateGPT and localGPT which I created to exclusively run locally on Apple Silicon and more specificly my MacBook Pro (M1). My motivation was to create minimum viable LLM chat running completly local on my device, for learning purposes.

Features

LlamaCpp from llama.cpp
Models from Hugging Face
Embeddings from Hugging Face
Vector Database from Chroma
LangChain Framework LangChain
UI via Gradio

Requirements

Mac computer with Apple silicon
Conda

Create & activate New Conda environment

conda create -n simpleChat python=3.11
conda activate simpleChat

Clone the Repo

git clone https://github.com/ziligy/simpleChat_llama2_mac_silicon simpleChat
cd simpleChat

Installation

LangChain

conda install -c conda-forge langchain==0.0.239

upgrade llama-cpp-python for Metal

CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install --upgrade --force-reinstall llama-cpp-python==0.1.77 --no-cache-dir

requirements.txt

pip install -r requirements.txt

Download GGML Model & Set MODEL_PATH

Download this model or another 4-bit(preferred) GGML (required) model

llama-2-13b-chat.ggmlv3.q4_1.bin

Define MODEL_PATH in constants.py to set the location and name of the model you are using.

(e.g in constants.py) MODEL_PATH = os.path.expanduser( '~' ) + "/Models/llama-2-13b-chat.ggmlv3.q4_1.bin"

Starting the Chat on Gradio Server

gradio app.py

You should see: Running on local URL: http://127.0.0.1:7861

cmd + click on the link to start the Chat UI in your browser

Go to your browser to chat with the AI.

Note chat responses may take one to two minutes, so you'll need to be patient

Instructions for optionally ingesting your own dataset

NOTE: The ingest.py component is basicly a fork from localGPT

Put any and all of your .txt, .pdf, or .csv files into the SOURCE_DOCUMENTS directory

The current default file types are .txt, .pdf, .csv, and .xlsx, if you want to use any other file type, you will need to convert it to one of the default file types.

Run the following command to ingest all the data.

python ingest.py

It will create an index containing the local vectorstore. Will take time, depending on the size of your documents.

If you want to start from an empty database, delete the DB folder.

rm -r ./DB

Executing the ingest.py program will recreate a fresh DB directory

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

simpleChat_llama2_mac_silicon

NOTE I've archived this repo, as it's aged-out! 🤪🤪🤪🤪🤪

Features

Requirements

Create & activate New Conda environment

Clone the Repo

Installation

Download GGML Model & Set MODEL_PATH

Starting the Chat on Gradio Server

Instructions for optionally ingesting your own dataset

Files

README.md

Latest commit

History

README.md

File metadata and controls

simpleChat_llama2_mac_silicon

** NOTE ** I've archived this repo, as it's aged-out! 🤪🤪🤪🤪🤪

Features

Requirements

Create & activate New Conda environment

Clone the Repo

Installation

Download GGML Model & Set MODEL_PATH

Starting the Chat on Gradio Server

Instructions for optionally ingesting your own dataset

NOTE I've archived this repo, as it's aged-out! 🤪🤪🤪🤪🤪