Welcome! This project is intended to help people get started thinking about and using LLMs in their programming projects, in particular using local LLMs and vector stores to implement the Retrieval Augmented Generation (RAG) architecture that forms the backbone of many current generative AI applications. It is coded in Python, and uses Jupyter Notebooks for coding.
We're using LLaMA-based LLMs here, specifically LLMs which are based on the LLaMA 2 foundational model. That model is usable for commercial applications, and is highly capable at things like text summaries.
LangChain is the premier framework for building LLM-based applications. It has a vibrant community and is constantly getting better and better.
You need some way to run a local LLM, or you'll need to use a third party LLM such as ChatGPT. The examples in this project use Ollama, which only works on Mac OS at the moment (8/31/2023).
If you are not on Mac OS, LangChain supports other methods of running local LLMs, such as Llama.cpp, and other LLM classes can be readily substituted in the examples. ChatGPT is quite affordable for small scale projects.
- Clone the project using
git - Install Python 3.11+
- Create a virtual environment for this project
- Install the requirements
- Run
jupyter lab - Open a browser to localhost:8888
If you're new to this stuff, the suggested order is:
- LLMs
- Prompting
- Embeddings
- Vector Databases
- Retrieval Augmented Generation (RAG)
- Local LLMs with LangChain - an overview of using local LLMs with LangChain
- Local retrieval QA - more info about what we're doing here with RAG
- Long term chat memory - using a vector database to implement long term chat memory
- Ollama Langchain integration
- Code understanding with LangChain - using an LLM to analyze and work with your code
- Introduction to prompt design - from Google, but applicable to any LLM
- LangChain integrations - LangChain works with many LLMs, vector databases, other data retrieval sources, and can ingest data from many sources
- Sentient Silo demo - a demo of some more advanced capabilities in an upcoming side project