Skip to content

Commit

Permalink
Cleaned repository and fixed issues
Browse files Browse the repository at this point in the history
General adjustments:
- cleaned up the repository by removing and adjusting several files
- updated README.md to reflect changes from original repository
- added db_clear.py to easily clear the entire database

main_st.py:
- removed cache refresh button, issue is now fixed
- cleaned up multiple parts of the code

db_build.py:
- cleaned up code
  • Loading branch information
Vlassie committed Oct 26, 2023
1 parent 1ebb5bd commit d983844
Show file tree
Hide file tree
Showing 11 changed files with 239 additions and 3,398 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# GGML Models
models/*.bin
models/*

# Data
data/*
Expand Down
59 changes: 28 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,57 +1,54 @@
# Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A

### Clearly explained guide for running quantized open-source LLM applications on CPUs using LLama 2, C Transformers, GGML, and LangChain
## Preface
This is a fork of Kenneth Leung's original repository, that adjusts the original code in several ways:
- A streamlit visualisation is available to make it more user-friendly
- Follow-up questions are now possible thanks to memory implementation
- Different models now appear as options for the user
- Multiple other optimalisations

**Step-by-step guide on TowardsDataScience**: https://towardsdatascience.com/running-llama-2-on-cpu-inference-for-document-q-a-3d636037a3d8
___
## Context
- Third-party commercial large language model (LLM) providers like OpenAI's GPT4 have democratized LLM use via simple API calls.
- However, there are instances where teams would require self-managed or private model deployment for reasons like data privacy and residency rules.
- The proliferation of open-source LLMs has opened up a vast range of options for us, thus reducing our reliance on these third-party providers. 
- When we host open-source LLMs locally on-premise or in the cloud, the dedicated compute capacity becomes a key issue. While GPU instances may seem the obvious choice, the costs can easily skyrocket beyond budget.
- In this project, we will discover how to run quantized versions of open-source LLMs on local CPU inference for document question-and-answer (Q&A).
<br><br>
![Alt text](assets/diagram_flow.png)
___

## Quickstart
- Ensure you have downloaded the GGML binary file from https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML and placed it into the `models/` folder
- To start parsing user queries into the application, launch the terminal from the project directory and run the following command:
`poetry run python main.py "<user query>"`
- For example, `poetry run python main.py "What is the minimum guarantee payable by Adidas?"`
- Note: Omit the prepended `poetry run` if you are NOT using Poetry
<br><br>
- Ensure you have downloaded the model of your choice in GGUF format and placed it into the `models/` folder. Some examples:
- https://huggingface.co/TheBloke/Llama-2-7b-Chat-GGUF
- https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF

- Fill the `data/` folder with .pdf, .doc(x) or .txt files you want to ask questions about

- To build a FAISS database with information regarding your files, launch the terminal from the project directory and run the following command <br>
`python db_build.py`

- To start asking questions about your files, run the following command: <br>
`streamlit run main_st.py`

- Choose which model to use for Q&A and adjust parameters to your liking

![Alt text](assets/qa_output.png)

___
## Tools
- **LangChain**: Framework for developing applications powered by language models
- **C Transformers**: Python bindings for the Transformer models implemented in C/C++ using GGML library
- **LlamaCPP**: Python bindings for the Transformer models implemented in C/C++
- **FAISS**: Open-source library for efficient similarity search and clustering of dense vectors.
- **Sentence-Transformers (all-MiniLM-L6-v2)**: Open-source pre-trained transformer model for embedding text to a 384-dimensional dense vector space for tasks like clustering or semantic search.
- **Llama-2-7B-Chat**: Open-source fine-tuned Llama 2 model designed for chat dialogue. Leverages publicly available instruction datasets and over 1 million human annotations.
- **Poetry**: Tool for dependency management and Python packaging

___
## Files and Content
- `/assets`: Images relevant to the project
- `/config`: Configuration files for LLM application
- `/data`: Dataset used for this project (i.e., Manchester United FC 2022 Annual Report - 177-page PDF document)
- `/models`: Binary file of GGML quantized LLM model (i.e., Llama-2-7B-Chat)
- `/models`: Binary file of GGUF quantized LLM model (i.e., Llama-2-7B-Chat)
- `/src`: Python codes of key components of LLM application, namely `llm.py`, `utils.py`, and `prompts.py`
- `/vectorstore`: FAISS vector store for documents
- `db_build.py`: Python script to ingest dataset and generate FAISS vector store
- `main.py`: Main Python script to launch the application and to pass user query via command line
- `pyproject.toml`: TOML file to specify which versions of the dependencies used (Poetry)
- `db_clear.py`: Python script to clear the previously built database
- `main_st.py`: Main Python script to launch the streamlit application
- `main.py`: Python script to launch an older version of the application within the terminal, mainly used for testing purposes
- `requirements.txt`: List of Python dependencies (and version)
___

## References
- https://github.com/marella/ctransformers
- https://huggingface.co/TheBloke
- https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML
- https://python.langchain.com/en/latest/integrations/ctransformers.html
- https://python.langchain.com/en/latest/modules/models/llms/integrations/ctransformers.html
- https://python.langchain.com/docs/ecosystem/integrations/ctransformers
- https://ggml.ai
- https://github.com/rustformers/llm/blob/main/crates/ggml/README.md
- https://www.mdpi.com/2189676
- https://github.com/abetlen/llama-cpp-python
- https://python.langchain.com/docs/integrations/llms/llamacpp
Binary file removed assets/diagram_flow.png
Binary file not shown.
Binary file modified assets/qa_output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 1 addition & 4 deletions config/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ VECTOR_COUNT: 2
CHUNK_SIZE: 500
CHUNK_OVERLAP: 50
DATA_PATH: 'data/'
LOG_FILE: 'log_loaded.txt'
DB_FAISS_PATH: 'vectorstore/db_faiss'
# MODEL_TYPE: 'mpt'
# MODEL_BIN_PATH: 'models/mpt-7b-instruct.ggmlv3.q8_0.bin'
MODEL_TYPE: 'llama'
MODEL_BIN_PATH: 'models/llama-2-7b-chat.ggmlv3.q8_0.bin'
MAX_NEW_TOKENS: 256
TEMPERATURE: 0.01
16 changes: 8 additions & 8 deletions db_build.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import box
import yaml
from langchain.vectorstores import FAISS
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import PyPDFLoader, DirectoryLoader
from langchain.document_loaders import Docx2txtLoader
Expand All @@ -12,7 +13,6 @@
import sys
import os

from langchain.embeddings import HuggingFaceEmbeddings

# Import config vars
with open('config/config.yml', 'r', encoding='utf8') as ymlfile:
Expand All @@ -25,19 +25,19 @@ def run_db_build():
documents = []

source = cfg.DATA_PATH
output_file = 'log_loaded.txt'
output_path = os.path.join(source, output_file)
log_file = cfg.LOG_FILE
log_path = os.path.join(source, log_file)
all_items = os.listdir(source)

# Check which files are already loaded in the database (if any)
existing_files = []
if os.path.exists(output_path):
with open(output_path, 'r') as file:
if os.path.exists(log_path):
with open(log_path, 'r') as file:
existing_files = file.read().splitlines()
# Obtain files that aren't yet loaded
new_files = [name for name in all_items if name not in existing_files and name != output_file]
new_files = [name for name in all_items if name not in existing_files and name != log_file]
# Save their names to the logging file
with open(output_path, 'a') as file:
with open(log_path, 'a') as file:
for name in new_files:
file.write(name + '\n')
if new_files:
Expand All @@ -47,7 +47,7 @@ def run_db_build():
sys.exit()

for index, file in enumerate(new_files, start=1):
if not file == output_file: # skip adding the logging file to the database
if not file == log_file: # skip adding the logging file to the database
print(f"Loading... {file} - File {index}/{total_files}", end='\r')
print(end='\x1b[2K') # clear previous print so no overlap occurs

Expand Down
40 changes: 40 additions & 0 deletions db_clear.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# =========================
# Module: Vector DB Clear
# =========================
import box
import yaml
import os

# Import config vars
with open('config/config.yml', 'r', encoding='utf8') as ymlfile:
cfg = box.Box(yaml.safe_load(ymlfile))

def delete_files_and_clear_content(folder_path, file_to_clear):
try:
# Get a list of all files in the folder
files = os.listdir(folder_path)

# Loop through the list and delete each file
for file in files:
file_path = os.path.join(folder_path, file)
if os.path.isfile(file_path):
os.remove(file_path)
print(f"{file} deleted successfully.")

print(f"All files in '{folder_path}' have been deleted.")
except FileNotFoundError:
print(f"Folder not found at path: {folder_path}")

# Clear the contents of the specified file
try:
with open(file_to_clear, 'w') as clear_file:
clear_file.truncate(0)
print(f"Contents of '{file_to_clear}' cleared successfully.")
except FileNotFoundError:
print(f"{file_to_clear} not found.")

if __name__ == "__main__":
folder_path = cfg.DB_FAISS_PATH
file_to_clear = os.path.join(cfg.DATA_PATH, cfg.LOG_FILE)

delete_files_and_clear_content(folder_path, file_to_clear)
Loading

0 comments on commit d983844

Please sign in to comment.