#Background

This is a experimentation setup to use LLaMA3 LLM and Langchain, to execute contextual queries with personal data and derive insights.

##Steps at a Glance

1. Setup LLaMA3 LLM Locally and Execute Test Queries
2. Start Jupyter Notebooks locally
3. Setup and Execute queries using Langchain
4. Setup Conversation Chat History
5. Download Cricket Data
6. Read the Cricket Data using Langchain
7. Create Vector Indexes for the Data 
8. Execute Queries
9. Experimentation
10. Reference Documentation

##1. Setup LLaMA3 LLM Locally and Execute Test Queries

Detailed Steps: [Running LLaMA3 on Mac, Windows & Linux](https://github.com/meta-llama/llama-recipes/blob/main/recipes/quickstart/Running_Llama3_Anywhere/Running_Llama_on_Mac_Windows_Linux.ipynb)

####Use homebrew to install Ollama
`brew install ollama`

####Using Ollama pull the llama3.1 8B LLM and run it
```
ollama pull llama3.1
ollama run llama3.1
```

####Execute Queries

Queries can be executed using the Command Line, where LLaMA3 has started up and using Curl as shown below

```
curl http://localhost:11434/api/chat -d '{
  "model": "llama3.1",
  "messages": [
    {
      "role": "user",
      "content": "who wrote the book godfather?"
    }
  ],
  "stream": false
}'
```

##2. Start Jupyter Notebooks locally
```
# clone git repo: https://github.com/fastai/fastsetup
git clone https://github.com/fastai/fastsetup.git

# run the following command to setup conda
./fastsetup/setup-conda.sh

# install and run Jupyter Notebook locally
conda install jupyter
jupyter notebook --no-browser
```

Copy paste the URL from the console in the web browser to access Jupyter Notebooks

##3. Setup and Execute queries using Langchain

Create a new file with the extension .ipynb, and use the following code

In [None]:
!pip install langchain
!pip install langchain_community # used for interaction with Ollama agent to execute queries with llama3.1
!pip install sentence-transformers # used to chunk large input files to enable creation of vector indexes
!pip install faiss-cpu # the vector datastore used to store the vector index
!pip install bs4

In [None]:
from langchain_community.chat_models import ChatOllama

llm = ChatOllama(model="llama3.1", temperature=0) 
response = llm.invoke("who wrote the book godfather?") 
print(response.content)

##4. Setup Conversation Chat History

We are able to execute queries with llama3.1 now, but llama currently does not have the ability to answer follow up questions as it does not remember the context of for the perviously asked questions. The following setup will setup a session store where we save the conversations with llama, and provide it the history of the conversation so that it can remember what we are talking about and answer questions in that context.

In [None]:
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

# store is a dictionary that maps session IDs to their corresponding chat histories.
store = {}  # memory is maintained outside the chain


# A function that returns the chat history for a given session ID.
def get_session_history(session_id: str) -> InMemoryChatMessageHistory:
    if session_id not in store:
        store[session_id] = InMemoryChatMessageHistory()
    return store[session_id]


#  Define a RunnableConfig object, with a `configurable` key. session_id determines thread
config = {"configurable": {"session_id": "1"}}

conversation = RunnableWithMessageHistory(
    llm,
    get_session_history,
)

conversation.invoke(
    "who wrote the book godfather?",  # input or query
    config=config,
)

In [None]:
conversation.invoke(
    "tell me more",
    config=config,
)

##5. Download Cricket Data

Download the data in JSON format following website: https://cricsheet.org/downloads/
Data Schema Information: https://cricsheet.org/format/json/#introduction-to-the-json-format

There are various examples on the web where data from PDFs and Text files are used to build vector indexes and used for querying. The following notebook has a good example of scraping a webpage and using its information to build vector indexes: https://github.com/meta-llama/llama-recipes/blob/main/recipes/quickstart/RAG/hello_llama_cloud.ipynb


Using this data will enable the exploration of using structured and semi-structured data for creating vector indexes and experiment with the different kinds of queries that can be used to draw meaningful insights from this data.

##6. Read the Cricket Data using Langchain

The downloaded data is pretty large (2.8GB) which makes experimenttaion slow. Create a separate folder called `cricket-data-sample` and copy paste 5-10 files from the downloaded data. This will be used for the purposes of testing.

Next, we will load the data as Langchain Documents.

In [None]:
from langchain.document_loaders import DirectoryLoader, TextLoader
# from langchain.document_loaders.json_loader import JSONLoader

DRIVE_FOLDER = "/Users/srjalan/personal/data/cricket_sample"
loader = DirectoryLoader(DRIVE_FOLDER, glob='**/*.json', show_progress=True, loader_cls=TextLoader)

# loader = DirectoryLoader(DRIVE_FOLDER, glob='**/*.json', show_progress=True, loader_cls=JSONLoader, loader_kwargs = {'jq_schema':'.content'})

documents = loader.load()

print(f'document count: {len(documents)}')
print(documents[0] if len(documents) > 0 else None)\

##7. Create Vector Index for the Data

First we will split the data into chunks using the `RecursiveCharacterTextSplitter`. There are multiple splitters that can be used to improve the effectiency of queries. The following article describes some of the different splitters: https://medium.com/@sushmithabhanu24/retrieval-in-langchain-part-2-text-splitters-2d8c9d595cc9

In [None]:
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain.text_splitter import RecursiveCharacterTextSplitter
import bs4

# Split the document into chunks with a specified chunk size
text_splitter = RecursiveCharacterTextSplitter(chunk_size=750, chunk_overlap=100)
all_splits = text_splitter.split_documents(documents)

# Store the document into a vector store with a specific embedding model
vectorstore = FAISS.from_documents(all_splits, HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2"))

##8. Execute Queries

In [None]:
from langchain.chains import RetrievalQA

qa_chain = RetrievalQA.from_chain_type(
    llm,
    retriever=vectorstore2.as_retriever()
)

question = "Whick teams did Australia play against?"
result = qa_chain.invoke({"query": question})
print(result['result'])

##9. Experimentation

To improve the performance of the query system, I tried to the `RecursiveJsonSplitter`, but this has not given me the desired result I was hoping for yet. Continuing the research and play with configurations to find a good soluiton. The code below demonstrates the setup for the `RecursiveJsonSplitter`.

In [None]:
#Load the data as Json Objects into memory, instead of loading them as Langchain Documents

import os
import json

# Path to folder with JSON files
folder_path = '/Users/srjalan/personal/data/cricket_sample'

# List to store all data
data = []

# Load each JSON file in the folder
for filename in os.listdir(folder_path):
    if filename.endswith(".json"):
        with open(os.path.join(folder_path, filename), 'r') as f:
            json_data = json.load(f)
            data.append(json_data)

# Check sample data
print(data[0])

In [None]:
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain.text_splitter import RecursiveJsonSplitter
import bs4

# Split the document into chunks with a specified max chunk size
json_splitter = RecursiveJsonSplitter(max_chunk_size=1000)
all_splits = json_splitter.create_documents(data)

# Store the document into a vector store with a specific embedding model
vectorstore = FAISS.from_documents(all_splits, HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2"))

In [None]:
from langchain.chains import RetrievalQA

qa_chain = RetrievalQA.from_chain_type(
    llm,
    retriever=vectorstore2.as_retriever()
)

question = "Whick teams did Australia play against?"
result = qa_chain.invoke({"query": question})
print(result['result'])

##10. Reference Documentation

1. [Local RAG Agent with LLaMA3 - Advanced Example](https://github.com/langchain-ai/langgraph/blob/main/examples/rag/langgraph_rag_agent_llama3_local.ipynb)
2. https://python.langchain.com/docs/how_to/document_loader_json/
3. https://medium.com/@sushmithabhanu24/retrieval-in-langchain-part-2-text-splitters-2d8c9d595cc9
4. [How to split JSON data](https://python.langchain.com/docs/how_to/recursive_json_splitter/)
5. [Cricket Data API](https://cricketdata.org)
6. [Kaggle ICC Cricket Data](https://www.kaggle.com/datasets/mahendran1/icc-cricket)
7. [Create Your Customized ChatBot with Your Data Using LangChain](https://blog.gopenai.com/create-your-customized-chatbot-with-your-data-using-langchain-a715ad50a34d)