# LangChain RAG with Local Models

This is based on Pixegami's tutorial. ([original repo](https://github.com/pixegami/rag-tutorial-v2/))

## Download data and folder setup

On the Docker host side, run the following to set up the `jetson-containers`' `/data` directory.

Your directory structure should look like this:

```
└── ./data/documents/L4T-README
    ├── README-usb-dev-mode.txt
    ├── README-vnc.txt
    └── README-wifi.txt
```

# Run these on Docker host side (natively on Jetson)

```
cd jetson-containers
mkdir -p data/documents/L4T-README
cp /media/jetson/L4T-README/*.txt data/documents/L4T-README/
```

## Loading The Data

In [1]:
from langchain_community.document_loaders import DirectoryLoader

DATA_PATH = '/data/documents/L4T-README'

def load_documents():
    document_loader = DirectoryLoader(DATA_PATH, glob="*.txt")
    return document_loader.load()

In [2]:
documents = load_documents()
print(len(documents))
print(documents[0])

4


## Split The Documents 

In [3]:
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain.schema.document import Document

def split_documents(documents: list[Document]):
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=800,
        chunk_overlap=50,
        length_function=len,
        is_separator_regex=False,
    )
    return text_splitter.split_documents(documents)

In [4]:
chunks = split_documents(documents)
print(len(chunks))
print(chunks[0])

27


## Embedding Models

There are couple of options for embedding model.

### Using Cloud Embedding Model

Cloud hosted embedding modles generally perform more accurately.

#### Option 1: OpenAI Embedding model

[API Reference: `OpenAIEbmeddings`](https://api.python.langchain.com/en/latest/embeddings/langchain_openai.embeddings.base.OpenAIEmbeddings.html)

> **Remember to put YOUR own OpenAI API key in the following cell.** 

In [None]:
OPENAI_API_KEY = ""

In [11]:
from langchain_openai import OpenAIEmbeddings

def get_embedding_model():
    embedding_model = OpenAIEmbeddings(
        model = "text-embedding-3-large",
        openai_api_key = OPENAI_API_KEY
    )
    return embedding_model

#### Option 2: AWS Bedrock Embedding model

[API Reference: `langchain_community.embeddings.bedrock`](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.bedrock.Bedrock.html)

In [6]:
from langchain_community.embeddings.bedrock import BedrockEmbeddings

def get_embedding_model():
    embedding_model = BedrockEmbeddings(
        credentials_profile_name="default", region_name="us-east-1"
    )
    return embedding_model

### Running **Local** Embedding Function

#### Case 3: Open embedding model `nomic-embed-text` running locally

In case you want to run the embedding function locally as well, run the following cell instead.

In [None]:
!ollama list

In [None]:
!ollama pull nomic-embed-text

In [None]:
from langchain_community.embeddings.ollama import OllamaEmbeddings

def get_embedding_model():
    embedding_model = OllamaEmbeddings(model="nomic-embed-text")
    return embedding_model

## Creating The Vector Store

We are going to create the vector store with embeddings and save it in a directory as files.
Here, the directory is defined to be "**chromadb**".

In [7]:
CHROMA_PATH = "chromadb"

Remove the directory if it has previously been created and populated.

In [8]:
%%bash -s "$CHROMA_PATH"
rm -rf $1

### Vector store to be created with embedding model specified

In [12]:
from langchain.vectorstores.chroma import Chroma

def add_to_chroma(chunks: list[Document]):
    vectorstore = Chroma(
        persist_directory=CHROMA_PATH, 
        embedding_function=get_embedding_model()
    )
    vectorstore.add_documents(chunks)
    vectorstore.persist()

In [13]:
add_to_chroma(chunks)

  warn_deprecated(


Let's check what files are saved and the size of each file.

In [14]:
%%bash -s "$CHROMA_PATH"
du -ah ./$1

4.0K	./chromadb/27ebb1e3-2205-4e67-8d79-b70273f784ee/length.bin
4.0K	./chromadb/27ebb1e3-2205-4e67-8d79-b70273f784ee/header.bin
12M	./chromadb/27ebb1e3-2205-4e67-8d79-b70273f784ee/data_level0.bin
0	./chromadb/27ebb1e3-2205-4e67-8d79-b70273f784ee/link_lists.bin
12M	./chromadb/27ebb1e3-2205-4e67-8d79-b70273f784ee
620K	./chromadb/chroma.sqlite3
13M	./chromadb


## Running RAG Query Locally

Below defines the template for the prompt to eventually sent to our LLM.

In [15]:
PROMPT_TEMPLATE = """
Answer the question based only on the following context:
{context}

---
Answer the question based on the above context: {question}
"""

The actual question is supplied as below.

In [16]:
query_text="What IPv4 address Jetson device gets assigned when connected to a PC with a USB cable? \
    And what file to edit in order to change the IP address to be assigned to Jetson itself in USB device mode? \
    Plesae state which section you find the answer for each question."

### Load vector store from persisted files with embedding model specified

In [17]:
vectorstore = Chroma(
    persist_directory=CHROMA_PATH, 
    embedding_function=get_embedding_model()
)

### Search the vector store for retrieving relevant context

Top 5 relevant chunks are retrieved.

In [18]:
results = vectorstore.similarity_search_with_score(query_text, k=5)

In [19]:
from langchain.prompts import ChatPromptTemplate

context_text = "\n\n---\n\n".join([doc.page_content for doc, _score in results])
prompt_template = ChatPromptTemplate.from_template(PROMPT_TEMPLATE)
prompt = prompt_template.format(context=context_text, question=query_text)

Final `prompt` is generated with context filled.

If you decide to print out the generated prompt, run the following cell.

In [20]:
print(prompt)

Human: 
Answer the question based only on the following context:
Ethernet on Mac ---------------------------------------------------------------------- An NCM USB Ethernet device is created, and the required driver is automatically activated. Since the Mac operating system does not support the RNDIS device, the device is not activated by default.

Changing the IPv4 Address ---------------------------------------------------------------------- Edit /opt/nvidia/l4t-usb-device-mode/nv-l4t-usb-device-mode-config.sh on Jetson to change the IPv4 network parameters. The following variables must be changed, and must maintain consistent values: - net_ip - net_mask - net_net - net_dhcp_start - net_dhcp_end

---

Linux for Tegra assigns a static IPv4 address of 192.168.55.1 to Jetson, and runs a DHCP server to automatically assign an IPv4 address of 192.168.55.100 to your host machine. This provides point-to-point connectivity. If a Jetson device experiences very high CPU or disk IO load, this DH

### Define LLM

Define the local LLM using `Ollama` to be invoked with the prompt.

In [21]:
from langchain_community.llms.ollama import Ollama

model = Ollama(model="llama3")

If you have not downloaded `llama3` model and the above cell failed, run the following cell and come back to execute the above cell again.

In [None]:
!ollama pull llama3

#### Alternative: Using OpenAI LLM

In case you wanted to try OpenAI LLM, you can run the following cell. 

In [None]:
from langchain_openai import OpenAI

model = OpenAI(
    model="gpt-3.5-turbo-instruct",
    openai_api_key = OPENAI_API_KEY
)

#### Running the LLM 

In [22]:
response_text = model.invoke(prompt)

Let's check the LLM output.

In [23]:
from IPython.display import display, Markdown, Latex
display(Markdown(response_text))

**Question 1: What IPv4 address does the Jetson device get assigned when connected to a PC with a USB cable?**

Answer: The Jetson device gets assigned a static IPv4 address of `192.168.55.1` (Section ---).

**Question 2: What file should I edit in order to change the IP address to be assigned to Jetson itself in USB device mode?**

Answer: You need to edit `/opt/nvidia/l4t-usb-device-mode/nv-l4t-usb-device-mode-config.sh` (Section Changing the IPv4 Address).