<a href="https://colab.research.google.com/github/rexian/ML/blob/main/langchain/groq/langchain_groq_nvidia_llama3_RAG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install langchain-core langgraph>0.2.27
!pip install -qU "langchain[groq]" langchain_community langchain_huggingface

In [2]:
import getpass
import os

if not os.environ.get("GROQ_API_KEY"):
  os.environ["GROQ_API_KEY"] = getpass.getpass("Enter API key for Groq: ")

from langchain.chat_models import init_chat_model

model = init_chat_model("llama3-8b-8192", model_provider="groq")

Enter API key for Groq: ··········


In [3]:
!wget -O "golden_hymns_of_epictetus.txt" https://www.gutenberg.org/cache/epub/871/pg871.txt

--2025-03-26 20:51:02--  https://www.gutenberg.org/cache/epub/871/pg871.txt
Resolving www.gutenberg.org (www.gutenberg.org)... 152.19.134.47, 2610:28:3090:3000:0:bad:cafe:47
Connecting to www.gutenberg.org (www.gutenberg.org)|152.19.134.47|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 152337 (149K) [text/plain]
Saving to: ‘golden_hymns_of_epictetus.txt’


2025-03-26 20:51:02 (711 KB/s) - ‘golden_hymns_of_epictetus.txt’ saved [152337/152337]



In [4]:
filename = "/content/golden_hymns_of_epictetus.txt"

start_saving = False
stop_saving = False
lines_to_save = []

with open(filename, 'r') as file:
    for line in file:
        if "Are these the only works of Providence within us?" in line:
            start_saving = True
        if "*** END OF THE PROJECT GUTENBERG EBOOK THE GOLDEN SAYINGS OF EPICTETUS, WITH THE HYMN OF CLEANTHES ***" in line:
            stop_saving = True
            break
        if start_saving and not stop_saving:
            lines_to_save.append(line)

# Write the stored lines back to the file
with open(filename, 'w') as file:
    for line in lines_to_save:
        file.write(line)
word_count = 0

with open(filename, 'r') as file:
    for line in file:
        words = line.split()
        word_count += len(words)

print(f"The total number of words in the file is: {word_count}")

The total number of words in the file is: 23503


Retrieval gathers resources to enhance an essay, helping language models access up-to-date, relevant information beyond their built-in knowledge.

- **Advantages**:
   - Adds new, fresh information.
   - Makes responses more relevant and informed.

- **Document Loaders**:
   - Function as "specialized librarians."
   - Organize content from various sources for language models.

- **Text Loader Fundamentals**:
   - Simple process: Converts text files into a usable format for language models.

- **Presentation Style**:
   - Brief and informative, ideal for a concise summary.

In [11]:
from langchain.document_loaders import TextLoader
loader = TextLoader("/content/golden_hymns_of_epictetus.txt")
golden_sayings = loader.load()

type(golden_sayings)

list

# **Document Loaders**:

**Usage Steps**:
   1. Choose a Document Loader from LangChain.
   2. Create an instance of the Document Loader.
   3. Employ its `load()` method to convert files into LangChain documents.

**Role of Document Transformers**

Customization for Models: Adjust documents to suit model's requirements, like trimming lengthy texts.

**Understanding Text Splitters**

**Function**: Divide long texts into smaller, coherent segments.

**Goal**: Keep related text together, fitting within the model's capacity.

**Using `RecursiveCharacterTextSplitter`**

**Methodology**:
   - Intelligently splits texts using multiple separators.

   - Recursively adjusts if segments are too large.

   - Ensures all parts are appropriately sized.

**Key Aspects of Splitting**

   - Chooses optimal separators for division.

   - Continually splits large chunks.

   - Balances chunk size by characters or tokens.

   - Maintains some overlap for context.

   - Tracks chunk starting points if needed.

**Presentation Style**

   - Focused on essential steps and features, great for a concise summary.

In [19]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap = 50,
    length_function = len,
    add_start_index = True
)
texts = text_splitter.split_documents(golden_sayings)
texts[0].page_content
texts[1].page_content

'What then! seeing that most of you are blinded, should there not be\nsome one to fill this place, and sing the hymn to God on behalf of all\nmen? What else can I that am old and lame do but sing to God? Were I a\nnightingale, I should do after the manner of a nightingale. Were I a\nswan, I should do after the manner of a swan. But now, since I am a\nreasonable being, I must sing to God: that is my work: I do it, nor\nwill I desert this my post, as long as it is granted me to hold it; and\nupon you too I call to join in this self-same hymn.\n\nII\n\nHow then do men act? As though one returning to his country who had\nsojourned for the night in a fair inn, should be so captivated thereby\nas to take up his abode there.\n\n“Friend, thou hast forgotten thine intention! This was not thy\ndestination, but only lay on the way thither.”\n\n“Nay, but it is a proper place.”'

**Text Embeddings Overview**

**Functionality**: Converts documents into numerical vectors in LangChain.

**Similarity Measure**: Vectors that are closer indicate more similar texts.

**Application**: Quickly identify documents with similar topics or content.

**Presentation Style**: Concise and clear, ideal for slides or quick explanations.

In [23]:
from langchain_huggingface import HuggingFaceEmbeddings

model_name = "sentence-transformers/all-mpnet-base-v2"
model_kwargs = {'device': 'cpu'}
encode_kwargs = {'normalize_embeddings': False}
hf = HuggingFaceEmbeddings(
    model_name=model_name,
    model_kwargs=model_kwargs,
    encode_kwargs=encode_kwargs
)

# **Creating a Vector Store Retriever**

1. **Load Documents**: Utilize a document loader for initial document retrieval.

2. **Split Texts**: Break down documents into smaller sections with a text splitter.

3. **Embedding Conversion**: Apply an embedding model to transform text chunks into vectors.

4. **Vector Store Creation**: Compile these vectors into a vector store.

**Outcome**: Your vector store is now set up to search and retrieve texts by content.

In [None]:
!pip install faiss-cpu 	langchain-nvidia-ai-endpoints

In [27]:
from langchain.vectorstores import FAISS
vectorstore = FAISS.from_documents(documents=texts, embedding=hf)

In [35]:
os.environ["NVIDIA_API_KEY"] = getpass.getpass("Enter API key for NVIDIA AI Endpoints: ")

Enter API key for NVIDIA AI Endpoints: ··········


In [37]:
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
from langchain_nvidia_ai_endpoints import ChatNVIDIA

template = """

Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say 'Ah snap homie, I ain't gonna front. I don't know.`, don't try to make up an answer.
Use three sentences maximum, relevant analogies, and keep the answer as concise as possible.
Use the active voice, and speak directly to the reader using concise language.
{context}
Question: {question}
Helpful Answer:

"""

QA_CHAIN_PROMPT = PromptTemplate.from_template(template)

nvidia_model = ChatNVIDIA(model="meta/llama3-70b-instruct")

qa_chain = RetrievalQA.from_chain_type(
    nvidia_model,
    retriever=vectorstore.as_retriever(),
    chain_type_kwargs={"prompt": QA_CHAIN_PROMPT}
)

query = "What do grief, fear, envy, and desire stem from?"
result = qa_chain.invoke({"query": query})

result["result"]

'Homie, grief, fear, envy, and desire stem from not having your own good, your own best interest, in that which is free from hindrance and in your power. Instead, they arise from placing your worth in things that are subject to hindrance and dependent on the will of others.'

### **LCEL for Retrieval**

1. **Integrate Context and Question**: The prompt template includes placeholders for context and question.

2. **Preliminary Setup**
   - Set up a retriever with an in-memory store for document retrieval.
   - Runnable components can be chained or run separately.

3. **RunnableParallel for Input Preparation**
   - Use `RunnableParallel` to combine document search results and the user's question.
   - `RunnablePassthrough` passes the user's question unchanged.

4. **Workflow Steps**
   - **Step 1**: Create `RunnableParallel` with two entries: 'context' (document results) and 'question' (user's original query).
   - **Step 2**: Feed the dictionary to the prompt component, which constructs a prompt using the user's question and retrieved documents.
   - **Step 3**: Model component evaluates the prompt with Llama3 LLM
   - **Step 4**: `Output_parser` transforms response into a readable Python string.

**End-to-End Process**: From document retrieval and prompt creation to model evaluation and output parsing, the flow seamlessly integrates various components for an effective LLM-driven response.

In [38]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableParallel, RunnablePassthrough

output_parser = StrOutputParser()
setup_and_retrieval = RunnableParallel(
    {"context": vectorstore.as_retriever(), "question": RunnablePassthrough()}
)
chain = setup_and_retrieval | QA_CHAIN_PROMPT | nvidia_model | output_parser

chain.invoke(query)

"Based on the provided context, it seems that grief, fear, envy, and desire stem from one's inability to cope with their own emotions and desires, leading to the development of bad habits and mental diseases. In essence, it's the lack of self-control and the giving in to negative impulses that fuels these emotions.\n\n(Ah, I didn't see a direct answer in the provided context, but I made an educated inference based on the discussions about habits, self-control, and negative emotions.)"