In [15]:
!pip install langchain-groq
!pip install langchain-chroma
!pip install langchain-text-splitters
!pip install langchain langchain-huggingface "sentence-transformers[onnx]"

Collecting langchain
  Using cached langchain-1.0.8-py3-none-any.whl.metadata (4.9 kB)
Collecting langchain-huggingface
  Using cached langchain_huggingface-1.0.1-py3-none-any.whl.metadata (2.1 kB)
Collecting sentence-transformers[onnx]
  Using cached sentence_transformers-5.1.2-py3-none-any.whl.metadata (16 kB)
Collecting langgraph<1.1.0,>=1.0.2 (from langchain)
  Using cached langgraph-1.0.3-py3-none-any.whl.metadata (7.8 kB)
Collecting langgraph-checkpoint<4.0.0,>=2.1.0 (from langgraph<1.1.0,>=1.0.2->langchain)
  Using cached langgraph_checkpoint-3.0.1-py3-none-any.whl.metadata (4.7 kB)
Collecting langgraph-prebuilt<1.1.0,>=1.0.2 (from langgraph<1.1.0,>=1.0.2->langchain)
  Using cached langgraph_prebuilt-1.0.4-py3-none-any.whl.metadata (5.2 kB)
Collecting torch>=1.11.0 (from sentence-transformers[onnx])
  Using cached torch-2.9.1-cp313-cp313-manylinux_2_28_x86_64.whl.metadata (30 kB)
Collecting optimum>=1.23.1 (from optimum[onnxruntime]>=1.23.1; extra == "onnx"->sentence-transformer

In [17]:
from langchain_chroma import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_groq import ChatGroq
from langchain_huggingface.embeddings import HuggingFaceEmbeddings


In [None]:
api_key = "" #-- Add your Groq API key here

### Initialize the ChromaDB

In [19]:
# Get Embeddings Model
embedder = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")

# Initialize ChromaDB as Vector Store
vector_store = Chroma(
    collection_name="test_collection",
    embedding_function=embedder
)

  from .autonotebook import tqdm as notebook_tqdm


### Split the File into LangChain Documents & Save to Vector Store

In [21]:
# Read in State of the Union Address File
with open("./text_rag_example.txt") as f:
    state_of_the_union = f.read()

# Initialize Text Splitter
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200,
    length_function=len
)

# Create Documents (Chunks) From File
texts = text_splitter.create_documents([state_of_the_union])

# Save Document Chunks to Vector Store
ids = vector_store.add_documents(texts)

### Semantic Similarity Check with Vector Store

In [29]:
query = "Which renewable energy sources have the potential to provide continuous and reliable electricity, and what are the main challenges associated with each?"

In [30]:
# Query the Vector Store
results = vector_store.similarity_search(query,
    k=2
)

# Print Resulting Chunks
for res in results:
    print(f"* {res.page_content} [{res.metadata}]\n\n")

* In conclusion, renewable energy technologies have evolved from niche applications to mainstream energy sources, transforming the global energy landscape. Solar, wind, hydro, biomass, and geothermal energy each contribute unique advantages, and their integration into energy systems promotes sustainability, resilience, and climate mitigation. While technical, economic, and social challenges remain, ongoing innovation, supportive policy frameworks, and international collaboration provide a strong foundation for continued growth. As the world seeks to transition to a low-carbon future, renewable energy will undoubtedly play a central role in shaping a sustainable, equitable, and prosperous global energy system. [{}]


* The Evolution of Renewable Energy Technologies and Their Global Impact

Over the past century, the global energy landscape has undergone profound transformations. From the early reliance on coal during the Industrial Revolution to the gradual adoption of oil and natural g

### RAG Pipeline

In [31]:
# Create Document Parsing Function to String
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

In [32]:
# Set Chroma as the Retriever
retriever = vector_store.as_retriever()

In [25]:
# Initialize the LLM instance
llm = ChatGroq(model="llama-3.1-8b-instant", api_key=api_key)

In [33]:
# Create the Prompt Template
prompt_template = """Use the context provided to answer the user's question below. If you do not know the answer based on the context provided, tell the user that you do not know the answer to their question based on the context provided and that you are sorry.
context: {context}

question: {query}

answer: """

# Create Prompt Instance from template
custom_rag_prompt = PromptTemplate.from_template(prompt_template)

In [34]:
# Create the RAG Chain
rag_chain = (
    {"context": retriever | format_docs, "query": RunnablePassthrough()}
    | custom_rag_prompt
    | llm
    | StrOutputParser()
)

In [35]:
# Query the RAG Chain
rag_chain.invoke(query)

'Based on the provided context, hydroelectric energy is mentioned as a renewable energy source that can provide continuous and reliable electricity. \n\nAdditionally, it can be inferred that geothermal energy also has the potential to provide continuous and reliable electricity, as it is mentioned as one of the renewable energy technologies that contribute unique advantages.'

In [36]:
# Get an I don't know from the Model
rag_chain.invoke("What is the purpose of life?")

"I don't know the answer to your question based on the context provided and I'm sorry. The provided context discusses emerging trends and challenges in the field of renewable energy, but it does not address philosophical or existential questions like the purpose of life."