### Initialize LLM Model ??

In [17]:
import os
from dotenv import load_dotenv

load_dotenv()

os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY")
os.environ["GOOGLE_API_KEY"] = os.getenv("GOOGLE_API_KEY")

In [2]:
from langchain_groq import ChatGroq

llm = ChatGroq(model="llama-3.3-70b-versatile")

llm

ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x000001B5B5599010>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x000001B5B56D8B50>, model_name='llama-3.3-70b-versatile', model_kwargs={}, groq_api_key=SecretStr('**********'))

In [4]:
response = llm.invoke("Explain Quantum Material Science in very simple terms?")

print(response.content)

Quantum Material Science is a branch of physics that studies the behavior of materials at a very small scale, where the rules of classical physics don't apply. Let me break it down in simple terms:

**What are materials?**
Materials are the things that make up everything around us, like metals, plastics, and even the air we breathe. They're made up of tiny particles called atoms and molecules.

**What's quantum?**
Quantum refers to the strange and fascinating world of tiny things, like atoms and particles that are too small to see. At this scale, the rules of classical physics (like Newton's laws) don't work anymore. Instead, strange and cool things start to happen, like:

* Particles can be in two places at once (called superposition)
* Particles can be connected and affect each other even if they're really far apart (called entanglement)
* Particles can tunnel through walls (called quantum tunneling)

**What's Quantum Material Science?**
Quantum Material Science is the study of how m

In [6]:
from langchain_community.document_loaders import TextLoader

loader = TextLoader("data/be-good.txt")

documents = loader.load()

documents

[Document(metadata={'source': 'data/be-good.txt'}, page_content='Be good\n\nApril 2008(This essay is derived from a talk at the 2008 Startup School.)About a month after we started Y Combinator we came up with the\nphrase that became our motto: Make something people want.  We\'ve\nlearned a lot since then, but if I were choosing now that\'s still\nthe one I\'d pick.Another thing we tell founders is not to worry too much about the\nbusiness model, at least at first.  Not because making money is\nunimportant, but because it\'s so much easier than building something\ngreat.A couple weeks ago I realized that if you put those two ideas\ntogether, you get something surprising.  Make something people want.\nDon\'t worry too much about making money.  What you\'ve got is a\ndescription of a charity.When you get an unexpected result like this, it could either be a\nbug or a new discovery.  Either businesses aren\'t supposed to be\nlike charities, and we\'ve proven by reductio ad absurdum that one

In [7]:
documents[0].page_content

'Be good\n\nApril 2008(This essay is derived from a talk at the 2008 Startup School.)About a month after we started Y Combinator we came up with the\nphrase that became our motto: Make something people want.  We\'ve\nlearned a lot since then, but if I were choosing now that\'s still\nthe one I\'d pick.Another thing we tell founders is not to worry too much about the\nbusiness model, at least at first.  Not because making money is\nunimportant, but because it\'s so much easier than building something\ngreat.A couple weeks ago I realized that if you put those two ideas\ntogether, you get something surprising.  Make something people want.\nDon\'t worry too much about making money.  What you\'ve got is a\ndescription of a charity.When you get an unexpected result like this, it could either be a\nbug or a new discovery.  Either businesses aren\'t supposed to be\nlike charities, and we\'ve proven by reductio ad absurdum that one\nor both of the principles we began with is false.  Or we have 

In [10]:
docs = documents[0].page_content

docs

'Be good\n\nApril 2008(This essay is derived from a talk at the 2008 Startup School.)About a month after we started Y Combinator we came up with the\nphrase that became our motto: Make something people want.  We\'ve\nlearned a lot since then, but if I were choosing now that\'s still\nthe one I\'d pick.Another thing we tell founders is not to worry too much about the\nbusiness model, at least at first.  Not because making money is\nunimportant, but because it\'s so much easier than building something\ngreat.A couple weeks ago I realized that if you put those two ideas\ntogether, you get something surprising.  Make something people want.\nDon\'t worry too much about making money.  What you\'ve got is a\ndescription of a charity.When you get an unexpected result like this, it could either be a\nbug or a new discovery.  Either businesses aren\'t supposed to be\nlike charities, and we\'ve proven by reductio ad absurdum that one\nor both of the principles we began with is false.  Or we have 

### CharacterTextSplitter ??

In [8]:
from langchain_text_splitters import CharacterTextSplitter

# Text splitter
text_splitter = CharacterTextSplitter(
    separator="\n\n",
    chunk_size=1000,
    chunk_overlap=200,
    length_function=len,
    is_separator_regex=False,
)

In [13]:
texts = text_splitter.create_documents([docs])

texts

[Document(metadata={}, page_content='Be good'),
 Document(metadata={}, page_content='April 2008(This essay is derived from a talk at the 2008 Startup School.)About a month after we started Y Combinator we came up with the\nphrase that became our motto: Make something people want.  We\'ve\nlearned a lot since then, but if I were choosing now that\'s still\nthe one I\'d pick.Another thing we tell founders is not to worry too much about the\nbusiness model, at least at first.  Not because making money is\nunimportant, but because it\'s so much easier than building something\ngreat.A couple weeks ago I realized that if you put those two ideas\ntogether, you get something surprising.  Make something people want.\nDon\'t worry too much about making money.  What you\'ve got is a\ndescription of a charity.When you get an unexpected result like this, it could either be a\nbug or a new discovery.  Either businesses aren\'t supposed to be\nlike charities, and we\'ve proven by reductio ad absurdum

In [14]:
len(texts)

2

### RecursiveCharacterTextSplitter ??

In [15]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Recursive text splitter
recursive_splitter = RecursiveCharacterTextSplitter(
    chunk_size=26,
    chunk_overlap=4,
)

In [16]:
recursive_text = recursive_splitter.split_text(docs)

recursive_text

['Be good',
 'April 2008(This essay is',
 'is derived from a talk at',
 'at the 2008 Startup',
 'School.)About a month',
 'after we started Y',
 'Y Combinator we came up',
 'up with the',
 'phrase that became our',
 'our motto: Make something',
 "people want.  We've",
 'learned a lot since then,',
 'but if I were choosing',
 "now that's still",
 "the one I'd pick.Another",
 'thing we tell founders is',
 'is not to worry too much',
 'about the',
 'business model, at least',
 'at first.  Not because',
 'making money is',
 'unimportant, but because',
 "it's so much easier than",
 'building something',
 'great.A couple weeks ago',
 'ago I realized that if',
 'if you put those two',
 'two ideas',
 'together, you get',
 'get something surprising.',
 'Make something people',
 'want.',
 "Don't worry too much",
 'about making money.  What',
 "you've got is a",
 'description of a',
 'a charity.When you get an',
 'an unexpected result like',
 'this, it could either be',
 'be a',
 'bug or a new di

### Embeddings:
- Transform the small parts of text in numbers (vectors) that are easily stored and searched by vector databases.

In [20]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings

embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

embeddings

GoogleGenerativeAIEmbeddings(client=<google.ai.generativelanguage_v1beta.services.generative_service.client.GenerativeServiceClient object at 0x000001B5B8290690>, async_client=<google.ai.generativelanguage_v1beta.services.generative_service.async_client.GenerativeServiceAsyncClient object at 0x000001B5B9808F50>, model='models/embedding-001', task_type=None, google_api_key=SecretStr('**********'), credentials=None, client_options=None, transport=None, request_options=None)

In [21]:
embeddings.embed_query("Hello world!")

[0.058863308280706406,
 0.003392943413928151,
 -0.0728108361363411,
 -0.022689837962388992,
 0.05758946016430855,
 0.021888138726353645,
 0.005913455504924059,
 -0.02974560670554638,
 0.0070821382105350494,
 0.03830326721072197,
 -0.02002689428627491,
 0.03193406015634537,
 -0.030567705631256104,
 0.0023819657508283854,
 -0.006789948791265488,
 -0.03710676357150078,
 0.0201618243008852,
 0.012596654705703259,
 0.02667921595275402,
 0.004417411983013153,
 0.005690323654562235,
 0.016588488593697548,
 -0.02287888526916504,
 -0.011648036539554596,
 0.03213648498058319,
 0.0006601402419619262,
 0.012787614949047565,
 -0.04911084100604057,
 -0.0149785615503788,
 0.0074206204153597355,
 -0.03921062871813774,
 0.0013768513454124331,
 -0.027875307947397232,
 0.034823205322027206,
 0.0272233746945858,
 -0.05635960027575493,
 0.00749538978561759,
 0.012933867052197456,
 -0.007740380242466927,
 -0.013834776356816292,
 0.015063080005347729,
 -0.07006464153528214,
 -0.04136952757835388,
 0.01399935

In [26]:
print(embeddings.embed_query("Hello world!")[:5])

[0.058863308280706406, 0.003392943413928151, -0.0728108361363411, -0.022689837962388992, 0.05758946016430855]


In [24]:
len(embeddings.embed_query("Hello world!"))

768

### Vector Store (aka. Vector Database)

- Vector stores are databases that store embeddings in a way that allows for efficient and fast search. They are used to store the embeddings of the documents in the vector database.

In [27]:
from langchain_community.document_loaders import TextLoader
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter
from langchain_chroma import Chroma

In [28]:
## Load the document, split it into chunks, embed each chunk and load it into the vector store.
loaded_document = TextLoader('data/state_of_the_union.txt').load()

loaded_document

[Document(metadata={'source': 'data/state_of_the_union.txt'}, page_content='Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans.  \n\nLast year COVID-19 kept us apart. This year we are finally together again. \n\nTonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. \n\nWith a duty to one another to the American people to the Constitution. \n\nAnd with an unwavering resolve that freedom will always triumph over tyranny. \n\nSix days ago, Russia’s Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. \n\nHe thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. \n\nHe met the Ukrainian people. \n\nFrom President Zelenskyy to every Ukrainian, their fearlessness, their courage, their det

In [29]:
# Split the document into chunks.
text_spliiter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)

splitted_documents = text_spliiter.split_documents(loaded_document)

splitted_documents

[Document(metadata={'source': 'data/state_of_the_union.txt'}, page_content='Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans.  \n\nLast year COVID-19 kept us apart. This year we are finally together again. \n\nTonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. \n\nWith a duty to one another to the American people to the Constitution. \n\nAnd with an unwavering resolve that freedom will always triumph over tyranny. \n\nSix days ago, Russia’s Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. \n\nHe thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. \n\nHe met the Ukrainian people. \n\nFrom President Zelenskyy to every Ukrainian, their fearlessness, their courage, their det

In [30]:
len(splitted_documents)

42

In [31]:
# Create embeddings for each chunk.
embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

In [32]:
# Create a vector store.
vector_store = Chroma.from_documents(splitted_documents, embeddings)

In [33]:
vector_store

<langchain_chroma.vectorstores.Chroma at 0x1b5c0304bd0>

In [34]:
question = "What did the president say about the John Lewis Voting Rights Act?"

response = vector_store.similarity_search(question)

print(response[0].page_content)

And I will keep doing everything in my power to crack down on gun trafficking and ghost guns you can buy online and make at home—they have no serial numbers and can’t be traced. 

And I ask Congress to pass proven measures to reduce gun violence. Pass universal background checks. Why should anyone on a terrorist list be able to purchase a weapon? 

Ban assault weapons and high-capacity magazines. 

Repeal the liability shield that makes gun manufacturers the only industry in America that can’t be sued. 

These laws don’t infringe on the Second Amendment. They save lives. 

The most fundamental right in America is the right to vote – and to have it counted. And it’s under assault. 

In state after state, new laws have been passed, not only to suppress the vote, but to subvert entire elections. 

We cannot let this happen.


### Vector Store as Retriever
- Find the embedding that best answers your question.

In [35]:
retriever = vector_store.as_retriever(search_kwargs={"k": 1})

retriever

VectorStoreRetriever(tags=['Chroma', 'GoogleGenerativeAIEmbeddings'], vectorstore=<langchain_chroma.vectorstores.Chroma object at 0x000001B5C0304BD0>, search_kwargs={'k': 1})

In [None]:
response = retriever.invoke("what did he say about ketanji brown jackson?")

response

[Document(id='2bf37c5b-bd2e-416e-9289-0eb3d9ec3413', metadata={'source': 'data/state_of_the_union.txt'}, page_content='And my report is this: the State of the Union is strong—because you, the American people, are strong. \n\nWe are stronger today than we were a year ago. \n\nAnd we will be stronger a year from now than we are today. \n\nNow is our moment to meet and overcome the challenges of our time. \n\nAnd we will, as one people. \n\nOne America. \n\nThe United States of America. \n\nMay God bless you all. May God protect our troops.')]

In [38]:
print(response[0].page_content)

And my report is this: the State of the Union is strong—because you, the American people, are strong. 

We are stronger today than we were a year ago. 

And we will be stronger a year from now than we are today. 

Now is our moment to meet and overcome the challenges of our time. 

And we will, as one people. 

One America. 

The United States of America. 

May God bless you all. May God protect our troops.
