In [14]:

import os
from dotenv import load_dotenv
from langchain_groq import ChatGroq
load_dotenv()
llm_model = ChatGroq(model="llama-3.3-70b-versatile")
res= llm_model.invoke("Explain why sky is blue?")
print(res)

content="The sky appears blue because of a phenomenon called Rayleigh scattering, which is the scattering of light by small particles or molecules in the atmosphere.\n\nHere's what happens:\n\n1. **Sunlight enters the Earth's atmosphere**: When the sun shines, it emits a broad spectrum of light, including all the colors of the visible spectrum (red, orange, yellow, green, blue, indigo, and violet).\n2. **Light encounters tiny molecules**: As sunlight travels through the atmosphere, it encounters tiny molecules of gases such as nitrogen (N2) and oxygen (O2). These molecules are much smaller than the wavelength of light.\n3. **Scattering occurs**: When light hits these small molecules, it is scattered in all directions. The amount of scattering that occurs depends on the wavelength of the light. Shorter wavelengths (like blue and violet) are scattered more than longer wavelengths (like red and orange).\n4. **Blue light is scattered the most**: Because of its shorter wavelength, blue ligh

Streaming

In [15]:
for chunk in llm_model.stream("why sky is blue?"):
    print(chunk.text, end='')

The sky appears blue because of a phenomenon called Rayleigh scattering, which is the scattering of light by small particles or molecules in the atmosphere. Here's a simplified explanation:

1. **Sunlight enters the Earth's atmosphere**: When sunlight enters the Earth's atmosphere, it consists of a broad spectrum of colors, including all the colors of the visible spectrum (red, orange, yellow, green, blue, indigo, and violet).
2. **Light encounters tiny molecules**: The sunlight encounters tiny molecules of gases such as nitrogen (N2) and oxygen (O2) in the atmosphere. These molecules are much smaller than the wavelength of light.
3. **Scattering occurs**: When light hits these tiny molecules, it scatters in all directions. The amount of scattering that occurs depends on the wavelength of the light. Shorter wavelengths (like blue and violet) are scattered more than longer wavelengths (like red and orange).
4. **Blue light is scattered more**: Because blue light has a shorter wavelength

In [16]:
messages=[("system","you are a helpful assistant that answers any questions"),
          ("human","why sky is blue?")
          ]
res = llm_model.invoke(messages)
print(res.content)

The sky appears blue because of a phenomenon called Rayleigh scattering, which is the scattering of light by small particles or molecules in the atmosphere.

Here's a simplified explanation:

1. **Sunlight enters the Earth's atmosphere**: When sunlight enters the Earth's atmosphere, it contains all the colors of the visible spectrum, including red, orange, yellow, green, blue, indigo, and violet.
2. **Light encounters tiny molecules**: The sunlight encounters tiny molecules of gases such as nitrogen (N2) and oxygen (O2) in the atmosphere. These molecules are much smaller than the wavelength of light.
3. **Scattering occurs**: When light hits these small molecules, it scatters in all directions. The amount of scattering that occurs depends on the wavelength of the light.
4. **Blue light scatters more**: It turns out that shorter (blue) wavelengths are scattered more than longer (red) wavelengths. This is known as Rayleigh scattering, named after the British physicist Lord Rayleigh, who 

Prompt Templates

In [17]:
from langchain_core.prompts import PromptTemplate

prompt_template = PromptTemplate.from_template("Tell me a {storyname} story about topic {topic}")
llm_prompt = prompt_template.format(storyname="a liitle girl",topic="horror")
res = llm_model.invoke(llm_prompt)
print(res)

content='Once upon a time, in a small, spooky town, there lived a little girl named Lily. She was a curious and brave girl, with a heart full of wonder and a mind full of questions. One dark and stormy night, Lily decided to explore the abandoned mansion on the hill, a place that was rumored to be haunted by ghosts and monsters.\n\nAs she walked up the creaky steps and pushed open the creaky door, a chill ran down her spine. The wind howled and the trees creaked, making it sound like someone was whispering her name. Lily stepped inside, her flashlight casting eerie shadows on the walls.\n\nSuddenly, she heard a strange noise, like footsteps coming from the floor above. Lily\'s heart started racing, but she didn\'t want to be scared. She took a deep breath and began to climb the stairs, her eyes fixed on the door at the top.\n\nAs she reached the top, she saw a figure standing in the doorway. It was a ghostly girl, with eyes that glowed like embers and hair that flowed like the wind. Li

In [18]:
# chat prompt template
from langchain_core.prompts import ChatPromptTemplate

chat_template = ChatPromptTemplate.from_messages(
    [
        ("system","You are a {profession} expert on {topic}"),
        ("human", "Hi {profession} expert, can you please answer a question?"),
        ("ai","sure"),
        ("human","{userinput}")
    ]
    )

messages = chat_template.format_messages(
    profession="Historian",
    topic="kennedy family",
    userinput="How many grandchildren have kenny jr"
    )

res=llm_model.invoke(messages)
print(res)

content="You're referring to John F. Kennedy Jr., the son of President John F. Kennedy. Unfortunately, John F. Kennedy Jr. passed away in a plane crash on July 18, 1999, at the age of 38. He was married to Carolyn Bessette-Kennedy, but they did not have any children together. Therefore, John F. Kennedy Jr. did not have any grandchildren." additional_kwargs={} response_metadata={'token_usage': {'completion_tokens': 85, 'prompt_tokens': 75, 'total_tokens': 160, 'completion_time': 0.216285512, 'prompt_time': 0.003620448, 'queue_time': 0.054197381, 'total_time': 0.21990596}, 'model_name': 'llama-3.3-70b-versatile', 'system_fingerprint': 'fp_fb4860a75b', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None, 'model_provider': 'groq'} id='lc_run--853ec626-4f20-4181-ab1a-878e0dfb5fe6-0' usage_metadata={'input_tokens': 75, 'output_tokens': 85, 'total_tokens': 160}


Fewshot prompting

In [20]:
from langchain_core.prompts import FewShotChatMessagePromptTemplate

example =[
    {"input":"hi","output":"hola"},
    {"input":"bye","output":"adios"}
]

example_prompt = ChatPromptTemplate.from_messages(
    [
        ("human","{input}"),
        ("ai","{output}")
    ]
)

few_shot_prompt = FewShotChatMessagePromptTemplate(
    example_prompt=example_prompt,
    examples=example
)

final_prompt = ChatPromptTemplate.from_messages(
    [
        ("system","You are a professional translator to spanish"),
        few_shot_prompt,
        ("human","{input}")
    ]
)



Chains

In [22]:
chain = final_prompt | llm_model
res=chain.invoke({"input":"How are you?"})
print(res.content)

Estoy bien, gracias. ¿Y tú? (I'm fine, thank you. And you?)


Output parsers

In [23]:
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import SimpleJsonOutputParser

json_prompt = PromptTemplate.from_template("Return an json object with 'answer' key for the user question:{question}")

json_parser = SimpleJsonOutputParser()

chain = json_prompt | llm_model | json_parser

res = chain.invoke({"question":"what is the biggest country?"})
print(res)

{'answer': 'Russia'}


Parser using Pydantic

In [24]:
from pydantic import BaseModel, Field
from langchain_core.output_parsers import JsonOutputParser
from langchain_core.prompts import PromptTemplate

class Joke(BaseModel):
    question: str = Field(description="question to setup the joke")
    answer:str =Field(description="answer to resolve the joke")

parser = JsonOutputParser(pydantic_object=Joke)

prompt = PromptTemplate(
    template="Answer the user query. \n{query},\n{format_instructions}",
    input_variables=["query"],
    partial_variables={"format_instructions":parser.get_format_instructions()}
)

chain = prompt | llm_model | parser
res = chain.invoke({"query":"tell me a joke"})
print(res)

{'question': 'Why was the math book sad?', 'answer': 'Because it had too many problems'}


Document Loaders

In [26]:
from langchain_community.document_loaders import TextLoader

loader = TextLoader("files/test.txt")
loaded_data= loader.load()
print(loaded_data)


[Document(metadata={'source': 'files/test.txt'}, page_content='2025-04-04T09:53:48-04:00 (Debug) /cgi-bin/chewy.cfg/scripts/custom/cron/splunk_logger.php@2 4104AA2C-F56B-02F5-7DF6-DF93E6EA009B:\nLogging Initialized for /cgi-bin/chewy.cfg/scripts/custom/cron/splunk_logger.php  \n\n2025-04-04T09:53:48-04:00 (Debug) /cgi-bin/chewy.cfg/scripts/custom/cron/splunk_logger.php@2 4104AA2C-F56B-02F5-7DF6-DF93E6EA009B:\nJob configuration settings \nSERVER_NAME=chewy--tst1.custhelp.com, (1000012)=https://hec-stg.chewy.net/services/collector/event, (1000013)=18f1b8e2-97b1-4423-b238-c035217d4e10, (1000014)=oraclecrm:dev, (1000015)=oracrm, (1000077)=105689, (1000075)=0, (1000076)=2000, (1000074)=55 \n\n2025-04-04T09:53:48-04:00 (Debug) /cgi-bin/chewy.cfg/scripts/custom/cron/splunk_logger.php@2 4104AA2C-F56B-02F5-7DF6-DF93E6EA009B:\nStarting triggerJob@splunk_logger_model (Line: 2)  \n\n2025-04-04T09:53:48-04:00 (Debug) /cgi-bin/chewy.cfg/scripts/custom/cron/splunk_logger.php@2 4104AA2C-F56B-02F5-7DF6

In [30]:
from langchain_community.document_loaders import PyPDFLoader

pdf_loader = PyPDFLoader("files/test.pdf")
data = pdf_loader.load_and_split()
print(data)
print(data[0].page_content)

[Document(metadata={'producer': 'PyPDF', 'creator': 'Microsoft Word', 'creationdate': '2022-02-15T19:32:13+00:00', 'author': 'Katie McCray', 'moddate': '2022-02-15T19:32:13+00:00', 'source': 'files/test.pdf', 'total_pages': 3, 'page': 0, 'page_label': '1'}, page_content='Getting Started with Data Modules Workshop \nThis document provides the preparations to get the lab environment set up and to access the \nfiles needed to proceed.  \n \n1.1 Log into the Classroom Trial Environment of Cognos Analytics on Cloud \n \nThe instructor has invited you to use an IBM Cloud “On Demand” instance of Cognos \nAnalytics. This is a real cloud instance of Cognos Analytics, the same that you can purchase \nfor your company. The other students will also be accessing this same instance as if you \neach worked for the same company. \n \nFirst, Get your student ID and Password: \n \n1. Check the private chat area of the web session. \n2. Find the student ID and Password that was issued to you by the instr

In [34]:
from langchain_community.document_loaders import WikipediaLoader

wiki = WikipediaLoader(query="Tesla", load_max_docs=1)
data = wiki.load()
print(data)

[Document(metadata={'title': 'Tesla Cybertruck', 'summary': 'The Tesla Cybertruck is a battery-electric full-size pickup truck manufactured by Tesla, Inc. since 2023. It was first unveiled as a prototype in November 2019, featuring a distinctive angular design composed of flat, unpainted stainless steel body panels, drawing comparisons to low-polygon computer models.\nOriginally scheduled for production in late 2021, the vehicle faced multiple delays before entering limited production at Gigafactory Texas in November 2023, with initial customer deliveries occurring later that month. As of 2025, three variants are available: a tri-motor all-wheel drive (AWD) model marketed as the "Cyberbeast", a dual-motor AWD model, and a single-motor rear-wheel drive (RWD) "Long Range" model. EPA range estimates vary by configuration, from 320 to 350 miles (515 to 565 km). \nAs of 2025, the Cybertruck is sold in the United States, Mexico, Canada and South Korea. The Cybertruck has been criticized for 

In [40]:
from langchain_core.prompts import ChatPromptTemplate



chat_prompt = ChatPromptTemplate.from_messages(
    [
        ("human", "Answer the {question} for the given context: {context}" )
    ]
)

messages = chat_prompt.format_messages(
    question="what is Tesla?",
    context=data
)

res = llm_model.invoke(messages)
print(res)

content='In the context of the provided text, Tesla refers to Tesla, Inc., an American electric vehicle and clean energy company founded by Elon Musk. Specifically, it is the manufacturer of the Tesla Cybertruck, a battery-electric full-size pickup truck that is the main subject of the document.' additional_kwargs={} response_metadata={'token_usage': {'completion_tokens': 58, 'prompt_tokens': 1227, 'total_tokens': 1285, 'completion_time': 0.166485451, 'prompt_time': 0.0624144, 'queue_time': 0.058286379, 'total_time': 0.228899851}, 'model_name': 'llama-3.3-70b-versatile', 'system_fingerprint': 'fp_fb4860a75b', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None, 'model_provider': 'groq'} id='lc_run--a3aa803f-a8ef-4fb7-8df1-e59d7a73c578-0' usage_metadata={'input_tokens': 1227, 'output_tokens': 58, 'total_tokens': 1285}


CharacterTextSplitter

In [52]:
from langchain_text_splitters import CharacterTextSplitter

splitter = CharacterTextSplitter(
    separator=".",
    chunk_size=1000,
    chunk_overlap=200,
    length_function=len,
    is_separator_regex=False
)

texts = splitter.create_documents([data[0].metadata['summary']])
print(texts)
print(texts[0])
print(texts[1])


[Document(metadata={}, page_content='The Tesla Cybertruck is a battery-electric full-size pickup truck manufactured by Tesla, Inc. since 2023. It was first unveiled as a prototype in November 2019, featuring a distinctive angular design composed of flat, unpainted stainless steel body panels, drawing comparisons to low-polygon computer models.\nOriginally scheduled for production in late 2021, the vehicle faced multiple delays before entering limited production at Gigafactory Texas in November 2023, with initial customer deliveries occurring later that month. As of 2025, three variants are available: a tri-motor all-wheel drive (AWD) model marketed as the "Cyberbeast", a dual-motor AWD model, and a single-motor rear-wheel drive (RWD) "Long Range" model. EPA range estimates vary by configuration, from 320 to 350 miles (515 to 565 km). \nAs of 2025, the Cybertruck is sold in the United States, Mexico, Canada and South Korea'), Document(metadata={}, page_content='EPA range estimates vary 

Recursive Character text splitter

In [54]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

splitter  = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200
)

texts = splitter.create_documents([data[0].metadata['summary']])
print(texts)
print(len(texts))

[Document(metadata={}, page_content='The Tesla Cybertruck is a battery-electric full-size pickup truck manufactured by Tesla, Inc. since 2023. It was first unveiled as a prototype in November 2019, featuring a distinctive angular design composed of flat, unpainted stainless steel body panels, drawing comparisons to low-polygon computer models.\nOriginally scheduled for production in late 2021, the vehicle faced multiple delays before entering limited production at Gigafactory Texas in November 2023, with initial customer deliveries occurring later that month. As of 2025, three variants are available: a tri-motor all-wheel drive (AWD) model marketed as the "Cyberbeast", a dual-motor AWD model, and a single-motor rear-wheel drive (RWD) "Long Range" model. EPA range estimates vary by configuration, from 320 to 350 miles (515 to 565 km).'), Document(metadata={}, page_content='As of 2025, the Cybertruck is sold in the United States, Mexico, Canada and South Korea. The Cybertruck has been cr