# Chapter 1 - LLM Fundamentals with LangChain

#### Imports

In [46]:
import os
from dotenv import load_dotenv
import warnings
warnings.filterwarnings('ignore')

from pydantic import BaseModel
import openai

from langchain_core.messages import HumanMessage, SystemMessage
from langchain_core.prompts import PromptTemplate, ChatPromptTemplate
from langchain_core.runnables import chain
from langchain_openai.llms import OpenAI
from langchain_openai import ChatOpenAI






#### Load OpenAI API

In [3]:
# Load OpenAI API from .env file
api_openai = load_dotenv(dotenv_path='./api.env')
api_openai = os.getenv('OPENAI_API_KEY')

#### Simple chat completion

In [4]:

# Initialize an chat-completion LLM (GPT-3.5-turbo)
model = ChatOpenAI(model='gpt-3.5-turbo') # Default params, 
print(model)


client=<openai.resources.chat.completions.completions.Completions object at 0x10b82efe0> async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x1162da2f0> root_client=<openai.OpenAI object at 0x10b82c100> root_async_client=<openai.AsyncOpenAI object at 0x10b82c070> model_kwargs={} openai_api_key=SecretStr('**********')


In [5]:
# Generate with model
prompt = "It was a dark and stormy night"
response = model.invoke(prompt)
print(type(response))

# Response as dictionary
response = response.dict()
response


<class 'langchain_core.messages.ai.AIMessage'>


/var/folders/6p/k8b2fdkn3z36ffyjvjjt_cz4tt91c9/T/ipykernel_48822/4163424536.py:7: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.11/migration/
  response = response.dict()


{'content': ', the rain pelting against the window panes in a relentless drumbeat. Thunder rumbled in the distance, the flashes of lightning illuminating the sky in an eerie yet beautiful display. The wind howled through the trees, bending them to its will.\n\nIn the midst of the storm, a lone figure trudged through the muddy streets, their coat pulled tightly around them to shield from the chill. Lightning cracked overhead, casting their shadow in sharp relief against the buildings. They quickened their pace, eager to reach their destination before the storm grew fiercer.\n\nAs they reached the old, abandoned mansion at the edge of town, a chill ran down their spine. The house loomed ominously in the dark, its windows boarded up and its gardens overgrown. But despite its eerie appearance, the figure pushed open the creaking gate and stepped inside.\n\nInside, the air was heavy with the scent of decay and neglect. Dust motes danced in the dim light of their lantern, casting eerie shado

In [6]:
# View generated text
text = prompt + response['content']
print(text)


It was a dark and stormy night, the rain pelting against the window panes in a relentless drumbeat. Thunder rumbled in the distance, the flashes of lightning illuminating the sky in an eerie yet beautiful display. The wind howled through the trees, bending them to its will.

In the midst of the storm, a lone figure trudged through the muddy streets, their coat pulled tightly around them to shield from the chill. Lightning cracked overhead, casting their shadow in sharp relief against the buildings. They quickened their pace, eager to reach their destination before the storm grew fiercer.

As they reached the old, abandoned mansion at the edge of town, a chill ran down their spine. The house loomed ominously in the dark, its windows boarded up and its gardens overgrown. But despite its eerie appearance, the figure pushed open the creaking gate and stepped inside.

Inside, the air was heavy with the scent of decay and neglect. Dust motes danced in the dim light of their lantern, casting 

#### Chat completion with specific roles

* Messages sent to LLMs are classified into roles, based on the function of the message

* System role: instructions the model should use to answer a question

* User role: User's query and content provided by the user

* Assistant role: content generated by the model

* Roles are achieved with chat message interfaces, instead of a simple string as prompt. E.g. HumanMessage for user role, AIMessage for assistant role, SystemMessage for system role, and ChatMessage for arbitrary role.

In [None]:
# User role example (HumanMessage)
prompt_text = 'What is the capital of France?'
prompt = [HumanMessage(prompt_text)]
response = model.invoke(prompt)
print(response.content)

The capital of France is Paris.


In [52]:
# System role example (SystemMessage)
system_msg = SystemMessage(
    "You are a helpful assistant that responds to questions with three exclamation marks."
)
human_msg = HumanMessage("What is the capital of France?")
response = model.invoke([system_msg, human_msg])
print(response.content)

The capital of France is Paris!!!


#### Chat completions with a standard prompt template

* We can specify a prompt template and use LangChain's PrompTemplate to construct prompts with dynamic inputs

In [8]:
# Specifying a prompt template
prompt_template =  """Answer the question based on the context below. If the question cannot be answered using the information provided, answer with "I don't know".

Context: {context}

Question: {question}

Answer: """

# Dynamic template inputs
template = PromptTemplate.from_template(prompt_template)
prompt = template.invoke(
    {
        "context": "The most recent advancements in NLP are being driven by Large Language Models (LLMs). These models outperform their smaller counterparts and have become invaluable for developers who are creating applications with NLP capabilities. Developers can tap into these models through Hugging Face's `transformers` library, or by utilizing OpenAI and Cohere's offerings through the `openai` and `cohere` libraries, respectively.",
        "question": "Which model providers offer LLMs?",
    }
)

print(type(prompt))
print(prompt.text)

<class 'langchain_core.prompt_values.StringPromptValue'>
Answer the question based on the context below. If the question cannot be answered using the information provided, answer with "I don't know".

Context: The most recent advancements in NLP are being driven by Large Language Models (LLMs). These models outperform their smaller counterparts and have become invaluable for developers who are creating applications with NLP capabilities. Developers can tap into these models through Hugging Face's `transformers` library, or by utilizing OpenAI and Cohere's offerings through the `openai` and `cohere` libraries, respectively.

Question: Which model providers offer LLMs?

Answer: 


* Then we can feed the template-constructed prompt to the model

In [9]:
# Feed prompt to model
response = model.invoke(prompt)
print(response.content)

Hugging Face, OpenAI, and Cohere offer LLMs.


#### Chat completions with a role-based prompt template

In [10]:

# Construct the prompt template with specified roles (user, system, assistant)
template = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            'Answer the question based on the context below. If the question cannot be answered using the information provided, answer with "I don\'t know".',
        ),
        ("human", "Context: {context}"),
        ("human", "Question: {question}"),
    ]
)

prompt = template.invoke(
    {
        "context": "The most recent advancements in NLP are being driven by Large Language Models (LLMs). These models outperform their smaller counterparts and have become invaluable for developers who are creating applications with NLP capabilities. Developers can tap into these models through Hugging Face's `transformers` library, or by utilizing OpenAI and Cohere's offerings through the `openai` and `cohere` libraries, respectively.",
        "question": "Which model providers offer LLMs?",
    }
)
print(type(prompt))
print(prompt)

<class 'langchain_core.prompt_values.ChatPromptValue'>
messages=[SystemMessage(content='Answer the question based on the context below. If the question cannot be answered using the information provided, answer with "I don\'t know".', additional_kwargs={}, response_metadata={}), HumanMessage(content="Context: The most recent advancements in NLP are being driven by Large Language Models (LLMs). These models outperform their smaller counterparts and have become invaluable for developers who are creating applications with NLP capabilities. Developers can tap into these models through Hugging Face's `transformers` library, or by utilizing OpenAI and Cohere's offerings through the `openai` and `cohere` libraries, respectively.", additional_kwargs={}, response_metadata={}), HumanMessage(content='Question: Which model providers offer LLMs?', additional_kwargs={}, response_metadata={})]


In [11]:
# Pass the prompt to the model to generate a response
response = model.invoke(prompt)
print(response.content)

OpenAI and Cohere offer LLMs through their libraries.


#### Getting specific format out of the LLM


* Use BaseModel from pydantic to define the schema for the format that you want the LLM to respect

* Then you include the schema in the prompt.

* Pydantic is a Python library used for data validation and settings management using Python type annotations.

In [35]:
# Define Schema for JSON format
class AnswerWithJustification(BaseModel):
    """An answer to the user's question along with justification for the answer."""

    answer: str
    """The answer to the user's question"""
    justification: str
    """Justification for the answer"""


# Initialize a model with the defined schema
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
structured_llm = llm.with_structured_output(AnswerWithJustification)

# Generate the response
response = structured_llm.invoke(
    "What weighs more, a pound of bricks or a pound of feathers")
print(type(response)) # A pydantic object
response.model_dump()

<class '__main__.AnswerWithJustification'>


{'answer': 'They weigh the same',
 'justification': 'Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the two substances is different.'}

In [36]:
# Expand the schema with alternative explanation

class AnswerWithJustificationAndExplanation(BaseModel):
    """An answer to the user's question along with justification for the answer."""

    answer: str
    """The answer to the user's question"""
    justification: str
    """Justification for the answer"""
    alternative_explanation: str
    """An alternative explanation for the answer"""


# Initialize a model with the defined schema
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
structured_llm = llm.with_structured_output(AnswerWithJustificationAndExplanation)

# Generate the response
response = structured_llm.invoke(
    "What weighs more, a pound of bricks or a pound of feathers")
response.model_dump()

{'answer': 'They weigh the same',
 'justification': 'A pound is a unit of weight, so a pound of bricks and a pound of feathers both weigh the same amount, which is one pound.',
 'alternative_explanation': 'Although bricks are denser and heavier than feathers, a pound of bricks and a pound of feathers still weigh the same because they are both measured in pounds.'}

#### Using the runnable interface

* .invoke -> transforms single input to output

* .batch -> efficiently transforms multiple inputs to multiple outputs

* .stream -> streams ouptut from ingle input as it is produced

In [38]:
# Invoke example
model = ChatOpenAI(model="gpt-3.5-turbo")
completion = model.invoke("Hi there!")
print(completion.content)

Hello! How can I assist you today?


In [41]:
# Batch example
completions = model.batch(["Hi there!", "Bye!"])
print(type(completions)) # list
print([item.content for item in completions])

<class 'list'>
['Hello! How can I assist you today?', 'Goodbye! Have a great day!']


In [44]:
# Stream example
# Returns a generator that yields tokens or chunks of output one at a time as the model generates them
for token in model.stream("Bye!"):
    print(token.content)


Good
bye
!
 Have
 a
 great
 day
!



#### Imperative and declarative composition

* Two ways of composing model calls
    
    1. Imperative composition -> you build explicit code into functions/classes and use them as langchain runnables
    
    2. Declarative composition

In [48]:

# Imperative composition example
template = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful assistant."),
        ("human", "{question}"),
    ]
)
model = ChatOpenAI(model="gpt-3.5-turbo")

@chain
def chatbot(values):
    # Combine the prompt and model into a function
    # The @chain decorator adds the same Runnable interface for any function you write
    # @chain wraps your function, so that it behaves like a langchain runnable with .invoke, .batch, )
    prompt = template.invoke(values)
    return model.invoke(prompt)

# Run the function
response = chatbot.invoke({"question": "Which model providers offer LLMs (language models)?"})
print(response.content)

Some popular model providers that offer LLMs (large language models) are:

1. OpenAI (GPT-3)
2. Google (BERT, T5)
3. Facebook AI Research (RoBERTa)
4. Hugging Face (Transformer, BERT, GPT-2)
5. Microsoft (Turing-NLG, MT-DNN)
6. Salesforce (CTRL)
7. EleutherAI (GPT-Neo) 

These providers offer a range of LLMs with different capabilities and use cases.


* The above approach only works for .invoke and .batch

* To use .stream, modify the code as follows:

In [52]:
# Imperative composition with stream

@chain
def chatbot(values):
    prompt = template.invoke(values)
    for token in model.stream(prompt):
        yield token


for part in chatbot.stream({"question": "List the three most popular providers of LLMs (language models)?"}):
    print(part.content)


The
 three
 most
 popular
 providers
 of
 L
LM
s
 (
language
 models
)
 are
:


1
.
 Open
AI
's
 G
PT
 (
Gener
ative
 Pre
-trained
 Transformer
)

2
.
 Google
's
 B
ERT
 (
Bid
irectional
 Encoder
 Represent
ations
 from
 Transformers
)

3
.
 Facebook
's
 Ro
BERT
a



* For asynchronous execution, use the python keywords: async and await 

* Asynchronous  means that the generation can be allowed to run in the background without blocking the main flow

* Async. processes are efficient for high-latency tasks.

* With async. execution, you are telling Python: "Start this I/O-bound operation (the API call to the LLM), and while you're waiting for the response, you’re allowed to handle other tasks (other coroutines)." The LLM server still generates one token at a time.

* The async call just allows your Python process to do other things while the LLM is busy.

In [None]:

# Asynchronous imperative composition
@chain
async def chatbot(values):
    prompt = await template.ainvoke(values)
    return await model.ainvoke(prompt)


async def main():
    return await chatbot.ainvoke({"question": "Which model providers offer LLMs?"})

if __name__ == "__main__":
    import asyncio
    print(asyncio.run(main()))




Declarative composition

* With declarative composition, you use LCEL (LangChain Experession Language)

* LCEL is a declarative, chainable syntax that enables you compose complex pipelines of components in a structured and modular way (e.g. similar to | in bash pipelines)

In [55]:
# Declarative composition

template = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful assistant."),
        ("human", "{question}"),
    ]
)
model = ChatOpenAI()


# # Combine model and template with the LCEL pipeline operator ( | ) 
chatbot = template | model

# Execute the generation with invoke/batch
response = chatbot.invoke({"question": "Which model providers offer LLMs (language models)?"})
print(response.content)

Several companies offer language models as part of their services. Some popular options for language models include:

1. OpenAI, which offers the GPT series of language models, including GPT-2 and GPT-3.
2. Google Cloud's Natural Language API provides access to a powerful pretrained language model.
3. Microsoft Azure has language models available through its Text Analytics API.
4. IBM Watson offers language models as part of its Natural Language Understanding service.
5. Amazon Web Services (AWS) provides language models through its Amazon Comprehend service.

These are just a few options, and there are many other companies and research organizations that also offer language models for various purposes.


In [58]:
# Execute with streaming
for part in chatbot.stream({"question": "Which model providers offer LLMs (language models)?"}):
    print(part.content)
