# Deep Dive into LangChain
## LLMs, Prompt Templates, Caching, Streaming, Chains

This notebook uses the latest versions of the OpenAI and LangChain libraries.

In [None]:
pip install -r ./requirements.txt -q

Download [requirements.txt](https://drive.google.com/file/d/1UpURYL9kqjXfe9J8o-_Dq5KJTbQpzMef/view?usp=sharing)

In [None]:
pip install --upgrade  -q langchain langchain-community

In [None]:
pip install --upgrade -q openai

In [None]:
# pip show openai

In [None]:
pip show langchain

### Python-dotenv

In [1]:
import os
from dotenv import load_dotenv, find_dotenv

# loading the API Keys from .env
load_dotenv(find_dotenv(), override=True)

# os.environ.get('OPENAI_API_KEY')

True

## Chat Models: GPT-3.5 Turbo and GPT-4

In [2]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI()  

# invoking the llm (running the prompt)
output = llm.invoke('Explain quantum mechanics in one sentence.', model='gpt-3.5-turbo', temperature=0.1)
print(output.content)

Quantum mechanics is the branch of physics that describes the behavior of particles at the smallest scales, where traditional laws of physics no longer apply and instead, phenomena such as superposition and entanglement occur.


In [3]:
# help(ChatOpenAI)  # see the llm constructor arguments with its defaults

In [4]:
# using Chat Completions API Messages: System, Assistant and Human
from langchain.schema import(
    SystemMessage, 
    AIMessage,
    HumanMessage
)
messages = [
    SystemMessage(content='You are a physicist and respond only in German.'),
    HumanMessage(content='Explain quantum mechanics in one sentence.')
]

output = llm.invoke(messages)
print(output.content)

Quantenmechanik beschreibt das Verhalten von Teilchen auf atomarer und subatomarer Ebene durch Wellenfunktionen und Wahrscheinlichkeitsverteilungen.


## Caching LLM Responses

### 1. In-Memory Cache

In [5]:
from langchain.globals import set_llm_cache
from langchain_openai import OpenAI
llm = OpenAI(model_name='gpt-3.5-turbo-instruct')

In [6]:
from langchain.cache import InMemoryCache
set_llm_cache(InMemoryCache())

In [7]:
%%time
prompt = 'Tell a me a joke that a toddler can understand.'
llm.invoke(prompt)

CPU times: user 20.9 ms, sys: 947 μs, total: 21.8 ms
Wall time: 1.11 s


'\n\nWhy did the cookie go to the doctor? Because it was feeling crumbly!'

In [8]:
%%time
llm.invoke(prompt)

CPU times: user 308 μs, sys: 0 ns, total: 308 μs
Wall time: 312 μs


'\n\nWhy did the cookie go to the doctor? Because it was feeling crumbly!'

### 2. SQLite Caching

In [9]:
from langchain.cache import SQLiteCache
set_llm_cache(SQLiteCache(database_path=".langchain.db"))

In [10]:
%%time
# First request (not in cache, takes longer)
llm.invoke("Tell me a joke")

CPU times: user 50.4 ms, sys: 33.4 ms, total: 83.8 ms
Wall time: 83 ms


'\n\nWhy did the tomato turn red?\n\nBecause it saw the salad dressing!'

In [11]:
%%time
# Second request (cached, faster)
llm.invoke("Tell me a joke")

CPU times: user 2.12 ms, sys: 193 μs, total: 2.31 ms
Wall time: 1.75 ms


'\n\nWhy did the tomato turn red?\n\nBecause it saw the salad dressing!'

## LLM Streaming

In [None]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI()
prompt = 'Write a rock song about the Moon and a Raven.'
print(llm.invoke(prompt).content)

In [None]:
for chunk in llm.stream(prompt):
    print(chunk.content, end='', flush=True)

## PromptTemplates

In [12]:
from langchain.prompts import PromptTemplate
from langchain_openai import ChatOpenAI

# Define a template for the prompt
template = '''You are an experienced virologist.
Write a few sentences about the following virus "{virus}" in {language}.'''

# Create a PromptTemplate object from the template
prompt_template = PromptTemplate.from_template(template=template)

# Fill in the variable: virus and language
prompt = prompt_template.format(virus='hiv', language='german')
prompt  # Returns the generated prompt


'You are an experienced virologist.\nWrite a few sentences about the following virus "hiv" in german.'

In [13]:
llm = ChatOpenAI(model_name='gpt-3.5-turbo', temperature=0)
output = llm.invoke(prompt)
print(output.content)

HIV, das humane Immundefizienzvirus, ist ein Retrovirus, das das menschliche Immunsystem schwächt und zu AIDS führen kann. Es wird hauptsächlich durch ungeschützten Geschlechtsverkehr, den Austausch von infizierten Nadeln oder von der Mutter auf das Kind während der Schwangerschaft übertragen. Es gibt keine Heilung für HIV, aber mit antiretroviralen Medikamenten kann die Krankheit gut kontrolliert werden. Prävention und frühzeitige Diagnose sind entscheidend im Kampf gegen HIV.


## ChatPromptTemplates

In [14]:
from langchain.prompts import ChatPromptTemplate, HumanMessagePromptTemplate
from langchain_core.messages import SystemMessage

# Create a chat template with system and human messages
chat_template = ChatPromptTemplate.from_messages(
    [
        SystemMessage(content='You respond only in the JSON format.'),
        HumanMessagePromptTemplate.from_template('Top {n} countries in {area} by population.')
    ]
)

# Fill in the specific values for n and area
messages = chat_template.format_messages(n='5', area='World')
print(messages)  # Outputs the formatted chat messages


[SystemMessage(content='You respond only in the JSON format.'), HumanMessage(content='Top 5 countries in World by population.')]


In [15]:
from langchain_openai import ChatOpenAI
llm = ChatOpenAI()
output = llm.invoke(messages)
print(output.content)

{
    "1": "China",
    "2": "India",
    "3": "United States",
    "4": "Indonesia",
    "5": "Pakistan"
}


## Simple Chains

In [16]:
from langchain_openai import ChatOpenAI
from langchain import PromptTemplate
from langchain.chains import LLMChain

llm = ChatOpenAI()
template = '''You are an experience virologist.
Write a few sentences about the following virus "{virus}" in {language}.'''
prompt_template = PromptTemplate.from_template(template=template)

chain = LLMChain(
    llm=llm,
    prompt=prompt_template,
    verbose=True
)
output = chain.invoke({'virus': 'HSV', 'language': 'Spanish'})
print(output)



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are an experience virologist.
Write a few sentences about the following virus "HSV" in Spanish.[0m

[1m> Finished chain.[0m
{'virus': 'HSV', 'language': 'Spanish', 'text': 'El virus del herpes simple, o HSV por sus siglas en inglés, es un virus de doble cadena de ADN que pertenece a la familia de los herpesvirus. Este virus se clasifica en dos tipos: HSV-1, que suele causar herpes labial, y HSV-2, que es la principal causa de herpes genital. Ambos tipos de HSV pueden causar infecciones recurrentes y se transmiten principalmente a través del contacto directo con una persona infectada.'}


  chain = LLMChain(


In [18]:
# if the code in the cell above gives an LLMChain deprecation error replace:
# chain = LLMChain(
#     llm=llm,
#     prompt=prompt_template,
#     verbose=True
# )

# with: 

from langchain_core.output_parsers import StrOutputParser

chain = prompt_template | llm | StrOutputParser()
output = chain.invoke({'virus': 'HSV', 'language': 'Spanish'})
print(output)

El virus del herpes simple, o HSV por sus siglas en inglés, es un virus de doble cadena de ADN que pertenece a la familia de los herpesvirus. Este virus se clasifica en dos tipos: HSV-1, que suele causar herpes labial, y HSV-2, que es la principal causa de herpes genital. Ambos tipos de HSV pueden causar infecciones recurrentes y se transmiten principalmente a través del contacto directo con una persona infectada.


In [19]:
template = 'What is the capital of {country}?. List the top 3 places to visit in that city. Use bullet points'
prompt_template = PromptTemplate.from_template(template=template)

# Initialize an LLMChain with the ChatOpenAI model and the prompt template
chain = LLMChain(
    llm=llm,
    prompt=prompt_template,
    verbose=True
)

country = input('Enter Country: ')

# Invoke the chain with specific virus and language values
output = chain.invoke(country)
print(output['text'])

Enter Country:  Italy




[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWhat is the capital of Italy?. List the top 3 places to visit in that city. Use bullet points[0m

[1m> Finished chain.[0m
The capital of Italy is Rome.

Top 3 places to visit in Rome:
- The Colosseum
- The Vatican City (including St. Peter's Basilica and the Sistine Chapel)
- The Roman Forum


## Sequential Chains

In [20]:
from langchain_openai import ChatOpenAI
from langchain import PromptTemplate
from langchain.chains import LLMChain, SimpleSequentialChain

# Initialize the first ChatOpenAI model (gpt-3.5-turbo) with specific temperature
llm1 = ChatOpenAI(model_name='gpt-3.5-turbo', temperature=0.5)

# Define the first prompt template
prompt_template1 = PromptTemplate.from_template(
    template='You are an experienced scientist and Python programmer. Write a function that implements the concept of {concept}.'
)
# Create an LLMChain using the first model and the prompt template
chain1 = LLMChain(llm=llm1, prompt=prompt_template1)

# Initialize the second ChatOpenAI model (gpt-4-turbo) with specific temperature
llm2 = ChatOpenAI(model_name='gpt-4-turbo-preview', temperature=1.2)

# Define the second prompt template
prompt_template2 = PromptTemplate.from_template(
    template='Given the Python function {function}, describe it as detailed as possible.'
)
# Create another LLMChain using the second model and the prompt template
chain2 = LLMChain(llm=llm2, prompt=prompt_template2)

# Combine both chains into a SimpleSequentialChain
overall_chain = SimpleSequentialChain(chains=[chain1, chain2], verbose=True)

# Invoke the overall chain with the concept "linear regression"
output = overall_chain.invoke('linear regression')




[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3mSure! Here is an example of a function that implements linear regression in Python:

```python
import numpy as np

def linear_regression(X, y):
    X = np.array(X)
    y = np.array(y)
    
    # Calculate the slope (m) and intercept (b) using the least squares method
    X_mean = np.mean(X)
    y_mean = np.mean(y)
    
    m = np.sum((X - X_mean) * (y - y_mean)) / np.sum((X - X_mean)**2)
    b = y_mean - m * X_mean
    
    return m, b

# Example usage
X = [1, 2, 3, 4, 5]
y = [2, 4, 5, 4, 5]

m, b = linear_regression(X, y)
print(f"Slope: {m}, Intercept: {b}")
```

This function takes two arrays `X` and `y` as input, where `X` represents the independent variable and `y` represents the dependent variable. It calculates the slope `m` and intercept `b` of the linear regression line using the least squares method and returns them.[0m
[33;1m[1;3mThe Python function `linear_regression` is designed to perform a linear regr

In [21]:
print(output['output'])

The Python function `linear_regression` is designed to perform a linear regression analysis, which is a basic statistical approach for modeling the relationship between a dependent variable (`y`) and one independent variable (`X`). The goal is to find the best-fitting straight line through the points represented by the `X` and `y` data sets.

Let's deconstruct this function step by step:

### Importing Necessary Library

- **`import numpy as np`** : This line imports the NumPy library with the alias `np`. NumPy is a powerful library for numerical computations in Python, offering support for large, multi-dimensional arrays and matrices, along with a collection of high-level mathematical functions to perform operations on these arrays.

### Function Definition

- **`def linear_regression(X, y):`** : This line defines the function `linear_regression` with two parameters, `X` and `y`. `X` represents the array of independent variable(s), and `y` is the array of corresponding dependent varia

In [22]:
pip install -q --upgrade langchain_experimental

Note: you may need to restart the kernel to use updated packages.


In [23]:
from langchain_experimental.utilities import PythonREPL
python_repl = PythonREPL()
python_repl.run('print([n for n in range(1, 100) if n % 13 == 0])')

Python REPL can execute arbitrary code. Use with caution.


'[13, 26, 39, 52, 65, 78, 91]\n'

In [24]:
from langchain_experimental.agents.agent_toolkits import create_python_agent
from langchain_experimental.tools.python.tool import PythonREPLTool
from langchain_openai import ChatOpenAI

# Initialize the ChatOpenAI model with gpt-4-turbo and a temperature of 0
llm = ChatOpenAI(model='gpt-4-turbo-preview', temperature=0)

# Create a Python agent using the ChatOpenAI model and a PythonREPLTool
agent_executor = create_python_agent(
    llm=llm,
    tool=PythonREPLTool(),
    verbose=True
)

# Invoke the agent
prompt = 'Calculate the square root of the factorial of 12 and display it with 4 decimal points'
agent_executor.invoke(prompt)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mTo solve this, I will first calculate the factorial of 12 using the math module's factorial function. Then, I will calculate the square root of that result using the math module's sqrt function. Finally, I will format the result to display it with 4 decimal points using the format function.

Action: Python_REPL
Action Input: import math[0m
Observation: [36;1m[1;3m[0m
Thought:[32;1m[1;3mAction: Python_REPL
Action Input: factorial_12 = math.factorial(12)[0m
Observation: [36;1m[1;3m[0m
Thought:[32;1m[1;3mAction: Python_REPL
Action Input: sqrt_factorial_12 = math.sqrt(factorial_12)[0m
Observation: [36;1m[1;3m[0m
Thought:[32;1m[1;3mAction: Python_REPL
Action Input: print("{:.4f}".format(sqrt_factorial_12))[0m
Observation: [36;1m[1;3m21886.1052
[0m
Thought:[32;1m[1;3mI now know the final answer
Final Answer: 21886.1052[0m

[1m> Finished chain.[0m


{'input': 'Calculate the square root of the factorial of 12 and display it with 4 decimal points',
 'output': '21886.1052'}

In [25]:
response = agent_executor.invoke('What is the answer to 5.1 ** 7.3?')



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI need to calculate 5.1 raised to the power of 7.3 to get the answer.
Action: Python_REPL
Action Input: print(5.1 ** 7.3)[0m
Observation: [36;1m[1;3m146306.05007233328
[0m
Thought:[32;1m[1;3mI now know the final answer
Final Answer: 146306.05007233328[0m

[1m> Finished chain.[0m


In [26]:
response

{'input': 'What is the answer to 5.1 ** 7.3?', 'output': '146306.05007233328'}

In [27]:
print(response['input'])

What is the answer to 5.1 ** 7.3?


In [28]:
print(response['output'])


146306.05007233328
