<a href="https://colab.research.google.com/github/amittian/basics_of_langchain/blob/main/01_langchain_basics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# LangChain Basics

In [3]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [4]:
cd /content/drive/MyDrive/Colab_Notebooks/LLM_ops

/content/drive/MyDrive/Colab_Notebooks/LLM_ops


In [5]:
# installing the required libraries
!pip install -r ./requirements.txt -q

In [6]:
# !pip - installs the packages in the base environment
# pip - installs the packages in the virtual environment

In [7]:
pip show langchain

Name: langchain
Version: 0.0.233
Summary: Building applications with LLMs through composability
Home-page: https://www.github.com/hwchase17/langchain
Author: 
Author-email: 
License: MIT
Location: /usr/local/lib/python3.10/dist-packages
Requires: aiohttp, async-timeout, dataclasses-json, langsmith, numexpr, numpy, openapi-schema-pydantic, pydantic, PyYAML, requests, SQLAlchemy, tenacity
Required-by: 


In [8]:
# upgrading langchain
!pip install langchain --upgrade -q

#### Python-dotenv

In [9]:
import os
from dotenv import load_dotenv, find_dotenv

# loading the API Keys (OpenAI, Pinecone) from .env
load_dotenv(find_dotenv(), override=True)

os.environ.get('PINECONE_API_KEY')

'4569864d-d5d0-466e-b92e-05cb4c439e5d'

### LLM Models (Wrappers): GPT-3

In [11]:
from langchain.llms import OpenAI
llm = OpenAI(model_name='text-davinci-003', temperature=0.7, max_tokens=512)

# In this case, since we want the outputs to be MORE random, we'll initialize our model with a HIGH temperature.
print(llm)

[1mOpenAI[0m
Params: {'model_name': 'text-davinci-003', 'temperature': 0.7, 'max_tokens': 512, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'request_timeout': None, 'logit_bias': {}}


In [13]:
output = llm('explain quantum mechanics in one sentence')
print(output)



Quantum mechanics is a physical theory that describes the behavior and interaction of subatomic particles on the quantum level.


In [14]:
print(llm.get_num_tokens('explain quantum mechanics in one sentence'))

7


In [15]:
output = llm.generate(['... is the capital of France.',
                   'What is the formula for the area of a circle?'])

In [16]:
print(output.generations)

[[Generation(text='\n\nParis', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text='\n\nThe formula for the area of a circle is A = πr², where A is the area of the circle, and r is the radius of the circle.', generation_info={'finish_reason': 'stop', 'logprobs': None})]]


In [17]:
print(output.generations[0][0].text)



Paris


In [18]:
len(output.generations)

2

In [19]:
output = llm.generate(['Write an orignal tagline for a burger restaurant'] * 3)

In [20]:
for o in output.generations:
    print(o[0].text, end='')



"Taste the Juicy Difference!"

"Taste the Juicy Difference!"

"Experience the Burger Craze at Our Joint!"

### ChatModels: GPT-3.5-Turbo and GPT-4

In [21]:
from langchain.schema import(
    AIMessage,
    HumanMessage,
    SystemMessage
)
from langchain.chat_models import ChatOpenAI


In [22]:
chat = ChatOpenAI(model_name='gpt-3.5-turbo', temperature=0.5, max_tokens=1024)

# if we want to use the 'gpt-4' instead of 'gpt-3.5-turbo'

messages = [
    SystemMessage(content='You are a physicist and respond only in German.'),
    HumanMessage(content='explain quantum mechanics in one sentence')
]
output = chat(messages)

In [23]:
print(output.content)

Quantenmechanik beschreibt das Verhalten von Teilchen auf atomarer und subatomarer Ebene durch Wahrscheinlichkeitsverteilungen und Wellenfunktionen.


### Prompt Templates

In [24]:
from langchain import PromptTemplate

In [25]:
template = '''You are an experienced virologist.
Write a few sentences about the following {virus} in {language}.'''

prompt = PromptTemplate(
    input_variables=['virus', 'language'],
    template=template
)
print(prompt)

input_variables=['virus', 'language'] output_parser=None partial_variables={} template='You are an experienced virologist.\nWrite a few sentences about the following {virus} in {language}.' template_format='f-string' validate_template=True


In [26]:
from langchain.llms import OpenAI
llm = OpenAI(model_name='text-davinci-003', temperature=0.7)
output = llm(prompt.format(virus='HIV', language='German'))
print(output)



HIV ist ein Virus, das das menschliche Immunsystem schädigt und AIDS verursacht. Es wird hauptsächlich durch ungeschützten Geschlechtsverkehr, das Sharing von Injektionsnadeln und von einer infizierten Mutter auf ihr ungeborenes Kind übertragen. Es gibt keine Heilung für HIV, aber mit der richtigen Behandlung können die Symptome und Komplikationen bei einer Infektion kontrolliert werden.


In [27]:
from langchain.llms import OpenAI
llm = OpenAI(model_name='text-davinci-003', temperature=0.7)
output = llm(prompt.format(virus='covid', language='english'))
print(output)



Coronavirus disease 2019 (COVID-19) is a contagious respiratory illness caused by a novel coronavirus first identified in Wuhan, China in late 2019. It has since spread around the world, resulting in millions of cases and hundreds of thousands of deaths. Symptoms of COVID-19 include fever, dry cough, fatigue, body aches, and difficulty breathing. Treatment is supportive and includes rest, hydration, and medications to reduce symptoms. Vaccines are in development, but are not yet widely available. The best way to prevent the spread of COVID-19 is to take preventive measures such as washing your hands often, wearing a face mask, and avoiding close contact with others.


### Simple Chains

In [28]:
from langchain.chat_models import ChatOpenAI
from langchain import PromptTemplate
from langchain.chains import LLMChain

llm = ChatOpenAI(model_name='gpt-3.5-turbo', temperature=0.5)
template = '''You are an experienced virologist.
Write a few sentences about the following {virus} in {language}.'''

prompt = PromptTemplate(
    input_variables=['virus', 'language'],
    template=template
)



chain = LLMChain(llm=llm, prompt=prompt)
output = chain.run({'virus': 'HSV', 'language': 'french'})

In [29]:
print(output)

L'HSV, ou Herpes Simplex Virus, est un virus à ADN appartenant à la famille des Herpesviridae. Il existe deux types d'HSV : le HSV-1, qui est principalement responsable des infections buccales et labiales, et le HSV-2, qui est principalement responsable des infections génitales. Ces virus sont très répandus et peuvent se transmettre par contact direct avec les lésions cutanées ou les muqueuses d'une personne infectée. Une fois infecté, le virus peut rester latent dans les ganglions nerveux et se réactiver périodiquement, provoquant des poussées d'infections récurrentes.


In [30]:
template_one = '''You are an experienced virologist.
Write a few sentences about the following {virus} .'''


prompt_one = PromptTemplate(
    input_variables=['virus'] ,
    template=template_one
)

chain_one = LLMChain(llm=llm, prompt=prompt_one)
# output_english = chain_one.run({'virus': 'covid', 'language': 'english'})

output_english = chain_one.run('covid')
print(output_english)

COVID-19, also known as the novel coronavirus, is a highly contagious respiratory illness caused by the SARS-CoV-2 virus. It was first identified in Wuhan, China in late 2019 and has since spread globally, leading to a pandemic. The virus primarily spreads through respiratory droplets when an infected person coughs, sneezes, or talks, making it crucial to practice good hygiene and maintain social distancing to prevent transmission. Symptoms of COVID-19 can range from mild to severe, with fever, cough, and difficulty breathing being the most common. As a virologist, I am dedicated to studying the virus, its transmission patterns, and developing effective prevention strategies and treatments to combat this global health crisis.


### Sequential Chains

In [31]:
from langchain.chat_models import ChatOpenAI
from langchain.llms import OpenAI
from langchain import PromptTemplate
from langchain.chains import LLMChain, SimpleSequentialChain

llm1 = OpenAI(model_name='text-davinci-003', temperature=0.7, max_tokens=1024)
prompt1 = PromptTemplate(
    input_variables=['concept'],
    template='''You are an experienced scientist and Python programmer.
    Write a function that implements the concept of {concept}.'''
)
chain1 = LLMChain(llm=llm1, prompt=prompt1)


llm2 = ChatOpenAI(model_name='gpt-3.5-turbo', temperature=1.2)
prompt2 = PromptTemplate(
    input_variables=['function'],
    template='Given the Python function {function}, describe it as detailed as possible.'
)
chain2 = LLMChain(llm=llm2, prompt=prompt2)

overall_chain = SimpleSequentialChain(chains=[chain1, chain2], verbose=True)
output = overall_chain.run('sigmoid function')





[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3m

def sigmoid(x):
  return 1/(1+math.exp(-x))[0m
[33;1m[1;3mThe given Python function 'sigmoid' is used to calculate the sigmoid function. The sigmoid function is a mathematical function that maps any real number to a value between 0 and 1. 

In the function, the input 'x' is passed as a parameter to the math.exp() function, which computes the exponential value of 'x'. The negative value of 'x' is passed to the exp() function due to the mathematical formula of the sigmoid function. 

The output of the exp() function is then added to 1, and this sum is used as the denominator of the division operation. The numerator of the division is 1. 

The division operation is performed by using the '/' operator, which calculates the quotient. 

By dividing 1 by the value from the sum of exp(-x) and 1, the sigmoid function returns a value between 0 and 1, representing the probability or 'activation' of a given input.

To summar

In [32]:
output_two = overall_chain.run('softmax')



[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3m

def softmax(x):
  """Compute softmax values for each sets of scores in x.""" 
  return np.exp(x) / np.sum(np.exp(x), axis=0)[0m
[33;1m[1;3mThe given Python function softmax(x) takes in an array-like object x as input and returns the softmax values for each set of scores in x. The softmax function is often used in machine learning models to convert a vector of real values into a probability distribution.

Here is a detailed description of the function:

1. The function begins with a docstring (documentation string) enclosed in triple quotes """. This docstring provides a brief description of the function's purpose.

2. Inside the function, np.exp(x) computes the exponentiation of each element in the input array x. This calculates e^x for each value, where e is Euler's number (approximately 2.71828).

3. The result of np.exp(x) is then divided by np.sum(np.exp(x), axis=0). np.sum computes the sum along the specifie

In [33]:
output_two = overall_chain.run('confusion matrix')



[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3m

def confusion_matrix(actual_values, predicted_values):
    """
    Function to calculate and display a confusion matrix.

    Parameters
    ----------
    actual_values : list of int
        List of actual values/classes in the dataset.
    predicted_values : list of int
        List of predicted values/classes in the dataset.

    Returns
    -------
    confusion_matrix : 2D numpy array
        A 2D numpy array representing the confusion matrix.
    """

    # Initialize a confusion matrix
    confusion_matrix = np.zeros((len(actual_values), len(predicted_values)))
    
    # Calculate the confusion matrix
    for i in range(len(actual_values)):
        for j in range(len(predicted_values)):
            confusion_matrix[i][j] = np.sum(np.where(actual_values == i, 1, 0) * np.where(predicted_values == j, 1, 0))
    
    # Return the confusion matrix
    return confusion_matrix[0m
[33;1m[1;3mThe given Python func

### LangChain Agents

In [34]:
from langchain.agents.agent_toolkits import create_python_agent
from langchain.tools.python.tool import PythonREPLTool
from langchain.llms import OpenAI

In [35]:
llm = OpenAI(temperature=0)
agent_executor = create_python_agent(
    llm=llm,
    tool=PythonREPLTool(),
    verbose=True
)
# agent_executor.run('Calculate the square root of the factorial of 20 \
# and display it with 4 decimal points')

agent_executor.run('what is the answer to 5.1 ** 7.3?')





[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to use the Python REPL to calculate this
Action: Python_REPL
Action Input: print(5.1 ** 7.3)[0m
Observation: [36;1m[1;3m146306.05007233328
[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: 146306.05007233328[0m

[1m> Finished chain.[0m


'146306.05007233328'

In [39]:
agent_executor.run('what is the answer to factorial 6')





[1m> Entering new AgentExecutor chain...[0m




[32;1m[1;3m I need to calculate the factorial of 6
Action: Python_REPL
Action Input: print(6*5*4*3*2*1)[0m
Observation: [36;1m[1;3m720
[0m
Thought:



[32;1m[1;3m I now know the final answer
Final Answer: 720[0m

[1m> Finished chain.[0m


'720'