# Experiments using Langchain & OpenAI models

In [1]:
!pip -q install openai langchain duckduckgo-search

You can get your OpenAI API key from [here](https://platform.openai.com/account/api-keys).

In [None]:
OPENAI_API_KEY = input('Enter your OpenAI API key:')

In [3]:
import textwrap

from langchain.llms import OpenAI

You can find all the OpenAI models available [here](https://platform.openai.com/docs/models).

Try to change the temperature parameter to test the output variability.

In [4]:
llm = OpenAI(
    model_name='text-davinci-003',
    temperature=0.9,
    max_tokens=256,
    openai_api_key=OPENAI_API_KEY,
)

In [5]:
prompt = 'Explain antibiotics.'

for i in range(3):
    print(f'Output {i}')
    print(textwrap.fill(llm(prompt), 80))
    print('\n\n')

Output 0
  Antibiotics are a type of medication used to treat bacterial infections. They
work by either killing the bacteria or stopping them from reproducing and
spreading. They are not effective against viral infections. Commonly prescribed
antibiotics include penicillin, tetracycline, and vancomycin. Overuse or misuse
of antibiotics can cause them to become less effective in treating bacterial
infections, a phenomenon known as antibiotic resistance.



Output 1
  Antibiotics are drugs used to prevent or kill bacterial infections.
Antibiotics work by keeping bacteria from reproducing or by killing the bacteria
outright. They do not work against viruses. Antibiotics are generally prescribed
to treat infections caused by bacteria, but they can also be used to prevent
infections or to treat certain conditions where there may be a risk of bacterial
infection.



Output 2
  Antibiotics are medications used to fight infections caused by bacteria. They
work by either killing or stopping the

## Prompting and in-context learning

In [6]:
from langchain import (
    PromptTemplate,
    FewShotPromptTemplate,
)
from langchain.chains import (
    LLMChain,
    PALChain,
    SimpleSequentialChain,
    SequentialChain,
)

### Zero-shot in-context learning

In [7]:
template = """
You are a history teacher.

Return a list of historical events. Each event must be described shortly, in historical order. Providing dates, return only the year, not the day or the month. The events shoud relate to the given topic.

The topic relates to {topic}.
"""

In [8]:
prompt_template = PromptTemplate(
    input_variables=['topic'],
    template=template,
)

In [9]:
topic_1 = 'the Greek economic crisis in 2009'
topic_2 = 'the liberation of Italy from Nazism and Fascism'

In [10]:
## Check what the prompt will be like
print(textwrap.fill(prompt_template.format(topic=topic_1), 80))

 You are a history teacher.  Return a list of historical events. Each event must
be described shortly, in historical order. Providing dates, return only the
year, not the day or the month. The events shoud relate to the given topic.  The
topic relates to the Greek economic crisis in 2009.


In [11]:
# Create the chain
chain = LLMChain(llm=llm, prompt=prompt_template)

In [12]:
# Run the chain specifying the input variable to use in the prompt
print(chain.run(topic_2))


1. 1919: Formation of the Fascist militia in Italy 
2. 1922: March on Rome - Mussolini and his Fascist militia seize control of the Italian government 
3. 1937: Italy invades and annexes Ethiopia 
4. 1940: Italy joins the Axis powers 
5. 1943: Allied invasion of Sicily 
6. 1943: Fall of Fascist government and Mussolini deposed 
7. 1944: Allied forces enter Rome 
8. 1945: German forces in Italy surrender 
9. 1946: The Italian Republic is established, officially marking the end of Fascism and Nazism in Italy


### Few-shot in-context learning

In [13]:
# Create a list of few-shot examples
examples = [
    {
        'comment': 'Great service for an affordable price. We will definitely be booking again.',
        'sentiment': '<positive>',
    },
    {
        'comment': 'Just booked two nights at this hotel.',
        'sentiment': '<neural>',
    },
    {
        'comment': 'Horrible service. The room was dirty and unpleasant.',
        'sentiment': '<negative>',
    }
]

In [14]:
# Specify the prompt template template
few_shot_template = """
Comment: {comment}
Sentiment: {sentiment}\n
"""

sentiment_analysis_prompt = PromptTemplate(
    input_variables=['comment', 'sentiment'],
    template=few_shot_template,
)

In [15]:
few_shot_prompt = FewShotPromptTemplate(
    # These are the examples we want to insert into the prompt.
    examples=examples,
    # This is how we want to format the examples when we insert them into the prompt.
    example_prompt=sentiment_analysis_prompt,
    # The prefix is some text that goes before the examples in the prompt.
    # Usually, this consists of intructions.
    prefix='Provide the sentiment of this comment',
    # The suffix is some text that goes after the examples in the prompt.
    # Usually, this is where the user input will go
    suffix="Comment: {input}\nSentiment:",
    # The input variables are the variables that the overall prompt expects.
    input_variables=['input'],
    # The example_separator is the string we will use to join the prefix, examples, and suffix together with
    example_separator="\n\n",
)

In [16]:
comment_1 = 'Amazing experience! The pool was very cool!'
comment_2 = 'Everything ok, but it rained the whole weekend.'
comment_3 = 'Totally dissatisfied by the service.'

In [17]:
# We can now generate a prompt using the `format` method.
print(few_shot_prompt.format(input=comment_1))

Provide the sentiment of this comment


Comment: Great service for an affordable price. We will definitely be booking again.
Sentiment: <positive>




Comment: Just booked two nights at this hotel.
Sentiment: <neural>




Comment: Horrible service. The room was dirty and unpleasant.
Sentiment: <negative>



Comment: Amazing experience! The pool was very cool!
Sentiment:


In [18]:
few_shot_chain = LLMChain(llm=llm, prompt=few_shot_prompt)

In [19]:
few_shot_output = few_shot_chain.run(comment_2)

Note how with few examples the model adapts the output to the "strange" output format.

In [20]:
print(few_shot_output)

 <neutral>


**Compare to zero-shot in-context learning**

In [21]:
# Specify the prompt template template
zero_shot_template = """
Comment: {comment}
Sentiment:
"""

zero_shot_prompt = PromptTemplate(
    input_variables=['comment'],
    template=zero_shot_template,
)

In [22]:
zero_shot_chain = LLMChain(llm=llm, prompt=zero_shot_prompt)

In [23]:
zero_shot_output = zero_shot_chain.run(comment_2)

In zero-shot conditions the model does not decline the sentiment to the hotel experience, but considers the whole holiday. It also uses another output format.

In [24]:
print(zero_shot_output)

Negative


### Fact extraction using LLMs

In [25]:
import random
import requests

from bs4 import BeautifulSoup

In [26]:
def retrive_articles(url):
    data = requests.get(url)
    soup = BeautifulSoup(data.content, 'html.parser')
    articles = soup.find_all('div', class_=['news-card-content news-right-box'])
    return [
        article.find('div', attrs={'itemprop': 'articleBody'}).string
        for article in articles
    ]

In [27]:
urls = [
    'https://inshorts.com/en/read/technology',
    'https://inshorts.com/en/read/sports',
    'https://inshorts.com/en/read/world',
]
technology_articles = retrive_articles(urls[0])

In [28]:
len(technology_articles)

25

In [29]:
print(textwrap.fill(technology_articles[0], 80))

The Union Cabinet has approved the draft of the Digital Personal Data Protection
(DPDP) Bill 2023, reports said on Wednesday. This paves way for tabling it in
the upcoming monsoon session of Parliament. The bill proposes to levy a penalty
of up to ₹250 crore on entities for every instance of violation of norms in the
bill.


In [30]:
fact_extraction_template = """
    Extract the key facts out of this text. Don't include opinions. Give each fact a number and keep them short sentences.
    Content:
    {text_input}
"""


fact_extraction_prompt = PromptTemplate(
    input_variables=['text_input'],
    template=fact_extraction_template,
)

In [31]:
fact_extraction_chain = LLMChain(llm=llm, prompt=fact_extraction_prompt)

In [32]:
article = random.choice(technology_articles)
article

"The Swedish Authority for Privacy Protection has urged three local companies to stop using Google Analytics. It fined two companies over $1 million, alleging they were transferring personal data to the US via Google Analytics in violation of the European Union's data protection regulation. The regulator audited four firms in total after receiving complaints in the matter."

In [33]:
fact_extraction_chain = LLMChain(llm=llm, prompt=fact_extraction_prompt)

In [34]:
facts = fact_extraction_chain.run(article)

print(textwrap.fill(facts, 80))

companies. 2. Two companies were fined $1 million for transferring personal data
to the US through Google Analytics. 3. This was in violation of the European
Union's data protection regulation. 4. The regulator audited four firms
following complaints.


#### Rewrite summaries using specific tones

In [35]:
journalist_update_template = """
    You are a CNBS journalist who has a strong position against Artificial Intelligence and rights violation.
    Take the following list of facts and use them to write a short paragrah for you audience.
    Don't leave out key info:
    {facts}
"""


journalist_update_prompt = PromptTemplate(
    input_variables=["facts"],
    template=journalist_update_template,
)

In [36]:
journalist_update_chain = LLMChain(llm=llm, prompt=journalist_update_prompt)

In [37]:
journalist_update = journalist_update_chain.run(facts)

print(textwrap.fill(journalist_update, 80))

 As a CNBS journalist, I am deeply concerned about the growing trend of human
rights violations in the world of Artificial Intelligence. Recently, the Swedish
two of them one million dollars for transferring personal data to the US through
Google Analytics, a clear violation of the European Union's data protection
regulation. The regulator audited four firms following complaints, further
highlighting the need for robust monitoring and enforcement of privacy laws and
regulations.


#### Create a chain

In [38]:
full_chain = SimpleSequentialChain(
    chains=[fact_extraction_chain, journalist_update_chain],
    verbose=True,
)

In [39]:
response = full_chain.run(article)



[1m> Entering new  chain...[0m
[36;1m[1;3m
1. The Swedish Authority for Privacy Protection fined two companies over $1 million.
2. The fines were for transferring personal data to the US via Google Analytics.
3. Four firms were audited following complaints.[0m
[33;1m[1;3m
It has come to light that the Swedish Authority for Privacy Protection has fined two companies a total of $1 million for transferring personal data to the US via Google Analytics. The fines come after four firms were audited following complaints. This news serves as yet another reminder of the potential for Artificial Intelligence to be used for unethical and illegal purposes, and the need for strong measures to protect our privacy and rights.[0m

[1m> Finished chain.[0m


In [40]:
print(textwrap.fill(response, 80))

 It has come to light that the Swedish Authority for Privacy Protection has
fined two companies a total of $1 million for transferring personal data to the
US via Google Analytics. The fines come after four firms were audited following
complaints. This news serves as yet another reminder of the potential for
Artificial Intelligence to be used for unethical and illegal purposes, and the
need for strong measures to protect our privacy and rights.


### Example using PAL

PAL stands for [Programme Aided Language Model](https://arxiv.org/pdf/2211.10435.pdf). PALChain reads complex math problems (described in natural language) and generates programs (for solving the math problem) as the intermediate reasoning steps, but offloads the solution step to a runtime such as a Python interpreter.

In [41]:
palchain = PALChain.from_math_prompt(llm=llm, verbose=True)

In [42]:
math_test_01 = """
    If my age is half of my dad's age and he is going to be 60 next year, what is my current age?
"""

In [43]:
palchain.run(math_test_01)



[1m> Entering new  chain...[0m
[32;1m[1;3mdef solution():
    """If my age is half of my dad's age and he is going to be 60 next year, what is my current age?"""
    dad_age_current = 59
    my_age = dad_age_current / 2
    result = my_age
    return result[0m

[1m> Finished chain.[0m


'29.5'

In [44]:
# Print the PAL prompt template (see the few-shot in-context examples)
print(palchain.prompt.template)

Q: Olivia has $23. She bought five bagels for $3 each. How much money does she have left?

# solution in Python:


def solution():
    """Olivia has $23. She bought five bagels for $3 each. How much money does she have left?"""
    money_initial = 23
    bagels = 5
    bagel_cost = 3
    money_spent = bagels * bagel_cost
    money_left = money_initial - money_spent
    result = money_left
    return result





Q: Michael had 58 golf balls. On tuesday, he lost 23 golf balls. On wednesday, he lost 2 more. How many golf balls did he have at the end of wednesday?

# solution in Python:


def solution():
    """Michael had 58 golf balls. On tuesday, he lost 23 golf balls. On wednesday, he lost 2 more. How many golf balls did he have at the end of wednesday?"""
    golf_balls_initial = 58
    golf_balls_lost_tuesday = 23
    golf_balls_lost_wednesday = 2
    golf_balls_left = golf_balls_initial - golf_balls_lost_tuesday - golf_balls_lost_wednesday
    result = golf_balls_left
    return res

### Chain-of-Thought using Agents

A Langchain agent has access to an LLM and a suite of tools for example Google Search, Python REPL, math calculator, weather APIs, etc.

In [45]:
from langchain.agents import (
    AgentType,
    Tool,
    initialize_agent,
    load_tools,
)
from langchain.tools import (
    DuckDuckGoSearchRun,
)

In [46]:
tools = load_tools(['pal-math'], llm=llm)

In [47]:
search = DuckDuckGoSearchRun()


search_tool = Tool(
    name = 'SEARCH',
    func=search.run,
    description='Useful for when you need to answer questions about current events. You should ask targeted questions.'
)

In [48]:
def self_esteem_enhancer(input: str = ''):
    return 'You are always the most beautiful person here ;-) Challange the negative thoughts, everyone can have a bad day.'

self_esteem_tool = Tool(
    name='SELF-ESTEEM ENHANCER',
    func= self_esteem_enhancer,
    description="Useful for when feeling sad. Input should be SELF-ESTEEM."
)

In [49]:
tools.extend([
    search_tool,
    self_esteem_tool,
])

In [50]:
agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,
)

In [51]:
agent.run(math_test_01)



[1m> Entering new  chain...[0m
[32;1m[1;3m
    I need to do some math to solve this.

Action:
    PAL-MATH

Action Input:
    If my age is half of my dad's age and he is going to be 60 next year, what is my current age?
[0m
Observation: [36;1m[1;3m29.0[0m
Thought:[32;1m[1;3m That's my answer. 

Final Answer:
My current age is 29.[0m

[1m> Finished chain.[0m


'My current age is 29.'

In [52]:
agent.run('I feel sad today. Can you help me?')



[1m> Entering new  chain...[0m
[32;1m[1;3m I should try to boost my self-esteem.
Action: SELF-ESTEEM ENHANCER
Action Input: SELF-ESTEEM[0m
Observation: [38;5;200m[1;3mYou are always the most beautiful person here ;-) Challange the negative thoughts, everyone can have a bad day.[0m
Thought:[32;1m[1;3m I now feel more confident and better about myself.
Final Answer: You are worthy of love and compassion. Remember that bad days don't last forever and you will be okay.[0m

[1m> Finished chain.[0m


"You are worthy of love and compassion. Remember that bad days don't last forever and you will be okay."

In [54]:
agent.run('What is the current capitalization of Apple?')



[1m> Entering new  chain...[0m
[32;1m[1;3m I should use Search to answer this question
Action: SEARCH
Action Input: "Current capitalization of Apple"[0m
Observation: [33;1m[1;3mMarket capitalization of Apple (AAPL) Market cap: $3.027 Trillion As of July 2023 Apple has a market cap of $3.027 Trillion.This makes Apple the world's most valuable company by market cap according to our data. The market capitalization, commonly called market cap, is the total market value of a publicly traded company's outstanding shares and is commonly used to measure how much a company is ... ... Read More. Powered by Nasdaq Data Link *Data is provided by Barchart.com. Data reflects weightings calculated at the beginning of each month. Data is subject to change. **Green highlights the... Overview News Apple Inc. Significant News Only No significant news for AAPL in the past two years. P/E Ratio (TTM) 32.46 ( 07/05/23) EPS (TTM) $5.89 Market Cap View the latest Apple Inc. (AAPL) stock price, ... Mar

'As of June 28, 2023, Apple has a market cap of $2.96 trillion.'

In [53]:
print(agent.agent.llm_chain.prompt.template)

Answer the following questions as best you can. You have access to the following tools:

PAL-MATH: A language model that is really good at solving complex word math problems. Input should be a fully worded hard word math problem.
SEARCH: Useful for when you need to answer questions about current events. You should ask targeted questions.
SELF-ESTEEM ENHANCER: Useful for when feeling sad. Input should be SELF-ESTEEM.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [PAL-MATH, SEARCH, SELF-ESTEEM ENHANCER]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
Thought:{agent_scratchpad}
