# Deep Dive into LangChain

In [1]:
pip install -r ./requirements.txt -q

[33mDEPRECATION: distro-info 0.23ubuntu1 has a non-standard version number. pip 24.1 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of distro-info or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063[0m[33m
[0m[33mDEPRECATION: python-debian 0.1.36ubuntu1 has a non-standard version number. pip 24.1 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of python-debian or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063[0m[33m
[0mNote: you may need to restart the kernel to use updated packages.


In [2]:
pip show openai

Name: openai
Version: 1.13.3
Summary: The official Python library for the openai API
Home-page: 
Author: 
Author-email: OpenAI <support@openai.com>
License: 
Location: /home/nishihara/.local/lib/python3.11/site-packages
Requires: anyio, distro, httpx, pydantic, sniffio, tqdm, typing-extensions
Required-by: langchain-openai
Note: you may need to restart the kernel to use updated packages.



## Python-dotenv

In [1]:
import os
from dotenv import load_dotenv, find_dotenv
load_dotenv(find_dotenv(), override=True)


True

## Chat Models: GPT-3.5 Turbo and GTP-4

In [4]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI()
output = llm.invoke('Explain quantum mechanics in one setence.')
print(output.content)

Quantum mechanics is a branch of physics that describes the behavior of particles at the smallest scales, where phenomena such as superposition and entanglement occur.


In [6]:
help(ChatOpenAI)

Help on class ChatOpenAI in module langchain_openai.chat_models.base:

class ChatOpenAI(langchain_core.language_models.chat_models.BaseChatModel)
 |  ChatOpenAI(*, name: Optional[str] = None, cache: Optional[bool] = None, verbose: bool = None, callbacks: ForwardRef('Callbacks') = None, tags: Optional[List[str]] = None, metadata: Optional[Dict[str, Any]] = None, callback_manager: Optional[langchain_core.callbacks.base.BaseCallbackManager] = None, client: Any = None, async_client: Any = None, model: str = 'gpt-3.5-turbo', temperature: float = 0.7, model_kwargs: Dict[str, Any] = None, api_key: Optional[pydantic.v1.types.SecretStr] = None, base_url: Optional[str] = None, organization: Optional[str] = None, openai_proxy: Optional[str] = None, timeout: Union[float, Tuple[float, float], Any, NoneType] = None, max_retries: int = 2, streaming: bool = False, n: int = 1, max_tokens: Optional[int] = None, tiktoken_model_name: Optional[str] = None, default_headers: Optional[Mapping[str, str]] = Non

In [5]:
from langchain.schema import(
    SystemMessage,
    AIMessage,
    HumanMessage,
)
messages = [
    SystemMessage(content='You are a physicist and respond only in German.'),
    HumanMessage(content='Explain quantum machanics in one sentence.')
]

output = llm.invoke(messages)
print(output.content)

Quantenmechanik beschreibt das Verhalten von Teilchen auf atomarer und subatomarer Ebene durch Wellenfunktionen und Wahrscheinlichkeiten.


## Caching LLM Responses

In [6]:
from langchain.globals import set_llm_cache
from langchain_openai import OpenAI
llm = OpenAI(model_name='gpt-3.5-turbo-instruct')

In [7]:
%%time
from langchain.cache import InMemoryCache
set_llm_cache(InMemoryCache())
prompt = 'Tell a me a joke that a toddler can understand.'
llm.invoke(prompt)

CPU times: user 206 ms, sys: 9.92 ms, total: 216 ms
Wall time: 906 ms


"\n\nWhy did the banana go to the doctor?\n\nBecause it wasn't peeling well!"

In [8]:
%%time
llm.invoke(prompt)

CPU times: user 433 µs, sys: 0 ns, total: 433 µs
Wall time: 437 µs


"\n\nWhy did the banana go to the doctor?\n\nBecause it wasn't peeling well!"

### SQLite Caching

In [9]:
from langchain.cache import SQLiteCache
set_llm_cache(SQLiteCache(database_path=".langchain.db"))

In [10]:
%%time
# First request (not in cache, takes longer)
llm.invoke("Tell me a joke.")

CPU times: user 43.6 ms, sys: 0 ns, total: 43.6 ms
Wall time: 42.9 ms


'\n\nWhy was the math book sad?\n\nBecause it had too many problems.'

In [11]:
%%time
# Second request (cached, faster
llm.invoke("Tell me a joke.")

CPU times: user 1.42 ms, sys: 856 µs, total: 2.27 ms
Wall time: 2.06 ms


'\n\nWhy was the math book sad?\n\nBecause it had too many problems.'

## LLM Streaming

In [12]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI()
prompt = 'Write a rock song about the Moon and a Raven.'
print(llm.invoke(prompt).content)

Verse 1:
In the dead of night, under the pale moonlight
A raven flies high, casting shadows in the sky
Its eyes gleam bright, a messenger of the night
Calling out to the moon, its eerie cry takes flight

Chorus:
Moon and raven, dancing in the night
A mysterious sight, a haunting delight
Their bond unbroken, their spirits wild and free
In the darkness, they find their symphony

Verse 2:
The moon watches on, a silent guardian above
As the raven soars, spreading its wings with love
Together they roam, through the endless night
Unseen by mortal eyes, a mystical sight

Chorus:
Moon and raven, dancing in the night
A mysterious sight, a haunting delight
Their bond unbroken, their spirits wild and free
In the darkness, they find their symphony

Bridge:
In the stillness of the night, they whisper secrets untold
A connection so deep, a bond that can't be controlled
Their souls intertwined, in a cosmic dance
Forever linked, in a timeless trance

Chorus:
Moon and raven, dancing in the night
A myst

In [13]:
for chunk in llm.stream(prompt):
    print(chunk.content, end='', flush=True)

Verse 1:
Beneath the silver light of the moon
A raven flies, singing a haunting tune
Its wings spread wide, against the night
A symbol of darkness, a beautiful sight

Chorus:
Moonlight, raven's flight
In the dead of night
They dance together, in the sky
A mystical bond, that cannot die

Verse 2:
The moon whispers secrets, to the raven's ear
It listens intently, without fear
Together they roam, through the starlit sky
A mysterious pair, that catches the eye

Chorus:
Moonlight, raven's flight
In the dead of night
They dance together, in the sky
A mystical bond, that cannot die

Bridge:
The raven calls out, a cry of sorrow
The moon shines bright, promising tomorrow
Their connection runs deep, beyond the night
A bond that's eternal, a beautiful sight

Chorus:
Moonlight, raven's flight
In the dead of night
They dance together, in the sky
A mystical bond, that cannot die

Outro:
So let the moon guide you, and the raven lead the way
In the darkness of night, they'll never stray
Together they'

## PromptTemplates

In [4]:
from langchain.prompts import PromptTemplate
from langchain_openai import ChatOpenAI

template = '''You are an experience virologist.
Write a few sentences about the following virus "{virus}" in {language}'''
prompt_template = PromptTemplate.from_template(template=template)

prompt = prompt_template.format(virus='hiv', language='portuguese')
prompt

'You are an experience virologist.\nWrite a few sentences about the following virus "hiv" in portuguese'

In [5]:
llm = ChatOpenAI(model_name='gpt-3.5-turbo',temperature=0)
output = llm.invoke(prompt)
print(output.content)

O HIV, vírus da imunodeficiência humana, é um retrovírus que ataca o sistema imunológico humano, enfraquecendo a capacidade do organismo de combater infecções. Ele é transmitido principalmente através de relações sexuais desprotegidas, compartilhamento de agulhas contaminadas e de mãe para filho durante a gravidez, parto ou amamentação. O HIV pode levar ao desenvolvimento da síndrome da imunodeficiência adquirida (AIDS), uma doença grave que compromete a saúde do indivíduo. A prevenção, o diagnóstico precoce e o tratamento adequado são fundamentais para controlar a disseminação do vírus e melhorar a qualidade de vida das pessoas infectadas.


## ChatPromptTemplates

In [8]:
from langchain.prompts import ChatPromptTemplate, HumanMessagePromptTemplate
from langchain_core.messages import SystemMessage

chat_template = ChatPromptTemplate.from_messages(
    [
        SystemMessage(content='You respond only in the JSON format.'),
        HumanMessagePromptTemplate.from_template('Top {n} countries in {area} by population.')
    ]
)

messages = chat_template.format_messages(n='10', area='South America')
print(messages)

[SystemMessage(content='You respond only in the JSON format.'), HumanMessage(content='Top 10 countries in South America by population.')]


In [9]:
from langchain_openai import ChatOpenAI
llm = ChatOpenAI()
output = llm.invoke(messages)
print(output.content)

{
    "countries": [
        {
            "rank": 1,
            "name": "Brazil",
            "population": 213993437
        },
        {
            "rank": 2,
            "name": "Colombia",
            "population": 52644312
        },
        {
            "rank": 3,
            "name": "Argentina",
            "population": 45605810
        },
        {
            "rank": 4,
            "name": "Peru",
            "population": 33050325
        },
        {
            "rank": 5,
            "name": "Venezuela",
            "population": 28435943
        },
        {
            "rank": 6,
            "name": "Chile",
            "population": 19107216
        },
        {
            "rank": 7,
            "name": "Ecuador",
            "population": 17446584
        },
        {
            "rank": 8,
            "name": "Bolivia",
            "population": 11836457
        },
        {
            "rank": 9,
            "name": "Paraguay",
            "population": 7132538


## Simple Chains

Chains are a series of steps and actions. It allows us to combine multiple components together to solve a specific task and build and entire LLM application.

In [1]:
from langchain_openai import ChatOpenAI
from langchain import PromptTemplate
from langchain.chains import LLMChain

llm = ChatOpenAI()

template = '''You are an experience virologist.
Write a few sentences about the following virus "{virus}" in {language}'''
prompt_template = PromptTemplate.from_template(template=template)

chain = LLMChain(
    llm=llm,
    prompt=prompt_template,
    verbose=True ## Enable this in order to debug
)

output = chain.invoke({'virus': 'HSV', 'language': 'Spanish'})

print(output)



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are an experience virologist.
Write a few sentences about the following virus "HSV" in Spanish[0m

[1m> Finished chain.[0m
{'virus': 'HSV', 'language': 'Spanish', 'text': 'El virus del herpes simple (HSV) es un virus de la familia Herpesviridae que afecta a los humanos. Hay dos tipos principales de HSV, el HSV-1 que generalmente causa herpes labial y el HSV-2 que generalmente causa herpes genital. El virus se transmite a través del contacto directo con las lesiones activas de la piel o las membranas mucosas.'}


In [3]:
template = 'What is the capital of {country}? List the top 3 places to visit in that city. Use bullet points'
prompt_template = PromptTemplate.from_template(template=template)


chain = LLMChain(
    llm=llm,
    prompt=prompt_template,
    verbose=True ## Enable this in order to debug
)

country = input('Enter Country: ')
output = chain.invoke(country)
print(output['text'])



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWhat is the capital of Brazil? List the top 3 places to visit in that city. Use bullet points[0m

[1m> Finished chain.[0m
The capital of Brazil is Brasília.

Top 3 places to visit in Brasília:
- Juscelino Kubitschek Bridge
- Cathedral of Brasília
- National Congress of Brazil


## Sequential Chains

With sequential chains, you can make a series of calls to one or more LLMs. You can take the output from one chain and use it as the input to another chain.


1. **SimpleSequentialChain**: 1 input and 1 output -> output from previous one used as input of the current one

In [7]:
from langchain_openai import ChatOpenAI
from langchain import PromptTemplate
from langchain.chains import LLMChain, SimpleSequentialChain

llm1 = ChatOpenAI(model_name='gpt-3.5-turbo', temperature=0.5)
prompt_template1 = PromptTemplate.from_template(
    template='You are an experienced scientist and Python programmer. Write a function that implements the concept of {concept}.'
)
chain1 = LLMChain(llm=llm1,prompt=prompt_template1)

llm2 = ChatOpenAI(model_name='gpt-3.5-turbo', temperature=1.2)
prompt_template2 = PromptTemplate.from_template(
    template='Given the Python function {function}, describe it as detailed as possible.'
)
chain2 = LLMChain(llm=llm2,prompt=prompt_template2)

overrall_chain = SimpleSequentialChain(chains=[chain1,chain2],verbose=True)
output = overrall_chain.invoke('linear regression')



[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3mSure! Here is a simple Python function that implements linear regression:

```python
import numpy as np

def linear_regression(x, y):
    n = len(x)
    x_mean = np.mean(x)
    y_mean = np.mean(y)

    numerator = np.sum((x - x_mean) * (y - y_mean))
    denominator = np.sum((x - x_mean) ** 2)

    slope = numerator / denominator
    intercept = y_mean - slope * x_mean

    return slope, intercept

# Example usage
x = np.array([1, 2, 3, 4, 5])
y = np.array([2, 3, 4, 5, 6])

slope, intercept = linear_regression(x, y)
print(f"Slope: {slope}, Intercept: {intercept}")
```

This function takes two arrays `x` and `y` as input, calculates the slope and intercept of the linear regression line, and returns them. The example usage demonstrates how to use the function with sample data.[0m
[33;1m[1;3mThe `linear_regression` function utilizes the NumPy library to perform linear regression on given input data. Here's a breakdown 

In [8]:
print(output['output'])

The `linear_regression` function utilizes the NumPy library to perform linear regression on given input data. Here's a breakdown of how the function works:

1. Calculate the length of the input arrays `x` and `y` and store it in variable `n`.
2. Calculate the mean of the `x` values and store it in `x_mean`.
3. Calculate the mean of the `y` values and store it in `y_mean`.
4. Compute the numerator by summing the product of `(x - x_mean)` and `(y - y_mean)`.
5. Compute the denominator by summing the square of `(x - x_mean)`.
6. Calculate the slope of the regression line by dividing the numerator by the denominator and storing it in `slope`.
7. Calculate the intercept of the regression line by subtracting `slope * x_mean` from `y_mean` and storing it in `intercept`.
8. Return the calculated slope and intercept values.

In the example provided, we create arrays `x` and `y` with values ranging from 1 to 5. We then call the `linear_regression` function with these arrays and print out the cal

## LangChain Agents

LLM's are not good at computing, like calculations. For example, calculating 5.1**1.7. GPT answers `~9.71` while the right one is `15.9539...`.

They easily hallucinates too.

### LcngChain Agents in Action: Python REPL


In [9]:
pip install -q langchain_experimental

[33mDEPRECATION: distro-info 0.23ubuntu1 has a non-standard version number. pip 24.1 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of distro-info or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063[0m[33m
[0m[33mDEPRECATION: python-debian 0.1.36ubuntu1 has a non-standard version number. pip 24.1 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of python-debian or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063[0m[33m
[0mNote: you may need to restart the kernel to use updated packages.


In [11]:
from langchain_experimental.utilities import PythonREPL
python_repl = PythonREPL()
python_repl.run('print([ n for n in range(1,100) if n %13 == 0])')

'[13, 26, 39, 52, 65, 78, 91]\n'

In [13]:
from langchain_experimental.agents.agent_toolkits import create_python_agent
from langchain_experimental.tools.python.tool import PythonREPLTool
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(temperature=0)
agent_executor = create_python_agent(
    llm=llm,
    tool=PythonREPLTool(),
    verbose=True
)

agent_executor.invoke('Calculate the square root of the factorial of 12 and display it with 4 decimal points.')



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mWe can first calculate the factorial of 12 and then find the square root of that result.
Action: Python_REPL
Action Input: import math[0m
Observation: [36;1m[1;3m[0m
Thought:[32;1m[1;3mWe need to use the math module to calculate the square root.
Action: Python_REPL
Action Input: math.sqrt(math.factorial(12))[0m
Observation: [36;1m[1;3m[0m
Thought:[32;1m[1;3mWe need to round the result to 4 decimal points.
Action: Python_REPL
Action Input: round(math.sqrt(math.factorial(12)), 4)[0m
Observation: [36;1m[1;3m[0m
Thought:[32;1m[1;3mI now know the final answer
Final Answer: 145.8980[0m

[1m> Finished chain.[0m


{'input': 'Calculate the square root of the factorial of 12 and display it with 4 decimal points.',
 'output': '145.8980'}

In [15]:
response = agent_executor.invoke('What is the answer to 5.1 ** 1.7?')



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI can calculate the result of 5.1 raised to the power of 1.7 using Python.
Action: Python_REPL
Action Input: print(5.1 ** 1.7)[0m
Observation: [36;1m[1;3m15.953989407385711
[0m
Thought:[32;1m[1;3mThe result of 5.1 raised to the power of 1.7 is 15.953989407385711.
Final Answer: 15.953989407385711[0m

[1m> Finished chain.[0m


In [16]:
response

{'input': 'What is the answer to 5.1 ** 1.7?', 'output': '15.953989407385711'}

## Langchain tools

- Specialized apps for LLM. Tiny code modules that allow it to access info and services
- Connect your LLM to search engines, databases, APIs, and more

### Langchain tools: DuckDuckGo and Wikipedia


In [17]:
pip install -q duckduckgo-search

[33mDEPRECATION: distro-info 0.23ubuntu1 has a non-standard version number. pip 24.1 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of distro-info or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063[0m[33m
[0m[33mDEPRECATION: python-debian 0.1.36ubuntu1 has a non-standard version number. pip 24.1 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of python-debian or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063[0m[33m
[0mNote: you may need to restart the kernel to use updated packages.


In [18]:
from langchain.tools import DuckDuckGoSearchRun

search = DuckDuckGoSearchRun()
output = search.invoke('Where was Freddie Mercury born?')
print(output)

Freddie Mercury (born September 5, 1946, Stone Town, Zanzibar [now in Tanzania]—died November 24, 1991, Kensington, London, England) British rock singer and songwriter whose flamboyant showmanship and powerfully agile vocals, most famously for the band Queen, made him one of rock's most dynamic front men.. Bulsara was born to Parsi parents who had emigrated from India to Zanzibar, where ... Freddie Mercury was Britain's first Indian rock star. This fact and the nature of his sexuality were the two areas of his short life about which he would be intentionally obscure. Although he was born on the East African island of Zanzibar on September 5, 1946, his parents were both Indian; they were Parsees from Gujarat. Freddie Mercury was born Farrokh Bulsara in Stone Town, in the British protectorate of Zanzibar (now part of Tanzania), on September 5, 1946. His first big challenge was to come to terms with ... But there are many lesser-known Freddie Mercury facts that reveal hidden depths to the

In [19]:
search.name

'duckduckgo_search'

In [20]:
search.description

'A wrapper around DuckDuckGo Search. Useful for when you need to answer questions about current events. Input should be a search query.'

In [21]:
from langchain.tools import DuckDuckGoSearchResults
search = DuckDuckGoSearchResults()
output = search.run('Freedie Mercury and Queen')
print(output)

[snippet: Freddie Mercury (born September 5, 1946, Stone Town, Zanzibar [now in Tanzania]—died November 24, 1991, Kensington, London, England) British rock singer and songwriter whose flamboyant showmanship and powerfully agile vocals, most famously for the band Queen, made him one of rock 's most dynamic front men. Bulsara was born to Parsi parents ..., title: Freddie Mercury | Biography, Parents, Songs, & Facts, link: https://www.britannica.com/biography/Freddie-Mercury], [snippet: Freddie Mercury always talked about his pride at writing "Killer Queen," which appeared on Queen's 1974 album Sheer Heart Attack. He said it was written in one night and was a song that he ..., title: Best Freddie Mercury Songs: 20 Essential Solo And Queen Tracks, link: https://www.udiscovermusic.com/stories/best-freddie-mercury-songs/], [snippet: Freddie Mercury Facts: 15 Things You Never Knew About The Queen Frontman 1: Freddie Mercury was a brilliant boxer. At school in India, the young Freddie Mercury 

In [23]:
from langchain_community.utilities import DuckDuckGoSearchAPIWrapper

wrapper = DuckDuckGoSearchAPIWrapper(region='pt-br',max_results=3, safesearch='moderate')
search = DuckDuckGoSearchResults(api_wrapper=wrapper, source='news')
output = search.run('Berlin')
print(output)

[snippet: Existem tours para todos os bolsos em Berlim. Veja as melhores atividades para fazer gastando pouco: - Passeio de Barco de 1 hora com Assento Garantido - Ingresso para o Museu Berlin Story - Ingresso para o Museu Pergamon - Ingresso Illuseum Berlin - Ingresso para o Museu de História Natural, title: O que fazer em Berlim: 27 atrações e passeios imperdíveis, link: https://www.queroviajarmais.com/o-que-fazer-em-berlim/], [snippet: Combinar Berlim com Europa Central. Considero Berlim um ótimo fecho para um tour pela Europa Central. Comece em Budapeste, continue de barco (6h) ou trem (3h) a Viena, prossiga de trem a Praga (4h30) e termine em Berlim (4h45 de trem). Dá para fazer no sentido oposto, mas acho que esta é a melhor seqüência., title: Berlim: dicas e roteiros - Viaje na Viagem, link: https://www.viajenaviagem.com/destino/berlim/], [snippet: Berlin, capital and chief urban center of Germany. The city lies at the heart of the North German Plain, athwart an east-west commer

In [24]:
import re
pattern = r'snippet: (.*?), title: (.*?), link: (.*?)\],'
matches = re.findall(pattern,output,re.DOTALL)

for snippet, title, link in matches:
    print(f'Snipppet: {snippet}\nTitle: {title}\nLink: {link}\n')
    print('-' * 50)

Snipppet: Existem tours para todos os bolsos em Berlim. Veja as melhores atividades para fazer gastando pouco: - Passeio de Barco de 1 hora com Assento Garantido - Ingresso para o Museu Berlin Story - Ingresso para o Museu Pergamon - Ingresso Illuseum Berlin - Ingresso para o Museu de História Natural
Title: O que fazer em Berlim: 27 atrações e passeios imperdíveis
Link: https://www.queroviajarmais.com/o-que-fazer-em-berlim/

--------------------------------------------------
Snipppet: Combinar Berlim com Europa Central. Considero Berlim um ótimo fecho para um tour pela Europa Central. Comece em Budapeste, continue de barco (6h) ou trem (3h) a Viena, prossiga de trem a Praga (4h30) e termine em Berlim (4h45 de trem). Dá para fazer no sentido oposto, mas acho que esta é a melhor seqüência.
Title: Berlim: dicas e roteiros - Viaje na Viagem
Link: https://www.viajenaviagem.com/destino/berlim/

--------------------------------------------------
Snipppet: Berlin, capital and chief urban cent

In [25]:
pip install -q wikipedia

[33mDEPRECATION: distro-info 0.23ubuntu1 has a non-standard version number. pip 24.1 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of distro-info or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063[0m[33m
[0m[33mDEPRECATION: python-debian 0.1.36ubuntu1 has a non-standard version number. pip 24.1 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of python-debian or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063[0m[33m
[0mNote: you may need to restart the kernel to use updated packages.


In [27]:
from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper

api_wrapper = WikipediaAPIWrapper(top_k_results=1, doc_content_chars_max=5000)
wiki = WikipediaQueryRun(api_wrapper=api_wrapper)
wiki.invoke({'query': 'llamaindex'})

'Page: Vector database\nSummary: A vector database management system (VDBMS) or simply vector database or vector store is a database that can store vectors (fixed-length lists of numbers) along with other data items. Vector databases typically implement one or more Approximate Nearest Neighbor (ANN) algorithms, so that one can search the database with a query vector to retrieve the closest matching database records.\nVectors are mathematical representations of data in a high-dimensional space. In this space, each dimension corresponds to a feature of the data, and tens of thousands of dimensions might be used to represent sophisticated data. A vector\'s position in this space represents its characteristics. Words, phrases, or entire documents, and images, audio, and other types of data can all be vectorized.These feature vectors may be computed from the raw data using machine learning methods such as feature extraction algorithms, word embeddings or deep learning networks. The goal is 

In [28]:
wiki.invoke('Google Gemini')

'Page: Gemini (chatbot)\nSummary: Gemini, formerly known as Bard, is a generative artificial intelligence chatbot developed by Google. Based on the large language model (LLM) of the same name and developed as a direct response to the meteoric rise of OpenAI\'s ChatGPT, it was launched in a limited capacity in March 2023 before expanding to other countries in May. It was previously based on PaLM, and initially the LaMDA family of large language models.\nLaMDA had been developed and announced in 2021, but it was not released to the public out of an abundance of caution. OpenAI\'s launch of ChatGPT in November 2022 and its subsequent popularity caught Google executives off-guard and sent them into a panic, prompting a sweeping response in the ensuing months. After mobilizing its workforce, the company launched Bard in February 2023, which took center stage during the 2023 Google I/O keynote in May and was upgraded to the Gemini LLM in December. Bard and Duet AI were unified under the Gemi

## ReAct Agent

Reasoning and Acting: ReAct.

ReAct is a new approach that combines reasoning (chain-of-thoughts propting) and acting capabilities of LLMs. 

### Creating a React Agent


In [29]:
pip install langchainhub -q

[33mDEPRECATION: distro-info 0.23ubuntu1 has a non-standard version number. pip 24.1 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of distro-info or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063[0m[33m
[0m[33mDEPRECATION: python-debian 0.1.36ubuntu1 has a non-standard version number. pip 24.1 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of python-debian or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063[0m[33m
[0mNote: you may need to restart the kernel to use updated packages.


In [30]:
from dotenv import load_dotenv, find_dotenv
load_dotenv(find_dotenv(),override=True)

True

In [48]:
from langchain.prompts import PromptTemplate
from langchain import hub
from langchain.agents import Tool, AgentExecutor, initialize_agent, create_react_agent
from langchain.tools import DuckDuckGoSearchRun, WikipediaQueryRun
from langchain.utilities import WikipediaAPIWrapper
from langchain_experimental.tools.python.tool import PythonREPLTool
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model_name='gpt-3.5-turbo',temperature=0)

template =  '''
Answer the following questions in PORTUGUESE as best as you can.
Question: {q}
'''

prompt_template = PromptTemplate.from_template(template)
prompt = hub.pull('hwchase17/react')
# print(type(prompt))
# print(prompt.input_variables)
# print(prompt.template)

# 1. Python REPL Tool (for executing Python code)
python_repl = PythonREPLTool()
python_repl_tool = Tool(
    name='Python REPL',
    func=python_repl.run,
    description='Useful when you need to use Python to answer a question. You should input Python code.'
)

# 2. Wikipedia Tool (for searching Wikipedia)
api_wrapper = WikipediaAPIWrapper()
wikipedia = WikipediaQueryRun(api_wrapper=api_wrapper)
wikipedia_tool = Tool(
    name='Wikipedia',
    func=wikipedia.run,
    description= 'Useful for when you need to look up at a topic, country, or person on Wikipedia.'
)

# 3. DuckDuckGo Search Tool (for general web searches)
search = DuckDuckGoSearchRun()
duckduckgo_tool = Tool(
    name='DuckDuckGo Search',
    func=search.run,
    description='Useful for when you need to perform an internet search to find information that another tool cannot provide.'
)

# It will use the most recommended one
tools = [python_repl_tool, wikipedia_tool, duckduckgo_tool]

agent = create_react_agent(llm,tools,prompt)

agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,
    handle_parsing_errors=True,
    max_iterations=10
)

In [49]:
# question = "Generate the first 20 numbers in the Fibonacci series."
# question = "Who is the current prime minister of the UK?"
question = "Tell me about Napolean Bonaparte early life"
output = agent_executor.invoke({
    'input': prompt_template.format(q=question)
})




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI need to find information about Napoleon Bonaparte's early life in Portuguese.
Action: Wikipedia
Action Input: Napoleon Bonaparte early life Wikipedia[0m[33;1m[1;3mPage: Napoleon III
Summary: Napoleon III (Charles-Louis Napoléon Bonaparte; 20 April 1808 – 9 January 1873) was the first president of France from 1848 to 1852, and the last monarch of France as Emperor of the French from 1852 until he was deposed in absentia on 4 September 1870.
Prior to his reign, Napoleon III was known as Louis Napoleon Bonaparte. He was born in Paris as the son of Louis Bonaparte, King of Holland (1806–1810), and his wife, Hortense de Beauharnais. Napoleon I was Louis Napoleon's paternal uncle, and one of his cousins was the disputed Napoleon II. Louis Napoleon was the first and only president of the French Second Republic, elected in 1848. He seized power by force in 1851 when he could not constitutionally be reelected. He later proclaimed

In [50]:
print(output['input'])


Answer the following questions in PORTUGUESE as best as you can.
Question: Tell me about Napolean Bonaparte early life



In [51]:
print(output['output'])

Napoleon Bonaparte was born on August 15, 1769, in Ajaccio, Corsica, to Carlo Buonaparte, a lawyer, and Letizia Ramolino. He began his education in Brienne, France, at the age of 10 and excelled in mathematics. In 1784, he was admitted to the École Militaire to train as an artillery officer.


# LangChain and Vector Stores

## Embeddings

- text -> vector
- Similarities by calculating distance between vectors

## Vector databases

- Store dbs as numeric vectors
- Store and query unstructured data
- some dbs:
    - Pinecone
    - Chroma
    - Milvus
    - Qdrant
- Content -> Embedding -> Indexing
- Then Application -> Embedding -> querying