In [1]:
import os

from dotenv import load_dotenv, find_dotenv

In [2]:
load_dotenv(find_dotenv(), override=True)

True

In [3]:
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY")
PINECONE_API_KEY = os.environ.get("PINECONE_API_KEY")

In [4]:
from langchain_openai.chat_models import ChatOpenAI
from langchain.schema import AIMessage, HumanMessage, SystemMessage

In [5]:
text = """
Mojo is designed to solve a variety of AI development challenges that no other language can, 
because Mojo is the first programming language built from the ground-up with MLIR 
(a compiler infrastructure that's ideal for heterogeneous hardware, from CPUs and GPUs, to various AI ASICs). 
We also designed Mojo as a superset of Python because we love Python and its community, 
but we couldn't realistically enhance Python to do all the things we wanted. 
For a longer discussion on this topic, read Why Mojo.
"""

In [6]:
messages = [
    SystemMessage(content="You are an expert in summarizing text."),
    HumanMessage(content=f"Please provide a summary of the following text:\n{text}"),
]

In [7]:
llm = ChatOpenAI(
    model="gpt-3.5-turbo",
    api_key=OPENAI_API_KEY,
    temperature=0
)

In [8]:
llm.get_num_tokens(text)

105

In [9]:
summary = llm.invoke(messages)

In [12]:
print(summary.content)

Mojo is a programming language created to address AI development challenges that other languages cannot handle. It is unique in that it is built with MLIR, a compiler infrastructure suitable for diverse hardware types. Mojo is based on Python to leverage its community support, but it offers additional capabilities that could not be easily integrated into Python. For more information, refer to the article "Why Mojo."


In [13]:
from langchain import PromptTemplate
from langchain.chains import LLMChain

In [21]:
template = '''
Write a concise and short summary in {language} of the following text:

TEXT: {text}
'''

prompt = PromptTemplate.from_template(template)

In [22]:
llm.get_num_tokens(prompt.format(text=text, language="French"))

121

In [23]:
chain = LLMChain(
    llm=llm,
    prompt=prompt
)

In [28]:
summary = chain.run({"text": text, "language": "Portuguese"})

In [30]:
summary

'Mojo é uma linguagem de programação projetada para resolver desafios de desenvolvimento de IA que nenhuma outra linguagem pode, pois é a primeira linguagem construída do zero com MLIR. Além disso, é um superset de Python, mantendo compatibilidade com a comunidade Python. Para mais informações, leia o artigo "Why Mojo".'

# `load_summarize_chain`

In [32]:
from langchain import PromptTemplate
from langchain.chat_models import ChatOpenAI
from langchain.chains.summarize import load_summarize_chain
from langchain_community.document_loaders import TextLoader

In [43]:
docs = TextLoader("./files/steve.txt").load()
text = docs[0].page_content

In [58]:
llm = ChatOpenAI(
    model="gpt-3.5-turbo",
    api_key=OPENAI_API_KEY,
    temperature=0
)

prompt = '''Write a concise and short summary of the following text:
```
{text}
```
'''

prompt_template = PromptTemplate.from_template(prompt)

chain = load_summarize_chain(
    llm=llm,
    chain_type="stuff",
    prompt=prompt_template,
    verbose=True
)

In [59]:
prompt_template.format(text=text)

'Write a concise and short summary of the following text:\n```\nI am honored to be with you today at your commencement from one of the finest universities in the world. I never graduated from college. Truth be told, this is the closest I’ve ever gotten to a college graduation. Today I want to tell you three stories from my life. That’s it. No big deal. Just three stories.\n\nThe first story is about connecting the dots.\n\nI dropped out of Reed College after the first 6 months, but then stayed around as a drop-in for another 18 months or so before I really quit. So why did I drop out?\n\nIt started before I was born. My biological mother was a young, unwed college graduate student, and she decided to put me up for adoption. She felt very strongly that I should be adopted by college graduates, so everything was all set for me to be adopted at birth by a lawyer and his wife. Except that when I popped out they decided at the last minute that they really wanted a girl. So my parents, who w

In [60]:
chain.run(docs)



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise and short summary of the following text:
```
I am honored to be with you today at your commencement from one of the finest universities in the world. I never graduated from college. Truth be told, this is the closest I’ve ever gotten to a college graduation. Today I want to tell you three stories from my life. That’s it. No big deal. Just three stories.

The first story is about connecting the dots.

I dropped out of Reed College after the first 6 months, but then stayed around as a drop-in for another 18 months or so before I really quit. So why did I drop out?

It started before I was born. My biological mother was a young, unwed college graduate student, and she decided to put me up for adoption. She felt very strongly that I should be adopted by college graduates, so everything was all set for me to be adopted at birth by a lawyer an

'The speaker shares three stories from his life during a commencement speech. The first story is about dropping out of college, the second is about being fired from his own company and starting over, and the third is about facing a cancer diagnosis. He emphasizes the importance of following your heart, staying true to yourself, and making the most of life as it is limited. The speech concludes with the message "Stay Hungry. Stay Foolish."'

# `map_reduce`

In [61]:
from langchain import PromptTemplate
from langchain.chat_models import ChatOpenAI
from langchain.chains.summarize import load_summarize_chain
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [62]:
docs = TextLoader("./files/steve.txt").load()
text = docs[0].page_content

llm = ChatOpenAI(
    model="gpt-3.5-turbo",
    api_key=OPENAI_API_KEY,
    temperature=0
)

llm.get_num_tokens(text)

2653

In [69]:
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=10000,
    chunk_overlap=50
)

chunks = text_splitter.create_documents([text])

In [70]:
len(chunks)

2

In [73]:
chunks

[Document(page_content='I am honored to be with you today at your commencement from one of the finest universities in the world. I never graduated from college. Truth be told, this is the closest I’ve ever gotten to a college graduation. Today I want to tell you three stories from my life. That’s it. No big deal. Just three stories.\n\nThe first story is about connecting the dots.\n\nI dropped out of Reed College after the first 6 months, but then stayed around as a drop-in for another 18 months or so before I really quit. So why did I drop out?\n\nIt started before I was born. My biological mother was a young, unwed college graduate student, and she decided to put me up for adoption. She felt very strongly that I should be adopted by college graduates, so everything was all set for me to be adopted at birth by a lawyer and his wife. Except that when I popped out they decided at the last minute that they really wanted a girl. So my parents, who were on a waiting list, got a call in the

In [71]:
chain = load_summarize_chain(
    llm=llm,
    chain_type="map_reduce",
    verbose=True
)

In [74]:
chain.run(chunks)



[1m> Entering new MapReduceDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"I am honored to be with you today at your commencement from one of the finest universities in the world. I never graduated from college. Truth be told, this is the closest I’ve ever gotten to a college graduation. Today I want to tell you three stories from my life. That’s it. No big deal. Just three stories.

The first story is about connecting the dots.

I dropped out of Reed College after the first 6 months, but then stayed around as a drop-in for another 18 months or so before I really quit. So why did I drop out?

It started before I was born. My biological mother was a young, unwed college graduate student, and she decided to put me up for adoption. She felt very strongly that I should be adopted by college graduates, so everything was all set for me to be adopted at birth by a lawyer and his wife. 

"Steve Jobs shares three personal stories during a commencement speech, highlighting the importance of following passion, embracing change, and living each day to the fullest. He encourages the audience to trust their intuition, find what they love, and not settle for anything less. Jobs reflects on the inevitability of death and the importance of following one's own path, urging listeners to stay hungry, foolish, curious, and open-minded."

In [76]:
map_prompt = '''Write a concise summary of the following: {text}'''

combine_prompt = '''Write a concise and short summary that covers the key points.
Add a title to the summary.capitalize
Start your summary with an INTRODUCTION PARAGRAPH that gives an overview of the text FOLLOWED by BULLET POINTS that summarize the key points.

TEXT: {text}
'''

map_prompt_template = PromptTemplate.from_template(map_prompt)
combine_prompt_template = PromptTemplate.from_template(combine_prompt)

chain = load_summarize_chain(
    llm=llm,
    chain_type="map_reduce",
    map_prompt=map_prompt_template,
    combine_prompt=combine_prompt_template,
    verbose=True
)

chain.run(chunks)



[1m> Entering new MapReduceDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following: I am honored to be with you today at your commencement from one of the finest universities in the world. I never graduated from college. Truth be told, this is the closest I’ve ever gotten to a college graduation. Today I want to tell you three stories from my life. That’s it. No big deal. Just three stories.

The first story is about connecting the dots.

I dropped out of Reed College after the first 6 months, but then stayed around as a drop-in for another 18 months or so before I really quit. So why did I drop out?

It started before I was born. My biological mother was a young, unwed college graduate student, and she decided to put me up for adoption. She felt very strongly that I should be adopted by college graduates, so everything was all set for me to be adopted at birth by a lawyer and his wife. Exc

'LIVING AUTHENTICALLY AND FOLLOWING INTUITION\n\nIn his commencement speech, Steve Jobs shares three stories from his life:\n- Connecting the dots: Emphasizes the importance of following curiosity and intuition.\n- Love and loss: Highlights how being fired from Apple led to new opportunities and personal growth.\n- Death: Discusses his battle with cancer and the importance of living each day to the fullest.\n\nJobs encourages the audience to find what they love, not settle, and to remember that life is short, so follow your heart and make the most of it. The speaker reflects on the inevitability of death and the importance of living authentically and following one\'s own intuition. They share a personal anecdote about The Whole Earth Catalog and encourage the audience to "Stay Hungry. Stay Foolish." The message is to remain curious, open-minded, and true to oneself as they embark on new beginnings.'

# `refine`

In [81]:
from langchain.chat_models import ChatOpenAI
from langchain import PromptTemplate
from langchain.chains.summarize import load_summarize_chain
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import TextLoader
from langchain.document_loaders import PyPDFLoader

In [83]:
data = PyPDFLoader("./files/attention_is_all_you_need.pdf").load()

In [85]:
data[0].page_content

'Attention Is All You Need\nAshish Vaswani∗\nGoogle Brain\navaswani@google.comNoam Shazeer∗\nGoogle Brain\nnoam@google.comNiki Parmar∗\nGoogle Research\nnikip@google.comJakob Uszkoreit∗\nGoogle Research\nusz@google.com\nLlion Jones∗\nGoogle Research\nllion@google.comAidan N. Gomez∗†\nUniversity of Toronto\naidan@cs.toronto.eduŁukasz Kaiser∗\nGoogle Brain\nlukaszkaiser@google.com\nIllia Polosukhin∗‡\nillia.polosukhin@gmail.com\nAbstract\nThe dominant sequence transduction models are based on complex recurrent or\nconvolutional neural networks that include an encoder and a decoder. The best\nperforming models also connect the encoder and decoder through an attention\nmechanism. We propose a new simple network architecture, the Transformer,\nbased solely on attention mechanisms, dispensing with recurrence and convolutions\nentirely. Experiments on two machine translation tasks show these models to\nbe superior in quality while being more parallelizable and requiring signiﬁcantly\nless tim

In [86]:
data[1].page_content

'transduction problems such as language modeling and machine translation [ 35,2,5]. Numerous\nefforts have since continued to push the boundaries of recurrent language models and encoder-decoder\narchitectures [38, 24, 15].\nRecurrent models typically factor computation along the symbol positions of the input and output\nsequences. Aligning the positions to steps in computation time, they generate a sequence of hidden\nstatesht, as a function of the previous hidden state ht−1and the input for position t. This inherently\nsequential nature precludes parallelization within training examples, which becomes critical at longer\nsequence lengths, as memory constraints limit batching across examples. Recent work has achieved\nsigniﬁcant improvements in computational efﬁciency through factorization tricks [ 21] and conditional\ncomputation [ 32], while also improving model performance in case of the latter. The fundamental\nconstraint of sequential computation, however, remains.\nAttention mec

In [91]:
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=10000,
    chunk_overlap=100
)

chunks = text_splitter.split_documents(data)

In [92]:
len(chunks)

15

In [93]:
llm = ChatOpenAI(
    model="gpt-3.5-turbo",
    api_key=OPENAI_API_KEY,
    temperature=0
)

In [94]:
chain = load_summarize_chain(
    llm=llm,
    chain_type="refine",
    verbose=True
)

In [95]:
chain.invoke(chunks)



[1m> Entering new RefineDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"Attention Is All You Need
Ashish Vaswani∗
Google Brain
avaswani@google.comNoam Shazeer∗
Google Brain
noam@google.comNiki Parmar∗
Google Research
nikip@google.comJakob Uszkoreit∗
Google Research
usz@google.com
Llion Jones∗
Google Research
llion@google.comAidan N. Gomez∗†
University of Toronto
aidan@cs.toronto.eduŁukasz Kaiser∗
Google Brain
lukaszkaiser@google.com
Illia Polosukhin∗‡
illia.polosukhin@gmail.com
Abstract
The dominant sequence transduction models are based on complex recurrent or
convolutional neural networks that include an encoder and a decoder. The best
performing models also connect the encoder and decoder through an attention
mechanism. We propose a new simple network architecture, the Transformer,
based solely on attention mechanisms, dispensing with recurrence and convolutions
entirely. Exp

{'input_documents': [Document(metadata={'source': './files/attention_is_all_you_need.pdf', 'page': 0}, page_content='Attention Is All You Need\nAshish Vaswani∗\nGoogle Brain\navaswani@google.comNoam Shazeer∗\nGoogle Brain\nnoam@google.comNiki Parmar∗\nGoogle Research\nnikip@google.comJakob Uszkoreit∗\nGoogle Research\nusz@google.com\nLlion Jones∗\nGoogle Research\nllion@google.comAidan N. Gomez∗†\nUniversity of Toronto\naidan@cs.toronto.eduŁukasz Kaiser∗\nGoogle Brain\nlukaszkaiser@google.com\nIllia Polosukhin∗‡\nillia.polosukhin@gmail.com\nAbstract\nThe dominant sequence transduction models are based on complex recurrent or\nconvolutional neural networks that include an encoder and a decoder. The best\nperforming models also connect the encoder and decoder through an attention\nmechanism. We propose a new simple network architecture, the Transformer,\nbased solely on attention mechanisms, dispensing with recurrence and convolutions\nentirely. Experiments on two machine translation tas

In [97]:
prompt_template = '''Write a concise summary of the following: {text}'''

refine_template = '''
    Your job is to produce a final summary.
    I have provided an existing summary up to a certain point: {existing_answer}.
    Please refine the existing summary with some more context below.
    ------------
    {text}
    ------------
    Start the final summary with an INTRODUCTION PARAGRAPH that gives an overview of the topic FOLLOWED
    by BULLET POINTS if possible AND end the summary with a CONCLUSION PHRASE.
    
'''

initial_prompt = PromptTemplate.from_template(map_prompt)
refine_prompt = PromptTemplate.from_template(refine_template)

chain = load_summarize_chain(
    llm=llm,
    chain_type="refine",
    question_prompt=initial_prompt,
    refine_prompt=refine_prompt,
    return_intermediate_steps=True,
    verbose=True
)

chain.invoke(chunks)



[1m> Entering new RefineDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following: Attention Is All You Need
Ashish Vaswani∗
Google Brain
avaswani@google.comNoam Shazeer∗
Google Brain
noam@google.comNiki Parmar∗
Google Research
nikip@google.comJakob Uszkoreit∗
Google Research
usz@google.com
Llion Jones∗
Google Research
llion@google.comAidan N. Gomez∗†
University of Toronto
aidan@cs.toronto.eduŁukasz Kaiser∗
Google Brain
lukaszkaiser@google.com
Illia Polosukhin∗‡
illia.polosukhin@gmail.com
Abstract
The dominant sequence transduction models are based on complex recurrent or
convolutional neural networks that include an encoder and a decoder. The best
performing models also connect the encoder and decoder through an attention
mechanism. We propose a new simple network architecture, the Transformer,
based solely on attention mechanisms, dispensing with recurrence and convolutions
entirely. Experi

# Using Agents