# Introduction

This notebook explores summarization templates and the default prompts supplied by LangChain.

It uses the Azure OpenAI models.

In [2]:
import os
import textwrap

from helpers.utilities import install_if_needed, load_keys, get_env_file_keys
from dotenv import find_dotenv, load_dotenv

In [3]:
ENV_FILE_NAME = '.env_azure_openai'

In [4]:
get_env_file_keys(ENV_FILE_NAME)

['OPENAI_API_TYPE', 'OPENAI_API_VERSION', 'OPENAI_API_BASE', 'OPENAI_API_KEY']

In [5]:
load_keys(ENV_FILE_NAME)

In [6]:
install_if_needed (['langchain'])

from langchain.chat_models import AzureChatOpenAI

langchain is already installed.


## load_summarize_chain

In [8]:
from langchain.chains.summarize import load_summarize_chain
from langchain.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
docs = loader.load()

llm = AzureChatOpenAI(deployment_name="gpt-35-turbo-16k",
                      temperature=0,)

In [9]:
chain = load_summarize_chain(llm, chain_type="stuff")

In [10]:
%%time
result = chain.run(docs)

CPU times: user 63 ms, sys: 4.45 ms, total: 67.5 ms
Wall time: 47.4 s


In [11]:
print(textwrap.fill(result))

The article discusses the concept of building autonomous agents
powered by large language models (LLMs). It explores the components of
such agents, including planning, memory, and tool use. The article
provides case studies and proof-of-concept examples of LLM-powered
agents in various domains. It also highlights the challenges and
limitations of using LLMs in agent systems.


In [12]:
chain.__fields__.keys()

dict_keys(['memory', 'callbacks', 'callback_manager', 'verbose', 'tags', 'metadata', 'input_key', 'output_key', 'llm_chain', 'document_prompt', 'document_variable_name', 'document_separator'])

In [13]:
print(chain.llm_chain.prompt.template)

Write a concise summary of the following:


"{text}"


CONCISE SUMMARY:


## StuffDocumentsChain

The load_summarize_chain with chain_type stuff is really just an instance of a StuffDocumentsChain built from a PromptTemplate.

In [14]:
from langchain.chains.llm import LLMChain
from langchain.prompts import PromptTemplate
from langchain.chains.combine_documents.stuff import StuffDocumentsChain

prompt_template = """Write a concise summary of the following:
"{text}"
CONCISE SUMMARY:"""
prompt = PromptTemplate.from_template(prompt_template)

# Define LLM chain
llm = llm
llm_chain = LLMChain(llm=llm, prompt=prompt)

stuff_chain = StuffDocumentsChain(llm_chain=llm_chain, document_variable_name="text")

docs = loader.load()

In [15]:
%%time
result = stuff_chain.run(docs)

CPU times: user 57.4 ms, sys: 2.13 ms, total: 59.5 ms
Wall time: 5.1 s


In [16]:
print(textwrap.fill(result))

The article discusses the concept of building autonomous agents
powered by large language models (LLMs). It explores the components of
such agents, including planning, memory, and tool use. The article
provides case studies and examples of proof-of-concept demos,
highlighting the challenges and limitations of LLM-powered agents. It
also includes references to related research papers and provides a
citation for the article.


In [17]:
print(stuff_chain.llm_chain.prompt.template)

Write a concise summary of the following:
"{text}"
CONCISE SUMMARY:


## Map-Reduce

Now we explore the use of map-reduce to address inputs that are too large.

In [18]:
# Switching to gpt-35-turbo which has a smaller input window than gpt-3.5-turbo-16k.
llm = AzureChatOpenAI(deployment_name="gpt-35-turbo",
                      temperature=0,)

In [19]:
# Build the stuff chain around the smaller input window llm.
llm_chain = LLMChain(llm=llm, prompt=prompt)
smaller_stuff_chain = StuffDocumentsChain(llm_chain=llm_chain, document_variable_name="text")

In [20]:
# Generate the anticipated error.
try:
    smaller_stuff_chain.run(docs)
except Exception as e:
    print(f"Caught an Invalid Request Error: {e}")

Caught an Invalid Request Error: This model's maximum context length is 4096 tokens. However, your messages resulted in 9530 tokens. Please reduce the length of the messages.


In [21]:
from langchain.chains.mapreduce import MapReduceChain
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import MapReduceDocumentsChain

In [26]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=5000, chunk_overlap=200)
chunks = text_splitter.split_documents(docs)

In [27]:
len(chunks)

10

In [28]:
map_reduce_chain = load_summarize_chain(llm, chain_type="map_reduce")

In [29]:
%%time
result = map_reduce_chain.run(input_documents=chunks, return_only_outputs=True)

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')).
Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised ServiceUnavailableError: The server is overloaded or not ready yet..
Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')).
Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end close

CPU times: user 514 ms, sys: 47.8 ms, total: 562 ms
Wall time: 15min 21s


In [30]:
print(textwrap.fill(result))

This article discusses the concept of building autonomous agents
powered by large language models (LLMs) and explores the components of
such agents, including planning, memory, and tool use. It also
discusses frameworks for improving reasoning skills in AI agents and
explores different types of human memory. The article mentions various
algorithms and architectures used in similarity search and tool use in
language models. It discusses the HuggingGPT system and its
challenges, as well as case studies in the ChemCrow domain, scientific
discovery, and generative agents. The article also describes a
generative agent architecture and a proof-of-concept example called
AutoGPT. It provides information on the response format for the GPT-
Engineer project and includes a sample conversation between the user
and the assistant. The challenges of limited context length and long-
term planning in building LLM-centered agents are mentioned, as well
as the reliability issues of natural language inter

In [31]:
map_reduce_chain.__fields__.keys()

dict_keys(['memory', 'callbacks', 'callback_manager', 'verbose', 'tags', 'metadata', 'input_key', 'output_key', 'llm_chain', 'reduce_documents_chain', 'document_variable_name', 'return_intermediate_steps'])

In [32]:
print(map_reduce_chain.llm_chain.prompt.template)

Write a concise summary of the following:


"{text}"


CONCISE SUMMARY:


In [33]:
print(map_reduce_chain.reduce_documents_chain.combine_documents_chain.llm_chain.prompt.template)

Write a concise summary of the following:


"{text}"


CONCISE SUMMARY:


## Map-Refine

In [34]:
from langchain.chains import RefineDocumentsChain

In [35]:
refine_chain = load_summarize_chain(llm, chain_type="refine")

In [36]:
%%time
result = refine_chain.run(input_documents=chunks, return_only_outputs=True)

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised ServiceUnavailableError: The server is overloaded or not ready yet..
Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised ServiceUnavailableError: The server is overloaded or not ready yet..
Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised ServiceUnavailableError: The server is overloaded or not ready yet..
Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised ServiceUnavailableError: The server is overloaded or not ready yet..
Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised ServiceUnavailableError: The server is overloaded or not ready yet..


CPU times: user 186 ms, sys: 21.3 ms, total: 208 ms
Wall time: 4min 2s


In [37]:
print(textwrap.fill(result))

The original summary is still relevant and does not need to be
refined.


In [38]:
refine_chain.__fields__.keys()

dict_keys(['memory', 'callbacks', 'callback_manager', 'verbose', 'tags', 'metadata', 'input_key', 'output_key', 'initial_llm_chain', 'refine_llm_chain', 'document_variable_name', 'initial_response_name', 'document_prompt', 'return_intermediate_steps'])

In [39]:
print(refine_chain.initial_llm_chain.prompt.template)

Write a concise summary of the following:


"{text}"


CONCISE SUMMARY:


In [40]:
print(refine_chain.refine_llm_chain.prompt.template)

Your job is to produce a final summary
We have provided an existing summary up to a certain point: {existing_answer}
We have the opportunity to refine the existing summary(only if needed) with some more context below.
------------
{text}
------------
Given the new context, refine the original summary
If the context isn't useful, return the original summary.


The refine chain doesn't seem to be working exactly as anticipated.  The output seems to relate to the impact of the last chunk and doesn't reflect the aggregated response.  Perhaps I am looking in the wrong part of the object.  Let's repeat but change the return_only_outputs parameter to False and then look more carefully at the result object.

In [42]:
%%time
result = refine_chain.run(input_documents=chunks, return_only_outputs=False)

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')).
Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised ServiceUnavailableError: The server is overloaded or not ready yet..
Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised ServiceUnavailableError: The server is overloaded or not ready yet..
Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised ServiceUnavailableError: The server is overloaded or not ready yet..
Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as i

CPU times: user 265 ms, sys: 21.7 ms, total: 287 ms
Wall time: 8min 2s


In [43]:
result

'The original summary is still relevant and does not need to be refined.'