# Introduction

This notebook explores summarization templates and the default prompts supplied by LangChain.

It uses the Cohere "command" model.

In [10]:
import os
import textwrap

from helpers.utilities import install_if_needed, load_keys, get_env_file_keys
from dotenv import find_dotenv, load_dotenv

In [11]:
ENV_FILE_NAME = '.env_cohere'

In [12]:
get_env_file_keys(ENV_FILE_NAME)

['COHERE_API_KEY']

In [13]:
load_keys(ENV_FILE_NAME)

In [14]:
install_if_needed (['langchain'])

from langchain.llms import Cohere

langchain is already installed.


## Load Docs and Instantiate LLM

The total number of tokens (prompt and prediction) for that model cannot exceed 4096.  The example we have been using exceeds that limit, so we will avoid examples using stuff chains.  We review this example with stuff chains in our OpenAI and Azure OpenAI notebooks.

In [15]:
from langchain.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
docs = loader.load()

llm = Cohere(model="command", temperature=0,)

## Map-Reduce

Now we explore the use of map-reduce to address inputs that are too large.

In [16]:
from langchain.chains.mapreduce import MapReduceChain
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import MapReduceDocumentsChain

In [27]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=3000, chunk_overlap=200)
chunks = text_splitter.split_documents(docs)

In [28]:
len(chunks)

19

In [29]:
map_reduce_chain = load_summarize_chain(llm, chain_type="map_reduce")

In [30]:
%%time
result = map_reduce_chain.run(input_documents=chunks, return_only_outputs=True)

Retrying langchain.llms.cohere.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised CohereAPIError: You are using a Trial key, which is limited to 5 API calls / minute. You can continue to use the Trial key for free or upgrade to a Production key with higher rate limits at 'https://dashboard.cohere.ai/api-keys'. Contact us on 'https://discord.gg/XW44jPfYJu' or email us at support@cohere.com with any questions.
Retrying langchain.llms.cohere.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised CohereAPIError: You are using a Trial key, which is limited to 5 API calls / minute. You can continue to use the Trial key for free or upgrade to a Production key with higher rate limits at 'https://dashboard.cohere.ai/api-keys'. Contact us on 'https://discord.gg/XW44jPfYJu' or email us at support@cohere.com with any questions.
Retrying langchain.llms.cohere.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised C

CPU times: user 1.57 s, sys: 52.1 ms, total: 1.63 s
Wall time: 3min 38s


In [31]:
print(textwrap.fill(result))

 LLM-powered autonomous agents have the potential to perform complex
tasks by decomposing them into smaller, manageable steps. Techniques
such as CoT (Chain of Thought) and ToT (Tree of Thoughts) can be used
to break down tasks, with or without human input. Self-reflection is
important for improving the agent's performance and can be achieved
through techniques like ReAct (Reasoning and Acting). LLM+P (LLM plus
external classical planner) is another approach that outsources the
planning step to an external tool. The paper "Learning to Learn with
Chain of Hindsight and Algorithm Distillation" proposes two techniques
to improve the performance of deep learning models. The Chain of
Hindsight (CoH) technique encourages the model to improve its own
outputs by presenting it with a sequence of past outputs, each
annotated with feedback. The Algorithm Distillation (AD) technique
applies the same idea to cross-episode trajectories in reinforcement
learning tasks, where an algorithm is encapsula

In [32]:
map_reduce_chain.__fields__.keys()

dict_keys(['memory', 'callbacks', 'callback_manager', 'verbose', 'tags', 'metadata', 'input_key', 'output_key', 'llm_chain', 'reduce_documents_chain', 'document_variable_name', 'return_intermediate_steps'])

In [33]:
print(map_reduce_chain.llm_chain.prompt.template)

Write a concise summary of the following:


"{text}"


CONCISE SUMMARY:


In [34]:
print(map_reduce_chain.reduce_documents_chain.combine_documents_chain.llm_chain.prompt.template)

Write a concise summary of the following:


"{text}"


CONCISE SUMMARY:


## Map-Refine

In [35]:
from langchain.chains import RefineDocumentsChain

In [36]:
refine_chain = load_summarize_chain(llm, chain_type="refine")

In [37]:
%%time
result = refine_chain.run(input_documents=chunks, return_only_outputs=True)

Retrying langchain.llms.cohere.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised CohereAPIError: You are using a Trial key, which is limited to 5 API calls / minute. You can continue to use the Trial key for free or upgrade to a Production key with higher rate limits at 'https://dashboard.cohere.ai/api-keys'. Contact us on 'https://discord.gg/XW44jPfYJu' or email us at support@cohere.com with any questions.
Retrying langchain.llms.cohere.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised CohereAPIError: You are using a Trial key, which is limited to 5 API calls / minute. You can continue to use the Trial key for free or upgrade to a Production key with higher rate limits at 'https://dashboard.cohere.ai/api-keys'. Contact us on 'https://discord.gg/XW44jPfYJu' or email us at support@cohere.com with any questions.
Retrying langchain.llms.cohere.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised C

CPU times: user 1.52 s, sys: 21.7 ms, total: 1.54 s
Wall time: 3min 43s


In [38]:
print(textwrap.fill(result))

 The original summary is still accurate and does not need to be
refined based on the new context. Lil'Log is an LLM-powered autonomous
agent system that uses LLM as its core controller. It consists of
three key components: planning, memory, and tool use. The agent breaks
down large tasks into smaller subgoals, reflects on past actions, and
uses memory to retain and recall information. It also learns to call
external APIs for extra information. This allows the agent to perform
complex tasks efficiently and effectively. The planning component of
Lil'Log involves task decomposition, which breaks down large tasks
into smaller subgoals. This can be done by LLM with simple prompting,
by using task-specific instructions, or with human inputs. Another
approach, LLM+P, involves relying on an external classical planner to
do long-horizon planning. The memory component of Lil'Log allows the
agent to retain and recall information, which is important for self-
reflection and improving past action d

In [39]:
refine_chain.__fields__.keys()

dict_keys(['memory', 'callbacks', 'callback_manager', 'verbose', 'tags', 'metadata', 'input_key', 'output_key', 'initial_llm_chain', 'refine_llm_chain', 'document_variable_name', 'initial_response_name', 'document_prompt', 'return_intermediate_steps'])

In [40]:
print(refine_chain.initial_llm_chain.prompt.template)

Write a concise summary of the following:


"{text}"


CONCISE SUMMARY:


In [41]:
print(refine_chain.refine_llm_chain.prompt.template)

Your job is to produce a final summary
We have provided an existing summary up to a certain point: {existing_answer}
We have the opportunity to refine the existing summary(only if needed) with some more context below.
------------
{text}
------------
Given the new context, refine the original summary
If the context isn't useful, return the original summary.


Regardless of the model, the refine summaries do not seem as good for this example as map reduce.