In [1]:
import getpass

openai_api_key = getpass.getpass('OpenAI API Key:')

In [15]:
from langchain_openai import OpenAI

llm = OpenAI(temperature=0)

In [16]:
llm('hello, this is a test')

'\n\n\nHello! Thank you for letting me know. Is there anything else you would like to test or discuss?'

In [17]:
from IPython.display import Markdown

Markdown(llm("hello, this is a test"))




Hello! Thank you for letting me know. Is there anything else you would like to test or discuss?

In [18]:
Markdown(llm("What is zero-shot chain-of-thought prompting?"))



Zero-shot chain-of-thought prompting is a technique used in natural language processing (NLP) to generate coherent and relevant responses to a given prompt without any prior training or specific instructions. It involves using a pre-trained language model to generate a sequence of words or phrases that are related to the prompt, without being explicitly trained on the specific task or topic. This allows the model to generate responses that are not limited by a specific set of prompts or topics, making it more versatile and adaptable to different scenarios.

In [20]:
 import arxiv

In [24]:
arxiv_client = arxiv.Client()
paper = next(arxiv_client.results(arxiv.Search(id_list=["2205.11916"])))

Markdown(paper.summary)

Pretrained large language models (LLMs) are widely used in many sub-fields of
natural language processing (NLP) and generally known as excellent few-shot
learners with task-specific exemplars. Notably, chain of thought (CoT)
prompting, a recent technique for eliciting complex multi-step reasoning
through step-by-step answer examples, achieved the state-of-the-art
performances in arithmetics and symbolic reasoning, difficult system-2 tasks
that do not follow the standard scaling laws for LLMs. While these successes
are often attributed to LLMs' ability for few-shot learning, we show that LLMs
are decent zero-shot reasoners by simply adding "Let's think step by step"
before each answer. Experimental results demonstrate that our Zero-shot-CoT,
using the same single prompt template, significantly outperforms zero-shot LLM
performances on diverse benchmark reasoning tasks including arithmetics
(MultiArith, GSM8K, AQUA-RAT, SVAMP), symbolic reasoning (Last Letter, Coin
Flip), and other logical reasoning tasks (Date Understanding, Tracking Shuffled
Objects), without any hand-crafted few-shot examples, e.g. increasing the
accuracy on MultiArith from 17.7% to 78.7% and GSM8K from 10.4% to 40.7% with
large InstructGPT model (text-davinci-002), as well as similar magnitudes of
improvements with another off-the-shelf large model, 540B parameter PaLM. The
versatility of this single prompt across very diverse reasoning tasks hints at
untapped and understudied fundamental zero-shot capabilities of LLMs,
suggesting high-level, multi-task broad cognitive capabilities may be extracted
by simple prompting. We hope our work not only serves as the minimal strongest
zero-shot baseline for the challenging reasoning benchmarks, but also
highlights the importance of carefully exploring and analyzing the enormous
zero-shot knowledge hidden inside LLMs before crafting finetuning datasets or
few-shot exemplars.

In [26]:
prompt = f"""Here's a summary of a paper:
{paper.summary}

Based on that summary, what is zero-shot chain-of-thought prompting?"""

response = llm(prompt)
Markdown(response)



Zero-shot chain-of-thought prompting is a technique that utilizes large language models (LLMs) to perform complex multi-step reasoning tasks without any hand-crafted few-shot examples. It involves adding the phrase "Let's think step by step" before each answer in a prompt, which significantly improves the zero-shot performance of LLMs on diverse reasoning tasks such as arithmetics, symbolic reasoning, and logical reasoning. This technique suggests that LLMs have untapped and understudied zero-shot capabilities, and highlights the importance of exploring and analyzing these capabilities before creating finetuning datasets or few-shot exemplars.

In [27]:
import pypdf

In [28]:
paper_path = paper.download_pdf()

In [29]:
from langchain.document_loaders import PyPDFLoader

In [30]:
loader = PyPDFLoader(paper_path)
pages = loader.load_and_split()

In [31]:
len(pages)

49

In [32]:
content = "\n\n".join([page.page_content for page in pages[0:2]])

In [33]:
response = llm(f"""Here's the first two pages of a paper:
{content}

Based on that content, what is zero-shot chain-of-thought prompting?""")

Markdown(response)



Zero-shot chain-of-thought prompting is a technique for eliciting complex multi-step reasoning through step-by-step answer examples, without the need for task-specific few-shot examples. It involves adding the prompt "Let's think step by step" before each answer in order to facilitate step-by-step thinking and improve the performance of large language models on challenging reasoning tasks. This approach is shown to be effective in a variety of tasks, including arithmetic, symbolic, commonsense, and other logical reasoning tasks.

In [35]:
from langchain_openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [37]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000)
docs = text_splitter.split_documents(pages)

In [38]:
embeddings = OpenAIEmbeddings()
db = FAISS.from_documents(docs, embeddings)

In [39]:
docs = db.similarity_search("What is zero-shot chain-of-thought prompting?")

In [40]:
Markdown(docs[0].page_content)

Chain of thought prompting Multi-step arithmetic and logical reasoning benchmarks have par-
ticularly challenged the scaling laws of large language models [Rae et al., 2021]. Chain of thought
(CoT) prompting [Wei et al., 2022], an instance of few-shot prompting, proposed a simple solution
by modifying the answers in few-shot examples to step-by-step answers, and achieved signiﬁcant
boosts in performance across these difﬁcult benchmarks, especially when combined with very large
language models like PaLM [Chowdhery et al., 2022]. The top row of Figure 1 shows standard
few-shot prompting against (few-shot) CoT prompting. Notably, few-shot learning was taken as a
given for tackling such difﬁcult tasks, and the zero-shot baseline performances were not even reported
in the original work [Wei et al., 2022]. To differentiate it from our method, we call Wei et al. [2022]
asFew-shot-CoT in this work.
3 Zero-shot Chain of Thought

In [41]:
from langchain.chains.qa_with_sources import load_qa_with_sources_chain

In [43]:
chain = load_qa_with_sources_chain(llm, chain_type='stuff')
query = "What is zero-shot chain-of-thought prompting?"

sources = db.similarity_search(query)
results = chain({'input_documents':sources, "question": query}, return_only_outputs=True)

In [44]:
Markdown(results['output_text'])

 Zero-shot chain-of-thought prompting is a method for eliciting multi-step reasoning from large language models without requiring step-by-step few-shot examples. It differs from prior prompting methods and has potential biases due to the training data used for large language models. 
SOURCES: ./2205.11916v4.Large_Language_Models_are_Zero_Shot_Reasoners.pdf