ReAct [@yao2022react] is a method that uses Large Language Models (LLMs) to create reasoning and actions for tasks in an interleaved way. Reasoning helps the model to create, monitor and adjust action plans and handle exceptions. Actions allows the model to access knowledge bases or external environments to obtain more information to support its reasoning.

This notebook uses LangChain agent to try ReAct using Llama 2.

In [1]:
#| echo: False

import warnings
warnings.filterwarnings('ignore')

## Load Llama 2 13B-chat

In [2]:
#| code-fold: true
#| code-summary: Configure HuggingFace token (click to toggle the content)

from getpass import getpass

hf_token = getpass()

 ········


In [4]:
#| code-fold: show
#| code-summary: Load the model (click to toggle the content)

import torch
from transformers import pipeline, GenerationConfig


model_id = 'meta-llama/Llama-2-13b-chat-hf'

gen_config = GenerationConfig.from_pretrained(
    model_id,
    token=hf_token
)
gen_config.max_new_tokens = 4096
gen_config.temperature = 0.0

pipe = pipeline(
    task="text-generation",
    model=model_id,
    return_full_text=True,
    generation_config=gen_config,
    device_map='auto',
    model_kwargs={
        'load_in_4bit': True,
        'bnb_4bit_quant_type': 'nf4',
        'bnb_4bit_use_double_quant': True,
        'bnb_4bit_compute_dtype': torch.bfloat16,
    },
    use_auth_token=hf_token,
    repetition_penalty=1.1
)

Downloading (…)lve/main/config.json:   0%|          | 0.00/587 [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/3 [00:00<?, ?it/s]

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/776 [00:00<?, ?B/s]

## Set up LangChain ReAct doc store

In [5]:
#| code-fold: show
#| code-summary: Initialize an agent (click to toggle the content)

from langchain import Wikipedia
from langchain.llms import HuggingFacePipeline
from langchain.agents import initialize_agent, Tool, AgentType
from langchain.agents.react.base import DocstoreExplorer


docstore = DocstoreExplorer(Wikipedia())

tools = [
    Tool(
        name="Search",
        func=docstore.search,
        description="useful for when you need to ask with search",
    ),
    Tool(
        name="Lookup",
        func=docstore.lookup,
        description="useful for when you need to ask with lookup",
    )
]

llm = HuggingFacePipeline(pipeline=pipe)

react_agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent=AgentType.REACT_DOCSTORE,
    verbose=True)

## Ask some questions

In [7]:
react_agent.run('Who are the directors of the movies that caused the Barbenheimer phenomenon?')



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I need to search Barbenheimer phenomenon, find the movies that caused it, and find the directors of those movies.
Action: Search[Barbenheimer phenomenon][0m
Observation: [36;1m[1;3mBarbenheimer ( BARB-ən-HIGH-mər) is an Internet phenomenon that began circulating on social media before the simultaneous theatrical release of two blockbuster films, Barbie and Oppenheimer, on July 21, 2023, in the United States and several other countries. The word is a portmanteau of the films' titles. The dichotomy of Barbie—a fantasy comedy by Greta Gerwig about the fashion doll Barbie—and Oppenheimer—an epic biographical thriller by Christopher Nolan about physicist J. Robert Oppenheimer, scientific director of the Manhattan Project, which developed the first nuclear weapons during World War II—prompted a comedic response from Internet users, including memes and merchandise. Polygon described the two films as "extreme opposites", 

'Greta Gerwig and Christopher Nolan'

In [8]:
react_agent.run('Who are the authors of the paper that proposed the original transformer model?')



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I need to search the original transformer model, find the authors of the paper that proposed it.
Action: Search[original transformer model][0m
Observation: [36;1m[1;3mA Transformer is a deep learning architecture that relies on the parallel multi-head attention mechanism. The original Transformer was proposed in the 2017 paper titled 'Attention Is All You Need' by Ashish Vaswani et al, Google Brain team. It is notable for requiring less training time compared to previous recurrent neural architectures, such as long short-term memory (LSTM), and its later variation has been prevalently adopted for training large language models on large (language) datasets, such as the Wikipedia Corpus and Common Crawl, by virtue of the parallelized processing of input sequence.
Input text is parsed into tokens by a byte pair encoding tokenizer, and each token is converted into a vector via looking up from a word embedding table. A

'Ashish Vaswani et al'

In [9]:
react_agent.run("What is the movie Terminator about? Use your findings as your answer.")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I need to search Terminator, find what it is about, and provide my answer.
Action: Search[Terminator][0m
Observation: [36;1m[1;3mThe Terminator is a 1984 American science fiction action film directed by James Cameron. It stars Arnold Schwarzenegger as the Terminator, a cyborg assassin sent back in time from 2029 to 1984 to kill Sarah Connor (Linda Hamilton), whose unborn son will one day save mankind from extinction by Skynet, a hostile artificial intelligence in a post-apocalyptic future. Kyle Reese (Michael Biehn) is a soldier sent back in time to protect Sarah. The screenplay is credited to Cameron and producer Gale Anne Hurd, while co-writer William Wisher Jr. received an "additional dialogue" credit.
Cameron devised the premise of the film from a fever dream he experienced during the release of his first film, Piranha II: The Spawning (1982), in Rome, and developed the concept in collaboration with Wisher. He

'The Terminator is a 1984 American science fiction action film directed by James Cameron, starring Arnold Schwarzenegger as the Terminator, a cyborg assassin sent back in time to kill Sarah Connor, whose unborn son will one day save mankind from extinction by Skynet, a hostile artificial intelligence in a post-apocalyptic future.'