<a href="https://colab.research.google.com/github/JapiKredi/Forward_Looking_Active_Retrieval_augmented_generation/blob/main/Forward_Looking_Active_Retrieval_augmented_generation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Retrieve as you generate with FLARE

This notebook is an implementation of Forward-Looking Active REtrieval augmented generation (FLARE).

Please see the original repo here.

The basic idea is:

Start answering a question
If you start generating tokens the model is uncertain about, look up relevant documents
Use those documents to continue generating
Repeat until finished
There is a lot of cool detail in how the lookup of relevant documents is done. Basically, the tokens that model is uncertain about are highlighted, and then an LLM is called to generate a question that would lead to that answer. For example, if the generated text is Joe Biden went to Harvard, and the tokens the model was uncertain about was Harvard, then a good generated question would be where did Joe Biden go to college. This generated question is then used in a retrieval step to fetch relevant documents.

In order to set up this chain, we will need three things:

An LLM to generate the answer
An LLM to generate hypothetical questions to use in retrieval
A retriever to use to look up answers for
The LLM that we use to generate the answer needs to return logprobs so we can identify uncertain tokens. For that reason, we HIGHLY recommend that you use the OpenAI wrapper (NB: not the ChatOpenAI wrapper, as that does not return logprobs).

The LLM we use to generate hypothetical questions to use in retrieval can be anything. In this notebook we will use ChatOpenAI because it is fast and cheap.

The retriever can be anything. In this notebook we will use SERPER search engine, because it is cheap.

Other important parameters to understand:

max_generation_len: The maximum number of tokens to generate before stopping to check if any are uncertain
min_prob: Any tokens generated with probability below this will be considered uncertain


### Imports

In [26]:
from google.colab import userdata
OPENAI_API_KEY = userdata.get('openai_key')

In [27]:
from google.colab import userdata
SERPER_API_KEY = userdata.get('SERPER_API_KEY')

In [28]:
import os

os.environ["SERPER_API_KEY"] = SERPER_API_KEY
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

In [4]:
!pip install langchain

Collecting langchain
  Downloading langchain-0.1.12-py3-none-any.whl (809 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m809.1/809.1 kB[0m [31m4.2 MB/s[0m eta [36m0:00:00[0m
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain)
  Downloading dataclasses_json-0.6.4-py3-none-any.whl (28 kB)
Collecting jsonpatch<2.0,>=1.33 (from langchain)
  Downloading jsonpatch-1.33-py2.py3-none-any.whl (12 kB)
Collecting langchain-community<0.1,>=0.0.28 (from langchain)
  Downloading langchain_community-0.0.28-py3-none-any.whl (1.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m14.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting langchain-core<0.2.0,>=0.1.31 (from langchain)
  Downloading langchain_core-0.1.32-py3-none-any.whl (260 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m260.9/260.9 kB[0m [31m15.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting langchain-text-splitters<0.1,>=0.0.1 (from langchain)
  Downl

In [6]:
!pip install langchain_openai

Collecting langchain_openai
  Downloading langchain_openai-0.0.8-py3-none-any.whl (32 kB)
Collecting openai<2.0.0,>=1.10.0 (from langchain_openai)
  Downloading openai-1.14.0-py3-none-any.whl (257 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m257.5/257.5 kB[0m [31m2.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting tiktoken<1,>=0.5.2 (from langchain_openai)
  Downloading tiktoken-0.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m18.4 MB/s[0m eta [36m0:00:00[0m
Collecting httpx<1,>=0.23.0 (from openai<2.0.0,>=1.10.0->langchain_openai)
  Downloading httpx-0.27.0-py3-none-any.whl (75 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m7.6 MB/s[0m eta [36m0:00:00[0m
Collecting httpcore==1.* (from httpx<1,>=0.23.0->openai<2.0.0,>=1.10.0->langchain_openai)
  Downloading httpcore-1.0.4-py3-none-any.whl (77 kB)
[2K     

In [29]:
from typing import Any, List

from langchain.callbacks.manager import (
    AsyncCallbackManagerForRetrieverRun,
    CallbackManagerForRetrieverRun,
)
from langchain_community.utilities import GoogleSerperAPIWrapper
from langchain_core.documents import Document
from langchain_core.retrievers import BaseRetriever
from langchain_openai import ChatOpenAI, OpenAI

# Retriever

In [30]:
class SerperSearchRetriever(BaseRetriever):
    search: GoogleSerperAPIWrapper = None

    def _get_relevant_documents(
        self, query: str, *, run_manager: CallbackManagerForRetrieverRun, **kwargs: Any
    ) -> List[Document]:
        return [Document(page_content=self.search.run(query))]

    async def _aget_relevant_documents(
        self,
        query: str,
        *,
        run_manager: AsyncCallbackManagerForRetrieverRun,
        **kwargs: Any,
    ) -> List[Document]:
        raise NotImplementedError()


retriever = SerperSearchRetriever(search=GoogleSerperAPIWrapper(serper_api_key=SERPER_API_KEY))

# FLARE Chain

In [31]:
# We set this so we can see what exactly is going on
from langchain.globals import set_verbose

set_verbose(True)

In [32]:
from langchain.chains import FlareChain

flare = FlareChain.from_llm(
    ChatOpenAI(temperature=0),
    retriever=retriever,
    max_generation_len=164,
    min_prob=0.3
)

In [33]:
query = "explain in great detail the difference between the langchain framework and baby agi"

In [34]:
flare.run(query)



[1m> Entering new FlareChain chain...[0m
[36;1m[1;3mCurrent Response: [0m
Prompt after formatting:
[32;1m[1;3mRespond to the user message using any relevant context. If context is provided, you should ground your answer in that context. Once you're done responding return FINISHED.

>>> CONTEXT: 
>>> USER INPUT: explain in great detail the difference between the langchain framework and baby agi
>>> RESPONSE: [0m


[1m> Entering new QuestionGeneratorChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven a user input and an existing partial response as context, ask a question to which the answer is the given term/entity/phrase:

>>> USER INPUT: explain in great detail the difference between the langchain framework and baby agi
>>> EXISTING PARTIAL RESPONSE:  
The Langchain framework and Baby AGI are two different approaches to artificial general intelligence (AGI). While both aim to create intelligent machines that can perform a wide range of tasks, they differ in their

'The Langchain framework and Baby AGI are two distinct but related concepts in the field of artificial intelligence. While both aim to enhance the capabilities of AI systems, they have different approaches and goals.\n\nThe Langchain framework is a general-purpose AI framework that can be used for various applications. It is designed to simplify the process of creating generative AI application interfaces by providing a series of automated chains that hold various AI components together. These chains are designed to facilitate a seamless connection between various linguistic elements, enabling AI systems to comprehend and process language in a more context-aware manner. The framework is open-source and supported by an active community, allowing for continuous improvement and development.\n\nOn the other hand, Baby AGI is a more advanced AI system that can be used to create more complex applications. It draws inspiration from the cognitive development of infants and aims to create an AI

In [35]:
llm = OpenAI()
llm(query)

  warn_deprecated(


'\n\nThe Langchain Framework and Baby AGI are two different approaches to creating artificial general intelligence (AGI). While both aim to create an intelligent system that can perform a wide range of tasks and adapt to new situations, they differ in their underlying principles and methods.\n\n1. Definition\n\nThe Langchain Framework, also known as the Language Chain, is a theoretical framework proposed by computer scientist Ben Goertzel for creating AGI. It is based on the idea that language is the key to intelligence and that a system that can understand and use language at a human level can also exhibit general intelligence.\n\nBaby AGI, on the other hand, is a specific AGI project developed by OpenCog Foundation, led by Ben Goertzel. It aims to create an AGI system that learns and develops in a similar way to a human infant, starting with a minimal set of cognitive abilities and gradually building upon them through learning and experience.\n\n2. Approach\n\nThe Langchain Framework

In [36]:
flare.run("how are the origin stories of langchain and bitcoin similar or different?")



[1m> Entering new FlareChain chain...[0m
[36;1m[1;3mCurrent Response: [0m
Prompt after formatting:
[32;1m[1;3mRespond to the user message using any relevant context. If context is provided, you should ground your answer in that context. Once you're done responding return FINISHED.

>>> CONTEXT: 
>>> USER INPUT: how are the origin stories of langchain and bitcoin similar or different?
>>> RESPONSE: [0m


[1m> Entering new QuestionGeneratorChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven a user input and an existing partial response as context, ask a question to which the answer is the given term/entity/phrase:

>>> USER INPUT: how are the origin stories of langchain and bitcoin similar or different?
>>> EXISTING PARTIAL RESPONSE:  
The origin stories of Langchain and Bitcoin are quite different. Langchain was created as a decentralized platform for language translation, while Bitcoin was created as a decentralized digital currency. However, both were created wit

'The origin stories of Langchain and Bitcoin are quite different. While Bitcoin was created in 2009 by an unknown person or group, Langchain was created in 2019 as a platform for developers to easily create AI applications. However, both have a common goal of revolutionizing the way we conduct transactions and store value. '