## Introduction

While Large Language Models (LLMs) have outstanding capabilities, they have some limits that might cause problems when deployed in a production context. The hallucination problem causes people to confidently answer certain questions incorrectly. This problem can be linked to a number of variables, one of which is that their training procedure has a deadline. As a result, these models cannot access events prior to that date.

Presenting the relevant information to the model and leveraging its reasoning abilities to find/extract the answer is a workaround strategy. Furthermore, the top-matched results returned by a search engine can be shown as the context for a user's query.

This article will investigate the concept of using the finest articles on the Internet as the backdrop for a chatbot to get the proper answer. To extract stories from search results, we will use LangChain's interaction with Google Search API and the Newspaper library. This is followed by selecting and employing the most pertinent selections in the prompt.

The same pipeline may be used with the Bing API, but we'll use the Google Search API in this project because it's utilised in previous lessons in this course, reducing the need to create several keys for the same functionality. Please see the tutorial below (or the Bing Web Search API for direct access) to retrieve the Bing Subscription Key and LangChain.

<img src="https://github.com/pranath/blog/raw/master/images/activeloop-bing-chatbot.png" width="800"/>

Using a search engine (e.g., Bing or Google Search), the user query is utilised to extract relevant articles, which are then divided into pieces. The embeddings of each chunk are then computed, ranked by cosine similarity to the query's embedding, and the most relevant chunks are entered into a prompt to construct the final answer while maintaining track of the sources.

In [None]:
#| include: false
!pip install -q langchain==0.0.208 openai tiktoken newspaper3k

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m61.1 MB/s[0m eta [36m0:00:00[0m
[?25h

## Ask Trending Questions

Let's begin by looking at an example. The following paragraph should be familiar by now. It creates an assistant to answer queries using the OpenAI GPT-3.5-turbo model. We'll ask the model to name the most recent Fast & Furious movie. As a result, the model could not have seen the solution during training. 

In [None]:
from langchain import LLMChain, PromptTemplate
from langchain.llms import OpenAI

llm = OpenAI(temperature=0)

template = """You are an assistant that answer the following question correctly and honestly: {question}\n\n"""
prompt_template = PromptTemplate(input_variables=["question"], template=template)

question_chain = LLMChain(llm=llm, prompt=prompt_template)

question_chain.run("what is the latest fast and furious movie?")

'\nThe latest Fast and Furious movie is Fast & Furious 9, which is set to be released in May 2021.'

The response demonstrates that the model uses the previous movie title as the solution. This is due to the fact that the new film (10th sequel) has yet to be produced in its fictional universe! Let us now address the issue.

## Google API

Before we start, let’s set up the API Key and a custom search engine. If you don’t have the keys from the previous lesson, head to the [Google Cloud console](https://console.cloud.google.com/apis/credentials?project=amazing-codex-395218) and generate the key by pressing the CREATE CREDENTIALS buttons from the top and choosing API KEY. Then, head to the [Programmable Search Engine](https://programmablesearchengine.google.com/controlpanel/create) dashboard and remember to select the “Search the entire web” option. The Search engine ID will be visible in the details. You might also need to enable the “Custom Search API” service under the Enable APIs and services. (You will receive the instruction from API if required) Now we can set the environment variables for both Google and OpenAI APIs.

In [None]:
# let's setup the keys

import os

os.environ["GOOGLE_CSE_ID"] = "<Custom_Search_Engine_ID>"
os.environ["GOOGLE_API_KEY"] = "<Google_API_Key>"
os.environ["OPENAI_API_KEY"] = "<OpenAI_Key>"

## Get Search Results

To obtain search results, this component employs LangChain's GoogleSearchAPIWrapper class. It interacts with the Tool class, which provides utilities for agents to let them interface with the outside world. It is possible to make a tool out of any function, such as top_n_results, in this example. The API will return the title, URL, and a brief description of the page.

In [None]:
# first, we create a tool that allows us to use Google search.
# we'll use it to retrieve the first 10 results

from langchain.tools import Tool
from langchain.utilities import GoogleSearchAPIWrapper

search = GoogleSearchAPIWrapper()
TOP_N_RESULTS = 10

def top_n_results(query):
    return search.results(query, TOP_N_RESULTS)

tool = Tool(
    name = "Google Search",
    description="Search Google for recent results.",
    func=top_n_results
)

In [None]:
# this is how we can use the tool. For each result, we have:
# 1. the result title
# 2. its URL
# 3. and the snippet that we would see if we were on the Google UI

query = "what is the latest fast and furious movie?"

results = tool.run(query)

for result in results:
    print(result["title"])
    print(result["link"])
    print(result["snippet"])
    print("-"*50)

Fast & Furious movies in order | chronological and release order ...
https://www.radiotimes.com/movies/fast-and-furious-order/
Mar 22, 2023 ... Fast & Furious Presents: Hobbs & Shaw (2019); F9 (2021); Fast and Furious 10 (2023). Tokyo Drift also marks the first appearance of Han Lue, a ...
--------------------------------------------------
FAST X | Official Trailer 2 - YouTube
https://www.youtube.com/watch?v=aOb15GVFZxU
Apr 19, 2023 ... Fast X, the tenth film in the Fast & Furious Saga, launches the final ... witnessed it all and has spent the last 12 years masterminding a ...
--------------------------------------------------
Fast & Furious 10: Release date, cast, plot and latest news on Fast X
https://www.radiotimes.com/movies/fast-and-furious-10-release-date/
Apr 17, 2023 ... Fast X is out in cinemas on 19th May 2023 – find out how to rewatch all the Fast & Furious movies in order, and read our Fast & Furious 9 review ...
--------------------------------------------------
Fast & Fur

We now use the link key of the results variable to download and parse the contents. Everything is taken care of by the newspaper library. However, under certain circumstances, such as anti-bot measures or having a file as a result, it may be difficult to record some items.

In [None]:
# let's visit all the URLs from the results and use the newspaper library
# to download their texts. The library won't work on some URLs, e.g.
# if the content is a PDF file or if the website has some anti-bot mechanisms
# adopted.

import newspaper

pages_content = []

for result in results:
    try:
        article = newspaper.Article(result["link"])
        article.download()
        article.parse()
        if len(article.text) > 0:
            pages_content.append({ "url": result["link"], "text": article.text })
    except:
        continue

print(len(pages_content))

8


## Process the Search Results

We now have the top 10 results from the Google search. (Honestly, who looks at Google’s second page?) However, it is not efficient to pass all the contents to the model because of the following reasons:

- The model’s context length is limited.
- It will significantly increase the cost if we process all the search results.
- In almost all cases, they share similar pieces of information.

So, let’s find the most relevant results.

By incorporating the embedded generating capacity of the LLM, we will be able to identify contextually related information. Converting the text to a high-dimensionality tensor that captures meaning is required. The cosine similarity function can locate the most relevant article in relation to the user's query.

It begins by separating the texts with the RecursiveCharacterTextSplitter class to guarantee that the content lengths are within the input length of the model. The Document class will construct a data structure from each piece, allowing you to save metadata such as the URL as the source. The model can then utilise this information to determine the placement of the content.

In [None]:
# we split the article texts into small chunks. While doing so, we keep track of each
# chunk metadata (i.e. the URL where it comes from). Each metadata is a dictionary and
# we need to use the "source" key for the document source so that the chain
# that we'll create later knows where to retrieve the source.

from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.docstore.document import Document

text_splitter = RecursiveCharacterTextSplitter(chunk_size=4000, chunk_overlap=100)

docs = []
for d in pages_content:
    chunks = text_splitter.split_text(d["text"])
    for chunk in chunks:
        new_doc = Document(page_content=chunk, metadata={ "source": d["url"] })
        docs.append(new_doc)
len(docs)

24

The subsequent step involves utilizing the OpenAI API's OpenAIEmbeddings class, specifically the .embed_documents() method for search results and the .embed_query() method for the user's question, to generate embeddings.

In [None]:
# then, we embed both the chunks and the query

from langchain.embeddings import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()

docs_embeddings = embeddings.embed_documents([doc.page_content for doc in docs])
query_embedding = embeddings.embed_query(query)

Lastly, the get_top_k_indices function accepts the content and query embedding vectors and returns the index of top K candidates with the highest cosine similarities to the user's request. Later, we use the indexes to retrieve the best-fit documents.

In [None]:
# next, we compute the cosine similarities between the document vectors and
# the query vectors using numpy and sklearn. We are interested only in the top 3
# chunks for now because we'll later put them in a prompt and the prompt size is
# limited.

import numpy as np
from sklearn.metrics.pairwise import cosine_similarity

def get_top_k_indices(list_of_doc_vectors, query_vector, top_k):
    # convert the lists of vectors to numpy arrays
    list_of_doc_vectors = np.array(list_of_doc_vectors)
    query_vector = np.array(query_vector)

    # compute cosine similarities
    similarities = cosine_similarity(query_vector.reshape(1, -1), list_of_doc_vectors).flatten()

    # sort the vectors based on cosine similarity
    sorted_indices = np.argsort(similarities)[::-1]

    # retrieve the top K indices from the sorted list
    top_k_indices = sorted_indices[:top_k]

    return top_k_indices

top_k = 2
best_indexes = get_top_k_indices(docs_embeddings, query_embedding, top_k)
best_k_documents = [doc for i, doc in enumerate(docs) if i in best_indexes]

## Chain with Source

Finally, we used the articles from our prompt (through the stuff technique) to help the model identify the correct response. The load_qa_with_sources_chain() chain is provided by LangChain and is meant to accept a list of input_documents as a source of information and a question argument that is the user's inquiry. The final step is to preprocess the model's response in order to extract its answer and the sources it used.

In [None]:
# we are now ready to create a question answering chain that leverages
# sources, and we'll use the load_qa_with_sources_chain function for that

from langchain.chains.qa_with_sources import load_qa_with_sources_chain
from langchain.llms import OpenAI

chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="stuff")

In [None]:
# last, let's generate the response to our query
response = chain({"input_documents": best_k_documents, "question": query}, return_only_outputs=True)

response_text, response_sources = response["output_text"].split("SOURCES:")
response_text = response_text.strip()
response_sources = response_sources.strip()

print(f"Answer: {response_text}")
print(f"Sources: {response_sources}")

Answer: The latest Fast and Furious movie is Fast X, scheduled for release on May 19, 2023.
Sources: https://www.radiotimes.com/movies/fast-and-furious-10-release-date/, https://en.wikipedia.org/wiki/Fast_%26_Furious


In [None]:
response

{'output_text': ' The latest Fast and Furious movie is Fast X, scheduled for release on May 19, 2023.\nSOURCES: https://www.radiotimes.com/movies/fast-and-furious-10-release-date/, https://en.wikipedia.org/wiki/Fast_%26_Furious'}

The model was able to locate the correct response because to the utilisation of search results, even though it had never seen it previously during the training stage. The source of the question and answer chain offers information about the sources used by the model to create the answer.

## Conclusion
We covered how to use external knowledge from a search engine to create a powerful application in this course. The context can come from a variety of sources, including PDFs, text documents, CSV files, and even the Internet! We used Google search results as the source of information, which allowed the model to accurately answer the question it had previously been unable to answer.

## Acknowledgements

I'd like to express my thanks to the wonderful [LangChain & Vector Databases in Production Course](https://learn.activeloop.ai/courses/langchain) by Activeloop - which i completed, and acknowledge the use of some images and other materials from the course in this article.