# Recreate Bing Chat

---

## Introduction

While the Large Language Models (LLMs) possess impressive capabilities, they have certain limitations that can present challenges when deploying them in a production environment. The hallucination problem makes them answer certain questions wrongly with high confidence. This issue can be attributed to various factors, one of which is that their training process has a cut-off date. So, these models do not have access to events preceding that date.

This lesson will explore the idea of finding the best articles from the Internet as the context for a chatbot to find the correct answer. We will use LangChain’s integration with Google Search API and the `SeleniumURLLoader` to extract the stories from search results. This is followed by choosing and using the most relevant options in the prompt.

> Note: Notice that the same pipeline could be done with the Bing API, but we'll use the Google Search API in this project because it is used in other lessons of this course, thus avoiding creating several keys for the same functionality.

## Workflow

The user query is used to extract relevant articles using a search engine (e.g. Bing or Google Search), which are then split into chunks. We then compute the embeddings of each chunk, rank them by cosine similarity with respect to the embedding of the query, and put the most relevant chunks into a prompt to generate the final answer, while also keeping track of the sources.
<br/>
<img src="../../images/bing-chat-workflow.png" alt="State of Workflow" style="width: 55%; height: auto;"/>

## Setup

In [1]:
import openai
import os
from dotenv import load_dotenv, find_dotenv

_ = load_dotenv(find_dotenv())
openai.api_type = os.environ.get("OPENAI_API_TYPE")
openai.api_base = os.environ.get("OPENAI_API_BASE")
openai.api_key = os.environ.get("OPENAI_API_KEY")
openai.api_version = os.environ.get("OPENAI_API_VERSION")

## Building the system

We must set up the API Key and a custom search engine to be able to use Google search API. To get the key, head to the [Google Cloud console](https://console.cloud.google.com/apis/credentials) and generate the key by pressing the CREATE CREDENTIALS buttons from the top and choosing API KEY. Then, head to the [Programmable Search Engine](https://programmablesearchengine.google.com/controlpanel/create) dashboard and remember to select the “Search the entire web” option. The Search engine ID will be visible in the details. You might also need to enable the “Custom Search API” service under the Enable APIs and services. (You will receive the instruction from API if required) We can now configure the environment variables `GOOGLE_CSE_ID` and `GOOGLE_API_KEY`, allowing the Google wrapper to connect with the API.

### 1. Get Search Results

In [5]:
from langchain.tools import Tool
from langchain.utilities import GoogleSearchAPIWrapper

search = GoogleSearchAPIWrapper()
TOP_N_RESULTS = 5


def top_n_results(query):
    return search.results(query, TOP_N_RESULTS)


tool = Tool(
    name="Google Search",
    description="Search Google for recent results.",
    func=top_n_results,
)

query = "What is the latest fast and furious movie?"

results = tool.run(query)

results

[{'title': 'Fast & Furious movies in order | chronological and release order ...',
  'link': 'https://www.radiotimes.com/movies/fast-and-furious-order/',
  'snippet': 'Mar 22, 2023 ... Fast & Furious Presents: Hobbs & Shaw (2019); F9 (2021); Fast and Furious 10 (2023). Tokyo Drift also marks the first appearance of Han Lue, a\xa0...'},
 {'title': 'FAST X | Official Trailer 2 - YouTube',
  'link': 'https://www.youtube.com/watch?v=aOb15GVFZxU',
  'snippet': 'Apr 19, 2023 ... Fast X, the tenth film in the Fast & Furious Saga, launches the final ... witnessed it all and has spent the last 12 years masterminding a\xa0...'},
 {'title': 'How to Watch Fast and Furious Movies in Chronological Order - IGN',
  'link': 'https://www.ign.com/articles/fast-and-furious-movies-in-order',
  'snippet': "Looking to go on a Fast and Furious binge? ... With the latest Fast and Furious film: Fast X out now, we've put together this handy guide on how to watch\xa0..."},
 {'title': 'Fast & Furious - Wikipedia',

### 2. Extract content from URL

In [6]:
from langchain.document_loaders import SeleniumURLLoader

urls = [result["link"] for result in results]

loader = SeleniumURLLoader(
    urls=urls,
    binary_location=os.environ.get("BROWSER_EXEC_PATH"),
)
docs = loader.load()

print("Number of pages:", len(docs))
print(docs[0])

Number of pages: 5
page_content='Home\n\nMovies\n\nHow to watch all the Fast and Furious movies in order - full chronological timeline and release order\n\nWe may earn commission from links on this page. Our editorial is always independent (learn more)\n\nHow to watch all the Fast and Furious movies in order - full chronological timeline and release order\n\nWhat’s the best way to watch Vin Diesel, Dwayne Johnson and Paul Walker in action? Here\'s a speedy explanation of the Fast & Furious movie timeline.\n\nUniversal\n\nBy \n\nThomas Ling\n\nPublished: Wednesday, 22nd March 2023 at 11:10 am\n\nSave\n\nShare on facebook\n\nShare on twitter\n\nShare on pinterest\n\nShare on reddit\n\nEmail to a friend\n\nIf you’re planning to watch the Fast and Furious movies before Fast X speeds into cinemas this May, it’s a good idea to make sure you’re watching them in the best order.\n\nAdvertisement\n\nYou might think a blockbuster franchise about cars would be pretty straightforward, but since it 

### 3. Split documents into chunks

In [8]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=4000, chunk_overlap=100)
splitted_docs = text_splitter.split_documents(docs)

print("Number of chunks:", len(splitted_docs))
splitted_docs[0]

Number of chunks: 28


Document(page_content="Home\n\nMovies\n\nHow to watch all the Fast and Furious movies in order - full chronological timeline and release order\n\nWe may earn commission from links on this page. Our editorial is always independent (learn more)\n\nHow to watch all the Fast and Furious movies in order - full chronological timeline and release order\n\nWhat’s the best way to watch Vin Diesel, Dwayne Johnson and Paul Walker in action? Here's a speedy explanation of the Fast & Furious movie timeline.\n\nUniversal\n\nBy \n\nThomas Ling\n\nPublished: Wednesday, 22nd March 2023 at 11:10 am\n\nSave\n\nShare on facebook\n\nShare on twitter\n\nShare on pinterest\n\nShare on reddit\n\nEmail to a friend\n\nIf you’re planning to watch the Fast and Furious movies before Fast X speeds into cinemas this May, it’s a good idea to make sure you’re watching them in the best order.\n\nAdvertisement\n\nYou might think a blockbuster franchise about cars would be pretty straightforward, but since it began in 20

### 4. Compute embeddings

In [11]:
from langchain.embeddings import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings()

docs_embeddings = embeddings.embed_documents([doc.page_content for doc in docs])
query_embedding = embeddings.embed_query(query)

### 5. Find top k chunks

In [12]:
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
from typing import List


def get_top_k_indices(
    list_of_doc_vectors: List[List[float]], query_vector: List[float], top_k: int
) -> List[int]:
    """
    Returns the indices of the top K vectors in the list of document vectors that
    are most similar to the query vector.

    :param list_of_doc_vectors: a list of document vectors
    :param query_vector: a query vector
    :param top_k: the number of top vectors to retrieve
    :return: a list of indices of the top K vectors in the list of document vectors
    """
    # convert the lists of vectors to numpy arrays
    list_of_doc_vectors = np.array(list_of_doc_vectors)
    query_vector = np.array(query_vector)

    # compute cosine similarities
    similarities = cosine_similarity(
        query_vector.reshape(1, -1), list_of_doc_vectors
    ).flatten()

    # sort the vectors based on cosine similarity
    sorted_indices = np.argsort(similarities)[::-1]

    # retrieve the top K indices from the sorted list
    top_k_indices = sorted_indices[:top_k]

    return top_k_indices


top_k = 3
best_indexes = get_top_k_indices(docs_embeddings, query_embedding, top_k)
best_k_documents = [doc for i, doc in enumerate(docs) if i in best_indexes]

### 6. Generate answer

In [18]:
from langchain.chains.qa_with_sources import load_qa_with_sources_chain
from langchain.chat_models import AzureChatOpenAI

llm = AzureChatOpenAI(deployment_name="gpt4", temperature=0)
chain = load_qa_with_sources_chain(llm, chain_type="stuff")

response = chain(
    {"input_documents": best_k_documents, "question": query}, return_only_outputs=True
)

response_text, response_sources = response["output_text"].split("SOURCES:")
response_text = response_text.strip()
response_sources = response_sources.strip()

print(f"Answer: {response_text}")
print(f"Sources: {response_sources}")

Answer: The latest Fast and Furious movie is "Fast X."
Sources: https://www.radiotimes.com/movies/fast-and-furious-order/, https://www.ign.com/articles/fast-and-furious-movies-in-order, https://www.comingsoon.net/guides/news/1287501-is-fast-x-the-last-movie-fast-and-furious-franchise-final-one-10
