[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/VectorInstitute/rag-bootcamp/blob/refactor/uv-migration/implementations/web_search/web_search_langchain.ipynb)

# Web Search with Langchain

This example shows how to use the Python [LangChain](https://python.langchain.com/docs/get_started/introduction) library to run a text-generation request on open-source LLMs and embedding models using the OpenAI SDK, then augment that request using results from Google web search.

### 📝 Requirements

To run this notebook, you will need:

- **OpenAI API key**:  
    - Sign up at [OpenAI](https://platform.openai.com/) and create an API key

## Set up the RAG workflow environment

#### Install libraries (Only in Google Colab)

In [None]:
import os

if "COLAB_RELEASE_TAG" in os.environ:
    # This is a Google Colab environment
    # Install required dependencies
    !pip3 install faiss-cpu langchain langchain-community langchain-huggingface langchain-openai # aieng-rag-utils

#### Import libraries

In [2]:
import warnings

warnings.filterwarnings("ignore")

In [None]:
import os

from aieng.rag.utils import get_device_name
from aieng.rag.utils.search import DocumentReader, pretty_print


from langchain.chains import RetrievalQA
from langchain_community.vectorstores import FAISS
from langchain_huggingface.embeddings import HuggingFaceEmbeddings
from langchain_openai import ChatOpenAI

#### Load OpenAI env variables

In [4]:
OPENAI_BASE_URL = os.getenv("OPENAI_BASE_URL", "https://api.openai.com/v1")
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY", "YOUR_OPENAI_API_KEY")

#### Choose LLM and embedding model

In [5]:
GENERATOR_MODEL_NAME = "gpt-4.1"
EMBEDDING_MODEL_NAME = "BAAI/bge-base-en-v1.5"

## Start with a basic generation request without RAG augmentation

Let's start by asking the model a question about recent events that it doesn't know about, something that happened after it finished training. At the time I'm writing this notebook in November 2024, the model doesn't know who won the last World Series of baseball.

*Who won the 2024 World Series of baseball?*

**The correct answer is the Los Angeles Dodgers won in October 2024.**

In [6]:
query = "Who won the 2024 World Series of baseball?"

## Now send the query to the open source model using KScope

In [7]:
llm = ChatOpenAI(
    model=GENERATOR_MODEL_NAME,
    temperature=0,
    max_tokens=None,
    base_url=OPENAI_BASE_URL,
    api_key=OPENAI_API_KEY,
)
message = [
    ("human", query),
]
try:
    result = llm.invoke(message)
    print(f"Result: \n\n{result.content}")
except Exception as err:
    if "Error code: 503" in err.message:
        print(f"The model {GENERATOR_MODEL_NAME} is not ready yet.")
    else:
        raise

Result: 

As of my latest update in June 2024, the 2024 World Series has not yet been played, so there is no winner to report. The World Series typically takes place in October. If you are seeking the most current results, please check a reliable sports news source for the latest updates.


The model admits that it doesn't know the answer, since according to the model it's a future event.

Let's see how we can use RAG to augment our question with a Google web search and get the correct answer.

## Ingestion: Do a Google web search for the query and obtain the necessary information

Parse through all the websites returned by a Google search, break them up into smaller digestible chunks, then encode them as vector embeddings.

In [8]:
# Do a Google web search and parse the results into a big text string
document_reader = DocumentReader(web_search_query=query, search_k=5)
docs, chunks = document_reader.load()

print(f"Number of source documents: {len(docs)}")
print(f"Number of text chunks: {len(chunks)}\n")

Number of source documents: 3
Number of text chunks: 600



#### Define the embeddings model

In [9]:
device = get_device_name()

model_kwargs = {"device": device, "trust_remote_code": True}
encode_kwargs = {"normalize_embeddings": True}  # set True to compute cosine similarity

print("Setting up the embeddings model...")
embeddings = HuggingFaceEmbeddings(
    model_name=EMBEDDING_MODEL_NAME,
    model_kwargs=model_kwargs,
    encode_kwargs=encode_kwargs,
)

Setting up the embeddings model...


## Retrieval: Make the document chunks available via a retriever

The retriever will identify the document chunks that most closely match our original query. (This takes about 1-2 minutes)

In [10]:
vectorstore = FAISS.from_documents(chunks, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})

# Retrieve the most relevant context from the vector store based on the query
retrieved_docs = retriever.invoke(query)

Let's see what results it found. Important to note, these results are in the order the retriever thought were the best matches.

In [11]:
# Print the retrieved documents
pretty_print(retrieved_docs)

Document 1:

Dodgers win World Series 2024
----------------------------------------------------------------------------------------------------
Document 2:

2024 World Series - Wikipedia




































Jump to content







Main menu





Main menu
move to sidebar
hide



		Navigation
	


Main pageContentsCurrent eventsRandom articleAbout WikipediaContact us





		Contribute
	


HelpLearn to editCommunity portalRecent changesUpload fileSpecial pages



















Search











Search






















Appearance
















Donate

Create account

Log in








Personal tools





Donate Create account Log in
----------------------------------------------------------------------------------------------------
Document 3:

2024 World Series - Wikipedia




































Jump to content







Main menu





Main menu
move to sidebar
hide



		Navigation
	


Main pageContentsCurrent eventsRandom articleAbout WikipediaContact us




## Now send the query to the RAG pipeline

In [12]:
rag_pipeline = RetrievalQA.from_llm(llm=llm, retriever=retriever)
result = rag_pipeline.invoke(input=query)
print(f"Result: \n\n{result['result']}")

Result: 

The Los Angeles Dodgers won the 2024 World Series of baseball.


The model provides the correct answer based on the information from the web.