# WebResearchRetriever

Given a query, this retriever will: 

* Formulate a set of relate Google searches
* Search for each 
* Load all the resulting URLs
* Then embed and perform similarity search with the query on the consolidate page content

In [3]:
from langchain.callbacks.manager import CallbackManager
from langchain.retrievers.web_research import WebResearchRetriever
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

## Run

Pass the desired model and vectorstore.

In [5]:
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.chat_models.openai import ChatOpenAI
# Set input
llm = ChatOpenAI(temperature=0)
vectorstore = Chroma(embedding_function=OpenAIEmbeddings())
GOOGLE_CSE_ID = "b5e84267513eb4dcf"
GOOGLE_API_KEY = "AIzaSyDUKwJCpdU6nNwANyA7NC2cXnMfvXD6YcM"

In [6]:
# Initialize
web_research_retriever = WebResearchRetriever(
    vectorstore=vectorstore, 
    llm=llm, 
    GOOGLE_CSE_ID=GOOGLE_CSE_ID, 
    GOOGLE_API_KEY=GOOGLE_API_KEY
)

In [7]:
# Run
import logging
logging.basicConfig()
logging.getLogger("langchain.retrievers.web_research").setLevel(logging.INFO)
user_input = "How do LLM Powered Autonomous Agents work?"
docs = web_research_retriever.get_relevant_documents(user_input)

INFO:langchain.retrievers.web_research:Generating questions for Google Search ...
INFO:langchain.retrievers.web_research:Questions for Google Search (raw): {'question': 'How do LLM Powered Autonomous Agents work?', 'text': LineList(lines=['1. What is the definition of LLM Powered Autonomous Agents?', '2. What are the key features of LLM Powered Autonomous Agents?', '3. How do LLM Powered Autonomous Agents differ from traditional autonomous agents?', '4. What are the applications of LLM Powered Autonomous Agents?', '5. Are there any case studies or examples of successful implementations of LLM Powered Autonomous Agents?'])}
INFO:langchain.retrievers.web_research:Questions for Google Search: ['1. What is the definition of LLM Powered Autonomous Agents?', '2. What are the key features of LLM Powered Autonomous Agents?', '3. How do LLM Powered Autonomous Agents differ from traditional autonomous agents?', '4. What are the applications of LLM Powered Autonomous Agents?', '5. Are there any c

In [9]:
len(docs)

6

`Local -`

In [7]:
from langchain.llms import LlamaCpp
from langchain.vectorstores import Chroma
from langchain.embeddings import GPT4AllEmbeddings

n_gpu_layers = 1  # Metal set to 1 is enough.
n_batch = 512  # Should be between 1 and n_ctx, consider the amount of RAM of your Apple Silicon Chip.
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])
llama = LlamaCpp(
    model_path="/Users/rlm/Desktop/Code/llama.cpp/llama-2-13b-chat.ggmlv3.q4_0.bin",
    n_gpu_layers=n_gpu_layers,
    n_batch=n_batch,
    n_ctx=4096,  # Context window
    max_tokens=1000,  # Max tokens to generate
    f16_kv=True,  # MUST set to True, otherwise you will run into problem after a couple of calls
    callback_manager=callback_manager,
    verbose=True,
)
vectorstore_llama = Chroma(embedding_function=GPT4AllEmbeddings())
GOOGLE_CSE_ID = "b5e84267513eb4dcf"
GOOGLE_API_KEY = "AIzaSyDUKwJCpdU6nNwANyA7NC2cXnMfvXD6YcM"

llama.cpp: loading model from /Users/rlm/Desktop/Code/llama.cpp/llama-2-13b-chat.ggmlv3.q4_0.bin
llama_model_load_internal: format     = ggjt v3 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 4096
llama_model_load_internal: n_embd     = 5120
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 40
llama_model_load_internal: n_layer    = 40
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: freq_base  = 10000.0
llama_model_load_internal: freq_scale = 1
llama_model_load_internal: ftype      = 2 (mostly Q4_0)
llama_model_load_internal: n_ff       = 13824
llama_model_load_internal: model size = 13B
llama_model_load_internal: ggml ctx size =    0.09 MB
llama_model_load_internal: mem required  = 9132.71 MB (+ 1608.00 MB per state)
llama_new_context_with_model: kv self size  = 3200.00 MB


Found model file at  /Users/rlm/.cache/gpt4all/ggml-all-MiniLM-L6-v2-f16.bin
llama_new_context_with_model: max tensor size =    87.89 MB


ggml_metal_init: allocating
ggml_metal_init: using MPS
ggml_metal_init: loading '/Users/rlm/miniforge3/envs/llama/lib/python3.9/site-packages/llama_cpp/ggml-metal.metal'
ggml_metal_init: loaded kernel_add                            0x2a58f3710
ggml_metal_init: loaded kernel_mul                            0x2a58f4c40
ggml_metal_init: loaded kernel_mul_row                        0x2a58f5af0
ggml_metal_init: loaded kernel_scale                          0x2a58f3a60
ggml_metal_init: loaded kernel_silu                           0x2a58f3cc0
ggml_metal_init: loaded kernel_relu                           0x2a58f6260
ggml_metal_init: loaded kernel_gelu                           0x2a58f68b0
ggml_metal_init: loaded kernel_soft_max                       0x2a58f75b0
ggml_metal_init: loaded kernel_diag_mask_inf                  0x2a58f7a70
ggml_metal_init: loaded kernel_get_rows_f16                   0x2a58f8530
ggml_metal_init: loaded kernel_get_rows_q4_0                  0x2a58f8b90
ggml_metal_init:

In [8]:
# Initialize WebResearchRetriever
web_research_retriever = WebResearchRetriever(
    vectorstore=vectorstore_llama, 
    llm=llama, 
    GOOGLE_CSE_ID=GOOGLE_CSE_ID, 
    GOOGLE_API_KEY=GOOGLE_API_KEY
)

In [9]:
import logging
logging.basicConfig()
logging.getLogger("langchain.retrievers.web_research").setLevel(logging.INFO)
user_input = "How do LLM Powered Autonomous Agents work?"
docs = web_research_retriever.get_relevant_documents(user_input)

INFO:langchain.retrievers.web_research:Generating questions for Google Search ...


  Sure! Here are five search queries that could help answer the user's question about how LLM powered autonomous agents work:

1. "LLM powered autonomous agents architecture" - This search query could provide information on the overall design and structure of LLM powered autonomous agents, including the components and interfaces involved in their operation.
2. "How do LLM powered autonomous agents perceive their environment?" - This search query could provide information on the sensors and other sources of data that LLM powered autonomous agents use to understand their environment and make decisions.
3. "What algorithms and techniques are used in LLM powered autonomous agents for decision making?" - This search query could provide information on the machine learning and artificial intelligence techniques that are used in LLM powered autonomous agents to enable them to make decisions and take actions based on their environment and objectives.
4. "How do LLM powered autonomous agents lea


llama_print_timings:        load time =  7344.40 ms
llama_print_timings:      sample time =   245.79 ms /   350 runs   (    0.70 ms per token,  1423.96 tokens per second)
llama_print_timings: prompt eval time =  7344.26 ms /    99 tokens (   74.18 ms per token,    13.48 tokens per second)
llama_print_timings:        eval time = 14318.16 ms /   349 runs   (   41.03 ms per token,    24.37 tokens per second)
llama_print_timings:       total time = 22399.59 ms
INFO:langchain.retrievers.web_research:Questions for Google Search (raw): {'question': 'How do LLM Powered Autonomous Agents work?', 'text': LineList(lines=['1. "LLM powered autonomous agents architecture" - This search query could provide information on the overall design and structure of LLM powered autonomous agents, including the components and interfaces involved in their operation.\n', '2. "How do LLM powered autonomous agents perceive their environment?" - This search query could provide information on the sensors and other s

KeyError: 'link'

In [6]:
len(docs)

7