<img src="https://drive.google.com/uc?export=view&id=1wYSMgJtARFdvTt5g7E20mE4NmwUFUuog" width="200">

[![Build Fast with AI](https://img.shields.io/badge/BuildFastWithAI-GenAI%20Bootcamp-blue?style=for-the-badge&logo=artificial-intelligence)](https://www.buildfastwithai.com/genai-course)


[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1cYJcJ42zuPa8O0L-pFZiZzPlehtl7UGo#scrollTo=zWo55QTF-mJ2)


# 🔍 Perplexity Clone using LangChain + Cerebras + Tavily Search

In this notebook, we will build a Perplexity-like conversational search assistant that:
- Uses Tavily API to search the web,
- Summarizes results using a Cerebras LLM,
- Maintains chat history using LangChain's RunnableWithMessageHistory.


In [None]:
# Install dependencies
!pip install langchain langchain-community langchain-cerebras tavily-python

Collecting langchain-community
  Downloading langchain_community-0.4-py3-none-any.whl.metadata (3.0 kB)
Collecting langchain-cerebras
  Downloading langchain_cerebras-0.5.0-py3-none-any.whl.metadata (4.9 kB)
Collecting tavily-python
  Downloading tavily_python-0.7.12-py3-none-any.whl.metadata (7.5 kB)
INFO: pip is looking at multiple versions of langchain-community to determine which version is compatible with other requirements. This could take a while.
Collecting langchain-community
  Downloading langchain_community-0.3.31-py3-none-any.whl.metadata (3.0 kB)
Collecting requests<3,>=2 (from langchain)
  Downloading requests-2.32.5-py3-none-any.whl.metadata (4.9 kB)
Collecting dataclasses-json<0.7.0,>=0.6.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting langchain-openai<0.4.0,>=0.3.0 (from langchain-cerebras)
  Downloading langchain_openai-0.3.35-py3-none-any.whl.metadata (2.4 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from d

## 🔑 Setup API Keys

We will use:
- `TAVILY_API_KEY` for Tavily Search
- `CEREBRAS_API_KEY` for the Cerebras LLM

Make sure to get both keys and add them to your environment.


In [None]:
import os
from google.colab import userdata
# Set your API keys (you can also use environment variables in Colab or .env)
api_key = userdata.get("TAVILY_API_KEY").strip()

os.environ["CEREBRAS_API_KEY"] = userdata.get("CEREBRAS_API_KEY")

## 🧩 Import libraries and initialize tools
We’ll use:
- `TavilySearchResults` for fetching relevant snippets.
- `ChatCerebras` for generating answers.
- `RunnableWithMessageHistory` to maintain chat memory.


In [None]:
from tavily import TavilyClient
from langchain_cerebras import ChatCerebras
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.messages import HumanMessage, AIMessage


client = TavilyClient(api_key)
# response = client.search(
#     query="what is genai"
# )  check

# print(response)
# Initialize Cerebras model
llm = ChatCerebras(model="qwen-3-32b", temperature=0.4)


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

{'query': 'what is genai', 'follow_up_questions': None, 'answer': None, 'images': [], 'results': [{'url': 'https://libguides.acu.edu.au/ai-basics', 'title': 'What is GenAI - AI basics - Library guides', 'content': '**Generative AI (GenAI)** is a specialised form that creates new content- text, images, video, audio, or code, based on prompts and patterns in existing data. There are many GenAI tools out there to choose from, and they can generate different types of content. * **Text content:** AI tools that generate text responses are trained on a massive amount of data. * **Sound and video:** Sound and video based GenAI tools can generate content such as music compositions in a range of genres, sound effects for movies and games or human-looking avatars. * **Research discovery:** Research discovery based GenAI tools can automate parts of the research process and make long, complex texts easier to decipher by extracting key information or summa

## 🧠 Define the system prompt
This defines how the model will use search results to answer user questions.


In [None]:
template = ChatPromptTemplate.from_messages([
    ("system",
     "You are an intelligent search assistant. Use the provided web results to answer the question. "
     "If the results are insufficient, say you couldn’t find enough information. Always cite sources."),
    MessagesPlaceholder(variable_name="history"),
    ("human", "User query: {query}\n\nWeb results:\n{context}\n\nAnswer:")
])


## 🌐 Define function to perform search + combine with LLM
This chain:
1. Performs Tavily search
2. Builds a combined context
3. Passes it to the Cerebras LLM


In [None]:
def run_search_and_answer(query: str, history):
    # Perform search
    results = client.search(query=query)

    # If Tavily returns a string, use it directly
    if isinstance(results, str):
        context = results
    # If Tavily returns a list of dicts, format them
    elif isinstance(results, list):
        context = "\n".join([
            f"{i+1}. {r.get('content', '')} (Source: {r.get('url', 'N/A')})"
            for i, r in enumerate(results)
        ])
    else:
        context = str(results)

    # Build the prompt input
    prompt_input = {"query": query, "context": context, "history": history}

    # Generate response
    chain = template | llm
    response = chain.invoke(prompt_input)

    return response.content, context


## 💬 Add memory using RunnableWithMessageHistory
We’ll use LangChain’s memory feature to persist conversation context.


In [None]:
from langchain_core.runnables import RunnableLambda, RunnableMap

store = {}  # session-wise memory store

def get_history(session_id: str) -> InMemoryChatMessageHistory:
    if session_id not in store:
        store[session_id] = InMemoryChatMessageHistory()
    return store[session_id]

# Step 1: Create Runnable for web search
search_runnable = RunnableLambda(lambda x: search.invoke({"query": x["query"]}))

# Step 2: Combine search + context + query for LLM
def prepare_input(inputs):
    results = inputs["search"]
    context = "\n".join([f"{i+1}. {r['content']} (Source: {r['url']})" for i, r in enumerate(results)])
    return {"query": inputs["query"], "context": context, "history": inputs["history"]}

# Step 3: Define the full chain
chain = (
    RunnableMap({
        "search": search_runnable,
        "query": lambda x: x["query"],
        "history": lambda x: x["history"],
    })
    | RunnableLambda(prepare_input)
    | template
    | llm
)

# Step 4: Wrap with message history
chain_with_history = RunnableWithMessageHistory(
    chain,
    get_history,
    input_messages_key="query",
    history_messages_key="history"
)


## 🚀 Test the assistant
Now let’s interact with our Perplexity Clone in this notebook.


In [None]:
session_id = "demo_session"

while True:
    user_query = input("🧑‍💻 You: ")
    if user_query.lower() in ["exit", "quit"]:
        break

    history = get_history(session_id)
    response, context = run_search_and_answer(user_query, history.messages)

    # Save to memory
    history.add_message(HumanMessage(content=user_query))
    history.add_message(AIMessage(content=response))

    print("\n🤖 AI:", response, "\n")


🧑‍💻 You: what is genai

🤖 AI: <think>
Okay, the user is asking "what is genai". Let me look at the web results provided.

First, the top result from Acu.edu.au defines GenAI as a specialized form of AI that creates new content like text, images, video, audio, or code based on prompts and existing data. It mentions different types of content it can generate, like text, sound, video, and research tools. The score here is high, 0.928, so this seems reliable.

The second result from UC Library Guides says GenAI is a type of AI that creates new content such as text, images, music, videos, code. It compares it to predictive text on phones and lists uses like brainstorming, translation, quizzes, and summarizing texts. Also notes risks like plagiarism and made-up content. Score is 0.927, so also trustworthy.

Third, the University of Michigan's page defines GenAI as creating new content by generating data samples similar to training data. A bit more general, but aligns with the other definitio

KeyboardInterrupt: Interrupted by user