# © Artur Czarnecki. All rights reserved.
# Intergrax framework – proprietary and confidential.
# Use, modification, or distribution without written permission is prohibited.

# 04_websearch_context_demo.ipynb

This notebook demonstrates how to use **DropInKnowledgeRuntime** with:

- session-based chat,
- optional RAG (attachments ingested into a vector store),
- **live web search** via `WebSearchExecutor`,

to achieve a "ChatGPT-like" experience with browsing.

The core configuration (LLM adapter, embeddings, vector store, runtime config)
is initialized in a single cell, while the rest of the notebook focuses on
testing and inspecting the web search behavior.


In [1]:
# 1. Imports and environment setup
# --------------------------------

import sys, os
sys.path.insert(0, os.path.abspath(os.path.join(os.getcwd(), "..", "..")))

import os
from dotenv import load_dotenv

# LLM adapter (Ollama backend via LangChain)

# Embeddings + vector store
from intergrax.rag.embedding_manager import EmbeddingManager
from intergrax.rag.vectorstore_manager import VSConfig, VectorstoreManager

# Drop-in knowledge runtime pieces
from intergrax.runtime.drop_in_knowledge_mode.engine.runtime import DropInKnowledgeRuntime
from intergrax.runtime.drop_in_knowledge_mode.config import RuntimeConfig

# Web search integration
from intergrax.websearch.service.websearch_executor import WebSearchExecutor
from intergrax.websearch.providers.google_cse_provider import GoogleCSEProvider
# from intergrax.websearch.providers.bing_provider import BingWebProvider  # optional


# 1.1 Load environment variables (API keys, etc.)
load_dotenv()

# Make sure the env vars for web search are available.
# If you want to use Bing as well, uncomment the Bing provider later.
os.environ["GOOGLE_CSE_API_KEY"] = os.getenv("GOOGLE_CSE_API_KEY") or ""
os.environ["GOOGLE_CSE_CX"] = os.getenv("GOOGLE_CSE_CX") or ""
os.environ["BING_SEARCH_V7_API_KEY"] = os.getenv("BING_SEARCH_V7_API_KEY") or ""


  from tqdm.autonotebook import tqdm, trange





# Core runtime configuration (LLM + embeddings + vector store + web search)

In [2]:
# 2. Core runtime configuration (LLM + embeddings + vector store + web search)
# -----------------------------------------------------------------------------

# 2.1 Session store – simple in-memory storage for chat messages & metadata.
from intergrax.globals.settings import GLOBAL_SETTINGS
from intergrax.llm_adapters.base import LLMAdapterRegistry, LLMProvider
from intergrax.runtime.drop_in_knowledge_mode.session.in_memory_session_storage import InMemorySessionStorage
from intergrax.runtime.drop_in_knowledge_mode.session.session_manager import SessionManager


session_manager = SessionManager(
    storage=InMemorySessionStorage()
)

# 2.2 LLM adapter – here we use Ollama through the LangChain adapter.
llm_adapter = LLMAdapterRegistry.create(LLMProvider.OLLAMA)

# 2.3 Embedding manager – MUST match the model used during ingestion.
embedding_manager = EmbeddingManager(
    provider="ollama",
    model_name=GLOBAL_SETTINGS.default_ollama_embed_model,
    assume_ollama_dim=1536,
)

# 2.4 Vector store configuration – same collection as in the RAG demo.
vectorstore_cfg = VSConfig(
    provider="chroma",
    collection_name="intergrax_docs_drop_in_demo",
)

vectorstore_manager = VectorstoreManager(
    config=vectorstore_cfg,
    verbose=True,
)

# 2.5 Web search executor – wraps one or more providers (Google CSE, Bing, etc.)
websearch_executor = WebSearchExecutor(
    providers=[
        GoogleCSEProvider(),
    ],
    max_text_chars=None,
)

# 2.6 RuntimeConfig – single source of truth for drop-in knowledge runtime.
config = RuntimeConfig(
    # LLM & embeddings & vector store
    llm_adapter=llm_adapter,
    embedding_manager=embedding_manager,
    vectorstore_manager=vectorstore_manager,

    # RAG settings (enabled to allow attachment-based context, as in 03 demo)
    enable_rag=True,

    # Web search settings – THIS is the feature under test in this notebook.
    enable_websearch=True,
    websearch_executor=websearch_executor,
    max_docs_per_query=4,  # how many docs we inject into the system prompt

    # Tools / memory – kept off here to isolate the web search behavior.
    tools_mode="off",
    enable_user_profile_memory=True,

    # Optional tenant / workspace (can be overridden per request)
    tenant_id="demo-tenant",
    workspace_id="demo-workspace",
)

# 2.7 Drop-in knowledge runtime – chat engine with RAG + web search.
runtime = DropInKnowledgeRuntime(
    config=config,
    session_manager=session_manager,
)

runtime


[intergraxVectorstoreManager] Initialized provider=chroma, collection=intergrax_docs_drop_in_demo
[intergraxVectorstoreManager] Existing count: 100


<intergrax.runtime.drop_in_knowledge_mode.engine.runtime.DropInKnowledgeRuntime at 0x1f371685cd0>

### 3. Create a fresh chat session for this web search demo

We create a brand new session in the in-memory store.
All questions in this notebook will be sent under the same `session_id`
so that the runtime can keep short-term chat history (like ChatGPT).

Helper: `ask(question: str)` for interactive testing

This helper:
1. Builds a `RuntimeRequest` bound to the current session.
2. Sends it through `DropInKnowledgeRuntime.ask(...)`.
3. Prints:
   - The user question
   - The model answer (truncated optionally)
   - Routing info (was RAG used? was web search used?)
   - Basic debug info (web search, RAG).

In [3]:
import json
import uuid

from intergrax.runtime.drop_in_knowledge_mode.responses.response_schema import RuntimeAnswer, RuntimeRequest

# 3.1 Create a fresh session for this demo.
session = await session_manager.create_session(
    session_id=str(uuid.uuid4()),
    user_id="demo-user-websearch",
    tenant_id="demo-tenant",
    workspace_id="demo-workspace",
    metadata={"notebook": "04_websearch_drop_in_web_demo"},
)

print("Session created:")
print(f"- id: {session.id}")
print(f"- user_id: {session.user_id}")
print(f"- tenant_id: {session.tenant_id}")
print(f"- workspace_id: {session.workspace_id}")


async def ask(question: str, *, show_full_answer: bool = True) -> RuntimeAnswer:
    """
    Convenience helper for this notebook.

    - Sends the user's question through DropInKnowledgeRuntime.
    - Prints the answer and basic routing/debug information.
    """
    print("\n" + "=" * 80)
    print(f"USER QUESTION:\n{question}")
    print("=" * 80)

    request = RuntimeRequest(
        session_id=session.id,
        user_id=session.user_id,
        tenant_id=session.tenant_id,
        workspace_id=session.workspace_id,
        message=question,
        attachments=[],          # In this notebook we focus on web search (no files)
        metadata={"source": "04_websearch_demo"},
    )

    answer = await runtime.run(request)

    # 1) Print assistant answer
    print("\nASSISTANT ANSWER:\n")
    if show_full_answer:
        print(answer.answer)
    else:
        print(answer.answer[:700] + ("..." if len(answer.answer) > 700 else ""))

    # 2) Routing information (did RAG / websearch fire?)
    print("\n--- ROUTING INFO ---")
    print(f"strategy:           {answer.route.strategy}")
    print(f"used_rag:           {answer.route.used_rag}")
    print(f"used_websearch:     {answer.route.used_websearch}")
    print(f"used_tools:         {answer.route.used_tools}")
    print(f"used_user_profile:  {answer.route.used_user_profile}")

    # 3) Debug trace: show only relevant slices (websearch / rag)
    debug = answer.debug_trace or {}

    print("\n--- DEBUG: RAG ---")
    rag_debug = debug.get("rag")
    if rag_debug:
        print(json.dumps(rag_debug, indent=2, ensure_ascii=False))
    else:
        print("(no RAG debug info)")

    print("\n--- DEBUG: WEBSEARCH ---")
    web_debug = debug.get("websearch")
    if web_debug:
        print(json.dumps(web_debug, indent=2, ensure_ascii=False))
    else:
        print("(no websearch debug info)")

    return answer


answer_2 = await ask(
    "Based on that, which framework would you recommend for a production-ready multi-agent system and why?"
)

answer_2

Session created:
- id: 66fc2d38-91ba-471a-9464-fe55b5625d7e
- user_id: demo-user-websearch
- tenant_id: demo-tenant
- workspace_id: demo-workspace

USER QUESTION:
Based on that, which framework would you recommend for a production-ready multi-agent system and why?

ASSISTANT ANSWER:

Based on the provided search results, it's difficult to pinpoint a specific framework that would be recommended for a production-ready multi-agent system. However, I can try to infer some insights from the snippets:

1. The first snippet mentions Robotec.ai, which seems to focus on developing production-ready autonomous systems. This implies that they may have experience with creating scalable and reliable multi-agent systems.
2. The second snippet is about a control system controller called EClypse, which can be expanded to support up to 20 input/output (I/O) modules. While this doesn't directly relate to a framework for multi-agent systems, it does suggest that the manufacturer has experience with design

RuntimeAnswer(answer="Based on the provided search results, it's difficult to pinpoint a specific framework that would be recommended for a production-ready multi-agent system. However, I can try to infer some insights from the snippets:\n\n1. The first snippet mentions Robotec.ai, which seems to focus on developing production-ready autonomous systems. This implies that they may have experience with creating scalable and reliable multi-agent systems.\n2. The second snippet is about a control system controller called EClypse, which can be expanded to support up to 20 input/output (I/O) modules. While this doesn't directly relate to a framework for multi-agent systems, it does suggest that the manufacturer has experience with designing and implementing scalable and adaptable systems.\n3. The third snippet is about a research paper on the circular economy in the European photovoltaic industry. This seems unrelated to frameworks for multi-agent systems.\n4. The fourth snippet mentions Bart