# File Search Store Management for Rickbot

A notebook to experiment with the FileSearchStore and how it can be used to manage file search in the Rickbot Agent.

The best way to run this notebook is from Google Colab.

<a target="_blank" href="https://colab.research.google.com/github/derailed-dash/rickbot-adk/blob/main/file-search-store/notebooks/file_search_store.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" height=30/></a>

## Pre-Reqs and Notes

- The `file_search_stores` is a feature exclusive to the Gemini Developer API. 
  - It does not work with the Vertex AI API or the Gen AI SDK in Vertex AI mode.
  - Therefore: don't set env vars for `GOOGLE_CLOUD_LOCATION` or `GOOGLE_GENAI_USE_VERTEXAI` and do not initialise Vertex AI.
- Make sure you have an up-to-date version of the `google-genai` package installed. 
  - Versions older than 1.49.0 do not support the File Search Tool.
  - You can upgrade the package using `pip install --upgrade google-genai`.
  - You can add to your `pyproject.toml` file; since we don't explicitly need it outside of this notebook, we can add it to the `[jupyter]` section.
- Add your Gemini API Key to Colab as a secret. Then you can retrieve it using `userdata.get("GEMINI_API_KEY")`

## Setup

In [None]:
import glob
import os
import time

from dotenv import load_dotenv
from google import genai
from google.genai.types import FileSearchStore
# from IPython.display import Markdown, display



### Local Only

If running locally, setup the Google Cloud environment:

```bash
source scripts/setup-env.sh
```

Then to install the package dependencies into the virtual environment, use the `uv` tool:

1. From your agent's root directory, run `make install` to set up the virtual environment (`.venv`).
2. In this Jupyter notebook, select the kernel from the `.venv` folder to ensure all dependencies are available.

In [None]:
if load_dotenv(".env.local"):
    print("Loaded env")
else:
    print("Warning: .env file not found")

GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")
if not GEMINI_API_KEY:
    raise ValueError("GEMINI_API_KEY environment variable not set.")
else:
    print("Successfully loaded Google API key.")

APP_NAME = "rickbot_notebook_client"

### Or In Colab

In [None]:
%pip install -q -U "google-genai>=1.49.0"

In [None]:
from google.colab import userdata

os.environ["GEMINI_API_KEY"] = userdata.get('GEMINI_API_KEY')

### Client Initialisation

In [None]:
client = genai.Client()
DAZBO_STORE_NAME = "rickbot-dazbo-ref"

## Store Management

### Create the Store (One Time)

In [None]:
file_search_store = client.file_search_stores.create(config={'display_name': DAZBO_STORE_NAME})

### View Stores

In [None]:
for a_store in client.file_search_stores.list():
    print(a_store)

### Retrieve the Store

In [None]:
def get_store(store_name: str) -> FileSearchStore | None:
    """ Retrieve a store by display name """
    for a_store in client.file_search_stores.list():
        if a_store.display_name == store_name:
            return a_store

    return None 

### Delete a Store

In [None]:
# Delete a store
file_search_store = get_store(DAZBO_STORE_NAME)
if file_search_store:
    print(f"Deleting {file_search_store}")
    # Uncomment to delete
    # client.file_search_stores.delete(name=file_search_store.name, config={'force': True})

## Upload and Process Files

Now we need to place the files in a suitable local folder to upload to the store.

In [None]:
UPLOAD_PATH = "/content/upload-files/"

In [None]:
def upload_doc(file_path, file_search_store):
    """ Upload a document to the file search store """

    file_name = os.path.basename(file_path)

    # Import the file into the file search store with custom metadata
    operation = client.file_search_stores.upload_to_file_search_store(
        file_search_store_name=file_search_store.name,
        file = file_path,
        config={'display_name' : file_path, # or we could determine the title
                # 'chunking_config' : chunking_config["chunking_config"],
                'custom_metadata':[
                    # {"key": "title", "string_value": title},
                    {"key": "file_name", "string_value": file_name},
                ],
       }
    )

    # Wait until import is complete
    while not operation.done:
        time.sleep(5)
        print("Waiting")
        operation = client.operations.get(operation)

    print(f"{file_name} is uploaded and indexed")

In [None]:
file_search_store = get_store(DAZBO_STORE_NAME)
if file_search_store is None:
    print (f"Store {DAZBO_STORE_NAME} not found.")
else:
    files_to_upload = glob.glob(f'{UPLOAD_PATH}/*')
    for file_path in files_to_upload:
        print(f"Uploading {file_path}")
        # upload_doc(file_path, file_search_store)
