# **RAG with the help of Gemini File Search**

File Search imports, chunks, and indexes your data to enable fast retrieval of relevant information based on a user's prompt. This information is then provided as context to the model, allowing the model to provide more accurate and relevant answers.

In [3]:
from google import genai
from google.genai import types
from google.colab import userdata
import time

In [4]:
api_key = userdata.get('GOOGLE_API_KEY')

In [5]:
client = genai.Client(api_key=api_key)

In [6]:
# Create the File Search store with an optional display name
file_search_store = client.file_search_stores.create(config={'display_name': 'my-file-search-store'})

In [12]:
# Upload and import a file into the File Search store, supply a file name which will be visible in citations
operation = client.file_search_stores.upload_to_file_search_store(
  file='/content/resume.txt',
  file_search_store_name=file_search_store.name,
  config={
      'display_name' : 'resume.txt',
  }
)


# Wait until import is complete
while not operation.done:
    time.sleep(5)
    operation = client.operations.get(operation)

In [15]:
response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="""Can you tell me about Tushar Wagh""",
    config=types.GenerateContentConfig(
        tools=[
            types.Tool(
                file_search=types.FileSearch(
                    file_search_store_names=[file_search_store.name]
                )
            )
        ]
    )
)

print(response.text)

Tushar Wagh is a Data Scientist and Data Engineer based in Pune, Maharashtra, India. He has strong technical skills in Data Science, including Machine Learning and Deep Learning algorithms for Computer Vision and LLM, as well as Data Engineering, focusing on Data Pipelines and Data Profiling.

His expertise includes:
*   **Python:** Deep learning, machine learning, image processing, data processing, numerical calculation, data visualization, and data quality packages.
*   **Cloud Platforms:** AWS (Data pipeline, EMR, Sagemaker, SNS, Redshift), Azure (Databricks, Azure Data Factory), and GCP (Vertex AI, Google Cloud Functions, Cloud Run, App Engine, Cloud Storage, Pub/Sub, BigQuery, Artificial Intelligence, AutoML for Vision).
*   **Big Data:** PySpark.
*   **Web Frameworks:** Flask, Streamlit, FastAPI.
*   **Databases:** MySQL, MongoDB, ElasticSearch, PostgreSQL.
*   **Orchestration:** Airflow, Rundeck.
*   **Containerization:** Docker, Kubernetes.
*   **Monitoring:** DataDog.
*   **Ge