A conversational AI agent that searches, filters, and discovers files in Google Drive using natural language. Built with LangChain tool calling and direct Drive API query generation.
- Frontend: https://dossier-drive-assistant.streamlit.app/
- Backend API: https://dossier-q74p.onrender.com
Try it now! Ask questions like:
- "Find all PDF reports"
- "Show me spreadsheets from last month"
- "Documents that mention budget"
User Query: "Find budget reports from last week"
│
▼
Streamlit Frontend (Streamlit Cloud)
│ HTTP POST /chat
▼
FastAPI Backend (Render)
│
▼
LangChain Tool-Calling Agent
│
├─ LLM generates Drive API query string:
│ "(name contains 'budget' or name contains 'report')
│ and modifiedTime > '2024-05-05T00:00:00'
│ and trashed = false
│ and mimeType != 'application/vnd.google-apps.folder'"
│
▼
DriveSearchTool
│
▼
Google Drive API (files.list)
│ - Recursive folder search
│ - Filters by q parameter
│
▼
Results → LLM formats response → User
- Direct Query Generation: LLM writes Drive API
qparameter strings directly (no intermediate JSON) - Full Drive API Power: Supports complex queries with
and/or/parentheses - Smart Search:
- Name search:
name contains 'keyword' - Content search:
fullText contains 'keyword' - Type filtering:
mimeType = 'application/pdf' - Date filtering:
modifiedTime > '2024-01-01T00:00:00'
- Name search:
- Recursive Folder Search: Finds files in subfolders automatically
- Plural/Singular Handling: "reports" finds both "report" and "reports"
- Auto-Retry: If no results, agent automatically broadens search criteria
- Tool-Based Architecture: Uses LangChain function calling for transparency
- Python 3.11+
- A Google Cloud project with Drive API enabled
- A Groq API key (free at https://console.groq.com)
-
Create Google Cloud Project
- Go to Google Cloud Console
- Create a new project
- Enable Google Drive API
-
Create Service Account
- Go to IAM & Admin → Service Accounts
- Create a service account
- Create a JSON key → download it
-
Share Drive Folder
- Create or use an existing Google Drive folder
- Share it with the service account's email (Viewer permission)
- Copy the folder ID from the URL:
https://drive.google.com/drive/folders/FOLDER_ID_HERE
Just Set PYTHON_VERSION=3.11 to avoid pydantic-core compilation issues with Python 3.14
Live Backend: https://dossier-q74p.onrender.com
frontend at strreamlit cloud: https://dossier-drive-assistant.streamlit.app/
| User Query | Generated Drive API Query |
|---|---|
| "find budget PDFs" | (name contains 'budget') and mimeType = 'application/pdf' and trashed = false and mimeType != 'application/vnd.google-apps.folder' |
| "reports from last week" | (name contains 'reports' or name contains 'report') and modifiedTime > '2024-05-05T00:00:00' and trashed = false and mimeType != 'application/vnd.google-apps.folder' |
| "documents mentioning quarterly revenue" | fullText contains 'quarterly revenue' and trashed = false and mimeType != 'application/vnd.google-apps.folder' |
| "show me all spreadsheets" | (mimeType = 'application/vnd.google-apps.spreadsheet' or mimeType = 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet') and trashed = false and mimeType != 'application/vnd.google-apps.folder' |
| "find invoices" | (name contains 'invoices' or name contains 'invoice') and trashed = false and mimeType != 'application/vnd.google-apps.folder' |
| Component | Technology |
|---|---|
| Backend Framework | FastAPI + Uvicorn |
| Agent Framework | LangChain (Tool Calling Agent) |
| LLM | Groq API — llama-3.3-70b-versatile |
| Drive Integration | Google Drive API v3 (files.list method) |
| Frontend | Streamlit 1.39+ |
| Backend Hosting | Render (Web Service) |
| Frontend Hosting | Streamlit Cloud |
| Language | Python 3.11+ |
dossier/
├── backend/
│ ├── main.py # FastAPI app entry point
│ ├── agent_tool_based.py # LangChain tool-calling agent
│ ├── drive_search_tool.py # DriveSearchTool implementation
│ ├── drive_client.py # Google Drive API wrapper
│ ├── schemas.py # Pydantic models
│ ├── requirements.txt # Python dependencies
│ ├── runtime.txt # Python version for Render
│ └── .env.example # Environment variables template
│
├── frontend/
│ ├── app.py # Streamlit UI
│ ├── requirements.txt # Frontend dependencies
│ └── .streamlit/
│ └── secrets.toml.example # Secrets template
│
├── .gitignore
├── README.md
└── render.yaml # Render deployment config
User types natural language query: "Find budget reports from last week"
The LLM translates the request into a Drive API query string:
(name contains 'budget' or name contains 'report')
and modifiedTime > '2024-05-05T00:00:00'
and trashed = false
and mimeType != 'application/vnd.google-apps.folder'
DriveSearchTool executes the query via Google Drive API:
- Uses
files.listmethod - Searches recursively in subfolders
- Returns matching files with metadata
If no results found, agent automatically:
- Broadens search criteria
- Removes date filters
- Tries again with relaxed query
LLM generates friendly response: "I found 4 daily reports in PDF format. You can view them by clicking on the file cards."
✅ Name Search (partial match)
name contains 'invoice'
✅ Content Search (full-text)
fullText contains 'quarterly revenue'
✅ File Type Filtering
mimeType = 'application/pdf'
mimeType contains 'image/'
✅ Date Filtering
modifiedTime > '2024-01-01T00:00:00'
modifiedTime < '2024-12-31T23:59:59'
✅ Complex Queries
(name contains 'budget' or fullText contains 'budget')
and mimeType = 'application/pdf'
and modifiedTime > '2024-01-01T00:00:00'
- Documents: PDF, Google Docs, Word (.docx)
- Spreadsheets: Google Sheets, Excel (.xlsx)
- Presentations: Google Slides, PowerPoint (.pptx)
- Images: JPEG, PNG, GIF, WebP
- And more: Any file type supported by Google Drive
This project is licensed under the MIT License.
- Built with LangChain
- Powered by Groq
- Uses Google Drive API
- Frontend with Streamlit
Made with ❤️ by Magenta91