A production-ready Google Drive clone built from the ground up. This isn't just another file manager—it's a complete cloud storage solution with user authentication, file sharing, collaboration features, and everything you'd expect from a modern drive service.
After working with various cloud storage solutions and feeling frustrated by their limitations, I decided to build something better. This project implements a full-featured drive application with:
- Multi-user authentication with JWT tokens
- File and folder management with drag-and-drop
- Sharing and permissions (Viewer, Commenter, Editor)
- Search and filtering across all your files
- Storage analytics and quota management
- Activity tracking and version history
- Real-time UI built with React and TypeScript
Everything runs in a single Docker container, making deployment trivial. Pull the image, run it, and you're good to go.
We've packaged everything into a single Docker image that includes PostgreSQL, Redis, MinIO, the backend API, and the frontend. No docker-compose needed, no configuration headaches.
docker run -d \
--name gdrive-clone \
-p 3000:3000 \
-p 8000:8000 \
-p 9000:9000 \
-p 9001:9001 \
venky1701/gdrive-clone:latestThat's it. The container handles database initialization, service startup, and everything else automatically.
Access Points:
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000/api
- API Documentation: http://localhost:8000/docs
- MinIO Console: http://localhost:9001 (minioadmin/minioadmin)
Default Credentials:
- Email:
demo@example.com - Password:
demo123
The application automatically initializes with mock data when you first start it, so you'll have sample users, files, and folders to work with immediately.
If you prefer running things locally for development:
# Clone the repository
git clone https://github.com/venkat1701/gdrive.git
cd gdrive
# Backend setup
cd backend
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
# You'll need PostgreSQL, Redis, and MinIO running
# Then start the backend:
uvicorn app.main:app --reload
# Frontend setup (in another terminal)
cd frontend
npm install
npm run devSee the Setup Guide for detailed instructions.
Understanding the codebase is straightforward once you see how it's organized:
gdrive/
├── backend/ # FastAPI application
│ ├── app/
│ │ ├── api/ # REST API endpoints
│ │ ├── core/ # Configuration, database, security
│ │ ├── models/ # SQLAlchemy database models
│ │ ├── schemas/ # Pydantic validation schemas
│ │ ├── services/ # Business logic layer
│ │ ├── storage/ # MinIO/S3 storage abstraction
│ │ └── init_data.py # Mock data initialization
│ └── requirements.txt
│
├── frontend/ # React + TypeScript application
│ ├── src/
│ │ ├── api/ # API client functions
│ │ ├── components/ # Reusable UI components
│ │ ├── features/ # Feature-specific pages
│ │ ├── stores/ # Zustand state management
│ │ └── utils/ # Helper functions
│ └── package.json
│
├── docs/ # Documentation
├── docker-compose.yml # Multi-container setup
├── Dockerfile # Unified container image
└── entrypoint.sh # Container initialization script
When the Docker container starts, entrypoint.sh handles the initialization sequence:
- PostgreSQL Setup: Detects the PostgreSQL version, initializes the database if needed, creates the
gdrivedatabase, and sets up the postgres user password. - Service Startup: Supervisor starts all services in order:
- PostgreSQL (priority 100)
- Redis (priority 200)
- MinIO (priority 300)
- Service initialization script (priority 350) - sets up MinIO buckets
- Backend API (priority 400) - waits 10 seconds for dependencies
- Nginx (priority 500) - serves frontend and proxies API requests
The backend (backend/app/main.py) is the application entry point. On startup, it:
- Enables PostgreSQL trigram extension for full-text search
- Creates database tables via SQLAlchemy
- Runs
init_mock_data()to populate the database with sample data - Sets up CORS middleware
- Registers all API routers
The frontend entry point is frontend/src/main.tsx, which renders the React app and sets up routing.
The application includes a comprehensive mock data system that runs automatically when the database is empty. This happens in backend/app/init_data.py.
What Gets Created:
- 6 demo users with various names and email addresses
- Organized folder structures for each user (Documents, Photos, Work, Personal, Projects)
- Nested folders (e.g., Work/Reports, Work/Presentations)
- Various file types (PDFs, images, documents, spreadsheets)
- Starred items and trashed files for testing different views
- File shares between users to test collaboration
- Activity logs for file operations
- Comments on shared files
Using Mock Data:
The mock data initialization runs automatically when you start the application for the first time. You can verify it worked by checking the backend logs—you should see messages like "Created 6 users" and "Created X files/folders for user...".
If you want to reset the mock data, you can:
- Drop and recreate the database
- Or manually call
init_mock_data()from a Python shell
Test Accounts:
demo@example.com/demo123john.doe@example.com/password123jane.smith@example.com/password123alice.johnson@example.com/password123bob.wilson@example.com/password123emma.brown@example.com/password123
Each account has its own files, folders, and some shared items between them.
This project follows a clean, layered architecture that separates concerns and makes the codebase maintainable.
┌─────────────────────────────────────────────────────────┐
│ React Frontend │
│ (TypeScript, Zustand, React Query, Tailwind CSS) │
│ Port 3000 │
└────────────────────┬──────────────────────────────────┘
│ HTTP/REST API
│
┌────────────────────▼──────────────────────────────────┐
│ FastAPI Backend │
│ (Python 3.11, SQLAlchemy, Pydantic) │
│ Port 8000 │
└───┬──────────┬──────────┬──────────┬──────────────────┘
│ │ │ │
┌───▼───┐ ┌───▼───┐ ┌───▼───┐ ┌───▼────────┐
│PostgreSQL│ │ Redis │ │ MinIO │ │ Supervisor │
│ :5432 │ │ :6379 │ │ :9000 │ │ (Process │
│ │ │ │ │ │ │ Manager) │
└─────────┘ └───────┘ └───────┘ └────────────┘
The backend follows a layered architecture pattern:
API Layer (app/api/): HTTP endpoints that handle requests and responses
auth.py: Authentication (register, login, token refresh)files.py: File CRUD operations, upload, downloadshares.py: Sharing logic and permissionsstorage.py: Storage analytics and quota managementcomments.py: File comments and discussionslinks.py: Shareable link generation
Service Layer (app/services/): Business logic separated from HTTP concerns
file_service.py: File operations, validation, path managementshare_service.py: Permission checking, share managementpreview_service.py: File preview generation
Data Layer (app/models/): SQLAlchemy ORM models
user.py: User model with authentication fieldsfile.py: File, FileShare, FileActivity, FileVersion models
Storage Layer (app/storage/): Abstraction over object storage
minio_storage.py: MinIO client wrapper, handles presigned URLs
The frontend uses a feature-based organization:
State Management: Zustand stores for global state
authStore.ts: Authentication state and user infofileStore.ts: File operations and current folder stateuploadStore.ts: Upload queue and progress trackingthemeStore.ts: UI theme preferences
API Client (src/api/): Centralized API functions using Axios
- Abstracts HTTP calls, handles authentication tokens
- Provides type-safe interfaces for all backend endpoints
Components (src/components/): Reusable UI components
- File browsers, modals, dialogs, navigation elements
- Built with Tailwind CSS for styling
Features (src/features/): Feature-specific pages and logic
auth/: Login and registration pagesdrive/: Drive views (My Drive, Shared, Recent, Starred, Trash, Storage)
Why FastAPI? Async support, automatic OpenAPI docs, type safety with Pydantic, and excellent performance. It's the perfect fit for a modern Python API.
Why PostgreSQL? ACID compliance, full-text search with trigram indexes, mature ecosystem, and excellent performance. Plus, it handles JSON data when needed.
Why MinIO? S3-compatible API means we can swap in AWS S3 or any S3-compatible storage later. It's self-hosted, fast, and scales well.
Why Redis? Fast caching layer, perfect for session management and token blacklisting. We can extend it for real-time features later.
Why React + TypeScript? Component reusability, type safety, huge ecosystem, and excellent developer experience. Zustand keeps state management simple.
Why a Unified Docker Image? Simplifies deployment. One image, one command, everything works. Perfect for demos, testing, and production deployments where you want minimal configuration.
This project includes comprehensive documentation for training RL agents to interact with the application. The RL Training Guide provides:
- Complete state space definition: What the agent can observe (UI state, file data, user context)
- Action space definition: All possible actions the agent can take (navigation, file operations, UI interactions)
- Reward structure: How to design rewards for different tasks
- Environment setup: How to integrate with the application for RL training
- Example scenarios: Common tasks an agent should learn
The guide is designed for researchers and developers who want to train agents to automate file management tasks, test UI flows, or learn to navigate complex web applications.
Key sections include:
- Detailed state space breakdown (UI state, data state, user context)
- Complete action space (mouse clicks, keyboard input, navigation)
- Reward function design for various objectives
- Integration examples with popular RL frameworks
- Task-specific scenarios (file organization, sharing workflows, search tasks)
- JWT-based authentication with refresh tokens
- User registration and login
- Per-user storage quotas (15GB default)
- Multi-user isolation
- Upload multiple files with drag-and-drop
- Download files with presigned URLs (secure, time-limited)
- Rename, delete, and move files/folders
- File metadata display (name, type, size, dates, owner)
- Inline preview for images, PDFs, and documents
- Upload progress tracking
- Create nested folders
- Breadcrumb navigation
- Materialized path pattern for efficient queries
- Move files between folders
- Global search across all files
- Filter by file type, owner, date modified
- Full-text search using PostgreSQL trigram indexes
- Search suggestions
- Share files/folders with other users
- Three permission levels: Viewer (read-only), Commenter (read + comment), Editor (full access)
- Revoke access and manage shared users
- Permission inheritance for nested folders
- Shareable links
- Grid and list view modes
- Recent files view (last accessed)
- Starred/bookmarked files
- Trash with soft deletion
- Restore deleted files
- Activity feed for file operations
- File version history
- Track uploads, shares, deletions, edits, comments
- Real-time storage usage tracking
- Storage analytics by file type
- Files sorted by size
- Storage quota enforcement
Frontend:
- React 18 with TypeScript
- Vite for fast development and builds
- Tailwind CSS for styling
- Zustand for state management
- React Query (TanStack Query) for data fetching
- React Router for navigation
- Material UI Icons
Backend:
- FastAPI (Python 3.11)
- SQLAlchemy ORM
- PostgreSQL database
- MinIO for object storage
- Redis for caching
- JWT for authentication
- Pydantic for validation
Infrastructure:
- Docker and Docker Compose
- Supervisor for process management
- Nginx for serving frontend and proxying API
Key environment variables (all set in the Docker image, but you can override them):
# Database
DATABASE_URL=postgresql://postgres:postgres@localhost:5432/gdrive
# Storage
MINIO_ENDPOINT=localhost:9000
MINIO_ACCESS_KEY=minioadmin
MINIO_SECRET_KEY=minioadmin
MINIO_BUCKET=gdrive-files
# Security
SECRET_KEY=change-this-in-production-min-32-chars-long-secret-key
ALGORITHM=HS256
ACCESS_TOKEN_EXPIRE_MINUTES=30
REFRESH_TOKEN_EXPIRE_DAYS=7
# CORS
CORS_ORIGINS=http://localhost:3000,http://localhost:5173The mock data system doubles as a test dataset. You can use the demo accounts to test all features.
For automated testing, see the Testing Documentation.
Once the backend is running, visit http://localhost:8000/docs for interactive API documentation powered by FastAPI's automatic OpenAPI generation.
The unified image is available on Docker Hub:
docker pull venky1701/gdrive-clone:latestFor production deployments, consider:
- Security: Change all default passwords and secrets
- Database: Use a managed PostgreSQL service (AWS RDS, Azure Database)
- Storage: Use S3 or another production-grade object storage
- HTTPS: Set up SSL/TLS with a reverse proxy (Nginx, Traefik)
- Scaling: Run multiple backend instances behind a load balancer
- Monitoring: Add logging, metrics, and error tracking
- Backups: Regular database backups and storage replication
See the Deployment Guide for detailed production setup instructions.
Contributions are welcome! Whether it's bug fixes, new features, or documentation improvements, we'd love to see your pull requests.
Please read the Contributing Guide before submitting changes.
- Setup Guide - Detailed setup instructions
- Architecture - System design and decisions
- Features - Complete feature documentation
- Deployment - Production deployment guide
- Troubleshooting - Common issues and solutions
- RL Training Guide - Reinforcement learning integration
This project is open source and available under the MIT License.
Built with modern web technologies and inspired by Google Drive's excellent user experience. Uses Material Design principles for consistent UI patterns.
Questions? Check the documentation, open an issue, or dive into the code. It's all here.