Google Drive Clone

A production-ready Google Drive clone built from the ground up. This isn't just another file manager—it's a complete cloud storage solution with user authentication, file sharing, collaboration features, and everything you'd expect from a modern drive service.

What This Is

After working with various cloud storage solutions and feeling frustrated by their limitations, I decided to build something better. This project implements a full-featured drive application with:

Multi-user authentication with JWT tokens
File and folder management with drag-and-drop
Sharing and permissions (Viewer, Commenter, Editor)
Search and filtering across all your files
Storage analytics and quota management
Activity tracking and version history
Real-time UI built with React and TypeScript

Everything runs in a single Docker container, making deployment trivial. Pull the image, run it, and you're good to go.

Getting Started

The Easy Way: Docker Image

We've packaged everything into a single Docker image that includes PostgreSQL, Redis, MinIO, the backend API, and the frontend. No docker-compose needed, no configuration headaches.

docker run -d \
  --name gdrive-clone \
  -p 3000:3000 \
  -p 8000:8000 \
  -p 9000:9000 \
  -p 9001:9001 \
  venky1701/gdrive-clone:latest

That's it. The container handles database initialization, service startup, and everything else automatically.

Access Points:

Frontend: http://localhost:3000
Backend API: http://localhost:8000/api
API Documentation: http://localhost:8000/docs
MinIO Console: http://localhost:9001 (minioadmin/minioadmin)

Default Credentials:

Email: demo@example.com
Password: demo123

The application automatically initializes with mock data when you first start it, so you'll have sample users, files, and folders to work with immediately.

Local Development

If you prefer running things locally for development:

# Clone the repository
git clone https://github.com/venkat1701/gdrive.git
cd gdrive

# Backend setup
cd backend
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt

# You'll need PostgreSQL, Redis, and MinIO running
# Then start the backend:
uvicorn app.main:app --reload

# Frontend setup (in another terminal)
cd frontend
npm install
npm run dev

See the Setup Guide for detailed instructions.

Project Structure

Understanding the codebase is straightforward once you see how it's organized:

gdrive/
├── backend/                 # FastAPI application
│   ├── app/
│   │   ├── api/            # REST API endpoints
│   │   ├── core/           # Configuration, database, security
│   │   ├── models/         # SQLAlchemy database models
│   │   ├── schemas/        # Pydantic validation schemas
│   │   ├── services/       # Business logic layer
│   │   ├── storage/        # MinIO/S3 storage abstraction
│   │   └── init_data.py    # Mock data initialization
│   └── requirements.txt
│
├── frontend/               # React + TypeScript application
│   ├── src/
│   │   ├── api/           # API client functions
│   │   ├── components/    # Reusable UI components
│   │   ├── features/      # Feature-specific pages
│   │   ├── stores/        # Zustand state management
│   │   └── utils/         # Helper functions
│   └── package.json
│
├── docs/                   # Documentation
├── docker-compose.yml      # Multi-container setup
├── Dockerfile              # Unified container image
└── entrypoint.sh          # Container initialization script

Entry Point

When the Docker container starts, entrypoint.sh handles the initialization sequence:

PostgreSQL Setup: Detects the PostgreSQL version, initializes the database if needed, creates the gdrive database, and sets up the postgres user password.
Service Startup: Supervisor starts all services in order:
- PostgreSQL (priority 100)
- Redis (priority 200)
- MinIO (priority 300)
- Service initialization script (priority 350) - sets up MinIO buckets
- Backend API (priority 400) - waits 10 seconds for dependencies
- Nginx (priority 500) - serves frontend and proxies API requests

The backend (backend/app/main.py) is the application entry point. On startup, it:

Enables PostgreSQL trigram extension for full-text search
Creates database tables via SQLAlchemy
Runs init_mock_data() to populate the database with sample data
Sets up CORS middleware
Registers all API routers

The frontend entry point is frontend/src/main.tsx, which renders the React app and sets up routing.

Mock Data

The application includes a comprehensive mock data system that runs automatically when the database is empty. This happens in backend/app/init_data.py.

What Gets Created:

6 demo users with various names and email addresses
Organized folder structures for each user (Documents, Photos, Work, Personal, Projects)
Nested folders (e.g., Work/Reports, Work/Presentations)
Various file types (PDFs, images, documents, spreadsheets)
Starred items and trashed files for testing different views
File shares between users to test collaboration
Activity logs for file operations
Comments on shared files

Using Mock Data:

The mock data initialization runs automatically when you start the application for the first time. You can verify it worked by checking the backend logs—you should see messages like "Created 6 users" and "Created X files/folders for user...".

If you want to reset the mock data, you can:

Drop and recreate the database
Or manually call init_mock_data() from a Python shell

Test Accounts:

demo@example.com / demo123
john.doe@example.com / password123
jane.smith@example.com / password123
alice.johnson@example.com / password123
bob.wilson@example.com / password123
emma.brown@example.com / password123

Each account has its own files, folders, and some shared items between them.

Architecture

This project follows a clean, layered architecture that separates concerns and makes the codebase maintainable.

System Architecture

┌─────────────────────────────────────────────────────────┐
│                    React Frontend                       │
│  (TypeScript, Zustand, React Query, Tailwind CSS)      │
│                    Port 3000                           │
└────────────────────┬──────────────────────────────────┘
                     │ HTTP/REST API
                     │
┌────────────────────▼──────────────────────────────────┐
│                 FastAPI Backend                        │
│           (Python 3.11, SQLAlchemy, Pydantic)         │
│                    Port 8000                           │
└───┬──────────┬──────────┬──────────┬──────────────────┘
    │          │          │          │
┌───▼───┐ ┌───▼───┐ ┌───▼───┐ ┌───▼────────┐
│PostgreSQL│ │ Redis │ │ MinIO │ │ Supervisor │
│  :5432   │ │ :6379 │ │ :9000 │ │ (Process   │
│         │ │       │ │       │ │  Manager)  │
└─────────┘ └───────┘ └───────┘ └────────────┘

Backend Architecture

The backend follows a layered architecture pattern:

API Layer (app/api/): HTTP endpoints that handle requests and responses

auth.py: Authentication (register, login, token refresh)
files.py: File CRUD operations, upload, download
shares.py: Sharing logic and permissions
storage.py: Storage analytics and quota management
comments.py: File comments and discussions
links.py: Shareable link generation

Service Layer (app/services/): Business logic separated from HTTP concerns

file_service.py: File operations, validation, path management
share_service.py: Permission checking, share management
preview_service.py: File preview generation

Data Layer (app/models/): SQLAlchemy ORM models

user.py: User model with authentication fields
file.py: File, FileShare, FileActivity, FileVersion models

Storage Layer (app/storage/): Abstraction over object storage

minio_storage.py: MinIO client wrapper, handles presigned URLs

Frontend Architecture

The frontend uses a feature-based organization:

State Management: Zustand stores for global state

authStore.ts: Authentication state and user info
fileStore.ts: File operations and current folder state
uploadStore.ts: Upload queue and progress tracking
themeStore.ts: UI theme preferences

API Client (src/api/): Centralized API functions using Axios

Abstracts HTTP calls, handles authentication tokens
Provides type-safe interfaces for all backend endpoints

Components (src/components/): Reusable UI components

File browsers, modals, dialogs, navigation elements
Built with Tailwind CSS for styling

Features (src/features/): Feature-specific pages and logic

auth/: Login and registration pages
drive/: Drive views (My Drive, Shared, Recent, Starred, Trash, Storage)

Key Design Decisions

Why FastAPI? Async support, automatic OpenAPI docs, type safety with Pydantic, and excellent performance. It's the perfect fit for a modern Python API.

Why PostgreSQL? ACID compliance, full-text search with trigram indexes, mature ecosystem, and excellent performance. Plus, it handles JSON data when needed.

Why MinIO? S3-compatible API means we can swap in AWS S3 or any S3-compatible storage later. It's self-hosted, fast, and scales well.

Why Redis? Fast caching layer, perfect for session management and token blacklisting. We can extend it for real-time features later.

Why React + TypeScript? Component reusability, type safety, huge ecosystem, and excellent developer experience. Zustand keeps state management simple.

Why a Unified Docker Image? Simplifies deployment. One image, one command, everything works. Perfect for demos, testing, and production deployments where you want minimal configuration.

Reinforcement Learning Training

This project includes comprehensive documentation for training RL agents to interact with the application. The RL Training Guide provides:

Complete state space definition: What the agent can observe (UI state, file data, user context)
Action space definition: All possible actions the agent can take (navigation, file operations, UI interactions)
Reward structure: How to design rewards for different tasks
Environment setup: How to integrate with the application for RL training
Example scenarios: Common tasks an agent should learn

The guide is designed for researchers and developers who want to train agents to automate file management tasks, test UI flows, or learn to navigate complex web applications.

Key sections include:

Detailed state space breakdown (UI state, data state, user context)
Complete action space (mouse clicks, keyboard input, navigation)
Reward function design for various objectives
Integration examples with popular RL frameworks
Task-specific scenarios (file organization, sharing workflows, search tasks)

Features

Authentication & User Management

JWT-based authentication with refresh tokens
User registration and login
Per-user storage quotas (15GB default)
Multi-user isolation

File Management

Upload multiple files with drag-and-drop
Download files with presigned URLs (secure, time-limited)
Rename, delete, and move files/folders
File metadata display (name, type, size, dates, owner)
Inline preview for images, PDFs, and documents
Upload progress tracking

Folder Organization

Create nested folders
Breadcrumb navigation
Materialized path pattern for efficient queries
Move files between folders

Search & Discovery

Global search across all files
Filter by file type, owner, date modified
Full-text search using PostgreSQL trigram indexes
Search suggestions

Sharing & Collaboration

Share files/folders with other users
Three permission levels: Viewer (read-only), Commenter (read + comment), Editor (full access)
Revoke access and manage shared users
Permission inheritance for nested folders
Shareable links

Views & Organization

Grid and list view modes
Recent files view (last accessed)
Starred/bookmarked files
Trash with soft deletion
Restore deleted files

Activity & History

Activity feed for file operations
File version history
Track uploads, shares, deletions, edits, comments

Storage Management

Real-time storage usage tracking
Storage analytics by file type
Files sorted by size
Storage quota enforcement

Technology Stack

Frontend:

React 18 with TypeScript
Vite for fast development and builds
Tailwind CSS for styling
Zustand for state management
React Query (TanStack Query) for data fetching
React Router for navigation
Material UI Icons

Backend:

FastAPI (Python 3.11)
SQLAlchemy ORM
PostgreSQL database
MinIO for object storage
Redis for caching
JWT for authentication
Pydantic for validation

Infrastructure:

Docker and Docker Compose
Supervisor for process management
Nginx for serving frontend and proxying API

Development

Environment Variables

Key environment variables (all set in the Docker image, but you can override them):

# Database
DATABASE_URL=postgresql://postgres:postgres@localhost:5432/gdrive

# Storage
MINIO_ENDPOINT=localhost:9000
MINIO_ACCESS_KEY=minioadmin
MINIO_SECRET_KEY=minioadmin
MINIO_BUCKET=gdrive-files

# Security
SECRET_KEY=change-this-in-production-min-32-chars-long-secret-key
ALGORITHM=HS256
ACCESS_TOKEN_EXPIRE_MINUTES=30
REFRESH_TOKEN_EXPIRE_DAYS=7

# CORS
CORS_ORIGINS=http://localhost:3000,http://localhost:5173

Running Tests

The mock data system doubles as a test dataset. You can use the demo accounts to test all features.

For automated testing, see the Testing Documentation.

API Documentation

Once the backend is running, visit http://localhost:8000/docs for interactive API documentation powered by FastAPI's automatic OpenAPI generation.

Deployment

Docker Hub

The unified image is available on Docker Hub:

docker pull venky1701/gdrive-clone:latest

Production Considerations

For production deployments, consider:

Security: Change all default passwords and secrets
Database: Use a managed PostgreSQL service (AWS RDS, Azure Database)
Storage: Use S3 or another production-grade object storage
HTTPS: Set up SSL/TLS with a reverse proxy (Nginx, Traefik)
Scaling: Run multiple backend instances behind a load balancer
Monitoring: Add logging, metrics, and error tracking
Backups: Regular database backups and storage replication

See the Deployment Guide for detailed production setup instructions.

Contributing

Contributions are welcome! Whether it's bug fixes, new features, or documentation improvements, we'd love to see your pull requests.

Please read the Contributing Guide before submitting changes.

Documentation

Setup Guide - Detailed setup instructions
Architecture - System design and decisions
Features - Complete feature documentation
Deployment - Production deployment guide
Troubleshooting - Common issues and solutions
RL Training Guide - Reinforcement learning integration

License

This project is open source and available under the MIT License.

Acknowledgments

Built with modern web technologies and inspired by Google Drive's excellent user experience. Uses Material Design principles for consistent UI patterns.

Questions? Check the documentation, open an issue, or dive into the code. It's all here.

Name		Name	Last commit message	Last commit date
Latest commit History 137 Commits
backend		backend
docs		docs
frontend		frontend
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
RL_TRAINING_GUIDE.md		RL_TRAINING_GUIDE.md
docker-compose.public.yml		docker-compose.public.yml
docker-compose.unified.yml		docker-compose.unified.yml
docker-compose.yml		docker-compose.yml
entrypoint.sh		entrypoint.sh

Folders and files

Latest commit

History

Repository files navigation

Google Drive Clone

What This Is

Getting Started

The Easy Way: Docker Image

Local Development

Project Structure

Entry Point

Mock Data

Architecture

System Architecture

Backend Architecture

Frontend Architecture

Key Design Decisions

Reinforcement Learning Training

Features

Authentication & User Management

File Management

Folder Organization

Search & Discovery

Sharing & Collaboration

Views & Organization

Activity & History

Storage Management

Technology Stack

Development

Environment Variables

Running Tests

API Documentation

Deployment

Docker Hub

Production Considerations

Contributing

Documentation

License

Acknowledgments

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages