Skip to content

AetherStore is a lightweight, scalable, S3-style distributed object storage system. It runs across multiple Ubuntu nodes, supports replication, automatic node registration, and fault-tolerant object retrieval.

License

Notifications You must be signed in to change notification settings

deepesh611/AetherStore

Repository files navigation

AetherStore — Distributed Object Storage Platform

AetherStore is a lightweight, scalable, S3-style distributed object storage system written in Go. It runs across multiple nodes, supports replication, automatic node registration, and fault-tolerant object retrieval. Unlike Hadoop/HDFS, AetherStore treats files as immutable objects and uses only Go's standard library (+ SQLite for metadata).

🚀 Current Status

Phase 1: DataNode (Storage Service) ✅ COMPLETE

A production-grade storage node with:

  • ✅ Durable writes with fsync guarantees
  • ✅ Immutable object storage
  • ✅ REST API (PUT/GET/health endpoints)
  • ✅ Request logging
  • ✅ Dockerized deployment
  • ✅ Persistent volume support

Phase 2: Master Node (Metadata Service) ✅ COMPLETE

A metadata control plane with:

  • ✅ SQLite database with WAL mode
  • ✅ Node registration and tracking
  • ✅ Random placement algorithm
  • ✅ Object lifecycle management (init/complete/get)
  • ✅ Replica tracking
  • ✅ Dockerized deployment
  • ✅ Docker Compose orchestration

Phase 3: Replicated Uploads 🚧 TODO


🏗️ Architecture

DataNode (Storage Worker)

  • Stores raw binary objects on disk
  • Objects identified by UUIDv4
  • Immutable (no overwrites)
  • Crash-safe durability (fsync)
  • Simple REST API

Master Node (Metadata Control Plane)

  • Tracks registered storage nodes
  • Manages object metadata (size, status, replicas)
  • Placement decisions (random selection)
  • Object-to-node mapping
  • SQLite persistence

🛠️ Quick Start

Using Docker Compose (Recommended)

Start entire system (1 Master + 2 DataNodes):

docker-compose up -d

View logs:

docker-compose logs -f

Stop system:

docker-compose down

Manual Docker Deployment

Build images:

docker build -f Dockerfile.master -t aetherstore-master:latest .
docker build -f Dockerfile.datanode -t aetherstore-datanode:latest .

Run Master:

docker volume create aether-master-data
docker run -d -p 42100:42100 -v aether-master-data:/var/aetherstore/data --name master aetherstore-master:latest

Run DataNodes:

docker volume create aether-data-1
docker run -d -p 42101:42101 -v aether-data-1:/var/aetherstore/objects --name datanode-1 aetherstore-datanode:latest

docker volume create aether-data-2
docker run -d -p 42102:42101 -v aether-data-2:/var/aetherstore/objects --name datanode-2 aetherstore-datanode:latest

Local Development

Run Master:

go run cmd/master/main.go

Run DataNode:

go run cmd/datanode/main.go

📡 API Reference

Master Node Endpoints

POST /api/node/register

  • Register a storage node
  • Body: {"id": "node-1", "address": "http://localhost:42101"}
  • Returns: {"status": "OK"}

POST /api/upload/init

  • Request placement for new object
  • Body: {"size": 1024}
  • Returns: {"objectid": "uuid", "nodes": ["http://node1", "http://node2"]}

POST /api/upload/complete

  • Mark object upload as complete
  • Body: {"objectid": "uuid"}
  • Returns: {"status": "OK"}

GET /api/objects/{object_id}

  • Get replica locations for object
  • Returns: {"replicas": ["http://node1", "http://node2"]}

DataNode Endpoints

PUT /objects/{object_id}

  • Upload binary object
  • Returns: 201 Created or 409 Conflict (if exists)

GET /objects/{object_id}

  • Download object
  • Returns: 200 OK with binary data or 404 Not Found

GET /health

  • Health check
  • Returns: 200 OK with {"status": "ok"}

🧪 Testing the Full System

1. Register DataNodes with Master:

curl.exe -X POST http://localhost:42100/api/node/register -H "Content-Type: application/json" -d '{"id":"datanode-1","address":"http://localhost:42101"}'

curl.exe -X POST http://localhost:42100/api/node/register -H "Content-Type: application/json" -d '{"id":"datanode-2","address":"http://localhost:42102"}'

2. Request placement:

curl.exe -X POST http://localhost:42100/api/upload/init -H "Content-Type: application/json" -d '{"size":1024}'

3. Upload to assigned nodes:

echo "test data" > test.txt
curl.exe -X PUT --data-binary "@test.txt" http://localhost:42101/objects/OBJECT-ID
curl.exe -X PUT --data-binary "@test.txt" http://localhost:42102/objects/OBJECT-ID

4. Mark upload complete:

curl.exe -X POST http://localhost:42100/api/upload/complete -H "Content-Type: application/json" -d '{"objectid":"OBJECT-ID"}'

5. Retrieve object metadata:

curl.exe http://localhost:42100/api/objects/OBJECT-ID

🗺️ Roadmap

See roadmap/ directory for detailed phase planning.

  • Phase 0: Foundations
  • Phase 1: DataNode MVP
  • Phase 2: Master Metadata
  • Phase 3: Replicated Uploads
  • Phase 4: Fault-Tolerant Reads
  • Phase 5: Health & Heartbeats

🔧 Tech Stack

  • Language: Go (stdlib + SQLite driver)
  • Protocol: HTTP/JSON
  • Storage: Local filesystem with fsync
  • Metadata: SQLite with WAL mode
  • Deployment: Docker + Docker Compose

📚 Documentation


About

AetherStore is a lightweight, scalable, S3-style distributed object storage system. It runs across multiple Ubuntu nodes, supports replication, automatic node registration, and fault-tolerant object retrieval.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages