Skip to content

m-s-sat/map

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

26 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ—ΊοΈ India Road Network Map

A production-grade routing application that handles 17 million nodes and 35 million edges of India's road network data, optimized to run under 1GB RAM.

Map Screenshot Edges RAM

🎯 Key Features

  • πŸ—ΊοΈ Interactive Map - Visualize road network with Leaflet
  • πŸ” Place Search - Search 4,000+ named locations
  • πŸ›£οΈ Shortest Path Routing - Dijkstra's algorithm in C++
  • ⚑ Memory Optimized - 17M nodes in under 1GB RAM

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚    Frontend     │────▢│    Backend      │────▢│   C++ Engine    β”‚
β”‚   (Next.js)     β”‚     β”‚   (Express)     β”‚     β”‚   (Dijkstra)    β”‚
β”‚    Vercel       β”‚     β”‚    Railway      β”‚     β”‚   Memory-Mapped β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚   Binary Data Files β”‚
                    β”‚   (S3 - 700MB)      β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

🧠 The 1GB Challenge: How We Did It

The Problem

  • 17 million nodes (lat/lon coordinates) = 257 MB
  • 35 million edges (road connections) = 400 MB
  • Graph weights (distances) = 263 MB
  • Total: ~920 MB just for data!

The Solution: Memory-Mapped Files (mmap)

Instead of loading everything into RAM, we use memory-mapped files:

// Traditional approach (loads everything into RAM)
vector<Node> nodes;
nodes.resize(17000000); // πŸ’₯ Uses 257MB RAM

// Our approach (zero RAM usage)
void* data = mmap(nullptr, fileSize, PROT_READ, MAP_PRIVATE, fd, 0);
Node* nodes = reinterpret_cast<Node*>(data);
// βœ… OS loads pages on-demand, actual RAM usage: ~50MB

Memory Breakdown

Component Before After Optimization
Node data 257 MB ~10 MB (mmap)
Edge data 400 MB ~20 MB (mmap)
C++ Engine 200 MB ~50 MB
Node.js 500 MB 512 MB (capped)
Total 1.3 GB ~600 MB βœ…

πŸ“Š Data Pipeline

1. Extract from OpenStreetMap

# Download India OSM data (1.2 GB .pbf)
python scripts/extract_osm.py india-latest.osm.pbf

2. Convert to Binary Format

# Convert to optimized binary files
python scripts/convert_to_binary.py
python scripts/convert_places_to_bin.py

3. Binary File Structure

File Contents Format Size
nodes.bin Lat/Lon pairs 16 bytes per node 257 MB
graph.offset Edge offsets 4 bytes per node 64 MB
graph.targets Edge targets 4 bytes per edge 131 MB
graph.weights Edge weights 8 bytes per edge 263 MB
places.bin Named places Variable 600 KB

4. Why Binary Format? (The Speed Secret πŸš€)

Converting raw OSM/CSV data to binary format is critical for performance:

πŸ“„ CSV Format (Before):
"node_id,latitude,longitude"
"0,28.6139,77.2090"
"1,28.6140,77.2091"
...

πŸ“¦ Binary Format (After):
[8 bytes: lat][8 bytes: lon][8 bytes: lat][8 bytes: lon]...

Performance Comparison

Metric CSV/JSON Binary Improvement
File size 2.1 GB 715 MB 3x smaller
Parse time 45 seconds 0.3 seconds 150x faster
Memory during load 4+ GB ~50 MB 80x less
Random access ❌ Impossible βœ… Instant ∞

Why Binary Wins

  1. No parsing overhead - Data is stored exactly as it appears in memory
  2. Fixed-size records - Jump directly to any node: offset = nodeId Γ— 16
  3. Memory-mappable - OS can map file directly to memory without copying
  4. Cache-friendly - Sequential reads maximize CPU cache efficiency
# Converting CSV to Binary (extract from our script)
with open('nodes.bin', 'wb') as f:
    for lat, lon in nodes:
        f.write(struct.pack('<dd', lat, lon))  # 16 bytes per node

πŸ› οΈ Tech Stack

Layer Technology
Frontend Next.js 14, React, Leaflet, Redux
Backend Node.js, Express, TypeScript
Routing Engine C++17, Dijkstra's Algorithm
Data Format Custom Binary (mmap-compatible)
Hosting Vercel (Frontend), Railway (Backend)
Data Storage AWS S3

πŸš€ Quick Start

Prerequisites

  • Node.js 18+
  • GCC/G++ (for C++ engine)
  • ~4GB disk space

Local Development

# Clone the repository
git clone https://github.com/m-s-sat/map.git
cd map

# Install dependencies
cd backend && npm install && cd ..
cd frontend && npm install && cd ..

# Build C++ engine
cd cpp-engine
g++ -O3 -std=c++17 -o src/map_v2.exe src/main.cpp src/graph.cpp -I include
cd ..

# Start backend (terminal 1)
cd backend && npm run dev

# Start frontend (terminal 2)
cd frontend && npm run dev

# Open http://localhost:3000

🌐 Deployment Architecture

We use a split deployment strategy for cost efficiency and scalability:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         PRODUCTION                               β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                   β”‚
β”‚   πŸ‘€ User                                                        β”‚
β”‚     β”‚                                                             β”‚
β”‚     β–Ό                                                             β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    API calls    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚   β”‚   Vercel    β”‚ ──────────────▢ β”‚       Railway           β”‚   β”‚
β”‚   β”‚  (Frontend) β”‚                 β”‚      (Backend)          β”‚   β”‚
β”‚   β”‚   Next.js   β”‚                 β”‚  Node.js + C++ Engine   β”‚   β”‚
β”‚   β”‚    FREE     β”‚                 β”‚      $5/month           β”‚   β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                                               β”‚                   β”‚
β”‚                                               β–Ό                   β”‚
β”‚                                   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”‚
β”‚                                   β”‚      AWS S3         β”‚        β”‚
β”‚                                   β”‚   (Data Storage)    β”‚        β”‚
β”‚                                   β”‚   715 MB binaries   β”‚        β”‚
β”‚                                   β”‚      ~$0.02/month   β”‚        β”‚
β”‚                                   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β”‚
β”‚                                                                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Why Split Deployment?

Concern Solution
Frontend CDN Vercel's global edge network
Backend compute Railway's container hosting
Large data files S3 (downloaded at build time)
Cost ~$5/month total

Backend Deployment (Railway)

Railway builds and deploys our Docker container automatically:

1. Connect GitHub repo to Railway

Railway Dashboard β†’ New Project β†’ Deploy from GitHub

2. Railway auto-detects Dockerfile and builds:

# Multi-stage build
FROM node:20 AS ts-builder      # Build TypeScript
FROM gcc:12 AS cpp-builder      # Build C++ routing engine
FROM node:20-slim               # Final slim image

# Download data from S3 at build time
RUN curl -o ./data/nodes.bin "$S3_URL/nodes.bin"

3. Environment variables:

NODE_ENV=production
NODE_OPTIONS=--max-old-space-size=512

4. Resources:

  • RAM: 1GB
  • Auto-restarts on failure
  • HTTPS enabled automatically

5. Smart Rebuilds (Cost Saving):

We configured railway.toml to only rebuild when relevant code changes:

[build]
# Only continuously deploy when these folders change
watchPatterns = ["backend/**", "cpp-engine/**", "Dockerfile"]

This prevents unnecessary rebuilds (and billing) when you only change the frontend or documentation.


Frontend Deployment (Vercel)

Vercel provides zero-config Next.js deployment:

1. Connect GitHub repo to Vercel

vercel.com β†’ Add New Project β†’ Import from GitHub

2. Configure:

  • Root directory: frontend
  • Framework: Next.js (auto-detected)
  • Build command: npm run build

3. Environment variable:

NEXT_PUBLIC_API_URL=https://map-production-xxxx.up.railway.app

4. Benefits:

  • Global CDN (edge caching)
  • Automatic HTTPS
  • Preview deployments on PRs
  • 100% FREE for hobby projects

πŸ“ Project Structure

map/
β”œβ”€β”€ frontend/          # Next.js React app
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ components/
β”‚   β”‚   β”‚   β”œβ”€β”€ map-view.tsx      # Leaflet map
β”‚   β”‚   β”‚   └── place-search.tsx  # Search component
β”‚   β”‚   └── redux/                # State management
β”‚   └── .env.production           # Railway URL
β”‚
β”œβ”€β”€ backend/           # Express API server
β”‚   └── src/
β”‚       β”œβ”€β”€ controllers/
β”‚       β”‚   β”œβ”€β”€ node.controller.ts   # Streaming nodes
β”‚       β”‚   β”œβ”€β”€ edge.controller.ts   # Streaming edges
β”‚       β”‚   └── route.controller.ts  # Routing API
β”‚       └── services/
β”‚           └── cpp-engine.service.ts # C++ IPC
β”‚
β”œβ”€β”€ cpp-engine/        # C++ routing engine
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ main.cpp   # CLI interface
β”‚   β”‚   └── graph.cpp  # Dijkstra + mmap
β”‚   └── include/
β”‚       └── graph.h    # Data structures
β”‚
β”œβ”€β”€ data/              # Binary data files (not in git)
β”‚   β”œβ”€β”€ nodes.bin
β”‚   β”œβ”€β”€ graph.offset
β”‚   β”œβ”€β”€ graph.targets
β”‚   └── graph.weights
β”‚
└── scripts/           # Data processing
    β”œβ”€β”€ extract_osm.py
    └── convert_to_binary.py

πŸ”‘ Key Insights

1. Memory-Mapped Files > In-Memory Arrays

Traditional approach loads everything into RAM. With mmap, the OS handles paging automatically.

2. Binary Format > CSV/JSON

Binary files are 10x smaller and 100x faster to read than text formats.

3. Streaming > Bulk Loading

For web APIs, stream data on-demand rather than loading everything upfront.

4. C++ for Heavy Computation

Dijkstra's algorithm in C++ is 50x faster than JavaScript for graph traversal.


πŸ“ˆ Performance

Metric Value
Node count 16,867,026
Edge count 34,558,426
Data load time < 0.5s
Route query time 50-500ms
Memory usage ~600 MB
Geographic coverage All of India

πŸ“„ License

MIT License - feel free to use for your projects!


πŸ™ Acknowledgments

  • OpenStreetMap contributors for India road data
  • osmnx for OSM data extraction

Built with ❀️ by [Mrinal]

About

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors