CSV Processing System (gRPC + Django Gateway + NextJs Frontend)

A high-performance distributed system for processing large CSV files (e.g., millions of rows) using gRPC streaming, Django REST Gateway, and a NextJs frontend.
The app demonstrates memory-efficient streaming, asynchronous background job handling, and live upload + download management.

Project Architecture

Frontend (NextJs) | REST Gateway - Django | (async job) | gRPC CSV Processor | Storage Directory → Aggregated CSV Output

Frontend (NextJs) Allows users to upload CSV files and track processing jobs.
Backend (Django) Receives uploads, starts a background process, and sends the CSV to the gRPC service. gRPC Service (Python) Streams the CSV bytes, aggregates data efficiently, and returns summary metrics.
Storage Stores input files

Components

Frontend (Next.js) – Allows CSV upload, progress tracking, and downloading results.
Backend Gateway (Django + DRF) – Handles file uploads asynchronously, dispatches background gRPC jobs, and serves processed results.
gRPC Service (Python) – Streams CSV chunks and aggregates totals per department efficiently.

Algorithm Explanation

Core Goal : Aggregate Number of Sales per Department from huge CSVs while keeping memory usage constant.

Steps:

The client (Django) sends CSV chunks (64KB each) to the gRPC server via streaming RPC.

The server’s StreamingCSVAggregator processes each line immediately — it never loads the full file in memory.

totals[department] += number_of_sales

After all chunks, a totals dictionary is written as a new CSV.

The gRPC server returns total departments , rows processed , download URL and metrics (processing time, memory usage)

Memory-Efficiency Strategy

Large CSVs (hundreds of MB or millions of rows): Process incrementally via streaming bytes Memory spikes: Only store aggregated totals dictionary (e.g. {Department → total_sales}) Monitoring: tracemalloc used to measure peak memory during background processing

Complexity Analysis

Read + Parse CSV O(N) Each row is processed once. Aggregation (hash map) O(1) per row Dictionary insert/update for department totals. Total Memory O(D) Only department totals are kept, not all rows. Overall Time: O(N) Space: O(D)

How to Run the System

1. Prerequisites

Python 3.9+
Node.js 18+
pip install grpcio grpcio-tools djangorestframework drf-yasg python-dotenv
(Optional) npm install for frontend dependencies

2. Environment variables (from .env):

GRPC_SERVER_ADDR=localhost:50051
BASE_URL=http://127.0.0.1:8002/api

3. Setup gRPC Service

cd backend 
python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt
cd backend/grpc_service

# (Re)generate protobuf files
python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. csv_upload.proto

# Run the async gRPC server
python server.py

### This service Receives CSV chunks via streaming. ses a streaming aggregator to process rows without loading the full file into memory. rites summarized results (totals per department) to /grpc_service/storage.

4. Setup gRPC Service

cd backend/gateway
python manage.py migrate
python manage.py runserver

5. Run Frontend (NextJS)

cd frontend
npm install
npm run dev

Then open http://localhost:3000

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
backend		backend
frontend		frontend
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CSV Processing System (gRPC + Django Gateway + NextJs Frontend)

Project Architecture

Components

Algorithm Explanation

Memory-Efficiency Strategy

Complexity Analysis

How to Run the System

1. Prerequisites

2. Environment variables (from .env):

3. Setup gRPC Service

4. Setup gRPC Service

5. Run Frontend (NextJS)

About

Uh oh!

Releases

Packages

Languages

shanbel-kassa/Python-gRPC-NextJs

Folders and files

Latest commit

History

Repository files navigation

CSV Processing System (gRPC + Django Gateway + NextJs Frontend)

Project Architecture

Components

Algorithm Explanation

Memory-Efficiency Strategy

Complexity Analysis

How to Run the System

1. Prerequisites

2. Environment variables (from .env):

3. Setup gRPC Service

4. Setup gRPC Service

5. Run Frontend (NextJS)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages