Bikash Deb bikash-deb-007

Hi, I'm Bikash Deb

Data Engineer | Building scalable data pipelines with Apache Spark

About Me

I build production data pipelines that process millions of records. Currently focused on implementing Medallion Architecture for data lakes and revenue assurance systems in telecom.

What I work with:

Apache Spark & PySpark for distributed processing
Python for ETL development
Parquet & Delta Lake for storage
Data quality frameworks and validation

Technical Skills

Data Engineering:

Apache Spark (PySpark) | Databricks | Delta Lake
ETL/ELT Pipeline Development
Medallion Architecture (Bronze/Silver/Gold)
Data Quality & Validation

Programming & Tools:

Python | SQL | Git
Parquet | Delta | CSV
Data Modeling | Schema Design
Performance Optimization

Cloud & Infrastructure:

Distributed Computing
Data Warehousing
Version Control (Git/GitHub)

Featured Projects

TelcoStream Analytics Engine

Production-grade data engineering pipeline for telecom revenue assurance using PySpark and Medallion Architecture.

Tech Stack: PySpark, Parquet, Python, Medallion Architecture
Highlights:

Processes 5,000+ CDR records with schema-on-read validation
Identifies 33.6% of customers at bill shock risk
Implements 4-tier risk classification system
60-70% reduction in disputed charges

View Project →

GitHub Stats

Connect With Me

"Data scientists get the glory, but data engineers build the foundation."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly