Data Engineer with 4+ years of experience designing and delivering enterprise-scale data platforms across financial services, insurance, retail, and telecom sectors. I specialize in building robust, cloud-native data architectures using modern lakehouse patterns translating raw data into reliable, high-performance analytical systems.
I've driven significant cost and performance wins including Snowflake compute cost reductions, query latency improvements, and large-scale cloud migration projects. My engineering philosophy: build for reliability, optimize for cost, and scale for the future.
| Area | Achievement | Impact |
|---|---|---|
| Cloud Migrations | AWS / GCP / Azure | Enterprise-scale platform modernization |
| Cost Optimization | Snowflake & Spark tuning | Significant compute cost savings delivered |
| Streaming Pipelines | Real-time data platforms | Sub-minute latency on production systems |
| Data Domains | FinServ · Insurance · Retail · Telecom | Cross-industry platform expertise |
| ML Engineering | Feature pipelines & model-ready datasets | Accelerated model deployment cycles |
| Certifications | Databricks · Azure · AWS | Triple-cloud certified practitioner |
Enterprise banking data platform work across reporting, analytics, and pipeline engineering.
Retail data platform engineering; supply chain and POS analytics pipelines.
AI/ML data infrastructure; churn prediction feature pipelines and model data workflows.
graph TB
A[Raw Sources:<br/>APIs · Kafka · Databases · Files] --> B[Ingestion Layer<br/>Spark · Glue · ADF · Custom ETL]
B --> C[Cloud Data Lake<br/>S3 · ADLS · GCS · Delta Lake]
C --> D[Transformation<br/>PySpark · Scala · dbt · SQL]
D --> E[Serving Layer<br/>Snowflake · Redshift · BigQuery]
E --> F[Consumption<br/>Power BI · Tableau · ML Models · APIs]
style A fill:#0f172a,stroke:#3b82f6,stroke-width:2px,color:#fff
style B fill:#1e40af,stroke:#60a5fa,stroke-width:2px,color:#fff
style C fill:#1d4ed8,stroke:#93c5fd,stroke-width:2px,color:#fff
style D fill:#2563eb,stroke:#bfdbfe,stroke-width:2px,color:#fff
style E fill:#3b82f6,stroke:#dbeafe,stroke-width:2px,color:#fff
style F fill:#059669,stroke:#6ee7b7,stroke-width:2px,color:#fff
timeline
section Early Career
Foundation : Junior to Mid-level Data Engineering
: ETL · SQL · Warehousing basics
section Growth
Voziq AI : Data Engineer
: Feature engineering for churn models
The Hanover Insurance: Retail analytics platform
: POS and supply chain pipelines
section Senior
SMBC : Enterprise banking platform
: Snowflake optimization · Cloud migration
: 2024–Present
End-to-end ML pipeline built on the UCI CKD dataset. Focused on clinical interpretability using SHAP values, odds ratios, and decision tree visualizations — designed for healthcare stakeholder communication, not just model accuracy.
Stack: Python · Scikit-learn · SHAP · Pandas · Matplotlib
Terminal-aesthetic developer tool for GitHub profile analysis and optimization. Built with a polished UI targeting engineers who want to sharpen their personal brand and portfolio positioning.
Stack: Python · GitHub API · Rich CLI
Multiple enterprise cloud migration projects across AWS, GCP, and Azure — moving legacy on-premise data systems to modern lakehouse architectures with measurable latency and cost improvements.
Stack: Spark · Databricks · Snowflake · Airflow · Delta Lake · dbt
| Credential | Issuer | Focus |
|---|---|---|
| 🎓 Master's — IT Project Management | Clark University | Technology leadership & delivery |
| ☁️ Databricks Certified | Databricks | Lakehouse & Spark engineering |
| ☁️ Microsoft Azure Certified | Microsoft | Cloud data engineering on Azure |
| ☁️ AWS Certified | Amazon Web Services | Cloud architecture & services |
"Great data engineering is invisible — pipelines run silently, warehouses respond instantly, and business teams make confident decisions without wondering where the numbers come from. That's the standard I build to."
Core principles I ship by:
⚡ Performance-first — Profile before optimizing; instrument everything
🔁 Idempotency by default — Every pipeline should be safely re-runnable
💸 Cost as a metric — Cloud spend is an engineering responsibility, not just finance's
📐 Schema as contract — Data contracts prevent downstream chaos
🧪 Test your data — Data quality checks are not optional in production
Open to Senior Data Engineer · Data Architect · Platform Engineer roles (Remote / India)
📬 Let's connect on LinkedIn