Skip to content
View PA110's full-sized avatar

Block or report PA110

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
PA110/README.md

👋 Hello, I'm Punith_A

Data Engineer | Aspiring IT Project Manager | AI Enthusiast

PA110

LinkedIn GitHub


🎯 ABOUT ME

Data Engineer with 4+ years of experience designing and delivering enterprise-scale data platforms across financial services, insurance, retail, and telecom sectors. I specialize in building robust, cloud-native data architectures using modern lakehouse patterns translating raw data into reliable, high-performance analytical systems.

I've driven significant cost and performance wins including Snowflake compute cost reductions, query latency improvements, and large-scale cloud migration projects. My engineering philosophy: build for reliability, optimize for cost, and scale for the future.

📊 IMPACT AT A GLANCE

Area Achievement Impact
Cloud Migrations AWS / GCP / Azure Enterprise-scale platform modernization
Cost Optimization Snowflake & Spark tuning Significant compute cost savings delivered
Streaming Pipelines Real-time data platforms Sub-minute latency on production systems
Data Domains FinServ · Insurance · Retail · Telecom Cross-industry platform expertise
ML Engineering Feature pipelines & model-ready datasets Accelerated model deployment cycles
Certifications Databricks · Azure · AWS Triple-cloud certified practitioner

💼 PROFESSIONAL EXPERIENCE

🏦 SMBC — Data Engineer

Enterprise banking data platform work across reporting, analytics, and pipeline engineering.

🛒 The Hanover Insurance — Data Engineer

Retail data platform engineering; supply chain and POS analytics pipelines.

🤖 Voziq AI — Data Engineer

AI/ML data infrastructure; churn prediction feature pipelines and model data workflows.


🏗️ ARCHITECTURE APPROACH

graph TB
    A[Raw Sources:<br/>APIs · Kafka · Databases · Files] --> B[Ingestion Layer<br/>Spark · Glue · ADF · Custom ETL]
    B --> C[Cloud Data Lake<br/>S3 · ADLS · GCS · Delta Lake]
    C --> D[Transformation<br/>PySpark · Scala · dbt · SQL]
    D --> E[Serving Layer<br/>Snowflake · Redshift · BigQuery]
    E --> F[Consumption<br/>Power BI · Tableau · ML Models · APIs]

    style A fill:#0f172a,stroke:#3b82f6,stroke-width:2px,color:#fff
    style B fill:#1e40af,stroke:#60a5fa,stroke-width:2px,color:#fff
    style C fill:#1d4ed8,stroke:#93c5fd,stroke-width:2px,color:#fff
    style D fill:#2563eb,stroke:#bfdbfe,stroke-width:2px,color:#fff
    style E fill:#3b82f6,stroke:#dbeafe,stroke-width:2px,color:#fff
    style F fill:#059669,stroke:#6ee7b7,stroke-width:2px,color:#fff
Loading

🛠️ TECHNOLOGY STACK

Core Languages

Scala Python SQL

Big Data & Processing

Apache Spark PySpark Databricks Kafka

Cloud Platforms

AWS GCP Azure

Data Warehousing

Snowflake Redshift BigQuery

Orchestration & DevOps

Apache Airflow dbt Docker Git

Analytics & ML

Power BI Tableau


📈 CAREER TIMELINE

timeline
    section Early Career
        Foundation : Junior to Mid-level Data Engineering
                   : ETL · SQL · Warehousing basics
    section Growth
        Voziq AI     : Data Engineer
                     : Feature engineering for churn models
The Hanover Insurance: Retail analytics platform
                     : POS and supply chain pipelines
    section Senior
        SMBC       : Enterprise banking platform
                   : Snowflake optimization · Cloud migration
                   : 2024–Present
Loading

🚀 FEATURED PROJECTS

📊 Chronic Kidney Disease Prediction — ML + Interpretability

End-to-end ML pipeline built on the UCI CKD dataset. Focused on clinical interpretability using SHAP values, odds ratios, and decision tree visualizations — designed for healthcare stakeholder communication, not just model accuracy.

Stack: Python · Scikit-learn · SHAP · Pandas · Matplotlib


🖥️ GitOptima — GitHub Profile Optimizer

Terminal-aesthetic developer tool for GitHub profile analysis and optimization. Built with a polished UI targeting engineers who want to sharpen their personal brand and portfolio positioning.

Stack: Python · GitHub API · Rich CLI


🏭 Cloud Data Platform Migrations

Multiple enterprise cloud migration projects across AWS, GCP, and Azure — moving legacy on-premise data systems to modern lakehouse architectures with measurable latency and cost improvements.

Stack: Spark · Databricks · Snowflake · Airflow · Delta Lake · dbt


🎓 EDUCATION & CERTIFICATIONS

Credential Issuer Focus
🎓 Master's — IT Project Management Clark University Technology leadership & delivery
☁️ Databricks Certified Databricks Lakehouse & Spark engineering
☁️ Microsoft Azure Certified Microsoft Cloud data engineering on Azure
☁️ AWS Certified Amazon Web Services Cloud architecture & services

📊 GITHUB STATS

Punith's GitHub Stats

Top Languages

GitHub Streak


💡 ENGINEERING PHILOSOPHY

"Great data engineering is invisible — pipelines run silently, warehouses respond instantly, and business teams make confident decisions without wondering where the numbers come from. That's the standard I build to."

Core principles I ship by:

Performance-first — Profile before optimizing; instrument everything
🔁 Idempotency by default — Every pipeline should be safely re-runnable
💸 Cost as a metric — Cloud spend is an engineering responsibility, not just finance's
📐 Schema as contract — Data contracts prevent downstream chaos
🧪 Test your data — Data quality checks are not optional in production


Footer

Open to Senior Data Engineer · Data Architect · Platform Engineer roles (Remote / India)
📬 Let's connect on LinkedIn

Popular repositories Loading

  1. devopsrepo devopsrepo Public

    Python

  2. AIChatbot AIChatbot Public

    Python

  3. Drishti Drishti Public

    JavaScript

  4. PA110 PA110 Public

  5. IT-Project-Governance-Framework IT-Project-Governance-Framework Public

  6. zizy zizy Public

    Zizy the forensic investigator

    Python