Skip to content
View ChengWu-Data's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report ChengWu-Data

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ChengWu-Data/README.md

Business-Centered Data Systems • Compliance • Forecasting • Analytics

WebsiteEmailLinkedIn


Who I Am

I build data-driven systems that translate fragmented operations into scalable workflows — spanning supply chain compliance, demand forecasting, pricing analytics, and portfolio risk modeling.

My approach blends:

  • Business-first problem framing (consulting mindset)
  • Hands-on modeling + pipeline execution (analyst + DS)
  • Process + system design for scale (builder/operator)

The goal isn't just running models — it's driving decision velocity, traceability, and operational leverage.


Impact Summary

Category Result
Demand Forecasting 12% lift in order completion rates
Pipeline Optimization 40% lower latency via Spark/Dask
Compliance & Trade Automated SKU audit workflows & duty validation
Finance & Deals Analytics for 20+ PE/M&A transactions
Social Impact Data infra supporting 15 families, +30% income

Projects & Systems

Public versions redacted for confidentiality; code released as generalizable templates.

Core Systems (Cards View)

📊 Table View

Project Description Tech
tarte-compliance-analytics (WIP) HTS mapping logic + 7501 audit validation pipelines Python, SQL, PowerQuery
biotech-demand-forecasting Elasticity experiments + Spark pipelines Python, Spark/Dask, Scikit
portfolio-risk-dashboard Factor models + return benchmarking Python, SQL, Power BI
social-impact-metrics Rural education + logistics dashboards SQL, Dash, Logging Pipelines

System Architecture (Diagrams)

1️⃣ Compliance / HTS + 7501 Workflow

flowchart LR
A[SKU Metadata] --> B[NAV DB]
B --> C[HTS Classification Engine]
C --> D[7501 Entry Audit]
D --> E[Duty Validation / Exceptions]
E --> F[Compliance Dashboard]
Loading

2️⃣ Biotech Forecasting System

flowchart LR
    Data[Transactions + Seasonality] --> Seg[Clustering / Segmentation]
    Seg --> Elasticity[Price Elasticity Models]
    Elasticity --> Forecast[Forecast Engine]
    Forecast --> Dash[Product/Manager Dashboards]
Loading

3️⃣ Portfolio Analytics System

flowchart LR
    Market[Market Data + Filings + FX] --> Factors[Factor Construction]
    Factors --> Model[Regression / Scenario Models]
    Model --> Risk[Risk Dashboards / Exposure Reports]
Loading

4️⃣ Social Impact Data Loop

flowchart LR
    Survey[Field Surveys + Sales Data] --> SQL[SQL Pipelines]
    SQL --> Metrics[Impact Metrics]
    Metrics --> Funding[Resource Allocation Decisions]
    Funding --> Program[Program Changes]
    Program --> Survey
Loading

Methodology Library

Reusable frameworks (beyond code):

  • Commodity Code Consistency Mapping
    Framework for mapping HTS codes across SKU variants and regions

  • Elasticity Experimentation Protocol
    Systematic pricing test design (A/B + lift attribution)

  • Cross-System Reconciliation Strategy
    Align metadata across NAV, customs filings, logistics, and SKU catalogs

  • Impact Evaluation (Semi-Experimental)
    Evaluation design for development intervention outcomes


Tech Stack

Core Analysis
Python · SQL · R

Infra / Pipelines
Spark · Dask · AWS · ETL Pipelines

Business BI & Ops
Power BI · Plotly · Dash · Excel/VBA

Domains
Compliance · Pricing/Forecasting · Portfolio Analytics · Social Impact Data


📍 Roadmap (2025)

🟣 Compliance Systems

Scaling trade classification with validation + observability

  • CI checks for HTS mapping drift
  • Exception reporting for SKU onboarding

🟣 Forecasting Infra

Closing the loop between pricing experiments & operational decisions

  • Publish demand modeling templates
  • Add seasonal elasticity dashboards

🟣 Cross-System Alignment

Eliminating fragmentation across NAV / customs / product catalog

  • Reconciliation CLI tool
  • Metadata schema consistency rules

🟣 Impact & Policy

Turning data into resource allocation decisions

  • Open-source impact scoring framework
  • Quantitative evaluation case studies

📫 Contact

📧 Email: cheng.w@columbia.edu
🌐 Website: https://chengwu-data.github.io/
💼 LinkedIn: https://www.linkedin.com/in/cheng-wu-1ab27922a/

Work in progress — building systems that make data useful. 🚀

Popular repositories Loading

  1. Housing-Price-Prediction-An-Exploratory-Analysis Housing-Price-Prediction-An-Exploratory-Analysis Public

    HTML

  2. chengwu-data.github.io chengwu-data.github.io Public

    Forked from RayeRen/acad-homepage.github.io

    Cheng wu

    HTML

  3. followerJP followerJP Public

    JavaScript

  4. ChengWu-Data ChengWu-Data Public

    Config files for my GitHub profile.

    TeX

  5. Exponential-Smoothing-Trend-Strategy-with-Parameter-Tuning-Exit-Rules Exponential-Smoothing-Trend-Strategy-with-Parameter-Tuning-Exit-Rules Public

    End-to-end quantitative analysis of USD/CAD using exponential smoothing, signal engineering, grid search, and trade-level performance evaluation.

    Jupyter Notebook

  6. sf-crime-socioeconomic-analysis sf-crime-socioeconomic-analysis Public

    Panel & count models linking 900k+ SF crime incidents with ACS socioeconomic data (fixed effects, Poisson/NegBin, forecasting).

    Jupyter Notebook