Skip to content
View mrkchoe's full-sized avatar

Block or report mrkchoe

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
mrkchoe/README.md

Hi, I'm Mark 🧠

MS Analytics (Computational Data Track) — Georgia Tech · BS Cognitive Science (ML & Neurocomputational Focus) — UCSD

Data engineering + ML for longitudinal/biomedical data. I build reproducible pipelines, research-grade analyses, and clean model evaluation.

View repos · LinkedIn · Google Scholar


Featured work


Core Capabilities

Languages
Python · SQL

Data Engineering & Infrastructure
PostgreSQL · Apache Airflow · dbt (models, tests, documentation)
Docker · GitHub Actions
Relational schema design · Data validation · Reproducible batch pipelines

Machine Learning & Evaluation
scikit-learn · Feature preprocessing
Cross-validation frameworks · Model evaluation diagnostics (ROC, PR, confusion matrices)
Structured model comparison & research-grade reporting

Visualization
D3.js (interactive statistical visualization)
Seaborn (statistical plots & model diagnostics)


Additional Experience (Non-Public / Professional Work / Academic Work)

Java / R / Bash
AWS (EC2, S3, SageMaker)
TensorFlow / Pytorch / Neural network modeling
Longitudinal biomedical dataset management

Pinned Loading

  1. ad-mri-pet-model-comparison ad-mri-pet-model-comparison Public

    Structured comparison of machine learning models on curated tabular datasets.

    Jupyter Notebook

  2. wearable-data-pipeline wearable-data-pipeline Public

    Local-first data engineering pipeline for ingesting, validating, and transforming wearable health CSV data.

    Python

  3. sanctuary-operations-data-platform sanctuary-operations-data-platform Public

    Operational data platform for intake, care, adoption, and cost reporting in a sanctuary environment, emphasizing relational data modeling and SQL-based analytics.

    Python

  4. airflow-api-to-postgres-demo airflow-api-to-postgres-demo Public

    Airflow demo pipeline — simple API extraction → CSV transform → Postgres load with Docker and basic data quality check

    Python

  5. publications publications Public

    Bibliographic list of peer-reviewed neuroimaging publications with authorship context.