I have a scientific background and I'm a Tinkerer at heart. Here are the projects where I apply what I learn!
Full data value chain coverage: From extraction to production.
| Domain | Tools |
|---|---|
| Languages | |
| AI & ML | |
| Engineering | |
| BI & Corp |
I built a custom NAS (TrueNAS Scale) that evolved from family storage to a technical playground. I explore networking (VLANs, Reverse Proxy, VPN), hardware assembly, and run self-hosted services (Nextcloud, Media Servers, Pi-hole) via Docker.
Legend: β = Completed (click title for repo) | π§ = Work In Progress
-
- Solo β’ From scratch
- Automates internship applications via LinkedIn parsing & GenAI with a Human-in-the-loop validation step.
- Stack:
PythonStreamlitLLM APIsDockerChrome Ext
-
π§ Smart Shopping List Generator
- Solo β’ From scratch
- Digitizes physical recipe cards (HelloFresh) using OCR and GenAI for ingredient normalization/entity resolution.
- Stack:
PythonTesseractLLMStreamlitDocker
-
- Team of 3 β’ From scratch
- Comprehensive R Library for Variable Clustering (K-Means, HAC, MCA) using R6 (OOP). Includes Vignettes & Unit Tests.
- Stack:
RR6 (OOP)ShinyPackage Dev
-
β MedTriage-AI
- Team of 4 β’ From scratch
- AI Copilot for emergency medical triage based on the official FRENCH standard.
- Dockerized multi-service architecture (Frontend, Backend, MLflow) deployed on Hugging Face Spaces.
- Integrated a Pydantic-AI Agent with RAG capabilities (ChromaDB) for medical protocol analysis and structured data extraction.
- Production-Grade & Ethical Focus: Prompt injection security, real-time FinOps (API costs) and GreenOps (EcoLogits) monitoring to track carbon footprint.
- Stack:
Pydantic-AIMistral AIFastAPIStreamlitDockerMLflowChromaDBEcoLogits
-
β Fraud Detection with Cost-Sensitive Learning
- Team of 2 β’ Imbalanced Data
- Fraud detection in highly imbalanced check transaction data (IR ~165:1, 4.6M transactions).
- Dual Approach: Statistical optimization (F1-Score) vs Economic optimization (Profit maximization).
- Achieved 93.3% profit capture rate (β¬2.14M) using Instance-Weighted XGBoost with custom cost matrix integration.
- Stack:
PythonXGBoostPolarsScikit-learnImbalanced-learn
-
β Electricity Load Forecasting & R Package
- Solo β’ Time Series
- Forecast building electricity consumption at 15-minute intervals using classical time series methods.
- Benchmarked SARIMA, ETS, NNAR and implemented the Weighted Nearest Neighbors (WNN) algorithm as a native R package.
- Stack:
RforecastneuralnetPackage Development
-
β Energy Performance Predictor (DPE)
- Team of 4 β’ From scratch
- Dual Model: Classification (Energy Class) & Regression (Consumption) served via API to a reactive frontend.
- Stack:
PythonShinyFastAPIDockerScikit-learn
-
π§ Job Market Insights & NLP
- Team of 4 β’ WIP
- Insights extraction from job descriptions using Topic Modeling (LDA) and semantic clustering.
- Stack:
PythonNLP (Spacy/Gensim)Streamlit

