Skip to content
View crc10's full-sized avatar

Block or report crc10

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
crc10/README.md

Hi there, I'm Constantin πŸ‘‹

Data Scientist / Engineer graduate

I have a scientific background and I'm a Tinkerer at heart. Here are the projects where I apply what I learn!

πŸ› οΈ Tech Stack & Skills

Full data value chain coverage: From extraction to production.

Domain Tools
Languages Python SQL R
AI & ML Scikit-learn TensorFlow PyTorch
Engineering Docker Airflow MLflow
BI & Corp PowerBI Tableau SAS

🏠 My Homelab & Self-Hosting

I built a custom NAS (TrueNAS Scale) that evolved from family storage to a technical playground. I explore networking (VLANs, Reverse Proxy, VPN), hardware assembly, and run self-hosted services (Nextcloud, Media Servers, Pi-hole) via Docker.


πŸ“‚ Featured Projects

Legend: βœ… = Completed (click title for repo) | 🚧 = Work In Progress

πŸš€ Personal Projects

  • βœ… Smart Outreach CRM

    • Solo β€’ From scratch
    • Automates internship applications via LinkedIn parsing & GenAI with a Human-in-the-loop validation step.
    • Stack: Python Streamlit LLM APIs Docker Chrome Ext
  • 🚧 Smart Shopping List Generator

    • Solo β€’ From scratch
    • Digitizes physical recipe cards (HelloFresh) using OCR and GenAI for ingredient normalization/entity resolution.
    • Stack: Python Tesseract LLM Streamlit Docker

πŸŽ“ Academic Projects (M2 SISE)

  • βœ… clustVarACC (R Package)

    • Team of 3 β€’ From scratch
    • Comprehensive R Library for Variable Clustering (K-Means, HAC, MCA) using R6 (OOP). Includes Vignettes & Unit Tests.
    • Stack: R R6 (OOP) Shiny Package Dev
  • βœ… MedTriage-AI

    • Team of 4 β€’ From scratch
    • AI Copilot for emergency medical triage based on the official FRENCH standard.
    • Dockerized multi-service architecture (Frontend, Backend, MLflow) deployed on Hugging Face Spaces.
    • Integrated a Pydantic-AI Agent with RAG capabilities (ChromaDB) for medical protocol analysis and structured data extraction.
    • Production-Grade & Ethical Focus: Prompt injection security, real-time FinOps (API costs) and GreenOps (EcoLogits) monitoring to track carbon footprint.
    • Stack: Pydantic-AI Mistral AI FastAPI Streamlit Docker MLflow ChromaDB EcoLogits
  • βœ… Fraud Detection with Cost-Sensitive Learning

    • Team of 2 β€’ Imbalanced Data
    • Fraud detection in highly imbalanced check transaction data (IR ~165:1, 4.6M transactions).
    • Dual Approach: Statistical optimization (F1-Score) vs Economic optimization (Profit maximization).
    • Achieved 93.3% profit capture rate (€2.14M) using Instance-Weighted XGBoost with custom cost matrix integration.
    • Stack: Python XGBoost Polars Scikit-learn Imbalanced-learn
  • βœ… Electricity Load Forecasting & R Package

    • Solo β€’ Time Series
    • Forecast building electricity consumption at 15-minute intervals using classical time series methods.
    • Benchmarked SARIMA, ETS, NNAR and implemented the Weighted Nearest Neighbors (WNN) algorithm as a native R package.
    • Stack: R forecast neuralnet Package Development
  • βœ… Energy Performance Predictor (DPE)

    • Team of 4 β€’ From scratch
    • Dual Model: Classification (Energy Class) & Regression (Consumption) served via API to a reactive frontend.
    • Stack: Python Shiny FastAPI Docker Scikit-learn
  • 🚧 Job Market Insights & NLP

    • Team of 4 β€’ WIP
    • Insights extraction from job descriptions using Topic Modeling (LDA) and semantic clustering.
    • Stack: Python NLP (Spacy/Gensim) Streamlit

LinkedIn β€’ Email

Pinned Loading

  1. smart-outreach-crm smart-outreach-crm Public

    Python 1

  2. medtriage-ai medtriage-ai Public

    Python

  3. cyrizon/r-clustering-variables cyrizon/r-clustering-variables Public

    R

  4. fraud-detection-project fraud-detection-project Public

    Jupyter Notebook