Skip to content

claudian37/DS_Portfolio

Repository files navigation

Welcome to my portfolio! I'm a Data Scientist/ ML Engineer specializing in credit and fraud risk. In my spare time, I like to explore new tools, visualizations and modeling techniques, and I showcase some of them here. Visit my website: www.ds-claudia.com for more.

Contents:

  • Time series network graph visualization: An interactive plot showing how cluster grows over time using Kaggle data on healthcare provider fraud.

  • Huffman Code Algorithm: Implementation of a text compression algorithm in Python.

  • Geospatial visualization: Animated .gif showing heatmaps of Cab pickup and dropoff locations in NYC from the Kaggle NYC Cab Ride dataset

    • Gif file: NYC_cab_dataset > img > nyc_cab_rides_heatmap.gif
    • Notebook: NYC_cab_dataset > 01_EDA_NYC_Cab_geospatial_visualization.ipynb
  • Predictive modeling: LightGBM model to predict trip duration in NYC from the Kaggle NYC Cab Ride dataset.

    • Notebook: NYC_cab_dataset > 02_Modeling_NYC_Cab.ipynb