Skip to content
View divakaivan's full-sized avatar

Highlights

  • Pro

Block or report divakaivan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
divakaivan/README.md

Socials

Check my linkedin/monthly 2025 blog/youtube to see what I'm learning these days

Monthly 2025 blog | Daily 2024 blog

LinkedIn YouTube

Top Langs

Featured YT videos

Exploring the benefits of Backstage (dev platform) Write-Audit-Publish (WAP) Data Quality pattern with Apache Iceberg 🧊 The full story behind multicollinearity Chat with your PDF for free in colab using huggingface, mongodb, llama_index, langchain

Projects

AI/ML/MLOps

Sample ML model API following the Open Inference Protocol

  • Designed following the Open Inference Protocol — a growing industry standard for standardized, observable, and interoperable machine learning inference
  • Tech: scikit-learn, FastAPI, pytest, pre-commit, Docker, Github Actions, Pydantic

    View Project

    MLOps 101 Project for a mini-course I teach

  • After learning a tonne from great online teachers, and projects I decided to transfer my knowledge onto undergraduate students who are curious about the life of a model outside the Jupyter notebook
  • An end-to-end ML system that processes taxi data, stores models in a model registry, exposes them via an API, and deploys this API to Google Cloud, and keeps logs for observability
  • Tech: scikit-learn, EvidentlyAI, FastAPI, MLFlow, Docker, Github Actions, Terraform, Google Cloud (GCS, Logging, Compute Engine, Artifact Registry, Kubernetes Engine)

    View Project

    MLOps Architecture for Real-Time Fraud Detection

  • An AI-driven solution for real-time credit card fraud detection using MLOps techniques
  • Fully orchestrated pipelines, including data ingestion, model training, and real-time prediction and monitoring
  • High fraud case detection through Graph Convolutional Network, XGBoost, and CatBoost models
  • Tech: Neo4j graph DB, Sklearn, PyG, Mlflow, Kafka, Grafana, Mage orchestration, Docker, Streamlit

    View Project

    Voice-to-Voice Personal Finance Assistant

  • Talk, Learn and Analyse your spending habits with your Personal Finance Assistant AI Agent. Communicate through speech
  • Frontend + Backend communicating via a websocket
  • Ask follow-up questions (the language model can see the chat history)
  • Detailed observability of live and historical connections to the server via Pydantic Logfire
  • Tech: Webhooks, FastAPI, PydanticAI, Logfire, SQLite, OpenAI, Ollama, PostgreSQL, React

    View on GitHub

    MLOps Architecture for Insurance Fraud Detection

  • Building an end-to-end MLOps pipeline to detect car insurance fraud
  • Pipeline orchestration covering data storage, data preprocessing (using IV and WoE), model training, deployment, and monitoring
  • Focus on achieving high recall in fraud detection using a Balanced Random Forest Classifier
  • Tech: PostgreSQL DB, Terraform, Google Cloud Platform, Mlflow, Prefect, Grafana, Evidently, Docker, FastAPI

    View Project

    Other AI/ML


    Data Eng

    Esports Voice Data Pipeline (Zach Wilson's DE bootcamp capstone)

  • Esports team communication data is normally kept private, but for the first time a team is sharing their full voice communication records so with this I am showing a prototype of a pipeline that utilises audio data to extract communication patterns
  • In addition, I developed visualizations that uncover communication patterns and dynamics, providing the underlying team with actionable insights to enhance their gameplay
  • Tech: Airflow, dbt, Google BigQuery, Google Cloud Storage, Streamlit, Terraform, Github Actions, Astronomer

    View Project

    Write-Audit-Publish exercise YT video

  • Recorded a youtube tutorial on how to follow the WAP DQ practice using popular tech
  • Tech: Dremio, Apache Iceberg, Nessie, MinIO

    View Project

    EU AI Act Graph Modelling

  • Scraped the EU AI Act website and created a conceptual, logical and a physical data model, improving my understanding of the Act’s requirements
  • Separated entities into Articles, Recitals, Annexes, Chapters, Versions, Summaries and implemented the physical model using a graph database
  • Tech: Python, Neo4j, BeautifulSoup

    View Project

    Transaction Stream Data Engineering Pipeline

  • Generate transaction data via Stripe's API
  • Stream data using Apache Kafka and process it in real-time with PySpark Structured Streaming
  • Store processed data in PostgreSQL
  • Manage data transformations and modeling using dbt
  • Visualize data using Grafana
  • Tech: PostgreSQL DB, Kafka, PySpark, dbt, Grafana

    View Project

    Glaswegian Audio Dataset and ASR model

  • Co-create a 120 minute open-sourced Glaswegian dataset
  • Preprocess raw audio and transcriptions and upload to HuggingFace
  • Research into audio AI models and fine-tune ASR and TTS models
  • Tech: HuggingFace, Python, Fine-Tuning, Audio AI

    View on HuggingFace

    Lending Club Data Engineering Pipeline

  • Build a data pipeline to process and visualize Lending Club data
  • Extract raw data from Kaggle and load it into Google Cloud Storage
  • Process data with dbt in BigQuery
  • Create visualizations using Looker
  • Manage infrastructure with Terraform
  • Orchestrate the entire process with Mage
  • Tech: Docker, Mage orchestration, Google Cloud Platform, Terraform, dbt, Looker

    View Project

    Web/API/Other

    RSS Aggregator API

  • Developed an API that allows users to authenticate, scrape RSS feeds, follow feeds of their choice, and view posts from those feeds
  • The API is fully tested, dockerized, and available on Docker Hub
  • Deployed the API in a local Kubernetes setup with dashboards for monitoring both Kubernetes and the API
  • Tech: Go, PostgreSQL, GitHub Actions, Docker, Kubernetes, Prometheus, Grafana

    View Project

    Platform Engineering with Backstage

  • Built and deployed a Python API, created CI/CD pipelines with GitHub Actions, Helm, and ArgoCD for streamlined Kubernetes deployments
  • Registered components in Backstage’s software catalog, managed team ownership, published TechDocs, and deployed Backstage in production using Docker & Kubernetes
  • Tech: GitHub Actions, Docker, Kubernetes, ArgoCD, Helm, Backstage

    View Video

    Sample ML model API following the Open Inference Protocol

  • Designed following the Open Inference Protocol — a growing industry standard for standardized, observable, and interoperable machine learning inference
  • Tech: scikit-learn, FastAPI, pytest, pre-commit, Docker, Github Actions, Pydantic

    View Project

    Pinned Loading

    1. mlops-101 Public

      Sample Project for an MLOps 101 course I am taught

      Python 44 7

    2. kb_project Public

      Real-time fraud transaction detection system

      Python 22 3

    3. transaction-stream-data-pipeline Public

      Transaction processing & vis pipeline using PySpark Streaming

      Python 30 1

    4. rssagg Public

      Blog Aggregator API Deployed on K8s

      Go 4 1

    5. lolesports-voice-analytics Public

      LoL Esports Voice Analytics Capstone Project

      Python 11 3

    6. model-api-oip Public

      Sample ML model API following the Open Inference Protocol

      Python 9 2