Skip to content

hari255/de-crypto

Repository files navigation

Batch application on crypto-currency data

Building a scalable and robust data pipeline. Extracting data from an API, transforming the data using Pyspark, builiding a ML model on the transformed data to predict the price and finally hosting it on a Website built using Shiny App on R.

The data pipeline triggers every day and ingest data to the database, ML model makes it's predicitions on weekly basis.

Tech Stack

  • Apache Spark
  • Apache Airflow
  • Docker compose
  • SQlite
  • Cassandra
  • Jenkins
  • Kubernetes
  • Shiny Dashboard