Skip to content

πŸ’±πŸ“ˆ ETL pipeline with monitoring dashboard for CoinCap API

Notifications You must be signed in to change notification settings

Genvekt/coincap_monitor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

59 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

CoinCap monitor

ETL pipeline for monitoring cripto curency price and build analytical dashboard based on collected data inside Data Warehouse.

example workflow example workflow

Built with

System diagram

The following system diagram represents the project structure. From the picture, it may be seen that the system is composed from 4 docker containers with following purposes:

  • pipeline performs complete ETL cycle (cron job)
  • warehause contains main storage of cleaned data (Clickhouse)
  • stagedb plays the role of a backup storage for raw data (MongoDB)
  • dashboard generates and shows reports out of cleaned data (Metabase)

Additionally, dashboard uses PostgreSQL database container as its internal storage.

system design

Project Structure

β”œβ”€β”€ docs
β”‚
β”œβ”€β”€ pipeline
β”‚   β”œβ”€β”€ cron              # Scheduler configs
β”‚   β”œβ”€β”€ docker            # Environment configs
β”‚   β”œβ”€β”€ logs              # Logs for pipeline service
β”‚   β”‚
β”‚   β”œβ”€β”€ src               # ETL source code
β”‚   β”‚   β”œβ”€β”€ config.py     # Enviroment parsers
β”‚   β”‚   β”œβ”€β”€ db.py         # Warehouse management
β”‚   β”‚   β”œβ”€β”€ etl.py        # ETL functions
β”‚   β”‚   β”œβ”€β”€ run.py        # Pipeline script
β”‚   β”‚   └── stagedb.py    # StageDB management
β”‚   β”‚
β”‚   └── tests             # Unittests for ETL source code
β”‚
β”œβ”€β”€ warehouse
β”‚   β”œβ”€β”€ db                # Warehouse database files (Clickhouse)
β”‚   └── logs              # Logs for warehouse service
β”‚
β”œβ”€β”€ stagedb
β”‚   └── db                # StageDB database files (MongoDB)
β”‚
└── dashboard
    β”œβ”€β”€ db                # Dashboard database files (PostgreSQL)
    β”œβ”€β”€ docker            # Environment configs
    └── logs              # Logs for dashboard service


Installation

To run the project, perform the following steps:

  1. Clone repo to your machine
git clone https://github.com/Genvekt/coincap_monitor.git
cd coincap_monitor
  1. Create .env file with the following envairoment parameters:
  • API_KEY: key that you must retrieve from here

  • API_URL: url to the CoinCap API

  • STAGEDB_HOST, STAGEDB_DB, STAGEDB_USER, STAGEDB_PASSWORD, STAGEDB_PORT: MongoDB access data

  • CLICKHOUSE_HOST, CLICKHOUSE_DB, CLICKHOUSE_USER, CLICKHOUSE_PASSWORD, CLICKHOUSE_PORT: ClickHouse access data

  • POSTGRES_HOST, POSTGRES_DB, POSTGRES_USER, POSTGRES_PASSWORD, POSTGRES_PORT: PostgreSQL access data

    Example .env file:

    API_KEY={YOUR_API_KEY}
    API_URL=http://api.coincap.io/v2
    
    
    STAGEDB_HOST=stagedb
    STAGEDB_DB=stagedbdb
    STAGEDB_USER=stagedbuser
    STAGEDB_PASSWORD={YOUR_MONGODB_PASSWORD}
    STAGEDB_PORT=27017
    
    CLICKHOUSE_HOST=warehouse
    CLICKHOUSE_DB=clickhousedb
    CLICKHOUSE_USER=clickhouseuser
    CLICKHOUSE_PASSWORD={YOUR_CLICKHOUSE_PASSWORD}
    CLICKHOUSE_PORT=9000
    
    
    POSTGRES_HOST=dashboard_db
    POSTGRES_DB=postgres
    POSTGRES_USER=postgres
    POSTGRES_PASSWORD={YOUR_POSTGRESQL_PASSWORD}
    POSTGRES_PORT=5432
    
    
  1. Run application
docker network create CoinCapNet
docker-compose run --build -d
  1. Stop application:
docker-compose down -v

About

πŸ’±πŸ“ˆ ETL pipeline with monitoring dashboard for CoinCap API

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published