Skip to content
This repository was archived by the owner on Apr 16, 2025. It is now read-only.

newgnart/moo

Repository files navigation

MOO - CowSwap Data Pipeline

A data pipeline for extracting, transforming, and loading CowSwap DEX trading data.

Architecture

Extract → Transform → Load ↓ ↓ ↓ TheGraph → Polars → PostgreSQL

Components

  • Extract: Fetches swap data from CowSwap's TheGraph subgraph
  • Transform: Processes raw data using Polars with validations
  • Load: Stores transformed data in PostgreSQL with versioning support

Key Features

  • Incremental data loading
  • Data validation and quality checks
  • Schema versioning
  • Audit trail tracking
  • Configurable pipeline parameters

Setup

  1. Create .env file and set the environment variables
cp .env.example .env
  1. Configure pipeline in pipeline_config.yaml:
extract:
  subgraph_base_url: "https://gateway.thegraph.com/api"
  cowswap_subgraph_id: "..."
  raw_data_dir: "data/raw"

transform:
  decimal_places: 18
  timestamp_format: "%Y-%m-%d %H:%M:%S"

load:
  batch_size: 5000
  timeout: 300
  table_name: "swaps"
  if_exists: "append"
  1. Run the pipeline
  • Set up docker compose for postgres
docker compose up -d
  • Set up python environment with poetry, after cloning the repo and in the root directory
poetry install
  • Or install using pip git repo
pip install git+https://github.com/gnart33/moo.git
  • Orchestrate the pipeline
poetry run python src/moo/pipeline/orchestrator.py

Data Model

Swaps Table

  • blockTimestamp
  • transactionHash
  • tokenAmountIn
  • tokenAmountOut
  • tokenIn
  • tokenInSymbol
  • tokenOutSymbol
  • tokenOut

About

Data engineering pipeline for CoWSwap

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages