# Neutron Demo: Comprehensive Data Pipeline

This notebook demonstrates the full power of **Neutron** to:
1.  **Download** diverse market data types: **OHLCV**, **Aggregated Trades**, **Tick Trades**, and **Order Book Tickers**.
2.  **Retrieve** this data using the `DataCrawler`.
3.  **Visualize** the data to gain insights.

### Prerequisites
Ensure you have installed the package dependencies:
```bash
uv sync
uv add matplotlib seaborn
```

In [None]:
import sys
import os
import logging
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta

# Add src to path so we can import neutron
sys.path.append(os.path.abspath('../src'))

from neutron.core.downloader import Downloader
from neutron.core.crawler import DataCrawler
from neutron.core.config import NeutronConfig, StorageConfig, TaskConfig

# Setup logging
# This controls what you see in the notebook output.
# Detailed download logs are also saved to 'data.log' by default.
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
sns.set_theme(style="darkgrid")

## 1. Configure and Run Downloader

We will configure the downloader to fetch **1 day** of OHLCV data, but only **1 hour** of high-frequency data (Trades, BookTicker) to keep this demo quick.

In [None]:
# Define Configuration
config = NeutronConfig(
    # Storage Configuration
    # By default, it uses the DATABASE_URL env var or localhost postgres.
    # You can override it here, e.g., for a local SQLite file:
    # storage=StorageConfig(type='database', database_url='sqlite:///neutron_demo.db'),
    storage=StorageConfig(type='database'), 
    
    # State File Paths (Optional)
    # You can specify where to store the download state and exchange metadata.
    data_state_path='data_state.json',
    exchange_state_path='exchange_state.json',
    
    tasks=[
        # 1. OHLCV (1 Day)
        TaskConfig(
            type='backfill_ohlcv',
            params={
                'timeframe': '1h',
                'start_date': '2024-01-01T00:00:00',
                'end_date': '2024-01-02T00:00:00',
                'rewrite': True
            },
            exchanges={'binance': {'spot': {'symbols': ['BTC/USDT']}}}
        ),
        # 2. Aggregated Trades (1 Hour)
        TaskConfig(
            type='backfill_agg_trades',
            params={
                'start_date': '2024-01-01T00:00:00',
                'end_date': '2024-01-01T01:00:00', # 1 hour only
                'rewrite': True
            },
            exchanges={'binance': {'spot': {'symbols': ['BTC/USDT']}}}
        ),
        # 3. Tick Trades (1 Hour)
        TaskConfig(
            type='backfill_tick_data',
            params={
                'start_date': '2024-01-01T00:00:00',
                'end_date': '2024-01-01T01:00:00',
                'rewrite': True
            },
            exchanges={'binance': {'spot': {'symbols': ['BTC/USDT']}}}
        ),
        # 4. Order Book Ticker (1 Hour)
        TaskConfig(
            type='backfill_book_ticker',
            params={
                'start_date': '2024-01-01T00:00:00',
                'end_date': '2024-01-01T01:00:00',
                'rewrite': True
            },
            exchanges={'binance': {'spot': {'symbols': ['BTC/USDT']}}}
        )
    ]
)

# Initialize and Run
# You can specify a custom log file if needed, defaults to 'data.log'
downloader = Downloader(config=config, log_file='neutron_demo.log')

print("Starting Multi-Task Download... (Check neutron_demo.log for details)")
downloader.run()
print("Download Complete!")

## 2. Retrieve and Visualize Data

Initialize the crawler to fetch data from our database.

In [None]:
crawler = DataCrawler(storage_type='database')

### A. OHLCV Data
Standard candlestick data.

In [None]:
df_ohlcv = crawler.get_ohlcv(
    exchange='binance', symbol='BTC/USDT', timeframe='1h',
    start_date='2024-01-01', end_date='2024-01-02'
)

if not df_ohlcv.empty:
    plt.figure(figsize=(12, 6))
    plt.plot(df_ohlcv['time'], df_ohlcv['close'], label='Close Price', color='blue')
    plt.title('BTC/USDT Hourly Close Price')
    plt.legend()
    plt.show()
    display(df_ohlcv.head())

### B. Aggregated Trades
Visualizing buying vs selling pressure using aggregated trade volume.

In [None]:
df_agg = crawler.get_data(
    data_type='aggTrades', exchange='binance', symbol='BTC/USDT',
    start_date='2024-01-01T00:00:00', end_date='2024-01-01T01:00:00'
)

if not df_agg.empty:
    # Calculate Buy/Sell Volume
    df_agg['volume_usd'] = df_agg['price'] * df_agg['qty']
    buy_vol = df_agg[df_agg['is_buyer_maker'] == False]['volume_usd'].sum()
    sell_vol = df_agg[df_agg['is_buyer_maker'] == True]['volume_usd'].sum()

    plt.figure(figsize=(8, 5))
    plt.bar(['Buy Volume', 'Sell Volume'], [buy_vol, sell_vol], color=['green', 'red'])
    plt.title(f'Buy vs Sell Volume (1 Hour) - Total: ${buy_vol+sell_vol:,.0f}')
    plt.ylabel('Volume (USD)')
    plt.show()
    
    print(f"Total AggTrades: {len(df_agg)}")
    display(df_agg.head())

### C. Order Book Ticker (Spread Analysis)
Visualizing the Bid-Ask Spread over time.

In [None]:
df_ticker = crawler.get_data(
    data_type='bookTicker', exchange='binance', symbol='BTC/USDT',
    start_date='2024-01-01T00:00:00', end_date='2024-01-01T01:00:00'
)

if not df_ticker.empty:
    # Calculate Spread
    df_ticker['spread'] = df_ticker['best_ask_price'] - df_ticker['best_bid_price']
    
    plt.figure(figsize=(12, 4))
    plt.plot(df_ticker['time'], df_ticker['spread'], color='purple', lw=0.5)
    plt.title('Bid-Ask Spread (BTC/USDT)')
    plt.ylabel('Spread (USDT)')
    plt.xlabel('Time')
    plt.show()
    
    print(f"Average Spread: ${df_ticker['spread'].mean():.4f}")
    display(df_ticker.head())

### D. Tick Trades (Granular)
Inspecting individual trade execution.

In [None]:
df_trades = crawler.get_data(
    data_type='trades', exchange='binance', symbol='BTC/USDT',
    start_date='2024-01-01T00:00:00', end_date='2024-01-01T01:00:00'
)

if not df_trades.empty:
    print(f"Retrieved {len(df_trades)} individual trades.")
    # Histogram of Trade Sizes
    plt.figure(figsize=(10, 5))
    sns.histplot(df_trades['amount'], bins=50, log_scale=(True, False))
    plt.title('Distribution of Trade Sizes (Log Scale)')
    plt.xlabel('Trade Size (BTC)')
    plt.show()
    display(df_trades.head())