Binance Datatool is an open-source project for cryptocurrency quantitative trading research, featuring BHDS (Binance Historical Data Service) as its core service.
BHDS efficiently downloads and maintains historical market data from Binance using Aria2 for parallel downloads from Binance's AWS repository. Data is processed with Polars and stored in Parquet format for optimal quantitative research workflows.
The project uses src layout with two main packages:
bhds
: CLI and core servicesbdt_common
: shared utilities
See docs/ARCHITECTURE.md for detailed architecture.
This project is released under the MIT License.
- Python ≥ 3.12 (required for modern type hints and performance optimizations)
- uv for fast Python package management
- aria2 for efficient parallel downloads from Binance AWS
# Setup project environment
uv sync && source .venv/bin/activate
# Install aria2
sudo apt install aria2 # Ubuntu/Debian
# brew install aria2 # macOS
Optional configurations:
# BHDS main data storage directory (default: ~/crypto_data/bhds)
export BHDS_HOME="/path/to/your/bhds/data"
# HTTP proxy if needed
export HTTP_PROXY="http://127.0.0.1:7893"
BHDS provides two interfaces for quantitative traders. CLI interface is recommended for most use cases.
Built with Typer, the CLI provides a streamlined workflow using YAML configurations:
# Show available commands
uv run bhds --help
# Download historical data
uv run bhds aws-download configs/download/spot_kline.yaml
# Parse downloaded CSV to Parquet
uv run bhds parse-aws-data configs/parsing/spot_kline.yaml
# Generate holistic 1m klines
uv run bhds holo-1m-kline configs/holo_1m/spot.yaml
# Resample to higher timeframes
uv run bhds resample configs/resample/spot.yaml
See configs/ for YAML configuration templates.
For advanced users requiring programmatic access and custom workflows:
import asyncio
import os
from bdt_common.enums import DataFrequency, TradeType
from bdt_common.network import create_aiohttp_session
from bhds.aws.client import AwsClient
from bhds.aws.path_builder import AwsKlinePathBuilder
async def main():
async with create_aiohttp_session(5) as session:
http_proxy = os.getenv("HTTP_PROXY") or os.getenv("http_proxy")
path_builder = AwsKlinePathBuilder(trade_type=TradeType.spot, data_freq=DataFrequency.daily, time_interval="1m")
client = AwsClient(path_builder=path_builder, session=session, http_proxy=http_proxy)
symbols = await client.list_symbols()
print('First 5 symbols of spot daily 1m kline:', symbols[:5])
asyncio.run(main())
See examples/ for complete usage patterns including data download and processing workflows.