A robust backtesting system for cryptocurrency trading strategies based on machine learning models with time series cross-validation.
This project provides a complete pipeline for:
- Data Processing: Standardize and prepare cryptocurrency data for model training and simulation
- Model Training: Train machine learning models with hyperparameter optimization and time series cross-validation
- Backtesting: Simulate trading strategies with realistic assumptions and track performance
- Data Standardization: Ensures consistent datetime formats between training and simulation data
- Feature Engineering: Automatically generates technical indicators and other useful features
- Time Series Cross-Validation: Prevents data leakage when evaluating model performance
- Hyperparameter Optimization: Finds optimal model parameters with grid search
- Realistic Backtesting: Includes transaction costs, position sizing, and risk management
- Performance Metrics: Calculates comprehensive trading metrics including Sharpe ratio, win rate, drawdown, etc.
- Visualization: Generates plots of equity curves, drawdowns, and trade entry/exit points
# Clone the repository
git clone <repository-url>
cd crypto_forecasting
# Install dependencies
pip install -r requirements.txtTo run a complete example with ADA cryptocurrency:
python run_ada_backtest.pyThis will:
- Process the ADA data
- Train Random Forest, XGBoost, and Ridge regression models
- Backtest the models on the simulation data
- Save results in the
resultsdirectory
python -m src.data_processorThis will process all available cryptocurrencies and prepare the data for model training and backtesting.
python -m src.main --coins ada eth btc solana --model-types rf xgb ridge --n-rows 1000Command-line arguments:
--coins: List of coins to process and backtest--model-types: List of model types to train (rf, xgb, ridge, lasso, bayesian, gbm)--n-rows: Number of rows for reduced dataset--skip-processing: Skip data processing step--skip-training: Skip model training step--skip-backtesting: Skip backtesting step--initial-balance: Initial balance for backtesting (default: 10000.0)--enable-shorting: Enable short positions in backtesting--signal-threshold: Signal threshold for entering positions (default: 0.0001)
/data/train/: Contains training data files/data/train/simulation/: Contains simulation data files/data/processed/: Contains processed data ready for model training/models/: Saved trained models and performance metrics/results/: Backtest results, plots, and performance metrics
src/data_processor.py: Data preprocessing and standardizationsrc/datasets.py: Dataset loading and feature engineeringsrc/models.py: Model training with time series cross-validationsrc/backtest.py: Backtesting engine for trading strategiessrc/main.py: Main script to run the full pipelinerun_ada_backtest.py: Example script for ADA cryptocurrency
You can customize the backtesting parameters in src/config.py:
TRANSACTION_COSTS: Transaction costs for each cryptocurrencyINITIAL_BALANCE: Initial account balance for backtestingSIGNAL_THRESHOLD: Minimum signal threshold to enter a positionENABLE_SHORTING: Whether to allow short positionsRISK_PER_TRADE: Risk per trade as a fraction of account balanceMAX_POSITION_SIZE: Maximum position size as a fraction of account balance
After running the backtester, you'll find results in the results/ directory:
<coin>/<model_type>_backtest_results.csv: Detailed backtest results<coin>/<model_type>_trades.csv: Individual trade details<coin>/<model_type>_performance.json: Performance metrics<coin>/<model_type>_backtest_plot.png: Visualization of the backtest
Transaction costs are modeled specifically for each coin:
- BTC: 5-10 bps
- ETH: 6-12 bps
- SOL: 12-20 bps
- ADA: 14-25 bps
from src.runner import run_backtest
# Run backtest with default models (Random Forest, XGBoost, Logistic Regression)
results = run_backtest()
# Or specify custom model types
results = run_backtest(model_types=['rf', 'xgb'])The backtesting engine generates two key output files in the results/ directory:
backtest_results.csv: Performance summary across strategies and coinsdetailed_backtest_results.csv: Comprehensive trade-level results
- Total Return
- Annualized Return
- Annualized Volatility
- Sharpe Ratio
- Maximum Drawdown
- Fork the repository
- Create your feature branch
- Commit your changes
- Push to the branch
- Create a new Pull Request
[Specify your license here]