MarketTensor is a research-first framework for reproducible directional forecasting experiments in crypto perpetuals and futures markets. The repository is designed as benchmark infrastructure, not a trading product, and emphasizes leak-free preprocessing, strict time-series evaluation, and paper-ready experiment management.
The project is inspired by Rahul Gupta's December 2025 arXiv submission, "S&P 500 Stock's Movement Prediction using CNN", but it deliberately raises the scientific bar. MarketTensor focuses on chronological evaluation, explicit leakage controls, richer crypto-perpetual market signals, stronger baseline models, and experiment structures that can support a publishable benchmark study.
Many financial forecasting repositories mix exploratory code, loosely defined preprocessing, and time-series leakage. MarketTensor exists to provide a more defensible alternative:
- No random shuffling across temporal splits.
- Train, validation, and test segmentation by time.
- Train-only scaling and deterministic preprocessing artifacts.
- Config-driven experiments and ablations.
- Clear separation between prediction metrics and trading metrics.
- Open-source research structure that can evolve into a paper and later into lightweight browser inference.
Initial data families:
- OHLCV
- Funding rate
- Open-interest-style exchange metrics
- Liquidation feature hooks with explicit source errors until a reproducible historical source is added
Initial target markets:
- BTCUSDT perpetual
- ETHUSDT perpetual
- SOLUSDT perpetual
Implemented or scaffolded experiment dimensions:
- Feature sets:
ohlcv,ohlcv_funding,ohlcv_open_interest,ohlcv_liquidation, combined variants - Labels: next-bar direction, k-bar direction, return-threshold classification
- Horizons: 1, 4, 12, 24 bars
- Models: logistic regression, histogram gradient boosting, MLP, 1D CNN, TCN, LSTM
- Evaluation: chronological split, walk-forward evaluation, expanding and rolling windows, pooled and per-symbol analysis
MarketTensor/
├── configs/
├── data/
├── docs/
├── notebooks/exploration/
├── scripts/
├── src/markettensor/
└── tests/
The .lab/ directory is reserved for gitignored daily research notes and experiment journal entries.
- Create a virtual environment and install the project:
python3 -m venv .venv
source .venv/bin/activate
pip install -e .[dev]
pre-commit install- Download raw futures archives:
python scripts/download_data.py --symbol BTCUSDT --interval 1h- Build a processed dataset:
python scripts/build_dataset.py --config-name cnn_ohlcv_funding- Train a model:
python scripts/train.py --config-name cnn_ohlcv_funding- Evaluate a run:
python scripts/evaluate.py --run-id latest
python scripts/backtest.py --run-id latest
python scripts/export_onnx.py --run-id latestcnn_ohlcvcnn_ohlcv_fundingcnn_ohlcv_funding_open_interestcnn_ohlcv_liquidationcnn_combined_alllstm_ohlcvlogistic_ohlcvmlp_ohlcv_open_interest
Each experiment stores the resolved Hydra configuration, feature manifest, scaler parameters, metrics, predictions, and model artifacts for reproducibility.
The repository now includes a longer-history walk-forward benchmark over raw Binance USD-M futures archives downloaded for 2023-01-01 through 2025-12-31, using pooled BTCUSDT, ETHUSDT, and SOLUSDT data. The current README-facing results use the corrected trading evaluator:
- positions are taken from saved binary predictions, not re-thresholded probabilities
- turnover is computed per symbol rather than across pooled assets
- multi-bar trading metrics use non-overlapping holding periods
A raw-archive integrity sweep after download verified zero missing checksum files and zero checksum mismatches across:
165kline archives per symbol36funding-rate archives per symbol1,096metrics archives per symbol
The main 1h comparison in docs/results/long_history_naive_vs_best_summary.csv uses four expanding folds and compares naïve baselines against the current strongest learned candidates.
Headline results:
- Best mean accuracy:
LogRegat0.5123 - Best mean ROC-AUC:
LogRegat0.5163 - Best learned trading result:
1D CNN (OHLCV + Open Interest)with cumulative return0.0838and Sharpe0.0828 - Naïve
Majorityis still very close, with cumulative return0.0728and Sharpe0.0822
This is the current honest takeaway for 1h: the learned models are only marginally better than trivial baselines, and most of the apparent edge comes from small trading improvements rather than strong classification separation.
Per-symbol and multi-horizon tables are also available:
- docs/results/long_history_naive_vs_best_per_symbol_summary.csv
- docs/results/long_history_naive_vs_best_horizons_summary.csv
The same pooled four-fold protocol was then run across 15m, 1h, and 4h bars for Majority, LogReg, LSTM, and 1D CNN (OHLCV + Open Interest). The combined result table is in docs/results/long_history_timeframe_models_summary.csv.
Current timeframe takeaways:
15m: no useful learned edge;LSTMhas the best classification (0.5150accuracy,0.5202ROC-AUC) but negative trading metrics, whileMajorityis the best trading baseline with Sharpe0.04411h:LSTMremains the best classifier, but1D CNN (OHLCV + Open Interest)is the best trading model with Sharpe0.08284h: this is the strongest setup so far;LSTMhas the best classification (0.5219accuracy), while1D CNN (OHLCV + Open Interest)has the best trading result with cumulative return0.3140and Sharpe0.8068
The current research interpretation is therefore cautious but useful: lower timeframes are still close to noise, 1h contains only weak signal, and 4h is the first interval where the learned models separate more meaningfully from the trivial baselines under the corrected backtest.
python scripts/run_walk_forward.py \
--suite-name long_history_naive_vs_best \
--config-name majority_ohlcv \
--config-name random_ohlcv \
--config-name persistence_ohlcv \
--config-name logistic_ohlcv \
--config-name cnn_ohlcv_open_interest \
--symbols BTCUSDT ETHUSDT SOLUSDT \
--override experiment.eval.walk_forward.n_splits=4python scripts/generate_walk_forward_figures.py \
--summary-csv docs/results/long_history_naive_vs_best_summary.csv \
--folds-csv docs/results/long_history_naive_vs_best_folds.csv \
--prefix long_history_naive_vs_best \
--title "Long-History Walk-Forward Naive Baselines vs Best Learned Models"python scripts/run_walk_forward.py \
--suite-name long_history_tf_4h_models \
--config-name majority_ohlcv \
--config-name logistic_ohlcv \
--config-name lstm_ohlcv \
--config-name cnn_ohlcv_open_interest \
--symbols BTCUSDT ETHUSDT SOLUSDT \
--override experiment.eval.walk_forward.n_splits=4 \
--override experiment.data.interval=4hpython scripts/run_walk_forward.py \
--suite-name long_history_tf_1h_models \
--config-name majority_ohlcv \
--config-name logistic_ohlcv \
--config-name lstm_ohlcv \
--config-name cnn_ohlcv_open_interest \
--symbols BTCUSDT ETHUSDT SOLUSDT \
--override experiment.eval.walk_forward.n_splits=4 \
--override experiment.data.interval=1hpython scripts/run_walk_forward.py \
--suite-name long_history_tf_15m_models \
--config-name majority_ohlcv \
--config-name logistic_ohlcv \
--config-name lstm_ohlcv \
--config-name cnn_ohlcv_open_interest \
--symbols BTCUSDT ETHUSDT SOLUSDT \
--override experiment.eval.walk_forward.n_splits=4 \
--override experiment.data.interval=15mpython scripts/generate_timeframe_tables.py \
--suite long_history_tf_15m_models:15m:docs/results/long_history_tf_15m_models_summary.csv \
--suite long_history_tf_1h_models:1h:docs/results/long_history_tf_1h_models_summary.csv \
--suite long_history_tf_4h_models:4h:docs/results/long_history_tf_4h_models_summary.csv \
--output-prefix docs/results/long_history_timeframe_models_summarypython scripts/generate_timeframe_figures.py \
--summary-csv docs/results/long_history_timeframe_models_summary.csv \
--prefix long_history_timeframe_models_summary \
--title "Long-History Walk-Forward Comparison Across Timeframes"- Chronological train/validation/test splits only
- Optional purge and embargo support for overlapping labels
- Lagged feature alignment to avoid contemporaneous leakage
- Deterministic seeds
- Clear distinction between statistical performance and simulated trading outcomes
- Reproducible experiment registry and artifact layout
See docs/methodology.md, docs/reproducibility.md, and docs/paper_notes.md for details.
- Add reproducible liquidation data integration
- Extend exchange adapters beyond Binance USD-M futures
- Add richer microstructure features such as basis and mark-index spread
- Expand experiment registry outputs into paper tables and publication figures
- Export lightweight models for browser-based inference and chart-indicator workflows
MarketTensor is research infrastructure for forecasting experiments. It is not investment advice, not a trading recommendation system, and not a production execution stack.


