Trading system comparing traditional ML (Random Forest) against biological neural networks (fruit fly brain connectome) for Bitcoin price prediction.
The task is next-bar return prediction on Bitcoin/USDT using high-frequency data from Binance. We use 7 days of 1-second bars (~12,400 samples) with full market data including OHLC prices, volume, VWAP, and trade counts. The dataset is split 80/20 for training and testing, representing approximately 5.6 days of training and 1.4 days of held-out testing. Each model predicts the next 1-second log return and generates trading signals when predictions exceed model-specific thresholds. Performance is evaluated on out-of-sample data using a realistic backtest with 0.1% taker fees, 60-second cooldown between trades, and $1M starting capital. All models are trained once on the training set and evaluated on the same test period to ensure fair comparison.
| Model | Description | Return |
|---|---|---|
| Baseline | Random Forest with 23 engineered features | +3.62% |
| ESN | Echo State Network from fruit fly brain (7500 neurons) | +4.32% |
| ESN Rewired | Control with randomized connections | +3.78% |
Traditional machine learning approach using Random Forest trained on 23 engineered features including VWAP deviation, momentum indicators, volume metrics, and price patterns. Trades when predicted returns exceed a dynamic threshold based on signal strength (top 15% of predictions). Uses adaptive position sizing (30-85% of capital) based on prediction confidence, with stop-loss (-0.6%) and take-profit (+1.2%) rules. Generated 488 trades with +3.62% return and -1.06% max drawdown.
# Feature engineering - 23 technical indicators
df['ret_1s'] = np.log(bars['close'] / bars['close'].shift(1))
df['ret_5s'] = np.log(bars['close'] / bars['close'].shift(5))
df['vwap_dev_10s'] = (bars['close'] - bars['vwap'].rolling(10).mean()) / bars['vwap']
df['volume_ma_ratio'] = bars['volume'] / bars['volume'].rolling(20).mean()
# ... 19 more features (momentum, volatility, spreads, etc.)
# Model training
model = RandomForestRegressor(n_estimators=100, max_depth=10, random_state=42)
X = df[FEATURE_COLS].fillna(0).values
y = df['label'].values # Future 1-period return
model.fit(X_train, y_train)
# Trading signal
prediction = model.predict(X_current)
if abs(prediction) > dynamic_threshold:
position_size = min(0.85, 0.3 + signal_strength * 0.55)
trade(direction=np.sign(prediction), size=position_size)Biological neural network using the actual connectome structure from a fruit fly brain (7500 neurons, 323K connections). The reservoir of recurrently connected biological neurons processes only raw return data through echo state dynamics, with a simple linear readout trained via ridge regression. Trades when the ESN's predicted return exceeds 0.02% threshold, using fixed 70% position sizing without stop-losses. The biological structure provides implicit feature extraction and temporal memory. Generated 422 trades with +4.32% return and -0.88% max drawdown - the best performer.
# Load biological connectome (fruit fly brain)
A = scipy.sparse.load_npz('adjacency_directed_csr.npz') # 138K neurons
degrees = np.array(A.sum(axis=1)).ravel()
top_nodes = np.argsort(degrees)[-7500:] # Select 7500 most connected
W = A[top_nodes][:, top_nodes].toarray() # 323K biological connections
# Scale to spectral radius (stability)
eigenvalues = np.linalg.eigvals(W)
W = W * (0.9 / np.max(np.abs(eigenvalues))) # rho=0.9
# Generate reservoir states from raw returns only
X_reservoir = []
x = np.zeros(7500) # Initial state
for return_t in train_returns:
u_t = (return_t - mean) / std # Normalize input
x = (1 - 0.7) * x + 0.7 * np.tanh(W @ x + 0.12 * u_t) # Echo state update
X_reservoir.append(x)
# Train simple linear readout via ridge regression
W_out = np.linalg.solve(X_reservoir.T @ X_reservoir + 1e-5 * I,
X_reservoir.T @ y_train)
# Trading signal - just multiply reservoir state by learned weights
prediction = W_out @ x_current
if abs(prediction) > 0.0002:
trade(direction=np.sign(prediction), size=0.70)Control experiment using the same ESN architecture but with randomized connections that preserve the network's degree distribution while destroying the biological structure. Identical hyperparameters and trading rules as the biological ESN. Generated 434 trades with +3.78% return and -0.86% max drawdown. The 0.54% underperformance versus the biological network demonstrates that the evolved neural structure provides measurable value for time series prediction.
# Rewire: shuffle connections while preserving degree distribution
W_rewired = W.copy()
rows, cols = W_rewired.nonzero()
np.random.shuffle(cols) # Randomize targets, keep source degree
for i, (r, c) in enumerate(zip(rows, cols)):
W_rewired[r, :] = 0
W_rewired[r, c] = W[rows[i], cols[i]] # Preserve edge weights
# Scale to same spectral radius
eigenvalues = np.linalg.eigvals(W_rewired)
W_rewired = W_rewired * (0.9 / np.max(np.abs(eigenvalues)))
# Identical training and trading as biological ESN
# ... same echo state dynamics, same readout, same rules
# Result: +3.78% vs +4.32% - biology wins by 0.54%python main.pyThis runs all 3 models automatically and generates:
- Individual validation plots
- Backtest timeline plots
- Feature importance plots
- Portfolio comparison plot
ESN_CONFIG = {
'n_nodes': 7500, # Biological neurons (optimal: 7500)
'alpha': 0.7, # Leak rate
'rho': 0.9, # Spectral radius
'in_scale': 0.12, # Input scaling
'ridge': 1e-5, # Regularization
'washout': 50, # Initial timesteps to discard
'seed': 2024, # Random seed
}Biology wins: The biological connectome structure (+4.32%) outperforms both the baseline (+3.62%) and random network (+3.78%).
The real fruit fly brain provides measurable value: +0.54% advantage over random structure.
| Model | Return | Trades | Max Drawdown |
|---|---|---|---|
| Baseline | +3.62% | 488 | -1.06% |
| ESN (Bio) | +4.32% | 422 | -0.88% |
| ESN (Rewired) | +3.78% | 434 | -0.86% |
main.py- Entry point (set MODEL_TYPE to "all", "baseline", "esn", or "esn_rewired")esn_model.py- Biological ESN implementationdata.py- Data fetching from Binanceplot.py- Visualization utilitiescompare_and_plot.py- Comparison runner
Uses 7 days of 1-second Bitcoin bars from Binance (~12K samples, 80/20 train/test split).
See requirements.txt:
- numpy
- pandas
- scikit-learn
- matplotlib
- seaborn
- scipy
- joblib
- ESN uses only 1 feature (returns) vs baseline's 23 features
- Optimal network size: 7500 nodes (tested 500-10000)
- Biology provides structure that helps with time series prediction
