# Data_v1_training&evaluation.ipynb

This notebook trains and evaluates three models (exp1 to exp3) using the version 1 dataset with noticeable data distribution issues from convert_sample_to_cryptoMamba_format.ipynb. The experiments test how different batch sizes and the use of normalization affect model training and generalization. All models are evaluated on the same validation and test sets.

## Experiments Overview:
- exp1: Small batch size without normalization
- exp2: Large batch size without normalization
- exp3: Large batch size with normalization

## Purpose: 
Compare the effects of batch size and normalization on training convergence, model generalization, and to check model performance on a dataset with inconsistent data distribution.

In [None]:
# Connect google drive.

from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
# Clone repo to current session.

!git clone https://github.com/MShahabSepehri/CryptoMamba.git
%cd CryptoMamba/

Cloning into 'CryptoMamba'...
remote: Enumerating objects: 187, done.[K
remote: Counting objects: 100% (55/55), done.[K
remote: Compressing objects: 100% (21/21), done.[K
remote: Total 187 (delta 43), reused 34 (delta 34), pack-reused 132 (from 1)[K
Receiving objects: 100% (187/187), 1.37 MiB | 3.57 MiB/s, done.
Resolving deltas: 100% (89/89), done.
/content/CryptoMamba


In [None]:
# Install required dependencies.
%%capture
!pip install mamba-ssm[causal-conv1d]
!pip install -r requirements.txt

# Exp1: Based on our version1 dataset.
## Dataset info:
1. test_interval:
- `2018-29-04`
- `2019-28-04`
2. train_interval:
- `2013-29-04`
- `2017-28-04`
3. val_interval:
- `2017-28-04`
- `2018-28-04`

## Hyperparameter setting:
1. learning_rate: 0.01
2. normalization: False
3. window_size: 14
4. batch_size: 32

In [4]:
# @title Training
!python scripts/training.py --config cmamba_v_exp1

Seed set to 23
['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume']
['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume']
['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume']
2025-04-19 12:23:46.746388: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1745065426.992303    4379 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1745065427.063188    4379 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-04-19 12:23:47.565326: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other op

# Evaluation of exp1’s Model

## Result Analysis:
1.	The model successfully converged using a small batch size of 32 and without normalization.
2.	However, it demonstrated poor performance on both the validation and test datasets.

In [6]:
!python scripts/evaluation.py --config cmamba_v_exp1 --ckpt_path ./logs/CMamba_exp1/version_0/checkpoints/epoch66-val-rmse5787.5752.ckpt

Seed set to 23
['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume']
['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume']
['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume']
1446 data points loaded as train split.
350 data points loaded as val split.
350 data points loaded as test split.
 Split        MSE          RMSE     MAPE      MAE    
 Train      569.397       23.862   0.03672   15.609  
  Val     35378124.0     5947.951  0.5126   4464.943 
 Test      9052551.0     3008.746  0.47199  2753.768 


# Exp2: Based on our version1 dataset.
## Dataset info:
1. test_interval:
- `2018-29-04`
- `2019-28-04`
2. train_interval:
- `2013-29-04`
- `2017-28-04`
3. val_interval:
- `2017-28-04`
- `2018-28-04`

## Hyperparameter setting for training:
1. learning_rate: 0.01
2. Normalization: False
3. window_size: 14
4. batch_size: 512

In [9]:
!python scripts/training.py --config cmamba_v_exp2

Seed set to 23
['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume']
['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume']
['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume']
2025-04-19 12:47:57.434240: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1745066877.454487   17646 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1745066877.460711   17646 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-04-19 12:47:57.482118: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other op

# Evaluation of exp2’s Model

## Result Analysis:
1.	This model also converged, using a large batch size of 512 and no normalization.
2.	Despite convergence, its performance on the validation and test sets was even worse than that of exp1.

In [10]:
!python scripts/evaluation.py --config cmamba_v_exp2 --ckpt_path ./logs/CMamba_exp2/version_1/checkpoints/last.ckpt

Seed set to 23
['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume']
['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume']
['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume']
1446 data points loaded as train split.
350 data points loaded as val split.
350 data points loaded as test split.
 Split        MSE          RMSE     MAPE      MAE    
 Train     67435.078     259.683   0.40233  182.897  
  Val     62429332.0     7901.223  0.90999  6645.025 
 Test     28370898.0     5326.434  0.91381  5107.485 


# Exp3: Based on our version1 dataset.
## Dataset info:
1. test_interval:
- `2018-29-04`
- `2019-28-04`
2. train_interval:
- `2013-29-04`
- `2017-28-04`
3. val_interval:
- `2017-28-04`
- `2018-28-04`

## Hyperparameter setting for training:
1. learning_rate: 0.01
2. Normalization: True
3. window_size: 14
4. batch_size: 512

In [12]:
!python scripts/training.py --config cmamba_v_exp3

Seed set to 23
['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume']
['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume']
['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume']
2025-04-19 13:00:22.830106: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1745067622.850348   24012 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1745067622.856552   24012 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-04-19 13:00:22.877448: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other op

# Evaluation of exp3’s Model

## Result Analysis:
1.	The model converged more quickly than exp2, benefiting from the use of normalization.
2.	Nonetheless, its performance remained poor on both validation and test datasets, likely due to data distribution issues.


In [13]:
!python scripts/evaluation.py --config cmamba_v_exp3 --ckpt_path ./logs/CMamba_exp3/version_0/checkpoints/last.ckpt

Seed set to 23
['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume']
['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume']
['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume']
1446 data points loaded as train split.
350 data points loaded as val split.
350 data points loaded as test split.
 Split        MSE          RMSE     MAPE      MAE    
 Train     8373.418       91.506   0.24683   86.914  
  Val     58893792.0     7674.229  0.68339  5872.741 
 Test     18711456.0     4325.674  0.68217  3974.228 


# Conclusion

Across all three experiments in this notebook, the models were able to converge, yet consistently failed to generalize well on the validation and test datasets. This indicates that convergence alone is not a sufficient indicator of model performance. The poor results—despite varying batch sizes and introducing normalization in experiment 3—suggest underlying issues with the data distribution.

To address this, we conducted further experiments using data version 2 in the data_v2_training&evaluation.ipynb file, where the training, validation, and test datasets were resampled to mitigate major data distribution problems.