# CA-SDGNN Training, Validation, and Testing

This notebook demonstrates how to train, validate, and test the CA-SDGNN framework on the Bitcoin Alpha dataset. The workflow includes:
- Mounting Google Drive (if using Colab)
- Pretraining the model
- Fine-tuning the model
- Running validation and testing
- Analyzing results

In [11]:
!python main.py \
  --dataset bitcoin_alpha \
  --mode pretrain \
  --train_file experiment-data/bitcoin_alpha-train-1.edgelist \
  --device cuda \
  --epochs 500 \
  --node_feat_dim 16 \
  --embed_dim 16 \
  --num_heads 4 \
  --num_layers 2 \
  --dropout_rate 0.1 \
  --lr 0.001 \
  --weight_decay 0.0001 \
  --pretrain_path embeddings/bitcoin_alpha_ca_sdgnn_pretrained.pth \
  --output_dir output

Epoch [1/500], Pretraining Loss: 31.4786
Epoch [2/500], Pretraining Loss: 30.4657
Epoch [3/500], Pretraining Loss: 29.4441
Epoch [4/500], Pretraining Loss: 28.9800
Epoch [5/500], Pretraining Loss: 28.3410
Epoch [6/500], Pretraining Loss: 27.4996
Epoch [7/500], Pretraining Loss: 27.0407
Epoch [8/500], Pretraining Loss: 26.1458
Epoch [9/500], Pretraining Loss: 25.6498
Epoch [10/500], Pretraining Loss: 25.1788
Epoch [11/500], Pretraining Loss: 24.9631
Epoch [12/500], Pretraining Loss: 24.1118
Epoch [13/500], Pretraining Loss: 23.7725
Epoch [14/500], Pretraining Loss: 23.1672
Epoch [15/500], Pretraining Loss: 23.0386
Epoch [16/500], Pretraining Loss: 22.8957
Epoch [17/500], Pretraining Loss: 22.2958
Epoch [18/500], Pretraining Loss: 22.1936
Epoch [19/500], Pretraining Loss: 21.9401
Epoch [20/500], Pretraining Loss: 21.7944
Epoch [21/500], Pretraining Loss: 21.2566
Epoch [22/500], Pretraining Loss: 21.1918
Epoch [23/500], Pretraining Loss: 21.0761
Epoch [24/500], Pretraining Loss: 20.7190
E

  signed_adj_matrix = torch.sparse.FloatTensor(indices, values, (num_nodes, num_nodes))

Pretraining Epochs:   0%|          | 0/500 [00:00<?, ?it/s]
Pretraining Epochs:   0%|          | 1/500 [00:03<26:10,  3.15s/it]
Pretraining Epochs:   0%|          | 2/500 [00:03<11:28,  1.38s/it]
Pretraining Epochs:   1%|          | 3/500 [00:03<06:44,  1.23it/s]
Pretraining Epochs:   1%|          | 4/500 [00:03<04:32,  1.82it/s]
Pretraining Epochs:   1%|          | 5/500 [00:03<03:23,  2.43it/s]
Pretraining Epochs:   1%|          | 6/500 [00:03<02:37,  3.13it/s]
Pretraining Epochs:   1%|▏         | 7/500 [00:04<02:07,  3.88it/s]
Pretraining Epochs:   2%|▏         | 8/500 [00:04<01:50,  4.47it/s]
Pretraining Epochs:   2%|▏         | 9/500 [00:04<01:36,  5.10it/s]
Pretraining Epochs:   2%|▏         | 10/500 [00:04<01:27,  5.61it/s]
Pretraining Epochs:   2%|▏         | 11/500 [00:04<01:24,  5.81it/s]
Pretraining Epochs:   2%|▏         | 12/500 [00:04<01:24,  5.80it/s]
Pretraining Epochs:   3%|▎      

In [None]:
!python main.py \
  --dataset bitcoin_alpha \
  --mode finetune \
  --train_file experiment-data/bitcoin_alpha-train-1.edgelist \
  --device cuda \
  --epochs 50 \
  --node_feat_dim 16 \
  --embed_dim 16 \
  --num_heads 4 \
  --num_layers 2 \
  --dropout_rate 0.1 \
  --lr 0.0005 \
  --weight_decay 0.0001 \
  --pretrain_path embeddings/bitcoin_alpha_ca_sdgnn_pretrained.pth \
  --finetune_path embeddings/bitcoin_alpha_ca_sdgnn_finetuned.pth \
  --output_dir output

^C


  signed_adj_matrix = torch.sparse.FloatTensor(indices, values, (num_nodes, num_nodes))

Fine-tuning Epochs:   0%|          | 0/50 [00:00<?, ?it/s]
Fine-tuning Epochs:   2%|▏         | 1/50 [00:36<29:55, 36.64s/it]
Fine-tuning Epochs:   4%|▍         | 2/50 [01:11<28:33, 35.70s/it]
Fine-tuning Epochs:   6%|▌         | 3/50 [01:45<27:13, 34.76s/it]
Fine-tuning Epochs:   8%|▊         | 4/50 [02:25<28:10, 36.76s/it]
Fine-tuning Epochs:  10%|█         | 5/50 [03:01<27:32, 36.73s/it]
Fine-tuning Epochs:  12%|█▏        | 6/50 [03:38<26:55, 36.72s/it]
Fine-tuning Epochs:  14%|█▍        | 7/50 [04:15<26:19, 36.73s/it]
Fine-tuning Epochs:  16%|█▌        | 8/50 [04:46<24:28, 34.96s/it]
Fine-tuning Epochs:  18%|█▊        | 9/50 [05:17<23:03, 33.75s/it]
Fine-tuning Epochs:  20%|██        | 10/50 [05:50<22:23, 33.58s/it]
Fine-tuning Epochs:  22%|██▏       | 11/50 [06:19<20:55, 32.19s/it]
Fine-tuning Epochs:  24%|██▍       | 12/50 [06:48<19:45, 31.19s/it]
Fine-tuning Epochs:  26%|██▌       | 13/50 [0

Using device: NVIDIA GeForce GTX 1650 with Max-Q Design (torch.cuda.is_available()=True)
[LOG] Parsing arguments...
[LOG] Setting random seeds...
[LOG] Loading training edge list...
[LOG] Number of nodes: 3650
[LOG] Creating adjacency lists...
[LOG] Computing centrality features and node sign influence...
[LOG] Creating adjacency matrices...
[LOG] Initializing node features...
[LOG] Initializing model...
[LOG] Setting up optimizer...
[LOG] Starting fine-tuning phase...
[LOG] Loading pretrained model from embeddings/bitcoin_alpha_ca_sdgnn_pretrained.pth...
[LOG] Loaded pretrained model from embeddings/bitcoin_alpha_ca_sdgnn_pretrained.pth
[LOG] Preparing edge list and labels for fine-tuning...
[LOG] Computing weight_dict for fine-tuning loss...
[LOG] Calling finetune()...
Epoch [1/50], Fine-tuning Loss: 25340.4004
Epoch [2/50], Fine-tuning Loss: 24021.7344
Epoch [3/50], Fine-tuning Loss: 22774.5430
Epoch [4/50], Fine-tuning Loss: 21233.8887
Epoch [5/50], Fine-tuning Loss: 19989.4844
Epo

## Validation and Testing

After pretraining and fine-tuning, we evaluate the model on the validation and test sets. Adjust the file paths and parameters as needed for your experiment.

In [21]:
!python main.py \
  --dataset bitcoin_alpha \
  --mode infer \
  --train_file experiment-data/bitcoin_alpha-train-1.edgelist \
  --test_file experiment-data/bitcoin_alpha-test-1.edgelist \
  --device cuda \
  --node_feat_dim 16 \
  --embed_dim 16 \
  --num_heads 4 \
  --num_layers 2 \
  --dropout_rate 0.1 \
  --pretrain_path embeddings/bitcoin_alpha_ca_sdgnn_pretrained.pth \
  --finetune_path embeddings/bitcoin_alpha_ca_sdgnn_finetuned.pth \
  --output_dir output

Using device: NVIDIA GeForce GTX 1650 with Max-Q Design (torch.cuda.is_available()=True)
[LOG] Parsing arguments...
[LOG] Setting random seeds...
[LOG] Loading training edge list...
[LOG] Number of nodes: 3650
[LOG] Creating adjacency lists...
[LOG] Computing centrality features and node sign influence...
[LOG] Creating adjacency matrices...
[LOG] Initializing node features...
[LOG] Initializing model...
[LOG] Setting up optimizer...
[LOG] Starting inference phase...
[LOG] Loading fine-tuned model from embeddings/bitcoin_alpha_ca_sdgnn_finetuned.pth...
[LOG] Loaded fine-tuned model from embeddings/bitcoin_alpha_ca_sdgnn_finetuned.pth
[LOG] Loading and preparing test edge list...
[LOG] Running inference...
Optimal threshold: 0.10
Accuracy: 0.9394
Precision: 0.9394
Recall: 1.0000
Binary F1 Score: 0.9688
Micro F1 Score: 0.9394
Macro F1 Score: 0.4844
AUC: 0.5133
Predictions saved to output\predictions.txt
Evaluation Metrics: {'accuracy': 0.9394377842083506, 'precision': 0.9394377842083506,

  signed_adj_matrix = torch.sparse.FloatTensor(indices, values, (num_nodes, num_nodes))


In [None]:
!python main.py \
  --dataset bitcoin_alpha \
  --mode test \
  --train_file experiment-data/bitcoin_alpha-train-1.edgelist \
  --test_file experiment-data/bitcoin_alpha-test-1.edgelist \
  --device cuda \
  --node_feat_dim 16 \
  --embed_dim 16 \
  --num_heads 4 \
  --num_layers 2 \
  --dropout_rate 0.1 \
  --pretrain_path embeddings/bitcoin_alpha_ca_sdgnn_pretrained.pth \
  --finetune_path embeddings/bitcoin_alpha_ca_sdgnn_finetuned.pth \
  --output_dir output

## Analyzing Results

Check the `output` directory for logs, metrics, and model outputs. You can visualize or further process these results as needed.

## Running in Google Colab

If you are running this notebook in Google Colab, use the following cells to set up your environment and adjust file paths accordingly.

In [None]:
from google.colab import drive
import os

drive.mount('/content/drive')

# Set your project root directory (update this path as needed)
project_root = '/content/drive/MyDrive/Capstone/casdgnn'
os.chdir(project_root)
print('Current working directory:', os.getcwd())

In [None]:
# Example: Pretraining in Colab
!python main.py \
  --dataset bitcoin_alpha \
  --mode pretrain \
  --train_file experiment-data/bitcoin_alpha-train-1.edgelist \
  --device cuda \
  --epochs 500 \
  --node_feat_dim 16 \
  --embed_dim 16 \
  --num_heads 4 \
  --num_layers 2 \
  --dropout_rate 0.1 \
  --lr 0.001 \
  --weight_decay 0.0001 \
  --pretrain_path embeddings/bitcoin_alpha_ca_sdgnn_pretrained.pth \
  --output_dir output

In [None]:
# Example: Fine-tuning in Colab
!python main.py \
  --dataset bitcoin_alpha \
  --mode finetune \
  --train_file experiment-data/bitcoin_alpha-train-1.edgelist \
  --device cuda \
  --epochs 50 \
  --node_feat_dim 16 \
  --embed_dim 16 \
  --num_heads 4 \
  --num_layers 2 \
  --dropout_rate 0.1 \
  --lr 0.0005 \
  --weight_decay 0.0001 \
  --pretrain_path embeddings/bitcoin_alpha_ca_sdgnn_pretrained.pth \
  --finetune_path embeddings/bitcoin_alpha_ca_sdgnn_finetuned.pth \
  --output_dir output

In [None]:
# Example: Validation in Colab
!python main.py \
  --dataset bitcoin_alpha \
  --mode validate \
  --train_file experiment-data/bitcoin_alpha-train-1.edgelist \
  --val_file experiment-data/bitcoin_alpha-test-1.edgelist \
  --device cuda \
  --node_feat_dim 16 \
  --embed_dim 16 \
  --num_heads 4 \
  --num_layers 2 \
  --dropout_rate 0.1 \
  --pretrain_path embeddings/bitcoin_alpha_ca_sdgnn_pretrained.pth \
  --finetune_path embeddings/bitcoin_alpha_ca_sdgnn_finetuned.pth \
  --output_dir output

In [None]:
# Example: Testing in Colab
!python main.py \
  --dataset bitcoin_alpha \
  --mode test \
  --train_file experiment-data/bitcoin_alpha-train-1.edgelist \
  --test_file experiment-data/bitcoin_alpha-test-1.edgelist \
  --device cuda \
  --node_feat_dim 16 \
  --embed_dim 16 \
  --num_heads 4 \
  --num_layers 2 \
  --dropout_rate 0.1 \
  --pretrain_path embeddings/bitcoin_alpha_ca_sdgnn_pretrained.pth \
  --finetune_path embeddings/bitcoin_alpha_ca_sdgnn_finetuned.pth \
  --output_dir output