## Steps to retrain MaSIF-Neosurf

### 1 - Run masif_ppi_search inference to generate descriptors for training

Go to masif_ppi_search folder:

`cd masif/data/masif_ppi_search`

1. Prepare training data:

`sbatch data_prepare.slum`

2. Compute descriptors:

Run inference with masif_ppi_search (used as input for masif_seed_search NN):

`sbatch compute_descriptors.slurm`

Expected output per protein complex:

```
descriptors/sc05/all_feat/PDBID_CHAIN1_CHAIN2/
├── p1_desc_straight.npy    # Chain 1 descriptors (normal)
├── p1_desc_flipped.npy      # Chain 1 descriptors (flipped for complementarity)
├── p2_desc_straight.npy     # Chain 2 descriptors (normal)
└── p2_desc_flipped.npy      # Chain 2 descriptors (flipped)
```

### 2 - Train masif_seed_search

Go to masif_seed_search folder:
`cd masif_seed_search/data/scoring_nn`

1. Generate alignment training data
`sbatch make_transformations_12A.slurm`

2. Precompute features
`sbatch prepare_features_12A.slurm`

3. Train the neural network
`sbatch train_nn.slurm`


In [None]:
'''
┌─────────────────────────────────────────────────────────┐
│ STEP 1: Data Preparation (masif/data/masif_ppi_search) │
└────────────────────┬────────────────────────────────────┘
                     │
                     ▼
        sbatch data_prepare.slurm
                     │
         ┌───────────┴────────────┐
         │ data_prepare_one.sh    │
         │ - Download PDB         │
         │ - Triangulate surface  │
         │ - Precompute features  │
         └───────────┬────────────┘
                     │
                     ▼
        OUTPUTS: .ply, .pdb, precomputation/*.npy

┌─────────────────────────────────────────────────────────┐
│ STEP 2: Compute Descriptors (masif/data/masif_ppi_search)│
└────────────────────┬────────────────────────────────────┘
                     │
                     ▼
     sbatch compute_descriptors.slurm
                     │
         ┌───────────┴────────────────┐
         │ compute_descriptors.sh     │
         │                            │
         │ masif_ppi_search_comp_desc.py
         │ - Load trained MaSIF model │
         │ - Run inference            │
         │ - Generate descriptors     │
         └───────────┬────────────────┘
                     │
                     ▼
        OUTPUTS: descriptors/*/p1_desc_*.npy
                descriptors/*/p2_desc_*.npy

┌─────────────────────────────────────────────────────────┐
│ STEP 3-5: Train masif_seed_search (masif_seed_search)  │
└────────────────────┬────────────────────────────────────┘
                     │
         ┌───────────┴────────────┐
         │ Step 3: Generate data  │
         │ make_transformations_  │
         │   12A.slurm            │
         └───────────┬────────────┘
                     │
         ┌───────────┴────────────┐
         │ Step 4: Extract feats  │
         │ prepare_features_      │
         │   12A.slurm            │
         └───────────┬────────────┘
                     │
         ┌───────────┴────────────┐
         │ Step 5: Train NN       │
         │ train_nn.slurm         │
         └───────────┬────────────┘
                     │
                     ▼
        OUTPUT: models/weights_12A_0123.*
'''