<a href="https://colab.research.google.com/github/MDIL-SNU/SevenNet/blob/tutorial/tutorial/SevenNet_simple_tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# SevenNet tutorial: simple
---
This notebook is a simple tutorial for using SevenNet with the CLI.

We recommend using a GPU. If you're using Colab, go to:\
[Runtime] → [Change runtime type] → [T4 GPU] → [Save]\
You can use CPU, but it is very slow.

If you're using Colab, it may crash occasionally. If it does, try starting from the beginning or from the cell where it crashed. If that doesn’t work, go to: \
[Runtime] → [Disconnect and delete runtime] → restart the tutorial

If you want to learn more about SevenNet, please check the link below:\
[paper](https://pubs.acs.org/doi/10.1021/acs.jctc.4c00190)\
[code](https://github.com/MDIL-SNU/SevenNet) 

## 0. Installation
First of all, let's install SevenNet to our computer!

In [1]:
# Install SevenNet
!pip install sevenn

# If you want to check the SevenNet code, clone the repo.
# !git clone https://github.com/MDIL-SNU/SevenNet.git 

import os
working_dir = os.getcwd() # save current path



In [2]:
# check if sevenn is installed well
!sevenn -h

usage: sevenn [-h] [-m {train_v1,train_v2}] [-w [WORKING_DIR]] [-l LOG] [-s]
              [-d] [--distributed_backend {nccl,mpi}]
              input_yaml

sevenn version=0.10.0, train model based on the input.yaml

positional arguments:
  input_yaml            input.yaml for training

optional arguments:
  -h, --help            show this help message and exit
  -m {train_v1,train_v2}, --mode {train_v1,train_v2}
                        main training script to run. Default is train.
  -w [WORKING_DIR], --working_dir [WORKING_DIR]
                        path to write output. Default is cwd.
  -l LOG, --log LOG     name of logfile, default is log.sevenn
  -s, --screen          print log to stdout
  -d, --distributed     set this flag if it is distributed training
  --distributed_backend {nccl,mpi}
                        backend for distributed training. Supported: nccl, mpi


## 1. Dataset


In [3]:
# download the dataset
!pip install gdown
!gdown https://drive.google.com/uc?id=1TZPJzJaaBPZIiD5gaHC7E0oDngN14rJg

Cloning into 'BOTNet-datasets'...
remote: Enumerating objects: 57, done.[K
remote: Counting objects: 100% (57/57), done.[K
remote: Compressing objects: 100% (50/50), done.[K
remote: Total 57 (delta 13), reused 37 (delta 7), pack-reused 0 (from 0)[K
Receiving objects: 100% (57/57), 28.73 MiB | 30.05 MiB/s, done.
Resolving deltas: 100% (13/13), done.
iso_atoms.xyz  test_1200K.xyz  test_600K.xyz  train_300K.xyz
README.md      test_300K.xyz   test_dih.xyz   train_mixedT.xyz


In [None]:
!unzip seven_tuto_data.zip

In [None]:
# write structure_list file
# You can increase the data. For example, 150:1000:1

with open('structure_list', 'w') as f:
    f.write('[MD_1200K]\n')
    f.write(f'{working_dir}/train/1200K/OUTCAR 150:1000:10')
    f.write('\n')
    f.write('[MD_600K]\n')
    f.write(f'{working_dir}/train/600K/OUTCAR 150:1000:10')


## 2. Training with the CLI

In [6]:
%%writefile input.yaml  
model:
  chemical_species: auto
  cutoff: 5.0

train:
  device: cuda   # if you don't have gpu, write 'cpu'
  is_train_stress: False
  epoch: 100
  optim_param:
    lr: 0.005
  scheduler_param:
    gamma: 0.99

data:
  batch_size: 10
  load_dataset_path: './structure_list' # write a path of training data


Overwriting input.yaml


Let's start first training! It takes about 2 minutes to initialize and 6 minutes to train 100 epochs.
You can reduce epoch to train quickly.

In [7]:
!sevenn input.yaml -s

SevenNet: Scalable EquVariance-Enabled Neural Network
version 0.10.0, Wed Nov  6 00:20:13 2024
this file: /home/hexagonrose/SevenNet/tutorial/log.sevenn
reading yaml config...
                ****
              ********                                   .
              *//////,  ..                               .            ,*.
               ,,***.         ..                        ,          ********.                                  ./,
             .      .                ..   /////.       .,       . */////////                               /////////.
        .&@&/        .                  .(((((((..     /           *//////*.  ...                         *((((((((((.
     @@@@@@@@@@*    @@@@@@@@@@  @@@@@    *((@@@@@     (     %@@@@@@@@@@  .@@@@@@     ..@@@@.   @@@@@@*    .(@@@@@(((*
    @@@@@.          @@@@         @@@@@ .   @@@@@      #     %@@@@         @@@@@@@@     @@@@(,  @@@@@@@@.    @@@@@(*.
    %@@@@@@@&       @@@@@@@@@@    @@@@@   @@@@@      #  ., .%@@@@@@@@@    @@@@@@@@@@

Best model is saved as 'checkpoint_best.pth'. You can also check the training log at log.sevenn

In [8]:
!ls

SevenNet_python_tutorial.ipynb	checkpoint_45.pth  checkpoint_95.pth
SevenNet_simple_tutorial.ipynb	checkpoint_5.pth   evaluation
checkpoint			checkpoint_50.pth  input.yaml
checkpoint_10.pth		checkpoint_55.pth  lc.csv
checkpoint_100.pth		checkpoint_60.pth  log.sevenn
checkpoint_15.pth		checkpoint_65.pth  seven_tuto_data.zip
checkpoint_20.pth		checkpoint_70.pth  sevenn_data
checkpoint_25.pth		checkpoint_75.pth  structure_list
checkpoint_30.pth		checkpoint_80.pth  train
checkpoint_35.pth		checkpoint_85.pth
checkpoint_40.pth		checkpoint_90.pth


## 3. Model test
LPSC relax?