# Bulk submission of VASP jobs to Laikapack

This notebook will run a VASP calculation for each directory in a larger directory. i.e. the expected file structure is:

```
path
├─ calculation_1
   ├─ INCAR
   ├─ POSCAR
   ├─ POTCAR
   ├─ KPOINTS
├─ calculation_2
   ├─ INCAR
   ├─ POSCAR
   ├─ POTCAR
   ├─ KPOINTS
├─ ...
```

I usually generate these files using ASE directly on Laikapack. This has to be done on one of the Docker images that includes VASP in order to properly set up the POTCAR (for most functionals). Alternatively, you can upload these files. As long as they are good VASP inputs, this script will do the rest of the work.

In [1]:
import os
import yaml
from tqdm import tqdm
import pickle
import pandas as pd
import re
import numpy as np
from os import path as os_path

## The script

In [2]:
def run_VASP_batch(
    path,
    base_params,
    exp_tag,
    cpu_cores = 4,
    memory = '8Gi',
    select_only = None,
):
    '''
    args:
        path (str): the directory that contains your VASP directories
        base_params: a yaml of parameters obtained from your template file
        exp_tag: a string to identify this experiment/run/batch
        cpu_cores: how many CPU cores for each individual VASP job
        memory: how much memory for each individual VASP job
        
    If you submit more than your quota on Laikapack
    (e.g. 100 8-core calculations), your quota will be filled to capacity
    and the remainder of the jobs will wait for resources.
    This scheduling is handled by Kubernetes/Laikapack and is outside
    the scope of this script.
    '''
    
    if select_only is not None:
        paths = select_only
    else:
        paths = os.listdir(path)
        
    params = base_params.copy()
    params['metadata']['namespace'] = 'kabdelma'
    params['spec']['template']['spec']['containers'] = [params['spec']['template']['spec']['containers']]
    container = params['spec']['template']['spec']['containers'][0]
    
    # Set the resource usage, which is assumed to be the same for all jobs
    if cpu_cores is not None:

        container['resources']['limits']['cpu'] = cpu_cores
        container['resources']['requests']['cpu'] = cpu_cores
        container['args'][0] = re.sub('-np \d+', '-np '+str(cpu_cores), container['args'][0])
    
    if memory is not None:
        container['resources']['limits']['memory'] = memory
        container['resources']['requests']['memory'] = memory
    for fid, directory in enumerate(tqdm(paths)):
        
        # Only run calculations on directories that have not been run already
        if os_path.exists(path+'/'+directory+ "/OUTCAR")== False:  
#         if directory in df_best.index.values:
            # Set the necessary parameters for each job
            params['metadata']['name'] = exp_tag + '-' + re.sub('_', '-', directory)
            container['workingDir'] = path + '/' + directory
            container['name'] = exp_tag + '-' + re.sub('_', '-', directory)

            # # Write the job specification file
            with open('job.yaml', 'w') as config_file:
                yaml.dump(params, config_file, default_flow_style = False)

            # Submit the job    
            os.system('kubectl apply -f job.yaml > /dev/null')

## Running some jobs

1. Specify a kubernetes template file. If you don't have one, you can copy and modify the one at `/shared-scratch/ethan/vasp-job-submit/base.yaml`

In [3]:
template_file = 'base.yaml'

Verify the parameters obtained from your template file:

In [4]:
with open(template_file, 'r') as stream:
    base_params = yaml.safe_load(stream)
params = base_params.copy()
params

{'apiVersion': 'batch/v1',
 'kind': 'Job',
 'metadata': {'name': 'CHANGE_THIS', 'namespace': 'kabdelma'},
 'spec': {'ttlSecondsAfterFinished': 7200,
  'backoffLimit': 0,
  'template': {'spec': {'affinity': {'nodeAffinity': {'requiredDuringSchedulingIgnoredDuringExecution': {'nodeSelectorTerms': [{'matchExpressions': [{'key': 'kubernetes.io/hostname',
           'operator': 'NotIn',
           'values': ['major-gator']}]}]}}},
    'containers': {'name': 'CHANGE_THIS',
     'image': 'ulissigroup/kubeflow_vasp:amptorch',
     'imagePullPolicy': 'Always',
     'resources': {'limits': {'cpu': 16,
       'memory': '16Gi',
       'nvidia.com/gpu': '0'},
      'requests': {'cpu': 16, 'memory': '16Gi'}},
     'volumeMounts': [{'mountPath': '/home/jovyan/shared-datasets/',
       'name': 'shared-datasets'},
      {'mountPath': '/home/jovyan/shared-scratch/', 'name': 'shared-scratch'},
      {'mountPath': '/dev/shm', 'name': 'dshm'},
      {'mountPath': '/home/jovyan/', 'name': 'workspace-cop-mod

In [5]:
base_params

{'apiVersion': 'batch/v1',
 'kind': 'Job',
 'metadata': {'name': 'CHANGE_THIS', 'namespace': 'kabdelma'},
 'spec': {'ttlSecondsAfterFinished': 7200,
  'backoffLimit': 0,
  'template': {'spec': {'affinity': {'nodeAffinity': {'requiredDuringSchedulingIgnoredDuringExecution': {'nodeSelectorTerms': [{'matchExpressions': [{'key': 'kubernetes.io/hostname',
           'operator': 'NotIn',
           'values': ['major-gator']}]}]}}},
    'containers': {'name': 'CHANGE_THIS',
     'image': 'ulissigroup/kubeflow_vasp:amptorch',
     'imagePullPolicy': 'Always',
     'resources': {'limits': {'cpu': 16,
       'memory': '16Gi',
       'nvidia.com/gpu': '0'},
      'requests': {'cpu': 16, 'memory': '16Gi'}},
     'volumeMounts': [{'mountPath': '/home/jovyan/shared-datasets/',
       'name': 'shared-datasets'},
      {'mountPath': '/home/jovyan/shared-scratch/', 'name': 'shared-scratch'},
      {'mountPath': '/dev/shm', 'name': 'dshm'},
      {'mountPath': '/home/jovyan/', 'name': 'workspace-cop-mod

In [6]:
base_params['metadata']['namespace'] = 'kabdelma'

2. Setup metadata for your jobs:

In [7]:
# The path that contains your VASP directories
# e.x. '/home/jovyan/shared-scratch/ethan/experiment_1/'
path = '/home/jovyan/shared-scratch/kabdelma/high_miller_idx/vasp/slabs'

# A string to identify your experiment. No underscores or capital letters!
# e.x. 'ems-vasp-exp1'
exp_tag = 'o2'

3. Run the script to create and submit the jobs:

In [8]:
run_VASP_batch(
    path = path,
    base_params = base_params,
    exp_tag = exp_tag,
    cpu_cores = 16,
    memory = '32Gi'
)

100%|██████████| 36/36 [00:07<00:00,  4.57it/s]
