# Inference demo

This notebook illustrates two ways of performing inference:
- in a single Jupyter Notebook and with resources on a a single node
- via slurm command and with gpus from multiple nodes 

Examples with OG, PA and PC are included.

# 1. Sampling

Problem-specific inputs:
- path to folder containing the training run
- name of the checkpoint to sample from
- a config file in the `sampling_config` folder, which specifies the configuration to use for sampling.

Note that `batch_size` means the number of MCMC chains to run in parallel at sampling.

The config object includes all specifications of the physical system and training setup. As detailed in `mpatch_load_cfg` in `utils/loader.py`, the training config is initialized by `utils/base_config.py`, before being processed by the user-specified `config`, `optim_config` and finally `sampling_config`.

## 1.1 Single node

In [None]:
from utils.sampler import DeepSolidSampler

sampler = DeepSolidSampler(
                log_dir='_log_graphene_OG_test/',
                sampling_cfg_str='OG_batch1000_mcmc3e4.py',
                libcu_lib_path='/opt/conda/envs/deepsolid/lib/',
                ckpt_restore_filename='qmcjax_ckpt_000000_process0.npz',
                x64=True
)   

sampler.draw_samples(required_samples=2000, save_freq=1)

In [None]:
from utils.sampler import DeepSolidSampler

sampler = DeepSolidSampler(
                log_dir='_log_graphene_DA_test/',
                sampling_cfg_str='PA_batch1000_mcmc3e4.py',
                libcu_lib_path='/opt/conda/envs/deepsolid/lib/',
                ckpt_restore_filename='qmcjax_ckpt_000000_process0.npz',
                x64=True
)   

sampler.draw_samples(required_samples=2000, save_freq=1)

In [None]:
from utils.sampler import DeepSolidSampler

sampler = DeepSolidSampler(
                log_dir='_log_graphene_GA_test/',
                sampling_cfg_str='PA_batch1000_mcmc3e4.py',
                libcu_lib_path='/opt/conda/envs/deepsolid/lib/',
                ckpt_restore_filename='qmcjax_ckpt_000000_process0.npz',
                x64=True
)   

sampler.draw_samples(required_samples=2000, save_freq=1)

In [None]:
# before this, copy the content of the OG training folder and rename it as _log_graphene_PA_test

from utils.sampler import DeepSolidSampler

sampler = DeepSolidSampler(
                log_dir='_log_graphene_PA_test/',
                sampling_cfg_str='PA_batch1000_mcmc3e4.py',
                libcu_lib_path='/opt/conda/envs/deepsolid/lib/',
                ckpt_restore_filename='qmcjax_ckpt_000000_process0.npz',
                x64=True
)   

sampler.draw_samples(required_samples=2000, save_freq=1)

In [None]:
# before this, copy the content of the OG training folder and rename it as _log_graphene_PC_test

from utils.sampler import DeepSolidSampler

sampler = DeepSolidSampler(
                log_dir='_log_graphene_PC_test/',
                sampling_cfg_str='PC_batch20_mcmc3e4.py',
                libcu_lib_path='/opt/conda/envs/deepsolid/lib/',
                ckpt_restore_filename='qmcjax_ckpt_000000_process0.npz',
                x64=True
)   

sampler.draw_samples(required_samples=2000, save_freq=1)

## 1.2 Multiple nodes using singularity container

Some comments on the commands below:
- <span style='color:red'>**IMPORTANT**</span>: Use `export SCRATCH=/YOUR/SCRATCH/FOLDER` first to specify the folder containing your singularity image.
- `SINGULARITY_CMD` activates singularity. 
- Use `./slurm_dist.sh --help` to see slurm options
- Use `python sampling.py --help` to see training script options
- Certain flags need to be specified according to your slurm setup, e.g. -A, --partition, --mail-user

In [None]:
export SINGULARITY_CMD="singularity exec --no-home --nv --bind .:/home/invariant-schrodinger --pwd /home/invariant-schrodinger $SCRATCH/inv-ds.sif /bin/bash -c " && ./slurm_dist.sh --mem=10G --num-nodes=2 --port=8001 --timeout=1000 -A YOUR_ACCOUNT --partition="YOUR_PARTITION" --gres="gpu:1" --extra="-t 2-00:00:00 --mail-type=END,FAIL --mail-user=YOUR_EMAIL" --log='_log_graphene_OG_test_multi' --name="OGgraphene" --py-cmd="$SINGULARITY_CMD 'source /opt/conda/bin/activate deepsolid && python sampling.py --dist --ckpt_restore_filename=qmcjax_ckpt_000000_process0.npz --x64 --required_samples=2000 --save_freq=1 --sampling_cfg=OG_batch1000_mcmc3e4.py --libcu_lib_path=/opt/conda/envs/deepsolid/lib/'"

In [None]:
export SINGULARITY_CMD="singularity exec --no-home --nv --bind .:/home/invariant-schrodinger --pwd /home/invariant-schrodinger $SCRATCH/inv-ds.sif /bin/bash -c " && ./slurm_dist.sh --mem=10G --num-nodes=2 --port=8002 --timeout=1000 -A YOUR_ACCOUNT --partition="YOUR_PARTITION" --gres="gpu:1" --extra="-t 2-00:00:00 --mail-type=END,FAIL --mail-user=YOUR_EMAIL" --log='_log_graphene_DA_test_multi' --name="DAgraphene" --py-cmd="$SINGULARITY_CMD 'source /opt/conda/bin/activate deepsolid && python sampling.py --dist --ckpt_restore_filename=qmcjax_ckpt_000000_process0.npz --x64 --required_samples=2000 --save_freq=1 --sampling_cfg=PA_batch1000_mcmc3e4.py --libcu_lib_path=/opt/conda/envs/deepsolid/lib/'"

In [None]:
export SINGULARITY_CMD="singularity exec --no-home --nv --bind .:/home/invariant-schrodinger --pwd /home/invariant-schrodinger $SCRATCH/inv-ds.sif /bin/bash -c " && ./slurm_dist.sh --mem=10G --num-nodes=2 --port=8003 --timeout=1000 -A YOUR_ACCOUNT --partition="YOUR_PARTITION" --gres="gpu:1" --extra="-t 2-00:00:00 --mail-type=END,FAIL --mail-user=YOUR_EMAIL" --log='_log_graphene_GA_test_multi' --name="GAgraphene" --py-cmd="$SINGULARITY_CMD 'source /opt/conda/bin/activate deepsolid && python sampling.py --dist --ckpt_restore_filename=qmcjax_ckpt_000000_process0.npz --x64 --required_samples=2000 --save_freq=1 --sampling_cfg=PA_batch1000_mcmc3e4.py --libcu_lib_path=/opt/conda/envs/deepsolid/lib/'"

In [None]:
export SINGULARITY_CMD="singularity exec --no-home --nv --bind .:/home/invariant-schrodinger --pwd /home/invariant-schrodinger $SCRATCH/inv-ds.sif /bin/bash -c " && ./slurm_dist.sh --mem=10G --num-nodes=2 --port=8004 --timeout=1000 -A YOUR_ACCOUNT --partition="YOUR_PARTITION" --gres="gpu:1" --extra="-t 2-00:00:00 --mail-type=END,FAIL --mail-user=YOUR_EMAIL" --log='_log_graphene_PA_test_multi' --name="PAgraphene" --py-cmd="$SINGULARITY_CMD 'source /opt/conda/bin/activate deepsolid && python sampling.py --dist --ckpt_restore_filename=qmcjax_ckpt_000000_process0.npz --x64 --required_samples=2000 --save_freq=1 --sampling_cfg=PA_batch1000_mcmc3e4.py --libcu_lib_path=/opt/conda/envs/deepsolid/lib/'"

In [None]:
export SINGULARITY_CMD="singularity exec --no-home --nv --bind .:/home/invariant-schrodinger --pwd /home/invariant-schrodinger $SCRATCH/inv-ds.sif /bin/bash -c " && ./slurm_dist.sh --mem=10G --num-nodes=2 --port=8005 --timeout=1000 -A YOUR_ACCOUNT --partition="YOUR_PARTITION" --gres="gpu:1" --extra="-t 2-00:00:00 --mail-type=END,FAIL --mail-user=YOUR_EMAIL" --log='_log_graphene_PC_test_multi' --name="PCgraphene" --py-cmd="$SINGULARITY_CMD 'source /opt/conda/bin/activate deepsolid && python sampling.py --dist --ckpt_restore_filename=qmcjax_ckpt_000000_process0.npz --x64 --required_samples=2000 --save_freq=1 --sampling_cfg=PC_batch1000_mcmc3e4.py --libcu_lib_path=/opt/conda/envs/deepsolid/lib/'"

# 2. Evaluate statistics on samples drawn from the wavefunction

Use the `compute_stats` method of the DeepSolidSampler class to evaluate statistics. The following cells only illustrate this for OG as it is the same recipe for all others.

In [None]:
from utils.sampler import DeepSolidSampler

'''
    num_processes below specifies the number of processes used for sampling.
        e.g. the samples may have been saved in two files due to the use of two nodes, 
             and we need to specify 2 here to retrieve all samples.
'''

sampler = DeepSolidSampler(
    log_dir='_log_graphene_OG_test/',
    sampling_cfg_str='OG_batch1000_mcmc3e4.py',
    libcu_lib_path='/opt/conda/envs/deepsolid/lib/',
    ckpt_restore_filename='qmcjax_ckpt_000000_process0.npz',
    num_processes=1  
)
sampler.load_samples()
samples_all_processes = sampler.get_all_samples()

# outputs (m, 3n), where m is the number of samples and 3n is the shape of each configuration of n electrons
print(samples_all_processes.shape)

In [None]:
'''
    Computes estimates of 
    - loss: mean of local energy
    - var: variance of local energy
    - imag: imaginary part of energy
    - kinetic: kinetic part of energy
    - ewald: ewald part of energy
    - 
    - symm_ratio_mean: mean[ averaged wavefunction / wavefunction ]
    - symm_ratio_var: Var[ averaged wavefunction / wavefunction ]
    
    n_for_each_est indicates the number of samples to use for each estimate

    number of estimates produced = number of total samples / n_for_each_est
'''

sampler.load_stats()
if len(sampler.stats_list) == 0:
    sampler.compute_stats(n_for_each_est=1) 
sampler.stats_list