# Introduction

This is a modified example based on the [MobileNetV3 Supernet NAS](https://github.com/intel/neural-compressor/blob/master/examples/notebook/dynas/MobileNetV3_Supernet_NAS.ipynb) notebook. The main goal of this notebook is to showcase distributed search functionality.


## Prerequisites

### Install Intel® MPI or OpenMPI

#### Intel® MPI

Please refer to [Intel® MPI Library](https://www.intel.com/content/www/us/en/developer/tools/oneapi/mpi-library.html#gs.1t0vm0) for detailed steps on how to install Intel® MPI.

#### OpenMPI

1. You can download OpenMPI from https://www.open-mpi.org/ or use this link to directly download version 4.1.5:

    ```bash
    wget https://download.open-mpi.org/release/open-mpi/v4.1/openmpi-4.1.5.tar.gz
    ```

1. Unpack OpenMPI source code and go to source directory:

    ```bash
    tar -xzf openmpi-4.1.5.tar.gz
    cd openmpi-4.1.5
    ```

1. Configure, compile and install by executing the following command (change directory if needed):

    ```bash
    ./configure --prefix=/opt/openmpi
    make -j $(($(nproc)/2)) all
    make install
    ```

1. To use OpenMPI you will have to change your PATH and LD_LIBRARY_PATH environment variables (change directory if needed):

    ```bash
    echo "export PATH=$PATH:/opt/openmpi/bin" >> $HOME/.bashrc
    echo "export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/openmpi/lib" >> $HOME/.bashrc
    ```

1. Cleanup

    ```bash
    cd ../
    rm openmpi-4.1.5.tar.gz
    rm -Rf openmpi-4.1.5
    ```

In [None]:
!pip -q install neural_compressor dynast==1.1.0

Alternatievely, if you have a local copy of https://github.com/intel/neural-compressor, you can uncomment and run the code below:

In [None]:
# import sys
# sys.path.insert(0,'<path to neural compressor>')
# !pip install -qr <path to neural compressor>/requirements.txt
# !pip install -q dynast==1.1.0

## Example

A simple script `distributed_example.py` demonstrating how to use distributed search functionality is located in the same directory as this notebook. The distributed functionality can be used with both `MPI` and `torchrun`.

> Note: When run with `torchrun`, unless explicitly specified, `torch.distributed` uses `OMP_NUM_THREADS=1` ([link](https://github.com/pytorch/pytorch/commit/1c0309a9a924e34803bf7e8975f7ce88fb845131)) which may result in slow evaluation time. Good practice is to explicitly set `OMP_NUM_THREADS`  to `(total_core_count)/(num_workers)` (optional for MPI).

To run distributed NAS within Neural Compressor/DyNAS-T with `MPI`/`torchrun`, please add the following line to your configuration:

```python
config.dynas.distributed = True
```

### `mpirun`

In [None]:
%%bash

# If path to Neural Compressor was specified in the cell above, please modify the line below accordingly and uncomment it before running this cell.
# export PYTHONPATH=<path to neural compressor>neural-compressor

export PYTHONPATH=/nfs/pdx/home/mszankin/store/code/opensource/neural-compressor

time mpirun \
    --report-bindings \
    -x MASTER_ADDR=127.0.0.1 \
    -x MASTER_PORT=1238 \
    -np 2 \
    -bind-to socket \
    -map-by socket \
        python distributed_example.py