SAN
====

**Side Adapter Network for Open-Vocabulary Semantic Segmentation**

 * Paper: https://arxiv.org/pdf/2302.12242

![SAN Overview](../assets/san_overview.jpg)

![SAN Architecture](../assets/san_arch.jpg)

![SAN using CLIP-attn to predict mask](../assets/san_attn-mask.jpg)

## Installation

```bash
sudo apt update
sudo apt install -y build-essential git pkg-config \
  libjpeg-dev libpng-dev libglib2.0-0 libsm6 libxext6 libxrender-dev

conda create -n san python=3.9 -y
condaq activate san

pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 \
  torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113

python -c "import torch; print('CUDA available:', torch.cuda.is_available())"


git clone https://github.com/MendelXu/SAN.git SAN_repo
cd SAN_repo

pip install -r requirements.txt

# Install cuda-toolkit matching with pytorch's cuda version (11.3)
# nvcc 11.3 (matches PyTorch +cu113)
conda install -y -c nvidia/label/cuda-11.3.1 cuda-nvcc

# alternatively install full cuda 11.3
#conda install -y -c nvidia/label/cuda-11.3.1 cuda
nvcc -V  # should now report CUDA release 11.3

# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# next install detectron2 (but I got the following error)
# error: #error – unsupported GNU version! gcc versions later than 8 are not supported!
# so, installing gcc-8
sudo apt update
sudo apt install -y gcc-8 g++-8

# Set CUDA_HOME to the location of the conda-installed toolkit
export CUDA_HOME=$(dirname $(dirname $(which nvcc)))
export PATH="$CUDA_HOME/bin:$PATH"
export LD_LIBRARY_PATH="$CUDA_HOME/lib64:$LD_LIBRARY_PATH"
# Build from source with the correct compiler
CC=gcc-8 CXX=g++-8 python -m pip install \
    'git+https://github.com/facebookresearch/detectron2.git@v0.6'

```

## Downgrade Numpy to avoid this error: `Numpy is not available`

```bash
pip install numpy==1.26.4

python -c "import numpy; print('NumPy version:', numpy.__version__)"
```

In [3]:
import os
import sys
import matplotlib.pyplot as plt

sys.path.append("SAN_repo")
from predict import Predictor, model_cfg


# Choose the ViT-B/16 model; the config and weight names come from predict.py.
model_key = "san_vit_b_16"
config_file = model_cfg[model_key]["config_file"]
model_path = model_cfg[model_key]["model_path"]
config_file = "SAN_repo/" + config_file

# Build the predictor.  This will download the weights from HuggingFace on first use.
predictor = Predictor(config_file=config_file, model_path=model_path)




Loading model from:  /home/pyml/.cache/huggingface/hub/models--Mendel192--san/snapshots/0d24a51312b517e86044f845185d09002d032443/san_vit_b_16.pth
Loaded model from:  /home/pyml/.cache/huggingface/hub/models--Mendel192--san/snapshots/0d24a51312b517e86044f845185d09002d032443/san_vit_b_16.pth
