GLIP
=====

**Grounded Language-Image Pre-training**

 * Paper: https://arxiv.org/abs/2112.03857

![GLIP Overview](../assets/glip_overview.jpg)

## Using Docker image

```bash
# 1) Pull the GLIP docker image (CUDA 10.2 with PyTorch 1.9)
sudo docker pull pengchuanzhang/maskrcnn:ubuntu18-py3.7-cuda10.2-pytorch1.9

# 2) Clone GLIP and create a directory for models on your host
git clone https://github.com/microsoft/GLIP.git GLIP_repo
cd GLIP_repo
mkdir MODEL

# 3) Download the backbone weights (see section 2 below) into the MODEL folder

# 4) Run the container and mount the repo
sudo docker run --gpus all -it --rm \
  -v $(pwd):/workspace/GLIP_repo \
  pengchuanzhang/maskrcnn:ubuntu18-py3.7-cuda10.2-pytorch1.9 bash

```

```bash
cd /workspace/GLIP_repo
# Install the Python dependencies
pip install einops shapely timm yacs tensorboardX ftfy prettytable pymongo
pip install transformers
# Build maskrcnn_benchmark
python setup.py build develop --user

# Now you can run GLIP’s demo.  For example, to run the zero‑shot detector on an image:
python -m torch.distributed.launch --nproc_per_node=1 tools/test_grounding_net.py \
  --config-file configs/pretrain/glip_Swin_T_O365_GoldG.yaml \
  --weight MODEL/glip_tiny_model_o365_goldg.pth \
  TEST.IMS_PER_BATCH 1 \
  MODEL.DYHEAD.SCORE_AGG "MEAN" \
  TEST.EVAL_TASK detection \
  MODEL.DYHEAD.FUSE_CONFIG.MLM_LOSS False \
  OUTPUT_DIR outputs/demo

```

 * Downloading the missing Swin weights

 ```bash
 # Download Swin‑Tiny weights (ImageNet‑1K)
wget https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_tiny_patch4_window7_224.pth -P MODEL

# Download Swin‑Large weights (ImageNet‑22K, 384×384)
wget https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_large_patch4_window12_384_22k.pth -P MODEL
```

## Alternative: build outside Docker

```bash
conda create -n glip python=3.7 -y
conda activate glip
conda install pytorch==1.9.0 torchvision==0.10.0 cudatoolkit=11.1 -c pytorch -c nvidia -y
pip install einops shapely timm yacs tensorboardX ftfy prettytable pymongo transformers
# Clone and build GLIP
git clone https://github.com/microsoft/GLIP.git
cd GLIP
python setup.py build develop
```

## Download model

```python
import os
from huggingface_hub import hf_hub_download

os.makedirs("MODEL", exist_ok=True)

repo_id = "GLIPModel/GLIP"
filename = "glip_tiny_model_o365.pth"

# Download
model_path = hf_hub_download(
    repo_id=repo_id,
    filename=filename,
    cache_dir="GLIP_repo/MODEL",
    repo_type="model"  # ensures downloading from model repo
)
print(f"Downloaded {filename} → {model_path}")

```