# Process images in `./imgs/`

## Object Detection Models
- [Rex-Omni](https://github.com/IDEA-Research/Rex-Omni)
- DETR [resnet-101](https://huggingface.co/facebook/detr-resnet-101) (uses COCO classes: no trees, just person)
- OWLv2 [[base](https://huggingface.co/google/owlv2-base-patch16-ensemble)], [[large](https://huggingface.co/google/owlv2-large-patch14-ensemble)]
- Grounding DINO [[base](https://huggingface.co/IDEA-Research/grounding-dino-base)]
- YOLO [[v8](https://huggingface.co/Ultralytics/YOLOv8)], [[11](https://huggingface.co/Ultralytics/YOLO11)] (uses [COCO classes](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco.yaml): no trees, just person [[1](https://docs.ultralytics.com/tasks/detect/#models)])

## Object Segmentation Models
- [SAM3](https://ai.meta.com/blog/segment-anything-model-3/)
- SegFormer [[b5-ade](https://huggingface.co/nvidia/segformer-b5-finetuned-ade-640-640)] (persons, trees)
- DETR [[resnet-50-panoptic](https://huggingface.co/facebook/detr-resnet-50-panoptic)] (persons, trees)
- MaskFormer [[swin-base-ade](https://huggingface.co/facebook/maskformer-swin-base-ade)] (persons, trees)

## TODO:

### Detect shade pixels (mask)

### Detect actual bus stop

In [None]:
!pip install ultralytics

In [None]:
!wget -q https://raw.githubusercontent.com/direito-a-sombra/bus-view/refs/heads/main/models/detect_utils.py
!wget -q https://raw.githubusercontent.com/direito-a-sombra/bus-view/refs/heads/main/models/Detr.py
!wget -q https://raw.githubusercontent.com/direito-a-sombra/bus-view/refs/heads/main/models/Dino.py
!wget -q https://raw.githubusercontent.com/direito-a-sombra/bus-view/refs/heads/main/models/Owlv2.py
!wget -q https://raw.githubusercontent.com/direito-a-sombra/bus-view/refs/heads/main/models/Yolo.py
!mkdir -p ./models && mv *.py ./models

!wget -q https://raw.githubusercontent.com/direito-a-sombra/bus-view/refs/heads/main/imgs/address/11.jpg
!wget -q https://raw.githubusercontent.com/direito-a-sombra/bus-view/refs/heads/main/imgs/address/12.jpg
!wget -q https://raw.githubusercontent.com/direito-a-sombra/bus-view/refs/heads/main/imgs/address/13.jpg
!wget -q https://raw.githubusercontent.com/direito-a-sombra/bus-view/refs/heads/main/imgs/address/14.jpg
!wget -q https://raw.githubusercontent.com/direito-a-sombra/bus-view/refs/heads/main/imgs/address/15.jpg
!wget -q https://raw.githubusercontent.com/direito-a-sombra/bus-view/refs/heads/main/imgs/address/18.jpg
!mkdir -p ./imgs/address && mv 1*.jpg ./imgs/address

In [None]:
from os import listdir

from PIL import Image as PImage, ImageDraw as PImageDraw

from models.detect_utils import OBJECT_THRESHOLDS
from models.Detr import Detr
from models.Dino import Dino
from models.Owlv2 import Owlv2
from models.Yolo import Yolo

In [None]:
IMG_DIR = "./imgs/address"
imgs = [PImage.open(f"{IMG_DIR}/{f}") for f in sorted(listdir(IMG_DIR)) if f.endswith("jpg")]

In [None]:
def visualize_objs(img, objs):
  dimg = img.copy()
  iw,ih = dimg.size
  draw = PImageDraw.Draw(dimg)

  for o in objs:
    x0,y0,x1,y1 = o["box"]
    draw.rectangle([x0*iw,y0*ih,x1*iw,y1*ih], outline=(0,255,0), width=4)

  display(dimg.resize((256, 256)))
  print([f'{o["label"]}: {o["score"]}' for o in objs], '\n')

In [None]:
model = Yolo()

for img in imgs:
  objs = model.all_objects(img, OBJECT_THRESHOLDS)
  visualize_objs(img, objs)

In [None]:
model = Owlv2()

for img in imgs:
  objs = model.iou_objects(img, OBJECT_THRESHOLDS)
  visualize_objs(img, objs)

In [None]:
model = Detr()

for img in imgs:
  objs = model.all_objects(img, OBJECT_THRESHOLDS)
  visualize_objs(img, objs)

In [None]:
model = Dino()

for img in imgs:
  objs = model.iou_objects(img, OBJECT_THRESHOLDS)
  visualize_objs(img, objs)