gs1.mp4
This repository uses GroundingDINO to generate bounding boxes using natural language which are then fed into MetaAI's Segment Anything Model and visualized using rerun
This can work on either an image, a set of images, or a video.
first install the main repos requirements using
pip install -r requirements.txt
then install GroundingDINO by running
git submodule update --init --recursive
cd GroundingDINO
pip install -e .
Make sure to be in the main directory
python main.py
Use --help
to understand argparse inputs
To use video input (here is an example video to download) use the following command
python main.py --video-path <PATH TO YOUR VIDEO FILE> --prompt "<YOUR CHOSEN PROMPT>"