Skip to content

vilota-dev/Object_Detection_Models

Repository files navigation

Object Detection Pipeline with Gemini & DINO Models

This project implements a complete object detection workflow to identify short roadside bollards using two AI models:

  • Google's Gemini Vision API
  • Grounding DINO (Hugging Face)

Features

Gemini-based Detection (gemini_api.py)

  • Sends images to Gemini 2.0 Flash via API.
  • Extracts bounding boxes for bollards
  • Saves:
    • JSON predictions per image
    • Visualizations with red boxes around detected bollards
  • Supports hyperparameter tuning (temperature, top_p, top_k).

Grounding DINO Detection (dino.py)

  • Loads the Hugging Face GroundingDINO model locally.
  • Detects similar bollard objects using text prompts.
  • Saves:
    • Annotated images with class names and bounding boxes
    • Corresponding JSON files

Frame Extraction (extract_frames_from_mcap.py)

  • Converts .mcap video files into image frames.
  • Helps generate a dataset from video input.

Setup Instructions

1. Clone the Repository

2. Install Python Requirements

pip install -r requirements.txt

3a. Set up Google Gemini API

Get your API key from Google AI Studio for developers.

Uncomment and insert your key:

client = genai.Client(api_key='YOUR_KEY_HERE')

3b. Set up DINO

Clone this repository : https://github.com/IDEA-Research/GroundingDINO

4. Run Inference

Gemini

python gemini_api.py

DINO

python dino.py

Notes

Make sure the image filenames in chosen_dataset/ end with .png or .jpg.

Tune DINO detection thresholds (BOX_THRESHOLD, TEXT_THRESHOLD) in dino.py.

Tune Gemini hyperparameters via the configs list in gemini_api.py.

Extra models

To view personal trial projects exploring other models (SLIME, YOLOE, OWLV2, OWLVit, YOLO-World and YOLO-World-V2), find their folders in the home folder of the Lenovo Desktop with the Serial Number : PF-5M3JNS, and open the folder as a project in Visual Studio Code.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages