This project combines Large Language Models (LLMs) and YOLOv8 for object detection and identification in images. It uses Ollama for vision-based part identification and Ultralytics YOLO for object detection and annotation.
- Identifies visible parts in an image using Ollama LLM (vision model)
- Detects and annotates objects in the image using YOLOv8
- Saves the annotated image with bounding boxes for matched parts
- Python 3.8+
- ollama
- ultralytics
- opencv-python
- Clone this repository or download the code.
- Install dependencies:
pip install -r requirements.txt
- Ensure you have the YOLOv8 model weights (e.g.,
yolov8m.pt
). - Place your input image at the specified path in
main.py
.
Edit the image_path
and output_path
in main.py
as needed, then run:
python main.py
- Ollama must be running and accessible for the LLM vision step.
- The list of parts to identify can be customized in the prompt in
main.py
. - The YOLO model and weights can be changed as needed.
- The script prints the detected parts and saves an annotated image to the specified output path.