Anki Vector Robot Navigation and Image Captioning

Overview

This repository contains a Python script designed to navigate an Anki Vector robot through a predefined path, capturing images at various points and generating descriptive captions based on the content of the images. The script integrates a pre-trained Vision-Text model for image captioning and utilizes the Anki Vector SDK for robot control, providing a seamless and interactive user experience.

Prerequisites

Anki Vector Robot with SDK installed
Python 3.6 or higher
PyTorch
NumPy
PIL (Python Imaging Library)
Hugging Face Transformers Library

Installation

Clone the repository:

git clone https://github.com/yourusername/anki-vector-navigation-captioning.git
cd anki-vector-navigation-captioning

Install the required packages:
```
pip install -r requirements.txt
```

Configuration

Ensure that the Anki Vector Robot is properly configured and connected to your development environment.
The robot’s SDK should be installed and configured as per the official Anki Vector SDK documentation.
Update the data.json file with the predefined poses for navigation. Each pose should contain X, Y coordinates and a direction angle.

Usage

Run the script:
```
python main.py
```
The script will initialize the Anki Vector Robot, capturing images and generating captions at each point in the predefined path.
Captions will be vocalized using the robot’s built-in text-to-speech functionality.
Movement details, including distances and directions, will be logged to the console for real-time monitoring.

Features

Robot Navigation: Navigate through a predefined path based on the poses specified in data.json.
Image Captioning: Capture images using the robot’s camera and generate descriptive captions using a pre-trained Vision-Text model.
Voice Feedback: Vocalize generated image captions for enhanced user interaction.
Error Handling: Robust handling of VectorTimeoutException to ensure uninterrupted operation.
Real-time Logging: Console outputs for movement details and operational status.

Resources

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
media		media
.gitignore		.gitignore
Readme.md		Readme.md
data.json		data.json
project4_plane.ipynb		project4_plane.ipynb
requirements.txt		requirements.txt
runner.py		runner.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Anki Vector Robot Navigation and Image Captioning

Overview

Prerequisites

Installation

Configuration

Usage

Features

Resources

License

About

Releases

Packages

Languages

danikhan632/robot-path-finder

Folders and files

Latest commit

History

Repository files navigation

Anki Vector Robot Navigation and Image Captioning

Overview

Prerequisites

Installation

Configuration

Usage

Features

Resources

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages