Release Initial Release · renan-siqueira/image-to-text-tool

Image-to-Text Tool - Release Notes

Version: 1.0.0

This release introduces a robust and flexible tool designed to process images and generate accurate textual descriptions using advanced machine learning models.

Features:

Multiple Model Support: Now supports BLIP and UForm models, providing users with a choice to select the model that best fits their image processing needs.
Docker Integration: Includes a Dockerfile setup for easy and consistent environment setup across different systems. The Docker environment is built on the NVIDIA CUDA base image, ensuring optimal GPU support.
Flexible Execution: Users can choose to process images with a specific model or use all available models by simply adjusting the script execution flags.
Input and Output Management: Images can be placed in an input folder, and the output descriptions are saved in JSON format in designated files for each model.

Installation:

The tool can be set up using provided installation scripts for Unix-based (Linux, macOS) and Windows systems.
For Docker users, a Dockerfile is provided for building and running the tool in a containerized environment, complete with CUDA support for GPU acceleration.

Usage:

Simply place the images in the input folder and run the run.py script with desired model flags.
For Docker, use the provided commands to build and run the tool in a Docker container.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial Release

Image-to-Text Tool - Release Notes

Features:

Installation:

Usage: