Skip to content

Initial Release

Latest
Compare
Choose a tag to compare
@renan-siqueira renan-siqueira released this 06 Jan 02:24
· 2 commits to main since this release
e6cc808

Image-to-Text Tool - Release Notes

Version: 1.0.0

This release introduces a robust and flexible tool designed to process images and generate accurate textual descriptions using advanced machine learning models.

Features:

  • Multiple Model Support: Now supports BLIP and UForm models, providing users with a choice to select the model that best fits their image processing needs.

  • Docker Integration: Includes a Dockerfile setup for easy and consistent environment setup across different systems. The Docker environment is built on the NVIDIA CUDA base image, ensuring optimal GPU support.

  • Flexible Execution: Users can choose to process images with a specific model or use all available models by simply adjusting the script execution flags.

  • Input and Output Management: Images can be placed in an input folder, and the output descriptions are saved in JSON format in designated files for each model.

Installation:

  • The tool can be set up using provided installation scripts for Unix-based (Linux, macOS) and Windows systems.

  • For Docker users, a Dockerfile is provided for building and running the tool in a containerized environment, complete with CUDA support for GPU acceleration.

Usage:

  • Simply place the images in the input folder and run the run.py script with desired model flags.

  • For Docker, use the provided commands to build and run the tool in a Docker container.