Skip to content

A utility that uses Gemini API to rename image files according to their content"

License

Notifications You must be signed in to change notification settings

myudak/Vision-Label

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vision-Label


simply describe what is in the image, then rename the file to something understandable to a human.


Languages-Python OpenSource PRs

Github - myudak


vision-labelHD.mp4

VisionLabel is a powerful Python-based command-line utility tool designed to rename image files using the capabilities of the Gemini API. Descriptive filenames are generated based on the content of the image.

Features

  • Automatically rename image files with descriptive names generated by the Gemini API
  • Right-click context menu integration for quick and easy file renaming
  • Customizable name generation based on user-defined templates
  • Create and utilize a mapping file for batch renaming operations
  • Validate generated names with the Gemini API before applying changes
  • Retry attempts for name generation and validation in case of errors
  • Supports various image file types: .png, .jpeg, .jpg, .webp, .bmp,.gif (non-animated)

Dependencies

  • Python 3.x
  • Google-generativeai
  • PIL Library

Setup

1. Get a Gemini API Key for Free

Sign up and get your API key from Google AI Studio.

2. Add Environment Variables

Create a .env file in the root directory of your project and replace GOOGLE_API_KEY with your actual API key:

GOOGLE_API_KEY=your_api_key_here

3. Install Dependencies

Run the following command to install all required dependencies:

pip install -r requirements.txt

Usage

From the Command Line

You can rename an image by specifying the path to the image:

python main.py --file path_to_image

You can also rename all the image in a directory by specifying the path to the directory:

python main.py --directory path_to_directory

Example:

python main.py --file ./img/example.jpg

From the Windows Context Menu

Generate the add_context_menu.reg File

Run the following command to generate the registry file:

python main.py --reg

Initialize the Registry File

Add the context menu entry by running:

regedit add_context_menu.reg

note: you may need to run regedit as an administrator. Alternatively you can doube click the file to run it

Now, you can right-click on any image and select Rename with Vision-Label to rename the image using the Gemini API.

Remove the Context Menu Entry

If you want to remove the context menu entry, run:

python main.py --remove

How It Was Built

  1. Find the image in the given directory.
  2. Images are sent to the Gemini API, where they are processed, and a caption is returned.
  3. Rename the files with a new name from the Gemini API.

Contributing

Contributions are welcome! If you have any suggestions or improvements, feel free to open an issue or submit a pull request.

License

This project is licensed under the GNU GENERAL PUBLIC LICENSE - see the LICENSE file for details.

Disclaimer

This application uploads images to Google servers for processing using the Gemini API. Use caution and avoid using personal images. This tool is not affiliated with Google. The functionality is subject to change based on updates to the Gemini API or the terms of service of Google.

Credits

  • Gemini API
  • Google AI Studio

About

A utility that uses Gemini API to rename image files according to their content"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages