# Project: Renaming Images using AI

## 01 - Project Introduction

This project started as something small I needed for myself. See, I take and receive a ton of photos, and they end up in these folders of files with meaningless names. Searching for one specific picture? It's a nightmare.

What I really needed was an app that uses fancy AI to take a peek at each image, come up with a good name based on what's *in* the picture, and then rename the file. Finally, images with names that actually make sense!

Okay, in this section, I'll walk you through how to build this project step by step. We'll use the Gemini AI model, and I'll even throw in some cool Python tips that are useful for all sorts of stuff, like generators and working with file paths.


## 02 - Getting Images Using a Generator

In this video, I'll show you an efficient way to find all images within a directory and its subdirectories. This isn't just about images – this Python code you can use in tons of applications.

We'll also explore some cool advanced Python concepts like working with paths, files and generators. 


In [111]:
# I'm importing the 'os' module for working with files and directories
import os 

# I'm importing the 'Path' class for easier file path handling
from pathlib import Path 

# I'm also importing the 'Image' class from PIL because Gemini Pro Vision works with PIL images
from PIL import Image  

# You are probably thinking that a function that returns all images from a directory is appropriate.
# However this is not efficient.

# Imagine you have a huge folder filled with thousands of images. 
# Using a regular function, you'd first need to find all the images, load them into memory, and then start processing them.
# That'd use a ton of space!

# Generators are way smarter. They find one image, hand it to you, then go find the next.
# It's like streaming – you only deal with one image at a time, which saves a bunch of memory.


# As a quick refresher on functions vs. generators, functions run to completion, returning all the results at once.
# Generators, on the other hand, are like a "lazy worker.". They find one result, pause, then give it to you when asked. 
# Perfect for handling big datasets or when you don't need everything right away.

def get_images(directory):

    # This will be a generator, not a function.

      # I'm defining a tuple of supported image file extensions 
      supported_extensions = ('.png', '.jpg', '.jpeg', '.gif')
    
      # Let's loop through the directory structure using os.walk().
      # os.walk() is a generator that goes through each directory and gives us 
      # the current directory (root), any unused subdirectories (which I'll call _), 
      # and a list of files in the current directory (filenames).
      for root, _, filenames in os.walk(directory): 
    
        # Let's loop over the filenames
        for filename in filenames: 
          # I'll check if the filename ends with a supported extension, 
          # and I'll make the check case-insensitive.
          if filename.lower().endswith(supported_extensions):  
    
            # Let's build the full absolute path to the image file. 
            absolute_path = os.path.join(root, filename)  
    
            # I'm opening the image as a PIL image object, ready for Gemini.
            img = Image.open(absolute_path)  
    
            # Since this is a generator, I'm using 'yield' to return the image 
            # object and the full image path. 
            yield img, absolute_path  


In [103]:
# let's check the generator
# I will iterate over the value it yields
for img, absolute_path in get_images('./images'):
    print(img, absolute_path)
    print()

<PIL.PngImagePlugin.PngImageFile image mode=RGBA size=370x582 at 0x73E27614DBB0> ./images/open_fridge_full_of_food.png

<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=695x760 at 0x73E27614C890> ./images/shakshuka_eggs_tomatoes_peppers_cilantro.jpg

<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=793x412 at 0x73E27614F4D0> ./images/toy_poodle_puppy_eating_cucumber.jpg

<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1920x866 at 0x73E27614FC50> ./images/salad_bowl_with_greens_and_fruits.jpg

<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=418x927 at 0x73E27614FCE0> ./images/sagrada_familia_barcelona_gaudi_architecture_cathedral_facade.jpg

<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=793x412 at 0x73E27614F7D0> ./images/cal/puppy_eating_cucumber.jpg

<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=695x760 at 0x73E27614FEF0> ./images/cal/shakshuka_in_a_pan.jpg



## 03 - Renaming Images Using Gemini Pro Vision Suggestions

In [None]:
# I'm importing the necessary libraries and performing the authetication to Gemini.
import google.generativeai as genai
import os
from dotenv import load_dotenv, find_dotenv

load_dotenv(find_dotenv(), override=True) 
genai.configure(api_key=os.environ.get('GOOGLE_API_KEY'))

In [114]:
# I'm creating the model object. The LLM will use both text and an image as inputs so gemini pro vision is the good model for this.
model = genai.GenerativeModel('gemini-pro-vision')

# I'm creating the prompt for the image analysis and renaming task
prompt = '''Analyze this image in detail. 
Generate a descriptive image filename using only these rules:
* Relevant keywords describing the image, separated by underscores.
* Lowercase letters only.
* No special characters.
* Keep it short and accurate.
Respond ONLY with the image filename (no extension).

Example: child_running_in_the_rain
''' 

# I used prompt engineering to get the best results. This prompt is effective because:
# * I give clear instructions, one per line.
# * I specify the exact output format I want. 
# * I avoid spaces in filenames, as they can cause issues on some systems.
# * I include a helpful example.

# I'm setting the directory containing the images
my_directory = "./images"  

# Now, let's process each image in the directory using an iteration
for img, absolute_path in get_images(my_directory):

    # I'm making the API call sending the image and the prompt to Gemini for analysis:
    response = model.generate_content([prompt, img ])
    
    # Gemini's answer will not preserve the image extension. It will respond only with a description of the image that will be the new name.
    # I will need the original file extension and I'm getting it by calling os.path.splitext()
    root, ext = os.path.splitext(absolute_path) # this method takes an absolute path as an argument and split it into two components:
    # root and extension

    # I will show you an example in another cell to see how it works.
    # I'm using Linux or Mac paths but it works the same with Windows paths.


    # Root is the part of the path before the last period which is the extension. It  includes the directories leading up to the filename.
    
    # Next I will build the new filename, keeping the original extension.
    new_filename =  response.text.strip() + ext # I'm calling strip() to remove any whitespaces before or after the response. 
    # In practice I noticed that sometimes Gemini is adding a trailing whitespace.

    # I'll need the full path for the new image which is composed of a base dir and the new filename
    base_dir = os.path.dirname(absolute_path)
    new_filepath = base_dir + '/' + new_filename


    # I will call os.rename() to rename the file, but first I want to check that everyhing works as it is expected. 
    # All images are processed correctly by gemini, the absolute path and the extension are preservered and so on
    os.rename(absolute_path, new_filepath) # //** add at the end

    # I'm displaying a message for the user
    print(f'{absolute_path} was renamed to {new_filepath}')

    print('-' * 50)


# Okay, let's run the code! You'll see Gemini analyze each image and come up with a descriptive new filename 
# – just like we asked in the prompt.

# I'm adding the renaming part now!

# I'm running the code again,  noticing how the application is renaming all the file in this directory and its subdirectories.

# In this example, I renamed the files according to Gemini's response. However, you can add a prefix or a suffix to your images if you wish.
# For example, I'm adding 'vacation_paris_2024_' at the beginning.

# That's it! You learned how to rename your meaningless image names in a smart way using AI.


./images/sagrada_familia_barcelona_gaudi.jpg was renamed to ./images/vacation_paris_2024_sagrada_familia_barcelona_gaudi.jpg
--------------------------------------------------
./images/salad_bowl_with_greens_and_fruit.jpg was renamed to ./images/vacation_paris_2024_colorful_salad_bowl.jpg
--------------------------------------------------
./images/open_fridge_full_of_food.png was renamed to ./images/vacation_paris_2024_open_fridge_full_of_food.png
--------------------------------------------------


InternalServerError: 500 An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting

### -----

In [109]:
root, ext = os.path.splitext('/home/user/documents/project/image.jpg')
print(root, ext)

/home/user/documents/project/image .jpg


In [110]:
absolute_path = '/home/user/documents/project/image.jpg'
base_dir = os.path.dirname(full_path)
base_dir

'/home/user/documents/project'

In [107]:
base, ext = os.path.splitext(absolute_path)
base

'/home/user/documents/project/image'