## Concatenate `.caption`s with `.tags` to `.txt`
----

This Python script processes image files within a specified directory. For each image file, if corresponding tag and caption files exist, it combines the content of these files into a single text file (`.txt`). The content of the tag file (`.tags`) and caption file (`.caption`) are concatenated, with tags followed by a comma and then the caption.

The script also performs specific modifications to the caption content:

- Commas inside sentences are stripped.
- Each sentence is terminated with a period followed by a comma.
- Any parentheses in the tags are escaped with a backslash.

In [4]:
import os

def process_image_files(directory):
    """
    Process image files in the given directory. For each image file (.jpeg, .jpg, .png),
    if corresponding tags and caption files exist, combine the tags (.tags) and caption (.caption)
    into a single text file (.txt) with tags followed by caption separated by a comma. Additionally,
    the caption will have the following modifications:
        - Each sentence will have commas inside sentences stripped from them.
        - Each sentence will end with a period followed by a comma.
        - Any parentheses in the tags will be escaped with a backslash.

    Parameters:
        directory (str): The directory path containing image files and associated tags
                         and caption files.
    """
    for root, dirs, files in os.walk(directory):
        for file in files:
            if file.lower().endswith(('.jpeg', '.jpg', '.png')):
                image_name, _ = os.path.splitext(file)
                tags_file = os.path.join(root, image_name + '.tags')
                caption_file = os.path.join(root, image_name + '.caption')
                txt_file = os.path.join(root, image_name + '.txt')
                
                if os.path.exists(tags_file) and os.path.exists(caption_file):
                    with open(tags_file, 'r') as f:
                        tags = f.read().strip()
                        tags = tags.replace('(', '\\(').replace(')', '\\)')
                    with open(caption_file, 'r') as f:
                        caption = f.read().strip()
                        caption = caption.replace(', ', ' ')
                        caption = caption.replace('.', '.,')
                        caption = caption.rstrip(',')
                    
                    with open(txt_file, 'w') as f:
                        f.write(tags + ', ' + caption)
                    print(f"Processed {file} successfully.")
                else:
                    if not os.path.exists(tags_file):
                        print(f"Warning: Tags file missing for {file}")
                    if not os.path.exists(caption_file):
                        print(f"Warning: Caption file missing for {file}")

directory = r'C:\Users\kade\Desktop\training_dir_staging'
process_image_files(directory)

Processed Aspekt.png successfully.
Processed BeckAndArco.png successfully.
Processed CAACAgEAAxUAAWDcC-RAMyhgCSdjHXALY3-AW04jAAJJAQACUzihRiT7YFrWoWOgIAQ.png successfully.
Processed CAACAgEAAxUAAWDcH6ECBkkGen9vu0nfz3nOhi4KAAKqAQACdX3pR00bYhmqsDSOIAQ.png successfully.
Processed CAACAgEAAxUAAWP-kzWlMHAtZpEPPPuEGOI2DuFiAAJCAgAC2tWhRusq-Cq3l32HLgQ.png successfully.
Processed Sevrah.png successfully.
Processed Soukuugo.png successfully.
