# Watermark Robustness Pipeline

This notebook implements a pipeline to:
- Download raw images from Azure Blob Storage
- Add watermarks to the raw images
- Apply a variety of image transformations to those watermarkedimages
- See if watermarks are still detectable in both original and transformed images
- Calculate similarity and other metrics
- Save results to CSV files for further analysis

This Jupyter notebook uses the azureml_py38 environment (from the requirements_watermark.txt file).
It is also designed to be run in an "Azure AI" environent (https://ml.azure.com/).

We recommend reading the comments before running each section. Do not just click "Run all" - not all of the code sections are needed.

In particular we have 'hard-coded' some of the file locations.
These should be similar when you install the repo to your file system inside Azure AI. However you will have to change "David.Fletcher" to your name.

### Scratch notes

In [163]:
## There are a collection of scripts in the repo which may be useful for the reader.
#
#
#                "StripBits" creates the BitMasks in the 'Output/Mask' folders.
#                "StabSig" (an ipynb file!) creates the transforms in the 
#                                       'OutputTransformationsStableSig' folder.
#                "scratchfile" creates the two missing transforms from StabSig.
#                "test" creates the transforms in the 'OutputTransformations' folder.

### Control Variables

This is a list of control variables referenced elsewhere in the Pipeline.
The Pipeline is currently set up to target an Azure AI environment.

We strongly recommend changing at least the user_name control variable

In [164]:
##### CHOOSE WATERMARK DETECTION METHOD #####

## The following code overwrites the "detect_watermark" function with the relevant detector from one of the three methods.
## This means that we have to change the WatermarkMethod variable to switch between methods (and have to run the pipeline multiple times).
## Methods available: Stable Signature, Watermark-anything, TrustMark

WatermarkMethod = "Watermark_Anything" # Methods are Stable_Signature, Watermark_Anything, TrustMark  TESTED WITH HARDCODED
#WatermarkMethod = "TrustMark" # Methods are Stable_Signature, Watermark_Anything, TrustMark           TESTED WITH HARDCODED
#WatermarkMethod = "Stable_Signature" # Methods are Stable_Signature, Watermark_Anything, TrustMark     Broken - "from hidden import attenuation" error.
                                                                                                        #May need StableSignature/hidden to be on the path. Or Detector. Ask Jireh to verify

In [None]:
import os

# Inside Ofcom Azure AI you will generally need to change 'David.Fletcher' to your username.
# When running locally or in a different environment, modify the directory structure accordingly.
user_name = 'David.Fletcher' # Used to create the home directory and various child directories.
azure_root_dir = '/home/azureuser/cloudfiles/code/Users/' # Azure AI shared root directoty. Used to create the home directory.
root_dir = azure_root_dir + user_name # Root directory. Data directory and code directory are subdirectories of this.






################################################################################


## File locations
# We are currently storing the raw and transformed images at several locations spread throughout the Azure AI file system.
# (This is because several team members worked on different parts of the project, and moving files inside Azure AI is non-trivial).
# As such we will include the current file locations in Azure AI, and control variables to set up a proper directory structure.

Use_Current_File_Locations = True # Set to False to use the new directory structure.

if Use_Current_File_Locations:
    # Current file locations in Azure AI
    home_dir = '/home/azureuser/cloudfiles/code/Users/David.Fletcher/ost-embedding-research/' # Home directory for the project.
    raw_images_dir = '/home/azureuser/cloudfiles/code/Users/David.Fletcher/embedding_data/raw_watermarked_images/' #Location of the images downloaded from BLOB storage
    
    


    trustmark_test_location = '/home/azureuser/cloudfiles/code/Users/David.Fletcher/embedding_data/trustmark/' # Location used for the TrustMark example run: not part of the main pipeline.


# The raw images are current stored in the Azure AI file system at '/home/azureuser/cloudfiles/code/Users/David.Fletcher/embedding_data/raw_watermarked_images/'
# Here is code to generate a 

if (Use_Current_File_Locations == False):
    home_dir = root_dir + '/ost-embedding-research/'
    raw_images_dir = root_dir + '/embedding_data/raw_watermarked_images/'

    # We may be using the Watermark-Anything library. The code will swap the working directory at that point.
    wk_dir_wa = root_dir + '/ost-embedding-research/watermarkanything/watermark-anything'
    wk_dir_wa_checkpoints =  wk_dir_wa + "/checkpoints"








    trustmark_test_location = root_dir + '/embedding_data/trustmark/'

# TO DO
################################################################################





##

##


# We have included code to watermark and detect a watermark for the TrustMark method.
# This is not part of the pipeline, but included to help the reader.
# The following variable controls whether that code is run.
# This variable is only used if WatermarkMethod is set to "TrustMark".
TrustMark_Example_Run = False


# We have included code to download the raw images from BLOB storage.
# This should not be done unless you want to download the images again, and is disabled by default.
DOWNLOAD_IMAGES = False # Set to True to download the images from BLOB storage.


    





################################################################################


## Global variable to set the number of images to process
images_to_process = 2 #700 is the max value,
images_to_ = images_to_process


In [166]:
# Set working directory to where the configs are expected.
wk_dir = azure_root_dir + user_name + '/ost-embedding-research/PotentialTransforms' 
os.chdir(wk_dir)
print("CWD set to:", os.getcwd())

CWD set to: /mnt/batch/tasks/shared/LS_root/mounts/clusters/df-cpu/code/Users/David.Fletcher/ost-embedding-research/PotentialTransforms


This notebook is designed to run three different watermark detection methods.
Comment in the relevant method in the following cell.
This notebook does not support running multiple methods at once - please run the notebook multiple times for that.

The process of downloading raw images from our shared BLOB storage is not strictly part of this pipeline. The code for that is in the "access_blob_storage_subfolder.py" script in this folder (/PotentialTransforms). For testing purposes we have included a stripped down version of it in this notebook.

The actual creation of watermarked images is not part of this pipeline, as it was handled manually.

The code to create the TrustMark-watermarked images is in the "trustmark_test.py" script, at /ost-embedding-research/trustmark/trustmark_test.py
The code to create the WatermarkAnything-watermarked images is at /ost-embedding-research/watermarkanything, in the inferencehacking copy.ipynb and associated scripts.

## Jireh - please verify the following Stable Signature statement;
The code to create Stable Signature-watermarked images is in the /utils/ folder, in the "generate_watermark_imgs.py" file.

The files stored in Azure BLOB storage are watermarked. [If not, please flag where the watermarked files are stored]






In [None]:
if Use_Current_File_Locations:

    if WatermarkMethod == "Watermark_Anything":
        ## Watermark-anything watermarks and image locations
        download_path = "/home/azureuser/cloudfiles/code/Users/David.Fletcher/embedding_data/watermarked_transforms_wa/"
        raw_images_path = "/home/azureuser/cloudfiles/code/Users/Jireh.Jam/Deepfake-Embedding-Mitigation-Measures/watermark-anything/outputs/watermarked/"
        sub_folder = "watermarked/"   
        wk_dir_wa = '/home/azureuser/cloudfiles/code/Users/David.Fletcher/ost-embedding-research/watermarkanything/watermark-anything'
        wk_dir_wa_checkpoints = wk_dir_wa + "/checkpoints"
        
    if WatermarkMethod == "TrustMark":
        ## Trustmark watermarks and image locations
        download_path = "/home/azureuser/cloudfiles/code/Users/H.Kristjansdottir/watermarked_transforms_tm/"
        raw_images_path = "/home/azureuser/cloudfiles/code/Users/H.Kristjansdottir/trustmark_watermarked/"
        sub_folder = "trustmark_watermarked/"

    if WatermarkMethod == "Stable_Signature":
        ## Stable Signature watermarks and image locations??
        download_path = "/home/azureuser/cloudfiles/code/Users/David.Fletcher/embedding_data/"
        sub_folder = "raw_watermarked_images/"   
        raw_images_path = download_path + sub_folder #Must be a subdirectory for download dependency reasons.

if (Use_Current_File_Locations == False):
    home_dir = root_dir + '/ost-embedding-research/'

    if WatermarkMethod == "Watermark_Anything":
        ## Watermark-anything watermarks and image locations
        download_path = root_dir + "/embedding_data/watermarked_transforms_wa/"
        raw_images_path = home_dir + "/watermark-anything/outputs/watermarked/" #Sub-optimal, but matches the current directory structure.
        sub_folder = "watermarked/"   
        wk_dir_wa = home_dir + '/watermarkanything/watermark-anything'
        wk_dir_wa_checkpoints = wk_dir_wa + "/checkpoints"
        
    if WatermarkMethod == "TrustMark":
        ## Trustmark watermarks and image locations
        download_path = root_dir + "/watermarked_transforms_tm/"
        raw_images_path = "/trustmark_watermarked/" #Sub-optimal, but matches the current directory structure.
        sub_folder = "trustmark_watermarked/"

    if WatermarkMethod == "Stable_Signature":
        ## Stable Signature watermarks and image locations??
        download_path = root_dir + "/embedding_data/"
        sub_folder = "raw_watermarked_images/"   #Sub-optimal, but matches the current directory structure.
        raw_images_path = download_path + sub_folder #Must be a subdirectory for download dependency reasons.
    

## Download Images from Azure Blob Storage

Set up parameters and download a sample of watermarked images to the local directory.
Note that this only needs to be run once, and downloads a large number of files into your Azure AI file system (needing a lot of storage).
To avoid accidental rerunning, we have set the variable "DOWNLOAD_IMAGES" to False. Switch it to True to run the cell.
Similarly switch "num_images" to a larger number if you want to download the entire dataset.

In [None]:
from access_blob_storage_subfolder import download_blobs_from_subfolder


## Step 1:
# Download sample images to the 'embedding/inputfiles' folder
if (DOWNLOAD_IMAGES == True and WatermarkMethod == "Stable_Signature"):
    # Define parameters
    account_name = "" # Azure Blob Storage account name
    account_key = ""
    container_name = "embedding" # Blob container name
    download_path = "./embedding_data/" # Local download directory
    # Define the sub-folder you want to access
    sub_folder = "raw_watermarked_images/"  # Replace with your sub-folder name
    num_images = 2   # Number of images to download

In [169]:
# ACTION: Download the images.
# Change the "0 == 1" to "0 == 0" on the following line to download images from Azure Blob Storage
if (DOWNLOAD_IMAGES == True):
    print("Downloading images from Azure Blob Storage...")
    download_blobs_from_subfolder(account_name, account_key, container_name, download_path, sub_folder, num_images)

## Setup and Imports

Install required packages and import necessary libraries and custom modules.

In [170]:
# Install package needed to detect watermarks. Default environment is azureml_py38, scripts tends to be 3.10.14("watermark" venv)
import importlib.util
if importlib.util.find_spec("timm") is None:
    !pip install timm

In [171]:
import importlib.util
if importlib.util.find_spec("trustmark") is None:
    !pip install trustmark

In [172]:
# If you are using the Watermark_Anything method, you will need to change the working directory to that repo.

if (WatermarkMethod == "Watermark_Anything"):
    import os

    print("Old CWD is:", os.getcwd())
    # Set working directory to where the configs are expected
    os.chdir(wk_dir_wa)
    print("CWD set to:", os.getcwd())
    

Old CWD is: /mnt/batch/tasks/shared/LS_root/mounts/clusters/df-cpu/code/Users/David.Fletcher/ost-embedding-research/PotentialTransforms
CWD set to: /mnt/batch/tasks/shared/LS_root/mounts/clusters/df-cpu/code/Users/David.Fletcher/ost-embedding-research/watermarkanything/watermark-anything


In [173]:
## Load the Watermark-Anything model, to enable Watermark_Anything's detect_watermark function.

if (WatermarkMethod == "Watermark_Anything"):
    import os
    import torch
    from PIL import Image
    from torchvision.transforms.functional import to_tensor

    import sys
    sys.path.append(wk_dir_wa)

    from watermark_anything.data.metrics import msg_predict_inference
    from notebooks.inference_utils import (
        load_model_from_checkpoint, 
        default_transform, 
        msg2str
    )

    from huggingface_hub import hf_hub_download
    ckpt_path = hf_hub_download(
        repo_id="facebook/watermark-anything",
        filename="checkpoint.pth"
    )

    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")



    # Load the model
    
    exp_dir = wk_dir_wa_checkpoints
    json_path = os.path.join(exp_dir, "params.json")

    wam = load_model_from_checkpoint(json_path, ckpt_path).to(device).eval()

making attention of type 'vanilla' with 64 in_channels
Working with z of shape (1, 68, 32, 32) = 69632 dimensions.
making attention of type 'vanilla' with 64 in_channels
Model loaded successfully from /home/azureuser/.cache/huggingface/hub/models--facebook--watermark-anything/snapshots/88ffb2cd1079a76b01d6beda8bf2a433c25fa881/checkpoint.pth
{'embedder_config': 'configs/embedder.yaml', 'augmentation_config': 'configs/all_augs_multi_wm.yaml', 'extractor_config': 'configs/extractor.yaml', 'attenuation_config': 'configs/attenuation.yaml', 'embedder_model': 'vae_small', 'extractor_model': 'sam_base', 'nbits': 32, 'img_size': 256, 'img_size_extractor': 256, 'attenuation': 'jnd_1_3_blue', 'scaling_w': 2.0, 'scaling_w_schedule': None, 'scaling_i': 1.0, 'roll_probability': 0.2, 'multiple_w': 1.0, 'nb_wm_eval': 5, 'optimizer': 'AdamW,lr=1e-4', 'optimizer_d': None, 'scheduler': 'CosineLRScheduler,lr_min=1e-6,t_initial=100,warmup_lr_init=1e-6,warmup_t=5', 'epochs': 200, 'batch_size': 8, 'batch_siz

In [174]:
## Now that the model is set up, define a detect_watermark function for Watermark-Anything.

if (WatermarkMethod == "Watermark_Anything"):
    
    def detect_watermark(image_path, ckpt_path=None):
        """Detects and returns the watermark bits hidden in the image.
            
        Args:        image_path (str): Path to the watermarked image.
                     ckpt_path (str, optional): Unused variable for compatibility.
        
        """
        # Load the watermarked image
        img = Image.open(image_path).convert("RGB")
        img_tensor = default_transform(img).unsqueeze(0).to(device)
        
        # Detect watermark
        preds = wam.detect(img_tensor)["preds"]
        mask_preds = torch.sigmoid(preds[:, 0, :, :])
        bit_preds = preds[:, 1:, :, :]
        
        # Predict message
        pred_message = msg_predict_inference(bit_preds, mask_preds).cpu().float()
        confidence = torch.max(mask_preds).item()
        
        wm_file = "File"
        # Print result
        #print(f"{wm_file}: {msg2str(pred_message[0])} (confidence: {confidence:.4f})")
        
        return msg2str(pred_message[0])  
    


In [175]:
## If we are using the TrustMark method, set up this version of the detect_watermark function

if (WatermarkMethod == "TrustMark"):
    from trustmark import TrustMark
    from PIL import Image

    # Initialize TrustMark
    tm = TrustMark(verbose=True, model_type='Q')
    ckpt_path = 0 # Unused variable for compatibility, as TrustMark does not require a checkpoint path.

    # --- Encoding Example ---
    # Load an image

    def detect_watermark(image_path, ckpt_path=None):
        """Detects and returns the watermark bits hidden in the image.
            
        Args:        image_path (str): Path to the watermarked image.
                     ckpt_path (str, optional): Unused variable for compatibility.
        
        """
        # Load the watermarked image
        watermarked_image = Image.open(image_path).convert('RGB')

        # Decode the secret message
        wm_secret, wm_present, wm_schema = tm.decode(watermarked_image)
        
        if wm_present:
            decoded_str = wm_secret
        else:
            decoded_str = None
        return decoded_str

In [176]:
## This is an example of using TrustMark's encoding and decoding functions.
## The code to create the TrustMark-watermarked images is in the "trustmark_test.py" script,
## at /ost-embedding-research/trustmark/trustmark_test.py

if (TrustMark_Example_Run == True):
    from trustmark import TrustMark
    from PIL import Image

    # Initialize TrustMark
    tm = TrustMark(verbose=True, model_type='Q')

    # --- Encoding Example ---
    # Load an image
    try:
        sample_image_location = raw_images_dir + '/0_1ab8ea5ecaf859f061e099597d72b5ee_watermarked.png'
        cover = Image.open(sample_image_location).convert('RGB')
    except FileNotFoundError:
        print("Error: file not found. Please replace with the path to your image.")
        exit()
        
    # Encode a secret message and save the watermarked image
    tm_save_location = trustmark_test_location + '/watermarked_image.png'
    tm.encode(cover, 'eightlet').save(tm_save_location)
    print("Image watermarked and saved as ", tm_save_location)


    # --- Decoding Example ---
    # Load the watermarked image
    watermarked_image = Image.open(tm_save_location).convert('RGB')

    # Decode the secret message
    wm_secret, wm_present, wm_schema = tm.decode(watermarked_image)

    if wm_present:
        print(f'Extracted secret: {wm_secret}')
    else:
        print('No watermark detected.')

    # --- Removal Example ---
    # Remove the watermark and save the cleaned image
    stego = Image.open('watermarked_image.png').convert('RGB')
    im_recover = tm.remove_watermark(stego)
    tm_recover_location = trustmark_test_location + '/recovered_image.png'
    im_recover.save(tm_recover_location)
    print("Watermark removed and saved as", tm_recover_location)  

In [177]:
## If we are using the Stable_Signature method, set up this version of the detect_watermark function


if (WatermarkMethod == "Stable_Signature"):
    ## NOTE: We are defining the ckpt_path here, to be used in the functions later.
    import torch
    from torchvision import transforms
    from PIL import Image
    import argparse
    import matplotlib.pyplot as plt
    import numpy as np
    from skimage.metrics import peak_signal_noise_ratio
    import sys
    import os

    # Add the parent directory of attenuations.py and models.py to the Python path
    detector_dir = home_dir + "/Detector"
    sys.path.append(detector_dir)

    from models import HiddenEncoder, HiddenDecoder, EncoderWithJND, EncoderDecoder
    from attenuations import JND

    # Define variable for location of the ckpt file
    ckpt_path = detector_dir + "/ckpt/hidden_replicate.pth"


    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

    # Helper functions to convert between boolean message arrays and bit strings
    def msg2str(msg):
        return "".join(['1' if el else '0' for el in msg])

    def str2msg(s):
        return [True if el == '1' else False for el in s]

    # Parameters class
    class Params():
        def __init__(self, encoder_depth: int, encoder_channels: int, decoder_depth: int, decoder_channels: int, num_bits: int,
                    attenuation: str, scale_channels: bool, scaling_i: float, scaling_w: float):
            # Encoder and decoder parameters
            self.encoder_depth = encoder_depth
            self.encoder_channels = encoder_channels
            self.decoder_depth = decoder_depth
            self.decoder_channels = decoder_channels
            self.num_bits = num_bits
            # Attenuation parameters
            self.attenuation = attenuation
            self.scale_channels = scale_channels
            self.scaling_i = scaling_i
            self.scaling_w = scaling_w

    # Define image transforms (using ImageNet normalization)
    NORMALIZE_IMAGENET = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                                std=[0.229, 0.224, 0.225])
    UNNORMALIZE_IMAGENET = transforms.Normalize(mean=[-0.485/0.229, -0.456/0.224, -0.406/0.225],
                                                std=[1/0.229, 1/0.224, 1/0.225])
    default_transform = transforms.Compose([
        transforms.ToTensor(),
        NORMALIZE_IMAGENET
    ])

    # Set up parameters
    params = Params(
        encoder_depth=4, encoder_channels=64, decoder_depth=8, decoder_channels=64, num_bits=48,
        attenuation="jnd", scale_channels=False, scaling_i=1, scaling_w=1.5
    )

    # Create encoder and decoder models
    decoder = HiddenDecoder(
        num_blocks=params.decoder_depth, 
        num_bits=params.num_bits, 
        channels=params.decoder_channels
    )
    encoder = HiddenEncoder(
        num_blocks=params.encoder_depth, 
        num_bits=params.num_bits, 
        channels=params.encoder_channels
    )
    attenuation = JND(preprocess=UNNORMALIZE_IMAGENET) if params.attenuation == "jnd" else None
    encoder_with_jnd = EncoderWithJND(
        encoder, attenuation, params.scale_channels, params.scaling_i, params.scaling_w
    )

    # Move encoder and decoder to device
    encoder_with_jnd = encoder_with_jnd.to(device).eval()
    decoder = decoder.to(device).eval()

    # Function to load a decoder from a checkpoint
    def load_decoder(ckpt_path, decoder_depth, num_bits, decoder_channels, device):
        """Loads the HiddenDecoder model with weights from a checkpoint."""
        decoder_model = HiddenDecoder(num_blocks=decoder_depth, num_bits=num_bits, channels=decoder_channels)
        state_dict = torch.load(ckpt_path, map_location=device)['encoder_decoder']
        # Remove any "module." prefixes if present
        encoder_decoder_state_dict = {k.replace('module.', ''): v for k, v in state_dict.items()}
        decoder_state_dict = {k.replace('decoder.', ''): v 
                            for k, v in encoder_decoder_state_dict.items() if 'decoder' in k}
        decoder_model.load_state_dict(decoder_state_dict)
        decoder_model = decoder_model.to(device).eval()
        return decoder_model

    # Function to detect watermark from an image using the decoder model
    def detect_watermark(image_path, ckpt_path, decoder_depth=8, num_bits=48, decoder_channels=64):
        """Detects and returns the watermark bits hidden in the image."""
        # Load watermarked image and resize to expected dimensions.
        img = Image.open(image_path).convert('RGB')
        img = img.resize((512, 512), Image.BICUBIC)
        # Apply the default transform.
        img_tensor = default_transform(img).unsqueeze(0).to(device)
        # Load the decoder model.
        decoder_model = load_decoder(ckpt_path, decoder_depth, num_bits, decoder_channels, device)
        # Decode the watermark.
        ft = decoder_model(img_tensor)
        decoded_msg = ft > 0  # Threshold to obtain binary message
        decoded_str = msg2str(decoded_msg.squeeze(0).cpu().numpy())
        return decoded_str

In [178]:
## Set the current working directory back to the original location

import os
import sys
sys.path.append(wk_dir)  # Ensure the current working directory is in the path

# Print the current working directory
print(f"Current working directory: {os.getcwd()}")


# Set the working directory to the notebook's directory (if needed)
os.chdir(azure_root_dir + user_name)
print(f"Updated working directory: {os.getcwd()}")

Current working directory: /mnt/batch/tasks/shared/LS_root/mounts/clusters/df-cpu/code/Users/David.Fletcher/ost-embedding-research/watermarkanything/watermark-anything
Updated working directory: /mnt/batch/tasks/shared/LS_root/mounts/clusters/df-cpu/code/Users/David.Fletcher


## Apply Image Transformations

For each downloaded image, apply a suite of transformations (resize, flip, rotate, crop, blur, color jitter, etc.) and save the results to organized subfolders.
These cells take a very long time to run. We suggest testing on a very small number of images at first (<10).

In [179]:
import os
from PIL import Image

#from scratchfile import save_compressed_image
import combined_transforms


#### Step 2:  Apply the transforms to the images
# For each image in the above folder, apply the transforms 
#  and save the images to sub directories of the local embedding_data folder.
####

# Counter for the number of images processed
image_counter = 0

# Loop through all images in the raw images folder
for image_name in os.listdir(raw_images_path):
    image_path = os.path.join(raw_images_path, image_name)
    
    # Check if the file is an image
    if os.path.isfile(image_path) and image_name.lower().endswith(('.png', '.jpg', '.jpeg')):
        
        # Perform transformations here (to be implemented)
        print(f"Processing image: {image_name}")
        if image_counter >= images_to_process:
            print(f"Processed {images_to_process} images, stopping further processing.")
            break  
        image_counter = image_counter + 1

        ## Compress image and save it to the output directory
        #output_images_path = os.path.join(download_path, "compressed_image/", image_name)
        #input_image = Image.open(image_path)
        #save_compressed_image(input_image, output_images_path, quality=1)

        ## Add other transformations from combined_transforms.py here
        # Define output directories for each transformation
        transform_output_dirs = {
            "resized": os.path.join(download_path, "resized_images/"),
            "flipped": os.path.join(download_path, "flipped_images/"),
            "rotated": os.path.join(download_path, "rotated_images/"),
            "jittered": os.path.join(download_path, "jittered_images/"),
            "normalized": os.path.join(download_path, "normalized_images/"),
            "blurred": os.path.join(download_path, "blurred_images/"),
            "cropped": os.path.join(download_path, "cropped_images/"),
            "perspective": os.path.join(download_path, "perspective_images/"),
            "erased": os.path.join(download_path, "erased_images/"),
            "grayscale": os.path.join(download_path, "grayscale_images/"),
            "text_overlay": os.path.join(download_path, "text_overlay_images/"),
            "compressed": os.path.join(download_path, "compressed_images/"),
            "brightness_adjusted": os.path.join(download_path, "brightness_adjusted_images/"),
            "contrast_adjusted": os.path.join(download_path, "contrast_adjusted_images/"),
            "saturation_adjusted": os.path.join(download_path, "saturation_adjusted_images/"),
            "hue_adjusted": os.path.join(download_path, "hue_adjusted_images/"),
            "gamma_adjusted": os.path.join(download_path, "gamma_adjusted_images/"),
            "sharpness_adjusted": os.path.join(download_path, "sharpness_adjusted_images/")
        }

        # Add bitmask directories for values between 0 and 8
        for bitmask_value in range(9):
            transform_output_dirs[f"bitmask_{bitmask_value}"] = os.path.join(download_path, f"bitmask_{bitmask_value}_images/")

        # Create directories if they don't exist
        for transform_dir in transform_output_dirs.values():
            os.makedirs(transform_dir, exist_ok=True)

        # Apply transformations and save images. Low priority TODO: Stop unneeded repetition. Breaking 'DRY' principle.
        combined_transforms.resize_image(image_path, os.path.join(transform_output_dirs["resized"], image_name))
        combined_transforms.random_horizontal_flip(image_path, os.path.join(transform_output_dirs["flipped"], image_name))
        combined_transforms.fixed_rotation(image_path, os.path.join(transform_output_dirs["rotated"], image_name))
        combined_transforms.color_jitter(image_path, os.path.join(transform_output_dirs["jittered"], image_name))
        combined_transforms.normalize_image(image_path, os.path.join(transform_output_dirs["normalized"], image_name))
        combined_transforms.gaussian_blur(image_path, os.path.join(transform_output_dirs["blurred"], image_name))
        combined_transforms.centre_crop(image_path, os.path.join(transform_output_dirs["cropped"], image_name))
        combined_transforms.random_perspective(image_path, os.path.join(transform_output_dirs["perspective"], image_name))
        combined_transforms.random_erasing(image_path, os.path.join(transform_output_dirs["erased"], image_name))
        combined_transforms.grayscale(image_path, os.path.join(transform_output_dirs["grayscale"], image_name))
        combined_transforms.overlay_text(image_path, os.path.join(transform_output_dirs["text_overlay"], image_name), text="Sample Text")
        combined_transforms.jpeg_compress(image_path, os.path.join(transform_output_dirs["compressed"], image_name), quality=85)
        combined_transforms.adjust_brightness(image_path, os.path.join(transform_output_dirs["brightness_adjusted"], image_name), brightness_factor=1.2)
        combined_transforms.adjust_contrast(image_path, os.path.join(transform_output_dirs["contrast_adjusted"], image_name), contrast_factor=1.2)
        combined_transforms.adjust_saturation(image_path, os.path.join(transform_output_dirs["saturation_adjusted"], image_name), saturation_factor=1.2)
        combined_transforms.adjust_hue(image_path, os.path.join(transform_output_dirs["hue_adjusted"], image_name), hue_factor=0.1)
        combined_transforms.adjust_gamma(image_path, os.path.join(transform_output_dirs["gamma_adjusted"], image_name), gamma=1.5)
        combined_transforms.adjust_sharpness(image_path, os.path.join(transform_output_dirs["sharpness_adjusted"], image_name), sharpness_factor=2.0)

        # Apply bitmask transformations for values between 0 and 8
        for bitmask_value in range(9):
            combined_transforms.bitmask_image(image_path, os.path.join(transform_output_dirs[f"bitmask_{bitmask_value}"], image_name), bitmask_value)









FileNotFoundError: [Errno 2] No such file or directory: '/home/azureuser/cloudfiles/code/Users/David.Fletcher/ost-embedding-research//watermark-anything/outputs/watermarked/'

In [None]:
from PIL import Image
import os

#from scratchfile import save_compressed_image
import combined_transforms



def scaled_centre_crop(image_path, output_path, scaling_factor=0.8):
    """
    Wrapper function for centre_crop that calculates the size of the image,
    scales it by a given factor, and crops the center of the image.

    Args:
        image_path (str): Path to the input image.
        output_path (str): Path to save the cropped image.
        scaling_factor (float): Scaling factor between 0 and 1 to determine the crop size.
                                Default is 0.8 (80% of the original size).
    """
    # Ensure the scaling factor is valid
    if not (0 < scaling_factor <= 1):
        raise ValueError("Scaling factor must be between 0 and 1.")

    # Load the image to calculate its size
    image = Image.open(image_path)
    original_width, original_height = image.size

    # Calculate the new crop size based on the scaling factor
    new_width = int(original_width * scaling_factor)
    new_height = int(original_height * scaling_factor)

    # Call the centre_crop function with the calculated size
    combined_transforms.centre_crop(image_path, output_path, size=(new_height, new_width))

In [None]:
##### Semi-scratch. Code to generate pictures with various percentage crops
##### (99%, 90%, ... ..., 10%)

import os
from PIL import Image

#from scratchfile import save_compressed_image
import combined_transforms


#### Step 2:  Apply the transforms to the images
# For each image in the above folder, apply the transforms 
#  and save the images to sub directories of the local embedding_data folder.
####

# Counter for the number of images processed
image_counter = 0

# Loop through all images in the raw images folder
for image_name in os.listdir(raw_images_path):
    image_path = os.path.join(raw_images_path, image_name)
    
    # Check if the file is an image
    if os.path.isfile(image_path) and image_name.lower().endswith(('.png', '.jpg', '.jpeg')):
        
        # Perform transformations here (to be implemented)
        if image_counter >= images_to_process:
            print(f"Processed {images_to_process} images, stopping further processing.")
            break  
        image_counter = image_counter + 1

        ## Compress image and save it to the output directory
        #output_images_path = os.path.join(download_path, "compressed_image/", image_name)
        #input_image = Image.open(image_path)
        #save_compressed_image(input_image, output_images_path, quality=1)

        ## Add other transformations from combined_transforms.py here
        # Define output directories for each transformation
        transform_output_dirs = {
            "cropped100": os.path.join(download_path, "cropped_images100/"),
            "cropped99": os.path.join(download_path, "cropped_images99/"),
            "cropped90": os.path.join(download_path, "cropped_images90/"),
            "cropped80": os.path.join(download_path, "cropped_images80/"),
            "cropped70": os.path.join(download_path, "cropped_images70/"),
            "cropped60": os.path.join(download_path, "cropped_images60/"),
            "cropped50": os.path.join(download_path, "cropped_images50/"),
            "cropped40": os.path.join(download_path, "cropped_images40/"),
            "cropped30": os.path.join(download_path, "cropped_images30/"),
            "cropped20": os.path.join(download_path, "cropped_images20/"),
            "cropped10": os.path.join(download_path, "cropped_images10/"),
        }

        # Create directories if they don't exist
        for transform_dir in transform_output_dirs.values():
            os.makedirs(transform_dir, exist_ok=True)

        # Apply transformations and save images. Low priority TODO: Stop unneeded repetition. Breaking 'DRY' principle.
        scaled_centre_crop(image_path, os.path.join(transform_output_dirs["cropped100"], image_name), scaling_factor=1)
        scaled_centre_crop(image_path, os.path.join(transform_output_dirs["cropped99"], image_name), scaling_factor=0.99)
        scaled_centre_crop(image_path, os.path.join(transform_output_dirs["cropped90"], image_name), scaling_factor=0.90)
        scaled_centre_crop(image_path, os.path.join(transform_output_dirs["cropped80"], image_name), scaling_factor=0.80)
        scaled_centre_crop(image_path, os.path.join(transform_output_dirs["cropped70"], image_name), scaling_factor=0.70)
        scaled_centre_crop(image_path, os.path.join(transform_output_dirs["cropped60"], image_name), scaling_factor=0.60)
        scaled_centre_crop(image_path, os.path.join(transform_output_dirs["cropped50"], image_name), scaling_factor=0.50)
        scaled_centre_crop(image_path, os.path.join(transform_output_dirs["cropped40"], image_name), scaling_factor=0.40)
        scaled_centre_crop(image_path, os.path.join(transform_output_dirs["cropped30"], image_name), scaling_factor=0.30)
        scaled_centre_crop(image_path, os.path.join(transform_output_dirs["cropped20"], image_name), scaling_factor=0.20)
        scaled_centre_crop(image_path, os.path.join(transform_output_dirs["cropped10"], image_name), scaling_factor=0.10)






Processed 2 images, stopping further processing.


In [None]:
#####
# ## Now we repeat that process for 'blur'. Ie the gaussian_blur function
from PIL import Image
import os

#from scratchfile import save_compressed_image
import combined_transforms


def apply_gaussian_blur(image_path, output_path, kernel_size=51):
    """
    Wrapper function for gaussian_blur that validates the kernel size
    and applies Gaussian blur to the image.

    Args:
        image_path (str): Path to the input image.
        output_path (str): Path to save the blurred image.
        kernel_size (int): Size of the Gaussian kernel. Must be an odd number.
                           Default is 51.
    """
    # Ensure the kernel size is a positive odd number
    if kernel_size <= 0 or kernel_size % 2 == 0:
        raise ValueError("Kernel size must be a positive odd number.")

    # Call the gaussian_blur function
    combined_transforms.gaussian_blur(image_path, output_path, kernel_size=kernel_size)

In [None]:
##### Semi-scratch. Code to generate pictures with various blurs
##### (..., 51, ...)

import os
from PIL import Image

#from scratchfile import save_compressed_image
import combined_transforms


#### Step 2:  Apply the transforms to the images
# For each image in the above folder, apply the transforms 
#  and save the images to sub directories of the local embedding_data folder.
####

# Image counter (temporary variable)
image_counter = 0

# Loop through all images in the raw images folder
for image_name in os.listdir(raw_images_path):
    image_path = os.path.join(raw_images_path, image_name)
    
    # Check if the file is an image
    if os.path.isfile(image_path) and image_name.lower().endswith(('.png', '.jpg', '.jpeg')):
        
        # Perform transformations here (to be implemented)
        print(f"Processing image: {image_name}")
        if image_counter >= images_to_process:
            print(f"Processed {images_to_process} images, stopping further processing.")
            break  
        image_counter = image_counter + 1


        # Define output directories for each transformation
        transform_output_dirs = {
            "blurred1": os.path.join(download_path, "blurred_images1/"),
            "blurred3": os.path.join(download_path, "blurred_images3/"),
            "blurred7": os.path.join(download_path, "blurred_images7/"),
            "blurred15": os.path.join(download_path, "blurred_images15/"),
            "blurred31": os.path.join(download_path, "blurred_images31/"),
            "blurred51": os.path.join(download_path, "blurred_images51/"),
            "blurred75": os.path.join(download_path, "blurred_images75/"),
            "blurred101": os.path.join(download_path, "blurred_images101/"),
            "blurred301": os.path.join(download_path, "blurred_images301/"),
            "blurred501": os.path.join(download_path, "blurred_images501/"),
        }

        # Create directories if they don't exist
        for transform_dir in transform_output_dirs.values():
            os.makedirs(transform_dir, exist_ok=True)

        # Apply transformations and save images. Low priority TODO: Stop unneeded repetition. Breaking 'DRY' principle.
        apply_gaussian_blur(image_path, os.path.join(transform_output_dirs["blurred1"], image_name), kernel_size=1)
        apply_gaussian_blur(image_path, os.path.join(transform_output_dirs["blurred3"], image_name), kernel_size=3)
        apply_gaussian_blur(image_path, os.path.join(transform_output_dirs["blurred7"], image_name), kernel_size=7)
        apply_gaussian_blur(image_path, os.path.join(transform_output_dirs["blurred15"], image_name), kernel_size=15)
        apply_gaussian_blur(image_path, os.path.join(transform_output_dirs["blurred31"], image_name), kernel_size=31)
        apply_gaussian_blur(image_path, os.path.join(transform_output_dirs["blurred51"], image_name), kernel_size=51)
        apply_gaussian_blur(image_path, os.path.join(transform_output_dirs["blurred75"], image_name), kernel_size=75)
        apply_gaussian_blur(image_path, os.path.join(transform_output_dirs["blurred101"], image_name), kernel_size=101)
        apply_gaussian_blur(image_path, os.path.join(transform_output_dirs["blurred301"], image_name), kernel_size=301)
        apply_gaussian_blur(image_path, os.path.join(transform_output_dirs["blurred501"], image_name), kernel_size=501)





Processing image: 0_08cc23c13b79af4d3852e78a8af8ced_original_trustmark_watermarked.png
Processing image: 0_126b1334283b521949e0684339c5389b_original_trustmark_watermarked.png
Processing image: 0_1416aabf6d59fd4e348473adc87838f_original_trustmark_watermarked.png
Processed 2 images, stopping further processing.


In [None]:



import os
from PIL import Image

#from scratchfile import save_compressed_image
import combined_transforms



# Define the path to the raw images folder
row_counter = 0
# Loop through all images in the raw images folder
for image_name in os.listdir(raw_images_path):
    image_path = os.path.join(raw_images_path, image_name)
    row_counter += 1
    if row_counter  == images_to_process:
        break
    # Check if the file is an image
    if os.path.isfile(image_path) and image_name.lower().endswith(('.png', '.jpg', '.jpeg')):
        
        # Perform transformations here (to be implemented)


        # Define output directories for each transformation. Done via tab completion.
        transform_output_dirs = {
            "brightness0.25": os.path.join(download_path, "brightened_images0-25/"),
            "brightness0.5": os.path.join(download_path, "brightened_images0-5/"),
            "brightness0.7": os.path.join(download_path, "brightened_images0-7/"),
            "brightness0.8": os.path.join(download_path, "brightened_images0-8/"),
            "brightness0.9": os.path.join(download_path, "brightened_images0-9/"),
            "brightness1": os.path.join(download_path, "brightened_images1/"),
            "brightness1.1": os.path.join(download_path, "brightened_images1-1/"),
            "brightness1.2": os.path.join(download_path, "brightened_images1-2/"),
            "brightness1.3": os.path.join(download_path, "brightened_images1-3/"),
            "brightness1.5": os.path.join(download_path, "brightened_images1-5/"),
            "brightness1.75": os.path.join(download_path, "brightened_images1-75/"),
            "contrast0.25": os.path.join(download_path, "contrast_images0-25/"),
            "contrast0.5": os.path.join(download_path, "contrast_images0-5/"),
            "contrast0.7": os.path.join(download_path, "contrast_images0-7/"),
            "contrast0.8": os.path.join(download_path, "contrast_images0-8/"),
            "contrast0.9": os.path.join(download_path, "contrast_images0-9/"),
            "contrast1": os.path.join(download_path, "contrast_images1/"),
            "contrast1.1": os.path.join(download_path, "contrast_images1-1/"),
            "contrast1.2": os.path.join(download_path, "contrast_images1-2/"),
            "contrast1.3": os.path.join(download_path, "contrast_images1-3/"),
            "contrast1.5": os.path.join(download_path, "contrast_images1-5/"),
            "contrast1.75": os.path.join(download_path, "contrast_images1-75/"),
            "gamma0.25": os.path.join(download_path, "gamma_images0-25/"),
            "gamma0.5": os.path.join(download_path, "gamma_images0-5/"),
            "gamma0.7": os.path.join(download_path, "gamma_images0-7/"),
            "gamma0.8": os.path.join(download_path, "gamma_images0-8/"),
            "gamma0.9": os.path.join(download_path, "gamma_images0-9/"),
            "gamma1": os.path.join(download_path, "gamma_images1/"),
            "gamma1.1": os.path.join(download_path, "gamma_images1-1/"),
            "gamma1.2": os.path.join(download_path, "gamma_images1-2/"),
            "gamma1.3": os.path.join(download_path, "gamma_images1-3/"),
            "gamma1.5": os.path.join(download_path, "gamma_images1-5/"),
            "gamma1.75": os.path.join(download_path, "gamma_images1-75/"),
            "hueminus0.5": os.path.join(download_path, "hue_imagesminus0-5/"),
            "hueminus0.3": os.path.join(download_path, "hue_imagesminus0-3/"),
            "hueminus0.2": os.path.join(download_path, "hue_imagesminus0-2/"),
            "hueminus0.1": os.path.join(download_path, "hue_imagesminus0-1/"),
            "hue0": os.path.join(download_path, "hue_images0/"),
            "hue0.1": os.path.join(download_path, "hue_images0-1/"),
            "hue0.2": os.path.join(download_path, "hue_images0-2/"),
            "hue0.3": os.path.join(download_path, "hue_images0-3/"),
            "hue0.5": os.path.join(download_path, "hue_images0-5/"),
            "saturation0.25": os.path.join(download_path, "saturation_images0-25/"),
            "saturation0.5": os.path.join(download_path, "saturation_images0-5/"),
            "saturation0.7": os.path.join(download_path, "saturation_images0-7/"),
            "saturation0.8": os.path.join(download_path, "saturation_images0-8/"),
            "saturation0.9": os.path.join(download_path, "saturation_images0-9/"),
            "saturation1": os.path.join(download_path, "saturation_images1/"),
            "saturation1.1": os.path.join(download_path, "saturation_images1-1/"),
            "saturation1.2": os.path.join(download_path, "saturation_images1-2/"),
            "saturation1.3": os.path.join(download_path, "saturation_images1-3/"),
            "saturation1.5": os.path.join(download_path, "saturation_images1-5/"),
            "saturation1.75": os.path.join(download_path, "saturation_images1-75/"),
            "sharpness0.25": os.path.join(download_path, "sharpness_images0-25/"),
            "sharpness0.5": os.path.join(download_path, "sharpness_images0-5/"),
            "sharpness0.7": os.path.join(download_path, "sharpness_images0-7/"),
            "sharpness0.8": os.path.join(download_path, "sharpness_images0-8/"),
            "sharpness0.9": os.path.join(download_path, "sharpness_images0-9/"),
            "sharpness1": os.path.join(download_path, "sharpness_images1/"),
            "sharpness1.1": os.path.join(download_path, "sharpness_images1-1/"),
            "sharpness1.2": os.path.join(download_path, "sharpness_images1-2/"),
            "sharpness1.3": os.path.join(download_path, "sharpness_images1-3/"),
            "sharpness1.5": os.path.join(download_path, "sharpness_images1-5/"),
            "sharpness1.75": os.path.join(download_path, "sharpness_images1-75/"),
        }

        # Create directories if they don't exist
        for transform_dir in transform_output_dirs.values():
            os.makedirs(transform_dir, exist_ok=True)

        # Apply transformations and save images. Low priority TODO: Stop unneeded repetition. Breaking 'DRY' principle.
        combined_transforms.adjust_brightness(image_path, os.path.join(transform_output_dirs["brightness0.25"], image_name), brightness_factor=0.25)
        combined_transforms.adjust_brightness(image_path, os.path.join(transform_output_dirs["brightness0.5"], image_name), brightness_factor=0.5)
        combined_transforms.adjust_brightness(image_path, os.path.join(transform_output_dirs["brightness0.7"], image_name), brightness_factor=0.7)
        combined_transforms.adjust_brightness(image_path, os.path.join(transform_output_dirs["brightness0.8"], image_name), brightness_factor=0.8)
        combined_transforms.adjust_brightness(image_path, os.path.join(transform_output_dirs["brightness0.9"], image_name), brightness_factor=0.9)
        combined_transforms.adjust_brightness(image_path, os.path.join(transform_output_dirs["brightness1"], image_name), brightness_factor=1)
        combined_transforms.adjust_brightness(image_path, os.path.join(transform_output_dirs["brightness1.1"], image_name), brightness_factor=1.1)
        combined_transforms.adjust_brightness(image_path, os.path.join(transform_output_dirs["brightness1.2"], image_name), brightness_factor=1.2)
        combined_transforms.adjust_brightness(image_path, os.path.join(transform_output_dirs["brightness1.3"], image_name), brightness_factor=1.3)
        combined_transforms.adjust_brightness(image_path, os.path.join(transform_output_dirs["brightness1.5"], image_name), brightness_factor=1.5)
        combined_transforms.adjust_brightness(image_path, os.path.join(transform_output_dirs["brightness1.75"], image_name), brightness_factor=1.75)
        combined_transforms.adjust_contrast(image_path, os.path.join(transform_output_dirs["contrast0.25"], image_name), contrast_factor=0.25)
        combined_transforms.adjust_contrast(image_path, os.path.join(transform_output_dirs["contrast0.5"], image_name), contrast_factor=0.5)
        combined_transforms.adjust_contrast(image_path, os.path.join(transform_output_dirs["contrast0.7"], image_name), contrast_factor=0.7)
        combined_transforms.adjust_contrast(image_path, os.path.join(transform_output_dirs["contrast0.8"], image_name), contrast_factor=0.8)
        combined_transforms.adjust_contrast(image_path, os.path.join(transform_output_dirs["contrast0.9"], image_name), contrast_factor=0.9)
        combined_transforms.adjust_contrast(image_path, os.path.join(transform_output_dirs["contrast1"], image_name), contrast_factor=1)
        combined_transforms.adjust_contrast(image_path, os.path.join(transform_output_dirs["contrast1.1"], image_name), contrast_factor=1.1)
        combined_transforms.adjust_contrast(image_path, os.path.join(transform_output_dirs["contrast1.2"], image_name), contrast_factor=1.2)
        combined_transforms.adjust_contrast(image_path, os.path.join(transform_output_dirs["contrast1.3"], image_name), contrast_factor=1.3)
        combined_transforms.adjust_contrast(image_path, os.path.join(transform_output_dirs["contrast1.5"], image_name), contrast_factor=1.5)
        combined_transforms.adjust_contrast(image_path, os.path.join(transform_output_dirs["contrast1.75"], image_name), contrast_factor=1.75)
        combined_transforms.adjust_gamma(image_path, os.path.join(transform_output_dirs["gamma0.25"], image_name), gamma=0.25)
        combined_transforms.adjust_gamma(image_path, os.path.join(transform_output_dirs["gamma0.5"], image_name), gamma=0.5)
        combined_transforms.adjust_gamma(image_path, os.path.join(transform_output_dirs["gamma0.7"], image_name), gamma=0.7)
        combined_transforms.adjust_gamma(image_path, os.path.join(transform_output_dirs["gamma0.8"], image_name), gamma=0.8)
        combined_transforms.adjust_gamma(image_path, os.path.join(transform_output_dirs["gamma0.9"], image_name), gamma=0.9)
        combined_transforms.adjust_gamma(image_path, os.path.join(transform_output_dirs["gamma1"], image_name), gamma=1)
        combined_transforms.adjust_gamma(image_path, os.path.join(transform_output_dirs["gamma1.1"], image_name), gamma=1.1)
        combined_transforms.adjust_gamma(image_path, os.path.join(transform_output_dirs["gamma1.2"], image_name), gamma=1.2)
        combined_transforms.adjust_gamma(image_path, os.path.join(transform_output_dirs["gamma1.3"], image_name), gamma=1.3)
        combined_transforms.adjust_gamma(image_path, os.path.join(transform_output_dirs["gamma1.5"], image_name), gamma=1.5)
        combined_transforms.adjust_gamma(image_path, os.path.join(transform_output_dirs["gamma1.75"], image_name), gamma=1.75)
        combined_transforms.adjust_hue(image_path, os.path.join(transform_output_dirs["hueminus0.5"], image_name), hue_factor=-0.5)
        combined_transforms.adjust_hue(image_path, os.path.join(transform_output_dirs["hueminus0.3"], image_name), hue_factor=-0.3)
        combined_transforms.adjust_hue(image_path, os.path.join(transform_output_dirs["hueminus0.2"], image_name), hue_factor=-0.2)
        combined_transforms.adjust_hue(image_path, os.path.join(transform_output_dirs["hueminus0.1"], image_name), hue_factor=-0.1)
        combined_transforms.adjust_hue(image_path, os.path.join(transform_output_dirs["hue0"], image_name), hue_factor=0)
        combined_transforms.adjust_hue(image_path, os.path.join(transform_output_dirs["hue0.1"], image_name), hue_factor=0.1)
        combined_transforms.adjust_hue(image_path, os.path.join(transform_output_dirs["hue0.2"], image_name), hue_factor=0.2)
        combined_transforms.adjust_hue(image_path, os.path.join(transform_output_dirs["hue0.3"], image_name), hue_factor=0.3)
        combined_transforms.adjust_hue(image_path, os.path.join(transform_output_dirs["hue0.5"], image_name), hue_factor=0.5)
        combined_transforms.adjust_saturation(image_path, os.path.join(transform_output_dirs["saturation0.25"], image_name), saturation_factor=0.25)
        combined_transforms.adjust_saturation(image_path, os.path.join(transform_output_dirs["saturation0.5"], image_name), saturation_factor=0.5)
        combined_transforms.adjust_saturation(image_path, os.path.join(transform_output_dirs["saturation0.7"], image_name), saturation_factor=0.7)
        combined_transforms.adjust_saturation(image_path, os.path.join(transform_output_dirs["saturation0.8"], image_name), saturation_factor=0.8)
        combined_transforms.adjust_saturation(image_path, os.path.join(transform_output_dirs["saturation0.9"], image_name), saturation_factor=0.9)
        combined_transforms.adjust_saturation(image_path, os.path.join(transform_output_dirs["saturation1"], image_name), saturation_factor=1)
        combined_transforms.adjust_saturation(image_path, os.path.join(transform_output_dirs["saturation1.1"], image_name), saturation_factor=1.1)
        combined_transforms.adjust_saturation(image_path, os.path.join(transform_output_dirs["saturation1.2"], image_name), saturation_factor=1.2)
        combined_transforms.adjust_saturation(image_path, os.path.join(transform_output_dirs["saturation1.3"], image_name), saturation_factor=1.3)
        combined_transforms.adjust_saturation(image_path, os.path.join(transform_output_dirs["saturation1.5"], image_name), saturation_factor=1.5)
        combined_transforms.adjust_saturation(image_path, os.path.join(transform_output_dirs["saturation1.75"], image_name), saturation_factor=1.75)
        combined_transforms.adjust_sharpness(image_path, os.path.join(transform_output_dirs["sharpness0.25"], image_name), sharpness_factor=0.25)
        combined_transforms.adjust_sharpness(image_path, os.path.join(transform_output_dirs["sharpness0.5"], image_name), sharpness_factor=0.5)
        combined_transforms.adjust_sharpness(image_path, os.path.join(transform_output_dirs["sharpness0.7"], image_name), sharpness_factor=0.7)
        combined_transforms.adjust_sharpness(image_path, os.path.join(transform_output_dirs["sharpness0.8"], image_name), sharpness_factor=0.8)
        combined_transforms.adjust_sharpness(image_path, os.path.join(transform_output_dirs["sharpness0.9"], image_name), sharpness_factor=0.9)
        combined_transforms.adjust_sharpness(image_path, os.path.join(transform_output_dirs["sharpness1"], image_name), sharpness_factor=1)
        combined_transforms.adjust_sharpness(image_path, os.path.join(transform_output_dirs["sharpness1.1"], image_name), sharpness_factor=1.1)
        combined_transforms.adjust_sharpness(image_path, os.path.join(transform_output_dirs["sharpness1.2"], image_name), sharpness_factor=1.2)
        combined_transforms.adjust_sharpness(image_path, os.path.join(transform_output_dirs["sharpness1.3"], image_name), sharpness_factor=1.3)
        combined_transforms.adjust_sharpness(image_path, os.path.join(transform_output_dirs["sharpness1.5"], image_name), sharpness_factor=1.5)
        combined_transforms.adjust_sharpness(image_path, os.path.join(transform_output_dirs["sharpness1.75"], image_name), sharpness_factor=1.75)








## Reporting and Analysis

Outline of next steps:
- Calculate scores for each image pair
- Test watermark presence in each transformed image
- Collect and save results to CSV
- (Optional) Upload results to blob storage
- Generate reports and visualizations

## Calculate Metrics and Detect Watermarks

For each transformed image, compare it to the original using various metrics (PSNR, SSIM, LBP, GLCM, etc.), detect watermarks, and save the results to a CSV file.

In [None]:
# Imported CoPilot code. Needs bug testing and clarification.

import os
import csv
from PIL import Image
import numpy as np
from metrics import calculate_texture_features, calculate_similarity_metrics ,calculate_image_metrics
import cv2

import importlib.util

##Below code paragraph imports compare_images.py from the specified path
module_path = home_dir + '/utils/compare_images.py'
spec = importlib.util.spec_from_file_location("compare_images", module_path)
compare_images = importlib.util.module_from_spec(spec)
spec.loader.exec_module(compare_images)


image_counter = 0 

# Define the paths
## TODO: Issue where StableSig was running with "raw_images_path = os.path.join(download_path, raw_watermarked_images) here
##       Unsure of the implications there.

transformed_folders = [
                        "bitmask_0_images", #Could be simplified by looping through all folders instead of hardcoding
                        "bitmask_1_images", #However that could cause issues if extra folders are accidentally added.
                        "bitmask_2_images", #That is an option for the final version
                        "bitmask_3_images",
                        "bitmask_4_images",
                        "bitmask_5_images",
                        "bitmask_6_images",
                        "bitmask_7_images",
                        "bitmask_8_images",
                        "blurred_images",
                        "brightness_adjusted_images",
                        "compressed_images",
                        "contrast_adjusted_images",
                        "cropped_images",
                        "erased_images",
                        "flipped_images",
                        "gamma_adjusted_images",
                        "grayscale_images",
                        "hue_adjusted_images",
                        "jittered_images",
                        "normalized_images",
                        "perspective_images",
                        "resized_images",
                        "rotated_images",
                        "saturation_adjusted_images",
                        "sharpness_adjusted_images",
                        "text_overlay_images",
                        ]    

transformed_folders_paths = [os.path.join(download_path, folder) for folder in transformed_folders]

# Define the output CSV file
output_csv_path = os.path.join(download_path, "comparison_results.csv")

# Open the CSV file for writing
with open(output_csv_path, mode="w", newline="") as csv_file:
    csv_writer = csv.writer(csv_file, escapechar='\\', quoting=csv.QUOTE_MINIMAL)
    
    # Write the header row
    header = ["Image Name", "Transform", "PSNR", "SSIM_value", 
                "LBP_sim", "glcm_sim_contrast", "glcm_sim_dissimilarity",
                "glcm_sim_homogeneity", "glcm_sim_energy",
                "glcm_sim_correlation", "psnr2", "ssim_value2",
                "Watermark_raw", "Watermark_transformed", "Watermark_present"]  # Add more columns for additional placeholder functions
    csv_writer.writerow(header)
    
    row_counter = 1
    image_counter = 0
    # Loop through all images in the raw images folder
    for image_name in os.listdir(raw_images_path):
        temp_image_path = os.path.join(raw_images_path, image_name)
        if row_counter % 10 == 0:
            print(f"Processing image {row_counter}")
        row_counter += 1
        if image_counter >= images_to_process:
            print(f"Processed {images_to_process} images, stopping further processing.")
            break  
        image_counter = image_counter + 1
        
        # Check if the file is an image
        if os.path.isfile(temp_image_path) and image_name.lower().endswith(('.png', '.jpg', '.jpeg')):
            # Open the raw image
            with Image.open(temp_image_path) as raw_image:
                # Loop through each transformed folder
                for folder, folder_path in zip(transformed_folders, transformed_folders_paths):
                    transformed_image_path = os.path.join(folder_path, image_name)
                    
                    # Check if the transformed image exists
                    if os.path.isfile(transformed_image_path):
                        with Image.open(transformed_image_path) as transformed_image:
                            
                            
                            # Detect watermark in raw image
                            detected_bits_raw = detect_watermark(temp_image_path, ckpt_path)

                            # Detect watermark in transformed image
                            detected_bits_transformed = detect_watermark(transformed_image_path, ckpt_path)

                            if (detected_bits_raw == detected_bits_transformed):
                                watermark_present = 1
                            else:
                                watermark_present = 0

                            # Check if the images are the same size
                            if raw_image.size != transformed_image.size:
                                psnr = "NA"  # Return 'NA' if sizes are different
                                ssim_value = "NA"  # Return 'NA' if sizes are different
                                LBP_sim = "NA"
                                glcm_sim_contrast = "NA"
                                glcm_sim_dissimilarity = "NA"
                                glcm_sim_homogeneity = "NA"
                                glcm_sim_energy = "NA"
                                glcm_sim_correlation = "NA"
                                psnr2 = "NA"
                                ssim_value2 = "NA"

                            else:
                                # Convert PIL images to NumPy arrays
                                raw_image_array = np.array(raw_image)
                                transformed_image_array = np.array(transformed_image)
                                
                                # Apply the comparison function
                                psnr, ssim_value = compare_images.calculate_image_metrics(raw_image_array, transformed_image_array)
                                psnr = float(psnr)
                                ssim_value = float(ssim_value)

                                # Calculate metrics from metrics.py
                                texture_metrics_original = calculate_texture_features(image = cv2.imread(temp_image_path))
                                texture_metrics_transformed = calculate_texture_features(image = cv2.imread(transformed_image_path))
                                similarity_metrics = calculate_similarity_metrics(texture_metrics_original, texture_metrics_transformed)
                                image_metrics = calculate_image_metrics(cv2.imread(temp_image_path), cv2.imread(transformed_image_path))

                                LBP_sim = similarity_metrics['lbp_similarity']
                                glcm_sim_contrast = similarity_metrics['glcm_similarities']['contrast']
                                glcm_sim_dissimilarity = similarity_metrics['glcm_similarities']['dissimilarity']
                                glcm_sim_homogeneity = similarity_metrics['glcm_similarities']['homogeneity']
                                glcm_sim_energy = similarity_metrics['glcm_similarities']['energy']
                                glcm_sim_correlation = similarity_metrics['glcm_similarities']['correlation']
                                psnr2 = image_metrics[0]
                                ssim_value2 = image_metrics[1]


                            # Write the result to the CSV file
                            csv_writer.writerow([image_name, folder, psnr, ssim_value, 
                                                LBP_sim, glcm_sim_contrast, glcm_sim_dissimilarity,
                                                glcm_sim_homogeneity, glcm_sim_energy,
                                                glcm_sim_correlation, psnr2, ssim_value2,
                                                detected_bits_raw, detected_bits_transformed, watermark_present])

Processed 2 images, stopping further processing.


In [None]:
# Imported CoPilot code. Needs bug testing and clarification.

import os
import csv
from PIL import Image
import numpy as np
from metrics import calculate_texture_features, calculate_similarity_metrics ,calculate_image_metrics
import cv2

import importlib.util

##Below code paragraph imports compare_images.py from the specified path
module_path = home_dir + '/utils/compare_images.py'
spec = importlib.util.spec_from_file_location("compare_images", module_path)
compare_images = importlib.util.module_from_spec(spec)
spec.loader.exec_module(compare_images)

# Define a placeholder function
def compare(image1, image2):
    """Placeholder function to compare two images and return a number."""
    # Replace this with your actual comparison logic
    return 0.0

# Define the paths
transformed_folders = [
                        "cropped_images100",
                        "cropped_images99",
                        "cropped_images90",
                        "cropped_images80",
                        "cropped_images70",
                        "cropped_images60",
                        "cropped_images50",
                        "cropped_images40",
                        "cropped_images30",
                        "cropped_images20",
                        "cropped_images10",
                        ]    

transformed_folders_paths = [os.path.join(download_path, folder) for folder in transformed_folders]

# Define the output CSV file
output_csv_path = os.path.join(download_path, "comparison_results_cropped_deepdive.csv")

# Open the CSV file for writing
with open(output_csv_path, mode="w", newline="") as csv_file:
    csv_writer = csv.writer(csv_file, escapechar='\\', quoting=csv.QUOTE_MINIMAL)
    
    # Write the header row
    header = ["Image Name", "Transform", "PSNR", "SSIM_value", 
                "LBP_sim", "glcm_sim_contrast", "glcm_sim_dissimilarity",
                "glcm_sim_homogeneity", "glcm_sim_energy",
                "glcm_sim_correlation", "psnr2", "ssim_value2",
                "Watermark_raw", "Watermark_transformed", "Watermark_present"]  # Add more columns for additional placeholder functions
    csv_writer.writerow(header)
    
    row_counter = 1
    image_counter = 0
    # Loop through all images in the raw images folder
    for image_name in os.listdir(raw_images_path):
        temp_image_path = os.path.join(raw_images_path, image_name)
        if row_counter % 10 == 0:
            print(f"Processing image {row_counter}")
        row_counter += 1
        if image_counter >= images_to_process + 2:
            print(f"Processed {images_to_process} images, stopping further processing.")
            break  
        image_counter = image_counter + 1
        
        # Check if the file is an image
        if os.path.isfile(temp_image_path) and image_name.lower().endswith(('.png', '.jpg', '.jpeg')):
            # Open the raw image
            with Image.open(temp_image_path) as raw_image:
                # Loop through each transformed folder
                for folder, folder_path in zip(transformed_folders, transformed_folders_paths):
                    transformed_image_path = os.path.join(folder_path, image_name)
                    
                    # Check if the transformed image exists
                    if os.path.isfile(transformed_image_path):
                        with Image.open(transformed_image_path) as transformed_image:
                            
                            # Detect if there is a watermark in the raw image
                            detected_bits_raw = detect_watermark(temp_image_path, ckpt_path)

                            # Detect if there is a watermark in the transformed image
                            detected_bits_transformed = detect_watermark(transformed_image_path, ckpt_path)

                            if (detected_bits_raw == detected_bits_transformed):
                                watermark_present = 1
                            else:
                                watermark_present = 0

                            # Check if the images are the same size
                            if raw_image.size != transformed_image.size:
                                psnr = "NA"  # Return 'NA' if sizes are different
                                ssim_value = "NA"  # Return 'NA' if sizes are different
                                LBP_sim = "NA"
                                glcm_sim_contrast = "NA"
                                glcm_sim_dissimilarity = "NA"
                                glcm_sim_homogeneity = "NA"
                                glcm_sim_energy = "NA"
                                glcm_sim_correlation = "NA"
                                psnr2 = "NA"
                                ssim_value2 = "NA"

                            else:
                                # Convert PIL images to NumPy arrays
                                raw_image_array = np.array(raw_image)
                                transformed_image_array = np.array(transformed_image)
                                
                                # Apply the comparison function
                                psnr, ssim_value = compare_images.calculate_image_metrics(raw_image_array, transformed_image_array)
                                psnr = float(psnr)
                                ssim_value = float(ssim_value)

                                # Calculate metrics from metrics.py
                                texture_metrics_original = calculate_texture_features(image = cv2.imread(temp_image_path))
                                texture_metrics_transformed = calculate_texture_features(image = cv2.imread(transformed_image_path))
                                similarity_metrics = calculate_similarity_metrics(texture_metrics_original, texture_metrics_transformed)
                                image_metrics = calculate_image_metrics(cv2.imread(temp_image_path), cv2.imread(transformed_image_path))

                                LBP_sim = similarity_metrics['lbp_similarity']
                                glcm_sim_contrast = similarity_metrics['glcm_similarities']['contrast']
                                glcm_sim_dissimilarity = similarity_metrics['glcm_similarities']['dissimilarity']
                                glcm_sim_homogeneity = similarity_metrics['glcm_similarities']['homogeneity']
                                glcm_sim_energy = similarity_metrics['glcm_similarities']['energy']
                                glcm_sim_correlation = similarity_metrics['glcm_similarities']['correlation']
                                psnr2 = image_metrics[0]
                                ssim_value2 = image_metrics[1]


                            # Write the result to the CSV file
                            csv_writer.writerow([image_name, folder, psnr, ssim_value, 
                                                LBP_sim, glcm_sim_contrast, glcm_sim_dissimilarity,
                                                glcm_sim_homogeneity, glcm_sim_energy,
                                                glcm_sim_correlation, psnr2, ssim_value2,
                                                detected_bits_raw, detected_bits_transformed, watermark_present])

Processed 2 images, stopping further processing.


In [None]:
# Imported CoPilot code. Needs bug testing and clarification.

import os
import csv
from PIL import Image
import numpy as np
from metrics import calculate_texture_features, calculate_similarity_metrics ,calculate_image_metrics
import cv2

import importlib.util

##Below code paragraph imports compare_images.py from the specified path
module_path = home_dir + '/utils/compare_images.py'
spec = importlib.util.spec_from_file_location("compare_images", module_path)
compare_images = importlib.util.module_from_spec(spec)
spec.loader.exec_module(compare_images)

# Define the paths
transformed_folders = [
                        "blurred_images1",
                        "blurred_images3",
                        "blurred_images7",
                        "blurred_images15",
                        "blurred_images31",
                        "blurred_images51",
                        "blurred_images75",
                        "blurred_images101",
                        "blurred_images301",
                        "blurred_images501",
                        ]    

transformed_folders_paths = [os.path.join(download_path, folder) for folder in transformed_folders]

# Define the output CSV file
output_csv_path = os.path.join(download_path, "comparison_results_blurred_deepdivefullsuite.csv")

# Open the CSV file for writing
with open(output_csv_path, mode="w", newline="") as csv_file:
    csv_writer = csv.writer(csv_file, escapechar='\\', quoting=csv.QUOTE_MINIMAL)
    
    # Write the header row
    header = ["Image Name", "Transform", "PSNR", "SSIM_value", 
                "LBP_sim", "glcm_sim_contrast", "glcm_sim_dissimilarity",
                "glcm_sim_homogeneity", "glcm_sim_energy",
                "glcm_sim_correlation", "psnr2", "ssim_value2",
                "Watermark_raw", "Watermark_transformed", "Watermark_present"]  # Add more columns for additional placeholder functions
    csv_writer.writerow(header)
    
    row_counter = 1
    image_counter = 0
    # Loop through all images in the raw images folder
    for image_name in os.listdir(raw_images_path):
        temp_image_path = os.path.join(raw_images_path, image_name)
        if row_counter % 10 == 0:
            print(f"Processing image {row_counter}")
        row_counter += 1
        if image_counter >= images_to_process + 2:
            print(f"Processed {images_to_process} images, stopping further processing.")
            break  
        image_counter = image_counter + 1
        
        #print("a", row_counter)
        # Check if the file is an image
        if os.path.isfile(temp_image_path) and image_name.lower().endswith(('.png', '.jpg', '.jpeg')):
            # Open the raw image
            #print("b", row_counter)
            with Image.open(temp_image_path) as raw_image:
                # Loop through each transformed folder
                for folder, folder_path in zip(transformed_folders, transformed_folders_paths):
                    transformed_image_path = os.path.join(folder_path, image_name)
                    #print("in transformed loop", row_counter)
                    # Check if the transformed image exists
                    if os.path.isfile(transformed_image_path):
                        with Image.open(transformed_image_path) as transformed_image:
                            #print("in transformed image", row_counter)
                            # Detect if there is a watermark in the raw image
                            detected_bits_raw = detect_watermark(temp_image_path, ckpt_path)

                            # Detect if there is a watermark in the transformed image
                            detected_bits_transformed = detect_watermark(transformed_image_path, ckpt_path)

                            if (detected_bits_raw == detected_bits_transformed):
                                watermark_present = 1
                            else:
                                watermark_present = 0

                            # Check if the images are the same size
                            if raw_image.size != transformed_image.size:
                                psnr = "NA"  # Return 'NA' if sizes are different
                                ssim_value = "NA"  # Return 'NA' if sizes are different
                                LBP_sim = "NA"
                                glcm_sim_contrast = "NA"
                                glcm_sim_dissimilarity = "NA"
                                glcm_sim_homogeneity = "NA"
                                glcm_sim_energy = "NA"
                                glcm_sim_correlation = "NA"
                                psnr2 = "NA"
                                ssim_value2 = "NA"

                            else:
                                # Convert PIL images to NumPy arrays
                                raw_image_array = np.array(raw_image)
                                transformed_image_array = np.array(transformed_image)
                                
                                # Apply the comparison function
                                psnr, ssim_value = compare_images.calculate_image_metrics(raw_image_array, transformed_image_array)
                                psnr = float(psnr)
                                ssim_value = float(ssim_value)

                                # Calculate metrics from metrics.py
                                texture_metrics_original = calculate_texture_features(image = cv2.imread(temp_image_path))
                                texture_metrics_transformed = calculate_texture_features(image = cv2.imread(transformed_image_path))
                                similarity_metrics = calculate_similarity_metrics(texture_metrics_original, texture_metrics_transformed)
                                image_metrics = calculate_image_metrics(cv2.imread(temp_image_path), cv2.imread(transformed_image_path))

                                LBP_sim = similarity_metrics['lbp_similarity']
                                glcm_sim_contrast = similarity_metrics['glcm_similarities']['contrast']
                                glcm_sim_dissimilarity = similarity_metrics['glcm_similarities']['dissimilarity']
                                glcm_sim_homogeneity = similarity_metrics['glcm_similarities']['homogeneity']
                                glcm_sim_energy = similarity_metrics['glcm_similarities']['energy']
                                glcm_sim_correlation = similarity_metrics['glcm_similarities']['correlation']
                                psnr2 = image_metrics[0]
                                ssim_value2 = image_metrics[1]


                            # Write the result to the CSV file
                            csv_writer.writerow([image_name, folder, psnr, ssim_value, 
                                                LBP_sim, glcm_sim_contrast, glcm_sim_dissimilarity,
                                                glcm_sim_homogeneity, glcm_sim_energy,
                                                glcm_sim_correlation, psnr2, ssim_value2,
                                                detected_bits_raw, detected_bits_transformed, watermark_present])

Processed 2 images, stopping further processing.


In [None]:


import os
import csv
from PIL import Image
import numpy as np
from metrics import calculate_texture_features, calculate_similarity_metrics ,calculate_image_metrics
import cv2

import importlib.util

##Below code paragraph imports compare_images.py from the specified path
module_path = home_dir + '/utils/compare_images.py'
spec = importlib.util.spec_from_file_location("compare_images", module_path)
compare_images = importlib.util.module_from_spec(spec)
spec.loader.exec_module(compare_images)

# Define the paths

transformed_folders = [
                        "brightened_images0-5",
                        "brightened_images0-7",
                        "brightened_images0-8",
                        "brightened_images0-9",
                        "brightened_images1",
                        "brightened_images1-1",
                        "brightened_images1-2",
                        "brightened_images1-3",
                        "brightened_images1-5",
                        "brightened_images1-75",
                        "contrast_images0-5",
                        "contrast_images0-7",
                        "contrast_images0-8",
                        "contrast_images0-9",
                        "contrast_images1",
                        "contrast_images1-1",
                        "contrast_images1-2",
                        "contrast1-3",
                        "contrast_images1-5",
                        "contrast_images1-75",
                        "gamma_images0-5",
                        "gamma_images0-7",
                        "gamma_images0-8",
                        "gamma_images0-9",
                        "gamma_images1",
                        "gamma_images1-1",
                        "gamma_images1-2",
                        "gamma_images1-3",
                        "gamma_images1-5",
                        "gamma_images1-75",
                        "hue_imagesminus0-5",
                        "hue_imagesminus0-3",
                        "hue_imagesminus0-2",
                        "hue_imagesminus0-1",
                        "hue_images0",
                        "hue_images0-1",
                        "hue_images0-2",
                        "hue_images0-3",
                        "hue_images0-5",
                        "saturation_images0-25",
                        "saturation_images0-5",
                        "saturation_images0-7",
                        "saturation_images0-8",
                        "saturation_images0-9",
                        "saturation_images1",
                        "saturation_images1-1",
                        "saturation_images1-2",
                        "saturation_images1-3",
                        "saturation_images1-5",
                        "saturation_images1-75",
                        "sharpness_images0-25",
                        "sharpness_images0-5",
                        "sharpness_images0-7",
                        "sharpness_images0-8",
                        "sharpness_images0-9",
                        "sharpness_images1",
                        "sharpness_images1-1",
                        "sharpness_images1-2",
                        "sharpness_images1-3",
                        "sharpness_images1-5",
                        "sharpness_images1-75",
                      ]  

transformed_folders_paths = [os.path.join(download_path, folder) for folder in transformed_folders]

# Define the output CSV file
output_csv_path = os.path.join(download_path, "comparison_results_brightcongamma_deepdivefullsuite.csv")

# Open the CSV file for writing
with open(output_csv_path, mode="w", newline="") as csv_file:
    csv_writer = csv.writer(csv_file, escapechar='\\', quoting=csv.QUOTE_MINIMAL)
    
    # Write the header row
    header = ["Image Name", "Transform", "PSNR", "SSIM_value", 
                "LBP_sim", "glcm_sim_contrast", "glcm_sim_dissimilarity",
                "glcm_sim_homogeneity", "glcm_sim_energy",
                "glcm_sim_correlation", "psnr2", "ssim_value2",
                "Watermark_raw", "Watermark_transformed", "Watermark_present"]  # Add more columns for additional placeholder functions
    csv_writer.writerow(header)
    
    row_counter = 1
    image_counter = 0
    # Loop through all images in the raw images folder
    for image_name in os.listdir(raw_images_path):
        temp_image_path = os.path.join(raw_images_path, image_name)
        if row_counter % 10 == 0:
            print(f"Processing image {row_counter}")
        row_counter += 1
        if image_counter >= images_to_process + 2:
            print(f"Processed {images_to_process} images, stopping further processing.")
            break  
        image_counter = image_counter + 1
        
        #print("a", row_counter)
        # Check if the file is an image
        if os.path.isfile(temp_image_path) and image_name.lower().endswith(('.png', '.jpg', '.jpeg')):
            # Open the raw image
            #print("b", row_counter)
            with Image.open(temp_image_path) as raw_image:
                # Loop through each transformed folder
                for folder, folder_path in zip(transformed_folders, transformed_folders_paths):
                    transformed_image_path = os.path.join(folder_path, image_name)
                    #print("in transformed loop", row_counter, transformed_image_path)
                    # Check if the transformed image exists
                    if os.path.isfile(transformed_image_path):
                        #print("in transformed i exists")
                        with Image.open(transformed_image_path) as transformed_image:
                            #print("in transformed image", row_counter)
                            # Detect if there is a watermark in the raw image
                            detected_bits_raw = detect_watermark(temp_image_path, ckpt_path)

                            # Detect if there is a watermark in the transformed image
                            detected_bits_transformed = detect_watermark(transformed_image_path, ckpt_path)

                            if (detected_bits_raw == detected_bits_transformed):
                                watermark_present = 1
                            else:
                                watermark_present = 0

                            # Check if the images are the same size
                            if raw_image.size != transformed_image.size:
                                psnr = "NA"  # Return 'NA' if sizes are different
                                ssim_value = "NA"  # Return 'NA' if sizes are different
                                LBP_sim = "NA"
                                glcm_sim_contrast = "NA"
                                glcm_sim_dissimilarity = "NA"
                                glcm_sim_homogeneity = "NA"
                                glcm_sim_energy = "NA"
                                glcm_sim_correlation = "NA"
                                psnr2 = "NA"
                                ssim_value2 = "NA"

                            else:
                                # Convert PIL images to NumPy arrays
                                raw_image_array = np.array(raw_image)
                                transformed_image_array = np.array(transformed_image)
                                
                                # Apply the comparison function
                                psnr, ssim_value = compare_images.calculate_image_metrics(raw_image_array, transformed_image_array)
                                psnr = float(psnr)
                                ssim_value = float(ssim_value)

                                # Calculate metrics from metrics.py
                                texture_metrics_original = calculate_texture_features(image = cv2.imread(temp_image_path))
                                texture_metrics_transformed = calculate_texture_features(image = cv2.imread(transformed_image_path))
                                similarity_metrics = calculate_similarity_metrics(texture_metrics_original, texture_metrics_transformed)
                                image_metrics = calculate_image_metrics(cv2.imread(temp_image_path), cv2.imread(transformed_image_path))

                                LBP_sim = similarity_metrics['lbp_similarity']
                                glcm_sim_contrast = similarity_metrics['glcm_similarities']['contrast']
                                glcm_sim_dissimilarity = similarity_metrics['glcm_similarities']['dissimilarity']
                                glcm_sim_homogeneity = similarity_metrics['glcm_similarities']['homogeneity']
                                glcm_sim_energy = similarity_metrics['glcm_similarities']['energy']
                                glcm_sim_correlation = similarity_metrics['glcm_similarities']['correlation']
                                psnr2 = image_metrics[0]
                                ssim_value2 = image_metrics[1]
                                #print("in metric", row_counter)


                            # Write the result to the CSV file
                            print(image_name)
                            csv_writer.writerow([image_name, folder, psnr, ssim_value, 
                                                LBP_sim, glcm_sim_contrast, glcm_sim_dissimilarity,
                                                glcm_sim_homogeneity, glcm_sim_energy,
                                                glcm_sim_correlation, psnr2, ssim_value2,
                                                detected_bits_raw, detected_bits_transformed, watermark_present])

0_08cc23c13b79af4d3852e78a8af8ced_original_trustmark_watermarked.png
0_08cc23c13b79af4d3852e78a8af8ced_original_trustmark_watermarked.png
0_08cc23c13b79af4d3852e78a8af8ced_original_trustmark_watermarked.png
0_08cc23c13b79af4d3852e78a8af8ced_original_trustmark_watermarked.png
0_08cc23c13b79af4d3852e78a8af8ced_original_trustmark_watermarked.png
0_08cc23c13b79af4d3852e78a8af8ced_original_trustmark_watermarked.png
0_08cc23c13b79af4d3852e78a8af8ced_original_trustmark_watermarked.png
0_08cc23c13b79af4d3852e78a8af8ced_original_trustmark_watermarked.png
0_08cc23c13b79af4d3852e78a8af8ced_original_trustmark_watermarked.png
0_08cc23c13b79af4d3852e78a8af8ced_original_trustmark_watermarked.png
0_08cc23c13b79af4d3852e78a8af8ced_original_trustmark_watermarked.png
0_08cc23c13b79af4d3852e78a8af8ced_original_trustmark_watermarked.png
0_08cc23c13b79af4d3852e78a8af8ced_original_trustmark_watermarked.png
0_08cc23c13b79af4d3852e78a8af8ced_original_trustmark_watermarked.png
0_08cc23c13b79af4d3852e78a8af8ced_

### Scratch code.
This section includes 'scratch' code not for production.
The most important section is the section designed for testing screenshots
Note that you will need to manually take screenshots of ~10 images in the dataset, and save them to the relevant location before running this code.
To stop the code being run automatically, we have set up a "if (0==1)" statement. Change that to "1=1" before running.

In [None]:
if (0==1):

    print("LBP_similarity", similarity_metrics['lbp_similarity'])
    print("glcm_similarities_contrast", similarity_metrics['glcm_similarities']['contrast'])

    print("glcm_similarities_dissimilarity", similarity_metrics['glcm_similarities']['dissimilarity'])
    print("glcm_similarities_homogeneity", similarity_metrics['glcm_similarities']['homogeneity'])
    print("glcm_similarities_energy", similarity_metrics['glcm_similarities']['energy'])
    print("glcm_similarities_correlation", similarity_metrics['glcm_similarities']['correlation'])

    print("image metrics", image_metrics)
    print("psnr", image_metrics[0])
    print("ssim_value", image_metrics[1])

    '''
    LBP_sim
    glcm_sim_contrast
    glcm_sim_dissimilarity
    glcm_sim_homogeneity
    glcm_sim_energy
    glcm_sim_correlation
    psnr2
    ssim_value2


    similarity_metrics['lbp_similarity']
    similarity_metrics['glcm_similarities']['contrast']
    similarity_metrics['glcm_similarities']['dissimilarity']
    similarity_metrics['glcm_similarities']['homogeneity']
    similarity_metrics['glcm_similarities']['energy']
    similarity_metrics['glcm_similarities']['correlation']
    image_metrics[0]
    image_metrics[1]

    '''



In [None]:
##### Scratch code
##
##
## This is a one-off script to compare Halla's screenshots with the original images

if (0==1):

    import os
    import csv
    from PIL import Image
    import numpy as np
    from metrics import calculate_texture_features, calculate_similarity_metrics ,calculate_image_metrics
    import cv2

    import importlib.util

    ##Below code paragraph imports compare_images.py from the specified path
    module_path = home_dir + '/utils/compare_images.py'
    spec = importlib.util.spec_from_file_location("compare_images", module_path)
    compare_images = importlib.util.module_from_spec(spec)
    spec.loader.exec_module(compare_images)

    # Define a placeholder function
    def compare(image1, image2):
        """Placeholder function to compare two images and return a number."""
        # Replace this with your actual comparison logic
        return 0.0

    # Define the paths
    raw_images_path = os.path.join(download_path, "raw_watermarked_images")
    transformed_folders = [
                            "screenshotted"
                            ]    

    transformed_folders_paths = [os.path.join(download_path, folder) for folder in transformed_folders]

    # Define the output CSV file
    output_csv_path = os.path.join(download_path, "comparison_results_screenshotted.csv")

    # Open the CSV file for writing
    with open(output_csv_path, mode="w", newline="") as csv_file:
        csv_writer = csv.writer(csv_file)
        
        # Write the header row
        header = ["Image Name", "Transform", "PSNR", "SSIM_value", 
                    "LBP_sim", "glcm_sim_contrast", "glcm_sim_dissimilarity",
                    "glcm_sim_homogeneity", "glcm_sim_energy",
                    "glcm_sim_correlation", "psnr2", "ssim_value2",
                    "Watermark_raw", "Watermark_transformed", "Watermark_present"]  # Add more columns for additional placeholder functions
        csv_writer.writerow(header)
        
        row_counter = 1
        # Loop through all images in the raw images folder
        for image_name in os.listdir(raw_images_path):
            raw_image_path = os.path.join(raw_images_path, image_name)
            if row_counter % 10 == 0:
                print(f"Processing image {row_counter}")
            row_counter += 1
            # Check if the file is an image
            if os.path.isfile(raw_image_path) and image_name.lower().endswith(('.png', '.jpg', '.jpeg')):
                # Open the raw image
                with Image.open(raw_image_path) as raw_image:
                    # Loop through each transformed folder
                    for folder, folder_path in zip(transformed_folders, transformed_folders_paths):
                        transformed_image_path = os.path.join(folder_path, image_name)
                        
                        # Check if the transformed image exists
                        if os.path.isfile(transformed_image_path):
                            with Image.open(transformed_image_path) as transformed_image:
                                
                                
                                # Detect if there is a watermark in the raw image
                                detected_bits_raw = detect_watermark(raw_image_path, ckpt_path)

                                # Detect if there is a watermark in the transformed image
                                detected_bits_transformed = detect_watermark(transformed_image_path, ckpt_path)

                                if (detected_bits_raw == detected_bits_transformed):
                                    watermark_present = 1
                                else:
                                    watermark_present = 0

                                # Check if the images are the same size
                                if raw_image.size != transformed_image.size:
                                    psnr = "NA"  # Return 'NA' if sizes are different
                                    ssim_value = "NA"  # Return 'NA' if sizes are different
                                    LBP_sim = "NA"
                                    glcm_sim_contrast = "NA"
                                    glcm_sim_dissimilarity = "NA"
                                    glcm_sim_homogeneity = "NA"
                                    glcm_sim_energy = "NA"
                                    glcm_sim_correlation = "NA"
                                    psnr2 = "NA"
                                    ssim_value2 = "NA"

                                else:
                                    # Convert PIL images to NumPy arrays
                                    raw_image_array = np.array(raw_image)
                                    transformed_image_array = np.array(transformed_image)
                                    
                                    # Apply the comparison function
                                    psnr, ssim_value = compare_images.calculate_image_metrics(raw_image_array, transformed_image_array)
                                    psnr = float(psnr)
                                    ssim_value = float(ssim_value)

                                    # Calculate metrics from metrics.py
                                    texture_metrics_original = calculate_texture_features(image = cv2.imread(raw_image_path))
                                    texture_metrics_transformed = calculate_texture_features(image = cv2.imread(transformed_image_path))
                                    similarity_metrics = calculate_similarity_metrics(texture_metrics_original, texture_metrics_transformed)
                                    image_metrics = calculate_image_metrics(cv2.imread(raw_image_path), cv2.imread(transformed_image_path))

                                    LBP_sim = similarity_metrics['lbp_similarity']
                                    glcm_sim_contrast = similarity_metrics['glcm_similarities']['contrast']
                                    glcm_sim_dissimilarity = similarity_metrics['glcm_similarities']['dissimilarity']
                                    glcm_sim_homogeneity = similarity_metrics['glcm_similarities']['homogeneity']
                                    glcm_sim_energy = similarity_metrics['glcm_similarities']['energy']
                                    glcm_sim_correlation = similarity_metrics['glcm_similarities']['correlation']
                                    psnr2 = image_metrics[0]
                                    ssim_value2 = image_metrics[1]


                                # Write the result to the CSV file
                                csv_writer.writerow([image_name, folder, psnr, ssim_value, 
                                                    LBP_sim, glcm_sim_contrast, glcm_sim_dissimilarity,
                                                    glcm_sim_homogeneity, glcm_sim_energy,
                                                    glcm_sim_correlation, psnr2, ssim_value2,
                                                    detected_bits_raw, detected_bits_transformed, watermark_present])