<a href="https://colab.research.google.com/github/HLCV-23/Inpainting-Detection/blob/Nicola/inpainting_pipeline.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Related Work
## Inpainting
- https://arxiv.org/pdf/2102.12092.pdf  
  Dall-e
- https://arxiv.org/pdf/2112.10752.pdf  
  Stable-Diffusion

## GAN-generated images detection
- https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8397040  
  detection of images generated by GANs, no inpainting
- https://arxiv.org/pdf/2202.07145.pdf
  review of several GAN detection algorithms

## Diffusion-generated images detection
- https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10095167  
  studies the performance of GAN-detection models on images generated by diffusion models

## Inpainting Detection
- https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9410590  
- https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9506778&tag=1  
  similar to our detection approach, but with random(?) masks. It compares the performance on deep learning and traditional inpainting techniques, which might be of interest for us (both for training and evaluation). We could also use the network architecture proposed here. Additional open-source dataset we might want to use.

## Image/Network Watermarking
- https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=650120  
  not directly related, but in watermarking, it is common to test the robustness of image watermarking techniques against common image preprocessing(rescaling, compression, etc). We might want to do that in our experiments too.
- https://www.usenix.org/system/files/conference/usenixsecurity18/sec18-adi.pdf
- https://tianweiz07.github.io/Papers/21-aamas.pdf  
- https://arxiv.org/abs/2305.20030 (Fourier Transform based technique to robustly watermark diffusion model outputs)

In [1]:
!pip install transformers
!pip install diffusers
!pip install accelerate
!pip install xformers
import numpy as np
from diffusers import StableDiffusionInpaintPipeline
from transformers import pipeline
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
from PIL import Image, ImageFilter

Collecting transformers
  Downloading transformers-4.30.2-py3-none-any.whl (7.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.2/7.2 MB[0m [31m53.6 MB/s[0m eta [36m0:00:00[0m
Collecting huggingface-hub<1.0,>=0.14.1 (from transformers)
  Downloading huggingface_hub-0.16.4-py3-none-any.whl (268 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m268.8/268.8 kB[0m [31m31.3 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1 (from transformers)
  Downloading tokenizers-0.13.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.8/7.8 MB[0m [31m110.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting safetensors>=0.3.1 (from transformers)
  Downloading safetensors-0.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m81.4 MB/s[0m eta [36m0:00:

In [4]:
device = torch.device("cuda:0")

mask_generator = pipeline("mask-generation", model="facebook/sam-vit-huge", device=device)

#caption_generator = pipeline("image-to-text", model="nlpconnect/vit-gpt2-image-captioning")
caption_generator = pipeline("image-to-text", model="Salesforce/blip-image-captioning-base")  # Is BLIP fast enough?

tokenizer = AutoTokenizer.from_pretrained("Gustavosta/MagicPrompt-Stable-Diffusion")
model = AutoModelForCausalLM.from_pretrained("Gustavosta/MagicPrompt-Stable-Diffusion")
prompt_generator = pipeline("text-generation", model=model, tokenizer = tokenizer, device = device)

inpainting_generator = StableDiffusionInpaintPipeline.from_pretrained(
    "runwayml/stable-diffusion-inpainting",
    torch_dtype=torch.float16,
    safety_checker=None,
    low_cpu_mem_usage = False
)

Downloading (…)lve/main/config.json:   0%|          | 0.00/4.56k [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/990M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/506 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

Downloading (…)rocessor_config.json:   0%|          | 0.00/287 [00:00<?, ?B/s]

unet/diffusion_pytorch_model.safetensors not found
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion_inpaint.StableDiffusionInpaintPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .


In [None]:
# hugging face token is required to load dataset,
# go to https://huggingface.co/datasets/imagenet-1k/viewer/default/train
# and get a token in the account settings
!pip install datasets
!huggingface-cli login

In [None]:
# download imagenet-1k or scene_parse_150
from datasets import load_dataset, Dataset, Image
import pandas as pd
import os

dataset = load_dataset("scene_parse_150", use_auth_token = True, streaming = False, split = "train")  # set streaming to false for scenes!
dataset = iter(dataset.shuffle())

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
#"""
!mkdir drive/MyDrive/Prototype_Dataset3/
!touch drive/MyDrive/Prototype_Dataset3/Prompts.csv
!mkdir drive/MyDrive/Prototype_Dataset3/Images
!mkdir drive/MyDrive/Prototype_Dataset3/Images/Modified
!mkdir drive/MyDrive/Prototype_Dataset3/Images/Unmodified
!mkdir drive/MyDrive/Prototype_Dataset3/Masks
!mkdir drive/MyDrive/Prototype_Dataset3/Originals
!pwd
#"""

In [None]:
# TODO: Save original image, save label ImageFolder style
def generate_datapoint_single_mask(data, id, csv_data, path):
    original_image = data["image"].resize((512,512))
    image_area = 512*512
    masks = mask_generator(original_image)["masks"]

    # Only masks with an area not too large or too small are eligible candidates for inpainting
    valid_masks = [mask for idx, mask in enumerate(masks) if (image_area * 0.025 <= mask.sum() <= image_area * 0.5)]

    if (valid_masks != []):
      choice = np.random.choice(np.arange(len(valid_masks)))
      mask = valid_masks[choice]

      mask_img = Image.fromarray((mask * 255).astype(np.uint8).squeeze())
      mask_img = mask_img.convert("RGB")

      # Generate prompt for inpainting model
      caption = caption_generator(data)[0]["generated_text"]
      prompts = prompt_generator(caption)[0]["generated_text"]
      prompt = prompts.split(",")[0]

      # inpaint
      inpainted_img = inpainting_generator(prompt=prompt, image=original_image, mask_image=mask_img, height=512, width=512).images[0]

      # get the edges of the mask
      edges = mask_img.filter(ImageFilter.FIND_EDGES).filter(ImageFilter.MaxFilter(7)).convert("1")

      # blur the inpainted image
      inpainted_img_blur = inpainted_img.filter(ImageFilter.GaussianBlur(radius = 1))

      # add the blured parts along the borders of the inpainted object to make it smoother
      inpainted_img_smooth = Image.composite(inpainted_img_blur, inpainted_img, edges).convert("RGB")

      inpainted_img_smooth.save(path + f"/Images/Modified/{id}.png")
      original_image.save(path + f"/Originals/{id}.png")
      mask_img.save(path + f"/Masks/{id}.png")
      csv_data.at[id, "Prompt"] = prompt
      print(f"Saved processed image {id}.png")

    else:
      print(f"No valid mask could be generated. Skipping image {id}.")

In [None]:
path = "drive/MyDrive/Imagenet_Inpainted"

try:
  csv_data = pd.read_csv(path + "/Prompts.csv")
except:
  with open(path + "/Prompts.csv", "w+") as f:
    f.write("idx, Prompt")
  csv_data = pd.read_csv(path + "/Prompts.csv")


# TODO: INIT ID SUCH THAT NO IMAGES ARE OVERWRITTEN!!!

for id in range(0,100):
    generate_datapoint_single_mask(next(dataset), id, csv_data, path)

csv_data.to_csv(path + "/Prompts.csv")