<a href="https://www.kaggle.com/code/aisuko/super-resolution?scriptVersionId=164773906" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Overview

Image-to-Image task is the task where an application receives an image and outputs another image. This has various subtasks, including:

- image enhancement,like(super resolution, low light enhancement, deraining etc.)
- imgae inpainting
- etc.

Here, we are going to use an image-to-image pipeline for super resolution task, and run image-to-image models for same task without a pipeline.

In [None]:
%%capture
!pip install transformers==4.35.2

In [None]:
from transformers import pipeline

pipe=pipeline(task="image-to-image", model="caidas/swin2SR-lightweight-x2-64", device="cuda")
pipe.enable_cpu_offloading()
print(pipe)

In [None]:
from PIL import Image
import requests

url="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/cat.jpg"
image=Image.open(requests.get(url, stream=True).raw)
print(image.size)

image

# Upscaled the Image

We can now do inference with the pipeline. We will get an upscaled version of the cat image.

In [None]:
upscaled=pipe(image)
print(upscaled.size)

upscaled

# Without Pipeline

In [None]:
from transformers import Swin2SRForImageSuperResolution, Swin2SRImageProcessor

model=Swin2SRForImageSuperResolution.from_pretrained("caidas/swin2SR-lightweight-x2-64").to("cuda")
processor = Swin2SRImageProcessor("caidas/swin2SR-lightweight-x2-64")

Pipeline asbtracts away the preprocessing and postprocessing steps that we have to do ourselves. We will pass the image to the processor and then move the pixel values to GPU.

In [None]:
pixel_values=processor(image, return_tensors="pt").pixel_values
print(pixel_values.shape)

pixel_values=pixel_values.to("cuda")

We can now infer the image by passing pixel values to the model.

In [None]:
import torch

with torch.no_grad():
    outputs=model(pixel_values)

outputs

# Visualization the Image

We need to get the reconstruction and post-process it for visualization.

In [None]:
outputs.reconstruction.data.shape

We need to squeeze the output and get rid of axis 0, clip the values, then convert it to be numpy float. Then we will arrange axes to have the shape [1072,880], and finally, bring the output back to range [0,255].

In [None]:
import numpy as np

# squeeze, take to CPU and clip the values
output=outputs.reconstruction.data.squeeze().cpu().clamp_(0,1).numpy()
# rearrange the axes
output=np.moveaxis(output, source=0, destination=-1)
# bring values back to pixel values range
output=(output*255.0).round().astype(np.uint8)
Image.fromarray(output)