SRCNN(Super Resolution CNN)
```
input : low img --> Y
로직 : cnn을 주로 이용, Y는 일반적으로 X를 다운샘플링하고 노이즈가 섞인 형태
output : high img --> X
```
구조
```
3개의 합성곱층
1. Patch Extaction and Representation
  입력 이미지에서 작은 패치를 추출해서 이 패치의 특징을 표현
2. Non-Linear Mapping
  추출한 패치 특징을 더 복잡한 고해상도 패치 공간으로 매핑
3. Resolution
  매핑된 고차원 특징을 다시 고해상도 이미지공간으로 변환  
```
학습과정
```
손실함수 : 평균제곱오차
SRCNN : end to end 방식  저해상도 입력에서 바로 고해상도 출력을 생성
```


In [1]:
# 공유링크를 통해 다운로드
!pip install gdown

Collecting gdown
  Downloading gdown-5.2.0-py3-none-any.whl.metadata (5.8 kB)
Collecting tqdm (from gdown)
  Downloading tqdm-4.67.1-py3-none-any.whl.metadata (57 kB)
Collecting PySocks!=1.5.7,>=1.5.6 (from requests[socks]->gdown)
  Downloading PySocks-1.7.1-py3-none-any.whl.metadata (13 kB)
Downloading gdown-5.2.0-py3-none-any.whl (18 kB)
Downloading tqdm-4.67.1-py3-none-any.whl (78 kB)
Downloading PySocks-1.7.1-py3-none-any.whl (16 kB)
Installing collected packages: tqdm, PySocks, gdown
Successfully installed PySocks-1.7.1 gdown-5.2.0 tqdm-4.67.1
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython -m pip install --upgrade pip[0m


In [2]:
# : https://drive.google.com/file/d/17c7kMwVpIm6PqPE3GqJ2Pp7C07Cbg_TL/view?usp=sharing
# 직접다운로드
import gdown
file_id = '17c7kMwVpIm6PqPE3GqJ2Pp7C07Cbg_TL'
gdown.download(f'http://drive.google.com/uc?id={file_id}',quiet=False)

FileURLRetrievalError: Failed to retrieve file url:

	Too many users have viewed or downloaded this file recently. Please
	try accessing the file again later. If the file you are trying to
	access is particularly large or is shared with many people, it may
	take up to 24 hours to be able to view or download the file. If you
	still can't access a file after 24 hours, contact your domain
	administrator.

You may still be able to access the file from the browser:

	http://drive.google.com/uc?id=17c7kMwVpIm6PqPE3GqJ2Pp7C07Cbg_TL

but Gdown can't. Please check connections and permissions.

In [9]:
# 압축 해제
import zipfile
import os
zip_path = './SRCNN_SS.zip'
extract_path = './SRCNN'
with zipfile.ZipFile(zip_path, 'r') as z:
    z.extractall(extract_path)

In [10]:
!pip install --upgrade pip

[0m

In [11]:
pip install --upgrade --force-reinstall numpy pandas opencv-python-headless torch torchvision torchaudio albumentations tqdm

Collecting numpy
  Using cached numpy-2.2.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (62 kB)
Collecting pandas
  Using cached pandas-2.2.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (89 kB)
Collecting opencv-python-headless
  Using cached opencv_python_headless-4.11.0.86-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (20 kB)
Collecting torch
  Using cached torch-2.7.0-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (29 kB)
Collecting torchvision
  Using cached torchvision-0.22.0-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (6.1 kB)
Collecting torchaudio
  Using cached torchaudio-2.7.0-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (6.6 kB)
Collecting albumentations
  Using cached albumentations-2.0.8-py3-none-any.whl.metadata (43 kB)
Collecting tqdm
  Using cached tqdm-4.67.1-py3-none-any.whl.metadata (57 kB)
Collecting python-dateutil>=2.8.2 (from pandas)
  Using cached python_dateutil-2.9.0.post0-py2.py3-none-any

In [18]:
!python --version

Python 3.11.11


In [19]:
import random
import pandas as pd
import numpy as np
import os
import cv2
import math
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset ,DataLoader
from tqdm.auto import tqdm
import albumentations as A  # 데이터 증강 라이브러리
from albumentations.pytorch.transforms import ToTensorV2  # 이미지 데이터를 tensor
# 사전 학습 모델  ResNet EfficientNet등..
import torchvision.models as models
from torchvision import transforms

In [20]:
import warnings
warnings.filterwarnings(action='ignore')

In [21]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'

In [22]:
IMG_SIZE = 2048
EPOCHS=30
LEARING_RATE=1e-4  #0.0001
BATCH_SIZE=12
SEED = 42

In [24]:
# 데이터 로드
train_df = pd.read_csv('./SRCNN/SRCNN/train.csv')
test_df = pd.read_csv('./SRCNN/SRCNN/test.csv')

In [25]:
from glob import glob
filenames = [files.split('/')[-1] for files in glob('./SRCNN/SRCNN/train/hr/*.*',recursive=True)]
filenames = sorted(filenames)
filenames[0], filenames[-1]

('0000.png', '0010.png')

In [26]:
# 실제 이미지데이터가 900번까지 존재
train_df = train_df.iloc[:11]
test_df = test_df.iloc[:11]

In [27]:
'./SRCNN/SRCNN'+train_df['LR'][0][1:]

'./SRCNN/SRCNN/train/lr/0000.png'

In [28]:
# csv의 파일경로와 실제 경로를 매칭
import os
train_df['LR'] = train_df['LR'].apply(lambda x: './SRCNN/SRCNN' + x[1:])
train_df['HR'] = train_df['HR'].apply(lambda x: './SRCNN/SRCNN' + x[1:])

In [29]:
train_df.head(2)

Unnamed: 0,LR,HR
0,./SRCNN/SRCNN/train/lr/0000.png,./SRCNN/SRCNN/train/hr/0000.png
1,./SRCNN/SRCNN/train/lr/0001.png,./SRCNN/SRCNN/train/hr/0001.png


In [30]:
# 데이터셋
# 타입힌트 적용
class SRDataset(Dataset):
  def __init__(self,df:pd.DataFrame,transforms=None,train_mode:bool=True):
    self.df = df
    self.transforms = transforms
    self.train_mode = train_mode

  def __len__(self):
    return len(self.df)

  def __getitem__(self,idx):
    lr_path = self.df['LR'].iloc[idx]
    lr_img = cv2.imread(lr_path)
    # 이미지를 크기변경할때 보간법을 사용
    # INTER_CUBIC : 3차(큐빅)보간법 주변 16개의 픽셀 값을 이용해서 부드럽고 자연스러운 확대 이미지를 생성
    # MASTER : 가장가까운 이웃, LINEAR(선형보간)
    try:
      lr_img = cv2.resize(lr_img, (IMG_SIZE,IMG_SIZE),interpolation=cv2.INTER_CUBIC)
    except Exception as e:
      print(e)
      print(lr_path)
    if self.train_mode:
      hr_path = self.df['HR'].iloc[idx]
      hr_img = cv2.imread(hr_path)
      if self.transforms:
        transformed = self.transforms(image=lr_img,label=hr_img)
        lr_img = transformed['image'] / 255.
        hr_img = transformed['label'] / 255.
        return lr_img, hr_img
    else:
      file_name = lr_path.split('/')[-1]
      if self.transforms:
        transformed = self.transforms(image=lr_img)
        lr_img = transformed['image'] / 255.
      return lr_img, file_name

In [31]:
# 데이터 증강 함수  파이프라인에 사용
def get_train_transforms():
  return A.Compose([
      ToTensorV2(p=1.0)],
      additional_targets={'image':'image','label':'image'}
  )
  def get_test_transforms():
    return A.Compose([
      ToTensorV2(p=1.0)],
      additional_targets={'image':'image','label':'image'}
  )

In [32]:
train_dataset = SRDataset(train_df,get_train_transforms(),True)
train_loader = DataLoader(train_dataset,batch_size=BATCH_SIZE,shuffle=True)

test_dataset = SRDataset(test_df,get_train_transforms(),True)
test_loader = DataLoader(test_dataset,batch_size=BATCH_SIZE,shuffle=False)

In [33]:
# ex 입력 3,H,W
class SRCNN(nn.Module):
  def __init__(self,num_channels=3,feature_dim = 64,map_dim=32):
    '''
    feature_dim   첫번째 레이어의 출력수
    map_dim       두번째 레이어의 출력수
    '''
    super(SRCNN,self).__init__()
    # 특징추출
    # 스트라이드 1이고 패딩이 kernel_size // 2 형태면 해상도 유지
    self.features = nn.Sequential(
        nn.Conv2d(num_channels,feature_dim,kernel_size=9,stride=1,padding=4),
        nn.ReLU(inplace=True),
    )
    #특징맵을 더 압축해서 중요한 정보만 남김
    self.map = nn.Sequential(
        nn.Conv2d(feature_dim,map_dim,kernel_size=5,stride=1,padding=2),
        nn.ReLU(inplace=True),
    )
    # 고해상도 이미지 복원
    self.reconstruction = nn.Conv2d(map_dim,num_channels,kernel_size=5,stride=1,padding=2)
  def forward(self,x):
    x = self.features(x)
    x = self.map(x)
    x = self.reconstruction(x)
    return x
  # SRCNN 논문기반으로 가중치 초기화 함수 제작
  # 적절할 가중치 분포를 갖도록 다양한 기법이 존재
  # 1. 공통 conv레이어 초기화
  # 2. 마지막 레이어는 별도 초기화 : 출력이 이미지복원->작은변화에도 민감하게 반응하기 위해서
  def _initialize_weights(self)->None:
    #1
    for module in self.modules():
      if isinstance(module,nn.Conv2d):
        nn.init.normal_(module.weight.data,
                        0.0,
                        # 각 층의 출력분산이 적절하게 유지할수 있도록
                        math.sqrt(2./(module.out_channels*module.weight.data[0][0].numel())))
        if module.bias is not None:
          nn.init.zeros_(module.bias)
    #2
    nn.init.normal_(self.reconstruction.weight.data,0.0,0.001)
    nn.init.zeros_(self.reconstruction.bias.data)

In [34]:
# gpu 메모리 확보
import torch
torch.cuda.empty_cache()

In [35]:
# 학습함수
def train(model,optimizer,train_loader,scheduler,device):
  model.to(device)
  model.train()
  criterion = nn.MSELoss().to(device)
  best_model = None
  bast_loss = 9999
  for epoch in range(1,EPOCHS+1):
    train_loss = []
    for lr_img, hr_img in tqdm(iter(train_loader)):
      lr_img, hr_img = lr_img.float().to(device), hr_img.float().to(device)
      optimizer.zero_grad()
      #예측
      pred_hr_img = model(lr_img)
      loss = criterion(pred_hr_img,hr_img)
      #역전파
      loss.backward()
      optimizer.step()
      train_loss.append(loss.item())
    if scheduler is not None:
      scheduler.step()
    _train_loss = np.mean(train_loss)
    print(f'epoch:{epoch} train_loss:{_train_loss:.5f}')
    # best_loss = 0.01    _train_loss =. 0.005
    if best_loss > _train_loss:
      best_loss = _train_loss
      best_model = model
  return best_model

In [36]:
# 모델객체 생성
model = nn.DataParallel(SRCNN())
model.eval()
optimizer = torch.optim.Adam(params=model.parameters(),lr=LEARING_RATE)
# 5에포크마다 학습률을 0.5씩 감소
scheduler = torch.optim.lr_scheduler.StepLR(optimizer,step_size=5,gamma=0.5)
trained_model = train(model,optimizer,train_loader,scheduler,device)

  0%|          | 0/1 [00:00<?, ?it/s]

OutOfMemoryError: CUDA out of memory. Tried to allocate 5.50 GiB. GPU 0 has a total capacity of 19.70 GiB of which 412.88 MiB is free. Process 3160497 has 19.29 GiB memory in use. Of the allocated memory 19.08 GiB is allocated by PyTorch, and 1.73 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)