# Compression with Constraints: Steganography

The natural images are compressible, usually a full size image of several megabytes can be reduced to kilobytes level easily without lossing too much important information. Such property has been used widely to denoising, deblur, etc. techniques. 

The [``steganography``](https://en.wikipedia.org/wiki/Steganography) is a topic lying in the cryptography. It is concealing a file, message, image, or video within another file, message, image, or video. 

The advantage of steganography over cryptography alone is that the intended secret message ***does not attract attention to itself as an object of scrutiny***. Plainly visible encrypted messages, no matter how unbreakable they are, arouse interest and may in themselves be incriminating in countries in which encryption is illegal. 

Whereas cryptography is the practice of protecting the contents of a message alone, steganography is concerned both with concealing the fact that a secret message is being sent and its contents.

Steganography includes the concealment of information within computer files. In digital steganography, electronic communications may include steganographic coding inside of a transport layer, such as a document file, image file, program or protocol. Media files are ideal for steganographic transmission because of their large size. For example, a sender might start with an innocuous image file and adjust the color of every hundredth pixel to correspond to a letter in the alphabet. The change is so subtle that someone who is not specifically looking for it is unlikely to notice the change. 

In this project, we deal with a special case: stegranography with images only. So unlike many practical scenes, for instance, encrypt text, document in images, this task might not be able to produce perfect recovery of information.

## Purpose of the project

The project is not meant to create super powerful technique to conceal information, it is more concerned to get used to imaging processing libraries and optimization techniques. On the other hand, it also provides a challenge to think about how to detect the ``steganographic`` images without the original images.

## Mathematical aspects

The stegranography has two important components: encryption and decryption. 

Suppose you have an original image and a secret image.


1.   Encryption: As a sender, your task is to make the original image and secret image merge into one image. The purpose is two fold. Firstly, your outcome must be an image, if this image stays far away from the original image, then it will attract other people's attention. That will count as a failure. Secondly, your outcome image must also convey the information of the secret image. It is not that simple, since any blending of information will change each other. The problem is how much we can afford.
2.   Decryption: After the encryption part, your outcome image will have two parts of information coming from original image and the secret image. As the receiver, your task will be inverting the encryption process, to recover the secret image as much as possible (Caution,  the receiver do not care about original image). 

If we mathematically represent such process, let $x$ be the original image, $y$ is the secret image, then $z = E(x, y)$ is the encrypted image, $E$ is the encryption function. You will try to minimize 
$$\|z - x\|$$
The above norm is in certain sense, we will discuss that later. However, above minimization will subject to another constraint, which is the decryption function $D$ can recover sufficient information of the secret image. That is 
$$\|D(z) - y\|$$
should be as small as possible.

It is possible to construct a unified objective function:
$$\min_{E, D} \|x - E(x,y)\| + \gamma \|y - D(z)\|$$
where $E$ and $D$ are the parameters to find. $\gamma$ is a parameter chosen at your choice.

Of course, there are other constraints from the images, because images are pixels, each pixel contains 3 channels: R,G,B, each one is a 8-bit integer, goes from 0 to 255. If the image has 4 channels RGBA, then it will provide more information. Therefore above optimization problem also has constraints that $E(x,y)$ and $D(z)$ must be images. 



## Algorithm 101, LSB

The LSB is called least significant bits, which means you will replace the least significant bits of the original image with the secret image's most significant bits. This method will kill some information from both images, but the performance seems OK for general cases.

Here are a few references on this simple algorithm: 

0.   https://towardsdatascience.com/steganography-hiding-an-image-inside-another-77ca66b2acb1, the code is [here](https://github.com/kelvins/steganography)
1.   https://github.com/RobinDavid/LSB-Steganography 
2.   https://pdfs.semanticscholar.org/3dce/b6307cee042b687b7f377ec1d5de91ce20b0.pdf
3.   https://hackernoon.com/simple-image-steganography-in-python-18c7b534854f

The basic idea is (suppose you have a code to turn int8 into binary string),  inside each channel, say R, your original image's pixel, say represented as ``1001,0011``, and your secret image's that pixel is ``1110,1101``, then replace the last 4 bits in original image's pixel with the first 4 bits of secret image's corresponding pixel, the resulting number will be ``1001,1110``. In this way, the change in the original image could be small (on average).  There are other ways to alter the LSB, like treating the secret image as a binary string , and evenly distribute to each pixel. ***In our case, for simplicity, we only consider the images with the same size.***

## Shortcomings 

The shortcomings of algorithms/methods are mainly on the detection, which is, we can easily detect your outcome image is not feeling right. 

In practice, if you look at the altered image, say from LSB, you do not feel anything. But the LSB has a very obvious drawback: it alters the last bit, which may distory the statistics of the last bit. In theory, the last bit 0 and 1 should obey certain heuristic distribution in the image, but now it will be changed.

The detection code is here: https://github.com/b3dk7/StegExpose

In https://dl.acm.org/citation.cfm?id=1929317, the paper introduced the method to preserve the statistics. 

In https://pdfs.semanticscholar.org/80a5/fcbeda7697d9641bc80460593c2f8f305a65.pdf, it introduced the detection of LSB. 


In http://futuremedia.szu.edu.cn/assets/files/CF_What%20makes%20the%20stego%20imageundetectable.pdf, the authors considered choosing the best original image to hide the given secret image 

Again,  currently, we are not supposed to consider this far, but it will be a future work if you find this interesting.

## Other ways

The review paper (maybe old) is found here: https://www.sciencedirect.com/science/article/pii/B9780123855107000023

(some other reviews are found: [here](https://pdfs.semanticscholar.org/57a1/d15dcbf946f093a59db55f8828699fef7826.pdf) and [here](https://www.cscjournals.org/manuscript/Journals/IJCSS/Volume6/Issue3/IJCSS-670.pdf))


1.   https://arxiv.org/pdf/1606.05294.pdf. In this paper, it introduces the method to use NN to replace (learn) the LSB process. 
2.   https://papers.nips.cc/paper/6802-hiding-images-in-plain-sight-deep-steganography.pdf, it introduced a NN to  approximate $D$ and $E$. 
3. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.178.7157&rep=rep1&type=pdf, it uses the DCT (discrete-cosine-transform) and LSB. 
4. https://arxiv.org/pdf/1806.06357.pdf and  [code](https://github.com/adamcavendish/Deep-Image-Steganography)
5. https://ieeexplore.ieee.org/document/8403208/all-figures
6. https://eccv2018.org/openaccess/content_ECCV_2018/papers/Jiren_Zhu_HiDDeN_Hiding_Data_ECCV_2018_paper.pdf
7. https://arxiv.org/pdf/1904.01444.pdf
8. https://link.springer.com/article/10.1007/s00521-014-1702-1

## First task
Implement LSB, the images (orignal and secret) are of the same sizes. If you do not want to implement any, at least go through the code  [here](https://github.com/kelvins/steganography). 

## Second task
Try to use neural network to approximate $D$ and $E$, the parameters are up to you, the structure is up to you.  This paper provides a good insight: https://papers.nips.cc/paper/6802-hiding-images-in-plain-sight-deep-steganography.pdf,  an implementation is found here: https://github.com/fpingham/DeepSteg/blob/master/DeepSteganography.ipynb, https://github.com/Ankit-Dhankhar/deep-steg/blob/master/steg%20net.py, and https://github.com/mr3coi/deepsteg and https://github.com/alexandremuzio/deep-steg and https://github.com/harveyslash/Deep-Steganography....., a blog https://buzzrobot.com/hiding-images-using-ai-deep-steganography-b7726bd58b06

For the network structure, you can borrow the idea from autoencoder for the $E$ part, in that paper, the authors claimed the $E$ part uses 5 layers of convolutional neural networks with 3x3, 4x4, 5x5 patches. The idea is only to approximate the mappings $D$ and $E$, the fully connected network should also work, but convolutional type is cheaper. 

A good way to combine DCT (discrete-cosine-transform) to reduce the information first on secret images (bypassing the prep network in the paper). References are easy to find by searching google with DCT keywords. 

## Optional Task

What if you have two secret images to encrpt, what if there are more. Does LSB work? Does the NN work? 

## Data set 

https://www.kaggle.com/gaz3ll3/optimization-ii-project-3

In order to efficiency, we only consider small pictures, 256x256. If you have problem dealing with 256x256, you can resize them to 128x128 or 64x64.  If you feel the images are too many, you can sample a portion from them as well.  

Training and Validation sets are chosen at random (say, 80% and 20%). Each input data will be two images from the training set. 

If you are more comfortable with other data sets, it is up to you. Say you can use https://tiny-imagenet.herokuapp.com/ for 64x64 small images.

## Metric

In your trainging process for $D$ and $E$, the norm to compare images is the RMSE (root mean squared error), the images are of dimension $N\times N\times 3$, BTW.

## Your final result


1.   A writeup on your work, including performance, your work, issues, how do you solve the issues, etc. 
2.   Test your codes (LSB and NN) against the data set http://r0k.us/graphics/kodak/, each image will be downsize to 256x256 or 128x128 or 64x64 if you trained an NN on smaller images. Report your result in your writeup. 
3. Code, again, host on github. Submission will be a link. 
4. If you also tried the optional task, please also report that in your writeup.




In [63]:
print('Good luck!')

Good luck!


# First task

In [64]:
from PIL import Image

In [65]:
class Steganography(object):

  @staticmethod
  def __int_to_bin(rgb):
    """Convert an integer tuple to a binary (string) tuple.
    :param rgb: An integer tuple (e.g. (220, 110, 96))
    :return: A string tuple (e.g. ("00101010", "11101011", "00010110"))
    """
    r, g, b = rgb
    return ('{0:08b}'.format(r),
        '{0:08b}'.format(g),
        '{0:08b}'.format(b))

  @staticmethod
  def __bin_to_int(rgb):
    """Convert a binary (string) tuple to an integer tuple.
    :param rgb: A string tuple (e.g. ("00101010", "11101011", "00010110"))
    :return: Return an int tuple (e.g. (220, 110, 96))
    """
    r, g, b = rgb
    return (int(r, 2),
        int(g, 2),
        int(b, 2))

  @staticmethod
  def __merge_rgb(rgb1, rgb2):
    """Merge two RGB tuples.
    :param rgb1: A string tuple (e.g. ("00101010", "11101011", "00010110"))
    :param rgb2: Another string tuple
    (e.g. ("00101010", "11101011", "00010110"))
    :return: An integer tuple with the two RGB values merged.
    """
    r1, g1, b1 = rgb1
    r2, g2, b2 = rgb2
    rgb = (r1[:4] + r2[:4],
        g1[:4] + g2[:4],
        b1[:4] + b2[:4])
    return rgb

  @staticmethod
  def merge(img1, img2):
    """Merge two images. The second one will be merged into the first one.
    :param img1: First image
    :param img2: Second image
    :return: A new merged image.
    """

    # Check the images dimensions
    if img2.size[0] > img1.size[0] or img2.size[1] > img1.size[1]:
      raise ValueError('Image 2 should not be larger than Image 1!')

    # Get the pixel map of the two images
    pixel_map1 = img1.load()
    pixel_map2 = img2.load()

    # Create a new image that will be outputted
    new_image = Image.new(img1.mode, img1.size)
    pixels_new = new_image.load()

    for i in range(img1.size[0]):
      for j in range(img1.size[1]):
        rgb1 = Steganography.__int_to_bin(pixel_map1[i, j])

        # Use a black pixel as default
        rgb2 = Steganography.__int_to_bin((0, 0, 0))

        # Check if the pixel map position is valid for the second image
        if i < img2.size[0] and j < img2.size[1]:
          rgb2 = Steganography.__int_to_bin(pixel_map2[i, j])

        # Merge the two pixels and convert it to a integer tuple
        rgb = Steganography.__merge_rgb(rgb1, rgb2)

        pixels_new[i, j] = Steganography.__bin_to_int(rgb)

    return new_image

  @staticmethod
  def unmerge(img):
    """Unmerge an image.
    :param img: The input image.
    :return: The unmerged/extracted image.
    """

    # Load the pixel map
    pixel_map = img.load()

    # Create the new image and load the pixel map
    new_image = Image.new(img.mode, img.size)
    pixels_new = new_image.load()

    # Tuple used to store the image original size
    original_size = img.size

    for i in range(img.size[0]):
      for j in range(img.size[1]):
        # Get the RGB (as a string tuple) from the current pixel
        r, g, b = Steganography.__int_to_bin(pixel_map[i, j])

        # Extract the last 4 bits (corresponding to the hidden image)
        # Concatenate 4 zero bits because we are working with 8 bit
        rgb = (r[4:] + '0000',
            g[4:] + '0000',
            b[4:] + '0000')

        # Convert it to an integer tuple
        pixels_new[i, j] = Steganography.__bin_to_int(rgb)

        # If this is a 'valid' position, store it
        # as the last valid position
        if pixels_new[i, j] != (0, 0, 0):
          original_size = (i + 1, j + 1)

    # Crop the image based on the 'valid' pixels
    new_image = new_image.crop((0, 0, original_size[0], original_size[1]))

    return new_image

In [66]:
def merge(img1, img2, output):
  merged_image = Steganography.merge(Image.open(img1), Image.open(img2))
  merged_image.save(output)

def unmerge(img, output):
  unmerged_image = Steganography.unmerge(Image.open(img))
  unmerged_image.save(output)


# Second task

In [67]:
from google.colab import drive
drive.mount('/content/gdrive')
!unzip gdrive/MyDrive/data.zip

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).
Archive:  gdrive/MyDrive/data.zip
replace data/coast/arnat59.jpg? [y]es, [n]o, [A]ll, [N]one, [r]ename: A
  inflating: data/coast/arnat59.jpg  
  inflating: data/coast/art1130.jpg  
  inflating: data/coast/art294.jpg   
  inflating: data/coast/art487.jpg   
  inflating: data/coast/bea1.jpg     
  inflating: data/coast/bea10.jpg    
  inflating: data/coast/bea14.jpg    
  inflating: data/coast/bea2.jpg     
  inflating: data/coast/bea20.jpg    
  inflating: data/coast/bea24.jpg    
  inflating: data/coast/bea26.jpg    
  inflating: data/coast/bea27.jpg    
  inflating: data/coast/bea29.jpg    
  inflating: data/coast/bea3.jpg     
  inflating: data/coast/bea30.jpg    
  inflating: data/coast/bea33.jpg    
  inflating: data/coast/bea38.jpg    
  inflating: data/coast/bea39.jpg    
  inflating: data/coast/bea4.jpg     
  inflating: data/coast/bea5.jpg     
  i

In [68]:
import io
import os
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import IPython
from scipy import fftpack
from itertools import islice
import matplotlib.pyplot as plt
import numpy as np
import torch
from torch.autograd import Variable
from torch import utils
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import pickle
from torchvision import datasets, utils
from torch.utils.data import DataLoader, Dataset
import torchvision.transforms as transforms
from torchvision.transforms import ToPILImage
from random import shuffle

In [126]:
# Hyper Parameters
num_epochs = 110 # 设置模型训练的周期数
batch_size = 32 # 设置模型训练的batch size
learning_rate = 0.001 # 学习率
beta = 1

data_path = "/content/data" # 数据集的路径

# Mean and std deviation of imagenet dataset. Source: http://cs231n.stanford.edu/reports/2017/pdfs/101.pdf
std = [0.229, 0.224, 0.225]
mean = [0.485, 0.456, 0.406]

checkpoints_path = "models" # 模型保存的路径
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu') # 判断是否支持GPU训练
if not os.path.exists(checkpoints_path): os.mkdir(checkpoints_path) # 判断模型保存的文件夹是否存在，如果不，那就创建一个

In [70]:
# 定义损失函数
def customized_loss(S_prime1, S_prime2, C_prime, S1, S2, C, B):
  ''' Calculates loss specified on the paper.'''
  
  loss_cover = torch.nn.functional.mse_loss(C_prime, C) # 实际覆盖图像和生成的覆盖图像之间的标准 MSE
  loss_secret1 = torch.nn.functional.mse_loss(S_prime1, S1) # 实际加密图像和生成解密图像之间的 MSE
  loss_secret2 = torch.nn.functional.mse_loss(S_prime2, S2) # 实际加密图像和生成解密图像之间的 MSE
  loss_all = loss_cover + B * loss_secret1 + B*loss_secret2 # B就是前面那个beta是一个超参数，用来控制应该重建多少信息
  return loss_all, loss_cover, loss_secret1+loss_secret2

  # 这个添加高斯噪声的函数
def gaussian(tensor, mean=0, stddev=0.1):
    '''Adds random noise to a tensor.'''
    
    # noise = torch.nn.init.normal(torch.Tensor(tensor.size()), 0, 0.1)
    noise = torch.nn.init.normal_(torch.Tensor(tensor.size()), 0, 0.1).cuda()
    return (tensor + noise).clone().detach().requires_grad_(True)

In [71]:
# 预备网络用于准备隐藏加密图像，主要用途当加密图像（尺寸MM）小于载体图像（NN）的情况下，预备网络将加密图像的大小逐步增加到覆盖图像的大小，从而将保密图像的比特位分布到整个NN像素上。其次
# 与所有尺寸的隐藏图像相关的更重要的目的是将基于颜色的像素转换为更有用的特征，以便简洁地编码图像
# 这里主要是三个分别3x3,4x4, 5x5不同大小的卷积核, 并且每个卷积层有50个filters
class PrepNet(nn.Module):
    def __init__(self):
        super(PrepNet, self).__init__()
        
        # Preparation Network
        # 由于输入图像的是彩色图片，也就是图像是3维的，因此p1到p3的第一个ConvNet layer是输入通道数是3，输出是50，然后接下来第二个layer的输入和输出通道数都是50，每个layer后面都是使用relu激活函数，并padding补齐保持feature map的w和h与输入的图像w和h一致
        self.p1 = nn.Sequential(
            nn.Conv2d(3, 50, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.Conv2d(50, 50, kernel_size=3, padding=1),
            nn.ReLU()
        )
        
        self.p2 = nn.Sequential(
            nn.Conv2d(3, 50, kernel_size=4, padding=1),
            nn.ReLU(),
            nn.Conv2d(50, 50, kernel_size=4, padding=2),
            nn.ReLU()
        )

        self.p3 = nn.Sequential(
            nn.Conv2d(3, 50, kernel_size=5, padding=2),
            nn.ReLU(),
            nn.Conv2d(50, 50, kernel_size=5, padding=2),
            nn.ReLU()
        )
        # p4到p6的第一个layer的输入通道数是150，输出是50，主要是因为我们将p1到p3的特征拼接成一个具有150个通道的feature maps，然后第二个layer的输入和输出通道数均是50，也使用了relu激活函数，并padding补齐保持feature map的w和h与输入的图像w和h一致
        self.p4 = nn.Sequential(
            nn.Conv2d(150, 50, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.Conv2d(50, 50, kernel_size=3, padding=1),
            nn.ReLU()
        )
        
        self.p5 = nn.Sequential(
            nn.Conv2d(150, 50, kernel_size=4, padding=1),
            nn.ReLU(),
            nn.Conv2d(50, 50, kernel_size=4, padding=2),
            nn.ReLU()
        )

        self.p6 = nn.Sequential(
            nn.Conv2d(150, 50, kernel_size=5, padding=2),
            nn.ReLU(),
            nn.Conv2d(50, 50, kernel_size=5, padding=2),
            nn.ReLU()
        )

    def forward(self, S):
        
        p1 = self.p1(S)
        p2 = self.p2(S)
        p3 = self.p3(S)
        x = torch.cat((p1, p2, p3), 1)
        p4 = self.p4(x)
        p5 = self.p5(x)
        p6 = self.p6(x)
        y = torch.cat((p4, p5, p6), 1) # 拼接p4到p6的feature map并输出
        return y

In [120]:
# 解码网络： 将预备网络的输出和载体图像作为输入，并创建容器图像
# 与准备网络类似，这里同样使用了三个分别3x3,4x4, 5x5不同大小的卷积核, 并且每个卷积层有50个filters，但唯一不同的是该网络是将预备网络的输出和载体图像拼接起来作为输入
# 去除载体图像，显示加密图像
class EncoderNet(nn.Module):

    def __init__(self):
        super(EncoderNet, self).__init__()
        self.e1 = nn.Sequential(
            nn.Conv2d(303, 50, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.Conv2d(50, 50, kernel_size=3, padding=1),
            nn.ReLU(),
        )

        self.e2 = nn.Sequential(
            nn.Conv2d(303, 50, kernel_size=4, padding=1),
            nn.ReLU(),
            nn.Conv2d(50, 50, kernel_size=4, padding=2),
            nn.ReLU(),
        )

        self.e3 = nn.Sequential(
            nn.Conv2d(303, 50, kernel_size=5, padding=2),
            nn.ReLU(),
            nn.Conv2d(50, 50, kernel_size=5, padding=2),
            nn.ReLU(),
        )

        self.e4 = nn.Sequential(
            nn.Conv2d(150, 50, kernel_size=5, padding=2),
            nn.ReLU(),
            nn.Conv2d(50, 50, kernel_size=5, padding=2),
            nn.ReLU()
        )

        self.e5 = nn.Sequential(
            nn.Conv2d(150, 50, kernel_size=5, padding=2),
            nn.ReLU(),
            nn.Conv2d(50, 50, kernel_size=5, padding=2),
            nn.ReLU()
        )

        self.e6 = nn.Sequential(
            nn.Conv2d(150, 50, kernel_size=5, padding=2),
            nn.ReLU(),
            nn.Conv2d(50, 50, kernel_size=5, padding=2),
            nn.ReLU()
        )

        self.out1 = nn.Sequential(
            nn.Conv2d(150, 3, kernel_size=1, padding=0)
        )

        self.out2 = nn.Sequential(
            nn.Conv2d(150, 3, kernel_size=1, padding=0)
        )

    # 编码网络的模型结构与准备网络的模型结构类型，主要的不同有两点，第一个由于这个同时将预备网络的输出和载体图像作为输入，所以这个地方的e1到e3的第一个layer的输入通道数是153，以及最后使用一个卷积核为1的卷积层，完成最后的嵌入，也就是生成被隐藏信息的图像
    # 同时生成与被隐藏信息图像一样的高斯分布的噪声，其均值为0，标准差为0.1
    def forward(self, x):
        e1 = self.e1(x)
        e2 = self.e2(x)
        e3 = self.e3(x)
        x = torch.cat((e1, e2, e3), 1)
        e4 = self.e4(x)
        e5 = self.e5(x)
        e6 = self.e6(x)
        x = torch.cat((e4, e5, e6), 1)
        y1 = self.out1(x)
        y2 = self.out2(x)
        y_noise1 = gaussian(y1.data, 0, 0.1)
        y_noise2 = gaussian(y2.data, 0, 0.1)
        return y1, y_noise1, y_noise2

In [121]:
# 接收载密容器图像，恢复载体图像
class DecoderNet(nn.Module):
    def __init__(self):
        super(DecoderNet, self).__init__()

        self.d1 = nn.Sequential(
            nn.Conv2d(3, 50, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.Conv2d(50, 50, kernel_size=3, padding=1),
            nn.ReLU()
        )
        
        self.d2 = nn.Sequential(
            nn.Conv2d(3, 50, kernel_size=4, padding=1),
            nn.ReLU(),
            nn.Conv2d(50, 50, kernel_size=4, padding=2),
            nn.ReLU()
        )

        self.d3 = nn.Sequential(
            nn.Conv2d(3, 50, kernel_size=5, padding=2),
            nn.ReLU(),
            nn.Conv2d(50, 50, kernel_size=5, padding=2),
            nn.ReLU()
        )

        self.d4 = nn.Sequential(
            nn.Conv2d(150, 50, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.Conv2d(50, 50, kernel_size=3, padding=1),
            nn.ReLU()
        )
        
        self.d5 = nn.Sequential(
            nn.Conv2d(150, 50, kernel_size=4, padding=1),
            nn.ReLU(),
            nn.Conv2d(50, 50, kernel_size=4, padding=2),
            nn.ReLU()
        )

        self.d6 = nn.Sequential(
            nn.Conv2d(150, 50, kernel_size=5, padding=2),
            nn.ReLU(),
            nn.Conv2d(50, 50, kernel_size=5, padding=2),
            nn.ReLU()
        )
        self.final = nn.Sequential(
            nn.Conv2d(150, 3, kernel_size=1, padding=0)
        )
    # 解码网络与编码网络类似，主要是用来解密图像
    def forward(self, x):
        d1 = self.d1(x)
        d2 = self.d2(x)
        d3 = self.d3(x)
        x = torch.cat((d1, d2, d3), 1)
        d4 = self.d4(x)
        d5 = self.d5(x)
        d6 = self.d6(x)
        x = torch.cat((d4, d5, d6), 1)
        y = self.final(x)
        return y

In [122]:
# 这个模型就是以上三个模块实现
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.p1 = PrepNet()
        self.p2 = PrepNet()
        self.e = EncoderNet()
        self.d = DecoderNet()

    def forward(self, S1, S2, C):
        S1 = self.p1(S1)
        S2 = self.p1(S2)
        x = torch.cat((S1, S2, C), 1)
        x1, x_noise1, x_noise2 = self.e(x)
        y1 = self.d(x_noise1)
        y2 = self.d(x_noise2)
        return x, y1, y2

In [123]:
model = Model().cuda() # 将模型挂载在GPU上

In [107]:
# 加载图片，主要是将文件的路径保存在一个list
def load_data(data_path):
    data_list = []
    for root, dirs, files in os.walk(data_path):
        for file in files:
            if file.endswith(".jpg"):
                data_list.append(os.path.join(root, file))
    return data_list

In [109]:
data_list = load_data(data_path)
shuffle(data_list) # 将以上返回的图片路径的list打乱
data_list = np.array(data_list)
# 将数据进行切分，80%：20% 训练集和测试集
train_data = data_list[0:int(0.8*len(data_list))]
test_data = data_list[int(0.8*len(data_list)):]

In [127]:
# 对数据集进行转换，主要有，由于图像的大小是256x256x3，但是为了计算效率所以将图像resize成128，这样一次性就可以加载更多的图片
train_transforms = transforms.Compose([
        transforms.Resize(64),
        transforms.RandomCrop(64),
        transforms.ToTensor(),
        transforms.Normalize(mean=mean,
        std=std)
        ])

test_transforms = transforms.Compose([
        transforms.Resize(64),
        transforms.RandomCrop(64),
        transforms.ToTensor(),
        transforms.Normalize(mean=mean,
        std=std)
        ])

In [128]:
# 这个地方就是加载图片保存在内存中
class MyDataset(Dataset):

    def __init__(self, imgs, transform=None):
        super(MyDataset, self).__init__()   #对继承自父类的属性进行初始化
        self.imgs = imgs
        self.transform = transform

    def __getitem__(self, index):
        img_path = self.imgs[index]
        img = Image.open(img_path)
        if self.transform is not None:
            data = self.transform(img)
        else:
            img = np.asarray(img)
            data = torch.from_numpy(img)
        return data
    
    def __len__(self):
        '''返回数据集的长度'''
        return len(self.imgs)

In [129]:
# 数据集加载以及设置数据加载的方式
train_dataset = MyDataset(train_data, train_transforms)
test_dataset = MyDataset(test_data, test_transforms)
#Creates training set
train_loader = torch.utils.data.DataLoader(
        train_dataset, batch_size=batch_size, num_workers=0, 
        pin_memory=True, shuffle=True, drop_last=True)

# Creates test set
test_loader = torch.utils.data.DataLoader(
        test_dataset, batch_size=2, num_workers=0, 
        pin_memory=True, shuffle=False, drop_last=True)

In [130]:
# 开始训练网络
def train_model(train_loader, beta, learning_rate):
    
    # Save optimizer
    optimizer = optim.Adam(model.parameters(), lr=learning_rate)
    
    loss_history = []
    # Iterate over batches performing forward and backward passes
    for epoch in range(num_epochs):

        # Train mode
        model.train()
        
        train_losses = []
        # Train one epoch
        for idx, train_batch in enumerate(train_loader):

            data = train_batch
            data = data.cuda()

            # Saves secret images and secret covers
            train_covers = data[:21]
            train_secrets1 = data[21:42]
            train_secrets2 = data[42:63]
            
            # Creates variable from secret and cover images
            train_secrets1 = train_secrets1.clone().detach().requires_grad_(True)
            train_secrets2 = train_secrets2.clone().detach().requires_grad_(True)
            train_covers = train_covers.clone().detach().requires_grad_(True)

            # Forward + Backward + Optimize
            optimizer.zero_grad()
            train_hidden, train_output1, train_output2 = model(train_secrets1, train_secrets2, train_covers)

            # Calculate loss and perform backprop
            train_loss, train_loss_cover, train_loss_secret = customized_loss(train_output1, train_output2, train_hidden, train_secrets1, train_secrets2, train_covers, beta)
            train_loss.backward()
            optimizer.step()
            
            # Saves training loss
            # print(train_loss.data)
            train_losses.append(torch.Tensor.item(train_loss.data))
            loss_history.append(torch.Tensor.item(train_loss.data))
            
            # Prints mini-batch losses
            print('Training: Batch {0}/{1}. Loss of {2:.4f}, cover loss of {3:.4f}, secret loss of {4:.4f}'.format(idx+1, len(train_loader), (torch.Tensor.item(train_loss.data)), torch.Tensor.item(train_loss_cover.data), torch.Tensor.item(train_loss_secret.data)))
        
        if epoch+1 % 10 == 0:
            torch.save(model.state_dict(), checkpoints_path+'Epoch N{}.pkl'.format(epoch+1))
        
        mean_train_loss = np.mean(train_losses)
    
        # Prints epoch average loss
        print ('Epoch [{0}/{1}], Average_loss: {2:.4f}'.format(
                epoch+1, num_epochs, mean_train_loss))
    
    return model, mean_train_loss, loss_history

In [131]:
model, mean_train_loss, loss_history = train_model(train_loader, beta, learning_rate)


RuntimeError: ignored

In [None]:
model.load_state_dict(torch.load(checkpoints_path+'Epoch N110.pkl'))

In [None]:
# 反归一化只要是用来显示图片内容
def denormalize(image, std, mean):
    ''' Denormalizes a tensor of images.'''

    for t in range(3):
        image[t, :, :] = (image[t, :, :] * std[t]) + mean[t]
    return image
# 显示图片的函数
def imshow(img, idx, learning_rate, beta):
    '''Prints out an image given in tensor format.'''
    
    img = denormalize(img, std, mean)
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))
    plt.title('Example '+str(idx)+', lr='+str(learning_rate)+', B='+str(beta))
    plt.show()
    return

In [None]:
# Switch to evaluate mode
model.eval()

test_losses = []
# Show images
for idx, test_batch in enumerate(test_loader):
     # Saves images
    data = test_batch.cuda()
    # Saves secret images and secret covers
    test_secret = data[:len(data)//2]
    test_cover = data[len(data)//2:]

    # Creates variable from secret and cover images

    test_secret = test_secret.clone().detach().requires_grad_(False)
    test_cover = test_cover.clone().detach().requires_grad_(False)
    # Compute output
    test_hidden, test_output = model(test_secret, test_cover)
    
    # Calculate loss
    test_loss, loss_cover, loss_secret = customized_loss(test_output, test_hidden, test_secret, test_cover, beta)
    
    test_secret = test_secret.to('cpu')
    test_cover = test_cover.to('cpu')
    test_hidden, test_output = test_hidden.cpu(), test_output.cpu() 
    
#     diff_S, diff_C = np.abs(np.array(test_output.data[0]) - np.array(test_secret.data[0])), np.abs(np.array(test_hidden.data[0]) - np.array(test_cover.data[0]))
    
#     print (diff_S, diff_C)
#     test_secret.data, test_output.data, test_cover.data, test_hidden.data,test_output.data = 
    if idx % 100 == 0:
        print ('Total loss: {:.2f} \nLoss on secret: {:.2f} \nLoss on cover: {:.2f}'.format(torch.Tensor.item(test_loss.data), torch.Tensor.item(loss_secret.data), torch.Tensor.item(loss_cover.data)))

        # Creates img tensor
        imgs = [test_secret.data, test_output.data, test_cover.data, test_hidden.data,]
        imgs_tsor = torch.cat(imgs, 0)

        # Prints Images
        
        imshow(utils.make_grid(imgs_tsor), idx+1, learning_rate=learning_rate, beta=beta)
        
    test_losses.append(torch.Tensor.item(test_loss.data))
        
mean_test_loss = np.mean(test_losses)

print ('Average loss on test set: {:.2f}'.format(mean_test_loss))