## 本文说明
**本文内容:**  本次作业是把原始的[U-Net](https://arxiv.org/abs/1505.04597) 网络中的Encode部分的代码改成[ResNet](https://arxiv.org/abs/1512.03385) 的BottleNeck部分
  - **第一部分**:U-Net内容理解与网络实现
  - **第二部分**:U-Net的ResNet版本实现

### U-Net网络理解与总结
首先这里说到U-Net一般都会提到FCN，作为以神经网路为基础的图像语义分割领域，FCN有着重要的历史地位。在此基础之上U-Net作者提出了一种对称的，以Encoder和Decoder为BackBone的网络框架，该网络不仅分割精度高，同时只需要各更少的数据集。但是除了结构上的不同之外，U-Net和FCN之间还有那些差别呢？本文做出如下说明：
  #### 与FCN的对比理解
  - U-Net的Encoede采用的是5和stage的经典架构，而FCN则是以VGG为主要框架的卷积操作
  - FCN采用的是先进行100的padding，然后其余每层再进行padding为1的操作，U-Net的padding数为0，所以个人理解是 ```` 后面的copy and crop 需要剪裁，是因为卷积操作中没有使用Padding而导致Encoder和Decoder对应卷积层的FeatureMap图像尺寸不同````同时论文也提到“ The cropping is necessary due to the loss of border pixels in every convolution. ”
  - 在skip connect部分U-Net采用的是把Encoder对应层的Feature Map进行叠加操作，而FCN则是对最后1x1卷积得到的结果进行0，2，4倍的上采样再与对应尺寸的FeatureMap的各个Map逐像素相加而得到结果
  
#### U-Net优点总结
  - 使用较少的训练集得到较好的结果
  - 网络提取的精度更高，具有较高的应用价值
  - U-Net作为一种新型的网络框架具有重要的意义，同时后面会尝试该框架与一些分类效果分类效果的网络进行融合是否回去的更好的效果

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F

##### U-Net网络结构分析
  **首先是U-Net的网络结构分析**
   - U-Net的Encoder是由5个stage组成，也就是经过4次下采样图像变为小于原来1/16的图像
   - U-Net的中间部分是使用了两个全卷积的网络之后与Decoder连接、
   - Decoder部分对图像进行上采样操作，利用转置卷积或双线性插值都可以，但是据说双线性插值效果略好于转置卷积
   - skip connect是先对图像教进行剪裁，再把对应的FeatureMap进行叠加，这里用到的函数是torch.cat()函数，此函数的好处是不会增加心得维度信息<br/>                                     

### U-Net网络实现    
   **1、**由于U-Net的Encode是有5个stage组成，每个stage都是一样的，都是两次3x3的卷积操作然后进行**ReLU**和**BN**层操作
  

In [None]:
class UnetDownBlock(nn.Module):
    def __init__(self, in_channels, out_channels,norm_layer=True):
        super(UnetDownBlock, self).__init__()
        block_list = []
        self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=0)
        self.relu1 = nn.ReLU(inplace=True)
        self.bn1 = nn.BatchNorm2d(in_channels)
        self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size=3, padding=0)
        self.relu2 = nn.ReLU(inplace=True)
        self.bn2 = nn.BatchNorm2d(out_channels)
        block_list.append(self.conv1)
        block_list.append(self.relu1)
        if norm_layer:
            block_list.append(self.bn1)
        block_list.append(self.conv2)
        block_list.append(self.relu2)
        if norm_layer:
            block_list.append(self.bn2)
        self.block = nn.Sequential(*block_list)
    
    def forward(self, x):
        out = self.block(x)
        return out

**2、** 定义下采样的block，此处下采样有两种方法可以选择，一种是双线性插值上采样，另一种是转置卷积实现上采样，但是目前后者一般不常用了。下采样之后图像同样进行两层3x3的卷积操作

In [None]:
class UnetUpBlock(nn.Module):
    def __init__(self, in_channels, out_cahnnels, up_method, norm_layer):
        super(UnetUpBlock, self).__init__()
        if up_method == "upconv":
            self.up_func = nn.ConvTranspose2d(in_channels, out_cahnnels, kernel_size=2, stride=2)
        elif up_method == "upsample":
            self.up_func = nn.Sequential(
                nn.Upsample(mode="bilinear", scale_factor=2),
                nn.Conv2d(in_channels, out_cahnnels, kernel_size=1),)
        self.conv_block = UnetDownBlock(in_channels, out_cahnnels, norm_layer)
    
    #定义copy_crop操作
    def copy_crop(self, bridge, target_size):
        _, _, h, w = bridge.size()
        diff_y = (h - target_size[0])//2
        diff_x = (w - target_size[1])//2  
        return bridge[:, :, diff_y:(diff_y+target_size[0]), 
                      diff_x:(diff_x+target_size[1])]

    def forward(self, x, bridge):
        up = self.up_func(x)
        add_map = self.copy_crop(bridge, up.shape[2:])
        out = torch.cat([up, add_map], 1)
        out = self.conv_block(out)
        return out 

**3、** 定义Encoder部分，高部分主要是进行5个stage的卷积操作，同时利用Maxpooling对图像进行下采样操作

In [None]:
class UnetEncode(nn.Module):
    def __init__(
        self,
        in_channels=1,
        depth=5,
        wf=6,
        padding=False,
        norm_layer=False):
        super(UnetEncode, self).__init__()
        self.padding = padding
        self.depth = depth
        prev_channels = in_channels
        self.down_path = nn.ModuleList()
        
        #下采样的过程
        for i in range(depth):
            self.down_path.append(
                UnetDownBlock(prev_channels, 2 ** (wf + i),norm_layer)
            )
            prev_channels = 2 ** (wf + i)
    def forward(self, x):
        blocks = []
        
        
        for i, down in enumerate(self.down_path):
            x = down(x)
            if i != len(self.down_path) - 1:
                blocks.append(x)
                x = F.max_pool2d(x, 2)
        blocks.append(x)

        return blocks

**4、** 定义UNet的整个网络过程

In [None]:
class UNet(nn.Module):
    def __init__(
        self,
        n_classes=2,
        depth=5,
        wf=6,
        padding=False,
        batch_norm=False,
        up_mode='upconv',
    ):
        super(UNet, self).__init__()
        assert up_mode in ('upconv', 'upsample')
        self.padding = padding
        self.depth = depth
        prev_channels = 2 ** (wf + depth-1)
        
        self.encode = UnetEncode()
        self.up_path = nn.ModuleList()
        for i in reversed(range(depth - 1)):
            self.up_path.append(
                UnetUpBlock(prev_channels, 2 ** (wf + i), up_mode, batch_norm)
            )
            prev_channels = 2 ** (wf + i)

        self.last = nn.Conv2d(prev_channels, n_classes, kernel_size=1)

    def forward(self, x):
        
        blocks = self.encode(x)
        x = blocks[-1]
        for i, up in enumerate(self.up_path):
            x = up(x, blocks[-i - 2])

        return self.last(x)

In [None]:
x = torch.randn((1,1, 572,572))
unet = UNet()
unet.eval()
y_unet = unet(x)

In [None]:
y_unet.size()

## ResNet 改U-Net Encoder部分
在之前的ResNet网络模型中，已经清楚的了解到了ResNet网络模型的整图架构，以及50层以上和50层以下的网路结构之间的细微差别。50层以上的网络模型主要运用的是BasicBlock结构（两个3x3的卷积后面加一个残差），而50层以上的网络主要运用的是BottleNeck的结构（1x1—3x3 — 1x1）同一个block中有维度上的变化，这也发挥了1x1卷积的优势。

##### 定义一个CBR的Layer，因为后面有很多1x1和3x3的CBR层，方便后面使用

In [None]:

class CBR_Layer(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size=7, padding=3, stride=2):
        super(CBR_Layer, self).__init__()
        block = []
        block.append(nn.Conv2d(in_channels, out_channels, kernel_size=kernel_size, 
                              padding = padding, stride = stride))
        block.append(nn.ReLU(inplace=True))
        block.append(nn.BatchNorm2d(out_channels))
        self.block = nn.Sequential(*block)
    def forward(self, x):
        out = self.block(x)
        return out

**1、这里首先对BasicBlock的部分进行定义** 

In [None]:
class BasicBlock(nn.Module):
    def __init__(self, inplances, plances, norm_layer=True):
        super(BasicBlock, self).__init__()
        self.block = UnetDownBlock(inplances, plancest, norm_layer=norm_layer)
        
    def forward(self, x):
        identity = x
        out = self.block(x)
        out += identity
        
        return out

**2、定义BottleNeck部分**

In [None]:
class Bottleneck(nn.Module):
    expansion = 4
    def __init__(self, inplanes, planes):
        super(Bottleneck, self).__init__()
        assert out_chans%4==0
        
        """"""
        self.block1 = ResBlock(in_chans,int(out_chans/4),kernel_size=1,padding=0) #压缩
        self.block2 = ResBlock(int(out_chans/4),int(out_chans/4),kernel_size=3,padding=1) #提取特征
        self.block3 = ResBlock(int(out_chans/4),out_chans,kernel_size=1,padding=0) #恢复
        
        
    def forward(self, x):
        """"""
        identity = x
        
        out = self.block1(x)
        out = self.block2(out)
        out = self.block3(out)
        
        out += identity
        
        return out

class DownBottleNeck(nn.Module):
    expansion = 4 #此处是由于Bottleneck中一个block中FeatureMap的维度变化比例是1:1:4

    def __init__(self, inplanes, planes, stride=1, downsample=None, norm_layer=None):
        super(DownBottleNeck, self).__init__()
        if norm_layer is None:
            norm_layer = nn.BatchNorm2d

        self.cbr1 = CBR_Layer(inplanes, planes, kernel_size=3, stride=1,padding=1)
        self.cbr2 = CBR_Layer(planes, planes, kernel_size=3, stride=1, padding=1)
        self.conv1 = nn.Conv2d(planes, planes*self.expansion, kernel_size=1,padding=0,stride=stride)
        self.conv3 = nn.Conv2d(planes, planes*self.expansion,kernel_size=1, stride=stride)
        self.bn3 = norm_layer(planes*self.expansion)
        self.relu = nn.ReLU(inplace=True)
        self.downsample = downsample
        self.stride = stride

    def forward(self, x):
        identity = x

        out = self.cbr1(x)
        out = self.cbr2(out)

        out = self.conv3(out)
        out = self.bn3(out)

        identity = self.conv1(x)
        out += identity
        out = self.relu(out)
        return out  

In [None]:
def make_layers(in_channels, layer_list,name="vgg"):
    layers = []
    if name=="vgg":
        for v in layer_list:
            layers += [CBR_Layer(in_channels, v)]
            in_channels = v
    
    elif name=="resnet":
        #需要down进行下采样
        layers += [DownBottleNeck(in_channels, layer_list[0])]
        in_channels = layer_list[0]
        
        for v in layer_list[1:]:
            layers += [BottleNeck(in_channels, v)]
            in_channels = v
    return nn.Sequential(*layers)
            

class Layer(nn.Module):
    def __init__(self, in_channels, layer_list ,net_name):
        super(Layer, self).__init__()
        
        self.layer = make_layers(in_channels, layer_list, name=net_name)
    def forward(self, x):
        out = self.layer(x)
        return out

**3、定义Res_U-Net的Encoder部分**
  此处需要说明一下：    
    1）U-Net也是分为5个stage  
    2）原始U-Net中一个stage中只有一个Block，但是这里改成ResNet之后可以是多个Block  
    3）原始U-Net中每一个Block中都是3x3的卷积，但是这里改成ResNet之后会有1x1-3x3-1x3的结构

In [None]:
class ResNet101(nn.Module):
    '''
    ResNet101 model 
    '''
    def __init__(self):
        super(ResNet101,self).__init__()
        #self, in_ch,out_ch, kernel_size=3, padding=1, stride=1):
        self.conv1 = CBR_Layer(3,64)
        #ceil_mode 设置为true，否则分辨率不对
        self.pool1 = nn.MaxPool2d(kernel_size=3,stride=2,ceil_mode=True)
        
        self.conv2_1 = DownBottleNeck(64,64)
        self.conv2_2 = BottleNeck(256,256)
        self.conv2_3 = BottleNeck(256,256)
        
        self.layer3 = Layer(256,[512]*2,'resnet')
        self.layer4 = Layer(512,[1024]*23,'resnet')
        self.layer5 = Layer(1024,[2048]*3,'resnet')
    
    def forward(self,x):
        
        f1 = self.conv1(x)
        f2 = self.conv2_3(self.conv2_2(self.conv2_1(self.pool1(f1))))
        
        f3 = self.layer3(f2)
        f4 = self.layer4(f3)
        f5 = self.layer5(f4)
        return [f2,f3,f4,f5]

In [None]:
class ResNetUNet(nn.Module):
    def __init__(
        self,
        n_classes=2,
        depth=5,
        wf=6,
        padding=1,
        batch_norm=False,
        up_mode='upconv',
    ):
        super(ResNetUNet, self).__init__()
        assert up_mode in ('upconv', 'upsample')
        self.padding = padding
        self.depth = depth
        prev_channels = 2 ** (wf + depth)
        
        """"""
        self.encode = ResNet101()
        
        self.up_path = nn.ModuleList()
        for i in reversed(range(2,depth)):
            self.up_path.append(
                UnetUpBlock(prev_channels, 2 ** (wf + i), up_mode, batch_norm)
            )
            prev_channels = 2 ** (wf + i)

        self.last = nn.Conv2d(prev_channels, n_classes, kernel_size=1)

    def forward(self, x):
        blocks = self.encode(x)
        x = blocks[-1]
        for i, up in enumerate(self.up_path):
            x = up(x, blocks[-i - 2])

        return self.last(x)
        

In [None]:
x = torch.randn((1,3, 256,256))
unet = ResNetUNet()
unet.eval()
y_unet = unet(x)