## Part A — Calculations (20 pts)

### A1. [6 pts] 1D Convolution (Stride 1, No Padding)

**Given**  
Input: \(x=[2,\,1,\,2,\,3,\,1,\,0]\) (length \(N=6\))  
Kernel: \(w=[1,\,-1,\,2]\) (length \(F=3\))  
Stride \(S=1\), Padding \(P=0\)

**Output length**
$$
L_{\text{out}}=\frac{N-F+2P}{S}+1=\frac{6-3+0}{1}+1=\mathbf{4}.
$$

**Cross-correlation dot-products** (window \(\cdot\) kernel):
$$[2,1,2]\cdot[1,-1,2]=2-1+4=\mathbf{5}$$
$$[1,2,3]\cdot[1,-1,2]=1-2+6=\mathbf{5}$$
$$[2,3,1]\cdot[1,-1,2]=2-3+2=\mathbf{1}$$
$$[3,1,0]\cdot[1,-1,2]=3-1+0=\mathbf{2}$$

**Answer:** \(\boxed{[5,\,5,\,1,\,2]}\)


### A2. [6 pts] 1D Convolution (Stride 2, No Padding)

Same \(x\) and \(w\); now \(S=2\), \(P=0\).

**Output length**
$$
L_{\text{out}}=\left\lfloor\frac{6-3}{2}+1\right\rfloor
=\left\lfloor1.5+1\right\rfloor=\mathbf{2}.
$$

**Sampled starts:** \(i=0,\,2\)
$$[2,1,2]\cdot[1,-1,2]=\mathbf{5}\qquad
[2,3,1]\cdot[1,-1,2]=\mathbf{1}$$

**Answer:** \(\boxed{[5,\,1]}\)


### A3. [8 pts] 2D Convolution (Stride 1, No Padding)

**Input**
$$
X=\begin{bmatrix}
1&2&0&1\\
0&1&3&2\\
2&1&0&1\\
1&0&2&3
\end{bmatrix} \quad (4\times4)
$$

**Kernel**
$$
K=\begin{bmatrix}
1&-1\\
0&2
\end{bmatrix} \quad (2\times2)
$$

**Output size** (\(S=1,\,P=0\))
$$
H_{\text{out}}=W_{\text{out}}=\frac{4-2+0}{1}+1=3
$$
Final feature map: **\(3\times3\)**.

**Requested values (cross-correlation, no flip)**

Top-left \((0,0)\): use \(X[0{:}2,0{:}2]=\begin{bmatrix}1&2\\0&1\end{bmatrix}\)
$$
1\cdot1+2\cdot(-1)+0\cdot0+1\cdot2=\boxed{1}
$$

Top-right \((0,2)\): use \(X[0{:}2,2{:}4]=\begin{bmatrix}0&1\\3&2\end{bmatrix}\)
$$
0\cdot1+1\cdot(-1)+3\cdot0+2\cdot2=\boxed{3}
$$

Bottom-left \((2,0)\): use \(X[2{:}4,0{:}2]=\begin{bmatrix}2&1\\1&0\end{bmatrix}\)
$$
2\cdot1+1\cdot(-1)+1\cdot0+0\cdot2=\boxed{1}
$$

**Final output size:** \(\boxed{3\times3}\).


## Part B — Single-Layer CNNs (20 pts)

### B1. [10 pts] Conv Layer

**Network**
1) Input: \((1, 1, 28, 28)\)  
2) Conv: in_channels=1, out_channels=4, kernel_size=3, stride=1, padding=1

#### Manual verification
For PyTorch `Conv2d` (dilation = 1), each spatial dim uses:
$$
H_{\text{out}}=\left\lfloor \frac{H + 2p - (k-1) - 1}{s} + 1 \right\rfloor,\quad
W_{\text{out}}=\left\lfloor \frac{W + 2p - (k-1) - 1}{s} + 1 \right\rfloor.
$$
Here \(H=W=28,\;k=3,\;s=1,\;p=1\):
$$
H_{\text{out}}=W_{\text{out}}=\left\lfloor \frac{28 + 2\cdot 1 - (3-1) - 1}{1} + 1 \right\rfloor
= \left\lfloor 27 \right\rfloor + 1 = 28.
$$

**Expected shape after conv:** \(\boxed{(1, 4, 28, 28)}\). :contentReference[oaicite:1]{index=1}


In [1]:
# B1: Conv layer (nn.Module), print input & output shapes
import torch
import torch.nn as nn

class SingleConv(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = nn.Conv2d(in_channels=1, out_channels=4, kernel_size=3, stride=1, padding=1)

    def forward(self, x):
        print("B1 Input:", x.shape)
        y = self.conv(x)
        print("B1 After Conv:", y.shape)
        return y

# Test
x = torch.randn(1, 1, 28, 28)
_ = SingleConv()(x)


B1 Input: torch.Size([1, 1, 28, 28])
B1 After Conv: torch.Size([1, 4, 28, 28])


### B2. [10 pts] Conv + MaxPool

**Network**
1) Input: \((1, 1, 32, 32)\)  
2) Conv: \(1 \to 6\), kernel=5, stride=1, padding=0  
3) Pool: MaxPool2d(kernel_size=2, stride=2)

#### Manual verification
**After Conv** (dilation = 1):
$$
H_{\text{out}}=W_{\text{out}}=\left\lfloor \frac{32 + 2\cdot 0 - (5-1) - 1}{1} + 1 \right\rfloor
= \left\lfloor 27 \right\rfloor + 1 = 28.
$$
So the tensor becomes \((1, 6, 28, 28)\). :contentReference[oaicite:2]{index=2}

**After MaxPool2d** (kernel=2, stride=2, padding=0; dilation=1):
$$
H_{\text{out}}=W_{\text{out}}=\left\lfloor \frac{28 + 0 - (2-1) - 1}{2} + 1 \right\rfloor
= \left\lfloor \frac{26}{2} + 1 \right\rfloor = 14.
$$
Final shape: \(\boxed{(1, 6, 14, 14)}\). :contentReference[oaicite:3]{index=3}


In [2]:
# B2: Conv + MaxPool (nn.Module), print shapes after each layer
import torch
import torch.nn as nn

class ConvMaxPoolNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5, stride=1, padding=0)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)

    def forward(self, x):
        print("B2 Input:", x.shape)
        x = self.conv(x)
        print("B2 After Conv:", x.shape)
        x = self.pool(x)
        print("B2 After Pool:", x.shape)
        return x

# Test
x = torch.randn(1, 1, 32, 32)
_ = ConvMaxPoolNet()(x)


B2 Input: torch.Size([1, 1, 32, 32])
B2 After Conv: torch.Size([1, 6, 28, 28])
B2 After Pool: torch.Size([1, 6, 14, 14])


## Part C — Multi-Layer CNNs (20 pts)

### C1. [10 pts] Two Conv Layers

**Network**  
- Input: \((1, 1, 32, 32)\)  
- Conv1: \(1 \rightarrow 6\), kernel \(k=5\), stride \(s=1\), padding \(p=0\)  
- Conv2: \(6 \rightarrow 16\), kernel \(k=5\), stride \(s=1\), padding \(p=0\)

**Manual size verification**  
For PyTorch `Conv2d` (with dilation \(d=1\)), each spatial dimension uses:
$$
H_{\text{out}}
= \left\lfloor \frac{H + 2p - (k-1) - 1}{s} + 1 \right\rfloor,
\quad
W_{\text{out}}
= \left\lfloor \frac{W + 2p - (k-1) - 1}{s} + 1 \right\rfloor.
$$

**After Conv1** (start \(H=W=32\), \(k=5, s=1, p=0\)):
$$
H_{\text{out}}=W_{\text{out}}
= \left\lfloor \frac{32 + 0 - (5-1) - 1}{1} + 1 \right\rfloor
= \left\lfloor 27 \right\rfloor + 1
= 28.
$$
Shape: \(\boxed{(1,\,6,\,28,\,28)}\).

**After Conv2** (start \(28\times28\), \(k=5, s=1, p=0\)):
$$
H_{\text{out}}=W_{\text{out}}
= \left\lfloor \frac{28 + 0 - (5-1) - 1}{1} + 1 \right\rfloor
= \left\lfloor 23 \right\rfloor + 1
= 24.
$$
Shape: \(\boxed{(1,\,16,\,24,\,24)}\).

**Expected shapes:** after Conv1 → \((1,6,28,28)\); after Conv2 → \((1,16,24,24)\).

---

### C2. [10 pts] Conv + Pool + Linear

**Network**  
- Input: \((1, 1, 28, 28)\)  
- Conv: \(1 \rightarrow 8\), kernel \(k=3\), stride \(s=1\), padding \(p=1\)  
- Pool: `MaxPool2d(kernel_size=2, stride=2)`  
- Flatten  
- Linear: \(\text{in\_features} \rightarrow 10\)

**Manual size verification**

**After Conv** (\(H=W=28, k=3, s=1, p=1\)):
$$
H_{\text{out}}=W_{\text{out}}
= \left\lfloor \frac{28 + 2\cdot 1 - (3-1) - 1}{1} + 1 \right\rfloor
= \left\lfloor 27 \right\rfloor + 1
= 28.
$$
Shape: \(\boxed{(1,\,8,\,28,\,28)}\).

**After MaxPool(2,2)** (pooling uses the same shape rule with \(d=1\)):
$$
H_{\text{out}}=W_{\text{out}}
= \left\lfloor \frac{28 + 0 - (2-1) - 1}{2} + 1 \right\rfloor
= \left\lfloor \frac{26}{2} + 1 \right\rfloor
= 14.
$$
Shape: \(\boxed{(1,\,8,\,14,\,14)}\).

**Flatten → Linear in\_features**
\[
8 \times 14 \times 14 = \boxed{1568}
\]
So the classifier is `Linear(1568, 10)`.

**Expected shapes:** after Conv → \((1,8,28,28)\); after Pool → \((1,8,14,14)\); after Flatten → \((1,1568)\); after Linear → \((1,10)\).




In [3]:
# === C1. Two Conv Layers ===
import torch
import torch.nn as nn

class TwoConvNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 6, kernel_size=5, stride=1, padding=0)
        self.conv2 = nn.Conv2d(6, 16, kernel_size=5, stride=1, padding=0)

    def forward(self, x):
        print("C1 Input:", x.shape)           # (1, 1, 32, 32)
        x = self.conv1(x)
        print("C1 After Conv1:", x.shape)     # (1, 6, 28, 28)
        x = self.conv2(x)
        print("C1 After Conv2:", x.shape)     # (1, 16, 24, 24)
        return x

# quick test
_ = TwoConvNet()(torch.randn(1, 1, 32, 32))


C1 Input: torch.Size([1, 1, 32, 32])
C1 After Conv1: torch.Size([1, 6, 28, 28])
C1 After Conv2: torch.Size([1, 16, 24, 24])


In [4]:
# === C2. Conv + Pool + Flatten + Linear ===
import torch
import torch.nn as nn

class ConvPoolLinearNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = nn.Conv2d(1, 8, kernel_size=3, stride=1, padding=1)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.flatten = nn.Flatten()           # flattens from dim 1 .. -1
        self.fc = nn.Linear(1568, 10)         # 8 * 14 * 14 = 1568

    def forward(self, x):
        print("C2 Input:", x.shape)           # (1, 1, 28, 28)
        x = self.conv(x)
        print("C2 After Conv:", x.shape)      # (1, 8, 28, 28)
        x = self.pool(x)
        print("C2 After Pool:", x.shape)      # (1, 8, 14, 14)
        x = self.flatten(x)
        print("C2 After Flatten:", x.shape)   # (1, 1568)
        x = self.fc(x)
        print("C2 After Linear:", x.shape)    # (1, 10)
        return x

# quick test
_ = ConvPoolLinearNet()(torch.randn(1, 1, 28, 28))


C2 Input: torch.Size([1, 1, 28, 28])
C2 After Conv: torch.Size([1, 8, 28, 28])
C2 After Pool: torch.Size([1, 8, 14, 14])
C2 After Flatten: torch.Size([1, 1568])
C2 After Linear: torch.Size([1, 10])


## Part D — Advanced Structures (40 pts)

### D1 — CNN with Mixed Pooling (15 pts)
**Network**
- Input: (1, 1, 28, 28)
- Conv1: 1→8, k=3, s=1, p=1 → MaxPool(2,2)
- Conv2: 8→16, k=3, s=1, p=1 → AvgPool(2,2)
- Flatten → Linear: 16×7×7 → 10

**Manual size checks** (per spatial dim; dilation d=1):
$$
H_{\text{out}}=\left\lfloor \frac{H+2p-(k-1)-1}{s}+1\right\rfloor\quad(\text{same for }W)
$$
- After Conv1: 28→28 → (1,8,28,28)
- MaxPool(2,2): 28→14 → (1,8,14,14)
- After Conv2: 14→14 → (1,16,14,14)
- AvgPool(2,2): 14→7 → (1,16,7,7)
- Flatten → Linear in_features = 16×7×7 = **784** → Linear(784, 10)

**Linear params** = 784×10 + 10 = **7,850**.

---

### D2 — CNN with Strides (10 pts)
**Network**
- Input: (1, 1, 64, 64)
- Conv1: 1→8, k=3, s=2, p=1
- Conv2: 8→16, k=3, s=2, p=1

**Manual check** (stride 2 halves H,W with k=3, p=1):
- 64 → 32 after Conv1
- 32 → 16 after Conv2

---

### D3 — Two-Branch CNN (15 pts)
**Network**
- Input: (1,1,32,32)
- Branch A: Conv 1→4 (k=3,p=1,s=1) → MaxPool(2,2) → (1,4,16,16)
- Branch B: Conv 1→4 (k=5,p=2,s=1) → AvgPool(2,2) → (1,4,16,16)
- Fusion: concatenate along channels: (1,8,16,16)
- Flatten → Linear: in_features = 8×16×16 = **2048** → Linear(2048, 10)


In [5]:
# Part D — PyTorch implementations with shape prints
import torch
import torch.nn as nn

# D1
class MixedPoolingNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 8, 3, 1, 1)
        self.maxp  = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(8, 16, 3, 1, 1)
        self.avgp  = nn.AvgPool2d(2, 2)
        self.flatten = nn.Flatten()
        self.fc = nn.Linear(16*7*7, 10)

    def forward(self, x):
        print("D1 Input:", x.shape)           # (1,1,28,28)
        x = self.conv1(x); print("D1 Conv1:", x.shape)    # (1,8,28,28)
        x = self.maxp(x);  print("D1 MaxPool:", x.shape)  # (1,8,14,14)
        x = self.conv2(x); print("D1 Conv2:", x.shape)    # (1,16,14,14)
        x = self.avgp(x);  print("D1 AvgPool:", x.shape)  # (1,16,7,7)
        x = self.flatten(x); print("D1 Flatten:", x.shape) # (1,784)
        x = self.fc(x);    print("D1 Linear:", x.shape)   # (1,10)
        return x

# D2
class StrideCNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 8, 3, 2, 1)  # 64->32
        self.conv2 = nn.Conv2d(8,16, 3, 2, 1)  # 32->16
    def forward(self, x):
        print("D2 Input:", x.shape)            # (1,1,64,64)
        x = self.conv1(x); print("D2 Conv1:", x.shape)    # (1,8,32,32)
        x = self.conv2(x); print("D2 Conv2:", x.shape)    # (1,16,16,16)
        return x

# D3
class TwoBranchCNN(nn.Module):
    def __init__(self):
        super().__init__()
        # Branch A
        self.convA = nn.Conv2d(1, 4, 3, 1, 1)
        self.maxpA = nn.MaxPool2d(2, 2)
        # Branch B
        self.convB = nn.Conv2d(1, 4, 5, 1, 2)
        self.avgpB = nn.AvgPool2d(2, 2)
        # Head
        self.flatten = nn.Flatten()
        self.fc = nn.Linear(8*16*16, 10)  # 2048 -> 10

    def forward(self, x):
        print("D3 Input:", x.shape)           # (1,1,32,32)
        A = self.maxpA(self.convA(x)); print("D3 Branch A:", A.shape)  # (1,4,16,16)
        B = self.avgpB(self.convB(x)); print("D3 Branch B:", B.shape)  # (1,4,16,16)
        x = torch.cat([A,B], dim=1);  print("D3 Concat:", x.shape)     # (1,8,16,16)
        x = self.flatten(x);          print("D3 Flatten:", x.shape)    # (1,2048)
        x = self.fc(x);               print("D3 Linear:", x.shape)     # (1,10)
        return x

# quick sanity runs (comment out when submitting)
_ = MixedPoolingNet()(torch.randn(1,1,28,28))
_ = StrideCNN()(torch.randn(1,1,64,64))
_ = TwoBranchCNN()(torch.randn(1,1,32,32))

# Linear param count for D1
mp = MixedPoolingNet()
print("D1 Linear params:", mp.fc.weight.numel() + mp.fc.bias.numel())  # expect 7850


D1 Input: torch.Size([1, 1, 28, 28])
D1 Conv1: torch.Size([1, 8, 28, 28])
D1 MaxPool: torch.Size([1, 8, 14, 14])
D1 Conv2: torch.Size([1, 16, 14, 14])
D1 AvgPool: torch.Size([1, 16, 7, 7])
D1 Flatten: torch.Size([1, 784])
D1 Linear: torch.Size([1, 10])
D2 Input: torch.Size([1, 1, 64, 64])
D2 Conv1: torch.Size([1, 8, 32, 32])
D2 Conv2: torch.Size([1, 16, 16, 16])
D3 Input: torch.Size([1, 1, 32, 32])
D3 Branch A: torch.Size([1, 4, 16, 16])
D3 Branch B: torch.Size([1, 4, 16, 16])
D3 Concat: torch.Size([1, 8, 16, 16])
D3 Flatten: torch.Size([1, 2048])
D3 Linear: torch.Size([1, 10])
D1 Linear params: 7850
