# **Homework Assignment: Understanding Deconvolution in Autoencoders**
---------------

In class, we worked with autoencoders built from multilayer perceptrons (MLPs). However, encoders are often constructed using convolutional architectures to better capture spatial patterns. In this assignment, you'll explore how the decoder can use deconvolutional (transposed convolution) layers to reverse and mirror the operations performed by the convolutional encoder.

While convolutional encoders are relatively well understood, **decoding (or upsampling) the compressed representation** using **deconvolutional layers** (also known as **transposed convolutions**) often raises questions.

This assignment is particularly relevant because deconvolution is a core component of the U-Net architecture, a prominent neural network used extensively in image segmentation tasks.

Your main objective is to deeply understand **how transposed convolution layers work**, and explain them in both words and visuals.


## **The Objective**

Understand and clearly explain how **transposed convolutions** work. Use 2D transposed convolutions and a small grid of 2D points as a working example.

You may need to do some additional reading to complete this assignment.

## **Tasks & Deliverables**

### 1. **Theory Exploration**

Using markdown cells in your Colab notebook, answer the following:

- What is a **transposed convolution**?
- How does it differ from a regular convolution?
- How does it upsample feature maps?
- What are **stride**, **padding**, and **kernel size**, and how do they influence the result in a transposed convolution?
- To earn full two points, your explanation must be detailed enough for a reader to reproduce the upsampling process step by step.


### 2. **Manual Diagram (by your hand, not a generated image)**

Carefully plan and draw **by hand** a diagram or a set of diagrams that:

- Explain the process of using **transposed convolution**.
- Use an example of a **small input grid of 2D points** which gets expanded into a larger output grid.
- Explain how stride, padding, and the kernel shape affect the result.
- Show intermediate steps of the operation, not just input and output.

**Scan or photograph your diagram(s)**, and upload it to your **GitHub repository** for this course.

Then embed it in your Colab notebook using markdown (you can find examples on *how to do it* in previous notebooks related to this class, e.g. the one on linear regression or the one on the MLP network).


### 3. **Publish on GitHub**  
   - Place the Colab notebook in your **GitHub repository** for this course.
   - In your repository’s **README**, add a **link** to the notebook and also include an **“Open in Colab”** badge at the top of the notebook so it can be launched directly from GitHub.


# THEORY EXPLORATION ANSWER

- **What is a transposed convolution?**
It is process of upsampling or creating higher resolution image by multiplying smaller image by filter. Moving filter on small image results on generating more numbers based on the size of the filter and output image.

- **How does it differ from a regular convolution?**
The process of convolution is inverted. In normal convolution, bigger image is squeezed into smaller one by multiplying each set of numbers by filter and summing them into one value. This way we can extract probability of matching shape on the image to the filter.

- **How does it upsample feature maps?**
For each pixel value in image (or just any tensor) we multiply it elementwise with filter (kernel) and the result is of the size of the kernel.  

- What are **stride**, **padding**, and **kernel size**, and how do they influence the result in a transposed convolution?

**stride** value is number of pixels (or steps / indexes) which we move our filter from left to right around all the output image.

**padding** is number of "border" around our data tensor. It is important to start filter multiplication with its borders exceeding tensor. In this example we can start multiplying from number 1 without trouble of filter being outside of data.
$$
\begin{array}{ccc}
\begin{pmatrix}
1 & 2 \\
3 & 4
\end{pmatrix}
&
\xrightarrow{\text{padded}}
&
\begin{pmatrix}
0 & 0 & 0 & 0\\
0 & 1 & 2 & 0\\
0 & 3 & 4 & 0\\
0 & 0 & 0 & 0
\end{pmatrix}
\end{array}
$$

**kernel size** is dimension of the kernel (filter). It's size increases or decreases the ouptut tensor size. Usually the filter has odd number size to make it symmetric.


- To earn full two points, your explanation must be detailed enough for a reader to reproduce the upsampling process step by step.