# Exercises: Convolutions and CNNs

## Problems

1. Implementing Convolutions
2. Interpreting Convolutions
3. CIFAR-10 CNN
4. Subword Tokenizer


In [None]:
import numpy as np

# 1. Implementing Convolutions

Implement the function `conv()` below, which takes in two arguments:
* `data` is a one-dimensional array that contains the data to convolve.
* `filter` is a one-dimensional array that contains the filter. This array is always an odd length.

And returns the convolution of the two. Some additional notes:
* You do not need to reverse the filter. We won't worry about that.
* Make sure to pad your input array so that the output is the same length as the input!

*You should not use any imported libraries, beyond basic array operations from numpy. The goal of this problem is to implement convolutions from basic coding.*

In [None]:
def conv(data, filter):
  data_len = len(data)
  filter_len = len(filter)

  pad_size = filter_len // 2
  pad_data = [0] * pad_size + data + [0] * pad_size

  result = []

  for i in range(data_len):
    window = pad_data[i:i + filter_len]
    value = sum(window[j] * filter[j] for j in range(filter_len))

    result.append(value)
  return result

(a) Here are some test cases for your code

In [None]:
print(conv([0,1,3,4,2,1], [0,1,0]))      # should output [0, 1, 3, 4, 2, 1]
print(conv([1,2,3,4,5,6,7,8], [-1,1,0])) # [1, 1, 1, 1, 1, 1, 1, 1]
print(conv([0,1,2,0,4,2,1], [-1,1,0]))   # [0, 1, 1, -2, 4, -2, -1]
print(conv([0,1,0,4,3,0,2], [-1,3,-1]))  # [-1, 3, -5, 9, 5, -5, 6]

[0, 1, 3, 4, 2, 1]
[1, 1, 1, 1, 1, 1, 1, 1]
[0, 1, 1, -2, 4, -2, -1]
[-1, 3, -5, 9, 5, -5, 6]


(b) Here are more test cases, but the filters are allowed to be any odd-size, not just 3 entries. You should only have one function that completes all of the test cases for a and b.

In [None]:
print(conv([0,1,3,4,2,1], [0,0,1,0,0]))     # should output [0, 1, 3, 4, 2, 1]
print(conv([1,2,3,4,5,6,7,8], [1,1,1,1,1])) # [6, 10, 15, 20, 25, 30, 26, 21]
print(conv([0,1,2,0,4,2,1], [-1,1,-1]))     # [-1, -1, 1, -6, 2, -3, -1]
print(conv([0,1,2,3,4,5,6], [0,1,2,3,4,5,6]))  # [32, 50, 70, 91, 70, 50, 32]

[0, 1, 3, 4, 2, 1]
[6, 10, 15, 20, 25, 30, 26, 21]
[-1, -1, 1, -6, 2, -3, -1]
[32, 50, 70, 91, 70, 50, 32]


## 2. Interpreting Convolutions

Consider the following data and the two filters. For this problem, assume that the datasets have an arbitrary number of 0s at either end, to avoid the edge effects of convolutions.

In [None]:
data = [0,1,3,4,2,1,0]
filter1 = [0.3333, 0.3333, 0.3333]
filter2 = [0, -1, 1]

In [None]:
filter3 = [0.25, 0.5, 0.25]
z = np.convolve(data, filter3, mode="same")
z

array([0.25, 1.25, 2.75, 3.25, 2.25, 1.  , 0.25])

(a) Notice that when we convolve `data` with `filter1`, the original data and the convolution have the same sum (up to rounding errors).
* Will this be the case for all possible lists of data? Explain.
* Describe what kind of filters will produce this result. Explain.

In [None]:
z = np.convolve(data, filter2, mode="same")
z

array([ 0, -1, -2, -1,  2,  1,  1])

In [None]:
print(np.sum(data))
print(np.sum(conv(data, filter1)))

# Yes, this will be the case for all possible lists of data because this filter is
# a filter used for averaging the data. It looks at the middle and averages the things
# around it. Because all the numbers in the filter sum up to 1, the sum of the
# orginal data and the sum of the convolved data will be the same.

# The kind of filters that will produce this are filters that sum up to 1 or any filter
# that smoothes the data. An example of this would be [0.25, 0.5, 0.25] because it
# smoothes the data set. If we used this data with the [0.25, 0.5, 0.25] filter, our
# result will be [0.25, 1.25, 2.75, 3.25, 2.25, 1.0, 0.25], which gives us a sum of 1.
# Like filter1, it looks at the middle number and averages the numbers to the left and
# right of that number.

11
10.998899999999999


(b) When we convolve `data` with `filter2`, the convolution adds up to 0.
* Will this be the case for all possible lists of data? Explain.
* Describe what kind of filters will produce this result. Explain.

In [None]:
print(np.sum(data))
print(np.sum(conv(data, filter2)))

# Yes, this will always be the case because the sum of the filter itself is 0. Using
# this filter, each of the values in the dataset will contribute positively at one
# position with the 1 in the filter and a negative at one position with the -1. Then
# 0 will not doing anything to the value. Therefore, with a positive, a negative,
# and a neutral, the positive and negative will cancel each other out. This will
# always be the case as long as there is the 0 padding on either side. Filters that
# will produce this result are edge detectors, like [1, 0, -1], because it adds up
# to 0. Filters, like these that the weights sum to 0, redistribute the given data
# so that the positives and negatives exactly cancel each other out, producing that
# 0 sum for the convolution.

11
0


(c) Discuss the following:
* From the perspective of a human understanding the output of a convolution, are sum-preserving and a sum-zeroing filters useful?
* In what sorts of circumstances might you wish to use them?


> From the perspective of a human understanding the output of a convolution, sum-preserving and sum-zeroing filters are both useful. Sum-preserving filters are typically used for smoothing out or blurring images without changing the brightness or intensity of the image. An example of a circumstance that you might use this in is with photography because a lot of the times you want to smooth out the image but keep the brightness and intensity of the overall image the same. This could be in regards to portraits where editors blur out blemishes and pores on the face but keep the brightness of the skin the same. Sum-zeroing filters are also useful because they are used for edge detection and emphasize contrasts in an imagee. An example of a circumstance where you might use this looking at medical images, like X-rays, and wanting to emphasize the contrast between the lines to detect important aspects of the X-rays to take note of, like fractures or tumors.

(d) Discuss the following:
* For the purposes of machine computation in a neural network, are sum-preserving and sum-zeroing filters useful? Explain.
* If they are usful, give an example of when one might be learned.

> For the purpose of machine computation in a neural network, sum-preserving and sum-zeroing are useful, especially in CNN layers. Sum-preseving filters are useful for a neural network by smoothing and averaging information. It is also helpful in reducing any noise in the dataset, which helps the network focus in more on the general features. An example of when this might be learned is with smoothing out variations in handwriting styles, like with the MNIST dataset. Using sum-preserving filters for that dataset will maintain the structure of the dataset while also smoothing out those variations. Sum-zeroing filters are useful for a neural network because it is essential for emphasizing any changes in the dataset, like edges or gradients. An example of when this might be learned is in a CNN like YOLO, where these filters help the model identify the shapes of objects.

## 3. CIFAR

The [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) dataset contains 70,000 color 32x32 images, in 10 different classes. Using PyTorch, build a CNN to classify the CIFAR-10 dataset. You can use the link above, in which case the data are already loaded into numpy arrays, but you may want to instead use [this repository](https://github.com/YoongiKim/CIFAR-10-images) or [this kaggle reupload](https://www.kaggle.com/datasets/swaroopkml/cifar10-pngs-in-folders), where they are image files.

You should be able to achieve an accuracy >50% on this.

Some hints:
* This dataset is color images. We used black and white in clas.
* Use batch normalization and a one-cycle fitting schedule to (hopefully) speed up the fitting. See the textbook.


In [None]:
from fastai.vision.all import *

In [None]:
%env KAGGLE_USERNAME=sophia4827
%env KAGGLE_KEY=ac334c3351a2af38b3f6b0ce9d9922a6

!kaggle datasets download swaroopkml/cifar10-pngs-in-folders
!unzip cifar10-pngs-in-folders.zip

In [None]:
path = Path(".")

In [None]:
cifar = DataBlock((ImageBlock(cls=PILImageBW), CategoryBlock),
                  get_items=get_image_files,
                  splitter=RandomSplitter(seed=24601),
                  get_y=parent_label)

cifar_dls = cifar.dataloaders(path/'/content/cifar10/cifar10/train', bs=256)

In [None]:
cifar_dls.show_batch(max_n=9, figsize=(4,4))

In [None]:
cnn = sequential(
    nn.Conv2d(1, 8, kernel_size=3, stride=2, padding=1),
    nn.ReLU(),
    nn.BatchNorm2d(8),
    nn.Conv2d(8,16, kernel_size=3, stride=2, padding=1),
    nn.ReLU(),
    nn.BatchNorm2d(16),
    nn.Conv2d(16,32, kernel_size=3, stride=2, padding=1),
    nn.ReLU(),
    nn.BatchNorm2d(32),
    nn.Conv2d(32,64, kernel_size=3, stride=2, padding=1),
    nn.ReLU(),
    nn.BatchNorm2d(64),
    nn.Conv2d(64,128, kernel_size=3, stride=2, padding=1),
    Flatten()
)

In [None]:
learn = Learner(cifar_dls, cnn, loss_func=F.cross_entropy, metrics=accuracy)

In [None]:
learn.fit_one_cycle(10, 1e-3)

## 4. Subword Tokenization

**Note: This entire problem can be done in base Python. If you want to use a library, only add Pandas**.

Subword tokenization follows this method:
1. Begin with individual characters as your subword vocabulary. In English, if we make everything lowercase, then there should be just the letters "a" through "z" as the starting vocabulary
2. Look at every possible two-token concatenation of the vocabulary, and see how many times each appears. Add the most common one to the vocabulary of subwords. You might find that "th" is the most common combination.
3. Repeat step 2 until you hit the maximum size of your vocabulary.

(a) The code below loads in the text of the novel *Pride and Prejudice* as plaintext, and then gives you a starting vocabulary of all the lowercase leters. Train a vocabulary on this text, with a maximum size of 500.

(This can take a very long time if you're not clever about how you approach this. I'll give you partial credit if you only train to a vocabulary size of 100.)

In [None]:
import urllib.request
with urllib.request.urlopen('https://www.gutenberg.org/cache/epub/1342/pg1342.txt') as response:
   text = response.read().decode("ascii", "ignore")

In [None]:
subwords = [chr(i) for i in range(97,97+26)]
print(subwords)

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']


In [None]:
pairs = []
for i in range(0,26):
  for j in range(0,26):
    two_letter = subwords[i] + subwords[j]
    pairs.append(two_letter)

In [None]:
pair_counts = {}

for i in pairs:
  count = text.count(i)

  if count > 3000:
    pair_counts[i] = count

print(pair_counts)

{'al': 3029, 'an': 8296, 'ar': 4439, 'as': 5015, 'at': 6177, 'be': 4038, 'co': 3028, 'ea': 3303, 'ed': 5708, 'en': 6681, 'er': 11978, 'es': 3977, 'ha': 6507, 'he': 15327, 'hi': 4644, 'in': 10618, 'is': 5183, 'it': 4954, 'le': 3803, 'li': 3220, 'll': 3452, 'me': 3294, 'nd': 5966, 'ne': 4130, 'ng': 5077, 'no': 3421, 'nt': 3658, 'of': 4298, 'on': 6593, 'or': 4302, 'ou': 6551, 're': 7820, 'se': 4042, 'st': 3921, 'te': 4855, 'th': 14078, 'ti': 3602, 'to': 5418, 've': 4499}


(b) What is the most common subword in the text? Explain why this is.

In [None]:
max_pair = max(pair_counts, key = pair_counts.get)
max_count = pair_counts[max_pair]

print(max_pair)
print(max_count)

# The most common subword in the text iss "he", which appears 15,327 times.

he
15327


(c) What are the three longest subword in your vocabulary?

In [None]:
pairs = []

for i in range(0,26):
  for j in range(0,26):
    for k in range(0,26):
      for l in range(0,26):
        two_letter = subwords[i] + subwords[j] + subwords[k] + subwords[l]
        pairs.append(two_letter)

In [None]:
pair_counts = {}

for i in pairs:
  count = text.count(i)

  if count > 1000:
    pair_counts[i] = count

print(pair_counts)

# The three longest subwords in my vocabulary are "tion", "ther", and "that"

{'ould': 1280, 'that': 1609, 'ther': 1693, 'tion': 1744, 'with': 1342}


(d) What are the three most common subwords of at least 3 letters in length?

In [None]:
pairs = []

for i in range(0,26):
  for j in range(0,26):
    for k in range(0,26):
        two_letter = subwords[i] + subwords[j] + subwords[k]
        pairs.append(two_letter)

In [None]:
pair_counts = {}

for i in pairs:
  count = text.count(i)

  if count > 3000:
    pair_counts[i] = count

print(pair_counts)

# The three most common subwords of at least 3 letters in length are "the", "her", and "ing"

{'and': 3994, 'her': 4430, 'ing': 4163, 'the': 7870}
