# Chapter 2 - Exercises

## [2.1.8] Exercises (Data manipulation)

### question 1

1) Run the code in this section. Change the conditional statement X == Y in this section to X < Y or X > Y, and then see what kind of tensor you can get.

As expected components are compared elementwise

### question 2

2) Replace the two tensors that operate by element in the broadcasting mechanism with other shapes, e.g., 3-dimensional tensors. Is the result the same as expected?

As seen below the broadcasting is similar in high dimensions - the tensors are replicated to meet the dimension required

In [39]:
import torch

x = torch.tensor([[[1],[43],[2]],[[2],[3],[5]]])
y = torch.tensor([[[9,10,3],[1,2,3],[1,2,3]],[[9,10,3],[1,2,3],[1,2,3]]])
z = x + y

# print(x.shape, y.shape)
# print(x, y)
# print(z.shape)
# print(z)

## [2.2.5] Exercises (Data pre-processing)

### Create a raw dataset with more rows and columns.

In [87]:
import random
from datetime import datetime

import numpy as np
import pandas as pd
from dateutil.relativedelta import relativedelta

# First we create example dataframe
r = random.Random(12)

def generate_random_date_in_last_year():
    return (datetime.now() - relativedelta(days=365 * random.random()))


dataframe = pd.DataFrame({
    "date_time": [generate_random_date_in_last_year() for _ in range(10)],
    "animal": ['zebra', 'NA', 'zebra', 'NA', 'lion', 'lion', 'lion',
               'lion', 'rhino', 'rhino', ],
    "category": ['stripy'] * 4 + ['dangerous'] * 6,
    "name": ['Walter', 'NA', 'Gyles', 'NA', 'Bartholomew', 'Frederyk',
             'Raulf', 'Symond', 'Carlos', 'Arthur'],
    "weight": [80 + 40 * r.random() for _ in range(10)],
    "favourite_integer": [r.randint(0, 100) for _ in range(10)],
    "bad_column": ['', 3, '', 1, None, 2, None, 2, 'NA', 3],
    'employed': [bool(r.randint(0, 1)) for i in range(10)]
})

dataframe.to_csv('../data/animals.csv')

In [88]:
animals = pd.read_csv('../data/animals.csv',index_col=0)
animals.head()

Unnamed: 0,date_time,animal,category,name,weight,favourite_integer,bad_column,employed
0,2019-11-10 00:43:27.030039,zebra,stripy,Walter,98.982827,71,,False
1,2020-08-07 06:50:57.189155,,stripy,,106.2989,0,3.0,False
2,2019-09-24 00:15:38.189297,zebra,stripy,Gyles,106.656419,84,,False
3,2020-02-01 16:53:51.739058,,stripy,,85.704014,79,1.0,True
4,2020-02-16 09:15:32.273208,lion,dangerous,Bartholomew,80.434418,18,,True


### question 1

Delete the column with the most missing values.

In [89]:
column_to_delete = max(
    [(x, y) for x,y in animals.isna().sum().items()],
    key=lambda x: x[1])[0]

In [90]:
column_to_delete

'bad_column'

In [91]:
no_bad_col = animals.drop(column_to_delete, axis=1)
no_bad_col.head()

Unnamed: 0,date_time,animal,category,name,weight,favourite_integer,employed
0,2019-11-10 00:43:27.030039,zebra,stripy,Walter,98.982827,71,False
1,2020-08-07 06:50:57.189155,,stripy,,106.2989,0,False
2,2019-09-24 00:15:38.189297,zebra,stripy,Gyles,106.656419,84,False
3,2020-02-01 16:53:51.739058,,stripy,,85.704014,79,True
4,2020-02-16 09:15:32.273208,lion,dangerous,Bartholomew,80.434418,18,True


### Question 2

Convert the preprocessed dataset to the tensor format.

In [96]:
no_bad_col.date_time = pd.to_datetime(no_bad_col.date_time).astype(np.int64)
no_bad_col.employed = no_bad_col.employed.astype(int)

In [97]:
inputs = pd.get_dummies(no_bad_col, dummy_na=True)

In [98]:
inputs.dtypes

date_time               int64
weight                float64
favourite_integer       int64
employed                int64
animal_lion             uint8
animal_rhino            uint8
animal_zebra            uint8
animal_nan              uint8
category_dangerous      uint8
category_stripy         uint8
category_nan            uint8
name_Arthur             uint8
name_Bartholomew        uint8
name_Carlos             uint8
name_Frederyk           uint8
name_Gyles              uint8
name_Raulf              uint8
name_Symond             uint8
name_Walter             uint8
name_nan                uint8
dtype: object

In [99]:
X = torch.tensor(inputs.values)
X

## [2.3.13.] Exercises (Linear algebra)

### question 1

Prove that the transpose of a matrix  𝐀 ’s transpose is  𝐀 :  (𝐀⊤)⊤=𝐀 .

Let $$\mathbf{A} \in \mathbb{R}^{m\times n}$$
Let $$i \in 1\dots m\quad j \in 1\dots n$$
then by the definition of transpose (applied twice)
$$
\begin{aligned}
\left[(\mathbf{A}^T)^T\right]_{i,j} &= \left[(\mathbf{A}^T)\right]_{j,i} \\
&= \left[\mathbf{A}\right]_{i,j} \\
\implies
(\mathbf{A}^T)^T &= \mathbf{A}
\end{aligned}
$$


### question 2

Given two matrices  𝐀  and  𝐁 , show that the sum of transposes is equal to the transpose of a sum:  𝐀⊤+𝐁⊤=(𝐀+𝐁)⊤ .

Let $$\mathbf{A},\mathbf{B} \in \mathbb{R}^{m\times n}$$
and let $$i \in 1\dots m\quad j \in 1\dots n$$

$$
\begin{aligned}
\left[\mathbf{A}^T+\mathbf{B}^T\right]_{i,j} 
&= \left[\mathbf{A}^T\right]_{i,j}+\left[\mathbf{B}^T\right]_{i,j} \\
&= \left[\mathbf{A}\right]_{j,i}+\left[\mathbf{B}\right]_{j,i} \\
&= \left[(\mathbf{A}+\mathbf{B})^T\right]_{j,i} \\
&= \left[\mathbf{A+B}\right]_{i,j} \\
\implies
(\mathbf{A}^T + \mathbf{B}^T) &= (\mathbf{A}+\mathbf{B})^T
\end{aligned}
$$

### question 3

Given any square matrix  𝐀 , is  𝐀+𝐀⊤  always symmetric? Why?

Yes - here is a proof

Let $$\mathbf{A} \in \mathbb{R}^{n\times n}$$
and let $$i,j \in 1\dots n$$

Using the previous two questions

$$
\begin{aligned}
(\mathbf{A}+\mathbf{A}^T) ^ T
&= \mathbf{A}^T + (\mathbf{A}^T)^ T \quad\text{from question 2}\\
&= \mathbf{A}^T + \mathbf{A} \quad\text{from question 1}\\
&= \mathbf{A} + \mathbf{A}^T
\implies (\mathbf{A}+\mathbf{A}^T) \text{ symmetric}
\end{aligned}
$$

### question 4

We defined the tensor X of shape (2, 3, 4) in this section. What is the output of len(X)?

In [8]:
import torch
X = torch.ones(size=(2,3,4))
len(X)

2

### question 5

For a tensor X of arbitrary shape, does len(X) always correspond to the length of a certain axis of X? What is that axis?

len returns the legth of the first axis 

### question 6

Run A / A.sum(axis=1) and see what happens. Can you analyze the reason?

In [10]:
# A = torch.arange(20).reshape(5, 4)
# A / A.sum(axis=1)

The error is because the dimension (5,4) and (5) don't match or allow for broad casting. use keep_dims to fix.

### question 7

When traveling between two points in Manhattan, what is the distance that you need to cover in terms of the coordinates, i.e., in terms of avenues and streets? Can you travel diagonally?

Along the blocks and up the avenues with no diagonals ... this is referring to the $L_1$ norm. Also known as the Manhattan distance.

### question 8

Consider a tensor with shape (2, 3, 4). What are the shapes of the summation outputs along axis 0, 1, and 2?

In [14]:
(3,4),(2,4),(2,3)

((3, 4), (2, 4), (2, 3))

### question 9

Feed a tensor with 3 or more axes to the linalg.norm function and observe its output. What does this function compute for tensors of arbitrary shape?

In [16]:
torch.norm(X)

tensor(4.8990)

In [23]:
# really this is the root of the sum of the square of each element 
sum(X ** 2).sum() ** 0.5

tensor(4.8990)

## (Calculus)