## 4.1 Working with images
Loading an image file

In [1]:
import imageio

img_arr = imageio.imread('bobby.jpg')
img_arr.shape

(720, 1280, 3)

Any library that outputs a NumPy array will suffice to obtain a Pytorch tensor. Pytorch modules dealing with image data require tensors to be laid out as C\*H\*W: channels height and width, respectively.

As a slightly more efficient alternative to using $stack$ to build up the tensor, we can pre-allocate a tensor of appropriate size and fill it with images loaded from a directory.

In [3]:
import torch

batch_size = 3
batch = torch.zeros(batch_size, 3, 256, 256, dtype = torch.uint8)

This indicates that our batch will consist of three RGB images 256 pixels in height and 256 pixels in width.

We can now load all PNG images from an input directory and store them in the tensor.

In [3]:
import os

data_dir = 'image-cats/'
filenames = [name for name in os.listdir(data_dir)
             if os.path.splitext(name)[-1] == '.png']
for i, filename in enumerate(filenames):
    img_arr = imageio.imread(os.path.join(data_dir, filename))
    img_t = torch.from_numpy(img_arr)
    img_t = img_t.permute(2, 0, 1) # change the order of dimension
    img_t = img_t[:3]
    batch[i] = img_t

### 4.1.4 Normalizing the data
Neural networks exhibit the best performance when the input data ranges roughly from 0 to 1, or from -1 to 1. So a typical thing we'll want to do is cast a tensor to floating-point and normalize the values of the pixels. Normalizationg is tricker,as it depends on what range of the input we decide should lie between 0 and 1.

## 4.3 Representing tabular data
We are going to assume there's no meaning to the order in which samples appear in the table.

In [5]:
import numpy as np
import csv

wine_path = 'winequality-white.csv'
wineq_numpy = np.loadtxt(wine_path, dtype = np.float32, delimiter=";", skiprows=1)
wineq_numpy

array([[ 7.  ,  0.27,  0.36, ...,  0.45,  8.8 ,  6.  ],
       [ 6.3 ,  0.3 ,  0.34, ...,  0.49,  9.5 ,  6.  ],
       [ 8.1 ,  0.28,  0.4 , ...,  0.44, 10.1 ,  6.  ],
       ...,
       [ 6.5 ,  0.24,  0.19, ...,  0.46,  9.4 ,  6.  ],
       [ 5.5 ,  0.29,  0.3 , ...,  0.38, 12.8 ,  7.  ],
       [ 6.  ,  0.21,  0.38, ...,  0.32, 11.8 ,  6.  ]], dtype=float32)

Here, we just prescribe what the type of the 2D array should be, the delimiter used to seperate values in each row, and the fact that the first line should not be read since it contains the column names.

In [7]:
col_list = next(csv.reader(open(wine_path), delimiter=';'))
wineq_numpy.shape, col_list

((4898, 12),
 ['fixed acidity',
  'volatile acidity',
  'citric acid',
  'residual sugar',
  'chlorides',
  'free sulfur dioxide',
  'total sulfur dioxide',
  'density',
  'pH',
  'sulphates',
  'alcohol',
  'quality'])

Proceed to convert the NumPy array to a Pytorch tensor:

In [8]:
wineq = torch.from_numpy(wineq_numpy)
wineq.shape, wineq.dtype

(torch.Size([4898, 12]), torch.float32)

At this point, we have a floating-point torch.Tensor containing all the columns, including the last, which refers to the quality score.

### 4.3.3 Representing scores
We could treat the scores as a continuous variable, keep it as a real number, and perform a regression task, or treat it as a label and try to guess the label from chemical analysis in a classification task. In both approaches, we will typically remove the score from the tensor of input data and keep it in a separate tensor.

In [10]:
data = wineq[:, :-1]
data, data.shape

(tensor([[ 7.0000,  0.2700,  0.3600,  ...,  3.0000,  0.4500,  8.8000],
         [ 6.3000,  0.3000,  0.3400,  ...,  3.3000,  0.4900,  9.5000],
         [ 8.1000,  0.2800,  0.4000,  ...,  3.2600,  0.4400, 10.1000],
         ...,
         [ 6.5000,  0.2400,  0.1900,  ...,  2.9900,  0.4600,  9.4000],
         [ 5.5000,  0.2900,  0.3000,  ...,  3.3400,  0.3800, 12.8000],
         [ 6.0000,  0.2100,  0.3800,  ...,  3.2600,  0.3200, 11.8000]]),
 torch.Size([4898, 11]))

In [16]:
target = wineq[:, -1].long()
target, target.shape

(tensor([6, 6, 6,  ..., 6, 7, 6]), torch.Size([4898]))

We can achieve one-hot encoding using the scatter_ method, which fills the tensor with values from a source tensor along the indices provided as arguments.

In [17]:
target_onehot = torch.zeros(target.shape[0], 10)
target_onehot.scatter_(1, target.unsqueeze(1), 1.0)

tensor([[0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        ...,
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 1., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.]])

The arguments for scatter_ are as follows:
1. The dimension along which the following two arguments are specified
2. A column tensor indicating the indices of the elements to scatter
3. A tensor containing the elements to scatter or a single scalar to scatter

In [19]:
target_unsqueezed = target.unsqueeze(1)
target_unsqueezed

tensor([[6],
        [6],
        [6],
        ...,
        [6],
        [7],
        [6]])

The call to $unsqueeze$ adds a $singleton$ dimension, from 1D tensor of 4898 elements to a 2D tensor of size (4898\*1), without changing its content-no extra elements are added.

In [22]:
target[0], target_unsqueezed[0,0]

(tensor(6), tensor(6))

Let's go back to our $data$ tensor, containing 11 variables associated with the chemical analysis. We can use the funcitons in the PyTorch API to manipulate our data in tensor form. Let's first obtain the mean and sd for each column.

In [24]:
data_mean = torch.mean(data, dim=0)
data_mean

tensor([6.8548e+00, 2.7824e-01, 3.3419e-01, 6.3914e+00, 4.5772e-02, 3.5308e+01,
        1.3836e+02, 9.9403e-01, 3.1883e+00, 4.8985e-01, 1.0514e+01])

In [26]:
data_var = torch.var(data, dim=0)
data_var

tensor([7.1211e-01, 1.0160e-02, 1.4646e-02, 2.5726e+01, 4.7733e-04, 2.8924e+02,
        1.8061e+03, 8.9455e-06, 2.2801e-02, 1.3025e-02, 1.5144e+00])

In this case, dim=0 indicates that the reduction is performed along dimension 0. At this point, we can normalize the data by subtracting the mean and dividing by the sd.

In [27]:
data_normalized = (data-data_mean)/torch.sqrt(data_var)
data_normalized

tensor([[ 1.7208e-01, -8.1761e-02,  2.1326e-01,  ..., -1.2468e+00,
         -3.4915e-01, -1.3930e+00],
        [-6.5743e-01,  2.1587e-01,  4.7996e-02,  ...,  7.3995e-01,
          1.3422e-03, -8.2419e-01],
        [ 1.4756e+00,  1.7450e-02,  5.4378e-01,  ...,  4.7505e-01,
         -4.3677e-01, -3.3663e-01],
        ...,
        [-4.2043e-01, -3.7940e-01, -1.1915e+00,  ..., -1.3130e+00,
         -2.6153e-01, -9.0545e-01],
        [-1.6054e+00,  1.1666e-01, -2.8253e-01,  ...,  1.0049e+00,
         -9.6251e-01,  1.8574e+00],
        [-1.0129e+00, -6.7703e-01,  3.7852e-01,  ...,  4.7505e-01,
         -1.4882e+00,  1.0448e+00]])

### 4.3.6 Finding thresholds
Next, we're going to determine which rows in $target$ correspond to a score less than or equal to 3:

In [28]:
bad_indexes = target <= 3
bad_indexes.shape, bad_indexes.dtype, bad_indexes.sum()

(torch.Size([4898]), torch.bool, tensor(20))

The bad_indexes tensor has the same shape as $target$, with values of False or True depending on the outcome of the comparison between our threshold and each element in the orginal $target$ tensor:

In [29]:
bad_data = data[bad_indexes]
bad_data.shape

torch.Size([20, 11])

We can start to get information about wines grouped into good, middling, and bad categories. Let's take the .mean() of each column:

In [31]:
bad_data = data[target <= 3]
mid_data = data[(target > 3) & (target < 7)]
good_data = data[target >= 7]

bad_mean = torch.mean(bad_data, dim = 0)
mid_mean = torch.mean(mid_data, dim = 0)
good_mean = torch.mean(good_data, dim = 0)

for i, args in enumerate(zip(col_list, bad_mean, mid_mean, good_mean)):
    print('{:2} {:20} {:6.2f} {:6.2f} {:6.2f}'.format(i, *args))

 0 fixed acidity          7.60   6.89   6.73
 1 volatile acidity       0.33   0.28   0.27
 2 citric acid            0.34   0.34   0.33
 3 residual sugar         6.39   6.71   5.26
 4 chlorides              0.05   0.05   0.04
 5 free sulfur dioxide   53.33  35.42  34.55
 6 total sulfur dioxide 170.60 141.83 125.25
 7 density                0.99   0.99   0.99
 8 pH                     3.19   3.18   3.22
 9 sulphates              0.47   0.49   0.50
10 alcohol               10.34  10.26  11.42


We could use a threshold on total sulfur dioxide as a crude criterion for discriminating good wines from bad ones. Let's get the indexes where the total sulfur dioxide column is below the midpoint we calcualted earlier.

In [35]:
total_sulfur_threshold = 141.83
total_sulfur_data = data[:, 6]
predicted_indexes = torch.lt(total_sulfur_data, total_sulfur_threshold)

predicted_indexes.shape, predicted_indexes.dtype, predicted_indexes.sum()

(torch.Size([4898]), torch.bool, tensor(2727))

Next, we'll need to get the indexes of the actually good wines:

In [34]:
actual_indexes = target > 5
actual_indexes.shape, actual_indexes.dtype, actual_indexes.sum()

(torch.Size([4898]), torch.bool, tensor(3258))

We'll perform a logical "and" between our prediction indexes and the actual good indexes and use that interaction of wines-in-agreement to determine how well we did:

In [36]:
n_matches = torch.sum(actual_indexes & predicted_indexes).item()
n_predicted = torch.sum(predicted_indexes).item()
n_actual = torch.sum(actual_indexes).item()

n_matches, n_matches / n_predicted, n_matches / n_actual

(2018, 0.74000733406674, 0.6193984039287906)

We got around 2000 wines right! Since we predicted 2700 wines, this gives us a 74% chance that if we predict a wine to be high quality, it actually is. Unfortunately, there are 3200 good wines, and we only identified 61% of them.
## 4.4 Working with timeseries
4.4.1 Adding a time dimension

In [8]:
import numpy as np

bikes_numpy = np.loadtxt('hour-fixed.csv', dtype=np.float32,
                        delimiter=",", skiprows=1,
                        converters={1: lambda x: float(x[8:10])})
bikes = torch.from_numpy(bikes_numpy)
bikes.shape

torch.Size([17520, 17])

instant, day, season, year, month, hour, holiday, weekday, workingday, weathersit, temp, atemp, hum, windspeed, casual, registered, count of rental bikes: cnt.

The existence of an ordering gives us the opportunity to exploit causal relationships across time. We're going to focus on learning how to turn our bike-sharing dataset into something that our nn will be able to ingest in fixed-size chunks.
### 4.4.2 Shaping the data by time period

In [9]:
bikes.stride()

(17, 1)

That's 17520 hours, 17 columns. Now let's reshape the data to have 3 axes - day, hour, and then our 17 columns:

In [10]:
daily_bikes = bikes.view(-1, 24, bikes.shape[1])
daily_bikes.shape, daily_bikes.stride()

(torch.Size([730, 24, 17]), (408, 17, 1))

Our call to view requires us to provide the new shape for the returned tensor. We use -1 as a placeholder for "however many indexes are left, given the other dimensions and the original number of elements"

In [11]:
daily_bikes = daily_bikes.transpose(1, 2)
daily_bikes.shape, daily_bikes.stride()

(torch.Size([730, 17, 24]), (408, 1, 17))

We now have N sequences of L hours in a day, for C channels. To get to our desired N × C × L ordering, we have transposed the tensor.
### 4.3.3 Ready for training
The 'weather situation' variable is ordinal. It has four levels: 1 for good weather, and 4 for really bad. We could treat this variable as categorical, with levels interpreted as labels, or as a continuous variable.
## 4.5 Representing text