# Real-world data representation using tensors

In [6]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


Tensors are the building blocks for data in PyTorch. Neural networks take tensors as input and produce tensors as outputs. In fact, all operations within a neural network and during optimization are operations between tensors, and all parameters (for example, weights and biases) in a neural network are tensors.

# Working with images

An image is represented as a collection of scalars arranged in a regular grid with a
height and a width (in pixels).

### Adding color channels

### Loading an image
Images come in several different file formats, but luckily there are plenty of ways to
load images in Python.

In [1]:
%pip install imageio



In [4]:
import imageio.v2 as imageio

img_arr = imageio.imread('/content/drive/MyDrive/dataset/p1ch4/image-dog/bobby.jpg')
img_arr.shape

#This code loads a .jpg image using the imageio library and checks its shape. The resulting image is a NumPy array with height, width, and RGB channels.

(720, 1280, 3)

**Converting Image to PyTorch Tensor and Permuting Dimensions**

In [8]:
import torch

In [10]:
img = torch.from_numpy(img_arr)
out = img.permute(2, 0, 1)


Preallocating a Tensor for a Batch of Images

In [12]:
batch_size = 3
batch = torch.zeros(batch_size, 3, 256, 256, dtype=torch.uint8)
#Before loading multiple images, a tensor of shape (batch_size, channels, height, width) is created to store them efficiently.

Loading Multiple PNG Images into a Batch

In [13]:
import os

data_dir = '/content/drive/MyDrive/dataset/p1ch4/image-cats'
filenames = [name for name in os.listdir(data_dir) if name.endswith('.png')]

for i, filename in enumerate(filenames):
    img_arr = imageio.imread(os.path.join(data_dir, filename))
    img_t = torch.from_numpy(img_arr).permute(2, 0, 1)[:3]  # Keep only RGB
    batch[i] = img_t


Normalizing Pixel Values to [0, 1]

In [14]:
batch = batch.float()
batch /= 255.0
#Convert the pixel values from uint8 to float and scale them to the range [0, 1] by dividing by 255.

Standardizing Each Channel (Zero Mean, Unit Std)

In [15]:
n_channels = batch.shape[1]

for c in range(n_channels):
    mean = torch.mean(batch[:, c])
    std = torch.std(batch[:, c])
    batch[:, c] = (batch[:, c] - mean) / std


* We can perform several other operations on inputs, such as geometric transformations
like rotations, scaling, and cropping.

## 3D images: Volumetric data

Volumetric data such as CT scans are represented as 3D tensors where each slice corresponds to a cross-section of the subject's body. CT scans typically contain only a single intensity channel (grayscale), and we stack these slices into a tensor with shape [C, D, H, W].

 Loading a Volumetric CT Scan Using imageio.volread

In [16]:
import imageio
dir_path = "/content/drive/MyDrive/dataset/p1ch4/volumetric-dicom/2-LUNG 3.0  B70f-04083"
vol_arr = imageio.volread(dir_path, 'DICOM')
print(vol_arr.shape) #
#CT scans are stored as multiple DICOM files. Using imageio.volread, we can load these into a 3D NumPy array of shape [depth, height, width].

Reading DICOM (examining files): 1/99 files (1.0%)2/99 files (2.0%)3/99 files (3.0%)4/99 files (4.0%)5/99 files (5.1%)6/99 files (6.1%)7/99 files (7.1%)8/99 files (8.1%)9/99 files (9.1%)10/99 files (10.1%)11/99 files (11.1%)12/99 files (12.1%)13/99 files (13.1%)14/99 files (14.1%)15/99 files (15.2%)16/99 files (16.2%)17/99 files (17.2%)18/99 files (18.2%)19/99 files (19.2%)20/99 files (20.2%)21/99 files (21.2%)22/99 files (22.2%)23/99 files (23.2%)24/99 files (24.2%)25/99 files (25.3%)26/99 files (26.3%)27/99 files (27

Adding a Channel Dimension

In [17]:
import torch
vol = torch.from_numpy(vol_arr).float()
vol = torch.unsqueeze(vol, 0)
print(vol.shape)


torch.Size([1, 99, 512, 512])


## Representing tabular data

Tabular data, such as CSV files or spreadsheets, is one of the most common data forms in real-world ML problems. Each row represents one sample; columns store features (numerical or categorical). However, unlike tables, PyTorch tensors must be homogeneous (same data type), typically float32, to work with neural networks.

Loading Wine Data as a Tensor

In [19]:
import numpy as np

wine_path = "/content/drive/MyDrive/dataset/p1ch4/tabular-wine/winequality-white.csv"
wineq_numpy = np.loadtxt(wine_path, dtype=np.float32, delimiter=";", skiprows=1)
# loaded the CSV file using NumPy’s loadtxt and convert it to a PyTorch tensor. This avoids loading unnecessary libraries like Pandas.

In [20]:
import csv

col_list = next(csv.reader(open(wine_path), delimiter=';'))
print(wineq_numpy.shape)  # Output: (4898, 12)
print(col_list)  # List of column names


(4898, 12)
['fixed acidity', 'volatile acidity', 'citric acid', 'residual sugar', 'chlorides', 'free sulfur dioxide', 'total sulfur dioxide', 'density', 'pH', 'sulphates', 'alcohol', 'quality']


In [21]:
#to convert the NumPy array to a PyTorch tensor
import torch

wineq = torch.from_numpy(wineq_numpy)
print(wineq.shape, wineq.dtype)


torch.Size([4898, 12]) torch.float32


### Representing scores

In [23]:
data = wineq[:, :-1]
data, data.shape

(tensor([[ 7.0000,  0.2700,  0.3600,  ...,  3.0000,  0.4500,  8.8000],
         [ 6.3000,  0.3000,  0.3400,  ...,  3.3000,  0.4900,  9.5000],
         [ 8.1000,  0.2800,  0.4000,  ...,  3.2600,  0.4400, 10.1000],
         ...,
         [ 6.5000,  0.2400,  0.1900,  ...,  2.9900,  0.4600,  9.4000],
         [ 5.5000,  0.2900,  0.3000,  ...,  3.3400,  0.3800, 12.8000],
         [ 6.0000,  0.2100,  0.3800,  ...,  3.2600,  0.3200, 11.8000]]),
 torch.Size([4898, 11]))

In [24]:
target = wineq[:, -1]
target, target.shape

(tensor([6., 6., 6.,  ..., 6., 7., 6.]), torch.Size([4898]))

If we want to transform the target tensor in a tensor of labels, we have two options,
depending on the strategy or what we use the categorical data for.
One is simply to treat labels as an integer vector of scores:

In [None]:
target = wineq[:, -1].long()
target

tensor([6, 6, 6,  ..., 6, 7, 6])

### One-hot encoding


For classification, target scores (like 0–10) can be:

Stored as integer class labels

Or one-hot encoded: a vector with all zeros and one “1” at the label index.

This is useful for models that expect categorical input/output in vector form.

In [25]:
target = wineq[:, -1].long()  # Convert float target to integer class labels
print(target[:5])


tensor([6, 6, 6, 6, 6])


In [26]:
# Create zero tensor with shape (4898, 10) — 10 classes (scores 0 to 9)
target_onehot = torch.zeros(target.shape[0], 10)
# One-hot encode using scatter_
target_onehot.scatter_(1, target.unsqueeze(1), 1.0)
print(target_onehot.shape)


torch.Size([4898, 10])


### When to categorize

To help the  model learn better, normalizing features:
Subtract the mean and divide by the standard deviation for each column.

In [27]:
data_mean = torch.mean(data, dim=0)
data_var = torch.var(data, dim=0)

print(data_mean)  # Mean for each of the 11 features
print(data_var)   # Variance for each of the 11 features


tensor([6.8548e+00, 2.7824e-01, 3.3419e-01, 6.3914e+00, 4.5772e-02, 3.5308e+01,
        1.3836e+02, 9.9403e-01, 3.1883e+00, 4.8985e-01, 1.0514e+01])
tensor([7.1211e-01, 1.0160e-02, 1.4646e-02, 2.5726e+01, 4.7733e-04, 2.8924e+02,
        1.8061e+03, 8.9455e-06, 2.2801e-02, 1.3025e-02, 1.5144e+00])


In [28]:
data_var = torch.var(data, dim=0)
data_var

tensor([7.1211e-01, 1.0160e-02, 1.4646e-02, 2.5726e+01, 4.7733e-04, 2.8924e+02,
        1.8061e+03, 8.9455e-06, 2.2801e-02, 1.3025e-02, 1.5144e+00])

Normalize Data

In [29]:
data_normalized = (data - data_mean) / torch.sqrt(data_var)

print(data_normalized[:3])


tensor([[ 1.7208e-01, -8.1761e-02,  2.1326e-01,  2.8211e+00, -3.5351e-02,
          5.6987e-01,  7.4449e-01,  2.3313e+00, -1.2468e+00, -3.4915e-01,
         -1.3930e+00],
        [-6.5743e-01,  2.1587e-01,  4.7996e-02, -9.4467e-01,  1.4773e-01,
         -1.2529e+00, -1.4967e-01, -9.1472e-03,  7.3995e-01,  1.3422e-03,
         -8.2419e-01],
        [ 1.4756e+00,  1.7450e-02,  5.4378e-01,  1.0027e-01,  1.9350e-01,
         -3.1211e-01, -9.7324e-01,  3.5864e-01,  4.7505e-01, -4.3677e-01,
         -3.3663e-01]])


### Finding thresholds

Find a simple rule (threshold) to distinguish good from bad wines using just one feature. We'll use total sulfur dioxide as a test feature.

In [30]:
bad_indexes = target <= 3
print(bad_indexes.shape, bad_indexes.dtype, bad_indexes.sum())



torch.Size([4898]) torch.bool tensor(20)


 Use Boolean indexing to extract bad wines

In [None]:
bad_data = data[bad_indexes]
print(bad_data.shape)


torch.Size([20, 11])

Split wine data into three quality levels

In [31]:
bad_data = data[target <= 3]
mid_data = data[(target > 3) & (target < 7)]
good_data = data[target >= 7]


Compute average feature values for each group

In [32]:
bad_mean = torch.mean(bad_data, dim=0)
mid_mean = torch.mean(mid_data, dim=0)
good_mean = torch.mean(good_data, dim=0)

for i, args in enumerate(zip(col_list, bad_mean, mid_mean, good_mean)):
    print('{:2} {:20} {:6.2f} {:6.2f} {:6.2f}'.format(i, *args))


 0 fixed acidity          7.60   6.89   6.73
 1 volatile acidity       0.33   0.28   0.27
 2 citric acid            0.34   0.34   0.33
 3 residual sugar         6.39   6.71   5.26
 4 chlorides              0.05   0.05   0.04
 5 free sulfur dioxide   53.33  35.42  34.55
 6 total sulfur dioxide 170.60 141.83 125.25
 7 density                0.99   0.99   0.99
 8 pH                     3.19   3.18   3.22
 9 sulphates              0.47   0.49   0.50
10 alcohol               10.34  10.26  11.42


Use a sulfur dioxide threshold to predict good wines

In [34]:
total_sulfur_threshold = 141.83
total_sulfur_data = data[:, 6]

predicted_indexes = total_sulfur_data < total_sulfur_threshold
print(predicted_indexes.shape, predicted_indexes.dtype, predicted_indexes.sum())



torch.Size([4898]) torch.bool tensor(2727)


Get the actual good wine indexes
python
Copy
Edit


In [35]:
actual_indexes = target > 5
print(actual_indexes.shape, actual_indexes.dtype, actual_indexes.sum())



torch.Size([4898]) torch.bool tensor(3258)


Evaluate prediction quality

In [36]:
n_matches = torch.sum(actual_indexes & predicted_indexes).item()
n_predicted = torch.sum(predicted_indexes).item()
n_actual = torch.sum(actual_indexes).item()

print(n_matches, n_matches / n_predicted, n_matches / n_actual)


2018 0.74000733406674 0.6193984039287906


## Working with time series

### Adding a time dimension

For every hour, the dataset reports the following variables:
- Index of record: instant
- Day of month: day
- Season: season (1: spring, 2: summer, 3: fall, 4: winter)
- Year: yr (0: 2011, 1: 2012)
- Month: mnth (1 to 12)
- Hour: hr (0 to 23)
- Holiday status: holiday
- Day of the week: weekday
- Working day status: workingday
- Weather situation: weathersit (1: clear, 2:mist, 3: light rain/snow, 4: heavy
rain/snow)
- Temperature in °C: temp
- Perceived temperature in °C: atemp
- Humidity: hum
- Wind speed: windspeed
- Number of casual users: casual
- Number of registered users: registered
- Count of rental bikes: cnt

In [37]:
bikes_numpy = np.loadtxt(
"/content/drive/MyDrive/dataset/p1ch4/bike-sharing-dataset/hour-fixed.csv",
    dtype=np.float32,
    delimiter=",",
    skiprows=1,
    converters={1: lambda x: float(x[8:10])})
bikes = torch.from_numpy(bikes_numpy)
bikes

tensor([[1.0000e+00, 1.0000e+00, 1.0000e+00,  ..., 3.0000e+00, 1.3000e+01,
         1.6000e+01],
        [2.0000e+00, 1.0000e+00, 1.0000e+00,  ..., 8.0000e+00, 3.2000e+01,
         4.0000e+01],
        [3.0000e+00, 1.0000e+00, 1.0000e+00,  ..., 5.0000e+00, 2.7000e+01,
         3.2000e+01],
        ...,
        [1.7377e+04, 3.1000e+01, 1.0000e+00,  ..., 7.0000e+00, 8.3000e+01,
         9.0000e+01],
        [1.7378e+04, 3.1000e+01, 1.0000e+00,  ..., 1.3000e+01, 4.8000e+01,
         6.1000e+01],
        [1.7379e+04, 3.1000e+01, 1.0000e+00,  ..., 1.2000e+01, 3.7000e+01,
         4.9000e+01]])

### Shaping the data by time period

In [38]:
bikes.shape, bikes.stride()

(torch.Size([17520, 17]), (17, 1))

That’s 17,520 hours, 17 columns. Now let’s reshape the data to have 3 axes—day, hour,
and then our 17 columns:

In [39]:
daily_bikes = bikes.view(-1, 24, bikes.shape[1])
daily_bikes.shape, daily_bikes.stride()

(torch.Size([730, 24, 17]), (408, 17, 1))

We see that the rightmost dimension is the number of columns in the original
dataset. Then, in the middle dimension, we have time, split into chunks of 24 sequential
hours. In other words, we now have N sequences of L hours in a day, for C channels.
To get to our desired N × C × L ordering, we need to transpose the tensor:

In [40]:
daily_bikes = daily_bikes.transpose(1, 2)
daily_bikes.shape, daily_bikes.stride()

(torch.Size([730, 17, 24]), (408, 1, 17))

In order to make it easier to render our data, we’re going to limit ourselves to the
first day for a moment. We initialize a zero-filled matrix with a number of rows equal
to the number of hours in the day and number of columns equal to the number of
weather levels:

In [41]:
first_day = bikes[:24].long()
weather_onehot = torch.zeros(first_day.shape[0], 4)
first_day[:,9]

tensor([1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 2, 2, 2, 2])

In [43]:
weather_onehot.scatter_(
    dim=1,
    index=first_day[:,9].unsqueeze(1).long() - 1,
    value=1.0)

#Decreases the values by 1 because weather situation ranges from 1 to 4, while indices are 0-based

tensor([[1., 0., 0., 0.],
        [1., 0., 0., 0.],
        [1., 0., 0., 0.],
        [1., 0., 0., 0.],
        [1., 0., 0., 0.],
        [0., 1., 0., 0.],
        [1., 0., 0., 0.],
        [1., 0., 0., 0.],
        [1., 0., 0., 0.],
        [1., 0., 0., 0.],
        [1., 0., 0., 0.],
        [1., 0., 0., 0.],
        [1., 0., 0., 0.],
        [0., 1., 0., 0.],
        [0., 1., 0., 0.],
        [0., 1., 0., 0.],
        [0., 1., 0., 0.],
        [0., 1., 0., 0.],
        [0., 0., 1., 0.],
        [0., 0., 1., 0.],
        [0., 1., 0., 0.],
        [0., 1., 0., 0.],
        [0., 1., 0., 0.],
        [0., 1., 0., 0.]])

Last, we concatenate our matrix to our original dataset using the cat function.
Let’s look at the first of our results:

In [44]:
torch.cat((bikes[:24], weather_onehot), 1)[:1]

tensor([[ 1.0000,  1.0000,  1.0000,  0.0000,  1.0000,  0.0000,  0.0000,  6.0000,
          0.0000,  1.0000,  0.2400,  0.2879,  0.8100,  0.0000,  3.0000, 13.0000,
         16.0000,  1.0000,  0.0000,  0.0000,  0.0000]])

In [45]:
daily_weather_onehot = torch.zeros(daily_bikes.shape[0], 4,
daily_bikes.shape[2])
daily_weather_onehot.shape

torch.Size([730, 4, 24])

In [46]:
daily_weather_onehot.scatter_(
1, daily_bikes[:,9,:].long().unsqueeze(1) - 1, 1.0)
daily_weather_onehot.shape

torch.Size([730, 4, 24])

In [47]:
daily_bikes = torch.cat((daily_bikes, daily_weather_onehot), dim=1)

In [48]:
daily_bikes[:, 9, :] = (daily_bikes[:, 9, :] - 1.0) / 3.0

There are multiple possibilities for rescaling variables. We can either map their
range to [0.0, 1.0]

In [49]:
temp = daily_bikes[:, 10, :]
temp_min = torch.min(temp)
temp_max = torch.max(temp)
daily_bikes[:, 10, :] = ((daily_bikes[:, 10, :] - temp_min)
/ (temp_max - temp_min))

or subtract the mean and divide by the standard deviation:

In [50]:
temp = daily_bikes[:, 10, :]
daily_bikes[:, 10, :] = ((daily_bikes[:, 10,:] - torch.mean(temp))
/ torch.std(temp))

In the latter case, our variable will have 0 mean and unitary standard deviation. If our
variable were drawn from a Gaussian distribution, 68% of the samples would sit in the
[-1.0, 1.0] interval.

## Representing text

Deep learning models process text by converting it into tensors. This section introduces one-hot encoding at the character and word level, and explains how to turn raw text into tensor data suitable for models like RNNs or Transformers.

Our goal in this section is to turn text into something a neural network can process:
a tensor of numbers, just like our previous cases. If we can do that and later
choose the right architecture for our text-processing job, we’ll be in the position of
doing NLP with PyTorch.

### Converting text to numbers

Goal: Turn raw text into a tensor format.

Two main levels of processing:

Character-level (each character encoded)

Word-level (each word encoded)


In [51]:
with open('/content/drive/MyDrive/dataset/p1ch4/jane-austen/1342-0.txt', encoding='utf8') as f:
    text = f.read()

### One-hot-encoding characters

One-Hot Encoding is a method of representing characters or words by a vector where only one element is set to one and all others are zero, based on their position in the vocabulary. This results in a sparse, semantically independent vector with a high dimension.

Pick a line of text:

In [52]:
lines = text.split('\n')
line = lines[200]


Create one-hot encoding tensor:

In [53]:
letter_t = torch.zeros(len(line), 128)


Fill tensor using ASCII codes:

In [54]:
for i, letter in enumerate(line.lower().strip()):
    letter_index = ord(letter) if ord(letter) < 128 else 0
    letter_t[i][letter_index] = 1

### One-hot encoding whole words

Clean the words:

In [55]:
def clean_words(input_str):
    punctuation = '.,;:"!?”“_-'
    word_list = input_str.lower().replace('\n',' ').split()
    word_list = [word.strip(punctuation) for word in word_list]
    return word_list
words_in_line = clean_words(line)
line, words_in_line

('“Impossible, Mr. Bennet, impossible, when I am not acquainted with him',
 ['impossible',
  'mr',
  'bennet',
  'impossible',
  'when',
  'i',
  'am',
  'not',
  'acquainted',
  'with',
  'him'])

Create vocabulary:

In [56]:
word_list = sorted(set(clean_words(text)))
word2index_dict = {word: i for (i, word) in enumerate(word_list)}
len(word2index_dict), word2index_dict['impossible']

(7261, 3394)

In [57]:
word_t = torch.zeros(len(words_in_line), len(word2index_dict))
for i, word in enumerate(words_in_line):
    word_index = word2index_dict[word]
    word_t[i][word_index] = 1
    print('{:2} {:4} {}'.format(i, word_index, word))
print(word_t.shape)
#At this point, tensor represents one sentence of length 11 in an encoding space of size 7,261, the number of words in our dictionary.

 0 3394 impossible
 1 4305 mr
 2  813 bennet
 3 3394 impossible
 4 7078 when
 5 3315 i
 6  415 am
 7 4436 not
 8  239 acquainted
 9 7148 with
10 3215 him
torch.Size([11, 7261])
