In this notebook, you will implement:

* Create tensors from different data sources like Python lists and NumPy arrays.

* Reshape and manipulate tensor dimensions to prepare data for model inputs.

* Use indexing and slicing techniques to access and filter specific parts of your data.

* Perform the mathematical and logical operations that form the basis of all neural network computations.


In [1]:
import torch
import pandas as pd
import numpy as np

In [2]:
x = torch.tensor([1,2,3]) # from python list to tensor
print("From python lists :",x)
print("Tensor :", x.dtype)

From python lists : tensor([1, 2, 3])
Tensor : torch.int64


In [4]:
x_numpy = np.array([[1,2,3],[4,5,6]])
x_tensor_from_numpy = torch.from_numpy(x_numpy)
print("tensor from numy :\n\n",x_tensor_from_numpy)

tensor from numy :

 tensor([[1, 2, 3],
        [4, 5, 6]])


In [6]:
#now from pandas dataframe , pandas is the data manupulation library , which use dataframe to store the data into the format of csv and spreadsheet
# there is not direct way to convert the pandas dataframe to tensor , so we convert the dataframe to numpy with .values() and then create the tensor
# from the numpy array
!gdown --fuzzy https://drive.google.com/file/d/1sDyPpdUUSiE1wIWajkBApkNZLwVwf4hi/view?usp=drive_link

Downloading...
From: https://drive.google.com/uc?id=1sDyPpdUUSiE1wIWajkBApkNZLwVwf4hi
To: /content/data.csv
  0% 0.00/69.0 [00:00<?, ?B/s]100% 69.0/69.0 [00:00<00:00, 210kB/s]


In [9]:
df = pd.read_csv("data.csv")

numpy_df = df.values
#convert this to tensor
tensor_df = torch.tensor(numpy_df)
print("pandas dataframe :\n", df)
print("tensor:\n",tensor_df)
print("tensor type:\n",tensor_df.dtype)

pandas dataframe :
    distance_miles  delivery_time_minutes
0            1.60                   7.22
1           13.09                  32.41
2            6.97                  17.47
tensor:
 tensor([[ 1.6000,  7.2200],
        [13.0900, 32.4100],
        [ 6.9700, 17.4700]], dtype=torch.float64)
tensor type:
 torch.float64


In [13]:
zeros = torch.ones(2,3)
print(zeros)

tensor([[1., 1., 1.],
        [1., 1., 1.]])


In [16]:
range_val = torch.arange(0,10,step=2)
print(range_val)

tensor([0, 2, 4, 6, 8])


In [25]:
y = torch.tensor([[1.,2.,3.],[4.,5.,6]])
print("Original Tensor\n",y)
print("shape :",y.shape)
y = y.unsqueeze(0)
print("shape :",y.shape)
print("\nTENSOR WITH ADDED DIMENSION AT INDEX 0:\n\n", y)

y = y.squeeze(0)
print("back to original :\n",y)

Original Tensor
 tensor([[1., 2., 3.],
        [4., 5., 6.]])
shape : torch.Size([2, 3])
shape : torch.Size([1, 2, 3])

TENSOR WITH ADDED DIMENSION AT INDEX 0:

 tensor([[[1., 2., 3.],
         [4., 5., 6.]]])
back to original :
 tensor([[1., 2., 3.],
        [4., 5., 6.]])


In [33]:
z = torch.ones(1,2,3)
print(z.shape)
z = z.transpose(1,2)
print(z.shape)

torch.Size([1, 2, 3])
torch.Size([1, 3, 2])


In [40]:
z1 = torch.zeros(1,2,3)
z1 = z1.transpose(1,2)
z_final = torch.cat((z,z1),dim=2)
print(z_final)

tensor([[[1., 1., 0., 0.],
         [1., 1., 0., 0.],
         [1., 1., 0., 0.]]])


In [56]:
x = torch.tensor([
    [1, 2, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12]
])
print("original tensor: \n",x)
print("-"*55)
element_r1c2 = x[1][2]
print("row 1 and col 2 : ",element_r1c2)
print("-"*55)
row_2 = x[1]
print("row 2 : ",row_2)
print("-"*55)
row_1_2 = x[0:2]
print("1st two rows :",row_1_2)
print("-"*55)
col_3_row_all = x[:,2]
print("third col , all rows :",col_3_row_all)
every_other_col = x[:,::2]
print(every_other_col)

original tensor: 
 tensor([[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12]])
-------------------------------------------------------
row 1 and col 2 :  tensor(7)
-------------------------------------------------------
row 2 :  tensor([5, 6, 7, 8])
-------------------------------------------------------
1st two rows : tensor([[1, 2, 3, 4],
        [5, 6, 7, 8]])
-------------------------------------------------------
third col , all rows : tensor([ 3,  7, 11])
tensor([[ 1,  3],
        [ 5,  7],
        [ 9, 11]])


In [57]:
#practice block
base_tensor = torch.tensor([
    [1,2,3,4],
    [5,6,7,8],
    [9,10,11,12]
])

In [58]:
#first row
base_tensor[0]

tensor([1, 2, 3, 4])

In [59]:
base_tensor[1:,:]

tensor([[ 5,  6,  7,  8],
        [ 9, 10, 11, 12]])

In [60]:
base_tensor[:2,1:]

tensor([[2, 3, 4],
        [6, 7, 8]])

In [61]:
base_tensor[:,::2]

tensor([[ 1,  3],
        [ 5,  7],
        [ 9, 11]])

In [67]:
base_tensor[:,2:3].shape

torch.Size([3, 1])

In [70]:
base_tensor[1,:]

tensor([5, 6, 7, 8])

## 5 - Optional Exercises

You've now covered the essential tools for working with tensors in PyTorch. Theory provides the map, but hands-on practice is what builds true confidence and skill. The following optional exercises are your opportunity to apply what you have learned to practical scenarios, from analyzing sales data to engineering new features for a machine learning model. This is where the concepts truly come to life, so dive in and put your new knowledge to the test!

### Exercise 1: Analyzing Monthly Sales Data

You're a data analyst at an e-commerce company. You've been given a tensor representing the monthly sales of three different products over a period of four months. Your task is to extract meaningful insights from this data.

The tensor `sales_data` is structured as follows:

* **Rows** represent the **products** (Product A, Product B, Product C).

* **Columns** represent the **months** (Jan, Feb, Mar, Apr).

**Your goals are**:

1. Calculate the total sales for **Product B** (the second row).
2. Identify which months had sales **greater than 130** for **Product C** (the third row) using boolean masking.
3. Extract the sales data for all products for the months of **Feb and Mar** (the middle two columns).

<br>

<details>
<summary><span style="color:green;"><strong>Solution (Click here to expand)</strong></span></summary>

```python
### START CODE HERE ###

# 1. Calculate total sales for Product B.
total_sales_product_b = sales_data[1].sum()

# 2. Find months where sales for Product C were > 130.
high_sales_mask_product_c = sales_data[2] > 130

# 3. Get sales for Feb and Mar for all products.
sales_feb_mar = sales_data[:, 1:3]

### END CODE HERE ###
```

In [71]:
# Sales data for 3 products over 4 months
sales_data = torch.tensor([[100, 120, 130, 110],   # Product A
                           [ 90,  95, 105, 125],   # Product B
                           [140, 115, 120, 150]    # Product C
                          ], dtype=torch.float32)

print("ORIGINAL SALES DATA:\n\n", sales_data)
print("-" * 45)

### START CODE HERE ###

# 1. Calculate total sales for Product B.
# we need to fetch the row b , second row and add that row

total_sales_product_b =  sales_data[1,:].sum()

# 2. Find months where sales for Product C were > 130.
high_sales_mask_product_c = sales_data[2,:] > 130

# 3. Get sales for Feb and Mar for all products.
sales_feb_mar = sales_data[:,1:3]

### END CODE HERE ###

print("\nTotal Sales for Product B:                   ", total_sales_product_b)
print("\nMonths with >130 Sales for Product C (Mask): ", high_sales_mask_product_c)
print("\nSales for Feb & Mar:\n\n", sales_feb_mar)

ORIGINAL SALES DATA:

 tensor([[100., 120., 130., 110.],
        [ 90.,  95., 105., 125.],
        [140., 115., 120., 150.]])
---------------------------------------------

Total Sales for Product B:                    tensor(415.)

Months with >130 Sales for Product C (Mask):  tensor([ True, False, False,  True])

Sales for Feb & Mar:

 tensor([[120., 130.],
        [ 95., 105.],
        [115., 120.]])


#### Expected Output:

```
Total Sales for Product B:			 tensor(415.)

Months with >130 Sales for Product C (Mask):	 tensor([ True, False, False,  True])

Sales for Feb & Mar:

 tensor([[120., 130.],
        [ 95., 105.],
        [115., 120.]])
```

### Exercise 2: Image Batch Transformation

You're working on a computer vision model and have a batch of 4 grayscale images, each of size 3x3 pixels. The data is currently in a tensor with the shape `[4, 3, 3]`, which represents `[batch_size, height, width]`.

For processing with certain deep learning frameworks, you need to transform this data into the `[batch_size, channels, height, width]` format. Since the images are grayscale, **you'll need to**:

1. Add a new dimension of size 1 at index 1 to represent the color channel.
2. After adding the channel, you realize the model expects the shape `[batch_size, height, width, channels]`. Transpose the tensor to swap the channel dimension with the last dimension.

<br>

<details>
<summary><span style="color:green;"><strong>Solution (Click here to expand)</strong></span></summary>

```python
### START CODE HERE ###

# 1. Add a channel dimension at index 1.
image_batch_with_channel = image_batch.unsqueeze(1)

# 2. Transpose the tensor to move the channel dimension to the end.
# Swap dimension 1 (channels) with dimension 3 (the last one).
image_batch_transposed = image_batch_with_channel.transpose(1, 3)

### END CODE HERE ###
```

In [72]:
# A batch of 4 grayscale images, each 3x3
image_batch = torch.rand(4, 3, 3)

print("ORIGINAL BATCH SHAPE:", image_batch.shape)
print("-" * 45)

### START CODE HERE ###

# 1. Add a channel dimension at index 1.
image_batch_with_channel = image_batch.unsqueeze(1)

# 2. Transpose the tensor to move the channel dimension to the end.
# Swap dimension 1 (channels) with dimension 3 (the last one).
# we will need to do two transpose
temp = image_batch_with_channel.transpose(1,2)
print(temp.shape)
image_batch_transposed = temp.transpose(2,3)

### END CODE HERE ###


print("\nSHAPE AFTER UNSQUEEZE:", image_batch_with_channel.shape)
print("SHAPE AFTER TRANSPOSE:", image_batch_transposed.shape)

ORIGINAL BATCH SHAPE: torch.Size([4, 3, 3])
---------------------------------------------
torch.Size([4, 3, 1, 3])

SHAPE AFTER UNSQUEEZE: torch.Size([4, 1, 3, 3])
SHAPE AFTER TRANSPOSE: torch.Size([4, 3, 3, 1])


#### Expected Output:

```
SHAPE AFTER UNSQUEEZE: torch.Size([4, 1, 3, 3])
SHAPE AFTER TRANSPOSE: torch.Size([4, 3, 3, 1])
```

### Exercise 3: Combining and Weighting Sensor Data

You're building an environment monitoring system that uses two sensors: one for temperature and one for humidity. You receive data from these sensors as two separate 1D tensors.

**Your task is to**:

1. **Concatenate** the two tensors into a single `2x5` tensor, where the first row is temperature data and the second is humidity data.
2. Create a `weights` tensor `torch.tensor([0.6, 0.4])`.
3. Use **broadcasting and element-wise multiplication** to apply these weights to the combined sensor data. The temperature data should be multiplied by 0.6 and the humidity data by 0.4.
4. Finally, calculate the **weighted average** for each time step by **summing** the weighted values along `dim=0` and **dividing** by the sum of the weights.

<br>

<details>
<summary><span style="color:green;"><strong>Solution (Click here to expand)</strong></span></summary>

```python
### START CODE HERE ###

# 1. Concatenate the two tensors.
# Note: You need to unsqueeze them first to stack them vertically.
combined_data = torch.cat((temperature.unsqueeze(0), humidity.unsqueeze(0)), dim=0)

# 2. Create the weights tensor.
weights = torch.tensor([0.6, 0.4])

# 3. Apply weights using broadcasting.
# You need to reshape weights to [2, 1] to broadcast across columns.
weighted_data = combined_data * weights.unsqueeze(1)

# 4. Calculate the weighted average for each time step.
#    (A true average = weighted sum / sum of weights)
weighted_sum = torch.sum(weighted_data, dim=0)
weighted_average = weighted_sum / torch.sum(weights)

### END CODE HERE ###
```

In [73]:
# Sensor readings (5 time steps)
temperature = torch.tensor([22.5, 23.1, 21.9, 22.8, 23.5])
humidity = torch.tensor([55.2, 56.4, 54.8, 57.1, 56.8])

print("TEMPERATURE DATA: ", temperature)
print("HUMIDITY DATA:    ", humidity)
print("-" * 45)

### START CODE HERE ###

# 1. Concatenate the two tensors.
# Note: You need to unsqueeze them first to stack them vertically.
combined_data = torch.cat((temperature.unsqueeze(0),humidity.unsqueeze(0)),dim = 0)

# 2. Create the weights tensor.
weights = torch.tensor([0.6,0.4])

# 3. Apply weights using broadcasting.
# You need to reshape weights to [2, 1] to broadcast across columns.
weighted_data = combined_data * weights.unsqueeze(1)

# 4. Calculate the weighted average for each time step.
#    (A true average = weighted sum / sum of weights)
weighted_sum = torch.sum(weighted_data,dim=0)
weighted_average = weighted_sum / torch.sum(weights)

### END CODE HERE ###

print("\nCOMBINED DATA (2x5):\n\n", combined_data)
print("\nWEIGHTED DATA:\n\n", weighted_data)
print("\nWEIGHTED AVERAGE:", weighted_average)

TEMPERATURE DATA:  tensor([22.5000, 23.1000, 21.9000, 22.8000, 23.5000])
HUMIDITY DATA:     tensor([55.2000, 56.4000, 54.8000, 57.1000, 56.8000])
---------------------------------------------

COMBINED DATA (2x5):

 tensor([[22.5000, 23.1000, 21.9000, 22.8000, 23.5000],
        [55.2000, 56.4000, 54.8000, 57.1000, 56.8000]])

WEIGHTED DATA:

 tensor([[13.5000, 13.8600, 13.1400, 13.6800, 14.1000],
        [22.0800, 22.5600, 21.9200, 22.8400, 22.7200]])

WEIGHTED AVERAGE: tensor([35.5800, 36.4200, 35.0600, 36.5200, 36.8200])


#### Expected Output:

```
COMBINED DATA (2x5):

 tensor([[22.5000, 23.1000, 21.9000, 22.8000, 23.5000],
        [55.2000, 56.4000, 54.8000, 57.1000, 56.8000]])

WEIGHTED DATA:

 tensor([[13.5000, 13.8600, 13.1400, 13.6800, 14.1000],
        [22.0800, 22.5600, 21.9200, 22.8400, 22.7200]])

WEIGHTED AVERAGE: tensor([35.5800, 36.4200, 35.0600, 36.5200, 36.8200])
```

### Exercise 4: Feature Engineering for Taxi Fares

You are working with a dataset of taxi trips. You have a tensor, `trip_data`, where each row is a trip and the columns represent **[distance (km), hour_of_day (24h)]**.

**Your goal** is to engineer a new binary feature called `is_rush_hour_long_trip`. This feature should be `True` (or `1`) only if a trip meets **both** of the following criteria:

* It's a **long trip** (distance > 10 km).
* It occurs during a **rush hour** (8-10 AM or 5-7 PM, i.e., `[8, 10)` or `[17, 19)`).

To achieve this, you will need to:

1. **Slice** the `trip_data` tensor to isolate the `distance` and `hour` columns.
2. Use **logical and comparison operators** to create boolean masks for each condition (long trip, morning rush, evening rush).
3. Combine these masks to create the final `is_rush_hour_long_trip` feature.
4. **Reshape** this new 1D feature tensor into a 2D column vector and convert its data type to float so it can be combined with the original data.

<br>

<details>
<summary><span style="color:green;"><strong>Solution (Click here to expand)</strong></span></summary>

```python
### START CODE HERE ###

# 1. Slice the main tensor to get 1D tensors for each feature.
distances = trip_data[:, 0]
hours = trip_data[:, 1]

# 2. Create boolean masks for each condition.
is_long_trip = distances > 10.0
is_morning_rush = (hours >= 8.0) & (hours < 10.0)
is_evening_rush = (hours >= 17.0) & (hours < 19.0)

# 3. Combine masks to identify rush hour long trips.
# A trip is a rush hour long trip if it's (a morning OR evening rush) AND a long trip.
is_rush_hour_long_trip_mask = (is_morning_rush | is_evening_rush) & is_long_trip

# 4. Reshape the new feature into a column vector and cast to float.
new_feature_col = is_rush_hour_long_trip_mask.float().unsqueeze(1)

### END CODE HERE ###
```

In [75]:
# Data for 8 taxi trips: [distance, hour_of_day]
trip_data = torch.tensor([
    [5.3, 7],   # Not rush hour, not long
    [12.1, 9],  # Morning rush, long trip -> RUSH HOUR LONG
    [15.5, 13], # Not rush hour, long trip
    [6.7, 18],  # Evening rush, not long
    [2.4, 20],  # Not rush hour, not long
    [11.8, 17], # Evening rush, long trip -> RUSH HOUR LONG
    [9.0, 9],   # Morning rush, not long
    [14.2, 8]   # Morning rush, long trip -> RUSH HOUR LONG
], dtype=torch.float32)


print("ORIGINAL TRIP DATA (Distance, Hour):\n\n", trip_data)
print("-" * 55)


### START CODE HERE ###

# 1. Slice the main tensor to get 1D tensors for each feature.
distances = trip_data[:, 0]
hours = trip_data[:, 1]

# 2. Create boolean masks for each condition.
is_long_trip = distances > 10.0
is_morning_rush = (hours >= 8.0) & (hours < 10.0)
is_evening_rush = (hours >= 17.0) & (hours < 19.0)

# 3. Combine masks to identify rush hour long trips.
# A trip is a rush hour long trip if it's (a morning OR evening rush) AND a long trip.
is_rush_hour_long_trip_mask = (is_morning_rush | is_evening_rush) & is_long_trip

# 4. Reshape the new feature into a column vector and cast to float.
new_feature_col = is_rush_hour_long_trip_mask.float().unsqueeze(1)

### END CODE HERE ###

print("\n'IS RUSH HOUR LONG TRIP' MASK: ", is_rush_hour_long_trip_mask)
print("\nNEW FEATURE COLUMN (Reshaped):\n\n", new_feature_col)

# You can now concatenate this new feature to the original data
enhanced_trip_data = torch.cat((trip_data, new_feature_col), dim=1)
print("\nENHANCED DATA (with new feature at the end):\n\n", enhanced_trip_data)

ORIGINAL TRIP DATA (Distance, Hour):

 tensor([[ 5.3000,  7.0000],
        [12.1000,  9.0000],
        [15.5000, 13.0000],
        [ 6.7000, 18.0000],
        [ 2.4000, 20.0000],
        [11.8000, 17.0000],
        [ 9.0000,  9.0000],
        [14.2000,  8.0000]])
-------------------------------------------------------

'IS RUSH HOUR LONG TRIP' MASK:  tensor([False,  True, False, False, False,  True, False,  True])

NEW FEATURE COLUMN (Reshaped):

 tensor([[0.],
        [1.],
        [0.],
        [0.],
        [0.],
        [1.],
        [0.],
        [1.]])

ENHANCED DATA (with new feature at the end):

 tensor([[ 5.3000,  7.0000,  0.0000],
        [12.1000,  9.0000,  1.0000],
        [15.5000, 13.0000,  0.0000],
        [ 6.7000, 18.0000,  0.0000],
        [ 2.4000, 20.0000,  0.0000],
        [11.8000, 17.0000,  1.0000],
        [ 9.0000,  9.0000,  0.0000],
        [14.2000,  8.0000,  1.0000]])


#### Expected Output:

```
'IS RUSH HOUR LONG TRIP' MASK:  tensor([False,  True, False, False, False,  True, False,  True])

NEW FEATURE COLUMN (Reshaped):

 tensor([[0.],
        [1.],
        [0.],
        [0.],
        [0.],
        [1.],
        [0.],
        [1.]])

ENHANCED DATA (with new feature at the end):

 tensor([[ 5.3000,  7.0000,  0.0000],
        [12.1000,  9.0000,  1.0000],
        [15.5000, 13.0000,  0.0000],
        [ 6.7000, 18.0000,  0.0000],
        [ 2.4000, 20.0000,  0.0000],
        [11.8000, 17.0000,  1.0000],
        [ 9.0000,  9.0000,  0.0000],
        [14.2000,  8.0000,  1.0000]])
```        