In this project, you'll look at a data science competition helping scientists track animals in a wildlife preserve. The goal is to take images from camera traps and classify which animal, if any, is present. To complete the competition, you'll expand your machine learning skills by creating more powerful neural network models that can take images as inputs and classify them into one of multiple categories.

Some of the things you'll learn are:

How to read image files and prepare them for machine learning
How to use PyTorch to manipulate tensors and build a neural network model
How to build a Convolutional Neural Network that works well with images
How to use that model to make predictions on new images
How to turn those predictions into a submission to the competition

In [2]:
!pip install torch torchvision torchaudio


Collecting torch
  Downloading torch-2.7.0-cp313-cp313-win_amd64.whl.metadata (29 kB)
Collecting torchvision
  Downloading torchvision-0.22.0-cp313-cp313-win_amd64.whl.metadata (6.3 kB)
Collecting torchaudio
  Downloading torchaudio-2.7.0-cp313-cp313-win_amd64.whl.metadata (6.7 kB)
Collecting filelock (from torch)
  Downloading filelock-3.18.0-py3-none-any.whl.metadata (2.9 kB)
Collecting sympy>=1.13.3 (from torch)
  Downloading sympy-1.14.0-py3-none-any.whl.metadata (12 kB)
Collecting networkx (from torch)
  Downloading networkx-3.5-py3-none-any.whl.metadata (6.3 kB)
Collecting fsspec (from torch)
  Downloading fsspec-2025.5.1-py3-none-any.whl.metadata (11 kB)
Collecting mpmath<1.4,>=1.1.0 (from sympy>=1.13.3->torch)
  Downloading mpmath-1.3.0-py3-none-any.whl.metadata (8.6 kB)
Downloading torch-2.7.0-cp313-cp313-win_amd64.whl (212.5 MB)
   ---------------------------------------- 0.0/212.5 MB ? eta -:--:--
   ---------------------------------------- 0.5/212.5 MB 3.6 MB/s eta 0:00:59


In [3]:
import os
import sys

import matplotlib
import matplotlib.pyplot as plt
import pandas as pd
import PIL
import torch
import torchvision
from PIL import Image
from torchvision import transforms

In [4]:
print("Platform:", sys.platform)
print("Python version:", sys.version)
print("---")
print("matplotlib version:", matplotlib.__version__)
print("pandas version:", pd.__version__)
print("PIL version:", PIL.__version__)
print("torch version:", torch.__version__)
print("torchvision version:", torchvision.__version__)

Platform: win32
Python version: 3.13.2 | packaged by Anaconda, Inc. | (main, Feb  6 2025, 18:49:14) [MSC v.1929 64 bit (AMD64)]
---
matplotlib version: 3.10.0
pandas version: 2.2.3
PIL version: 11.1.0
torch version: 2.7.0+cpu
torchvision version: 0.22.0+cpu


# Working with Tensors in PyTorch

In [5]:
my_values = [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]
my_tensor = torch.Tensor(my_values)

print("my_tensor class:", type(my_tensor))
print(my_tensor)

my_tensor class: <class 'torch.Tensor'>
tensor([[ 1.,  2.,  3.],
        [ 4.,  5.,  6.],
        [ 7.,  8.,  9.],
        [10., 11., 12.]])


# Tensor Attributes

In [6]:
print("my_tensor shape:", my_tensor.shape) 

print("my_tensor dtype:", my_tensor.dtype)  

my_tensor shape: torch.Size([4, 3])
my_tensor dtype: torch.float32


# Print the device of my_tensor.

Tensors also have a .device attribute, which specifies the hardware on which it's stored. By default, tensors are created on the computer's CPU. Let's check if that's the case for my_tensor

In [7]:
print("my_tensor device:", my_tensor.device)

my_tensor device: cpu


Some computers come with GPUs, which allow for bigger and faster model building. In PyTorch, the cuda package is used to access GPUs on Linux and Windows machines; mps is used on Macs. Let's check what's available on our notebook.

In [8]:
# Check if GPUs available via `cuda`
cuda_gpus_available = torch.cuda.is_available()

# Check if GPUs available via `mps`
mps_gpus_available = torch.backends.mps.is_available()

print("cuda GPUs available:", cuda_gpus_available)
print("mps GPUs available:", mps_gpus_available)

cuda GPUs available: False
mps GPUs available: False


# Tensor slicing

Slice my_tensor, assigning its top two rows to left_tensor and its bottom two rows to right_tensor

In [9]:
left_tensor = my_tensor[:2, :]
right_tensor = my_tensor[2:, :]

print("left_tensor class:", type(left_tensor))
print("left_tensor shape:", left_tensor.shape)
print("left_tensor data type:", left_tensor.dtype)
print("left_tensor device:", left_tensor.device)
print(left_tensor)
print()
print("right_tensor class:", type(right_tensor))
print("right_tensor shape:", right_tensor.shape)
print("right_tensor data type:", right_tensor.dtype)
print("right_tensor device:", right_tensor.device)
print(right_tensor)

left_tensor class: <class 'torch.Tensor'>
left_tensor shape: torch.Size([2, 3])
left_tensor data type: torch.float32
left_tensor device: cpu
tensor([[1., 2., 3.],
        [4., 5., 6.]])

right_tensor class: <class 'torch.Tensor'>
right_tensor shape: torch.Size([2, 3])
right_tensor data type: torch.float32
right_tensor device: cpu
tensor([[ 7.,  8.,  9.],
        [10., 11., 12.]])


# Tensor Math

 Use both the mathematical operator and the class method to add left_tensor to right_tensor. Assign the results to summed_tensor_operator and summed_tensor_method, respectively.

In [10]:
summed_tensor_operator = right_tensor + left_tensor
summed_tensor_method = right_tensor.add(left_tensor)

print("summed_tensor_operator class:", type(summed_tensor_operator))
print("summed_tensor_operator shape:", summed_tensor_operator.shape)
print("summed_tensor_operator data type:", summed_tensor_operator.dtype)
print("summed_tensor_operator device:", summed_tensor_operator.device)
print(summed_tensor_operator)
print()
print("summed_tensor_method class:", type(summed_tensor_method))
print("summed_tensor_method shape:", summed_tensor_method.shape)
print("summed_tensor_method data type:", summed_tensor_method.dtype)
print("summed_tensor_method device:", summed_tensor_method.device)
print(summed_tensor_method)

summed_tensor_operator class: <class 'torch.Tensor'>
summed_tensor_operator shape: torch.Size([2, 3])
summed_tensor_operator data type: torch.float32
summed_tensor_operator device: cpu
tensor([[ 8., 10., 12.],
        [14., 16., 18.]])

summed_tensor_method class: <class 'torch.Tensor'>
summed_tensor_method shape: torch.Size([2, 3])
summed_tensor_method data type: torch.float32
summed_tensor_method device: cpu
tensor([[ 8., 10., 12.],
        [14., 16., 18.]])


In [11]:
ew_tensor_operator = left_tensor*right_tensor
ew_tensor_method = right_tensor.mul(left_tensor)

print("ew_tensor_operator class:", type(ew_tensor_operator))
print("ew_tensor_operator shape:", ew_tensor_operator.shape)
print("ew_tensor_operator data type:", ew_tensor_operator.dtype)
print("ew_tensor_operator device:", ew_tensor_operator.device)
print(ew_tensor_operator)
print()
print("ew_tensor_method class:", type(ew_tensor_method))
print("ew_tensor_method shape:", ew_tensor_method.shape)
print("ew_tensor_method data type:", ew_tensor_method.dtype)
print("ew_tensor_method device:", ew_tensor_method.device)
print(ew_tensor_method)

ew_tensor_operator class: <class 'torch.Tensor'>
ew_tensor_operator shape: torch.Size([2, 3])
ew_tensor_operator data type: torch.float32
ew_tensor_operator device: cpu
tensor([[ 7., 16., 27.],
        [40., 55., 72.]])

ew_tensor_method class: <class 'torch.Tensor'>
ew_tensor_method shape: torch.Size([2, 3])
ew_tensor_method data type: torch.float32
ew_tensor_method device: cpu
tensor([[ 7., 16., 27.],
        [40., 55., 72.]])


In [12]:
left_tensor * right_tensor == right_tensor * left_tensor

tensor([[True, True, True],
        [True, True, True]])

# matrix multiplication methods

there's matrix multiplication, which combines the rows and columns of two tensors to generate a new one. We can use the @ operator or the .matmul() method.

To see how this works, let's create two new tensors with different shapes.

In [13]:
new_left_tensor = torch.Tensor([[2, 5], [7, 3]])
new_right_tensor = torch.Tensor([[8], [9]])

print("new_left_tensor class:", type(new_left_tensor))
print("new_left_tensor shape:", new_left_tensor.shape)
print("new_left_tensor data type:", new_left_tensor.dtype)
print("new_left_tensor device:", new_left_tensor.device)
print(new_left_tensor)
print()
print("new_right_tensor class:", type(new_right_tensor))
print("new_right_tensor shape:", new_right_tensor.shape)
print("new_right_tensor data type:", new_right_tensor.dtype)
print("new_right_tensor device:", new_right_tensor.device)
print(new_right_tensor)

new_left_tensor class: <class 'torch.Tensor'>
new_left_tensor shape: torch.Size([2, 2])
new_left_tensor data type: torch.float32
new_left_tensor device: cpu
tensor([[2., 5.],
        [7., 3.]])

new_right_tensor class: <class 'torch.Tensor'>
new_right_tensor shape: torch.Size([2, 1])
new_right_tensor data type: torch.float32
new_right_tensor device: cpu
tensor([[8.],
        [9.]])


In [14]:
mm_tensor_operator = new_left_tensor @ new_right_tensor
mm_tensor_method = new_left_tensor.matmul(new_right_tensor)

print("mm_tensor_operator class:", type(mm_tensor_operator))
print("mm_tensor_operator shape:", mm_tensor_operator.shape)
print("mm_tensor_operator data type:", mm_tensor_operator.dtype)
print("mm_tensor_operator device:", mm_tensor_operator.device)
print(mm_tensor_operator)
print()
print("mm_tensor_method class:", type(mm_tensor_method))
print("mm_tensor_method shape:", mm_tensor_method.shape)
print("mm_tensor_method data type:", mm_tensor_method.dtype)
print("mm_tensor_method device:", mm_tensor_method.device)
print(mm_tensor_method)

mm_tensor_operator class: <class 'torch.Tensor'>
mm_tensor_operator shape: torch.Size([2, 1])
mm_tensor_operator data type: torch.float32
mm_tensor_operator device: cpu
tensor([[61.],
        [83.]])

mm_tensor_method class: <class 'torch.Tensor'>
mm_tensor_method shape: torch.Size([2, 1])
mm_tensor_method data type: torch.float32
mm_tensor_method device: cpu
tensor([[61.],
        [83.]])


# Tensor mean calculation

In [15]:
my_tensor_mean = my_tensor.mean()

print("my_tensor_mean class:", type(my_tensor_mean))
print("my_tensor_mean shape:", my_tensor_mean.shape)
print("my_tensor_mean data type:", my_tensor_mean.dtype)
print("my_tensor_mean device:", my_tensor_mean.device)
print("my_tensor mean:", my_tensor_mean)

my_tensor_mean class: <class 'torch.Tensor'>
my_tensor_mean shape: torch.Size([])
my_tensor_mean data type: torch.float32
my_tensor_mean device: cpu
my_tensor mean: tensor(6.5000)


# Calculate the mean for each column in my_tensor.

In [16]:
my_tensor_column_means = my_tensor.mean(dim=[0])

print("my_tensor_column_means class:", type(my_tensor_column_means))
print("my_tensor_column_means shape:", my_tensor_column_means.shape)
print("my_tensor_column_means data type:", my_tensor_column_means.dtype)
print("my_tensor_column_means device:", my_tensor_column_means.device)
print("my_tensor column means:", my_tensor_column_means)

my_tensor_column_means class: <class 'torch.Tensor'>
my_tensor_column_means shape: torch.Size([3])
my_tensor_column_means data type: torch.float32
my_tensor_column_means device: cpu
my_tensor column means: tensor([5.5000, 6.5000, 7.5000])


# Exploring files

In [17]:
!gcloud storage ls gs://wqu-cv-course-datasets

'gcloud' is not recognized as an internal or external command,
operable program or batch file.
