In [1]:
# For tips on running notebooks in Google Colab, see
# https://pytorch.org/tutorials/beginner/colab
%matplotlib inline


**Learn the Basics** ||
[Quickstart](quickstart_tutorial.html) ||
[Tensors](tensorqs_tutorial.html) ||
[Datasets & DataLoaders](data_tutorial.html) ||
[Transforms](transforms_tutorial.html) ||
[Build Model](buildmodel_tutorial.html) ||
[Autograd](autogradqs_tutorial.html) ||
[Optimization](optimization_tutorial.html) ||
[Save & Load Model](saveloadrun_tutorial.html)

# Learn the Basics

Authors:
[Suraj Subramanian](https://github.com/suraj813),
[Seth Juarez](https://github.com/sethjuarez/),
[Cassie Breviu](https://github.com/cassieview/),
[Dmitry Soshnikov](https://soshnikov.com/),
[Ari Bornstein](https://github.com/aribornstein/)

Most machine learning workflows involve working with data, creating models, optimizing model
parameters, and saving the trained models. This tutorial introduces you to a complete ML workflow
implemented in PyTorch, with links to learn more about each of these concepts.

We'll use the FashionMNIST dataset to train a neural network that predicts if an input image belongs
to one of the following classes: T-shirt/top, Trouser, Pullover, Dress, Coat, Sandal, Shirt, Sneaker,
Bag, or Ankle boot.

`This tutorial assumes a basic familiarity with Python and Deep Learning concepts.`


## Running the Tutorial Code
You can run this tutorial in a couple of ways:

- **In the cloud**: This is the easiest way to get started! Each section has a "Run in Microsoft Learn" and "Run in Google Colab" link at the top, which opens an integrated notebook in Microsoft Learn or Google Colab, respectively, with the code in a fully-hosted environment.
- **Locally**: This option requires you to setup PyTorch and TorchVision first on your local machine ([installation instructions](https://pytorch.org/get-started/locally/)). Download the notebook or copy the code into your favorite IDE.


## How to Use this Guide
If you're familiar with other deep learning frameworks, check out the [0. Quickstart](quickstart_tutorial.html) first
to quickly familiarize yourself with PyTorch's API.

If you're new to deep learning frameworks, head right into the first section of our step-by-step guide: [1. Tensors](tensor_tutorial.html).


.. include:: /beginner_source/basics/qs_toc.txt

.. toctree::
   :hidden:


Imports

In [2]:
import torch
import numpy as np

## Initializing a Tensor

Directly from data

In [3]:
data = [
    [1, 2],
    [3, 4]
]
x_data = torch.tensor(data)

x_data

tensor([[1, 2],
        [3, 4]])

From a NumPy array

In [4]:
np_array = np.array(data)
x_np = torch.from_numpy(np_array)

x_np

tensor([[1, 2],
        [3, 4]])

From another tensor:

Note shape and dtype are reserved in these operation unless specifically overrun

In [5]:
x_ones = torch.ones_like(x_data)
print(f"x_ones: {x_ones}")

# Specificallly overriding the preservation of dtype
x_rand = torch.rand_like(x_data, dtype=torch.float)

x_ones: tensor([[1, 1],
        [1, 1]])


With random or constant values:

In [6]:
shape = (2, 3,)

rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)

print(f"rand_tensor: {rand_tensor}")
print(f"ones_tensor: {ones_tensor}")
print(f"zeros_tensor: {zeros_tensor}")

# Note the default assumed dtype is torch.float32

rand_tensor: tensor([[0.9189, 0.1128, 0.5910],
        [0.0225, 0.9473, 0.4438]])
ones_tensor: tensor([[1., 1., 1.],
        [1., 1., 1.]])
zeros_tensor: tensor([[0., 0., 0.],
        [0., 0., 0.]])


## Attributes of a Tensor


In [7]:
tensor = torch.rand(3, 4)

print("Tensor attributes: ")
print(f"tensor.shape: {tensor.shape}")
print(f"tensor.dtype: {tensor.dtype}")
print(f"tensor.device: {tensor.device}")

Tensor attributes: 
tensor.shape: torch.Size([3, 4])
tensor.dtype: torch.float32
tensor.device: cpu


## Operations on Tensors

Copying a tensor from cpu to cuda/gpu

In [11]:
if torch.cuda.is_available():
  print("cuda is available!")
  tensor = tensor.to("cuda")

cuda is available!


Standard numpy-like indexing and slicing:


In [17]:
tensor = torch.ones(4, 4)

print(f"First row: {tensor[0]}")
print(f"First column: {tensor[:,0]}")
print(f"Last column: {tensor[:, -1]}")

tensor[:, 1] = 0

tensor

First row: tensor([1., 1., 1., 1.])
First column: tensor([1., 1., 1., 1.])
Last column: tensor([1., 1., 1., 1.])


tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])

Joining tensors

In [20]:
t1 = torch.cat([tensor, tensor, tensor], dim=1)

t1

tensor([[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.]])

dim = 0 means row wise (as the first subscript operator selects the row)
dim = 1 means column wise (as the second subcript operator selects the column)
...

Arithmetic operations

In [27]:
y1 = tensor @ tensor.T
y2 = tensor.matmul(tensor.T)

print(f"y1 : {y1}\n")
print(f"y2: {y2}\n")

y3 = torch.rand_like(y1)
torch.matmul(tensor, tensor.T, out=y3)

print(f"y3: {y3}\n")

z1 = tensor * tensor
z2 = tensor.mul(tensor)

print(f"z1: {z1}")
print(f"z2: {z2}")

z3 = torch.rand_like(tensor)
torch.mul(tensor, tensor, out=z3)

print(f"z3: {z3}")

y1 : tensor([[3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.]])

y2: tensor([[3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.]])

y3: tensor([[3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.]])

z1: tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])
z2: tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])
z3: tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])


Single-element tensors

In [33]:
agg = tensor.sum()

print(f"agg: {agg}")
print(f"type(agg): {type(agg)}")

agg: 12.0
type(agg): <class 'torch.Tensor'>


In [34]:
agg_item = agg.item()

print(f"agg_item: {agg_item}")
print(f"type(agg): {type(agg_item)}")

agg_item: 12.0
type(agg): <class 'float'>


In-place operations<br>
They end in an underscore?/_

In [35]:
print(f"tensor:{tensor}")

tensor.add_(5)

print(f"tensor: {tensor}")

tensor:tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])
tensor: tensor([[6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.]])


## Bridge with numpy

Tensors on the CPU and NumPy arrays can share their underlying memory locations, and changing one will change the other. <br>

I don't think this case holds with Tensors on the GPU.

Tensor to NumPy array

In [45]:
t = torch.ones(5)
print(f"t: {t}")
print(f"type(t): {type(t)}")

print("")

n = t.numpy()
print(f"n: {n}")
print(f"type(n): {type(n)}")

t: tensor([1., 1., 1., 1., 1.])
type(t): <class 'torch.Tensor'>

n: [1. 1. 1. 1. 1.]
type(n): <class 'numpy.ndarray'>


NumPy array to Tensor

In [46]:
n = np.ones(5)
print(f"n: {n}")
print(f"type(n): {type(n)}")

print("")

t = torch.from_numpy(n)
print(f"t: {t}")
print(f"type(t): {type(t)}")

n: [1. 1. 1. 1. 1.]
type(n): <class 'numpy.ndarray'>

t: tensor([1., 1., 1., 1., 1.], dtype=torch.float64)
type(t): <class 'torch.Tensor'>


Showing that changing a CPU tensor changes the numpy array and vice versa (as they share the same memory space)

In [47]:
t.add_(5)

print(f"t: {t}\n")
print(f"n: {n}")

t: tensor([6., 6., 6., 6., 6.], dtype=torch.float64)

n: [6. 6. 6. 6. 6.]


In [48]:
n += 5

print(f"t: {t}\n")
print(f"n: {n}")

t: tensor([11., 11., 11., 11., 11.], dtype=torch.float64)

n: [11. 11. 11. 11. 11.]


The same does not hold for Tensors on GPU

In [49]:
t_cuda = t.to("cuda")

t_cuda.add_(5)

print(f"t_cuda: {t_cuda}\n")
print(f"n: {n}")

t_cuda: tensor([16., 16., 16., 16., 16.], device='cuda:0', dtype=torch.float64)

n: [11. 11. 11. 11. 11.]
