A PyTorch Tensor is basically the same as a numpy array: it does not know anything about deep learning.

tensor = (in programmers view) multidimensional array

In [1]:
# First, import PyTorch
import torch

In [2]:
x = torch.zeros(3,4)
print(x)

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])


In [3]:
x = torch.empty(3, 4) # Do not initialize it.
print(type(x))
print(x)

<class 'torch.Tensor'>
tensor([[1.9287e+31, 1.7743e+28, 2.0535e-19, 2.7909e+23],
        [6.2608e+22, 4.7428e+30, 4.4656e+30, 3.4731e-12],
        [7.5555e+31, 1.6930e+22, 1.8465e+25, 1.9863e+08]])


In [4]:
zeros = torch.zeros(2, 3)

ones = torch.ones(2, 3)

torch.manual_seed(1729)
random = torch.rand(2, 3)

print(zeros)
print(ones)
print(random)

tensor([[0., 0., 0.],
        [0., 0., 0.]])
tensor([[1., 1., 1.],
        [1., 1., 1.]])
tensor([[0.3126, 0.3791, 0.3087],
        [0.0736, 0.4216, 0.0691]])


In [5]:
import numpy as np
my_numbers = np.array([[1, 2],[3, 4]])

print(my_numbers.shape)

my_numbers = torch.tensor([[1, 2],[3, 4]])
print(my_numbers)
print(my_numbers.shape)

(2, 2)
tensor([[1, 2],
        [3, 4]])
torch.Size([2, 2])


In [6]:
print(my_numbers.dtype)

t = my_numbers.to(torch.float32)

print(t.dtype)

print(torch.tensor([[1, 2],[3, 4]], dtype = torch.float32))

torch.int64
torch.float32
tensor([[1., 2.],
        [3., 4.]])


In [7]:
t = (torch.ones(2, 2) * 7 - 1) / 2
print(t)

print(t**2)

print(torch.std(t))


tensor([[3., 3.],
        [3., 3.]])
tensor([[9., 9.],
        [9., 9.]])
tensor(0.)


In [8]:
v1 = torch.tensor([1., 0., 0.])         # x unit vector
v2 = torch.tensor([0., 1., 0.])         # y unit vector

print(torch.cross(v2, v1)) #  the cross product of vectors
print(torch.dot(v1,v2))    # the dot product of two 1D tensors

tensor([ 0.,  0., -1.])
tensor(0.)


# Compute $y=\sigma(w^Tx + b)$

In [9]:
def activation(x):
    """ Sigmoid activation function 
    
        Arguments
        ---------
        x: torch.Tensor - alias for torch.FloatTensor
    """
    return 1/(1+torch.exp(-x))

In [10]:
### Generate some data
torch.manual_seed(7) # Set the random seed so things are predictable

# Features are 5 random normal variables

# features = torch.randn((1, 5)) # x = a random vector. Each of its elements is drawn from Normal distribution.
features = torch.tensor([1,2,3,4,5], dtype = torch.float32)

# True weights for our data, random normal variables again
weights = torch.randn_like(features)
# and a true bias term
bias = torch.randn((1, 1))

> **Exercise**: Calculate the output of the network with input features `features`, weights `weights`, and bias `bias`. Similar to Numpy, PyTorch has a [`torch.sum()`](https://pytorch.org/docs/stable/torch.html#torch.sum) function, as well as a `.sum()` method on tensors, for taking sums. Use the function `activation` defined above as the activation function.

In [11]:
### Solution
y = activation(torch.sum(features * weights) + bias)
y = activation((features * weights).sum() + bias)

In [12]:
y

tensor([[0.9994]])

In [13]:
y = activation(torch.dot(features, weights)+ bias)
print(y)

tensor([[0.9994]])


A better way is to use matrix multiplication

Use [`torch.mm()`](https://pytorch.org/docs/stable/torch.html#torch.mm) or [`torch.matmul()`](https://pytorch.org/docs/stable/torch.html#torch.matmul) for matrix multiplication.

`torch.mm`: only performs *matrix* multiplication. I.e., your inputs sizes should be $n \times m$ and $m \times p$. Moreover, this function does not support broadcasting.<br>
`torch.matmul`: It is more general than `torch.mm` in a way that it can be used to apply the matrix multiplication on two *tensors*. Also, it also supports broadcasting.

To change the shape use:
- `weights.reshape(a, b)`: may return a copy or a view of the original tensor.
- `weights.resize_(a, b)`: Resizes `self` tensor (here, `weights`) to the specified size.
- `weights.view(a, b)`: will return a new tensor with the same data as `weights` with size `(a, b)`. The returned tensor will share the underling data with the original tensor.

Examples:

In [14]:
test = torch.ones(2,2)
print(test)
print(test.view(1,4)) 
print(test) # Note that using view does not change the underlaying data

tensor([[1., 1.],
        [1., 1.]])
tensor([[1., 1., 1., 1.]])
tensor([[1., 1.],
        [1., 1.]])


In [15]:
# Let's change the underlying data with an in-place multiplication
print(test.view(1,4).mul_(4))
print(test)

tensor([[4., 4., 4., 4.]])
tensor([[4., 4.],
        [4., 4.]])


**Practical Tip:** 
To reshape, use `view`. And, to copy a tensor, use `clone`. [here](https://stackoverflow.com/questions/49643225/whats-the-difference-between-reshape-and-view-in-pytorch)


> **Exercise**: Calculate the output of our little network using matrix multiplication.

In [16]:
features.shape

torch.Size([5])

In [17]:
y = activation(torch.mm(features.view(1,5), weights.view(5,1)) + bias) # mm and view are the best options from the above.

print(y)

tensor([[0.9994]])


# Compute $
y =  f_2 \! \left(\, f_1 \! \left(\vec{x} \, \mathbf{W_1}\right) \mathbf{W_2} \right)
$

> **Exercise:** Calculate the output for this multi-layer network using the weights `W1` & `W2`, and the biases, `B1` & `B2`. 

In [18]:
### Generate some data
torch.manual_seed(7) # Set the random seed so things are predictable

# Features are 3 random normal variables
features = torch.randn((1, 3))

# Define the size of each layer in our network
n_input = features.shape[1]     # Number of input units, must match number of input features
n_hidden = 2                    # Number of hidden units 
n_output = 1                    # Number of output units

# Weights for inputs to hidden layer
W1 = torch.randn(n_input, n_hidden)
# Weights for hidden layer to output layer
W2 = torch.randn(n_hidden, n_output)

# and bias terms for hidden and output layers
B1 = torch.randn((1, n_hidden))
B2 = torch.randn((1, n_output))

In [19]:
h = activation(torch.mm(features, W1) + B1)
output = activation(torch.mm(h, W2) + B2)
print(output)

tensor([[0.3171]])


# Numpy to Torch and back

To create a tensor from a Numpy array, use `torch.from_numpy()`. 

To convert a tensor to a Numpy array, use the `.numpy()` method.

In [20]:
import numpy as np
a = np.random.rand(4,3)
a

array([[0.91958705, 0.10415664, 0.56259259],
       [0.78646912, 0.55737481, 0.32562772],
       [0.42913771, 0.2366727 , 0.31890369],
       [0.81015436, 0.97251678, 0.17452685]])

In [21]:
b = torch.from_numpy(a)
b

tensor([[0.9196, 0.1042, 0.5626],
        [0.7865, 0.5574, 0.3256],
        [0.4291, 0.2367, 0.3189],
        [0.8102, 0.9725, 0.1745]], dtype=torch.float64)

In [22]:
b.numpy()

array([[0.91958705, 0.10415664, 0.56259259],
       [0.78646912, 0.55737481, 0.32562772],
       [0.42913771, 0.2366727 , 0.31890369],
       [0.81015436, 0.97251678, 0.17452685]])

The memory is shared between the Numpy array and Torch tensor, so if you change the values in-place of one object, the other will change as well.

In [23]:
# Multiply PyTorch Tensor by 2, in place
b.mul_(2)

tensor([[1.8392, 0.2083, 1.1252],
        [1.5729, 1.1147, 0.6513],
        [0.8583, 0.4733, 0.6378],
        [1.6203, 1.9450, 0.3491]], dtype=torch.float64)

In [24]:
# Numpy array matches new values from Tensor
a

array([[1.8391741 , 0.20831327, 1.12518519],
       [1.57293825, 1.11474963, 0.65125543],
       [0.85827542, 0.4733454 , 0.63780737],
       [1.62030873, 1.94503356, 0.3490537 ]])

In [25]:
b.mul(3)

tensor([[5.5175, 0.6249, 3.3756],
        [4.7188, 3.3442, 1.9538],
        [2.5748, 1.4200, 1.9134],
        [4.8609, 5.8351, 1.0472]], dtype=torch.float64)

In [26]:
b

tensor([[1.8392, 0.2083, 1.1252],
        [1.5729, 1.1147, 0.6513],
        [0.8583, 0.4733, 0.6378],
        [1.6203, 1.9450, 0.3491]], dtype=torch.float64)

# GPU

PyTorch Tensor can run on either CPU or GPU.

One of the major advantages of PyTorch is its robust acceleration on CUDA-compatible Nvidia GPUs. (“CUDA” stands for Compute Unified Device Architecture, which is Nvidia’s platform for parallel computing.)

In [27]:
if torch.cuda.is_available():
    print('We have a GPU!')
else:
    print('Sorry, CPU only.')

We have a GPU!


In [28]:
if torch.cuda.is_available():
    my_device = torch.device('cuda')
else:
    my_device = torch.device('cpu')
print('Device: {}'.format(my_device))

x = torch.rand(2, 2, device=my_device)
print(x)

Device: cuda
tensor([[0.6718, 0.1996],
        [0.5633, 0.2009]], device='cuda:0')


In [29]:
my_numbers = torch.tensor([[1, 2],[3, 4]])

my_numbers.cuda()

tensor([[1, 2],
        [3, 4]], device='cuda:0')

In [30]:
my_numbers_2 = torch.tensor([[1, 2],[3, 4]])

my_numbers_2.to('cuda')

tensor([[1, 2],
        [3, 4]], device='cuda:0')

In [31]:
torch.cuda.current_device()

0

In [32]:
torch.cuda.device_count()

1

In [33]:
 torch.cuda.get_device_name() # torch.cuda.get_device_name(0)

'GeForce GTX 1080'