# A Random Sampling of Functions for Random Sampling in PyTorch

PyTorch is a Scientific computing package for Python that is similar to NumPy. However, PyTorch has some features that allow it to be more optimized for deep learning research with more flexibility and speed, since it allows the use of the computer's GPU for much faster processing. For my functions, I've selected items that deal with generating tensors that are filled with random numbers selected in different ways, and also breaking large tensors into more manageable pieces. I've explored some of the differences between these functions to determine their best use cases.

- function 1 - torch.arange()
- function 2 - torch.rand()
- function 3 - torch.chunk()
- function 4 - torch.randint_like()
- function 5 - torch.randperm()

In [6]:
# Uncomment and run the appropriate command for your operating system, if required

# Linux / Binder
# !pip install numpy torch==1.7.0+cpu torchvision==0.8.1+cpu torchaudio==0.7.0 -f https://download.pytorch.org/whl/torch_stable.html

# Windows
!pip install numpy torch==1.7.0+cpu torchvision==0.8.1+cpu torchaudio==0.7.0 -f https://download.pytorch.org/whl/torch_stable.html

# MacOS
# !pip install numpy torch torchvision torchaudio

Looking in links: https://download.pytorch.org/whl/torch_stable.html


In [7]:
# Import torch and other required modules
import torch

## Function 1 -  .arange()

The torch.arange function is used to create a tensor that is populated with intergers from a specified range. This is useful whenever you need to generate a tensor with a specific size and/or shape that contains numbers with the same step value. This can be used in whatever context the situation calls for. I will show this in the examples below.

In [100]:
x= torch.tensor([[2,5,1],
               [4,2,5],
               [9,3,6]], dtype = torch.float32)
s= x.shape
y= torch.arange(2,6.5,.5, dtype= torch.float32)
y=y.reshape(s)
print('x= ',x)
print('y= ',y)

x=  tensor([[2., 5., 1.],
        [4., 2., 5.],
        [9., 3., 6.]])
y=  tensor([[2.0000, 2.5000, 3.0000],
        [3.5000, 4.0000, 4.5000],
        [5.0000, 5.5000, 6.0000]])
torch.Size([3, 3])


Here is an example where I've created a tensor by assigning the shape, and the values. I then created a tensor with the same number of intergers as the first, and reshaped it to match the shape of x. Now these two tensors can be compared and manipulated as needed.

In [56]:
x= 1
y=10
v= torch.arange(x,y,2)
c= torch.arange(x*2,y+1,2)
print(v)
print(c)

tensor([1, 3, 5, 7, 9])
tensor([ 2,  4,  6,  8, 10])


Here I've created two tensors using the .arange function that reference pre-assigned variables. This can be useful when creating tensors from variables that need to have different mathematical operations run before being put into the tensor.

In [59]:

x=torch.arange(1,10,[1,3],[2,5])
y=torch.arange([1,3],[2,5])

TypeError: arange() received an invalid combination of arguments - got (int, int, list, list), but expected one of:
 * (Number end, *, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)
 * (Number start, Number end, Number step, *, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)


Here I've shown that you cannot pre-define the shape of the tensor when calling .arange. This is because .arange does not take a shape or list argument in it's parameters. Instead, the tensor first needs to be created with the number of intergers, and then shaped using the .reshape method.

Tensor.arange is a useful function to call whenever you need to create a tensor that contains a set of numbers in a specific range, where the intergers need to have some equal distance between them. It takes the paramaters: start, end, and step; and takes arguments for: out, dtype, layout, device, and requires_grad. It is limited by certain factors such as the fact that the number of outputs will always be the result of step size/range2-range1, and the shaping must be done after the tensor is created. 

Let's save our work using Jovian before continuing.

In [22]:
!pip install jovian --upgrade --quiet

In [23]:
import jovian

In [15]:
jovian.commit(project='01-tensor-operations')

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m
[jovian] Updating notebook "aakashns/01-tensor-operations" on https://jovian.ai/[0m
[jovian] Uploading notebook..[0m
[jovian] Capturing environment..[0m
[jovian] Committed successfully! https://jovian.ai/aakashns/01-tensor-operations[0m


'https://jovian.ai/aakashns/01-tensor-operations'

## Function 2 -torch.rand()

The torch.rand function is similar to the torch.arange function. However, it has some key differences that I felt made it a good subject for my second function. I will explain how the two functions are used differently in the examples below.

In [60]:
x= torch.tensor([[2,5,1],
               [4,2,5],
               [9,3,6]], dtype = torch.float32)
s= x.shape
y= torch.rand(s)
print(x)
print(y)

tensor([[2., 5., 1.],
        [4., 2., 5.],
        [9., 3., 6.]])
tensor([[0.1100, 0.6432, 0.7507],
        [0.5848, 0.0851, 0.2927],
        [0.6896, 0.1217, 0.5761]])


In this example I show that using the .rand function can quickly and easily generate a tensor with the same shape as x, that is populated with random intergers. It is important to note however, that while the .arange function uses intergers in a given range, the .rand function picks intergers that have a uniform distribution on the interval [0,1).  

In [81]:
x=torch.rand(5, dtype=torch.float32)
y= torch.arange(2,6.5,.5, dtype= torch.float32)
x1=torch.rand(10, dtype= torch.float32)
y1= torch.arange(0,1,.1, dtype=torch.float32)
print(x)
print(y,'\n')
print(x1)
print(y1)

tensor([0.4486, 0.3708, 0.6304, 0.6606, 0.3303])
tensor([2.0000, 2.5000, 3.0000, 3.5000, 4.0000, 4.5000, 5.0000, 5.5000, 6.0000]) 

tensor([0.4823, 0.7674, 0.8437, 0.6836, 0.6946, 0.5435, 0.1046, 0.8030, 0.7254,
        0.9517])
tensor([0.0000, 0.1000, 0.2000, 0.3000, 0.4000, 0.5000, 0.6000, 0.7000, 0.8000,
        0.9000])


Here I've shown that while both functions can be used to create tensors for any size, the .rand picks random values on a distribution curve from [0,1), and the .arange picks intergers that are evenly seperated between the start and end values. Thus, which one is necessary depends on the range of data you wish to generate.

In [92]:
torch.rand(1, dtype= int)

RuntimeError: "check_uniform_bounds" not implemented for 'Long'

Here I show that the dtype for .rand cannot be an int. This is because the distribution it selects numbers from is between, but non-inclusive of 0 & 1. Therefore the function can yield no whole number intergers.

In Summary, the .rand function is useful for creating a tensor that fits the shape of another tensor and contains values on an even distribution. This is useful for selecting weights as a starting point when beginning to train a predictive program. 

In [16]:
jovian.commit(project='01-tensor-operations')

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m
[jovian] Updating notebook "aakashns/01-tensor-operations" on https://jovian.ai/[0m
[jovian] Uploading notebook..[0m
[jovian] Capturing environment..[0m
[jovian] Committed successfully! https://jovian.ai/aakashns/01-tensor-operations[0m


'https://jovian.ai/aakashns/01-tensor-operations'

## Function 3 - torch.chunk()
The torch.chunk function is useful for splitting large tensors into equally sized chunks of the original. This can be used to more quickly train a predictive program, or whenever a section of data should be analyzed seperately from the larger tensor.

In [147]:
x=torch.arange(0,20, dtype= int)
x=x.reshape(2,2,5)
y= torch.chunk(x,2,1)
y1= y[0]
y2= y[1]

print('x= ',x)
print('y1= ',y1)
print('y2= ',y2)




x=  tensor([[[ 0,  1,  2,  3,  4],
         [ 5,  6,  7,  8,  9]],

        [[10, 11, 12, 13, 14],
         [15, 16, 17, 18, 19]]])
y1=  tensor([[[ 0,  1,  2,  3,  4]],

        [[10, 11, 12, 13, 14]]])
y2=  tensor([[[ 5,  6,  7,  8,  9]],

        [[15, 16, 17, 18, 19]]])


Here is an example of a chunk taken from the original tensor(x). The .chunk function takes the parameters: input(the original tensor), chunks(the number of pieces to split it into), and dim(the dimension along which to split the original). Here I split the original into two chunks, did so along the first dimension, and printed them out as seperate tensors for better illustration of what the parameters do.

In [160]:
z= torch.chunk( y1,2,2)
z1=z[0]
z2=z[1]
print('z1= ',z1)
print('z2= ',z2)

z1=  tensor([[[ 0,  1,  2]],

        [[10, 11, 12]]])
z2=  tensor([[[ 3,  4]],

        [[13, 14]]])


Here I am showing two things that I felt were important: 1. The chunks generated become new tensors when assigned to a variable, 2. It is possible to split into chunks with inequal dimensions. The former is useful if it is necessary to split a specific chunk into even smaller pieces. The second shows that the chunks may not be equal if the size along the dimension is not divisible by the number of chunks selected. In this case the LAST chunk will contain the remainder of the original tensor data after all other chunks have been equally parsed.

In [161]:
a= torch.chunk(y,2,2)

TypeError: chunk(): argument 'input' (position 1) must be Tensor, not tuple

I mentioned in the previous example that The chunks generated from a tensor become new tensors. Here I am illustrating that the results of a chunk function is actually a tuple, and aren't read tensors until the individual chunks are assigned their own variables. The result from the first example (y) is a tuple that contains a list of tensors, which then need to be accessed in some way to be read and used as tensors.

In summary, the chunk function is useful to call when a tensor is too large to be effectively used in it's inital state. Chunking it into smaller tensors results in faster calculations, and is useful for breaking data into specific pieces without changing the original tensor.

In [17]:
jovian.commit(project='01-tensor-operations')

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m
[jovian] Updating notebook "aakashns/01-tensor-operations" on https://jovian.ai/[0m
[jovian] Uploading notebook..[0m
[jovian] Capturing environment..[0m
[jovian] Committed successfully! https://jovian.ai/aakashns/01-tensor-operations[0m


'https://jovian.ai/aakashns/01-tensor-operations'

## Function 4 - torch.randint_like()

The .randint_like function allows the generation of a new tensor that has the same shape as a given tensor, but filled with random intergers generated from a given range which includes the low number but is exclusive of the high number. The randint_like function accepts parameters for: input, low, and high. 

In [164]:
y=torch.randint_like(x,20,30)
print('x= ',x)
print('y= ',y)

x=  tensor([[[ 0,  1,  2,  3,  4],
         [ 5,  6,  7,  8,  9]],

        [[10, 11, 12, 13, 14],
         [15, 16, 17, 18, 19]]])
y=  tensor([[[25, 21, 23, 25, 25],
         [20, 27, 21, 20, 23]],

        [[21, 25, 29, 20, 24],
         [25, 29, 28, 24, 25]]])


Here I've generated a tensor that has the same shape as (x), and is populated with whole number intergers selected randomly from 20-30. I'll also point out that the number 25 appears in this example much more often than the others. This seems to be the result of some interaction of the function trying to generate random numbers uniformly with the given parameters since 20 intergers were needed to fill the tensor, but a range of 10(20-29)numbers were given. I thought it was interesting that the numbers 22 & 26 were not selected at all, but 25 appears 6 times.

In [176]:
z= torch.randint_like(x,10,11,dtype=torch.float32)
print('x= ',x)
print('z= ',z)

x=  tensor([[[ 0,  1,  2,  3,  4],
         [ 5,  6,  7,  8,  9]],

        [[10, 11, 12, 13, 14],
         [15, 16, 17, 18, 19]]])
z=  tensor([[[10., 10., 10., 10., 10.],
         [10., 10., 10., 10., 10.]],

        [[10., 10., 10., 10., 10.],
         [10., 10., 10., 10., 10.]]])


This example shows that the size of the range of numbers does not have an effect on it's ability to generate a tensor of the same shape as the input, as long as the range is at least 1. It's worth noting that using this function with a small range such as 1 or 2 will not yield a very diverse sample, however.

In [178]:
z= torch.randint_like(x, 1., 5.5,dtype=torch.float32)

TypeError: randint_like() received an invalid combination of arguments - got (Tensor, float, float, dtype=torch.dtype), but expected one of:
 * (Tensor input, int high, *, torch.memory_format memory_format, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)
 * (Tensor input, int low, int high, *, torch.memory_format memory_format, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)


This example illustrates that while the function does accept the dtype argument, the low and high parameters must be entered as whole number intergers with a range of at least 1. Entering these values as floats returns an error even when the dtype is explicitly set to float.

The randint_like function has a different functionality than the other functions I've looked at for generating tensors with random values. It can both generate a tensor with a specific shape, like the .rand function, but can also accept any whole number range, like the .arange function. This makes it a useful function in cases where you need values from a range greater than [0,1), that are also randomly selected instead of stepped. 

In [18]:
jovian.commit(project='01-tensor-operations')

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m
[jovian] Updating notebook "aakashns/01-tensor-operations" on https://jovian.ai/[0m
[jovian] Uploading notebook..[0m
[jovian] Capturing environment..[0m
[jovian] Committed successfully! https://jovian.ai/aakashns/01-tensor-operations[0m


'https://jovian.ai/aakashns/01-tensor-operations'

## Function 5 - torch.randperm()

Finally, the .randperm function. This function generates a tensor with a random permutation of intergers from [0:n-1]. It is similar in nature to the .arange function since the output will only have 1-dimension. The key difference however, is that it only takes a parameter for the upper bound of the range. 

In [184]:
torch.randperm(10,dtype=torch.float32)

tensor([0., 2., 9., 4., 1., 6., 3., 5., 7., 8.])

In this example, you can see that for n=10 the tensor generated contains each number in the given range, however the order in which they appear is selected randomly, and each interger is only selected once. This means that when using this function, it's length will always be equal to it's range, and it wall always contain the numbers from [0:n-1].

In [198]:
n=torch.numel(x)
print(n)
rp= torch.randperm(n)
print(rp)
rp= rp.reshape(x.shape)
print(rp)

20
torch.Size([2, 2, 5])
tensor([ 2, 11,  6, 19, 18, 16,  5, 17, 15,  0, 12,  9,  7,  1, 10,  4,  3, 14,
         8, 13])
tensor([[[ 2, 11,  6, 19, 18],
         [16,  5, 17, 15,  0]],

        [[12,  9,  7,  1, 10],
         [ 4,  3, 14,  8, 13]]])


In this example I show how the .randperm function can be used to generate a tensor that is of the same shape as an existing tensor.

In [205]:
torch.randperm(-3)

RuntimeError: Trying to create tensor with negative dimension -3: [-3]

This example shows the limits of this function. It only accepts a paramater for the upper limit of the range (n), where n is: A positive, whole interger that is greater than zero.

This function is useful when the creating a tensor that requires nonrepeating values that are randomly placed from the range [0:n-1]. 

In [19]:
jovian.commit(project='01-tensor-operations')

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m
[jovian] Updating notebook "aakashns/01-tensor-operations" on https://jovian.ai/[0m
[jovian] Uploading notebook..[0m
[jovian] Capturing environment..[0m
[jovian] Committed successfully! https://jovian.ai/aakashns/01-tensor-operations[0m


'https://jovian.ai/aakashns/01-tensor-operations'

## Conclusion

In summary, the Pytorch computing package contains a vast array of functions, and many different functions for generating random samples of data. Each of these has unique paramaters and methods that allow for random samples to be generated in specific ways, and I feel as though I've only scratched the surface with my examples. Having now completed this assignment, and spent significant time looking at the subtle differences in these functions it is my conclusion that Pytorch is a valuable tool for DeepLearning programming. There is no universal tool for generating random samples of data. Insead we are provided with a wide variety of tools that make the task of programming much more efficient than having to hard code everything individually.

## Reference Links
Provide links to your references and other interesting articles about tensors
* Official documentation for tensor operations: https://pytorch.org/docs/stable/torch.html
* Pytorch: https://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html
* torch.arange(): https://pytorch.org/docs/stable/generated/torch.arange.html#torch.arange
* torch.rand(): https://pytorch.org/docs/stable/generated/torch.rand.html#torch.rand
* torch.chunk(): https://pytorch.org/docs/stable/generated/torch.chunk.html#torch.chunk
* torch.randint_like(): https://pytorch.org/docs/stable/generated/torch.randint_like.html#torch.randint_like
* torch.randperm(): https://pytorch.org/docs/stable/generated/torch.randperm.html#torch.randperm

In [None]:
jovian.commit(project='01-tensor-operations')

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m
