# Special Arrays in Numpy

Numpy has a huge number of functions/methods and some of them can be effectively used in order to conveniently create arrays which are frequently encountered in Machine Learning/Deep Learning problems. 

So, without any further ado, let's dive straight into discussing these special arrays, how to create them and where they're useful. Please note that I will be using the terms array and matrix interchangeably in the subsequent part of the post.


## The all-zeros array
Just as the name suggests, this array contains nothing but zeros. It could be of any size/shape which the user provides and also the datatype can be specified by the user optionally. 

```python
import numpy as np
x = np.zeros(shape = (1, 2), dtype = np.int64)
```

## The all-ones array
This is an array where each and every element is one. Similar to the all-zeros array, this could be built using the np.ones function by specifying the shape and optionally the datatype of the elements in the array.

```python
import numpy as np
x = np.ones(shape = (4, 4), dtype = np.float16)
```

Both the all-ones and all-zeros arrays can be used particularly for setting flags, for one-hot encoding label/target variable in Machine Learning applications, as a placeholder for filling in the values based on some set of operations etc. 

## The Identity Matrix
In the world of matrices, the identity matrix plays a big role. It is a matrix having ones on the diagonal and zeros elsewhere. It's handy in computing inverses, in finding the identity-mapping representation in resnets and in quite a lot of places. 

It is typically encountered in a square form but not necessarily. It could be a rectangular matrix as long as all the diagonal elements are 1 (diagonal elements are the ones which are accessed using same index across all the dimensions). In numpy the np.eye function is useful to create an identity matrix and it can be used similar to the ones and zeros function by specifying the shape and dtype (optionally) of the array respectively.

```python
import numpy as np
x = np.eye(shape = (3, 3), dtype = np.int16)
```

## The diagonal Matrix
A matrix which has all zeros across the non-diagnoal elements is called as a diagonal matrix. Conversely, it's only the diagonals which are permitted to have non-zero elements in this matrix. Since it's quite common to encounter these kind of matrices in techniques like Singular Value Decomposition, Eigen Value computation and stuff which can possibly be utilized in Machine Learning for building recommendation systems, the good people at numpy have provided us with a function which can convert a diagonal matrix into a vector of diagonal elements and vice versa. Let's see how.

```python
import numpy as np
x = np.eye(shape = (3,3), dtype = np.int16)

# Going from diagonal matrix to a vector representation
y = np.diag(x)

# Going from vector representation to a diagonal matrix
z = np.diag(y)
```

As we can see in the first turn, a 3 x 3 array was converted into a 1 x 3 array. Conversely in the second step, a 1 x 3 array was converted into a 3 x 3 array. 

Please note that we can use np.diag to dig out the diagonal elements irrespective of whether the source matrix is a diagonal matrix.

```python
import numpy as np
np.random.seed(3)
x = np.array([[1, 2, 3],
              [3, 2, 1],
              [2, 1, 2]], dtype = np.int16)
z = np.diag(x)
```

## The linearly spaced array
Many a time we want to create arrays which run incrementally from a particular start value to a stop value in many computer-science as well as machine-learning applications. Loops would be one example which are supposed to run for a particular number of steps. 

Numpy provides a function to do just that. All you have to do is specify the start, the end and how many items you want in between. 

```python
import numpy as np
x = np.linspace(1, 100, num = 100, dtype = np.int16)
```

On the other hand, if you knew the step size and the start and end you could use the same function as follows

```python
import numpy as np
num = int(np.ceil((end - start)/step_size))
x = np.linspace(start, end, num = num, dtype = np.xxx)
```

This is used typically for looping over things, looping over every odd or even or spaced over some interval etc. It comes in handy during preprocessing data stage of an ML solution cycle.

## The logspaced array

Just as we have the linearly spaced array, we can also have arrays that are spaced on a log scale. This is particularly useful while dealing with quantities that vary on a logarithmic scale. Just like we specified for linspace, logspace takes in the start and end quantities and number of elements that need to be in between the two of them and optionally a datatype which each element of that array should be in.

```python
import numpy as np
x = np.logspace(-3, 3, num = 7, dtype = np.int16)
```

We can make use of this feature when we are doing hyperparameter tuning for learning rate in deep neural networks, or when we have to deal with features that grow in a geometric progression and so on. 

## Random Arrays
When doing machine learning which involves statistics to a considerable extent, the need for random numbers arises inevitably. Although we cannot generate random numbers, we can simulate the production of random numbers using a pseudo-random number generator and numpy gives us one of those in the subpackage random. We will cover some aspects of this subpackage pertaining to arrays here.

### Reproducibility in randomness
Since we're simulating randomness, we can make sure that the randomness that I generate resembles the randomness that you generate when you run the code that's provided here (Well, that's why it's pseudo-random and not random). In order to do so, you just have to set a seed. You can do it as follows

```python
np.random.seed(10)
```

The argument in the above function could be any integer. As long as it's the same and the numpy and python versions are the same, running any of the following commands after setting the seed will ensure that you get the same random numbers. 

### Random Normal Array
A normal distribution or colloquially known as a bell curve is a distribution which is naturally encountered in a lot of problems, places and situations. Given that, simulation of this distribution becomes extremely important. Numpy provides a function to do this. 

```python
import numpy as np
np.random.seed(10)
y = np.random.randn(5, 5)
```

You can specify any number of dimensions to build an array of numbers sampled from a normal distribution. This is used the most when initializing the weights of a neural network. It can also be used in simulations which depend on generation of random numbers like the Monte Carlo Simulation and so on...

### Random Uniform Array
Another distribution which is also commonly used is a random uniform distribution. It is a distribution which weighs every outcome equally. Just like a coin-flip or a case where every outcome has equal probability, this distribution comes in handy.

```python
import numpy as np
np.random.seed(10)
np.random.random((5, 5))
```

You can specify the number of dimensions as a tuple. Note that it's not the same as the previous function. In np.random.randn, you are specifying each dimension individually and not as a tuple whereas that's not the case here.

This could be used to perform coin-flips or events which have definitive probabilities and create a simulation out of the same.

In [1]:
import numpy as np

In [2]:
# Matrix of all zeros
np.zeros(shape = (2,2), dtype = np.int64)

array([[0, 0],
       [0, 0]])

In [3]:
# Matrix of all ones
np.ones(shape = (3,3), dtype = np.float32)

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]], dtype=float32)

In [4]:
# Identity Matrix
np.eye(N = 3, M = 5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.]])

In [5]:
# Diagonal Matrix
np.diag([[1,2,3], [1,2,3], [3,4,5]])

array([1, 2, 5])

In [6]:
# Random numbers matrix
np.random.randn(4, 4)

array([[-0.29845894,  1.73889537, -0.28038579,  0.79339141],
       [-1.48163109, -0.32963367,  0.59311172, -3.17415521],
       [-0.97776768,  2.26475331, -1.6645612 , -0.03920712],
       [-0.31110351,  0.77067367,  0.00610254,  0.31099964]])

In [7]:
# Draw from a random normal distribution
# Ensure reproducibility
np.random.seed(10)
np.random.randn(4, 4)

array([[ 1.3315865 ,  0.71527897, -1.54540029, -0.00838385],
       [ 0.62133597, -0.72008556,  0.26551159,  0.10854853],
       [ 0.00429143, -0.17460021,  0.43302619,  1.20303737],
       [-0.96506567,  1.02827408,  0.22863013,  0.44513761]])

In [8]:
# Draw from a random uniform distribution
np.random.seed(10)
np.random.random((5, 4))

array([[0.77132064, 0.02075195, 0.63364823, 0.74880388],
       [0.49850701, 0.22479665, 0.19806286, 0.76053071],
       [0.16911084, 0.08833981, 0.68535982, 0.95339335],
       [0.00394827, 0.51219226, 0.81262096, 0.61252607],
       [0.72175532, 0.29187607, 0.91777412, 0.71457578]])

In [9]:
# Linearly spaced vector
np.linspace(start = 10, stop = 100, num = 91, dtype = np.int64)

array([ 10,  11,  12,  13,  14,  15,  16,  17,  18,  19,  20,  21,  22,
        23,  24,  25,  26,  27,  28,  29,  30,  31,  32,  33,  34,  35,
        36,  37,  38,  39,  40,  41,  42,  43,  44,  45,  46,  47,  48,
        49,  50,  51,  52,  53,  54,  55,  56,  57,  58,  59,  60,  61,
        62,  63,  64,  65,  66,  67,  68,  69,  70,  71,  72,  73,  74,
        75,  76,  77,  78,  79,  80,  81,  82,  83,  84,  85,  86,  87,
        88,  89,  90,  91,  92,  93,  94,  95,  96,  97,  98,  99, 100])

In [10]:
# Logarithmically spaced vector
np.logspace(start = -9, stop = 10, num = 20, dtype = np.float32)

array([1.e-09, 1.e-08, 1.e-07, 1.e-06, 1.e-05, 1.e-04, 1.e-03, 1.e-02,
       1.e-01, 1.e+00, 1.e+01, 1.e+02, 1.e+03, 1.e+04, 1.e+05, 1.e+06,
       1.e+07, 1.e+08, 1.e+09, 1.e+10], dtype=float32)