In [8]:
import numpy as np

# this code produces a rank one array which is an odd
# type of data structure which is somewhere between 2 and
# 3 dimensions
a = np.random.randn(5)
print(a)

[ 0.42396802  0.22705757 -0.02207982 -0.89002707 -0.01692857]


In [9]:
# This causes issues due to its shape of (5,)
print(a.shape)

(5,)


In [11]:
# It can be useful in the code to add assert statements to 
# ensure that you are not accidently creating or conducting 
# operations on rank one vector arrays.
assert(a.shape == (5,1))


AssertionError: 

In [12]:
# While ideally, you want to avoid creating or 
# using any rank one arrays, they can always be reshaped
# in the code so they expect closer to expected
a = a.reshape(5,1)
print(a)
print(a.shape)
assert(a.shape == (5,1))

[[ 0.42396802]
 [ 0.22705757]
 [-0.02207982]
 [-0.89002707]
 [-0.01692857]]
(5, 1)


## What the heck is a rank 1 array? 

A rank 1 array is an array that is technically 1-dimensional, but in practice behaves like a 2-dimensional array with only one row or one column. It is important to note that rank 1 arrays are not a recommended way to represent arrays in numpy.

In the case of the code you provided, np.random.randn(5) creates a rank 1 array with shape (5,). This means it has only one axis, and that axis has 5 elements. In other words, it is a 1-dimensional array with 5 elements.

On the other hand, np.random.randn(5, 1) creates a 2-dimensional array with shape (5, 1). This means it has two axes - the first axis has 5 elements and the second axis has 1 element. In other words, it is a 2-dimensional array with 5 rows and 1 column.

The main difference between the two is that rank 1 arrays behave differently when it comes to operations involving matrix multiplication or transposition. For example, the transpose of a rank 1 array is the same as the original array, whereas the transpose of a 2-dimensional array is a flipped version of the original array.

It is generally recommended to avoid using rank 1 arrays and instead use 2-dimensional arrays with a shape of (n, 1) or (1, n) depending on whether you want a column or a row vector, respectively.

In [None]:
print(a.T)

[0.64413    0.82747819 0.70280757 0.37597963 1.17046154]


In [None]:
print(np.dot(a, a.T))

3.104902999059224


Instead, as an easy way to avoid these errors, instead explicitly create the array with a shape of (5,1) using the following code

In [None]:
# instead create a column vector
a = np.random.randn(5,1)
# or a row vector
# a = np.random.randn(1,5)
print(a)
print(a.shape)
print(np.dot(a, a.T))

[[-0.87759362]
 [ 0.40464134]
 [-2.35669336]
 [ 0.48296511]
 [ 1.83561233]]
(5, 1)
[[ 0.77017056 -0.35511066  2.06821905 -0.4238471  -1.61092167]
 [-0.35511066  0.16373461 -0.95361555  0.19542765  0.74276463]
 [ 2.06821905 -0.95361555  5.55400359 -1.13820067 -4.3259754 ]
 [-0.4238471   0.19542765 -1.13820067  0.2332553   0.88653671]
 [-1.61092167  0.74276463 -4.3259754   0.88653671  3.36947264]]
