## Generating Data

In [1]:
import numpy as np

In [2]:
np.zeros((2,3))

array([[0., 0., 0.],
       [0., 0., 0.]])

In [3]:
np.ones((2,3))

array([[1., 1., 1.],
       [1., 1., 1.]])

In [4]:
10 * np.ones((2,3))

array([[10., 10., 10.],
       [10., 10., 10.]])

Creating an identity matrix

In [5]:
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

Generating random numbers

In [6]:
np.random.random()

0.8568936863807969

Generating a random matrix with size 2,3

In [10]:
np.random.random((2,3))

array([[0.78150128, 0.2826994 , 0.73975247],
       [0.80121204, 0.29095456, 0.45674825]])

In [8]:
np.random.randn(2,3)

array([[-0.08655322, -0.95905597,  1.96172437],
       [-1.60697744, -0.41215236,  1.78510525]])

In [11]:
R = np.random.randn(10000)

In [12]:
R

array([2.2140403 , 0.38186676, 1.4190944 , ..., 0.47257351, 0.97462881,
       0.52361471])

In [13]:
R.mean()

-0.0056313749358091044

In [14]:
np.mean(R)

-0.0056313749358091044

In [15]:
R.var()

1.0227199816082317

In [16]:
R.std()

1.011296188862705

In [17]:
R = np.random.randn(10000, 3)

Mean of each column

In [19]:
R.mean(axis=0)

array([-0.00818425, -0.00806607,  0.01461231])

Mean of each row

In [20]:
R.mean(axis=1)

array([ 0.43866802,  0.03961058,  0.03290165, ...,  0.39272455,
       -1.02836775,  0.42267309])

In [23]:
R.mean(axis=1).shape

(10000,)

Note: We have 10000 observations (rows) and 3 measurements per observations (columns). Each row is a vector observation. In vectors, tha analog of the variance is the co-variance.

In [25]:
np.cov(R)

array([[ 1.76984661,  1.26393198,  0.62379814, ..., -0.39276889,
         0.04103267,  1.97238393],
       [ 1.26393198,  1.09423511,  0.15746716, ..., -0.13570314,
        -0.38038624,  1.40161507],
       [ 0.62379814,  0.15746716,  0.65281373, ..., -0.35608783,
         0.63031285,  0.70564429],
       ...,
       [-0.39276889, -0.13570314, -0.35608783, ...,  0.1965828 ,
        -0.31870652, -0.44297498],
       [ 0.04103267, -0.38038624,  0.63031285, ..., -0.31870652,
         0.8769679 ,  0.06060729],
       [ 1.97238393,  1.40161507,  0.70564429, ..., -0.44297498,
         0.06060729,  2.1983519 ]])

In [26]:
np.cov(R).shape

(10000, 10000)

Note: The cov function by default treats each column as a vector observation. However, for the rest of the numpy stack like sci-kit learn, tensor flow or pytorch, each row is the sample observation.

Transposing the cov

In [28]:
np.cov(R.T)

array([[ 1.01442662,  0.00159723, -0.00321485],
       [ 0.00159723,  0.98168723, -0.01032132],
       [-0.00321485, -0.01032132,  1.00903704]])

In [29]:
R

array([[-0.18536952, -0.46495274,  1.96632632],
       [-0.86789891, -0.19696429,  1.18369496],
       [ 0.50721947, -0.90001633,  0.49150181],
       ...,
       [ 0.18219137,  0.90214412,  0.09383815],
       [-0.05527781, -1.92331189, -1.10651357],
       [-0.25600448, -0.59920061,  2.12322438]])

In [30]:
np.cov(R, rowvar=False)

array([[ 1.01442662,  0.00159723, -0.00321485],
       [ 0.00159723,  0.98168723, -0.01032132],
       [-0.00321485, -0.01032132,  1.00903704]])

In [31]:
np.random.randint(0,10, size=(3,3))

array([[1, 7, 9],
       [8, 2, 4],
       [0, 6, 3]])

In [32]:
np.random.choice(10, size=(3,3))

array([[6, 0, 9],
       [6, 1, 6],
       [6, 6, 9]])