# Working with numerical data

In [1]:
w1,w2,w3=0.3,0.2,0.5

In [5]:
kanto_temp=73
kanto_rainfall=67
kanto_humidity=43

yield_of_apples = w1 * temperature + w2 * rainfall + w3 * humidity 

In [7]:
kanto_yield_apples=kanto_temp*w1+kanto_rainfall*w2+kanto_humidity*w3
kanto_yield_apples

56.8

In [10]:
print("The expected yield of apples in kanto_region is {} tons".format(kanto_yield_apples))

The expected yield of apples in kanto_region is 56.8 tons


To make it slightly easier to perform the above computation for multiple regions, we can represent the climate data for each region as a vector, i.e., a list of numbers.

In [11]:
kanto=[73,67,43]
johto=[91,88,64]
hoenn=[87,134,58]
sinnoh=[102,43,37]
unova=[69,96,70]

In [12]:
weights=[w1,w2,w3]

In [15]:
kanto

[73, 67, 43]

In [16]:
weights

[0.3, 0.2, 0.5]

In [22]:
for x,w in zip(kanto,weights):
    print(x)
    print(w)

73
0.3
67
0.2
43
0.5


In [23]:
result=0 
for x,w in zip(kanto,weights):
    result=result+x*w
print(result)

56.8


In [24]:
def crop_yield(region,weights):
    result=0
    for x,w in zip(region,weights):
        result += x*w
    return result

In [25]:
crop_yield(kanto,weights)

56.8

In [26]:
crop_yield(johto,weights)

76.9

In [27]:
crop_yield(unova,weights)

74.9

# Going from Python lists to Numpy arrays
The calculation performed by the crop_yield (element-wise multiplication of two vectors and taking a sum of the results) is also called the dot product. 

The Numpy library provides a built-in function to compute the dot product of two vectors. However, we must first convert the lists into Numpy arrays.

 Next, let's import the `numpy` module. It's common practice to import numpy with the alias `np`.

In [28]:
import numpy as np

In [29]:
kanto=np.array([73,67,43])

In [30]:
kanto

array([73, 67, 43])

In [31]:
weights=np.array([w1,w2,w3])

In [32]:
weights

array([0.3, 0.2, 0.5])

### Numpy arrays have the type ndarray.

In [33]:
type(kanto)

numpy.ndarray

In [34]:
type(weights)

numpy.ndarray

Just like lists, Numpy arrays support the indexing notation `[]`.

In [35]:
weights[0]

0.3

In [36]:
weights[0:]

array([0.3, 0.2, 0.5])

In [37]:
kanto[2]

43

In [40]:
kanto[:-1]

array([73, 67])

## Operating on Numpy arrays

We can now compute the dot product of the two vectors using the `np.dot` function.

In [41]:
np.dot(kanto,weights)

56.8

In [42]:
kanto*weights

array([21.9, 13.4, 21.5])

We can achieve the same result with low-level operations supported by Numpy arrays: performing an element-wise multiplication and calculating the resulting numbers' sum.

In [43]:
(kanto*weights).sum()

56.8

The `*` operator performs an element-wise multiplication of two arrays if they have the same size. The `sum` method calculates the sum of numbers in an array.

In [44]:
arr1=np.array([2,3,5])
arr2=np.array([1,3,4])

In [45]:
arr1*arr2

array([ 2,  9, 20])

In [48]:
arr_a=np.array([2,3,5])
arr_b=np.array([1,3,4,4]) #ValueError: operands could not be broadcast together with shapes (3,) (4,) 


In [49]:
# arr_a*arr_b #ValueError: operands could not be broadcast together with shapes (3,) (4,) 


## Benefits of using Numpy arrays

Numpy arrays offer the following benefits over Python lists for operating on numerical data:

- **Ease of use**: You can write small, concise, and intuitive mathematical expressions like `(kanto * weights).sum()` rather than using loops & custom functions like `crop_yield`.
- **Performance**: Numpy operations and functions are implemented internally in C++, which makes them much faster than using Python statements & loops that are interpreted at runtime

Here's a comparison of dot products performed using Python loops vs. Numpy arrays on two vectors with a million elements each.

In [50]:
#Phyton lists
arr1=list(range(100000))
arr2=list(range(100000,200000))

#Numpy Array
arr1_np=np.array(arr1)
arr2_np=np.array(arr2)

In [53]:
arr1 #contain 1 lakh elements

[0,
 1,
 2,
 3,
 4,
 5,
 6,
 7,
 8,
 9,
 10,
 11,
 12,
 13,
 14,
 15,
 16,
 17,
 18,
 19,
 20,
 21,
 22,
 23,
 24,
 25,
 26,
 27,
 28,
 29,
 30,
 31,
 32,
 33,
 34,
 35,
 36,
 37,
 38,
 39,
 40,
 41,
 42,
 43,
 44,
 45,
 46,
 47,
 48,
 49,
 50,
 51,
 52,
 53,
 54,
 55,
 56,
 57,
 58,
 59,
 60,
 61,
 62,
 63,
 64,
 65,
 66,
 67,
 68,
 69,
 70,
 71,
 72,
 73,
 74,
 75,
 76,
 77,
 78,
 79,
 80,
 81,
 82,
 83,
 84,
 85,
 86,
 87,
 88,
 89,
 90,
 91,
 92,
 93,
 94,
 95,
 96,
 97,
 98,
 99,
 100,
 101,
 102,
 103,
 104,
 105,
 106,
 107,
 108,
 109,
 110,
 111,
 112,
 113,
 114,
 115,
 116,
 117,
 118,
 119,
 120,
 121,
 122,
 123,
 124,
 125,
 126,
 127,
 128,
 129,
 130,
 131,
 132,
 133,
 134,
 135,
 136,
 137,
 138,
 139,
 140,
 141,
 142,
 143,
 144,
 145,
 146,
 147,
 148,
 149,
 150,
 151,
 152,
 153,
 154,
 155,
 156,
 157,
 158,
 159,
 160,
 161,
 162,
 163,
 164,
 165,
 166,
 167,
 168,
 169,
 170,
 171,
 172,
 173,
 174,
 175,
 176,
 177,
 178,
 179,
 180,
 181,
 182,
 183,
 184,


In [52]:
arr1_np # contain 1 lakh elements

array([    0,     1,     2, ..., 99997, 99998, 99999])

In [54]:
len(arr2)

100000

In [55]:
arr2_np # contain 1 lakh elemenst from 100000 to 200000

array([100000, 100001, 100002, ..., 199997, 199998, 199999])

In [58]:
%%time # built in jupyter function
result = 0
for x1 , x2 in zip(arr1,arr2):
    result += x1*x2
    
result

Wall time: 48 ms


833323333350000

In [59]:
%%time
np.dot(arr1_np,arr2_np)

Wall time: 3 ms


893678192

In [61]:
%%time

(arr1_np*arr2_np).sum()

Wall time: 2.99 ms


893678192

## np.dot and .sum() built in function in numpy 

As you can see, using `np.dot` is 100 times faster than using a `for` loop. This makes Numpy especially useful while working with really large datasets with tens of thousands or millions of data points.



## Multi-dimensional Numpy arrays 

We can now go one step further and represent the climate data for all the regions using a single 2-dimensional Numpy array.

In [62]:
climate_data=np.array([[73,67,43],
                      [91,88,64],
                      [87,134,58],
                      [102,43,37],
                      [69,96,70]])

In [64]:
climate_data

array([[ 73,  67,  43],
       [ 91,  88,  64],
       [ 87, 134,  58],
       [102,  43,  37],
       [ 69,  96,  70]])

If you've taken a linear algebra class in high school, you may recognize the above 2-d array as a matrix with five rows and three columns. Each row represents one region, and the columns represent temperature, rainfall, and humidity, respectively.

Numpy arrays can have any number of dimensions and different lengths along each dimension. We can inspect the length along each dimension using the `.shape` property of an array.

<img src="https://fgnt.github.io/python_crashkurs_doc/_images/numpy_array_t.png" width="420">



## .shape

In [65]:
# 2D array (matrix)
climate_data.shape

(5, 3)

In [66]:
weights

array([0.3, 0.2, 0.5])

In [67]:
# 1D array (vector)
weights.shape

(3,)

In [72]:
# 3D array 
arr3=np.array([
    [[1,3,4],
     [2,31,3]],
    [[2,5,2],
    [4,24,5.3]]
])

In [73]:
arr3

array([[[ 1. ,  3. ,  4. ],
        [ 2. , 31. ,  3. ]],

       [[ 2. ,  5. ,  2. ],
        [ 4. , 24. ,  5.3]]])

In [70]:
arr3.shape

(2, 2, 3)

All the elements in a numpy array have the same data type. You can check the data type of an array using the `.dtype` property.

In [74]:
weights

array([0.3, 0.2, 0.5])

In [75]:
weights.dtype

dtype('float64')

In [76]:
climate_data.dtype

dtype('int32')

If an array contains even a single floating point number, all the other elements are also converted to floats.

In [78]:
arr3

array([[[ 1. ,  3. ,  4. ],
        [ 2. , 31. ,  3. ]],

       [[ 2. ,  5. ,  2. ],
        [ 4. , 24. ,  5.3]]])

In [77]:
arr3.dtype

dtype('float64')

We can now compute the predicted yields of apples in all the regions, using a single matrix multiplication between `climate_data` (a 5x3 matrix) and `weights` (a vector of length 3). Here's what it looks like visually:

<img src="https://i.imgur.com/LJ2WKSI.png" width="240">



### We can use the `np.matmul` function or the `@` operator to perform matrix multiplication.

In [79]:
np.matmul(climate_data,weights)

array([56.8, 76.9, 81.9, 57.7, 74.9])

In [80]:
climate_data @ weights

array([56.8, 76.9, 81.9, 57.7, 74.9])

## Working with CSV data files

Numpy also provides helper functions reading from & writing to files. Let's download a file `climate.txt`, which contains 10,000 climate measurements (temperature, rainfall & humidity) in the following format:


```
temperature,rainfall,humidity
25.00,76.00,99.00
39.00,65.00,70.00
59.00,45.00,77.00
84.00,63.00,38.00
66.00,50.00,52.00
41.00,94.00,77.00
91.00,57.00,96.00
49.00,96.00,99.00
67.00,20.00,28.00
...
```

This format of storing data is known as *comma-separated values* or CSV. 

> **CSVs**: A comma-separated values (CSV) file is a delimited text file that uses a comma to separate values. Each line of the file is a data record. Each record consists of one or more fields, separated by commas. A CSV file typically stores tabular data (numbers and text) in plain text, in which case each line will have the same number of fields. (Wikipedia)


To read this file into a numpy array, we can use the `genfromtxt` function.

In [82]:
import urllib.request

urllib.request.urlretrieve('https://hub.jovian.ml/wp-content/uploads/2020/08/climate.csv', 
    'climate.txt')

('climate.txt', <http.client.HTTPMessage at 0x164c4ea4d90>)

In [83]:
climate_data=np.genfromtxt('climate.txt',delimiter=',',skip_header=1)

In [89]:
# help(np.genfromtxt)

In [90]:
climate_data

array([[25., 76., 99.],
       [39., 65., 70.],
       [59., 45., 77.],
       ...,
       [99., 62., 58.],
       [70., 71., 91.],
       [92., 39., 76.]])

In [93]:
weights=np.array([0.3,0.2,0.5])

In [94]:
climate_data.shape

(10000, 3)

We can now perform a matrix multiplication using the `@` operator to predict the yield of apples for the entire dataset using a given set of weights.

In [95]:
weights

array([0.3, 0.2, 0.5])

In [96]:
yields=climate_data @ weights # @ or np.matmul()

In [97]:
yields

array([72.2, 59.7, 65.2, ..., 71.1, 80.7, 73.4])

In [98]:
yields.shape

(10000,)

Let's add the `yields` to `climate_data` as a fourth column using the [`np.concatenate`](https://numpy.org/doc/stable/reference/generated/numpy.concatenate.html) function.

In [100]:
# help(np.concatenate)

In [112]:
a = np.array([[1, 2],
              [3, 4]])
b = np.array([[5, 6,3,1],
             [3,4,3,1]])

In [120]:
a
a.shape

(2, 2)

In [121]:
b.shape

(2, 4)

In [131]:
c=np.concatenate((a,b),axis=1) #axis=0 gives error why?
c

array([[1, 2, 5, 6, 3, 1],
       [3, 4, 3, 4, 3, 1]])

In [119]:
c.shape

(2, 6)

In [133]:
c.reshape(4,3) # or c.reshape(3,4)

array([[1, 2, 5],
       [6, 3, 1],
       [3, 4, 3],
       [4, 3, 1]])

In [124]:
yields

array([72.2, 59.7, 65.2, ..., 71.1, 80.7, 73.4])

In [125]:
yields.shape

(10000,)

In [126]:
yields.reshape(10000,1)

array([[72.2],
       [59.7],
       [65.2],
       ...,
       [71.1],
       [80.7],
       [73.4]])

In [128]:
climate_result=np.concatenate((climate_data,yields.reshape(10000,1)),axis=1)

In [129]:
climate_result

array([[25. , 76. , 99. , 72.2],
       [39. , 65. , 70. , 59.7],
       [59. , 45. , 77. , 65.2],
       ...,
       [99. , 62. , 58. , 71.1],
       [70. , 71. , 91. , 80.7],
       [92. , 39. , 76. , 73.4]])

There are a couple of subtleties here:

* Since we wish to add new columns, we pass the argument `axis=1` to `np.concatenate`. The `axis` argument specifies the dimension for concatenation.

*  The arrays should have the same number of dimensions, and the same length along each except the dimension used for concatenation. We use the [`np.reshape`](https://numpy.org/doc/stable/reference/generated/numpy.reshape.html) function to change the shape of `yields` from `(10000,)` to `(10000,1)`.

Here's a visual explanation of `np.concatenate` along `axis=1` (can you guess what `axis=0` results in?):

<img src="https://www.w3resource.com/w3r_images/python-numpy-image-exercise-58.png" width="300">

The best way to understand what a Numpy function does is to experiment with it and read the documentation to learn about its arguments & return values. Use the cells below to experiment with `np.concatenate` and `np.reshape`.

Let's write the final results from our computation above back to a file using the `np.savetxt` function.

In [135]:
climate_result

array([[25. , 76. , 99. , 72.2],
       [39. , 65. , 70. , 59.7],
       [59. , 45. , 77. , 65.2],
       ...,
       [99. , 62. , 58. , 71.1],
       [70. , 71. , 91. , 80.7],
       [92. , 39. , 76. , 73.4]])

In [137]:
np.savetxt('climate_result.txt',
           climate_result,
          fmt='%.2f',
          delimiter=',',
          header='temperatur,rainfall,humidity,yeilds_apples',
          comments='')

The results are written back in the CSV format to the file `climate_results.txt`. 

```
temperature,rainfall,humidity,yeild_apples
25.00,76.00,99.00,72.20
39.00,65.00,70.00,59.70
59.00,45.00,77.00,65.20
84.00,63.00,38.00,56.80
...
```



Numpy provides hundreds of functions for performing operations on arrays. Here are some commonly used functions:


* Mathematics: `np.sum`, `np.exp`, `np.round`, arithemtic operators 
* Array manipulation: `np.reshape`, `np.stack`, `np.concatenate`, `np.split`
* Linear Algebra: `np.matmul`, `np.dot`, `np.transpose`, `np.eigvals`
* Statistics: `np.mean`, `np.median`, `np.std`, `np.max`

> **How to find the function you need?** The easiest way to find the right function for a specific operation or use-case is to do a web search. For instance, searching for "How to join numpy arrays" leads to [this tutorial on array concatenation](https://cmdlinetips.com/2018/04/how-to-concatenate-arrays-in-numpy/). 

You can find a full list of array functions here: https://numpy.org/doc/stable/reference/routines.html

## Arithmetic operations, broadcasting and comparison

Numpy arrays support arithmetic operators like `+`, `-`, `*`, etc. You can perform an arithmetic operation with a single number (also called scalar) or with another array of the same shape. Operators make it easy to write mathematical expressions with multi-dimensional arrays.

In [2]:
import numpy as np

In [3]:
arr2=np.array([[1,2,3,4],
               [5,6,7,8],
               [9,1,2,3]])

In [4]:
arr2.shape

(3, 4)

In [5]:
arr3=np.array([[11,12,13,14],
               [15,16,17,18],
               [19,11,12,13]])

In [8]:
arr3.shape

(3, 4)

In [10]:
arr2

array([[1, 2, 3, 4],
       [5, 6, 7, 8],
       [9, 1, 2, 3]])

In [11]:
#Adding a scalar
arr2+3

array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12,  4,  5,  6]])

In [12]:
arr3

array([[11, 12, 13, 14],
       [15, 16, 17, 18],
       [19, 11, 12, 13]])

In [13]:
#element-wise subtraction
arr3-arr2

array([[10, 10, 10, 10],
       [10, 10, 10, 10],
       [10, 10, 10, 10]])

In [14]:
#division by scalar
arr2/2

array([[0.5, 1. , 1.5, 2. ],
       [2.5, 3. , 3.5, 4. ],
       [4.5, 0.5, 1. , 1.5]])

In [17]:
arr2

array([[1, 2, 3, 4],
       [5, 6, 7, 8],
       [9, 1, 2, 3]])

In [15]:
#element-wise multiplication
arr2*arr3

array([[ 11,  24,  39,  56],
       [ 75,  96, 119, 144],
       [171,  11,  24,  39]])

In [16]:
#modulus with salar
arr2%4

array([[1, 2, 3, 0],
       [1, 2, 3, 0],
       [1, 1, 2, 3]], dtype=int32)

### Array Broadcasting

Numpy arrays also support *broadcasting*, allowing arithmetic operations between two arrays with different numbers of dimensions but compatible shapes. Let's look at an example to see how it works.

In [18]:
arr2

array([[1, 2, 3, 4],
       [5, 6, 7, 8],
       [9, 1, 2, 3]])

In [19]:
arr2.shape

(3, 4)

In [20]:
arr4=np.array([4,5,6,7])

In [21]:
arr4.shape

(4,)

In [22]:
arr2+arr4

array([[ 5,  7,  9, 11],
       [ 9, 11, 13, 15],
       [13,  6,  8, 10]])

In [2]:
import numpy as np

In [4]:
#2 X 3 X 2
arr1=np.array([[[1,3],
                [2,4],
                [3,2]],
              [[2,3],
               [1,2],
               [2,5]]])

In [6]:
# arr1
arr1.shape

(2, 3, 2)

In [13]:
# arr2=np.array([5,6])
# arr2=np.array([5,6,3])
arr2=np.array([[5,6],
               [1,2],
               [3,4]])


arr2.shape

(3, 2)

In [14]:
#shapes
#arr1=2 X 3 X 2 
#arr2=      X 2 
#arr2=      X 3  #arr2=np.array([5,6,3]) #ValueError: operands could not be broadcast together with shapes (2,3,2) (3,) 

#arr2=  X 3 X 2  arr2=np.array([[5,6],
#                               [1,2],
#                               [3,4]])

arr1+arr2

array([[[6, 9],
        [3, 6],
        [6, 6]],

       [[7, 9],
        [2, 4],
        [5, 9]]])

When the expression `arr2 + arr4` is evaluated, `arr4` (which has the shape `(4,)`) is replicated three times to match the shape `(3, 4)` of `arr2`. Numpy performs the replication without actually creating three copies of the smaller dimension array, thus improving performance and using lower memory.

<img src="https://jakevdp.github.io/PythonDataScienceHandbook/figures/02.05-broadcasting.png" width="360">

Broadcasting only works if one of the arrays can be replicated to match the other array's shape.

In [15]:
np.arange(3)+5

array([5, 6, 7])

In [24]:
np.ones((3,3)+ np.arange(3) 

SyntaxError: unexpected EOF while parsing (<ipython-input-24-424e310386ce>, line 1)

In [23]:
np.arange(3).reshape(3,1)+np.arange(3)

array([[0, 1, 2],
       [1, 2, 3],
       [2, 3, 4]])

### Array Comparison

Numpy arrays also support comparison operations like `==`, `!=`, `>` etc. The result is an array of booleans.

In [23]:
arr1=np.array([[1,2,3],
               [3,4,5]])
arr2=np.array([[2,2,3],
               [1,2,5]])

In [24]:
arr1.shape

(2, 3)

In [25]:
arr2.shape

(2, 3)

In [26]:
arr1==arr2

array([[False,  True,  True],
       [False, False,  True]])

In [27]:
arr1!=arr2

array([[ True, False, False],
       [ True,  True, False]])

In [28]:
arr1>=arr2

array([[False,  True,  True],
       [ True,  True,  True]])

In [29]:
arr1<arr2

array([[ True, False, False],
       [False, False, False]])

Array comparison is frequently used to count the number of equal elements in two arrays using the `sum` method. Remember that `True` evaluates to `1` and `False` evaluates to `0` when booleans are used in arithmetic operations.

In [30]:
(arr1!=arr2).sum()

3

In [31]:
(arr1==arr2).sum()

3

## Array indexing and slicing

Numpy extends Python's list indexing notation using `[]` to multiple dimensions in an intuitive fashion. You can provide a comma-separated list of indices or ranges to select a specific element or a subarray (also called a slice) from a Numpy array.

In [26]:
arr3=np.array([[[22,31,16,34],
                [43,21,32,12]],
               [[1,2,3,5],
                [5,6,7,8]],
               [[1,23,34,4],
                [5,62,7.4,85]]])

In [43]:
arr3.shape

(3, 2, 4)

In [44]:
#single element
arr3[1,1,3] # 8
arr3[0,1,3] # 12
arr3[2,1,1] # 6
arr3[0,0,3] # 34 

34

In [27]:
arr3

array([[[22. , 31. , 16. , 34. ],
        [43. , 21. , 32. , 12. ]],

       [[ 1. ,  2. ,  3. ,  5. ],
        [ 5. ,  6. ,  7. ,  8. ]],

       [[ 1. , 23. , 34. ,  4. ],
        [ 5. , 62. ,  7.4, 85. ]]])

In [30]:
#subarray using ranges
# arr3[1:,1:,:3] 
# arr3[1:,0:1,:2]
arr3[0:,1,2:]

array([[32. , 12. ],
       [ 7. ,  8. ],
       [ 7.4, 85. ]])

In [36]:
arr3

array([[[22. , 31. , 16. , 34. ],
        [43. , 21. , 32. , 12. ]],

       [[ 1. ,  2. ,  3. ,  5. ],
        [ 5. ,  6. ,  7. ,  8. ]],

       [[ 1. , 23. , 34. ,  4. ],
        [ 5. , 62. ,  7.4, 85. ]]])

In [33]:
# Mixing indices and ranges
# arr3[1:, 1, 3]
arr3[2,0:,1:3]

array([[23. , 34. ],
       [62. ,  7.4]])

In [37]:
# Mixing indices and ranges
# arr3[1:, 1, :3]
arr3[0:3,:2,0:2] 

array([[[22., 31.],
        [43., 21.]],

       [[ 1.,  2.],
        [ 5.,  6.]],

       [[ 1., 23.],
        [ 5., 62.]]])

In [38]:
arr3

array([[[22. , 31. , 16. , 34. ],
        [43. , 21. , 32. , 12. ]],

       [[ 1. ,  2. ,  3. ,  5. ],
        [ 5. ,  6. ,  7. ,  8. ]],

       [[ 1. , 23. , 34. ,  4. ],
        [ 5. , 62. ,  7.4, 85. ]]])

In [43]:
# Using fewer indices
# arr3[2]
# arr3[0]
arr3[1]

array([[1., 2., 3., 5.],
       [5., 6., 7., 8.]])

In [44]:
# Using too many indices
arr3[1,3,2,1]

IndexError: too many indices for array: array is 3-dimensional, but 4 were indexed

The notation and its results can seem confusing at first, so take your time to experiment and become comfortable with it. Use the cells below to try out some examples of array indexing and slicing, with different combinations of indices and ranges. Here are some more examples demonstrated visually:

<img src="https://scipy-lectures.org/_images/numpy_indexing.png" width="360">

In [58]:
ar=np.array([[0,1,2,3,4,5],
            [10,11,12,13,14,15],
            [20,21,22,23,24,25],
            [30,31,32,33,34,35],
            [40,41,42,43,44,45],
            [50,51,52,53,54,55]])

In [46]:
ar.shape

(6, 6)

In [47]:
ar[0,3:5]

array([3, 4])

In [57]:
ar[4:,4:]

array([[44, 45],
       [54, 55]])

In [51]:
ar[:,2]

array([ 2, 12, 22, 32, 42, 52])

In [53]:
ar[:,5]

array([ 5, 15, 25, 35, 45, 55])

In [56]:
ar[2::2,::2] # not fitted in mind

array([[20, 22, 24],
       [40, 42, 44]])

## Other ways of creating Numpy arrays

Numpy also provides some handy functions to create arrays of desired shapes with fixed or random values. Check out the [official documentation](https://numpy.org/doc/stable/reference/routines.array-creation.html) or use the `help` function to learn more.

In [64]:
# All zeros
np.zeros((3,2))
np.zeros((3,3))
np.zeros((3,2,4))

array([[[0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.]]])

In [67]:
# All ones
np.ones([2,3,2])
np.ones([2,2,3])

array([[[1., 1., 1.],
        [1., 1., 1.]],

       [[1., 1., 1.],
        [1., 1., 1.]]])

In [71]:
# Identity matrix
np.eye(2)
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [78]:
# Random vector
np.random.rand(9)
np.random.rand(9).reshape(3,3)

array([[0.87956324, 0.75550903, 0.14665464],
       [0.47238399, 0.20218482, 0.92400719],
       [0.56627324, 0.56161982, 0.74171216]])

In [79]:
# Random matrix
np.random.randn(2, 3) # rand vs. randn - what's the difference?

array([[-0.31072854, -1.31894195,  0.77684063],
       [-0.57800785,  0.48116786,  0.60189858]])

In [82]:
# Fixed value
np.full([3,2], 2)
np.full([3,2,3],7)

array([[[7, 7, 7],
        [7, 7, 7]],

       [[7, 7, 7],
        [7, 7, 7]],

       [[7, 7, 7],
        [7, 7, 7]]])

In [86]:
# Range with start, end and step
np.arange(10, 90, 3)
np.arange(10,90,3).reshape(3,3,3)


array([[[10, 13, 16],
        [19, 22, 25],
        [28, 31, 34]],

       [[37, 40, 43],
        [46, 49, 52],
        [55, 58, 61]],

       [[64, 67, 70],
        [73, 76, 79],
        [82, 85, 88]]])

In [89]:
# Equally spaced numbers in a range
np.linspace(3, 27, 9)

array([ 3.,  6.,  9., 12., 15., 18., 21., 24., 27.])