# 4. NumPy

Welcome to the fantastic world of NumPy! The name NumPy is short for "Numeric Python" or "Numerical Python", and it's a key building block for scientific programming in Python. It's open-source (which means it's free and everyone can contribute to it) and it's been designed with efficiency in mind.

At the heart of NumPy is its ability to create and handle data arrays (vectors) with ease, and this includes not just simple arrays, but matrices and even higher-dimensional tensors! And the cherry on top? NumPy comes equipped with a host of functions to play around with these arrays, including some advanced linear algebra routines which are optimized for dealing with really, really big matrices.

To get started with NumPy, we first need to import it into our notebook. This is like inviting NumPy into our coding party! We typically abbreviate NumPy as `'np'` to keep things short and sweet. Remember, we only need to do this once for the entire notebook. Now, let's roll out the red carpet for NumPy!

In [1]:
# importing numpy
import numpy as np

## 4.1 Starting with the basics

Let's introduce you to NumPy arrays, they're a bit like the neat and organized cousin of Python lists. Just like a list, a NumPy array is represented by a bunch of values, a bit like a team huddled together. We can even create a NumPy array from a list or a nested list.

However, NumPy arrays like to keep things uniform. This means that all of the values in an array need to be of the same type, kind of like wearing the same team jersey. Moreover, all the rows and columns (and even higher dimensions) need to have the same length. It's all about staying coordinated!

You can create a NumPy array using the `np.array()` function and feeding it a list or nested list. Let's look at some examples:

In [2]:
# creating a numpy array from a list
my_array = np.array([1,2,3,4,5,6]) # constructing from a list
print('My array \n', my_array) # \n is a newline

# creating a numpy 2x6 matrix from 2D nested lists
my_matrix = np.array([[1,2,3,4,5,6],[7,8,9,10,11,12]])
print('My matrix \n', my_matrix)

My array 
 [1 2 3 4 5 6]
My matrix 
 [[ 1  2  3  4  5  6]
 [ 7  8  9 10 11 12]]


Numpy arrays are really quite expressive, just like an open book. Once you've created one, you can find out a lot about it using certain attributes.

- **shape:** Think of this as the array's blueprint. It tells you the length of each dimension.

- **size:** This is a quick count of all elements in the array. In other words, how many 'team members' the array has!

- **ndim:** This tells you the total number of dimensions in the array, like asking, "Is this a one-story house, or a multi-story one?".

- **dtype:** This tells you the data type of the array. It's like asking, "What language does the team speak?"

With these attributes, we can get a pretty good snapshot of what our array looks like and how it's structured.

In [3]:
# printing attributes of the array
print('array shape:', my_array.shape)
print('array size:', my_array.size)
print('array total dimensions:', my_array.ndim)
print('array data type:', my_array.dtype)

# printing attributes of the matrix
print('matrix shape:', my_matrix.shape)
print('matrix size:', my_matrix.size)
print('matrix total dimensions:', my_matrix.ndim)
print('matrix data type:', my_matrix.dtype)

array shape: (6,)
array size: 6
array total dimensions: 1
array data type: int64
matrix shape: (2, 6)
matrix size: 12
matrix total dimensions: 2
matrix data type: int64


## 4.2 Creating arrays

Did you know Numpy is also an excellent array architect? It comes equipped with a bunch of handy functions to whip up arrays for you in no time!

- **zeros:** The `zeros` function is like a blank canvas; it gives you an array filled with zeros.
- **ones:** `ones`, as the name suggests, creates an array brimming with ones.
- **eye:** The `eye` function is a master at crafting identity matrices. ('I' for identity, get it?)
- **empty:** Lastly, `empty` creates an array of a certain shape, but leaves the values uninitialized, which means it's filled with some unknown values. Make sure to initialize all the values before use!

Also, keep in mind that Numpy likes to make arrays of type float64 by default. But if you prefer a different type, no worries! You can specify the datatype (**`dtype`**) you want during initialization. Now isn't that handy?

In [4]:
# creating a 3x3 matrix of all float zeros
zeros = np.zeros((3,3))
# creating a 3x3 matrix of all integer ones
ones = np.ones((3,3), dtype=np.int16)  # note how dtype can be specified
# creating a 3x3 indentiy matrix of float type
identity = np.eye(3)
# creating a 3x3 empty matrix of very small random values
empty = np.empty((3,3))

# here '\n' forces a new line
print('zeros:\n', zeros)
print('ones:\n', ones)
print('identity:\n', identity)
print('empty:\n', empty)

zeros:
 [[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
ones:
 [[1 1 1]
 [1 1 1]
 [1 1 1]]
identity:
 [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
empty:
 [[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


Think of Numpy as your friendly neighbourhood sequence maker. It's got some nifty tools (array constructors) called `arange` and `linspace` that are pretty similar to the good old range function we use to create sequences of numbers.

`arange` is the cool kid on the block. It's a lot like `range`, and works best when you want a sequence of integers. Call on `arange` with one number N and it will happily give you an array with integers from 0 to N. Note that (similarly to the Python range keyword), the end number specified won't be included. Want a specific start and end point or a particular interval? No problem! Just mention them and `arange` will get it done. Let's see it in action:

In [5]:
# integers from 0 to 4
range1 = np.arange(5)
# integers from 5 to 9
range2 = np.arange(5,10)
# integers from 50 to 75 in steps of 5 (note interval does not have to be an integer)
range3 = np.arange(50,80,5)

print('range1: \n', range1)
print('range2: \n', range2)
print('range3: \n', range3)

range1: 
 [0 1 2 3 4]
range2: 
 [5 6 7 8 9]
range3: 
 [50 55 60 65 70 75]


Did you know Numpy also has a great tool for when you're looking to find evenly spaced numbers within a fixed range? It's called linspace.

It's like a precise ruler that creates equally spaced markers between your start and end points. All you need to do is call `np.linspace(start, end, num)`, where `num` is the number of equally spaced samples you want. It's super handy for when you need precision and evenly distributed data. Let's check

In [6]:
# create an array with all integers from 0 to 9
integer_range = np.arange(10)
# create 9 linearly spaced numbers from 0 to 2
linearly_spaced_seq = np.linspace(0, 2, 9)

print('integer range:')
print(integer_range)

print('linearly space range:')
print(linearly_spaced_seq)

integer range:
[0 1 2 3 4 5 6 7 8 9]
linearly space range:
[0.   0.25 0.5  0.75 1.   1.25 1.5  1.75 2.  ]


And guess what? Numpy has yet another trick up its sleeve. It contains a dedicated submodule filled with random number generators!

These generators can create arrays filled with numbers that follow specific random distributions. It's like having your very own lottery machine, but with you setting the rules! Let's see how this works:

Finally, numpy has a submodule containing a range of random number generators to create arrays that follow specific random distributions i.e

In [7]:
# all creating matrices of shape (3,3)
# random numbers in range [0.0, 1.0)
rand1 = np.random.random((3,3))
# in range low (5) to high (20)
rand2 = np.random.randint(5,20,(3,3))
# sampling random floats from the standard normal distribution
# note how dimensions are defined as separate arguments
rand3 = np.random.randn(3,3)

print('random numbers in range [0,1]:')
print(rand1)
print('random integers in range [5,20]:')
print(rand2)
print('random floats drawn from standard normal distribution:')
print(rand3)

random numbers in range [0,1]:
[[0.90874226 0.56720882 0.29506698]
 [0.16863048 0.34531547 0.00719233]
 [0.00934548 0.57571737 0.03706722]]
random integers in range [5,20]:
[[18  9 14]
 [13  5 11]
 [ 5  7 11]]
random floats drawn from standard normal distribution:
[[ 1.26253852 -0.69250495  0.59442507]
 [ 0.02166084  0.33844564  1.79875488]
 [ 0.27900726 -0.16360805  0.5379476 ]]


Now, let's get into the  nitty-gritty of how these random number generators work.

- `np.random.random`, you give it a tuple defining the array size, and like a magician pulling numbers out of a hat, it creates a matrix filled with random variables, sampled from a continuous uniform distribution over the stated interval.

- `np.random.randint`, it's your own personal slot machine! You tell it the range (from low to high) and the size of your array, and voilà, it spits out an array filled with random integers. If you don't define high, it'll default to a range from 0 to low.

- `np.random.randn`, it's like having your own bingo ball machine, but instead of numbers, it's drawing samples from the standard normal distribution. What's different here is that the size of the array is supplied from separate arguments for row and column dimensions, not as a single tuple (e.g. (3,3)).

For even more magic tricks (or rather, examples), check out this link: https://docs.scipy.org/doc/numpy/reference/routines.random.html

## 4.3 Loading from files

 During our journey together in this course, we'll frequently make use of pre-generated array data that's stored in text files.

But, no worries, we won't be manually loading data line by line with Python's standard functions. That's just not our style! Instead, we'll be using numpy's `loadtxt` function.

It's like having our own personal butler who swiftly and efficiently unloads these matrices and arrays from text files for us. So, we can save our energy for the fun stuff! Let's keep going, my friend!

In [8]:
# Download data
import requests

def download_data(source, dest):
    base_url = 'https://raw.githubusercontent.com/'
    owner = 'MaralAminpour'
    repo = 'ML-BME-UofA'
    branch = 'main'
    token = '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'
    url = '{}/{}/{}/{}/{}'.format(base_url, owner, repo, branch, source)
    r = requests.get(url, headers={'Authorization':'token ' + token})
    f = open(dest, 'wb')
    f.write(r.content)
    f.close()

import os
if not os.path.exists('temp'):
   os.makedirs('temp')

download_data('Week-1-Python-programming/data/matrix.txt', 'temp/matrix.txt')

In [9]:
new_mat = np.loadtxt('temp/matrix.txt', delimiter=',') # note use of optional argument delimeter to load files with comma separated values
print(new_mat)

[[-0.05117679 -0.62723348 -0.04125068 -1.29827409 -0.0655964  -1.00543125
  -1.0831345   0.01155352  0.24766376  0.94264478]
 [-0.63283927 -0.03772665 -0.69827689 -0.1052899   0.61800768 -1.55363743
  -0.34921845  1.1285718  -0.7752054  -1.85557745]
 [ 0.81062212 -1.25557838 -0.88833737  0.50138339 -0.44753964  0.7569975
  -0.17064013  0.57047556 -0.32382926 -0.47187674]
 [-1.12529374 -0.23521637 -0.61800694 -0.56430634  0.4527447   0.03960641
   0.47565486 -0.80647691 -1.0366784   0.24002272]
 [ 1.57188442  0.08403489  0.66643652 -0.24020642 -0.23311341  0.01853213
  -1.46649328 -1.25634881 -0.98202555 -0.05224428]
 [ 1.84854548  1.43911413 -0.90022799  1.56078049 -0.20948503 -0.740694
   0.2036504  -0.77925612  0.20793743 -0.02193292]
 [ 0.38801326  0.15731882 -0.62093488 -1.38791008 -0.36185926 -0.1668012
   0.0717035   0.16855269  0.16238424  0.74886967]
 [ 1.02804171 -0.55283394 -0.66957722 -0.47886764  0.10912133  0.18720818
   0.45316169 -1.40592268  0.15264444  0.14787058]
 [-0

Got a matrix you need to keep for later? No problem at all! Numpy's got your back with `savetxt`. It neatly tucks away your matrices, defaulting to a format where values are separated by spaces.

But hey, maybe you're thinking, "I want something even more efficient." Well, you're in luck! Numpy also offers the save function. This little helper stores your arrays in a special `.npy` format, which makes reading and writing arrays a breeze.

And when you're ready to bring back those saved arrays, just call on `load`, and it'll retrieve them in a jiffy!

In [10]:
# saving in npy format
np.save('temp/matrix.npy', new_mat)
# loading in npy.format
new_mat2 = np.load('temp/matrix.npy')
print(new_mat2)

[[-0.05117679 -0.62723348 -0.04125068 -1.29827409 -0.0655964  -1.00543125
  -1.0831345   0.01155352  0.24766376  0.94264478]
 [-0.63283927 -0.03772665 -0.69827689 -0.1052899   0.61800768 -1.55363743
  -0.34921845  1.1285718  -0.7752054  -1.85557745]
 [ 0.81062212 -1.25557838 -0.88833737  0.50138339 -0.44753964  0.7569975
  -0.17064013  0.57047556 -0.32382926 -0.47187674]
 [-1.12529374 -0.23521637 -0.61800694 -0.56430634  0.4527447   0.03960641
   0.47565486 -0.80647691 -1.0366784   0.24002272]
 [ 1.57188442  0.08403489  0.66643652 -0.24020642 -0.23311341  0.01853213
  -1.46649328 -1.25634881 -0.98202555 -0.05224428]
 [ 1.84854548  1.43911413 -0.90022799  1.56078049 -0.20948503 -0.740694
   0.2036504  -0.77925612  0.20793743 -0.02193292]
 [ 0.38801326  0.15731882 -0.62093488 -1.38791008 -0.36185926 -0.1668012
   0.0717035   0.16855269  0.16238424  0.74886967]
 [ 1.02804171 -0.55283394 -0.66957722 -0.47886764  0.10912133  0.18720818
   0.45316169 -1.40592268  0.15264444  0.14787058]
 [-0

# Exercise 1: Creating numpy arrays

Let's put what we've learnt so far into practice! Here are some fun exercises for you to try out:

1) Get those fingers typing and create some arrays using the nested list notation.

- Go ahead and make a numpy array with 6 integer values using the function `np.array`.

- Then, let's get to know your array a bit better. Print out its various attributes: `shape`, `size`, `ndim`, `dtype`.

- Fancy a bit of variation? Repeat the process, but this time create an array with float values, and then another one with string values.

- Feeling adventurous? Make a 2D array with dimensions 3x2, and print out its shape and dimensions.

- And if you're up for a challenge, how about trying your hand at creating a 3D array?

2) Time to play architect! Create a neat 3x4 array filled with zeros using the `np.zeros` function.

3) Add a dash of unpredictability! Use `np.random.randint` to create a 2x4 array filled with random integers in the range of 10 to 20.

4) Lastly, let's go for a bit of order amidst the chaos. Create an array that neatly lines up every even number from 0 to 20.

Take your time, and most importantly, have fun with it! Remember, practice makes perfect, and these exercises are a great way to reinforce what you've learned. Happy coding!

In [11]:
# Exercise 1.1. To do - try creating arrays from 1D, 2D and 3D nested lists, print attributes
ex1_1a = np.array([1,2,3])
print(ex1_1a)
print(ex1_1a.shape)
print()
ex1_1b = np.array([[1,2,3],[1,2,3]])
print(ex1_1b)
print(ex1_1b.shape)
print()
ex1_1c = np.array([[[1,2,3],[1,2,3]],[[1,2,3],[1,2,3]]])
print(ex1_1c)
print(ex1_1c.shape)
print()

# Exercise 1.2. Create a 3x4 array full of zeros
ex1_2 = np.zeros((3,4))
print(ex1_2)
print()

# Exercise 1.3. Create a random 2x4 array of of random integers in the range 10 to 20
ex1_3 = np.random.randint(10,20,(2,4))
print(ex1_3)
print()

# Exercise 1.4. Create an array that returns every even number from 0 to 20
ex1_4 = np.arange(0,21,2)
print(ex1_4)

[1 2 3]
(3,)

[[1 2 3]
 [1 2 3]]
(2, 3)

[[[1 2 3]
  [1 2 3]]

 [[1 2 3]
  [1 2 3]]]
(2, 2, 3)

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]

[[14 17 15 13]
 [13 16 12 10]]

[ 0  2  4  6  8 10 12 14 16 18 20]




## 4.4 Indexing Slicing and Iterating

Arrays, like lists are indexed from 0. Just like lists, arrays in Python are also zero-indexed, which means the counting begins from 0, not 1.:

In [12]:
my_array = np.array([1,2,3,4,5,5,6,7,9,10])

print('Value of first item of array:', my_array[0])
print('Value of last item of of array:', my_array[-1])
print('Value of penultimate index of array:', my_array[-2])

Value of first item of array: 1
Value of last item of of array: 10
Value of penultimate index of array: 9


 Slicing is a super useful feature in Python that we can also use with numpy arrays. Remember, when we slice, we need to specify a start index, an end index, and a step for each dimension of our array. It's just like making precise cuts on a piece of cake (or any of your favorite treats!). Let's look at some examples:

In [13]:
my_matrix = np.array([[1,2,3,4,5,6],[7,8,9,10,11,12]]) # redefined from above
print('my_matrix =\n', my_matrix)

# to return a whole row (with all columns); specifically the first row
print('the first row:', my_matrix[0,:])

# to return a whole column (with all rows); specifically the fourth column (indexed by 3)
print('the fourth column (and all rows)', my_matrix[:,3])

print('the sub-matrix given by columns with indices = 1 and 2:')
# this returns all rows with ':' and then the columns corresponding to the 2nd and 3rd columns
# as the slice does not include the last index in the range (similar to range and arange))
print(my_matrix[:,1:3])

my_matrix =
 [[ 1  2  3  4  5  6]
 [ 7  8  9 10 11 12]]
the first row: [1 2 3 4 5 6]
the fourth column (and all rows) [ 4 10]
the sub-matrix given by columns with indices = 1 and 2:
[[2 3]
 [8 9]]


When you're slicing an array, it's good to remember that the last index isn't included. So, if you want everything from a particular dimension, you can use `:` all on its own. That'll give you all the indices for that dimension.

Now, let's chat about integer indexing. This is a neat trick that lets you grab any values you want from your array, no matter where they are. It's like picking out all your favorite candies from a mixed bag. Just provide a list of the indices you're after, and you're good to go! Check it out:

In summary, similarly to list slicing the range is non inclusive of the last index in the range. Use of ```:``` on its own will return all indices (columns/rows etc) for that dimension of the array.

While slicing extracts sub-matrices in fixed ranges, integer indexing (using lists) allows arbitrary values to be selected from the array:

In [14]:
#. this slice will extract one row (corresponding to index 0), and 3 columns (corresponding to indices 1, 3 and 5)
print('row 1 and columns with indices = 1, 3, 5:', my_matrix[0,[1,3,5]])

row 1 and columns with indices = 1, 3, 5: [2 4 6]


It is also possible to return all indices from an array, whose values meet certain boolean conditions e.g.

This is one of the cool things about numpy arrays - you can use conditions to pick out just the values you want. Imagine you're trying to find all the numbers in your array that meet a certain condition - say, all the numbers greater than 2. You can do that using something called 'boolean indexing'. It's like having a treasure map that only shows you the spots where the treasure is! Here's an example of how it works:

In [15]:
print('Return all indices correponding to values from my_matrix that are > 4:')
boolean_cond = (my_matrix > 4)
print(boolean_cond)

Return all indices correponding to values from my_matrix that are > 4:
[[False False False False  True  True]
 [ True  True  True  True  True  True]]


This can be useful, when for example the goal is to mask one array using the values of another.

Think of it like using one array as a stencil to pick out just the bits you want from another array. Let's say you have two arrays of the same shape and you want to replace some values in the first one based on conditions from the second one. You can totally do that with numpy arrays! You just need to use the second array to create a mask and then apply it to the first one. It's like using a sieve to filter out the grains of sand you don't want. Let's see it in action!

In [16]:
my_matrix2 = np.array([[11,12,13,14,15,16],[17,18,19,20,21,22]])
print(my_matrix2)

# this will mask my_matrix2 to return the values at indices where boolean_cond == True,
# where boolean_cond is the boolean matrix we calculated early using my_matrix > 4
print(my_matrix2[boolean_cond])
# Note that the result is an array, without the same shape as my_matrix2

[[11 12 13 14 15 16]
 [17 18 19 20 21 22]]
[15 16 17 18 19 20 21 22]


## 4.5 Changing the shape of an array

In some occasions over the course it will become necessary to reshape or flatten an array before performing operations on it. There are several inbuilt functions in numpy for this purpose like ```flatten``` or ```ravel``` for squashing our arrays into a single dimension, and we can reshape them to fit our needs using functions like ```reshape``` or ```resize```. Now, let's dive deeper into what each of these functions can do for us!

### 4.5.1 Flattening matrices/tensors into one long vector

The functions (```ravel``` and ```flatten```) seem to do the same thing on the surface, but there's a key difference between them. When `flatten` reshapes an array, it creates a brand new copy in your computer's memory. On the other hand, `ravel` is like a thrifty tailor - it reshapes the array but doesn't use any extra memory because it keeps referring back to the original array. While this makes `ravel`faster and more memory-efficient than `flatten`, it could also lead to some unexpected results or may result in undersired behaviour, as you'll see in the example below.

When it comes to unraveling an array using the `ravel` and `flatten` functions in numpy, you have the option to specify the order in which the elements are flattened. The "order" parameter allows you to choose between three options: "C" (default), "F", and "A".

- "C" represents the row-major ordering, which is the same as how it would be done in languages like C/C++. It flattens the array in a way where the row index varies slowest, and the column index varies quickest.

- "F" stands for Fortran column-major ordering. It flattens the array in a column-major fashion, where the column index varies slowest and the row index varies quickest.

- "A" is a special option that preserves the original ordering style of the array. This can be either C or Fortran style, depending on how the array was originally created. The ordering style is saved as an attribute when the array is first created.

By specifying the order parameter, you have the flexibility to choose how you want the array to be flattened.

**Note:** The term "unraveling" in the context of arrays refers to the process of transforming a multidimensional array into a one-dimensional array. It essentially means flattening the array, collapsing all the dimensions into a single sequence of elements. This can be useful when you want to work with the array in a linear fashion or when you need to apply certain operations that expect a one-dimensional input. Unraveling allows you to access each element of the array individually, regardless of its original shape or dimensions.

In [17]:
A = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])
print(A)
Flattened_X = A.flatten()
print('the result of flattening A', Flattened_X)
print('flattening A with row major ordering', A.flatten(order="C"))
print('flattening A with column major ordering', A.flatten(order="F"))
print('flattening A preserving the C/Fortran style of the original array', A.flatten(order="A")) # in this case it will be C ordering)

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
the result of flattening A [ 1  2  3  4  5  6  7  8  9 10 11 12]
flattening A with row major ordering [ 1  2  3  4  5  6  7  8  9 10 11 12]
flattening A with column major ordering [ 1  4  7 10  2  5  8 11  3  6  9 12]
flattening A preserving the C/Fortran style of the original array [ 1  2  3  4  5  6  7  8  9 10 11 12]


This behaviour is true for both ```ravel``` and ```flatten.``` The important difference is that if we subsequently edit the values of ```Flattened_X``` (the output of the ```flatten()``` operation) the original array ```A``` remains unchanged. On the other hand, this is not the case with ```ravel```, which continues to point to the location of ```A``` in memory. Thus it's important to carefully consider which behaviour is desired.

While both `ravel` and `flatten` functions in numpy offer the same flattening behavior, there is an important distinction in how they handle the **memory** and the **relationship with the original array**.

When you use flatten, it creates a new array `Flattened_X `that is a complete copy of the original array `A`. Any modifications made to `Flattened_X` won't affect the values in `A`, as they are separate entities in memory.

On the other hand, `ravel` doesn't create a copy of the array. It returns a new array that shares the same memory location as `A`. This means that any modifications made to ravel output will also affect the values in `A`, as they are essentially pointing to the same memory location.

Therefore, it's crucial to consider this difference and choose the appropriate function based on whether you want the modifications to propagate back to the original array or not. It gives you the flexibility to decide which behavior aligns with your specific requirements.

In [18]:
Flattened_X[0] = 100
print('We see that after flatten() any changes to the new array \n {} do not impact A \n {}'.format(Flattened_X,A))

B = A.ravel()
B[0] = 200

print('On the other hand, changing the output of ravel (B): \n {} does change A \n {} this is because they point to the same location in memory'.format(B,A))

We see that after flatten() any changes to the new array 
 [100   2   3   4   5   6   7   8   9  10  11  12] do not impact A 
 [[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
On the other hand, changing the output of ravel (B): 
 [200   2   3   4   5   6   7   8   9  10  11  12] does change A 
 [[200   2   3]
 [  4   5   6]
 [  7   8   9]
 [ 10  11  12]] this is because they point to the same location in memory


### 4.5.2 General reshaping

Alternatively, arrays can be reshaped to generic sizes using the ```reshape``` and ```resize``` functions. Here reshape creates a copy of the array, whereas resize modifies the original array

When it comes to reshaping arrays to different sizes, numpy provides us with two useful functions: ```reshape``` and ```resize```.

The `reshape` function allows you to create a new array with a different shape, while keeping the values of the original array intact. It's like taking a piece of clay and molding it into a new shape, without altering the clay itself.

On the other hand, `resize` modifies the original array itself to the desired shape. It's like using a pair of scissors to cut a piece of fabric into a new shape.

So, depending on whether you want to create a new array with a different shape or modify the original array, you can choose between reshape or resize accordingly. They give you the flexibility to manipulate arrays in different ways to suit your needs.

In [19]:
A = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])

new_A = A.reshape((2,6))
print('original shape: {}; new shape {} - note A retains its orginal shape'.format(A.shape,new_A.shape) )
print(new_A)

# now resize the original array
A.resize((2,6))
print('note following resize the original array shape is changed however', A.shape)

original shape: (4, 3); new shape (2, 6) - note A retains its orginal shape
[[ 1  2  3  4  5  6]
 [ 7  8  9 10 11 12]]
note following resize the original array shape is changed however (2, 6)


The numbers are resampled to the new shape following the order the array is stored in memory (generally C-style)

### 4.5.3 Concatenation

When concatenating arrays in numpy, you can create new arrays by combining two arrays along compatible dimensions. This can be done using functions like `np.concatenate`, `np.vstack`, or `np.hstack`.

The arrays to be concatenated should have compatible dimensions along the axis you want to concatenate. For example, if you have two 1D arrays, you can concatenate them to create a larger 1D array. If you have two 2D arrays, you can concatenate them along either the rows (vertical concatenation) or the columns (horizontal concatenation).

The order in which the arrays are concatenated follows the order in which they are provided as arguments to the concatenation function. It's like stacking building blocks one on top of the other or side by side to create a larger structure.

By using these concatenation functions, you can create new arrays by combining existing arrays in various ways to suit your needs.

It is also possible to create new arrays by concatenating two arrays, with (at least one) compatible dimension:

In [20]:
B = np.random.randint(100, size=(2,6))
A_and_B = np.concatenate((A,B),axis=1) # first argument is a tuple containing arrays to be concatenated in order
print(A)
print(B)
print(A_and_B)
print('shape of concatenated array', A_and_B.shape)

[[ 1  2  3  4  5  6]
 [ 7  8  9 10 11 12]]
[[76 58 11 63 57 39]
 [37 84 15  5 95 65]]
[[ 1  2  3  4  5  6 76 58 11 63 57 39]
 [ 7  8  9 10 11 12 37 84 15  5 95 65]]
shape of concatenated array (2, 12)


**Exercise** What happens when you change the axis of concatenation to 0? (When you change the axis of concatenation to 0, it means you want to concatenate the arrays along the vertical axis or the rows. )

Other operations for changing array shape and size can be found https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.array-manipulation.html

In [21]:
B = np.random.randint(100, size=(2,6))
A_and_B = np.concatenate((A,B),axis=0) # first argument is a tuple containing arrays to be concatenated in order
print(A)
print(B)
print(A_and_B)
print('shape of concatenated array', A_and_B.shape)

[[ 1  2  3  4  5  6]
 [ 7  8  9 10 11 12]]
[[64 15 45  9 22 40]
 [55 86  3 33 82 99]]
[[ 1  2  3  4  5  6]
 [ 7  8  9 10 11 12]
 [64 15 45  9 22 40]
 [55 86  3 33 82 99]]
shape of concatenated array (4, 6)


## 4. 6 Broadcasting

Broadcasting is like a magical wand that allows us to perform operations on arrays of different sizes, as long as their dimensions match or can be adjusted to match. It's like having a superhero power that brings arrays of different shapes into alignment!

Here's how it works: Let's say we have two arrays, one with a shape of (3, 4) and the other with a shape of (1, 4). These arrays have different dimensions, but broadcasting comes to the rescue!

First, broadcasting adds extra dimensions to the smaller array to match the dimensionality of the larger array. In this case, the smaller array will be transformed from a length-4 vector to a 1x4 matrix, making it compatible with the larger array.

**Next, the smaller array is copied along the new dimension as many times as needed to match the size of the larger array.** In our example, the smaller array will be repeated 3 times along the row dimension until it has the same shape as the larger array.

Now, with both arrays having the same shape, we can perform operations on them together. We can sum the values of the corresponding elements, thanks to the broadcasting magic!

So, broadcasting allows us to seamlessly combine arrays of different sizes, as long as their dimensions can be aligned. It's like bringing puzzle pieces together, making them fit perfectly for our operations.

What this specifically means is if you have two arrays with different numbers of dimensions (e.g. a vector and a matrix), extra dimensions will be added to the smaller array to give it the same dimensionality e.g. and $1 \times n $ matrix rather than a length $n$ vector - shape (,$n$).

In this specific example, broadcasting allows us to sum a (3, 4) shaped matrix with a (1, 4) shaped array. The smaller array is broadcasted along the row dimension, copying it three times to match the shape of the matrix. This allows us to perform the summation effortlessly.

Keep in mind that broadcasting can be a powerful tool, but it's important to ensure that the dimensions align properly for the desired operations.

In [22]:
# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11,12]])
print(x)
v = np.array([10,20,30,40])
print(v)
y = x + v
print(y)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
[10 20 30 40]
[[11 22 33 44]
 [15 26 37 48]
 [19 30 41 52]]


# Exercise 2: Slicing, Reshaping and broadcasting

Let's have some fun with the 3D array called X! Here are some exciting tasks for you to try:

1. Slice the third row and the second and fourth columns from array X.
2. Reshape X into a (6, 4) matrix.
3. Let your imagination run wild! Explore different reshaping options for array 5. (What other configuration can you reshape X into?)
6. Create a matrix of ones with the same size as X and concatenate it with X.
7. Get creative! Create a new array of size (1, 4, 1) and add it to X using broadcasting.
8. Challenge yourself! Create a (4, 6) array and add a row vector to it using broadcasting.

Ready for the adventure? Let's go!

In [23]:
X=np.array([[0, 1, 2],[3,4,5],[6,7,8], [9,10,11], [12,13,14], [15,16,17],[18,19,20],[21,22,23]])

# To do:  print X shape
print('X shape {}'.format(X.shape))

# 2.1 slice the third  and sizth row, with the second column to return a 2 x 1 array
print(X[[2,5],:1].shape) # note use of :1 in column index ensures return of 2x1 matrix raher than length 2 vector

# 2.2. reshape X into a (6,4) matrix
newX=X.reshape(6,4)
print('newX shape {}'.format(newX.shape))

# 2.3 What other configuration can you reshape X into?
print(X.reshape(3,8))
print(X.reshape(2,12))

# 2.4 Create a matrix one ones in the same size as X; concatenate with X first on rows then columns
my_ones=np.ones(X.shape)
my_concat=np.concatenate((my_ones,X))
print(' Shape after row concatenation {}'.format(my_concat.shape))
print(' Shape after row concatenation {}'.format(np.concatenate((my_ones,X),axis=1).shape))

# 2.5 Create a new array of size (1,3) add it to X using broadcasting
new=10*np.ones((1,3))
broadcasting_ex=new+X
print('result of broadcasted sum of \n {} \n with {} is \n {}'.format(X,new,broadcasting_ex))

# 2.6 Create an array of size (4,6) and add a row vector to it by broadcasting
A=np.random.randint(0,10,(4,6))
b=np.array([1,2,3,4,5,6])

print('A=',A)
print('A+b=',A+b)


X shape (8, 3)
(2, 1)
newX shape (6, 4)
[[ 0  1  2  3  4  5  6  7]
 [ 8  9 10 11 12 13 14 15]
 [16 17 18 19 20 21 22 23]]
[[ 0  1  2  3  4  5  6  7  8  9 10 11]
 [12 13 14 15 16 17 18 19 20 21 22 23]]
 Shape after row concatenation (16, 3)
 Shape after row concatenation (8, 6)
result of broadcasted sum of 
 [[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]
 [12 13 14]
 [15 16 17]
 [18 19 20]
 [21 22 23]] 
 with [[10. 10. 10.]] is 
 [[10. 11. 12.]
 [13. 14. 15.]
 [16. 17. 18.]
 [19. 20. 21.]
 [22. 23. 24.]
 [25. 26. 27.]
 [28. 29. 30.]
 [31. 32. 33.]]
A= [[6 2 9 6 5 0]
 [0 6 8 8 5 3]
 [7 8 6 2 0 1]
 [5 6 2 6 7 7]]
A+b= [[ 7  4 12 10 10  6]
 [ 1  8 11 12 10  9]
 [ 8 10  9  6  5  7]
 [ 6  8  5 10 12 13]]


## 4.7 Array Operations

Numpy is like a magical toolbox filled with incredible functions for operating on arrays. In this section, we'll explore some of the most commonly used numpy functions that will come in handy throughout this course.

Let's start with a few examples of elementwise operations. Just like the familiar operators (`+, -, *, /`), numpy provides equivalent functions (`add, subtract, multiply, divide`) for performing elementwise operations on arrays. The beauty of numpy is that it allows us to unleash the power of these operations on arrays effortlessly.

Not only that, numpy also lets us perform elementwise operations with scalars, thanks to its broadcasting magic! It's like having a magic wand that extends the scalar value to match the size of the array, allowing us to perform the operation seamlessly. Let's take a look at some examples to bring this to life:

Starting with some examples of elementwise operations. For these operators (```+,-,*,/``` etc) and have equivalent numpy functions (```add,subtract,multiply,divide```). Elementwise operations with scalars are also possible (essentially through broadcasting); see examples:

In [24]:
# define arrays as floats; ensures results of operations are also expressed as floats

scalar = 5
A = np.array([[1,2],[3,4]], dtype=np.float64)
B = np.array([[10,20],[30,40]], dtype=np.float64)

# Elementwise sum; can use + or add
print('Elementwise sum between array {} and scalar {} using + : {}'.format(A,scalar,A + scalar))
print('Elementwise sum using + :')
print(A + B)
print('Elementwise sum using np.add:')
print(np.add(A, B))

# Elementwise difference; can use - or subtract
print('Elementwise subtract using - :')
print(A - B)
print('Elementwise sum using np.subtract :')
print(np.subtract(A, B))

# Elementwise product; can use * or multiple
print('Elementwise multiply using * :')
print(A * B)
print('Elementwise multiple using np.multiply :')
print(np.multiply(A, B))

# Elementwise division; can use \ or divide
print('Elementwise division using / :')
print(A/B)
print('Elementwise division using np.divide :')
print(np.divide(A,B))

Elementwise sum between array [[1. 2.]
 [3. 4.]] and scalar 5 using + : [[6. 7.]
 [8. 9.]]
Elementwise sum using + :
[[11. 22.]
 [33. 44.]]
Elementwise sum using np.add:
[[11. 22.]
 [33. 44.]]
Elementwise subtract using - :
[[ -9. -18.]
 [-27. -36.]]
Elementwise sum using np.subtract :
[[ -9. -18.]
 [-27. -36.]]
Elementwise multiply using * :
[[ 10.  40.]
 [ 90. 160.]]
Elementwise multiple using np.multiply :
[[ 10.  40.]
 [ 90. 160.]]
Elementwise division using / :
[[0.1 0.1]
 [0.1 0.1]]
Elementwise division using np.divide :
[[0.1 0.1]
 [0.1 0.1]]


In the vast world of numpy, you'll discover a treasure trove of matrix operations waiting to be explored. However, it's essential to understand how these operations differ when working with vectors, matrices, and higher dimensional tensors.

When it comes to 1D arrays, numpy provides three distinct functions for calculating the dot product: `np.inner`, `np.dot`, and `np.matmul`. These functions come into play when we want to compute the dot product of two vectors. However, as we venture into higher dimensions, their behaviors begin to diverge.

So, grab your adventurer's hat and get ready to uncover the secrets of numpy's matrix operations!

**`np.inner`**: This function calculates the sum product of two matrices. In the context of two matrices A and B, np.inner estimates the matrix product AB^T. It multiplies and sums corresponding elements from the rows of A and the rows of B.

**`np.dot`** and **`np.matmul`**: These functions estimate the literal matrix product of two matrices, A and B. They multiply and sum elements from the rows of A and the columns of B. For two matrices A and B, `np.dot` and `np.matmul` both estimate the matrix product AB.

To put it simply, `np.inner` performs the matrix product by multiplying and summing corresponding elements from the rows of A and B, while `np.dot` and `np.matmul` perform the matrix product by multiplying and summing elements from the rows of A and the columns of B.

It's crucial to select the appropriate function based on the desired matrix product and the dimensionality of the arrays involved. By understanding the nuances of these functions, you can navigate the world of matrix operations with confidence and precision.

So, venture forth and harness the power of numpy's matrix operations, knowing that you have the right tools at your disposal!

So, grab your adventurer's hat and get ready to uncover the secrets of numpy's matrix operations!

In [25]:
a = np.array([1,2,3])
b = np.array([0,1,0])

# perform inner product
# for vectors all three of these method return
# [a1b1,a2b2,a3b3]
print('inner product of vectors :')
print(np.inner(a,b))

print('dot product of vectors :')
print(np.dot(a,b))

print('matrix product of vectors :')
print(np.matmul(a,b))

# For 2D arrays inner returns the sum product
print('For matrices \n A = {} \n B = {} \n inner product = {} '.format(A,B,np.inner(A,B)))

# whereas matmul and dot return matrix product
print('For matrices dot product =\n{} '.format(np.dot(A,B)))
print('For matrices matrix product =\n{} '.format(np.matmul(A,B)))

inner product of vectors :
2
dot product of vectors :
2
matrix product of vectors :
2
For matrices 
 A = [[1. 2.]
 [3. 4.]] 
 B = [[10. 20.]
 [30. 40.]] 
 inner product = [[ 50. 110.]
 [110. 250.]] 
For matrices dot product =
[[ 70. 100.]
 [150. 220.]] 
For matrices matrix product =
[[ 70. 100.]
 [150. 220.]] 


As we venture into higher dimensions, the behavior of `np.dot` and `np.matmul` in numpy diverges. When working with higher dimensional arrays, it's important to understand the differences between these functions.

Here's a brief overview:

**`np.dot`**: When applied to two arrays A and B, `np.dot` estimates the sum product between the last axis of A and the penultimate axis of B. It performs the dot product along these specific axes, resulting in a new array.

**`np.matmul`**: This function is specifically designed for matrix multiplication and assumes stacks of matrices. When used with two arrays A and B, `np.matmul` performs matrix multiplication between them, taking into account their dimensions and shapes.

In a nutshell, when it comes to matrix and vector multiplication operations, it's generally safest to use `np.matmul`. It's specifically designed for matrix multiplication and provides consistent results across different dimensionalities.

In addition to `np.matmul`, numpy offers other useful matrix product operations that you may find beneficial. Some of these include:

**`np.vdot`**: Calculates the dot product between two vectors and flattens the input arrays before performing the computation.

**`np.outer`**: Computes the outer product of two vectors, resulting in a matrix.

**`np.tensordot`**: Allows for tensor contractions along specified axes, resulting in a new array.

**`np.einsum`**: Provides a powerful mechanism for performing a variety of tensor operations, including matrix multiplication, contraction, and summation, using a compact string notation.

These additional matrix product operations offer versatility and flexibility in performing various tensor operations.

So, with **`np.matmul`** as your trusted companion and these additional functions at your disposal, you're well-equipped to conquer the world of matrix operations in numpy!

**In summary:**

Regardless, the TL;DR ("Too Long; Didn't Read") of all this is that you are probably safest using ```np.matmul()``` for your matrix and vector multiplication operations.

Other potentially useful matrix product operations are:

In [26]:
# perform outer product  =
#[a1b1,a1b2,a1b3;
# a2b1,a2b2,a2b3]
# a3b1,a3b2,a3b3]
print('outer product :')
print(np.outer(a,b))

# perform cross product  =
# [ (a2b3-b2a3) , (a1b3-b1a3), (a1b2-b1a2)]
print('cross product :')
print(np.cross(a,b))

# estimating a matrix transpose
print('matrix transpose:')
print(A.transpose()) # note here use of transpose as object attribute, equally accessible as np.transpose(A)

outer product :
[[0 1 0]
 [0 2 0]
 [0 3 0]]
cross product :
[-3  0  1]
matrix transpose:
[[1. 3.]
 [2. 4.]]


And, operations for matrix factorisation are available including methods for square (```eig```) and non-square matrices (cholesky, ```cholesky``` and singular value decompsition, ```svd```):

In [27]:
# first estimate eigenvalues and vectors of a square matrix
# A=PQP^(-1) with eigenvalues (Q) and vectors (P)
mat1 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
Q,P = np.linalg.eig(mat1)
print('Eigenvalues of square matrix:',Q)
print('Eigenvectors of square matrix:')
print(P)

# now estimate eigenvalues and vectors of a non-square matrix using svd
# svd decomposition:
# A=UDV* (where V* is conjugate transpose of V ; U and V are the left and right singular vectors of A,
# D is a diagonal matrix conatining singular values)
mat2 = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11,12]])
u,d,v = np.linalg.svd(mat2)
print('Singular values of non-square matrix:',d)
print('left singular vectors of non-square matrix:')
print(u)
print('right singular vectors of non-square matrix:')
print(v)

Eigenvalues of square matrix: [ 1.61168440e+01 -1.11684397e+00 -8.58274334e-16]
Eigenvectors of square matrix:
[[-0.23197069 -0.78583024  0.40824829]
 [-0.52532209 -0.08675134 -0.81649658]
 [-0.8186735   0.61232756  0.40824829]]
Singular values of non-square matrix: [2.54368356e+01 1.72261225e+00 2.64839734e-16]
left singular vectors of non-square matrix:
[[ 0.20673589  0.88915331  0.40824829]
 [ 0.51828874  0.25438183 -0.81649658]
 [ 0.82984158 -0.38038964  0.40824829]]
right singular vectors of non-square matrix:
[[ 0.40361757  0.46474413  0.52587069  0.58699725]
 [-0.73286619 -0.28984978  0.15316664  0.59618305]
 [ 0.26153473 -0.70741401  0.63022384 -0.18434456]
 [-0.48124795  0.44672745  0.55028893 -0.51576844]]


There are many more functions to explore and discover as we progress through the lecture series. If you're curious about the full range of functions available in numpy, you can always refer to the comprehensive library manual at: [numpy library manual](https://numpy.org/doc/stable/reference/index.html#reference).

This resource will be your go-to guide, providing detailed information on various mathematical functions [here](https://numpy.org/doc/stable/reference/routines.math.html) and linear algebra operations [here](https://numpy.org/doc/stable/reference/routines.linalg.html).

In addition to the library manual, you can also rely on online resources like Stack Overflow to search for specific functions or seek assistance from the vibrant programming community. With numpy being widely used, you'll find a wealth of examples and discussions readily available.

# Exercise 3: Array Operations

Numpy offers a wide range of elementwise operations that can come in handy for various tasks. Here are some of the commonly used ones:

**`mean:`** Calculates the mean value of all elements in an array.

**`std`**: Computes the standard deviation of all elements in an array.

**`var`**: Estimates the variance of all elements in an array.

**`sum`**: Calculates the sum of all elements in an array.

**`sqrt`**: Computes the square root of all elements in an array.

**`fabs`**: Returns the absolute values of all elements in an array.

**`exp`**: Calculates the exponential value of all elements in an array.

**`log`**: Computes the natural logarithm of all elements in an array.

Additionally, you can find more examples and information about other mathematical functions in the numpy documentation [here](https://numpy.org/doc/stable/reference/routines.math.html).

Now let's try out some of these functions and explore their capabilities:

In [28]:
# define array with positives and negatives
C = np.array([[16,64],[-25,4]], dtype=np.float64) 

# estimate mean of matrix elements (replace `None` with correct numpy function call)
meanC=np.mean(C)
#or
meanC=C.mean()

# estimate the standard deviation and variance 
stdC=C.std()
varC=C.var()

# get absolute values 
fabsC=np.fabs(C) # not available as a class method

# estimate square root of fabsC
sqrtC=np.sqrt(fabsC) 

# estimate exponential of fabsC
expC=np.exp(fabsC)

# estimate natural log of fabsC
logC=np.log(fabsC)

# try some other functions 

# MATRIX OPERATIONS
# if you are getting confused consider comparing results against those obtained in matlab 

# create 2 3x3 matrices
A=np.random.randint(5,10,(3,3))
B=np.random.randint(1,5,(3,3))


# perform elementwise multiplication
elementwise=A*B

# perform matrix multiplication - use matmul or dot
matmul=np.dot(A,B)
matmul2=np.matmul(A,B)

print('dot product \n {} '.format(matmul))
print('matrix product \n {} '.format(matmul2))


# estimate the eigenvalues and vectors of one matrix
Q,P=np.linalg.eig(A)

dot product 
 [[99 72 81]
 [72 55 59]
 [88 64 72]] 
matrix product 
 [[99 72 81]
 [72 55 59]
 [88 64 72]] 


# Citation

Travis E, Oliphant. A guide to NumPy, USA: Trelgol Publishing, (2006).