# An Introduction to numpy

**Requirements**: in order to execute this notebook, you will need the `numpy` package.

You can install itby either running `pip3 install --user numpy` in your terminal, or executing the next cell.


N.B.: in either case, make sure that you install them on the same python that you are using for the kernel of this notebook.

In [92]:
!pip3 install --user numpy

Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com


Now we can import the package, using `np` as shorthand for `numpy`.

In [101]:
import numpy as np

# Okay, so what is numpy?

numpy ([website](https://numpy.org/)) brands itself as "The fundamental package for scientific computing with Python". What's it good for?

1. **Numpy Arrays**: you want to use numpy every time you are dealing with numerical lists, matrices, and higher dimensional arrays. numpy's powerful arrays support vectorization, indexing, and broadcasting, which are extremely efficient and easy to read.

2. **Mathematical functions**: numpy also includes comprehensive mathematical functions, random number generators, linear algebra, transforms, and much more, all of which are designed to be used over its arrays.

# Let's see some code

## numpy Arrays
Define `mylist` as a normal python list, and `myarray` as a numpy array, with identical contents.

In [104]:
mylist = [0,1,2,3,4,5]
myarray = np.array([0,1,2,3,4,5])
print(mylist)
print(myarray)

[0, 1, 2, 3, 4, 5]
[0 1 2 3 4 5]


They look identical! Why is this different from normal lists?

numpy arrays are designed to hold homogenous data (while lists can have multiple types of objects in them), and are thus optimized for various array operations. For example, numpy arrays support a lot of array operations, which don't work with lists.

In [105]:
print(myarray + 5)            # add 5 to every element in the list
print(myarray + myarray)      # add two lists together, element by element
print(myarray * myarray)      # multiply two lists together, element by element

[ 5  6  7  8  9 10]
[ 0  2  4  6  8 10]
[ 0  1  4  9 16 25]


These operations don't work for normal lists, the following code snippet uses a `try-except` block to demonstrate this: the code under `try` will execute, but if it encounters any error, it won't crash everything, but simply execute the code under `except`.

In [106]:
try: 
    print(mylist * mylist)
except Exception as e:
    print(e)
    print("numpy arrays are different from lists!")

can't multiply sequence by non-int of type 'list'
numpy arrays are different from lists!


We can extend these arrays to many dimensions

In [107]:
mymatrix = np.array([[[0,1,2],[3,4,5]], [[6,7,8],[9,10,11]], [[12,13,14], [15,16,17]]])
print(mymatrix)
print(mymatrix.shape)

[[[ 0  1  2]
  [ 3  4  5]]

 [[ 6  7  8]
  [ 9 10 11]]

 [[12 13 14]
  [15 16 17]]]
(3, 2, 3)


Indexing works pretty much the same way as lists. However, note that you can use lists to index numpy arrays.

In [108]:
mylist = [0,1,2,3,4]
myarray = np.array(mylist)
print(myarray[0])
print(myarray[1:3])
print(myarray[[True, False, False, False, True]])
print(myarray[[-1,3,2]])

0
[1 2]
[0 4]
[4 3 2]


## Vectorization, indexing, broadcasting, etc.

This is the real power of numpy arrays. Let's see some examples

Let's say I want to select all positive entries in a list of numbers. This is how you would've done it in the good old days.

In [109]:
mylist = [0,-1,2,-3,4,-5,6]             # some list we want to get all positive numbers from
pos = []                                # new list to append positive numbers to
for number in mylist:
    if number > 0:
        pos.append(number)
print(pos)

[2, 4, 6]


Once you start writing more complex code, for loops like these become slow, and hard to read. Let's see how numpy fixes this.

In [110]:
myarray = np.array(mylist)              # some list we want to get all positive numbers from
pos = myarray[myarray > 0]
print(pos)

[2 4 6]


Very nice! How did it work? Let's see some details.

When we print `myarray > 0` we see that it returns an array of booleans, stating whether the **expression** that we have just written (myarray > 0) is True or False, at each index of the array.

In [111]:
expr = myarray > 0
print(expr)

[False False  True False  True False  True]


We can then use this array to index our original numpy array

In [112]:
print(myarray[expr])

[2 4 6]


We can now build more complicated expressions, but the idea is the same.

In [113]:
myarray = np.array([0,1,2,3,4,5,6,7,8,9,10])
print(myarray[ (myarray%2==0) & (myarray>5) ])    # select all even numbers greater than 5

[ 6  8 10]


## Some useful functions

I won't even get close to scratching the surface of all the math functions that numpy provides for us. numpy has a good [user's guide and tutorials](https://numpy.org/doc/stable/user/whatisnumpy.html), that you should check out after you're done with this notebook.

Initializing new arrays:

In [117]:
myarray = np.arange(0,10,1)                  # initialize arrays with all integers from 0 to 9
print("Integers from 0 to 9", myarray)
myarray = np.arange(0,10,0.5)                # initialize arrays with all half integers from 0 to 9.5
print("Half integers 0 from 9.5", myarray)
mymatrix = np.random.random((3,3))           # random [0,1] 3x3 matrix
print("Random 3x3 matrix", mymatrix)
myarray = np.logspace(0,3,4)                 # logarithmic space
print("Logarithmic space array", myarray)

Integers from 0 to 9 [0 1 2 3 4 5 6 7 8 9]
Half integers 0 from 9.5 [0.  0.5 1.  1.5 2.  2.5 3.  3.5 4.  4.5 5.  5.5 6.  6.5 7.  7.5 8.  8.5
 9.  9.5]
Random 3x3 matrix [[0.10876912 0.86211295 0.1060116 ]
 [0.86764082 0.37156861 0.61320559]
 [0.87962668 0.34436341 0.49392533]]
Logarithmic space array [   1.   10.  100. 1000.]


Simple functions:

In [119]:
mymatrix = np.random.random((3,3))
print("Exponentiate", np.exp(mymatrix))             # exponentiate every entry
print("Logarithm", np.log(mymatrix))                # take log of every entry
print("Sum of entries", mymatrix.sum())             # sum all entries
print("Maximum", mymatrix.max())                    # get max of all entries
print("Minimum", mymatrix.min())                    # get min of all entries

Exponentiate [[1.31003387 2.68075106 1.63259118]
 [1.67069799 2.32571187 2.1829025 ]
 [2.12642699 1.01795265 1.48115539]]
Logarithm [[-1.30913706 -0.01400055 -0.71300621]
 [-0.66700878 -0.16957177 -0.24762144]
 [-0.28177542 -4.02892758 -0.93439755]]
Sum of entries 5.049300467104399
Maximum 0.986096999888426
Minimum 0.017793401664856878


## Joining arrays

Simple concatenation

In [121]:
myarray1 = np.arange(0,5,1)
myarray2 = np.arange(5,10,1)
print("Original arrays:", myarray1, myarray2)
print("Concatenated:", np.concatenate((myarray1, myarray2), axis=0))

Original arrays: [0 1 2 3 4] [5 6 7 8 9]
Concatenated: [0 1 2 3 4 5 6 7 8 9]


Stacking two arrays on top of each other

In [122]:
myarray1 = np.arange(0,5,1)
myarray2 = np.arange(5,10,1)
print("Original arrays:", myarray1, myarray2)
print("Stacked:", np.vstack((myarray1, myarray2)))

Original arrays: [0 1 2 3 4] [5 6 7 8 9]
Stacked: [[0 1 2 3 4]
 [5 6 7 8 9]]


Here is another way to do it

In [123]:
myarray1 = np.arange(0,5,1)
myarray2 = np.arange(5,10,1)
print("Original arrays", myarray1, myarray2)

myarray1_new = np.expand_dims(myarray1, axis=0)
myarray2_new = np.expand_dims(myarray2, axis=0)
print("Expanded arrays", myarray1_new, myarray2_new)

mymatrix = np.concatenate((myarray1_new, myarray2_new), axis=0)
print("Resulting matrix", mymatrix)
print("Transposed", mymatrix.T)

Original arrays [0 1 2 3 4] [5 6 7 8 9]
Expanded arrays [[0 1 2 3 4]] [[5 6 7 8 9]]
Resulting matrix [[0 1 2 3 4]
 [5 6 7 8 9]]
Transposed [[0 5]
 [1 6]
 [2 7]
 [3 8]
 [4 9]]


## Linear Algebra

The linear algebra package, numpy.linalg, is quite nice; here is a couple examples.

Matrix multiplication

In [124]:
m1 = np.random.random((4,5))
m2 = np.random.random((5,4))
print(m1@m2)

[[1.05589368 0.76217535 0.85790371 1.04536037]
 [1.78051597 1.5987149  1.66783975 1.94923429]
 [1.63134215 1.21647504 1.33061557 1.42230855]
 [1.78877201 1.0293976  1.31419496 1.23351037]]


Eigenvalues

In [125]:
m3 = np.random.random((4,4))
print(np.linalg.eigvals(m3))

[1.62456766+0.j         0.07189804+0.j         0.31273332+0.13919685j
 0.31273332-0.13919685j]


Most standard linear algebra functions are already supported through this package: traces, determinants, singular value decomposition, etc. See [here](https://numpy.org/doc/stable/reference/routines.linalg.html) for full list.

# Concluding Remark
numpy is an extremely powerful, well-supported, and ubiquitous package for numerical and scientific computing in the Python ecosystem. Its main scope is the manipulation of multidimensional homogenous arrays, for which it provides extensive functions and tools.
For more information, see their [documentation](https://numpy.org/doc/stable/index.html).

-- Luca Lavezzo, 1/5/2023