Many other libraries revolve/use around NumPy. It is a linear algebra library and uses C libraries so it is also very fast. In this course we only work with Numpy arrays. They come as vectors (1d arrays) or matricies (2d arrays that can still have 1 column or row). They can also only contain one uniform basic data type, like a number or a character so that it can become way more memory efficient and sustainable.

# Numpy Array Basics

Of course you need to import numpy to use it. This import will serve the rest of the notebook. Numpy arrays are similar to lists of lists however they are way more efficient and have additional functionality. So from this it follows that the most basic (not most efficient) way to create numpy arrays are to convert lists

In [1]:
import numpy as np

my_list = [1,2,3]
print(my_list)
np.array(my_list)

[1, 2, 3]


array([1, 2, 3])

In [2]:
print(np.array(my_list))

[1 2 3]


In [3]:
my_matrix_list = [[1,2,3],[4,5,6],[7,8,9]]
print(my_matrix_list)
np.array(my_matrix_list)

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]


array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

Notice how the arrays that are multidimensional have that many brackets; The 2 by 2 matrix has 2 brackets to end. You can especially see this by looking at it printed:

In [4]:
print(np.array(my_matrix_list))

[[1 2 3]
 [4 5 6]
 [7 8 9]]


### Built-in methods

Of course, there are built-in methods to create numpy arrays and do other things...

The first one we will look at is arrange(). It's syntax look something like this:

np.arrange(start, stop, step, dtype)

This will return a numpy array object. Start defaults to 0 and is where you start you interval inlcusive. Stop is required and is where you end your interval. Step is the space between values and defaults to 1. When not specified dtype is infered based on the values provided, it is the data type of the objects.

In [5]:
np.arange(11)

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [6]:
np.arange(4,11)

array([ 4,  5,  6,  7,  8,  9, 10])

In [7]:
np.arange(4,11,2)

array([ 4,  6,  8, 10])

In [8]:
np.arange(4,11,2,"float")

array([ 4.,  6.,  8., 10.])

To generate an array of just zeros just use zeros()

You pass in a tuple or single number of dimensions the first one being the outer most dimension and so on..

In [9]:
np.zeros(3)

array([0., 0., 0.])

In [10]:
np.zeros((3,3))

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

ones() basically works the same way as above. Notice how with both, the data type is a float

In [11]:
np.ones(3)

array([1., 1., 1.])

In [12]:
np.ones((3,3))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

linspace() is way different than arrange(). They both take in start and stops but only for linspace() both are inclusive and linspace() takes in a number of points instead of an interval value.

In [13]:
np.linspace(0,1,3)
# Notice how the array contains a 0 AND a 1

array([0. , 0.5, 1. ])

You also have eye() that creates an identity matrix. eye() takes in a digit. When given a digit n, a creates an n by n identity matrix. 

In [14]:
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

### The numpy random library

This library inside numpy has many methods to create the effect of randomness. Unlike the other methods, instead of taking in a a tuple to specify the parameters it just takes in a bunch of numbers. You will see what I mean soon.

rand() takes in some number(s) and returns an array of random numbers with uniform distribution between $[0.1)$ with the dimensions being the numbers given

In [15]:
np.random.rand(4)

array([0.65481018, 0.20441415, 0.68382525, 0.06512621])

In [16]:
np.random.rand(2,2)

array([[0.21912312, 0.00146718],
       [0.07441011, 0.09084023]])

randn() is similar to above except a returns a normal distribution (the closer to zero the better the chance it will get picked)

In [17]:
np.random.randn(2,2)

array([[ 0.28788392, -0.04968404],
       [-0.3832798 , -0.87689244]])

randint(l, h, s, d) returns random integers in the interval $[l,h)$ with the data type of d with s many numbers

In [18]:
np.random.randint(1,3,5)

array([1, 2, 1, 2, 2])

## attributes and methods 

In [19]:
arr = np.arange(1,26)
arr

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20, 21, 22, 23, 24, 25])

In [20]:
ranarr = np.random.randint(1,10,5)
ranarr

array([4, 7, 5, 1, 1])

A very useful method is the reshape method which takes the same data but puts into a different array. Notice how reshape only returns the arrays, it doesn't change the given array. It will throw a value error if not all elements are accounted for

In [21]:
arr.reshape(5,5)

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25]])

In [22]:
arr

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20, 21, 22, 23, 24, 25])

In [23]:
try:
    arr.reshape(3,3)
except ValueError:
    print("what is stated above must be correct")

what is stated above must be correct


You also have the max method which can help will give you the maximum value in an array. The min() method gives you the minimum value of the array

In [24]:
ranarr.max()

7

In [25]:
ranarr.min()

1

argmax() which return the index of the maximum, argmin() returns the index of the minimum. Of course both act on the first occurance

In [26]:
ranarr

array([4, 7, 5, 1, 1])

In [27]:
ranarr.argmax()

1

In [28]:
ranarr.argmin()

3

shape is the dimensions of the array. Note that it is an attribute

In [29]:
arr.shape
# The (25,) indicates that it is one dimensional

(25,)

In [30]:
arr.reshape(5,5).shape

(5, 5)

dtype is the data type of the array. It is also an attribute

In [31]:
arr.dtype

dtype('int32')

# numpy array slicing and indexing

You can use slicing and indexing on arrays as well. See how indexing and slicing at a basic level is used below

In [32]:
arr = np.arange(11)
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [33]:
arr[5]

5

In [34]:
arr[1:8]

array([1, 2, 3, 4, 5, 6, 7])

In [35]:
arr[:4]

array([0, 1, 2, 3])

In [36]:
arr[3:]

array([ 3,  4,  5,  6,  7,  8,  9, 10])

numpy arrays possess the ability to broadcast their information. This means that you can do something like this:



In [37]:
arr[5:] = 10
arr

array([ 0,  1,  2,  3,  4, 10, 10, 10, 10, 10, 10])

Be careful when creating arrays though. When a array is created that is connected to another one, a new copy is not made so that it an be more efficient, instead a pointer is made to point to the original array. So editing one array would edit the other as well. You can see how the 2 changes to a 5 even though you didn't actually edit it itself

In [38]:
arr = np.arange(11)

In [39]:
connected_arr = arr [:8]
print(arr)
connected_arr[2] = 5
print(arr)

[ 0  1  2  3  4  5  6  7  8  9 10]
[ 0  1  5  3  4  5  6  7  8  9 10]


To avoid this just use copy()

In [40]:
connected_arr = arr[:8].copy()
connected_arr[:] = 9
arr
# Values haven't changed!

array([ 0,  1,  5,  3,  4,  5,  6,  7,  8,  9, 10])

You can index and slice a 2d array. There are 2 ways: arr[row][col] or arr[row, col]

In [41]:
arr = np.arange(5,46, 5).reshape(3,3)
arr

array([[ 5, 10, 15],
       [20, 25, 30],
       [35, 40, 45]])

In [42]:
arr[1]

array([20, 25, 30])

In [43]:
arr[1, 0] + arr[1][0]

40

In [44]:
arr[1:,:2]

array([[20, 25],
       [35, 40]])

In [45]:
arr[:2,1:]

array([[10, 15],
       [25, 30]])

There is also something called conditional selection

In [46]:
arr = np.arange(1,11)
arr

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

Over here you will see that it returns an array of booleans which can help you see which ones satisfy your conditions. If you pass this array instead of an index you will get all the values that satisfy the condition

In [47]:
bool_arr = arr > 5
print(bool_arr)
print(arr[bool_arr])


[False False False False False  True  True  True  True  True]
[ 6  7  8  9 10]


Usually though, you would skip the extra step and just do it in one line

In [48]:
arr[arr>5]

array([ 6,  7,  8,  9, 10])

There is also something called fancy indexing which is where you gain functionality to retrieve items out of order. You basically pass in a list with the indexes you want. 

In [49]:
# I will start off by creating a special array 

In [50]:
arr = np.zeros((10, 10))

arr_len = arr.shape[1]
for i in range(0, arr_len):
    arr[i] = i
    
arr

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [5., 5., 5., 5., 5., 5., 5., 5., 5., 5.],
       [6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [7., 7., 7., 7., 7., 7., 7., 7., 7., 7.],
       [8., 8., 8., 8., 8., 8., 8., 8., 8., 8.],
       [9., 9., 9., 9., 9., 9., 9., 9., 9., 9.]])

In [51]:
arr[[2,3,4]]

array([[2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.]])

In [52]:
arr[[8,6,3]]

array([[8., 8., 8., 8., 8., 8., 8., 8., 8., 8.],
       [6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.]])

In [53]:
arr[1][[2,4]]

array([1., 1.])

# Numpy Operations

When you add arrays together, each of their elements gets added to the other. The same goes for other operations

In [54]:
arr = np.arange(0,11)


In [55]:
arr+arr

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20])

In [56]:
arr-arr

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [57]:
arr*arr

array([  0,   1,   4,   9,  16,  25,  36,  49,  64,  81, 100])

You can also add a scalar (just a number) to your array. Numpy will just broadcast the scalar to all the values of the array. The same goes for the other operations. 

In [58]:
arr + 100

array([100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110])

In [59]:
arr * 100

array([   0,  100,  200,  300,  400,  500,  600,  700,  800,  900, 1000])

When you try to do something invalid like divide by zero, instead of giving an error, it will give a warning and put a nan (null) object in its place. Interestingly, if you divide a number that is not zero by zero you will get infinity instead

In [60]:
arr/arr

  """Entry point for launching an IPython kernel.


array([nan,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])

In [61]:
1 / arr 

  """Entry point for launching an IPython kernel.


array([       inf, 1.        , 0.5       , 0.33333333, 0.25      ,
       0.2       , 0.16666667, 0.14285714, 0.125     , 0.11111111,
       0.1       ])

Numpy also has built in methods to perform common operations. One is sqrt() which will broadcast a 1/2 exponent to all the values

In [62]:
np.sqrt(arr)

array([0.        , 1.        , 1.41421356, 1.73205081, 2.        ,
       2.23606798, 2.44948974, 2.64575131, 2.82842712, 3.        ,
       3.16227766])

Another one is exp() which will calculate the exponential of the numbers. This means that it will take e to the power of that number (e^x)

In [63]:
np.exp(arr)

array([1.00000000e+00, 2.71828183e+00, 7.38905610e+00, 2.00855369e+01,
       5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03,
       2.98095799e+03, 8.10308393e+03, 2.20264658e+04])

np.min() and np.max() will give the minimums and maximums, For arrays purposes, they are the same as the array's built in method

In [64]:
np.max(arr)

10

In [65]:
np.min(arr)

0

You also have trigonometric functions and logarithms

In [66]:
np.sin(arr)

array([ 0.        ,  0.84147098,  0.90929743,  0.14112001, -0.7568025 ,
       -0.95892427, -0.2794155 ,  0.6569866 ,  0.98935825,  0.41211849,
       -0.54402111])

In [67]:
np.log(arr+1)

array([0.        , 0.69314718, 1.09861229, 1.38629436, 1.60943791,
       1.79175947, 1.94591015, 2.07944154, 2.19722458, 2.30258509,
       2.39789527])

You can always visit this link: http://docs.scipy.org/doc/numpy/reference/ufuncs.html to find out what the rest of the possible functions are.