# NumPy

Numpy is a multi dimensional array library, and we use it over lists as lists are very slow, this is becuase it uses 'fixed types' which means that we don't have to store as much information when compared to using lists. The example given shows that the bytes of memory is much higher when we are using lists. When we are itterating through objects we also don't have to type check in Numpy. We also use continous memory when using numpy, meaning that all of our memory is stored right next to eachother, whereas in lists, this is not the case as that information can be stored in different places in the computers memory. The benifit to using Numpy is that the SIMD Vector Processing unit can be utilised to perform computations on all of the values, it also more effectivly uses the memory cache. 

But how are they different? 

Lists 
- Insertion, deletion, appending, concatenation, etc 

NumPy 
- Insertion, deletion, appending, concatenation, etc + lots more 


Example 

**Lists** 

a = [1,3,5]
b = [1,2,3]

a*b = Error 

**NumPy** 

a = np.array([1,3,5])
b = np.array([1,2,3])

a*b = np.array([1,6,15])


Applications of NumPy

- Mathematics 
- Plotting 
- Backend 
- Machine Learning


In [2]:
import numpy as np

### The Basics 

In [17]:
a = np.array([1,2,3])
a

array([1, 2, 3])

In [7]:
b = np.array([[9.0, 8.0, 7.0],[6.0, 5.0, 4.0]])
b

array([[9., 8., 7.],
       [6., 5., 4.]])

In [10]:
# How do we get the dimension of an array

a.ndim
# b.ndim

1

In [13]:
# How do we get the shape of an array 
# a.shape
b.shape

(2, 3)

In [19]:
# How to get the type
a.dtype

dtype('int64')

There the type is int64, but imagine we know we arnt going to be using large numbers in this computation, we can set the type to be lower therefore, taking up less memory than a larger type int number. 

In [20]:
a = np.array([1,2,3], dtype='int16')
a

array([1, 2, 3], dtype=int16)

In [22]:
# Getting the type 
a.dtype

dtype('int16')

As we can see, the int type has been changed from 64 to 16

In [23]:
# How to get the size of the array 
a.itemsize

2

### Accessing/Changing Specific Elements, Rows and Columns

#### Accessing 

In [24]:
a = np.array([[1,2,3,4,5,6,7],[8,9,10,11,12,13,14]])
print(a)

[[ 1  2  3  4  5  6  7]
 [ 8  9 10 11 12 13 14]]


In [27]:
a.shape

(2, 7)

In [29]:
# Get a specific element [row, column]
a[1, 5]

13

In [31]:
# Get a specific row 
a[0, :]

array([1, 2, 3, 4, 5, 6, 7])

In [32]:
# Get a specific column
a[:, 2]

array([ 3, 10])

In [39]:
# Getting more fancy [starting_index:end_index:step_size]

a[0, 1:6:2]

array([2, 4, 6])

#### Changing

In [36]:
a[1, 5] = 20

In [37]:
print(a)

[[ 1  2  3  4  5  6  7]
 [ 8  9 10 11 12 20 14]]


Now the original element that was 13 is now 20

In [93]:
# Changing all of the elements in a column 
a[:, 2] = 5
print(a)

[[ 1  2  5  4  5  6  7]
 [ 8  9  5 11 12 20 14]]


In [96]:
# chainging a certain part of the array
a[[0,1], 2] = 5
print(a)

[[ 1  2  5  4  5  6  7]
 [ 8  9  5 11 12 20 14]]


In [44]:
# Changing all of the elements in a colum to specific values
a[:, 2]= [1, 2]
print(a)

[[ 1  2  1  4  5  6  7]
 [ 8  9  2 11 12 20 14]]


In [46]:
# Working with a 3d example
b = np.array([[[1,2],[3,4]],[[5,6],[7,8]]])
print(b)

[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]


In [48]:
# Getting a specific element from these arrays (the tutorial says to work outside in)
# This means we wills start with which set we want, then which row, then which index, so if we want the 4
b[0,1,1]

4

In [51]:
b[0,1,:]

array([3, 4])

In [54]:
# Geting number 7 from b
b[1,1,0]

7

In [56]:
# Getting number 6 from b
b[1,0,1]

6

In [58]:
# Changing elements in a 3D array
# It is the same as a 2D array, just a little more complex 
# Lets change the 4 into a 7

b[0,1,1] = 7
print(b)

[[[1 2]
  [3 7]]

 [[5 6]
  [7 8]]]


### Initializing Different Types of Arrays

In [62]:
# All 0s matrix 
np.zeros((2,3,3))

array([[[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]]])

In [63]:
# All 1s matrix 
np.ones((4,2,2))

array([[[1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.]]])

In [65]:
# Any other number

np.full((2,2), 99)

array([[99, 99],
       [99, 99]])

In [67]:
# Any other number (full_like), allows us to take a shape thats already built and make a new array thats
# the same size but with different numbers

np.full_like(a,4)

array([[4, 4, 4, 4, 4, 4, 4],
       [4, 4, 4, 4, 4, 4, 4]])

In [70]:
# Matrix of random numbers
np.random.rand(4,2)

array([[0.55251012, 0.64112547],
       [0.18518064, 0.26041403],
       [0.43751528, 0.62582207],
       [0.77711172, 0.7659309 ]])

In [75]:
# Using the same shape as another array but filling it with random values

np.random.random_sample(a.shape)

array([[0.94523031, 0.28443347, 0.82072229, 0.93969273, 0.24789853,
        0.80261356, 0.82571777],
       [0.6998935 , 0.52937231, 0.62045047, 0.96163054, 0.7816738 ,
        0.21738973, 0.43708128]])

In [77]:
# Random integer values
np.random.randint(4, size=(3,3))

array([[3, 1, 0],
       [1, 2, 3],
       [1, 2, 2]])

In [79]:
np.random.randint(1,4, size=(4,4))

array([[1, 3, 1, 1],
       [1, 2, 2, 3],
       [2, 3, 1, 3],
       [2, 3, 3, 2]])

In [80]:
# Getting an identity matrix
np.identity(5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

In [83]:
# Repeating an array 
arr = np.array([[1,2,3]])
repeat = np.repeat(arr, 3, axis=0)
print(repeat)

[[1 2 3]
 [1 2 3]
 [1 2 3]]


### Task

Create a matrix of 5x5 with all of the outside numbers being 1s, then the middle numbers being 0 and the final center number = 9

In [105]:
# Making the first array
first_array = np.ones((5, 5))

# adding in the 0s
first_array[[1], [1,2,3]] = 0
first_array[[2], [1,2,3]] = 0
first_array[[3], [1,2,3]] = 0

# adding the 9 
first_array[2, 2] = 9
print(first_array)

[[1. 1. 1. 1. 1.]
 [1. 0. 0. 0. 1.]
 [1. 0. 9. 0. 1.]
 [1. 0. 0. 0. 1.]
 [1. 1. 1. 1. 1.]]


In [107]:
# The tutorial did it this way 

output = np.ones((5,5))

z = np.zeros((3,3))
z[1,1] = 9 

output[1:4, 1:4] = z
print(output)

[[1. 1. 1. 1. 1.]
 [1. 0. 0. 0. 1.]
 [1. 0. 9. 0. 1.]
 [1. 0. 0. 0. 1.]
 [1. 1. 1. 1. 1.]]


### Be Carefule When Copying Arrays 

When copying arrays, we need to be carful as just assinging a new variable will simply just override the old definition so really we arn't making a new array we are just changing the sign that points to the same array see below

In [110]:
a = np.array([1,2,3])
b = a 
b[0] = 100 

print(a)
print(b)

[100   2   3]
[100   2   3]


As you can see, even though we only changed b's first element to 100, the first element in a is now 100 as when we "made" a new copy we didn't actaully make one, to do this we need to use the copy() function.


In [109]:
a = np.array([1,2,3])
b = a.copy() 
b[0] = 100 

print(a)
print(b)

[1 2 3]
[100   2   3]


### Mathematics

In [123]:
a = np.array([1,2,3,4])

print(a)

[1 2 3 4]


In [114]:
a + 2

array([3, 4, 5, 6])

In [115]:
a - 2

array([-1,  0,  1,  2])

In [118]:
a * 2

array([2, 4, 6, 8])

In [120]:
b = np.array([0,1,0,1])

In [125]:
a + b

array([1, 3, 3, 5])

In [127]:
np.sin(a)

array([ 0.84147098,  0.90929743,  0.14112001, -0.7568025 ])

In [129]:
np.cos(a)

array([ 0.54030231, -0.41614684, -0.9899925 , -0.65364362])

#### Linear Algebra 

In [134]:
a = np.ones((2,3))
b = np.full((3,2), 2)
print(a)
print(b)


np.matmul(a,b)

[[1. 1. 1.]
 [1. 1. 1.]]
[[2 2]
 [2 2]
 [2 2]]


array([[6., 6.],
       [6., 6.]])

In [137]:
# Find the determinant
c = np.identity(3)
np.linalg.det(c)

1.0

#### Statistics

In [146]:
stats = np.array([[1,2,3],[4,5,6]])
print(stats)

print(np.min(stats))
print(np.max(stats))
print(np.sum(stats))

# we can also do it on an axis basis
print(np.min(stats, axis=1))
print(np.max(stats, axis=1))
print(np.sum(stats, axis=1))

[[1 2 3]
 [4 5 6]]
1
6
21
[1 4]
[3 6]
[ 6 15]


### Reorganizing Arrays

In [154]:
before = np.array([[1,2,3,4],[5,6,7,8]])
print(before)

# lets make it into an 8 by 1
after = before.reshape((8,1))
print(after)

# lets make it a 2,2,2 
after_after = before.reshape((2,2,2))
print(after_after)

# we cannot make it into a 2x3 for example as it wont fit correctly

[[1 2 3 4]
 [5 6 7 8]]
[[1]
 [2]
 [3]
 [4]
 [5]
 [6]
 [7]
 [8]]
[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]


In [157]:
# Vertically stacking vectors
v1 = np.array([1,2,3,4])
v2 = np.array([5,6,7,8])

np.vstack([v1, v2])
# we can also keep adding them
np.vstack([v1,v1,v1,v2])




array([[1, 2, 3, 4],
       [1, 2, 3, 4],
       [1, 2, 3, 4],
       [5, 6, 7, 8]])

In [160]:
# Horizontal stacking vectors

v1 = np.array([1,2,3,4])
v2 = np.array([5,6,7,8])

np.hstack([v1,v2])

array([1, 2, 3, 4, 5, 6, 7, 8])

### Loading In A File

Lets say we have a file that has a load of data into a numpy array

In [177]:
filedata = np.genfromtxt('data.txt', delimiter=',')

filedata
# note that is is returned as a float

array([[ 1., 13., 21., 11., 15.,  3.,  2., 34., 52., 34., 23., 21., 34.,
        45., 64., 32.],
       [ 3.,  4., 56., 32., 35.,  7., 43., 46., 67., 43., 25., 67., 43.,
        22., 33., 12.],
       [22., 44., 32., 34.,  1.,  4.,  6.,  8., 33., 45., 55., 35., 67.,
        86., 43., 22.]])

In [176]:
filedata.astype('int32') 
# this has converted it into an integer

array([[ 1, 13, 21, 11, 15,  3,  2, 34, 52, 34, 23, 21, 34, 45, 64, 32],
       [ 3,  4, 56, 32, 35,  7, 43, 46, 67, 43, 25, 67, 43, 22, 33, 12],
       [22, 44, 32, 34,  1,  4,  6,  8, 33, 45, 55, 35, 67, 86, 43, 22]],
      dtype=int32)

### Advanced Indexing (Boolean Masking and Advanced Indexing)

In [179]:
# where in filedata is the value greater than 50

filedata > 50 #(we can do anthing we want here really)


array([[False, False, False, False, False, False, False, False,  True,
        False, False, False, False, False,  True, False],
       [False, False,  True, False, False, False, False, False,  True,
        False, False,  True, False, False, False, False],
       [False, False, False, False, False, False, False, False, False,
        False,  True, False,  True,  True, False, False]])

In [182]:
filedata[filedata > 50]
# makes an array of all of the integers that are above 50

array([52., 64., 56., 67., 67., 55., 67., 86.])

In [184]:
# We can also index a list in NumPy

a = np.array([1,2,3,4,5,6,7,8,9])
a[[1,2,8]]
# this will index those spots in the array

array([2, 3, 9])

In [186]:
# We want to figure out if any value in the columns is greater than 50 

np.any(filedata > 50, axis=0)
# This is true if any of the values in the column is above 50
# We can also use np.all to find if all of the values in the column are above 50

array([False, False,  True, False, False, False, False, False,  True,
       False,  True,  True,  True,  True,  True, False])

In [187]:
# We can also check if values are more than one number but less than another

((filedata > 50) & (filedata < 100))

array([[False, False, False, False, False, False, False, False,  True,
        False, False, False, False, False,  True, False],
       [False, False,  True, False, False, False, False, False,  True,
        False, False,  True, False, False, False, False],
       [False, False, False, False, False, False, False, False, False,
        False,  True, False,  True,  True, False, False]])