<a href="https://colab.research.google.com/github/GraceMwende/Data_science_notebooks/blob/main/Getting_Started_with_NumPy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Getting Started with NumPy

## Introduction

NumPy is one of the main libraries for performing scientific computing in Python. Using NumPy, you can create high-performance multi-dimensional arrays, and several tools to work with these arrays.

A NumPy array can store a grid of values. All the values must be of the same type. NumPy arrays are n-dimensional, and the number of dimensions is denoted by the *rank* of the NumPy array. The shape of an array is a tuple of integers which holds the size of the array along each of the dimensions.

For more information on NumPy, refer to http://www.numpy.org/.


## Objectives

You will be able to:
  
- Use broadcasting to perform a math operation on an entire numpy array    
- Perform vector and matrix operations with numpy
- Access the shape of a numpy array    
- Use indexing with numpy arrays    




## NumPy array creation and basic operations

First, remember that it is customary to import NumPy as `np`.

In [None]:
import numpy as np

One easy way to create a numpy array is from a Python list. The two are similar in a number of manners but NumPy is optimized in a number of ways for performing mathematical operations, including having a number of built-in methods that will be extraordinarily useful.

In [None]:
x = np.array([1, 2, 3])
print(type(x))

<class 'numpy.ndarray'>


In [5]:
import numpy as np
y=np.array([1,2,3])
print(type(y))
print(y*3)
print(y+2)

<class 'numpy.ndarray'>
[3 6 9]
[3 4 5]


In [4]:
[1,2,3] *3


[1, 2, 3, 1, 2, 3, 1, 2, 3]

In [6]:
[1,2,3] + 2

TypeError: can only concatenate list (not "int") to list

## Broadcasting Mathematical Operations

Notice right off the bat how basic mathematical operations will be applied elementwise in a NumPy array versus a literal interpretation with a Python list:

In [None]:
# Multiplies each element by 3
x * 3

array([3, 6, 9])

In [None]:
# Returns the list 3 times
[1, 2, 3] * 3

[1, 2, 3, 1, 2, 3, 1, 2, 3]

In [None]:
# Adds two to each element
x + 2

array([3, 4, 5])

In [None]:
# Returns an error; different data types
[1, 2, 3] + 2

TypeError: can only concatenate list (not "int") to list

In [8]:
y

array([1, 2, 3])

In [7]:
np.add(y,1)

array([2, 3, 4])

In [9]:
np.power(y,3)

array([ 1,  8, 27])

## Even more math!

### Scalar Math

|   |   |
|---|---|
|`np.add(arr,1)` | Add 1 to each array element  |
|`np.subtract(arr,2)`  | Subtract 2 from each array element  |
|`np.multiply(arr,3)`  | Multiply each array element by 3 |
|`np.divide(arr,4)`    | Divide each array element by 4 (returns `np.nan` for division by zero) |
|`np.power(arr,5)`     | Raise each array element to the 5th power |




### Vector Math

|   |   |
|---|---|
|`np.add(arr1,arr2)` | Elementwise add arr2 to arr1  |
|`np.subtract(arr1,arr2)`  | Elementwise subtract arr2 from arr1  |
|`np.multiply(arr1,arr2)`  | Elementwise multiply arr1 by arr2 |
|`np.divide(arr1,arr2)`    | Elementwise divide arr1 by arr2 |
|`np.power(arr1,arr2)`     | Elementwise raise arr1 raised to the power of arr2 |
|`np.array_equal(arr1,arr2)`| Returns True if the arrays have the same elements and shape |
|`np.sqrt(arr)`            |  Square root of each element in the array                    |
|`np.sin(arr)`             |  Sine of each element in the array                           |
|`np.log(arr)`             |  Natural log of each element in the array                    |
|`np.abs(arr)`             |  Absolute value of each element in the array                 |
|`np.ceil(arr)`            |  Rounds up to the nearest int                                |
|`np.floor(arr)`           |  Rounds down to the nearest int                              |
|`np.round(arr)`           |  Rounds to the nearest int                                   |

### Here's a few more examples from the list above

In [None]:
# Adding raw lists is just appending
[1, 2, 3] + [4, 5, 6]

[1, 2, 3, 4, 5, 6]

In [10]:
[1,2,3] + [4,5,6]

[1, 2, 3, 4, 5, 6]

In [None]:
# Adds elements
np.array([1, 2, 3]) + np.array([4, 5, 6])

array([5, 7, 9])

In [11]:
np.array([1,2,3]) + np.array([4,5,6])

array([5, 7, 9])

In [12]:
x = np.array([1,2,3])
y = np.array([4,5,6])
np.add(x,y)

array([5, 7, 9])

In [None]:
# Same as above with built-in method
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
np.add(x, y)

array([5, 7, 9])

In [13]:
np.subtract(x,y)

array([-3, -3, -3])

## Multidimensional Arrays
NumPy arrays are also very useful for storing multidimensional data such as matrices. Notice how NumPy tries to nicely align the elements.

In [None]:
# An ordinary nested list
y = [[1, 2], [3, 4]]
print(type(y))
y

<class 'list'>


[[1, 2], [3, 4]]

In [None]:
# Reformatted as a NumPy array
y = np.array([[1, 2], [3, 4]])
print(type(y))
y

<class 'numpy.ndarray'>


array([[1, 2],
       [3, 4]])

In [15]:
y=[[1,2],[3,4,5]]
print(type(y))

<class 'list'>


In [19]:
y=np.array([[1,2],[3,4]])
print(y)
print(y.shape)

[[1 2]
 [3 4]]
(2, 2)


## The Shape Attribute
One of the most important attributes to understand with this is the shape of a NumPy array.

In [None]:
y.shape

(2, 2)

In [21]:
y = np.array([[1, 2, 3],[4, 5, 6]])
print(y.shape)
y

(2, 3)

In [None]:
y = np.array([[1, 2, 3],[4, 5, 6]])
print(y.shape)
y

(2, 3)


array([[1, 2, 3],
       [4, 5, 6]])

In [22]:
y = np.array([[1, 2],[3, 4],[5, 6]])
print(y.shape)
y

(3, 2)


array([[1, 2],
       [3, 4],
       [5, 6]])

### We can also have higher dimensional data such as working with 3 dimensional data
<img src="https://curriculum-content.s3.amazonaws.com/data-science/images/Image_195_3D array.png" width=500>

In [None]:
y = np.array([[[1, 2],[3, 4],[5, 6]],
             [[1, 2],[3, 4],[5, 6]]
             ])
print(y.shape)
y

(2, 3, 2)


array([[[1, 2],
        [3, 4],
        [5, 6]],

       [[1, 2],
        [3, 4],
        [5, 6]]])

## Built-in Methods for Creating Arrays
NumPy also has several built-in methods for creating arrays that are useful in practice. These methods are particularly useful:
* `np.zeros(shape)`
* `np.ones(shape)`
* `np.full(shape, fill)`

In [None]:
# One dimensional; 5 elements
np.zeros(5)

array([0., 0., 0., 0., 0.])

In [None]:
# Two dimensional; 2x2 matrix
np.zeros([2, 2])

array([[0., 0.],
       [0., 0.]])

In [None]:
# 2 dimensional;  3x5 matrix
np.zeros([3, 5])

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [None]:
# 3 dimensional; 3 4x5 matrices
np.zeros([3, 4, 5])

array([[[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]]])

### Similarly the `np.ones()` method returns an array of ones

In [None]:
np.ones(5)

array([1., 1., 1., 1., 1.])

In [None]:
np.ones([3, 4])

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

### The `np.full()` method allows you to create an array of arbitrary values

In [None]:
# Create a 1d array with 5 elements, all of which are 3
np.full(5, 3)

array([3, 3, 3, 3, 3])

In [None]:
# Create a 1d array with 5 elements, filling them with the values 0 to 4
np.full(5, range(5))

array([0, 1, 2, 3, 4])

In [None]:
# Sadly this trick won't work for multidimensional arrays
np.full([2, 5], range(10))

ValueError: could not broadcast input array from shape (10) into shape (2,5)

In [None]:
# NumPy also has useful built-in mathematical numbers
np.full([2, 5], np.pi)

array([[3.14159265, 3.14159265, 3.14159265, 3.14159265, 3.14159265],
       [3.14159265, 3.14159265, 3.14159265, 3.14159265, 3.14159265]])

## Numpy array subsetting

You can subset NumPy arrays very similarly to list slicing in python.

In [None]:
x = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
print(x.shape)
x

(4, 3)


array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

In [23]:
x=np.array([[1,2,3],[4, 5, 6],[7, 8, 9],[10, 11, 12]])
print(x.shape)
x

(4, 3)


array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

In [28]:
x[3]

array([10, 11, 12])

In [24]:
# Retrieving the first row
x[0]

array([1, 2, 3])

In [29]:
x[1:]

array([[ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

In [30]:
x[:,0]

array([ 1,  4,  7, 10])

In [34]:
x[:,1:3]

array([[ 2,  3],
       [ 5,  6],
       [ 8,  9],
       [11, 12]])

In [None]:
# Retrieving all rows after the first row
x[1:]

array([[ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

### This becomes particularly useful in multidimensional arrays when we can slice on multiple dimensions

In [None]:
# All rows, column 0
x[:,0]

array([ 1,  4,  7, 10])

In [35]:
x

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

In [None]:
# Rows 2 through 4, columns 1 through 3
x[2:4,1:3]

array([[ 8,  9],
       [11, 12]])

### Notice that you can't slice in multiple dimensions naturally with built-in lists

In [None]:
x = [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]
x

[[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]

In [None]:
x[0]

[1, 2, 3]

In [None]:
x[:,0]

TypeError: list indices must be integers or slices, not tuple

In [None]:
# To slice along a second dimension with lists we must verbosely use a list comprehension
[i[0] for i in x]

[1, 4, 7, 10]

In [None]:
# Doing this in multiple dimensions with lists
[i[1:3] for i in x[2:4]]

[[8, 9], [11, 12]]

### 3D Slicing

In [None]:
# With an array
x = np.array([
              [[1,2,3], [4,5,6]],
              [[7,8,9], [10,11,12]]
             ])
x

array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [None]:
x.shape

(2, 2, 3)

In [None]:
x[:,:,-1]

array([[ 3,  6],
       [ 9, 12]])

## Summary

Great! You learned about a bunch of NumPy commands. Now, let's move over to the lab to put your new skills into practice!

In [36]:
#cehcking numpy version
import numpy as np
print(np.__version__)

1.25.2


In [40]:
arr=np.array([1,2,3,4,5])
print(type(arr))
print(arr)

<class 'numpy.ndarray'>
[1 2 3 4 5]


In [39]:
# Use a tuple to create a NumPy array:
arr=np.array((1,2,3,4,5))
print(arr)

[1 2 3 4 5]


## 0-D Arrays-or scalars are the elements in an array.Each value in an array is a O-D array

In [43]:
arr=np.array(42)
print(arr)

42


## 1-D arrays
An array that has O-D arrays as its elements is calld uni-dimesional or I-D array
Most common Basic arrays

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(arr)

## 2-D Arrays
Array that has I-D arrays as its elements is called 2-D array
These are often used to represent matrix or 2nd order tensors.

In [45]:
#Create a 2-D array containing two arrays with the values 1,2,3 and 4,5,6:
arr=np.array([[1,2,3],[4,5,6]])
print(arr)
arr.shape

[[1 2 3]
 [4 5 6]]


(2, 3)

## 3-D Arrays
An array that has 2-D arrays(matrices) as its elements
These are often used to represent a 3rd order tensor.

In [50]:
#Create a 3-D array with two 2-D arrays, both containing two arrays with the values 1,2,3 and 4,5,6:
arr=np.array([[[1,2,3],[4,5,6]],[[1,2,3],[4,5,6]]])
print(arr)
arr.shape

[[[1 2 3]
  [4 5 6]]

 [[1 2 3]
  [4 5 6]]]


(2, 2, 3)

## Check Number of Dimensions

In [51]:
a = np.array(42)
b = np.array([1, 2, 3, 4, 5])
c = np.array([[1, 2, 3], [4, 5, 6]])
d = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(a.ndim)
print(b.ndim)
print(c.ndim)
print(d.ndim)


0
1
2
3


In [53]:
# Higher Dimensional Arrays
arr=np.array([1,2,3,4,5],ndmin=5)
print(arr)
print('number of dimensions:',arr.ndim)

[[[[[1 2 3 4 5]]]]]
number of dimensions: 5


In [55]:
import numpy as np

arr = np.array([1, 2, 3, 4])
arr[0]

1

In [56]:
arr[2]+arr[3]

7

In [66]:
arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])
arr[0,1]

2

In [67]:
#arr[1][4]
arr[1,4]

10

In [68]:
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
print(arr)
#arr[1][0][2]
arr[1,0,2]


[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]


9

In [69]:
arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])
arr[1,-1]

10

In [71]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[1:5])

[2 3 4 5]


In [74]:
#Slice elements from index 4 to the end of the array:
print(arr[4:])

[5 6 7]


In [75]:
print(arr[:4])

[1 2 3 4]


In [76]:
print(arr[-3:-1])

[5 6]


In [77]:
print(arr[1:5:2])

[2 4]


In [79]:
arr[::2]

array([1, 3, 5, 7])

## Slicing 2-D Arrays


In [80]:
arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print(arr[1, 1:4])

[7 8 9]


In [81]:
#From both elements, return index 2:
print(arr[0:2,2])

[3 8]


In [83]:
#From both elements, slice index 1 to index 4 (not included), this will return a 2-D array:
print(arr[0:2,1:4])

[[2 3 4]
 [7 8 9]]


## NumPy Data Types

In [85]:
arr=np.array([1,2,3,4,5])
print(arr.dtype)

int64


In [86]:
type(arr)

numpy.ndarray

In [92]:
arr=np.array(['apple','Cherry','banana'])
print(arr.dtype)

<U6


## Creating Arrays With a Defined Data Type

In [95]:
arr=np.array([1,2,3,4],dtype='S')
print(arr)
print(arr.dtype)

[b'1' b'2' b'3' b'4']
|S1


In [96]:
arr=np.array([1,2,3,4],dtype='i4')
print(arr)
print(arr.dtype)

[1 2 3 4]
int32


##Converting Data Type on Existing Arrays
The astype() function creates a copy of the array, and allows you to specify the data type as a parameter.

In [99]:
arr=np.array([1.1,2.1,3.1])
newarr=arr.astype(dtype='i')
print(newarr)
print(newarr.dtype)

[1 2 3]
int32


In [100]:
#Change data type from float to integer by using int as parameter value:
newarr=arr.astype(int)
print(newarr)
print(newarr.dtype)

[1 2 3]
int64


In [101]:
#Change data type from integer to boolean:
arr = np.array([1, 0, 3])
newarr=arr.astype(bool)
print(newarr)
print(newarr.dtype)

[ True False  True]
bool


## copy

In [103]:
arr = np.array([1,2,3,4,5])
x=arr.copy()
arr[0] =42
print(arr)
print(x)

[42  2  3  4  5]
[1 2 3 4 5]


## view

In [104]:
arr = np.array([1, 2, 3, 4, 5])
x=arr.view()
arr[0]=42
print(arr)
print(x)

[42  2  3  4  5]
[42  2  3  4  5]


In [105]:
arr = np.array([1, 2, 3, 4, 5])
x=arr.view()
x[0]=32
print(arr)
print(x)

[32  2  3  4  5]
[32  2  3  4  5]


## Check if Array Owns its Data
As mentioned above, copies owns the data, and views does not own the data, but how can we check this?

Every NumPy array has the attribute **base** that returns **None** if the array owns the data.

Otherwise, the base  attribute refers to the original object.

In [107]:
arr = np.array([1, 2, 3, 4, 5])
x=arr.copy()
y=arr.view()

print(x.base)
print(y.base)

None
[1 2 3 4 5]
