# Numpy Tutorial   
**Author : Ravi Mummigatti**  
**Credits : Code Academy , Cloudx , Numpy.org**  
**Pre-requisites : Basic knowledge of Python syntax , variables and operators**  

# Introduction

Sarah records her second-grade class’s grades in an online spreadsheet. Her web browser records that she visited that spreadsheet, in addition to every other site she’s visited. Those sites record her location, the time she spent on them, and where she visited next. The world is chock-full of all sorts of different datasets, and learning how to create, analyze, and manipulate these datasets can give us some insight and control over our digital surroundings.

In this lesson, we’ll be constructing and manipulating single-variable datasets.  
One way to think of a single-variable dataset is that it contains answers to a question. 
For instance, we might ask 100 people, “How tall are you?” Their heights in inches would form our dataset.
To work with our datasets, we’ll be using a powerful Python module known as NumPy, which stands for Numerical Python.

NumPy has many uses including:
*    Efficiently working with many numbers at once
*   Generating random numbers
*   Performing many different numerical functions (i.e., calculating sin, cos, tan, mean, median, etc.)


To use NumPy with Python, import it at the top of your file using the following line: **import numpy as np**  
Writing as np allows us to use np as a shorthand for NumPy, which saves us time when calling a NumPy function (less typing = fewer errors!) 

NumPy includes a powerful data structure known as an **array**.  
A NumPy array is a special type of list , a data structure that organizes multiple items.   
Each item can be of any type (strings, numbers, or even other arrays).

Arrays are most powerful when they are used to store numbers. This is because arrays give us special ways of performing mathematical operations that are both simpler to write and more efficient computationally. We’ll get more into this later.  

A NumPy array is a multi-dimensional array, whose elements are usually numbers of the same data type. You can access each element of the NumPy array by using indexes. Indexes in the NumPy array starts with 0.

e.g. my_array[1,3], which means the element in the second row and fourth column.

**Axis in NumPy array**  
For a 2-D NumPy array:
    `axis = 0` refers to rows of the array
    `axis = 1` refers to columns of the array
For a 3-D NumPy array, there will be another axis i.e. axis = 2.

This information of axis is used heavily in NumPy matrix (multi-dimensional arrays) manipulations.

**Rank**  
The rank of a NumPy array is the number of dimensions of the array.

e.g. if there is a NumPy array (matrix) of dimension 2x3 (i.e. 2 rows and 3 columns), then the rank of this matrix (array) is 2 since it is a 2-dimensional array.

**Shape**  
The shape of a NumPy array is the dimensions of that array.

e.g. if an array (matrix) has dimensions of 2x3, then the shape of this array (matrix) is (2,3) i.e. 2 rows and 3 columns.

**Creating a NumPy array**  
A NumPy array can be created in either of the below ways:  
    Through passing a Python list  
    Through passing a Python tuple  



**Question :** What are some differences between an array and a list?

**Answer :** In Python, an array, or NumPy Array, and list share many similarities, but they also have some important differences on how they can be used.

* Both arrays and lists can hold multiple items of any type. You can also access individual items by indexes.

* One important, and probably the main difference, between them is that you can perform operations on an array, like addition, multiplication, and subtraction, like you would a vector in mathematics.

* If you have an array of numbers, you can add a single number to every value in the array with one operation. With a list, operations cannot be applied on every single element like for an array, and might even cause errors.  

test = np.array([1, 2, 3, 4, 5])  
test += 10

The array is now [11, 12, 13, 14, 15]

# Creating an Array

## Create Array manually from a list

A NumPy array looks a lot like a Python list:

my_array = np.array([1, 2, 3, 4, 5, 6])

We can transform a regular list into a NumPy array by using **np.array()** and saving the value to a new variable:

my_list = [1, 2, 3, 4, 5, 6]  
my_array = np.array(my_list)


Imagine you’re a teacher and you need to keep track of your student’s test scores. On the first test, the students received the following scores:

92, 94, 88, 91, 87

Create a NumPy array with these values and save it with the name test_1.


In [1]:
my_list = [92,94,88,91,87]
test_1 = np.array(my_list)
test_1

array([92, 94, 88, 91, 87])

In [2]:
sample_list = [1,2,3] # create a list

list_array = np.array(sample_list) # create an array by using the "np.array" method

print(list_array) # print the array

[1 2 3]


## Creating an Array by passing a tuple  


In [3]:
my_tup = (1,2,3) # create a tuple

tup_array = np.array(my_tup) # create an array by using "np.array" method

print(tup_array)

[1 2 3]


## Creating an Array from a CSV  
Typically, you won’t be entering data directly into an array. Instead, you’ll be importing the data from somewhere else.

We’re able to transform CSV (comma-separated values) files into arrays using the np.genfromtxt() function:

Consider the following CSV, sample.csv, (34,9,12,11,7)

We can import this into a NumPy array using the following code: 
**csv_array = np.genfromtxt('sample.csv', delimiter=',')**  

Note that in this case, our file sample.csv has values separated by commas, so we use delimiter=',', but sometimes you’ll find files with other delimiters, the most common being tabs or colons.

Once imported, this CSV will create the array

...csv_array  
array([34, 9, 12, 11, 7])

## Time comparison of a NumPy operation and a corresponding Python code

Let us create the below function (multiply_loops) which takes two arrays as input and computes their multiplication using normal Python way.

In [4]:
def multiply_loops(A, B):
    c=np.zeros((A.shape[0], B.shape[1]))
    for i in range(A.shape[0]):
        for k in range(B.shape[1]):
            c[i,k] = 0
            for j in range(B.shape[0]):
                n = A[i,j] * B[j,k]
                c[i,k] += n
    return c

Now, let us create the below function (multiply_vector) which takes two arrays as input and computes their multiplication using NumPy's vector multiplication way.

In [5]:
def multiply_vector(A, B):
     return A @ B

Let us create two randomly generated 100x100 matrices - X and Y - to test the above functions

In [6]:
X = np.random.random((100, 100))
Y = np.random.random((100, 100))

Now we will execute the command (timeit) , which will output you the time taken by each of these functions

In [7]:
%timeit multiply_loops(X, Y)
%timeit multiply_vector(X, Y)

545 ms ± 7.18 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
34.4 µs ± 759 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


**Array Multiplication is Multiple Magnitudes faster than Standard Loops**

# Creating Special Arrays

In addition to the normal NumPy arrays that we created , there are few special NumPy arrays that you can create.

For example:  
1. NumPy array with all its elements values as zero (using zeros() function of NumPy)
2. NumPy array with all its elements values as one (using ones() function of NumPy)
3. NumPy array with all its elements values as a specific given value (using full() function of NumPy)
4. Identity matrix or array


### Creating NumPy array with all its elements values as zero (using zeros() function of NumPy)

In [8]:
# Create a tuple indicating dimensions of the desired array
tup_dim = (3,4)

# Now, create your NumPy array by passing the above tuple (tup_dim) to zeros() function
my_zero_array = np.zeros(tup_dim)

# print the array
print(my_zero_array)

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]


### Creating NumPy array with all its elements values as one (using ones() function of NumPy)

In [9]:
# Create a tuple indicating dimensions of the desired array
tup_dim = (3,4)

# Now, create your NumPy array by passing the above tuple (tup_dim) to ones() function
my_one_array = np.ones(tup_dim)

# print the array
print(my_one_array)

[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]


### Creating NumPy array with all its elements values as a specific given value (using full() function of NumPy)

In [10]:
# Create a tuple indicating dimensions of the desired array
tup_dim = (3,4)

# Now, create your NumPy array by passing the above tuple (tup_dim) to full() function , 
# along with the desired value that you want the array to be filled with (e.g. say value 7)
my_seven_array = np.full(tup_dim , 7)

# print the array
print(my_seven_array)


[[7 7 7 7]
 [7 7 7 7]
 [7 7 7 7]]


###  Creating Identity matrix or array  
Identity matrix (array) is a square matrix with all its elements as zero (0) except for the diagonal elements whose value is one (1). That is, all the main diagonal values are 1s (one).

You can create an Identity matrix by using numpy's identity() function by passing the desired dimension of the matrix (array) and the data type required.

e.g. np.identity(2, dtype=float) will create a 2x2 matrix with all its values as 0.0 except for the diagonal values which will be 1.0

In [11]:
# Create a 4X4 Identity Matrix of floats
my_identity = np.identity(4 , dtype = np.float64)

# print the array
print(my_identity)

[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]


### Creating special Types of Arrays with Random Values   
NumPy array with all its elements values as random values (using random.rand() function of NumPy)

The NumPy array will have random float values between 0.0 and 1.0

In [12]:
# Create your NumPy array by passing the above the desired dimensions of the array (say (3,4)) to random.rand()
my_random_array = np.random.rand(3,4)

my_random_array

array([[0.01643945, 0.17882462, 0.24134019, 0.12463315],
       [0.98457927, 0.72765622, 0.03883278, 0.96251353],
       [0.04902869, 0.56146551, 0.77077972, 0.14665655]])

### Creating a NumPy array by passing a range of values (using arange() function of NumPy)  
Suppose you want to create a NumPy array whose values should lie between 10 and 30 and the values should have a gap of the value of 5

To create this NumPy array, pass the start value (10), end value (30) and the gap value 5, to the arange()  

The above NumPy array will be filled with values between 10 and 30 (excluding 30) having a gap of the value of 5 between the consecutive values

**Note: Start value 10 , Skip 5 Elemnets , End Value does not include 30**

In [13]:
# create an array with values between 10 and 30 with a gap of 5
my_range_array = np.arange(10 , 30 , 5)

print(my_range_array)

[10 15 20 25]


### Creating a NumPy array by passing an equally spaced range of values (using linspace() function of NumPy)  
Suppose you want to create a NumPy array whose values should lie between 0 and 5/3 and the consecutive values should have an equal amount of gap between them (equally spaced values).

To create this NumPy array, pass the start value (0), end value (36), and the total number of values you want (say 6), to the linspace()  

The above NumPy array will have total of 6 values with values ranging between 0 and 36 (including 36) and having an equal amount of gap between the consecutive values

In [14]:
# create an equally spaced array starting at 0 , ending at 36 , spaced into 4 equal spaces
my_spaced_array = np.linspace(0 , 36 , 4)

my_spaced_array

array([ 0., 12., 24., 36.])

# Attributes of a NumPy Array  
NumPy array (ndarray class) is the most used construct of NumPy in Machine Learning and Deep Learning. Let us look into some important attributes of this NumPy array.

In [15]:
# create an array
array_A = np.array([[3,4,6], [0,8,1]])
array_A

array([[3, 4, 6],
       [0, 8, 1]])

## ndarray.ndim : dimensions  
ndim represents the number of dimensions (axes) of the ndarray.

e.g. for this 2-dimensional array [ [3,4,6], [0,8,1]], value of ndim will be 2.   
This ndarray has two dimensions (axes) - rows (axis=0) and columns (axis=1)

In [16]:
my_array = np.array([[1, 4, 5, 6], [7, 8, 9, 10], [11, 12, 14, 16]])
my_array

array([[ 1,  4,  5,  6],
       [ 7,  8,  9, 10],
       [11, 12, 14, 16]])

In [17]:
my_array.ndim

2

In [18]:
array_A.ndim

2

## ndarray.shape : size  
shape is a tuple of integers representing the size of the ndarray in each dimension.

e.g. for this 2-dimensional array [ [3,4,6], [0,8,1]], value of shape will be (2,3) because this ndarray has two dimensions - rows and columns - and the number of rows is 2 and the number of columns is 3

In [19]:
print(array_A)
array_A.shape

# array with 2 rows and 3 columns

[[3 4 6]
 [0 8 1]]


(2, 3)

In [20]:
print(my_array)
my_array.shape

# array of 3 rows and 4 columns

[[ 1  4  5  6]
 [ 7  8  9 10]
 [11 12 14 16]]


(3, 4)

## ndarray.size : total number of elements  
size is the total number of elements in the ndarray. It is equal to the product of elements of the shape.   
e.g. for this 2-dimensional array [ [3,4,6], [0,8,1]], shape is (2,3), size will be product (multiplication) of 2 and 3 i.e. (2*3) = 6. Hence, the size is 6.

In [21]:
print(array_A)

array_A.size

# array of 6 elements i.e. 2 Rows * 3 Columns

[[3 4 6]
 [0 8 1]]


6

In [22]:
print(my_array)

my_array.size

# array of 12 elements i.e. 3 Rows * 4 Columns

[[ 1  4  5  6]
 [ 7  8  9 10]
 [11 12 14 16]]


12

## ndarray.dtype : data types  
dtype tells the data type of the elements of a NumPy array. In NumPy array, all the elements have the same data type.

e.g. for this NumPy array [ [3,4,6], [0,8,1]], dtype will be int64

In [23]:
array_A.dtype

# array of integers

dtype('int32')

In [24]:
my_array.dtype

# array of integers

dtype('int32')

## ndarray.itemsize : size in bytes of every element  
itemsize returns the size (in bytes) of each element of a NumPy array.

e.g. for this NumPy array [ [3,4,6], [0,8,1]], itemsize will be 4, because this array consists of integers and size of integer (in bytes) is 8 bytes.

In [25]:
array_A.itemsize

4

In [26]:
my_array.itemsize

4

# Loading Text Files 

Although Pandas provides better ways and constructs to load a dataset from various sources like files, databases, etc. we should also know the NumPy constructs for the same.

There are two ways (constructs) in NumPy to load data from a text file:

(1) using loadtxt() function

(2) using genfromtxt() function

loadtxt() function provides less flexibility, whereas genfromtxt() function provides more flexibility.

For example, genfromtxt() function also handles the missing values kind of scenarios in the loaded dataset, whereas loadtxt() function doesn't.  

Example of loadtxt()  
..........................................................................  
import numpy as np  
name_arr, address_arr, zipcode_arr = np.loadtxt('my_file.txt', skiprows=2, unpack=True)  
.........................................................................  

The above piece of code will load the data from my_file.txt text file.

skiprows=2 means, skip the first two rows of the my_file.txt file while loading the data.

unpack=True means, unpack the columns of the dataset being loaded and return the data of each column separately in separate arrays ( name column data in name_arr array, address column data in address_arr array, zipcode column data in zipcode_arr array).

unpack=False means, return only a single array as output from the loadtxt() function

## Loading a text file data using NumPy's loadtxt() function

#### import required libraries

In [27]:
import numpy as np
import os

#### load using pandas  
Now we will use pandas to load data from a large csv file (California housing dataset) and create a small csv file (of housing data) by extracting only few rows of data from this large housing.csv file.

We are creating a smaller csv file of data, just for our convenience, to make it easy for us to load it using loadtxt() function

In [28]:
import pandas as pd
# defining housing.csv file path
HOUSING_PATH = 'C:/Users/rmummiga/Desktop/PYTHON_TRAINING/PYTHON_CODE_ACADEMY'
# reading the large housing.csv file using pandas
housing_raw = pd.read_csv(os.path.join(HOUSING_PATH, "housing.csv"))
# extracting only a few rows (5 rows) of data from the pandas dataframe 'my_df'
my_df = housing_raw.iloc[ : 5]
# creating a new small csv file - 'housing_short.csv' - containing the above extracted 5 rows of data
my_df.to_csv('housing_short.csv', index=False)

#### Load using Numpy  
Now, let us load the csv file - housing_short.csv - using NumPy's loadtxt() function

please define a variable called FILE and assign to it the string value housing_short.csv.

In [29]:
FILE = 'housing_short.csv'

#### Create a Function to load data file  
we define a function called load_housing_data(), which takes filename (FILE) as input and loads this file using NumPy's loadtxt() function

In [30]:
# import the required libraries
import pandas as pd
import os

# define load data funcction
def load_housing_data(file):
    return np.loadtxt(file, dtype={'names': ('longitude','latitude','housing_median_age','total_rooms','total_bedrooms','population','households','median_income','median_house_value','ocean_proximity'),'formats': ('f8', 'f8', 'f8', 'f8', 'f8', 'f8', 'f8', 'f8', 'f8', '|S15')}, delimiter=',', skiprows=1, unpack=True)

# specify the file path 
HOUSING_PATH = 'C:/Users/rmummiga/Desktop/PYTHON_TRAINING/PYTHON_CODE_ACADEMY'

# specify the file and join it to the path
HOUSING_FILE = os.path.join(HOUSING_PATH, "housing.csv")

# specify name of the new file to be created : sort version of data with limited columns
mfile='housing_short.csv'

# read the data
housing_raw = pd.read_csv(HOUSING_FILE)
print(housing_raw.head(5))

# select only 5 columns
my_df = housing_raw.iloc[ : 5]

# write back and store as new file
my_df.to_csv(mfile,index=False)

# assign individual arrays
longitude_arr,latitude_arr,housing_median_age_arr,total_rooms_arr,total_bedrooms_arr,population_arr,households_arr,median_income_arr,median_house_value_arr,ocean_proximity_arr = load_housing_data(mfile)

print(median_house_value_arr)

   longitude  latitude  housing_median_age  total_rooms  total_bedrooms  \
0    -122.23     37.88                  41          880           129.0   
1    -122.22     37.86                  21         7099          1106.0   
2    -122.24     37.85                  52         1467           190.0   
3    -122.25     37.85                  52         1274           235.0   
4    -122.25     37.85                  52         1627           280.0   

   population  households  median_income  median_house_value ocean_proximity  
0         322         126         8.3252              452600        NEAR BAY  
1        2401        1138         8.3014              358500        NEAR BAY  
2         496         177         7.2574              352100        NEAR BAY  
3         558         219         5.6431              341300        NEAR BAY  
4         565         259         3.8462              342200        NEAR BAY  
[452600. 358500. 352100. 341300. 342200.]


loadtxt() function parameters

* first parameter - file. It is the name of the file from which the data is to be loaded.

* second parameter - data type dtype of columns of the loaded csv file housing_short.csv. It is a Python dictionary with key as names of the columns, and values as the data types of these respective columns e.g. f8, |S15, etc.

* 'f8' means 64-bit floating-point number

* '|S15' -means a string of length of 15 characters

* third parameter - delimiter. It is the character by which values in a row of our csv file are separated. For example, in our case values of a row of our csv file - housing_short.csv - are separated by ',' (comma)

* fourth parameter - skiprows. You can specify here, how many initial rows of the csv file you want to skip loading. E.g. you may want to skip the first row of this csv file, as it may contain header information in the first row, which you may not want to load.

* fifth parameter - unpack. When unpack is True, the returned array is transposed, so that arguments may be unpacked using x, y, z = loadtxt(...). When used with a structured data-type, arrays are returned for each field. The default value for unpack is False. But here we are returning the individual arrays so we have kept it here asTrue.

#### Call the Function load_housing_data

In [31]:
load_housing_data('housing_short.csv')

[array([-122.23, -122.22, -122.24, -122.25, -122.25]),
 array([37.88, 37.86, 37.85, 37.85, 37.85]),
 array([41., 21., 52., 52., 52.]),
 array([ 880., 7099., 1467., 1274., 1627.]),
 array([ 129., 1106.,  190.,  235.,  280.]),
 array([ 322., 2401.,  496.,  558.,  565.]),
 array([ 126., 1138.,  177.,  219.,  259.]),
 array([8.3252, 8.3014, 7.2574, 5.6431, 3.8462]),
 array([452600., 358500., 352100., 341300., 342200.]),
 array([b'NEAR BAY', b'NEAR BAY', b'NEAR BAY', b'NEAR BAY', b'NEAR BAY'],
       dtype='|S15')]

# Reshaping and Re-sizing Arrays  
resize() function is used to create a new array of different sizes and dimensions.

resize() can create an array of larger size than the original array.

To convert the original array into a bigger array, resize() will add more elements (than available in the original array) by copying the existing elements (repeated as many times as required) to fill the desired larger size

ndarray.resize(new_shape, refcheck=True)

With the above resize() function, enlarging an array to a larger size will retain the existing values of the original array, but missing entries/values will be filled with zeros.

In [32]:
import numpy as np

my_arr = np.array([ [0,1], [2,3] ])

my_arr.resize( (2, 3) )  # need two rows and one column

print(my_arr)

[[0 1 2]
 [3 0 0]]


In [33]:
my_arr = np.array([[1,2,3,4],[5,6,7,8]])

my_arr.resize((2,4))

my_arr

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

reshape()

reshape() function is used to create a new array of the same size (as the original array) but of different desired dimensions.

reshape() function will create an array with the same number of elements as the original array, i.e. of the same size as that of the original array. If you want to convert the original array to a bigger array, reshape() can’t add more elements(than available in the original array) to give you a bigger array.  

ndarray.reshape(new_shape)

There may be two cases, as mentioned below when we are reshaping the array:

(1) We already know the desired number dimensions (say rows and columns) of the reshaped array.

(2) We know the desired number of columns, but don't know the desired number of rows of the reshaped array.  


Lets us take the first case - Reshaping a numpy array when the number of rows and number of columns of the reshaped array is known.

This technique is used when you already are aware - how many numbers of rows and columns your reshaped array will have.
.................................................................................    
my_arr = np.arange(6)  
print(my_arr)  
output will be  
[0,1,2,3,4,5]  
.................................................................................    
Now, let us reshape this array  
.................................................................................  
my_new_arr = my_arr.reshape(2,3)  
print(my_new_arr)  
output will be  
[ [0,1,2],
      [3,4,5] ]  
.................................................................................  

Now, let us take the second case - Reshaping a numpy array when the desired number of columns is known, but the desired number of rows is unknown.

This specific technique is used when your reshaped array is going to be of bigger size than the original array, and you are sure about how many columns you want in the reshaped array, but you don't have an idea how many rows will be there in the reshaped (output) array.

To achieve this, you use a value of '-1' in place of 'rows' field in the reshape() function, and use the desired value of the number of columns you want in the columns field in the reshape() function.

Example - Say you want to reshape an existing array to an array with 3 columns and an unknown number of rows, as shown in the code below:  
.....................................................  
my_arr = np.arange(9)

print(my_arr)

output will be

[0,1,2,3,4,5,6,7,8]  
.....................................................  

Now, let us reshape this array

my_new_arr = my_arr.reshape(-1,3)

print(my_new_arr)

output will be

    [ [0,1,2],
      [3,4,5],
      [6,7,8] ]  
.....................................................

When you can't use reshape() function?

You can't use reshape() function, when the size of the original array is different from your desired reshaped array. If you try to reshape(), it will throw an error.
Example:  
.....................................................  
my_arr = np.arange(8)  
print(my_arr)  
output will be
[0,1,2,3,4,5,6,7]   
......................................................  
my_arr.reshape(2,3)  
the output will be an error as shown below  
ValueError: cannot reshape array of size 8 into shape (2,3)  
.....................................................  
Now, again try with some other shape as shown below:
my_arr.reshape(3,3)  
the output will be an error as shown below
ValueError: cannot reshape array of size 8 into shape (3,3)    
.......................................................


Difference between resize() and reshape() :

reshape() will create an array with the same number of elements as the original array, i.e. of the same ‘size’ as that of the original array. If you want to convert the original array to a bigger array, reshape() can’t add more elements(than available in the original array) to give you a bigger array. Hence, reshape() is used to create a new array of the same size (as the original array) but of different desired dimensions.

resize() can create an array of larger size than the original array. To convert the original array into a bigger array, resize() will add more elements (than available in the original array) by copying the existing elements (repeated as many times as required) to fill the desired larger size. resize() is used to create a new array of different sizes and dimensions.

In [34]:
# create a 1-D numpy array called my_first_arr with values - 1,2,3,4,5,6,7 and 8
my_first_arr = np.arange(8)
print(my_first_arr)

[0 1 2 3 4 5 6 7]


In [35]:
# apply reshape() function on this my_first_arr array to convert this 1-D array into a (2x4) array and 
# store this newly created (2x4) array in a variable called my_new_arr
my_new_arr = my_first_arr.reshape(2,4)
print(my_new_arr)

[[0 1 2 3]
 [4 5 6 7]]


In [36]:
# create a 1-D numpy array called my_second_arr with values - 0,1,2,3,4,5,6,7 and 8,
my_second_arr = np.arange(9)
print(my_second_arr)

[0 1 2 3 4 5 6 7 8]


In [37]:
# apply reshape() function on this my_second_arr array to convert this 1-D array into an array with 
# 3 columns but unknown number of rows, and store this in a variable called my_updated_arr
my_updated_arr = my_second_arr.reshape(-1,3)
print(my_updated_arr)


[[0 1 2]
 [3 4 5]
 [6 7 8]]


# Operations with NumPy Arrays  
Generally, NumPy arrays are more efficient than lists.   
One reason is that they allow you to do element-wise operations.   
An element-wise operation allows you to quickly perform an operation, such as addition, on each element in an array.  
Common operations that are element-wise include addition, subtraction, multiplication, division, modulo and exponent

In [38]:
# With a list
l = [1, 2, 3, 4, 5] # list of values
l_plus_3 = []
for i in range(len(l)):
    l_plus_3.append(l[i] + 3)
    
print(l)
print(l_plus_3)

[1, 2, 3, 4, 5]
[4, 5, 6, 7, 8]


In [39]:
# With an array
a = np.array(l) # convert list to an array
a_plus_3 = a + 3 # add 3 to each elemnt in array and assign to new variable

print(a)
print(a_plus_3)

[1 2 3 4 5]
[4 5 6 7 8]


As we can see, if we were to add 3 to every number in a list, we would have to use a for loop or a list comprehension.  

With an array, we can just add 3. The same is true for subtraction, multiplication, and division.

We can also use NumPy Arrays to find the squares or square roots of each value.

In [40]:
print(a) # array of numbers
print(a**2) # array of squares of each number in the array
print(np.sqrt(a)) # array of square roots of each number in the array

[1 2 3 4 5]
[ 1  4  9 16 25]
[1.         1.41421356 1.73205081 2.         2.23606798]


## adding an element to an aray

The student’s grades are stored in 3 arrays.  
test_1 = np.array([92, 94, 88, 91, 87])  
test_2 = np.array([79, 100, 86, 93, 91])  
test_3 = np.array([87, 85, 72, 90, 92]  

It turns out that one of the questions on the third test had an error. Give each student two extra points and save the new array to test_3_fixed.


In [41]:
# create the arrays of test scores
test_1 = np.array([92, 94, 88, 91, 87])  
test_2 = np.array([79, 100, 86, 93, 91])  
test_3 = np.array([87, 85, 72, 90, 92]) 

In [42]:
# create the new array with modified scores
test_3_fixed = test_3 + 2
print(test_3)
print(test_3_fixed)

[87 85 72 90 92]
[89 87 74 92 94]


## elementwise addition

Arrays can also be added to or subtracted from each other in NumPy, **assuming the arrays have the same number of elements.**

When adding or subtracting arrays in NumPy, each element will be added/subtracted to its matching element.  


If we try to perform an operation, such as addition or subtraction, on two or more arrays with differing lengths, then there will be an error message thrown, specifically a **ValueError.**

When performing an operation on two arrays, each element of both arrays is essentially paired up from the first to the last element, and the operation is run on each pair of values. When one element is unpaired due to different array lengths, it will have no paired value, causing the error.

In [43]:
a = np.array([1, 2, 3, 4, 5])
b = np.array([6, 7, 8, 9, 10])
a + b


array([ 7,  9, 11, 13, 15])

Let’s find the average of each student’s test scores to calculate their final grade for the semester.   
* Start by adding the three arrays together and save the answer to the variable total_grade.(Remember to use the fixed scores for test three.)  
* Now, divide total_grade by the number of tests taken to find the average score for each student. Save the answer to the variable final_grade

In [44]:
# create an array of total scores
total_grade = test_1 + test_2 + test_3_fixed

total_grade

array([260, 281, 248, 276, 272])

In [45]:
# create an array of final grade by dividing total scores by number of tests in this case 3
final_grade = total_grade / 3

print(final_grade)

[86.66666667 93.66666667 82.66666667 92.         90.66666667]


## elementwise multiplication

In [46]:
# Create two NumPy arrays - A_arr and B_arr
A_arr = np.array([ [ 5,9], [4, 7] ])
B_arr = np.array( [ [2, 8], [1, 6] ] )
print(A_arr)
print(B_arr)

# elementwise multiplication
M_arr = A_arr * B_arr
print(M_arr)

[[5 9]
 [4 7]]
[[2 8]
 [1 6]]
[[10 72]
 [ 4 42]]


## matrix multiplication i.e. dot product

In [47]:
# matrix multiplication / dot product
C_arr = np.array([ [ 5,9], [4, 7] ])
D_arr = np.array( [ [2, 8], [1, 6] ] )
print(C_arr)
print(D_arr)

# matrix multiplication
P_arr = np.dot(C_arr , D_arr)
print(P_arr)

[[5 9]
 [4 7]]
[[2 8]
 [1 6]]
[[19 94]
 [15 74]]


## normal division

In [48]:
# elementwise division
R_div = np.array([ [ 25,65], [40, 14] ])
S_div = np.array([ [2, 10], [4, 5] ] )
print(R_div)
print(S_div)
T_div = R_div / S_div
print(T_div)

[[25 65]
 [40 14]]
[[ 2 10]
 [ 4  5]]
[[12.5  6.5]
 [10.   2.8]]


## integer division : floor division

In [49]:
# elementwisse integer division
R_int_div = np.array([ [ 25,65], [40, 70] ])
S_int_div = np.array([ [2, 10], [8, 3] ] )
print(R_int_div)
print(S_int_div)
T_int_div = R_int_div // S_int_div
print(T_int_div)

[[25 65]
 [40 70]]
[[ 2 10]
 [ 8  3]]
[[12  6]
 [ 5 23]]


## modulo division

In [50]:
# elementwise modulo division
R_mod = np.array([ [ 20,65], [40, 70] ])
S_mod = np.array([ [2, 10], [8, 3] ] )
print(R_mod)
print(S_mod)
T_mod = R_mod % S_mod
print(T_mod)

[[20 65]
 [40 70]]
[[ 2 10]
 [ 8  3]]
[[0 5]
 [0 1]]


## conditional operators

In [84]:
# creating a new array based on conditions
U = np.array([ [ 22, 45], [90, 4 ] ] )
print(U)

#  Get all the elements of U which are less than 45
print(U[U<45])

[[22 45]
 [90  4]]
[22  4]


# Two-Dimensional Arrays  

In Python, we can create lists that are made up of other lists. Similarly, in NumPy we can create an array of arrays. 

If the arrays that make up our bigger array are all the same size, then it has a special name: a two-dimensional array.

In the previous exercises we had stored the students’ test scores in separate one-dimensional arrays for each test:

test_1 = np.array([92, 94, 88, 91, 87])  
test_2 = np.array([79, 100, 86, 93, 91])  
test_3 = np.array([87, 85, 72, 90, 92])  

But we could have also stored all of this data in a single, two-dimensional array:  

np.array([[92, 94, 88, 91, 87],   
          [79, 100, 86, 93, 91],  
          [87, 85, 72, 90, 92]])  

Here, each row represents a test, and each column represents a student. This allows us to store all of our data in a single array without losing any of its organization.

As we mentioned, a two-dimensional array is a list of lists where each list has the same number of elements. 

This code will run but it will not create a two-dimensional array because the lists have different numbers of elements:

np.array([[29, 49,  6],   
          [77,  1]])  

This code will not run because the [] for the outer lists are missing:

np.array([68, 16, 73],    
         [61, 79, 30])

In [51]:
#creating an array of test scores (Row = test scores , Column = students)
test_scores = np.array([[92,94,88,91,87], ### list 1 of 5 elements
                       [79,100,86,93,91], ### list 2 of 5 elements
                       [87,85,72,90,92]]) ### list 3 of 5 elements

# print the array
print(test_scores)

[[ 92  94  88  91  87]
 [ 79 100  86  93  91]
 [ 87  85  72  90  92]]


In statistics, we often use two-dimensional arrays to represent a set of samples.   

For instance, if we flip a coin we can represent each head as a 1 and each tail as a 0.  

Create a one-dimensional array for a coin toss experiment that results in heads, tails, tails, heads, tails, and save it to the variable coin_toss.


In [52]:
# create an array for coin toss
coin_toss = np.array([1 , 0 , 0 , 1 , 0])

coin_toss

array([1, 0, 0, 1, 0])

We run the experiment again and get the following outcome: tails, tails, heads, heads, heads. 

Create a new array that represents both outcomes as a single experiment. Save the new array to coin_toss_again. 

In [53]:
# create a new array of coin toss that has both outcomes
coin_toss_again = np.array([[1 , 0 , 0 , 1 , 0], ## 1st coin toss outcome
                           [0,0,1,1,1]])  ## 2nd coin toss outcome

coin_toss_again

array([[1, 0, 0, 1, 0],
       [0, 0, 1, 1, 1]])

**Question :** Can we perform operations between 1-D and 2-D NumPy Arrays?  
**Answer :** Yes, when performing an operation between a 1-D and a 2-D NumPy Array, it will essentially perform the operation on the 1-D array with each row of the 2-D Array individually.

As a result, it is only possible if the number of elements for every row in both Arrays match.

In [54]:
# In the example below , these two arrays have 2 elements in each row, so you can perform an operation on them.

# When we perform an operation on these arrays, 
# it is essentially running the operation on each row of the 2-D list with the 1-D list.
arr1 = np.array([[1, 1], [2, 2], [3, 3]])
arr2 = np.array([10, 10])

arr1 * arr2
# The result of running the above is {1*10 , 1*10 ; 2*10 , 2*10 ; 3*10 , 3*10}
# [[10 10]
#  [20 20]
#  [30 30]]

print(arr1)
print(arr2)
print(arr1*arr2)

[[1 1]
 [2 2]
 [3 3]]
[10 10]
[[10 10]
 [20 20]
 [30 30]]


# Selecting Elements from a 1-D Array

NumPy allows us to select elements from an array using their indices. Consider the one-dimensional array  

a = np.array([5, 2, 7, 0, 11])  

If we wanted to select the first element in this array, we would call:  

..a[0]  
5 

In typical Python fashion, the indices for an array start at 0. This is known as zero-indexed numbering. In the array above, 5 is known as the zeroth element, a[0]. It follows that 2 is the first element, a[1].

We can also select negative indices, which count from opposite end of the array and start at -1. This is particularly useful when you want to access the last element or two of an array:  

..a[-1]  
11  
..a[-2]  
0  

If we wanted to select multiple elements in the array, we can define a range, such as a[1:3], which will select all the elements from a[1] to a[3], including a[1] but excluding a[3].  

..a[1:3]  
array([2, 7])  

Similarly, if we wanted to select all elements before a[3] we would use:  

..a[:3]  
array([5, 2, 7])  

We can also use negative indices to select multiple elements. Let’s say we want to select the last 3 elements in an array:  

..a[-3:]  
array([7, 0, 11])  

Notice that when we select multiple elements, we get an array.


Let’s return to our student’s test scores. The following table shows all three test arrays aligned to the names of the students.  
![2DARRAYEXAMPLE.GIF](attachment:2DARRAYEXAMPLE.GIF)  


Jeremy wants to know what he scored on the second test.  
Select the score from the test_2 array and save it to the variable jeremy_test_2.

In [55]:
jeremy_test_2 = test_2[3]

print(jeremy_test_2)

93


You want to compare how Manual and Adwoa did on the first test.   
Select both of their scores and save them in an array named manual_adwoa_test_1.


In [56]:
manual_adowa_test_1 = test_1[1:3]

print(manual_adowa_test_1)

[94 88]


**Question :** When selecting elements from a 1-D NumPy Array, is there a way to skip elements?

**Answer :** Yes, actually, selecting elements from a 1-D Array is similar to list or string slicing.

Just like when slicing a list or string, we can include a third value, which is the step value, or the number of indexes to increment for each item in the new list or substring.

In [57]:
# By default, step is 1, and selects each item 
# in the range of indexes.
test = np.array([92, 94, 88, 91, 87])

print(test[0:5])
# [92, 94, 88, 91 87]

# With a step of 2, it will skip every other element.
print(test[0:5:2] )
# [92, 88, 87]

[92 94 88 91 87]
[92 88 87]


Replacing selected elements of a NumPy array by a specific value

In [87]:
# create an array
my_arr = np.array([2, 9, 17, 13, 1, 4, 20, 57])
print(my_arr)

[ 2  9 17 13  1  4 20 57]


In [86]:
# replace elemnts from 2 to 5 with "6"
my_arr[2:5] = 6
print(my_arr)

[ 2  9  6  6  6  4 20 57]


# Selecting Elements from a 2-D Array  
Selecting elements from a 2-d array is very similar to selecting them from a 1-d array, we just have two indices to select from. The syntax for selecting from a 2-d array is a[row,column] where a is the array.

It’s important to note that when we work with arrays that have more than one dimension, the relationship between the interior arrays is defined in terms of axes. A two-dimensional array has two axes: axis 0 represents the values that share the same indexical position (are in the same column), and axis 1 represents the values that share an array (are in the same row). This is illustrated below  
![ARRAY.GIF](attachment:ARRAY.GIF)  

Syntax for selecting elements from 2D array [row index start : row index end , column index start : column index end ]


In [58]:
a = np.array([[32, 15, 6, 9, 14], 
              [12, 10, 5, 23, 1],
              [2, 16, 13, 40, 37]])

In [59]:
#We can select specific elements using their indices:
print(a[2,1])

# Select the row with index 2 i.e. row 3 and column with index 1 i.e. column 2

16


In [60]:
#Let’s say we wanted to select an entire column, we can insert : as the row index
print(a[:,0])

# Select all rows and elemnts from column index 0 i.e. 1st column

[32 12  2]


In [61]:
# Select 2nd row
print(a[1,:])

# Select elemnts of "Row 1" and "All Columns"

[12 10  5 23  1]


In [62]:
# Select specific elements
print(a)
print(a[0,2:3])

# Select elemnt at "Start with Rindex0 go to Rindex2 than start with Cindex 2 and stop at 2"

[[32 15  6  9 14]
 [12 10  5 23  1]
 [ 2 16 13 40 37]]
[6]


Let’s return to our student’s test scores. The following table shows all three test arrays aligned to the names of the students.  
Our students’ test scores are now stored in the 2-d array student_scores. The first row stores the scores of the first test, the second row the second test, and the third row the third test, as shown in the following table  

![2DARRAYEXAMPLE.GIF](attachment:2DARRAYEXAMPLE.GIF)

Tanya wants to know how well she did on third test. Select her score from the array and save it to tanya_test_3

In [63]:
#creating an array of test scores (Row = test scores , Column = students)
test_scores = np.array([[92,94,88,91,87], ### list 1 of 5 elements
                       [79,100,86,93,91], ### list 2 of 5 elements
                       [87,85,72,90,92]]) ### list 3 of 5 elements

# print the array
print(test_scores)

[[ 92  94  88  91  87]
 [ 79 100  86  93  91]
 [ 87  85  72  90  92]]


In [64]:
# Tanya's test 3 scores
tanya_test_3 = test_scores[2,0] # select the row with index 2 and column with index 0 i.e. 1st column
tanya_test_3

87

You have a parent teacher conference with Cody’s parents coming up and would like to have all of his test scores handy.

Select all of Cody’s test scores and save them to a new array cody_test_scores

In [65]:
# Cody's test scores test 1,2,3
cody_test_scores = test_scores[:,-1] # select all rows from the column with last index
cody_test_scores

array([87, 91, 92])

## Slicing Arrays

In [90]:
# Define a NumPy array with name apple with values (1, 8, 23, 3, 18, 91, 7, 15)
apple = np.array([1,8,23,3,18,91,7,5])
print(apple)

[ 1  8 23  3 18 91  7  5]


In [89]:
# Extract a portion of this NumPy array apple from index 1 to 4
apple[1:4]

array([ 8, 23,  3])

## Slicing and Replacing : changes original array contents

In [92]:
# Define a NumPy array with name apple with values (1, 8, 23, 3, 18, 91, 7, 15)
apple = np.array([1,8,23,3,18,91,7,5])
print(apple)

# Extract a portion of this NumPy array apple from index 1 to 4
apple_slice = apple[1:4]

# Replace 2nd element of array apple_slice by a value say 99999
apple_slice[1] = 99999
print(apple_slice)
print(apple)

[ 1  8 23  3 18 91  7  5]
[    8 99999     3]
[    1     8 99999     3    18    91     7     5]


**Original (parent) array from which the slice was extracted also gets modified if the slice value is modified.**

If you don't want original (parent) array values to get modified when its slice values are changing, you can create the slice by using a copy() function in the slicing code syntax

In [94]:
# Define a NumPy array with name apple with values (1, 8, 23, 3, 18, 91, 7, 15)
apple = np.array([1,8,23,3,18,91,7,5])
print(apple)

# Extract a portion of this NumPy array apple from index 1 to 4
apple_slice = apple[1:4]
print(apple_slice)

# slice elements 2:5 and create a copy
apple_slice_new = apple[2:5].copy()
print(apple_slice_new)

[ 1  8 23  3 18 91  7  5]
[ 8 23  3]
[23  3 18]


* Create a 4x5 (4 rows, 5 columns) NumPy array called my_multi_arr  
* Extract values from row index numbers 2 to 4 and from column index numbers 2 to 5, and store it in a variable called my_multi_arr_portion  
* Print the my_multi_arr_portion array using print() function to see its values

In [1]:
# Create a 4x5 (4 rows, 5 columns) NumPy array called my_multi_arr
my_multi_arr = np.arange(20).reshape(4,5)
print(my_multi_arr)  

#  Extract values from row index numbers 2 to 4 and 
# from column index numbers 2 to 5, 
# and store it in a variable called my_multi_arr_portion
my_multi_arr_portion = my_multi_arr[2:4,2:5]
print(my_multi_arr_portion)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]
[[12 13 14]
 [17 18 19]]


# Logical Operations with Arrays  
Another useful thing that arrays can do is perform element-wise logical operations.   
For instance, suppose we want to know how many elements in an array are greater than 5.   
We can easily write some code that checks to see whether this statement evaluates to True for each item in the array, without having to use a for loop

In [66]:
x = np.array([10, 2, 2, 4, 5, 3, 9, 8, 9, 7])
x > 3

array([ True, False, False,  True,  True, False,  True,  True,  True,
        True])

We can then use logical operators to evaluate and select items based on certain criteria.   
To select all elements from the previous array that are greater than 5, we’d write the following:

In [67]:
x[x>3] # from the array x select all elemnts that are > 3

array([10,  4,  5,  9,  8,  9,  7])

We can also combine logical statements to further specify our criteria.   
To do so, we place each statement in parentheses and use boolean operators like **"&"** (and) and **"|"** (or). 

In [68]:
x[(x>5) & (x<10)] ## select elements that are > 5 and < 10

array([9, 8, 9, 7])

In [69]:
x[(x>5) | (x<10)] ## select elements that are > 5 or < 10

array([10,  2,  2,  4,  5,  3,  9,  8,  9,  7])

Today we’re visiting the Goldilocks Porridge Festival, sampling a selection of breakfast cereals and judging them based on their temperature (listed in Fahrenheit).  

[79, 65, 50, 63, 56, 90, 85, 98, 79, 51])  


1. Create a logical condition that selects samples in the porridge array that are less than 60, and save them to a variable named cold.  
2. Create a logical condition that finds all the samples that are higher than 80 and save them to a variable named hot  
3. Create a logical condition that finds all the samples that are between 60 and 80 and save them to a variable named just_right

In [70]:
# create the array of temperatures
porridge_temps = np.array([79, 65, 50, 63, 56, 90, 85, 98, 79, 51])
porridge_temps

array([79, 65, 50, 63, 56, 90, 85, 98, 79, 51])

In [71]:
# select only those whose temp is < 60
cold = porridge_temps[porridge_temps < 60]
print(cold)

[50 56 51]


In [72]:
# select elements with temp > 80
hot = porridge_temps[porridge_temps>80]
print(hot)

[90 85 98]


In [73]:
# select elements that are between 60 and 80
just_right = porridge_temps[(porridge_temps >= 60) & (porridge_temps <= 80)]
print(just_right)

[79 65 63 79]


# Flatenning a Multidimentional Array  
If you have a multi-dimensional Numpy array, say, a 3-D, 2-D, etc. array, you can convert it to a 1-dimensional (1-D) array using Numpy function ravel().

You may need to do this, as some of the libraries need the data to be fed in 1-dimensional form.

In [2]:
import numpy as np

In [3]:
# create a 2-dimensional Numpy array (of size 4x5) called my_2d_arr
my_2d_arr = np.arange(20).reshape(4,5)
print(my_2d_arr)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]


In [4]:
# Call ravel() function on the above Numpy array (my_2d_arr) to create a 1-dimensional array, and store 
# this 1-D array in a variable called my_1d_arr
my_1d_arr = my_2d_arr.ravel()
print(my_1d_arr)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]


# Mathematical and Statistical Functions on Numpy Arrays  

## Across the complete array

There are several mathematical and statistical functions available in NumPy which can be very useful in manipulating the data of a matrix (dataset).  
* mean(): Calculates mean of all the elements of the NumPy array, irrespective of the shape of the array. 
* var() : Calculates the variance of the elements of the NumPy array.  
* std() : Calculates the standard deviation of the elements of the NumPy array.  
* min() : Returns the minimum element in the NumPy array. 
* max() : Returns the maximum element in the NumPy array.  
* sum() : Returns the sum of the elements of the NumPy array. 


In [5]:
# create a NumPy array X_stat with values [ [ 1, 2, 3], [ 4, -5, 6] ]
X_stat = np.array([[1,2,3] , [4,-5,6]])
print(X_stat)

[[ 1  2  3]
 [ 4 -5  6]]


In [6]:
# calculate and print the mean of elements of this array X_stat
print("mean = " , X_stat.mean())

mean =  1.8333333333333333


In [7]:
# calculate and print the variance of elements of this array X_stat
print("Variance = " , X_stat.var())

Variance =  11.805555555555557


In [8]:
# calculate and print the standard deviation of elements of this array X_stat
print("Standard Deviaton = " , X_stat.std())

Standard Deviaton =  3.435921354681384


In [9]:
# calculate and print the min of elements of this array X_stat
print("min =" , X_stat.min())

min = -5


In [10]:
# calculate and print the max of elements of this array X_stat
print("max = " , X_stat.max())

max =  6


In [11]:
# calculate and print the sum of elements of this array X_stat
print("sum = " , X_stat.sum())

sum =  11


## Sum across axes  
We can sum across different axes of a NumPy array by specifying the axis parameter of the sum function.  

array( [ [ [ 0, 1, 2, 3],  
           [ 4, 5, 6, 7],  
           [ 8, 9, 10, 11] ],  

         [ [ 12, 13, 14 15],
           [ 16, 17, 18, 19],
           [ 20, 21, 22, 23] ] ] )

**axis=0 : first dimension in array (i.e. the outermost bracket '[')**   

The outermost bracket has two matrices

Matrix 1 :  
[ [ 0, 1, 2, 3],  
  [ 4, 5, 6, 7],  
  [ 8, 9, 10, 11] ]    
Matrix 2:  
[ [ 12, 13, 14 15],  
  [ 16, 17, 18, 19],  
  [ 20, 21, 22, 23] ]   

Here the sum across the first dimension (axis=0) means the sum of these two matrices (element-wise) which will give us a single matrix in the final matrix...  
array( [ [ 12, 14, 16, 18 ],  
         [ 20, 22, 24, 26 ],  
         [ 28, 30, 32, 34 ] ] )  

This is calculated as  

[ [ 12+0, 13+1, 14+2, 15+3 ],  
  [ 16+4, 17+5, 18+6, 19+7 ],  
  [ 20+8, 21+9, 22+10, 23+11 ] ]  


**axis = 1 means he second dimension in array (i.e. the second bracket '[' from the outer side) i.e. the rows**  

The rows are

row 0:

    [ 0, 1, 2, 3]
    [ 4, 5, 6, 7]
    [ 8, 9, 10, 11]

row 1:

    [ 12, 13, 14, 15]
    [ 16, 17, 18, 19]
    [ 20, 21, 22, 23]

Hence, sum across the second dimension (axis=1) means the sum of the (along the) rows of the two matrices (element-wise) which will give us a single matrix in the final matrix  

array( [ [12, 15, 18, 21],  
         [48, 51, 54, 57] ] )  

This is calculated as  

array( [ [0+4+8, 1+5+9, 2+6+10, 3+7+11],  
         [12+16+20, 13+17+21, 14+18+22, 15+19+23] ] )  

**axis = 2 means the third dimension.i.e. the third bracket '[' from the outer side) is the columns.**  

The columns are

Matrix 1 :  
[ 0,  4,  8 ]

[ 1,   5,  9 ]

[ 2,   6,  10 ]

[ 3,   7,  11 ]

matrix 2 :  
[ 12,   16,  20 ]

[ 13,   17,  21 ]

[ 14,   18,  22 ]

[ 15,   19,  23 ]

Hence, sum across third dimension (axis=2) means the sum of the (along with the) columns of the two matrices (element-wise) which will give us a single matrix in the final matrix  

array( [ [ 6, 22, 38],
         [54, 70, 86] ] )

This is calculated as

array( [ [0+1+2+3, 4+5+6+7, 8+9+10+11],
         [12+13+14+15, 16+17+18+19, 20+21+22+23] ] )


In [12]:
# create a NumPy array called Z, with a total of 18 elements, and shape as (2, 3, 3)

Z = np.arange(18).reshape(2,3,3)
print(Z)

[[[ 0  1  2]
  [ 3  4  5]
  [ 6  7  8]]

 [[ 9 10 11]
  [12 13 14]
  [15 16 17]]]


In [14]:
# Please find the sum across axis=0,1,2 for the array Z, and print the result using the print() function
print("Sum of elements= " , Z.sum(axis=0))
print("Sum of rows = " , Z.sum(axis=1))
print("Sum of columns = " , Z.sum(axis=2))

Sum of elements=  [[ 9 11 13]
 [15 17 19]
 [21 23 25]]
Sum of rows =  [[ 9 12 15]
 [36 39 42]]
Sum of columns =  [[ 3 12 21]
 [30 39 48]]


## Transpose of an array  
Transpose of a matrix (with rank >= 2) interchanges the rows and columns of the matrix.

We can get the transpose of a matrix (or array) by using the attribute T.

In [15]:
# create a Numpy array called N of size 6 and shape as (3,2)
N = np.arange(6).reshape(3,2)
print(N)

[[0 1]
 [2 3]
 [4 5]]


In [17]:
# Print the transpose of array N
print("Transpose of N :" , N.T)

Transpose of N : [[0 2 4]
 [1 3 5]]


# Project : Betty's Bakery  
Betty has always used her grandmother’s recipe book to make cookies, cakes, pancakes, and bread for her friends and family. She’s getting ready to open a business and will need to start buying all of her milk, eggs, sugar, flour, and butter in bulk.

We will help Betty figure out how much she needs to buy using NumPy arrays describing her recipes

Betty’s assistant has compiled all of her recipes into a csv (comma-separated variable) file called recipes.csv. Load this file into a variable called recipes  
![BETTYSCUPCAKES.GIF](attachment:BETTYSCUPCAKES.GIF)

In [74]:
# import numpy
import numpy as np

# create the recipes array by importing .csv file
recipes = np.genfromtxt('recipes.csv', delimiter=',')

# print the contents of recipes
print(recipes)

[[2.    0.75  2.    1.    0.5  ]
 [1.    0.125 1.    1.    0.125]
 [2.75  1.5   1.    0.    1.   ]
 [4.    0.5   2.    2.    0.5  ]]


Create a NumPy array that represents all the contents for cupcakes and save it in a variable 'cupcakes'

In [75]:
# creating an array manually
cupcakes_manual = np.array([2 , 0.75 , 2 , 1 , 0.5])
print(cupcakes_manual)

# creating an array by selection
cupcakes = recipes[:1,] ## select all ements of row with index 0
print(cupcakes)

[2.   0.75 2.   1.   0.5 ]
[[2.   0.75 2.   1.   0.5 ]]


The 3rd column represents the number of eggs that each recipe needs.
Select all elements from the 3rd column and save them to the variable eggs

In [76]:
# creating an array of eggs
eggs = recipes[:,2]
print(eggs)

[2. 1. 1. 2.]


Which recipes require exactly 1 egg? Use a logical statement to get True or False for each value of eggs.

In [77]:
# simple way
eggs_simple = eggs == 1
print(eggs_simple)

# create a logical array for "one egg"
one_egg = recipes[:,2] == 1 # select the 3rd column and compare values to "1 egg"
print(one_egg)

[False  True  True False]
[False  True  True False]


Betty is going to make 2 batches of cupcakes (1st row) and 1 batch of cookies (3rd row).

You already have a variable for cupcakes. Create a variable for cookies with the data from the 3rd row

In [78]:
# creat3 an array of cookies
cookies = recipes[2:3,] # select all columns starting from rindex2 upto rindex3 excluding rindex3
print(cookies)

[[2.75 1.5  1.   0.   1.  ]]


Get the number of ingredients for a double batch of cupcakes by using multiplication on cupcakes. Save your new variable to double_batch

In [79]:
# create an array of double batch
double_batch = cupcakes * 2

print(double_batch)

[[4.  1.5 4.  2.  1. ]]


Create a new variable called grocery_list by adding cookies and double_batch.

In [80]:
# create grocery list by combining cokkies and double batch of cupcakes
grocery = cupcakes + double_batch

print(grocery)

[[6.   2.25 6.   3.   1.5 ]]


# Quiz

In [81]:
#What’s is the value of c in the following array operation?
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

c = a+b

c

array([5, 7, 9])

In [82]:
# Consider the array a = np.array([10, 4, 6, 9, 18, 22, 11, 13, 3, 2, 15]). 
# Which logical operation could be performed on the array to return array([ 4, 18, 22, 3, 2])?

a = np.array([10, 4, 6, 9, 18, 22, 11, 13, 3, 2, 15])

x = a[(a<5) | (a>15)]

x

array([ 4, 18, 22,  3,  2])