<a href="https://codeimmersives.com"><img src = "https://www.codeimmersives.com/wp-content/uploads/2019/09/CodeImmersives_Logo_RGB_NYC_BW.png" width = 400> </a>


<h1 align=center><font size = 5>Agenda</font></h1>

### 
<div class="alert alert-block alert-info" style="margin-top: 20px">

1.  [Review](#0)<br>
2.  [Numpy continued](#2)<br>
2.  [Exercise](#10)<br> 
3.  [Exercise](#12)<br>     
</div>
<hr>

<h2>Review</h2>

Each array has attributes ``ndim`` (the number of dimensions), ``shape`` (the size of each dimension), and ``size`` (the total size of the array):

In [1]:
import numpy as np
x3 = np.random.randint(10, size=(3, 4, 5))  # Three-dimensional array
print("x3 ndim: ", x3.ndim)
print("x3 shape:", x3.shape)
print("x3 size: ", x3.size)

x3 ndim:  3
x3 shape: (3, 4, 5)
x3 size:  60


<h4> Array Arithmetic </h4>
Vectorized operations in NumPy are implemented via *ufuncs*, whose main purpose is to quickly execute repeated operations on values in NumPy arrays. We also call this element-wise arithmetic.




In [2]:
x = np.array([1,2,3,4])
y = np.array([2,3,4,5])
print(x+y)

[3 5 7 9]


In [3]:
np.arange(5) / np.arange(1,6)

array([0.        , 0.5       , 0.66666667, 0.75      , 0.8       ])

In [4]:
np.add(x, 2)

array([3, 4, 5, 6])

All the arithmetic operators implemented in NumPy:

| Operator	    | Equivalent ufunc    | Description                           |
|---------------|---------------------|---------------------------------------|
|``+``          |``np.add``           |Addition (e.g., ``1 + 1 = 2``)         |
|``-``          |``np.subtract``      |Subtraction (e.g., ``3 - 2 = 1``)      |
|``-``          |``np.negative``      |Unary negation (e.g., ``-2``)          |
|``*``          |``np.multiply``      |Multiplication (e.g., ``2 * 3 = 6``)   |
|``/``          |``np.divide``        |Division (e.g., ``3 / 2 = 1.5``)       |
|``//``         |``np.floor_divide``  |Floor division (e.g., ``3 // 2 = 1``)  |
|``**``         |``np.power``         |Exponentiation (e.g., ``2 ** 3 = 8``)  |
|``%``          |``np.mod``           |Modulus/remainder (e.g., ``9 % 4 = 1``)|


<h4> Array Indexing: Accessing Single Elements </h4>
Done the exact same way as with Python lists. In multi-dimensional numpy arrays, we can index multiple times to access values. 


In [5]:
x1 = np.random.randint(10, size=6)  # One-dimensional array
x1[0]
x1[-1]


1

In [6]:
x2 = np.random.randint(10, size=(3, 4))  # Two-dimensional array
print(x2)
#x2[0, 0]
#x2[1, 0]
x2[1, -1]

[[2 3 1 9]
 [8 2 3 9]
 [9 2 2 4]]


9

## Slicing Arrays

To access subsections of a larger array, we use the : operation to *slice* a specified part of the array.

``` python
x[start:stop:step]
```
If any of these are unspecified, they default to the values ``start=0``, ``stop=``*``size of dimension``*, ``step=1``.
We'll take a look at accessing sub-arrays in one dimension and in multiple dimensions.

In [7]:
x = np.arange(10)
x[:3]  # first three elements
x[5:] #elements after index 5 
x[4:7]  # middle sub-array
x[::-1]  # all elements, reversed

array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])

<h4>Reshaping Arrays</h4>
Simply call array.reshape(new_shape) on your old array with a new shape. Make sure the shape size is compatible with the array length (factors).

In [8]:
grid = np.arange(1, 11).reshape((5, 2))
print(grid)

[[ 1  2]
 [ 3  4]
 [ 5  6]
 [ 7  8]
 [ 9 10]]


<h2>Exercise</h2>
Create a numpy array with 2 dimensions, a shape of 5x5, and calculate the array's size.

<h4>Solution</h4>

In [9]:
import numpy as np
array = np.zeros([5,5])
array.size

25

<h2>Exercise</h2>


Create a 1-D array of 8 values using np.arange(8), and then reshape it to a 3-dimensional array using the array.reshape(new_shape) function. 
*HINT* look at the cubic of 8 to find the appropriate shape.

<h4>Solution</h4>

In [10]:
array = np.arange(8)
array.reshape([2,2,2])

array([[[0, 1],
        [2, 3]],

       [[4, 5],
        [6, 7]]])

<h2>Exercise</h2>


<h2>Numpy continued</h2>
We will explore Numpy ufuncs, which stand for 'Universal Functions', that<br>
operate on the ndarray object.<br>
ufuncs are used to implement vectorization in NumPy which is way faster than <br>
iterating over elements.<br>
<br>
They also provide broadcasting and additional methods like reduce, accumulate etc. <br>
that are very helpful for computation.<br>
<br>
ufuncs also take additional arguments, <br>
like:<br>

- where - boolean array or condition defining where the operations should take place.<br>
- dtype - defining the return type of elements.<br>
- out - output array where the return value should be copied.<br>

<h4>Adding 2 python lists together</h4>
<br>
<code>
x = [3, 5, 7, 9]
y = [4, 6, 8, 10]
z = np.add(x, y)

print(z)  
print(type(z))
</code>
<br>
NOTE: The shape of the numpy objects must be the same.

In [11]:
x = [3, 5, 7, 9]
y = [4, 6, 8, 10]
print(np.add(x,y))


[ 7 11 15 19]


<h4>Subtraction</h4>
<br>
<code>
x = [13, 25, -7, 39]
y = [4, 6, 8, 10]
z = np.subtract(x, y)

print(z)
print(type(z))
</code>

In [12]:
x = [3, 5, 7, 9]
y = [4, 6, 8, 10]
print(np.subtract(x,y))

[-1 -1 -1 -1]


<h4>Multiplication</h4>
<code>
x = [13, 25, -7, 39]
y = [4, 6, 8, 10]
z = np.multiply(x, y)

print(z)
print(type(z))
</code>

In [13]:
x = [3, 5, 7, 9]
y = [4, 6, 8, 10]
print(np.multiply(x,y))

[12 30 56 90]


<h4>Division</h4>
<code>
import numpy as np

arr1 = np.array([10, 20, 30, 40, 50, 60])
arr2 = np.array([3, 5, 8, 13, 21, 34])

newarr = np.divide(arr1, arr2)
print(newarr) 
</code>
<br>
Powers
<code>
arr1 = np.array([2, 3, 4, 5, 6, 7])
arr2 = np.array([7, 6, 5, 4, 3, 2])

newarr = np.power(arr1, arr2)
print(newarr) 
</code>
<br>
Reminder/ Modulus
<code>
arr1 = np.array([10, 20, 30, 40, 50, 60])
arr2 = np.array([3, 7, 6, 14, 8, 13])
newarr = np.remainder(arr1, arr2)

print(newarr) 


arr1 = np.array([10, 20, 30, 40, 50, 60])
arr2 = np.array([3, 7, 9, 8, 2, 33])

newarr = np.mod(arr1, arr2)

print(newarr) 
</code>
<br>
Quotient and Mod
<code>
arr1 = np.array([10, 20, 30, 40, 50, 60])
arr2 = np.array([2, 5, 9, 8, 13, 43])
newarr = np.divmod(arr1, arr2)   # Returns 2 arrays !!!

print(newarr) 
</code>
<br>
Absolute Value
<code>
arr = np.array([-1, -2, 10, 2, 13, -9])
newarr = np.absolute(arr)

print(newarr) 
</code>

<h2> Try them Out! </h2>

<h2>Exercise</h2>


a1 = np.array([[1,2],
               [3,4],
               [5,6]])

a2 = np.array([[7,8],
               [9,10],
               [10,11]])

Reshape the two arrays into 1-D arrays and add them, print the sum.


<h2>Numpy ufunc - Rounding Decimals</h2><br>
There are primarily five ways of rounding off decimals in NumPy:<br>
- truncation - remove the decimal portion of a number
<br>
<code>
import numpy as np
arr = np.trunc([-3.1666, 3.6667, 111.23334])
print(arr)    
</code><br>
- rounding - equivalent to python rounding
<br>
<code>
arr = np.around(3.1417, 2)
print(arr)     
</code><br>
- floor - Returns the next lower integer number
<br>
<code>
arr = np.floor([-3.1666, 3.6667])
print(arr)     
</code><br>
- ceil - Returns the next higher integer number
<br>
<code>
arr = np.ceil([-3.1666, 3.6667])
print(arr)     
</code><br>

Exercise: round pi to the nearest 5 digits using numpy and use to calculate the area of a circle with radius 3. 

In [14]:
pi = np.round(np.pi,5)
area = np.multiply(pi,np.power(3,2))
print(area)

28.27431


<h2>Summing all elements</h2>
We can use the Numpy sum function to add up all of the values in<br>
1 or more np.arrays<br>
<code>
import numpy as np

arr1 = np.array([3, 4, 1])
arr2 = np.array([-1,-2, 7])
newarr = np.sum([arr1, arr2])
print(newarr) 
</code>

In [15]:
import numpy as np

arr1 = np.array([3, 4, 1])
arr2 = np.array([-1,-2, 7])

newarr = np.sum([arr1, arr2])
print(newarr) 

12


Exercise:
There are 10 customers with account holdings given below. We want to separate customers with an total holding of more than 1,000 dollars (commercial accounts) and those with less than 1,000 dollars (commercialal accounts). 
<br>
1) what is the total commercial account value?
<br>
2) using np.max() what is the highest account holding amongst the commercial accounts? <br>

We want to aggregate all the commercial and commercial account holdings into separate arrays. 
<br>
3) Find the mean and stdev of the commercial account list. 
<br>
4) Find the median and mean of the commercial account list.   
<br>
We'd like to plot the commercial account holdings against the commercial account holdings. <br>
5) Import matplotlib and use the plt.scatter(x,y) method to print a graph depicting this relationship. <br>


<code>
c1 = np.array([155, 204, 312, 56])
c2 = np.array([233, 245, 333])
c3 = np.array([802, 7, 382, 9329, 58392])
c4 = np.array([222, 453, 10235, 9929])
c5 = np.array([902, 75, 29394])
c6 = np.array([65, 45])
c7 = np.array([550, 300])
c8 = np.array([200, 120])
c9 = np.array([550, 1200])
c10 = np.array([5312, 0])
</code>

In [4]:
import numpy as np 
from matplotlib import pyplot as plt

c1 = np.array([155, 204, 312, 56])
c2 = np.array([233, 245, 333])
c3 = np.array([802, 7, 382, 9329, 58392])
c4 = np.array([222, 453, 10235, 9929])
c5 = np.array([902, 75, 29394])
c6 = np.array([65, 45])
c7 = np.array([550, 300])
c8 = np.array([200, 120])
c9 = np.array([550, 1200])
c10 = np.array([5312, 0])
c11 = np.array([150, 200, 20])



customers = np.array([c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,c11],dtype=object)


max_size = max(array.size for array in customers)

filled_accounts = np.stack([np.pad(array,(0,max_size-len(array)),'constant') for array in customers])
print(filled_accounts)
# sorting 
personal = filled_accounts[np.sum(filled_accounts)<1000][filled_accounts!=0]
commercial = filled_accounts[np.sum(filled_accounts)>1000][filled_accounts!=0]
np.append(commercial, np.zeros(1))



# Personal accounts
max_personal_value = np.max(personal,axis=1)
# Personal Account stats
personal_median = np.median(personal,axis=1)
personal_means = np.mean(personal,axis=1)
personal_stdev = np.std(personal, axis=1)


# Commercial accounts
max_commercial_value = np.max(np.sum(commercial, axis=1))
# Commercial Account stats
commercial_median = np.median(commercial,axis=1)
commercial_means = np.mean(commercial,axis=1)
commercial_stdev = np.std(commercial, axis=1)


flattened_commercial = np.append(flattened_commercial[flattened_commercial!=0], 0, axis=0)
print(flattened_personal)
print(flattened_commercial)
# Scatter comparing acount values
plt.scatter(flattened_personal, flattened_commercial)

[[  155   204   312    56     0]
 [  233   245   333     0     0]
 [  802     7   382  9329 58392]
 [  222   453 10235  9929     0]
 [  902    75 29394     0     0]
 [   65    45     0     0     0]
 [  550   300     0     0     0]
 [  200   120     0     0     0]
 [  550  1200     0     0     0]
 [ 5312     0     0     0     0]
 [  150   200    20     0     0]]


IndexError: boolean index did not match indexed array along dimension 0; dimension is 0 but corresponding boolean dimension is 11

This notebook is part of a course at www.codeimmersives.com called Data Science. If you accessed this notebook outside the course, you can get more information about this course online by clicking here.

<hr>

Copyright &copy; 2021  Code Immersives