        Python Programming
        Neshat Beheshti
        University of Texas Arlington
        
        This document can only be used for class studies. 
        You are not allowed to share it in any public platform.

<h1 align='center' style="color: blue;">Introduction to NumPy- Final la Important</h1>

<p>Numerical Python (Numpy) is a foundational package containing common data science packages. It is an extension module that provides efficient operation on arrays of <b><u>homogeneous</u></b> data. Numpy provides high performance multi-dimensional arrays that can be used as vectors or matrices.</p>

In [2]:
import numpy as np 

## 1. Arrays

<img src="https://jakevdp.github.io/PythonDataScienceHandbook/figures/array_vs_list.png" width=400 height=400 >
<p><b>Source</b>: VanderPlas, J., 2016. Python data science handbook: essential tools for working with data. " O'Reilly Media, Inc."</p>

<p> NumPy’s array class is called <b>ndarray</b> (multidimensional arrays). It is also known by the alias array. Note that numpy.array is not the same as the Standard Python Library class array.array, which only handles one-dimensional arrays and offers less functionality. </p>

<p>There are two common types of array in numpy:</p>
<ul>
    <li>Rank 1: it has <u>one</u> dimension.</li>
    <li>Rank 2: It has <u>two</u> dimensions.</li>
</ul>

<p>You can also create arrays with more than two dimensions. (Example: Images)</p>

In [7]:
an_array = np.array([1, 2, 3])  # Create a rank 1 array
an_array

array([1, 2, 3])

In [11]:
# let's check the type of created array
print(type(an_array)) 

<class 'numpy.ndarray'>


<p><b>Note:</b> Unlike Python lists, NumPy arrays contain homogeneous type of data. If types do not match, NumPy will upcast if possible (integers are upcast to floating point).</p>

In [12]:
np.array([5.7, 3, 2.2, 21])

array([ 5.7,  3. ,  2.2, 21. ])

<b>Note:</b> The <b style='color:blue'>shape</b> property is usually used to get the current shape of an array

In [3]:
an_array = np.array([1, 2, 3, 4])
print(f'The array shape is:{an_array.shape}')

The array shape is:(4,)


In [None]:
print(an_array.shape)

<b>Note:</b> In rank 1 array, you only need one index to access elements of array.

In [None]:
print(an_array[0])

<b>Note:</b> You can also use reverse indexing.

In [None]:
print(an_array[-1])

<b>Note:</b>You can change an element of an array.

In [None]:
an_array[0] = 5
an_array

Let's creat a <u>rank 2</u> array

In [3]:
new_array = np.array([[11,12,13],[21,22,23]])
print(new_array)

[[11 12 13]
 [21 22 23]]


Here is a clearer presentation

In [15]:
new_array = np.array([[11,12,13],
                      [21,22,23]])             
print(new_array)

[[11 12 13]
 [21 22 23]]


In [4]:
new_array.shape

(2, 3)

<b>Note:</b> You need to use multiple indexing format to get access to elements in a rank 2 array.

In [18]:
new_array[1,2]

23

In [None]:
new_array[0,0] = 44
new_array

<b>Note:</b> Keep in mind that data in arrays should be homogeneous

In [22]:
new_array[0,0] = float(45.6)
new_array

array([[45, 12, 13],
       [21, 22, 23]])

<b>Note:</b> You can also use comprehesion to develop arrays (generate list of lists)

In [4]:
np.array([range(i, i + 3) for i in [2, 4, 6]])

array([[2, 3, 4],
       [4, 5, 6],
       [6, 7, 8]])

### 1.1. Common Built-in Numpy Functions to Create Arrays

<p>Numpy has a number of built in methods which help us quickly and easily create multidimensional arrays.</p>

In [34]:
import numpy as np 

# create a 3x2 array of zeros
zeros_array = np.zeros((3,2))      
print(zeros_array)    

[[0. 0.]
 [0. 0.]
 [0. 0.]]


In [35]:
# create a 3x5 array of 1. s
ones_array = np.ones((3, 5))
print(ones_array)

[[1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]]


In [36]:
# create a 5x3 array filled with 5
full_array = np.full((5, 3), 5)
print(full_array)

[[5 5 5]
 [5 5 5]
 [5 5 5]
 [5 5 5]
 [5 5 5]]


In [37]:
# create a 4x4 matrix with the diagonal 1s and the others 0
eye_array = np.eye(4,4)
print(eye_array) 

[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]


<p><b>Example:</b> Consider the following code <br/><br/> ex_array = np.ones((1,5)) <br/><br/> Is it a rank1 array or rank2 array?</p>

In [5]:
# test your code over here
ex_array = np.ones((1,5)).shape
ex_array

(1, 5)

In [6]:
ex_array = np.ones((1,5))
ex_array

array([[1., 1., 1., 1., 1.]])

Default functions create 2d array but only fill specified 1D columns

In [45]:
e1 = np.array([1., 1., 1., 1., 1.])
e1.shape

(5,)

In [7]:
a_list = [[1., 1., 1., 1., 1.]]
print(len(a_list))

1


In [8]:
e2 = np.array([[1., 1., 1., 1., 1.]])
e2.shape

(1, 5)

In [10]:
# create a random array of uniformly distributed floats between 0 and 1
# random(mean= , std= , size=())
unirand_array = np.random.random((3,4))
print(unirand_array)   

[[0.15544485 0.33396549 0.68548202 0.52695128]
 [0.20684329 0.8946991  0.62457298 0.04737292]
 [0.20294551 0.62910903 0.72809232 0.18005965]]


In [11]:
# create a random array of normally distributed floats with mean 5 and sd 8
normalrand_array = np.random.normal(0,1,(3,4))
print(normalrand_array)

[[-0.5060083   0.17458719  0.5553117   1.15843178]
 [ 2.11184316  1.80000456  1.41846991  1.37717883]
 [-0.08633675  0.47659188  0.45068361 -0.56093758]]


In [12]:
# you can also use randn() to generate random array from normal distribution. 
# However you cannot define mean and sd (set to mean=0 and sd=1)
normalrand_array = np.random.randn(3,4)
print(normalrand_array)

[[-2.44074139 -2.45181848 -0.4917662  -0.47233234]
 [ 0.46657322 -2.25613698  0.80747686  2.22870306]
 [-0.15027593 -0.34696246  2.64351516  0.75456698]]


In [9]:
# create a random array of discrete uniform integers between [0, 10)
intrand_array = np.random.randint(0,10,(3,4))
print(intrand_array)

[[0 6 3 8]
 [2 5 0 0]
 [6 9 9 0]]


### 1.2. Array Slicing (sub-array)

<p>The Numpy slicing syntax follows that of the standard Python list:</p>

<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<i>an_array[start:stop:step]</i></p>

In [10]:
import numpy as np

an_array = np.arange(10)         # arange() in numpy works similar to range() in standard python 
print("Array: ", an_array)
print("Shape: ", an_array.shape) # this array is a rank 1 array 

Array:  [0 1 2 3 4 5 6 7 8 9]
Shape:  (10,)


In [15]:
# Slicing for rank1 arrays
print("Slice 1: an_array[2:5]   is", an_array[2:5]) 
print("Slice 2: an_array[:5]    is", an_array[:5])    # first five elements
print("Slice 3: an_array[5:]    is", an_array[5:])    # elements after index 5
print("Slice 4: an_array[::2]   is", an_array[::2])   # every other element
print("Slice 5: an_array[::3]   is", an_array[::3])    
print("Slice 6: an_array[3::2]  is", an_array[3::2])  # every other element, starting at index 3
print("Slice 7: an_array[3::3]  is", an_array[3::3])
print("Slice 8: an_array[::-1]  is", an_array[::-1])  # all elements, reversed
print("Slice 9: an_array[5::-2] is", an_array[5::-2]) # reversed every other from index 5

Slice 1: an_array[2:5]   is [2 3 4]
Slice 2: an_array[:5]    is [0 1 2 3 4]
Slice 3: an_array[5:]    is [5 6 7 8 9]
Slice 4: an_array[::2]   is [0 2 4 6 8]
Slice 5: an_array[::3]   is [0 3 6 9]
Slice 6: an_array[3::2]  is [3 5 7 9]
Slice 7: an_array[3::3]  is [3 6 9]
Slice 8: an_array[::-1]  is [9 8 7 6 5 4 3 2 1 0]
Slice 9: an_array[5::-2] is [5 3 1]


In [17]:
another_array = np.array([[12,  5,  2,  4],
                          [ 7,  6,  8,  8],
                          [ 1,  6,  7,  7]])

<p> The above code creates a matrix (2-dimensional array)</p> 
<table width=300>
    <tr style="background-color:white;">
        <td>&nbsp;</td>
        <td style="text-align:center">0</td>
        <td style="text-align:center">1</td>
        <td style="text-align:center">2</td>
        <td style="text-align:center">3</td>
    </tr>
    <tr style="background-color:white;">
        <td style="text-align:center">0</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">12</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">5</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">2</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">4</td>
    </tr>
    <tr style="background-color:white">
        <td style="text-align:center">1</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">7</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">6</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">8</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">8</td>
    </tr>
    <tr style="background-color:grey">
        <td style="text-align:center">2</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">1</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">6</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">7</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">7</td>
    </tr>
</table>
    

<h1>Output change</h1>
<br>

another_array[1, :]  & another_array[1:2,:] for both of this the output will be same array([[],[]])
<br/>
but for the first we specify to print only array at "1" so it will be 1-D
<br>
however for the second we specify "1,2" so it will be 2-D
<br/>


In [18]:
# Slicing for rank2 arrays
print("another_array[:2, :3]\n\n", another_array[:2, :3])
print("\n")
print("another_array[:2, ::2]\n\n", another_array[:2, ::2])
print("\n")
print("another_array[:, ::2]\n\n", another_array[:, ::2])
print("\n")
print("another_array[::-1, ::-1]\n\n", another_array[::-1, ::-1])
print("\n")
print("another_array[1, :]\n\n", another_array[1, :])
print("\n")
print("another_array[:, 2]\n\n", another_array[:, 2])

another_array[:2, :3]

 [[12  5  2]
 [ 7  6  8]]


another_array[:2, ::2]

 [[12  2]
 [ 7  8]]


another_array[:, ::2]

 [[12  2]
 [ 7  8]
 [ 1  7]]


another_array[::-1, ::-1]

 [[ 7  7  6  1]
 [ 8  8  6  7]
 [ 4  2  5 12]]


another_array[1, :]

 [7 6 8 8]


another_array[:, 2]

 [2 8 7]


In [None]:
another_array = np.array([[12,  5,  2,  4],
                          [ 7,  6,  8,  8],
                          [ 1,  6,  7,  7]])
another_array[1:2,:].shape

In [None]:
another_array[1:2,:]

<b>Important Example:</b>

In [None]:
print("another_array[:, 2] is :", another_array[:, 2])                  # rank1 array
print("another_array[:, 2].shape is :", another_array[:, 2].shape)
print("another_array[:, 2:3] is :\n", another_array[:, 2:3])            # rank2 array
print("another_array[:, 2:3].shape is :", another_array[:, 2:3].shape)

<p><b>Note:</b> array slices return <u>views</u> rather than copies of the array data. This is different from Python list slicing in which slices generate a new copies of the sub-list.</p>

when we use sub_array[] -> Changes made in sub array will change in original array
<br>
when we use np_array.copy() -> Changes made in sub array will not change in original array
<br>


In [12]:
# Here is an example
ex_array = np.array([[11,12,13],
                     [21,22,23],
                     [31,32,33]])
sub_array = ex_array[1:3,1:3]
sub_array

array([[22, 23],
       [32, 33]])

In [29]:
# let's change the content of sub_array
#sub array will change and update same positions in original array
sub_array[0,0] = 25
# let's check the original array
ex_array

array([[ 11,  12,  13],
       [ 21,  25, 455],
       [ 31,  25,  45]])

<p><b>Note:</b> In order to create a separate copy of a sub-array, you need to use the <b style="color:blue;">copy( )</b> function.</p>

In [31]:
ex_array = np.array([[11,12,13],
                     [21,22,23],
                     [31,32,33]])

sub_array_copy = ex_array[1:3,1:3].copy()

### 1.3. Boolean Indexing (Filtering)

<p>You can create filters by applying specific criteria to the arrays. Filters can be used to extract subsample of elements from an array.</p>

when is the output file 1-d with rank 1 -> just apply filter and implement on the array
<br>
when is the output file 2-d as matrix -> just apply filter.

In [32]:
# let's run an example
exp_array = np.array([[12,  5,  2,  4],
                      [ 7,  6,  8,  8],
                      [ 1,  6,  7,  7]])

filter_array = exp_array > 6     # This generates a same size (shape) boolean array  
print(filter_array)

[[ True False False False]
 [ True False  True  True]
 [False False  True  True]]


<p>Here is the matrix presentation of the result:</p>
<p>exp_array</p> 
<table width=300>
    <tr style="background-color:white;">
        <td>&nbsp;</td>
        <td style="text-align:center">0</td>
        <td style="text-align:center">1</td>
        <td style="text-align:center">2</td>
        <td style="text-align:center">3</td>
    </tr>
    <tr style="background-color:white;">
        <td style="text-align:center">0</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">12</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">5</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">2</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">4</td>
    </tr>
    <tr style="background-color:white">
        <td style="text-align:center">1</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">7</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">6</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">8</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">8</td>
    </tr>
    <tr style="background-color:grey">
        <td style="text-align:center">2</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">1</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">6</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">7</td>
        <td style="border: 1px black solid;text-align:center; background-color:yellow">7</td>
    </tr>
</table>

<p>filter_array</p> 
<table width=300>
    <tr style="background-color:white;">
        <td>&nbsp;</td>
        <td style="text-align:center">0</td>
        <td style="text-align:center">1</td>
        <td style="text-align:center">2</td>
        <td style="text-align:center">3</td>
    </tr>
    <tr style="background-color:white;">
        <td style="text-align:center">0</td>
        <td style="border: 1px black solid;text-align:center; background-color:green">True</td>
        <td style="border: 1px black solid;text-align:center; background-color:red">False</td>
        <td style="border: 1px black solid;text-align:center; background-color:red">False</td>
        <td style="border: 1px black solid;text-align:center; background-color:red">False</td>
    </tr>
    <tr style="background-color:white">
        <td style="text-align:center">1</td>
        <td style="border: 1px black solid;text-align:center; background-color:green">True</td>
        <td style="border: 1px black solid;text-align:center; background-color:red">False</td>
        <td style="border: 1px black solid;text-align:center; background-color:green">True</td>
        <td style="border: 1px black solid;text-align:center; background-color:green">True</td>
    </tr>
    <tr style="background-color:grey">
        <td style="text-align:center">2</td>
        <td style="border: 1px black solid;text-align:center; background-color:red">False</td>
        <td style="border: 1px black solid;text-align:center; background-color:red">False</td>
        <td style="border: 1px black solid;text-align:center; background-color:green">True</td>
        <td style="border: 1px black solid;text-align:center; background-color:green">True</td>
    </tr>
</table>
<br/>
<p>Let's use the filter to extract the desire values from the array</p>    

In [33]:
filter_result = exp_array[filter_array]         # result is one-dimensional rank1 array
print('Filtered array: ',filter_result)

print('Original array: ',exp_array)


Filtered array:  [12  7  8  8  7  7]
Original array:  [[12  5  2  4]
 [ 7  6  8  8]
 [ 1  6  7  7]]


<table width=300>
    <tr style="background-color:white;">
        <td>&nbsp;</td>
        <td style="text-align:center">0</td>
        <td style="text-align:center">1</td>
        <td style="text-align:center">2</td>
        <td style="text-align:center">3</td>
    </tr>
    <tr style="background-color:white;">
        <td style="text-align:center">0</td>
        <td style="border: 1px black solid;text-align:center; background-color:green">12</td>
        <td style="border: 1px black solid;text-align:center; background-color:red">5</td>
        <td style="border: 1px black solid;text-align:center; background-color:red">2</td>
        <td style="border: 1px black solid;text-align:center; background-color:red">4</td>
    </tr>
    <tr style="background-color:white">
        <td style="text-align:center">1</td>
        <td style="border: 1px black solid;text-align:center; background-color:green">7</td>
        <td style="border: 1px black solid;text-align:center; background-color:red">6</td>
        <td style="border: 1px black solid;text-align:center; background-color:green">8</td>
        <td style="border: 1px black solid;text-align:center; background-color:green">8</td>
    </tr>
    <tr style="background-color:grey">
        <td style="text-align:center">2</td>
        <td style="border: 1px black solid;text-align:center; background-color:red">1</td>
        <td style="border: 1px black solid;text-align:center; background-color:red">6</td>
        <td style="border: 1px black solid;text-align:center; background-color:green">7</td>
        <td style="border: 1px black solid;text-align:center; background-color:green">7</td>
    </tr>
</table>

<p>&nbsp;</p>

<p><b>Note:</b> We can update elements in arrays applying similar filtering.</p>  
<p><b>Example:</b> Let's add 20 to all the green elements.</p>

In [34]:
exp_array[exp_array>6] += 20
exp_array

array([[32,  5,  2,  4],
       [27,  6, 28, 28],
       [ 1,  6, 27, 27]])

<p><b>Example:</b> Use the exp_array and apply a filter to extract the even numbers. </p>

In [38]:
# write your code over here
import numpy as np

exp_array = np.array([[12,  5,  2,  4],
                      [ 7,  6,  8,  8],
                      [ 1,  6,  7,  7]])

even_filter_array = (exp_array%2 ==0)
exp_array[even_filter_array]

array([12,  2,  4,  6,  8,  8,  6])

<p><b>Example:</b> In exp_array, multiply all the even numbers by 3.</p>

In [39]:
# write your code over here
exp_array[even_filter_array] *=3
exp_array

array([[36,  5,  6, 12],
       [ 7, 18, 24, 24],
       [ 1, 18,  7,  7]])

<p><b>Note:</b> you can use <b style="color:blue;">any( )</b> and <b style="color:blue;">all( )</b> functions to test whether any or all elements in your array met your filtering criteria.</p>

In [46]:
exp_array = np.array([[12,  5,  2,  4],
                      [ 7,  6,  8,  8],
                      [ 1,  6,  7,  7]])

filter_array = exp_array > 6

In [50]:
# test to see whether any elements of the exp_array met the criteria
# filter_array.any(exp_array) #error if any elements of exp_array
filter_array.any() 

True

In [51]:
# test to see whether all the elements of the exp_array met the criteria
filter_array.all() 

False

<b>Note:</b> You can also use multiple criteria to filter your numpy array. In order to do it you need to use <b style='color:blue'>|</b> and <b style='color:blue'>&</b> operators. 

<b>Example1:</b> Consider the following array and find values that are less than 10 and more than 2.

In [None]:
exp_array = np.array([[12,  5,  2,  4],
                      [ 7,  6,  8,  8],
                      [ 1,  6,  7,  7]])

filter_array = (exp_array<10) & (exp_array>2)
exp_array[filter_array]

<b>Example2:</b> Consider the following array and find values that are more than 10 or less than 3.

In [None]:
exp_array = np.array([[12,  5,  2,  4],
                      [ 7,  6,  8,  8],
                      [ 1,  6,  7,  7]])

filter_array = (exp_array>10) | (exp_array<3)
exp_array[filter_array]

### 1.4. Reshaping of Arrays

<p>Numpy allows you to reshape your arrays suing <b style="color:blue">reshape( )</b> function. It is very useful to convert one-dimensional arrays into multi-dimensional arrays</p>

In [41]:
# Example 1
exp_array = np.arange(16).reshape(2,8)
exp_array

array([[ 0,  1,  2,  3,  4,  5,  6,  7],
       [ 8,  9, 10, 11, 12, 13, 14, 15]])

In [42]:
exp_array.reshape(4,4)

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [43]:
exp_array.reshape(8,2)

array([[ 0,  1],
       [ 2,  3],
       [ 4,  5],
       [ 6,  7],
       [ 8,  9],
       [10, 11],
       [12, 13],
       [14, 15]])

In [56]:
# Example 2
exp_array = np.arange(20).reshape(2,2,5) # three dimensional array
exp_array

array([[[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9]],

       [[10, 11, 12, 13, 14],
        [15, 16, 17, 18, 19]]])

In [57]:
exp_array = np.arange(5).reshape(1,-1)
exp_array.shape


(1, 5)

<p><b>Note:</b> It is important to that the size of the initial array matches the size of the reshaped array. 

<p>&nbsp;</p>

<p><b>Note:</b>You can also use the <b style="color:blue;">T</b> to find the transpose of an array</p>

In [None]:
exp2_array = np.array([[1,2,3],
                       [4,5,6]])
print("Original array")
print(exp2_array)
print()
print("Transpose array")
print(exp2_array.T)

### 1.5. Universal functions (ufunc)

<p>A universal function (or ufunc for short) is a function that operates on ndarrays in an element-by-element fashion. Universal functions are instances of the numpy.ufunc class. Many of the built-in functions are implemented in compiled C code which provides faster execution of operations.</p>

In [45]:
# Let's consider a one-dimensional array and compute the reciprocals of its elements in classic format
import numpy as np 
                           

def compute_reciprocals(values): 
    output = np.empty(len(values))            # generate an empty array with specified size
    for i in range(len(values)):
        output[i] = 1.0 / values[i] 
    return output
        
values = np.random.randint(1, 10, size=5)     # 5 random integer numbers betweeb [1,10)
compute_reciprocals(values)

array([0.125     , 0.125     , 1.        , 0.33333333, 0.25      ])

In [64]:
# Now, we can check the required time for this procedure on a large size array
big_array = np.random.randint(1, 100, size=1000000)
%timeit compute_reciprocals(big_array)

1.09 s ± 84.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


<p><b>Note:</b> Numpy's UFuncs allow you to apply operations at array level, and then they apply the operation to all the elements of the array in an optimized format (we call it vectorized operations). 

In [46]:
# Example: You can perform the reciprocal task in the following format.
values = np.random.randint(1, 10, size=5)
print("   Classic method result:", compute_reciprocals(values))
print("Vectorized method result:", 1/values)

   Classic method result: [0.2        0.14285714 0.11111111 0.14285714 0.5       ]
Vectorized method result: [0.2        0.14285714 0.11111111 0.14285714 0.5       ]


<p><b>Question:</b> As you can see the result of two methods are the same. The question is which one is faster? </p>

In [47]:
big_array = np.random.randint(1, 100, size=1000000)
%timeit 1/big_array

1.04 ms ± 21.1 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


Here you can find more arithmetic array operations:

In [48]:
x = np.array([[1,2],
              [3,4]])

y = np.array([[5.5,6.6],
              [7.7,8.8]])

print(x)
print()
print(y)

[[1 2]
 [3 4]]

[[5.5 6.6]
 [7.7 8.8]]


In [66]:
# add
print(x + y)         
print()
print(np.add(x, y))  

[[ 6.5  8.6]
 [10.7 12.8]]

[[ 6.5  8.6]
 [10.7 12.8]]


In [67]:
# subtract
print(x - y)
print()
print(np.subtract(x, y))

[[-4.5 -4.6]
 [-4.7 -4.8]]

[[-4.5 -4.6]
 [-4.7 -4.8]]


In [68]:
# multiply
print(x * y)
print()
print(np.multiply(x, y))

[[ 5.5 13.2]
 [23.1 35.2]]

[[ 5.5 13.2]
 [23.1 35.2]]


In [69]:
# divide
print(x / y)
print()
print(np.divide(x, y))

[[0.18181818 0.3030303 ]
 [0.38961039 0.45454545]]

[[0.18181818 0.3030303 ]
 [0.38961039 0.45454545]]


In [49]:
# square root
print(np.sqrt(x))

[[1.         1.41421356]
 [1.73205081 2.        ]]


In [50]:
# exponent (e ** x)
print(np.exp(x))

[[ 2.71828183  7.3890561 ]
 [20.08553692 54.59815003]]


In [None]:
A = np.array([[-1,-2],
              [5,3]])

B = np.array([[4,2],
              [-1,-6]])

In [None]:
A + B 

In [None]:
A + B.T

In [None]:
A.T - B

In [None]:
A.T

In [None]:
(A - B).T

In [None]:
(A + B).T - A

<b>Extra Note:</b> <b style='color:blue'>reduce( )</b> and <b style='color:blue'>accumulate( )</b> are two useful functions that can be used along with universal functions for aggregation.

### 1.6. Broadcasting

<p>Broadcasting is simply a set of rules for applying binary ufuncs (addition, subtraction, multiplication, etc.) on arrays of different sizes.</p>
<img src="https://jakevdp.github.io/PythonDataScienceHandbook/figures/02.05-broadcasting.png" width=400 height=400>
<p><b>Source</b>: VanderPlas, J., 2016. Python data science handbook: essential tools for working with data. " O'Reilly Media, Inc."</p>

<p>Rules of broadcasting:</p>

* If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading side.
* >(3,4) & (3,) --> if either of arrays are 1D then their row or column should be same
* > (3,4) & (4,1) --> error because first and second arrays have different number of row.

* If the shape of the two arrays does not match in any dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape.

* If in any dimension the sizes disagree and neither is equal to 1, an error is raised.

In [1]:
# Example 1
# when the np.array is 1D the number of column will be 
# in place of row.
import numpy as np

an_array = np.array([0,1,2])      # Shape (3,)
an_integer = 5
an_array + an_integer 


array([5, 6, 7])

In [3]:
# Example 2
a_matrix = np.ones((3,3))
print(a_matrix)
an_array = np.array([1,2])     # Shape (3,)
a_matrix * an_array

[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]


ValueError: operands could not be broadcast together with shapes (3,3) (2,) 

In [73]:
# Example 3
an_array = np.array([0,1,2])     # Shape (3,)
another_array = np.array([[0],
                          [1],
                          [2]])  # Shape (1,3)
an_array + another_array

array([[0, 1, 2],
       [1, 2, 3],
       [2, 3, 4]])

<p><b>Example:</b> (1) create a 3x4 matrix, (2) create a 3x1 vector. Add this two entities and figure out how broadcasting applies to this case?</p>

In [7]:
# Write your code over here
A = np.random.randn(3,4)
B = np.ones((3,1)) # intialize matrix with 1 in each cell
print(A)
print(B)
print(A+B)
# print(B)


[[ 1.70904272  1.12109294  1.50136425 -0.86758454]
 [ 0.6209432   0.99797542  0.4687915  -0.7544783 ]
 [-0.09432427  1.4607692   0.59970884  0.4588514 ]]
[[1.]
 [1.]
 [1.]]
[[2.70904272 2.12109294 2.50136425 0.13241546]
 [1.6209432  1.99797542 1.4687915  0.2455217 ]
 [0.90567573 2.4607692  1.59970884 1.4588514 ]]


In [8]:
A+B

array([[2.70904272, 2.12109294, 2.50136425, 0.13241546],
       [1.6209432 , 1.99797542, 1.4687915 , 0.2455217 ],
       [0.90567573, 2.4607692 , 1.59970884, 1.4588514 ]])

<p><b>Example:</b> (1) create a 3x4 matrix, (2) create a 4x1 vector. Add this two entities and figure out how broadcasting applies to this case?</p>

In [2]:
# Write your code over here
import numpy as np
A = np.random.randn(3,4)
B = np.ones((3,1))
print(A)
print(B)

[[-1.20846295 -0.24119338 -1.51049736 -0.16821151]
 [-0.70783845 -0.53484067 -1.09715672  1.02347983]
 [-2.72540571  1.20186728 -0.49397188 -0.11217459]]
[[1.]
 [1.]
 [1.]]


In [3]:
A+B

array([[-0.20846295,  0.75880662, -0.51049736,  0.83178849],
       [ 0.29216155,  0.46515933, -0.09715672,  2.02347983],
       [-1.72540571,  2.20186728,  0.50602812,  0.88782541]])

### 1.7. Aggregation

<p>Sometime you need to apply operations (sum, max, min, mean) at aggregate level to get summary statistic about your array (mostly one-dimensional). For this purpose, you can use both default python functions and numpy version of the functions. However, you need to keep in mind that numpy functions operate faster on large size arrays.</p>

In [56]:
# Example
x = np.random.randint(0,100,(10000000,1))
print(sum(x), min(x), max(x))
print(np.sum(x),np.max(x),np.min(x))

[495106926] [0] [99]
495106926 99 0


In [57]:
%timeit sum(x)

4.21 s ± 61.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [58]:
%timeit np.sum(x)

2.53 ms ± 87.7 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


#### 1.7.1. Multidimensional aggregates

<p>One common type of aggregation operation is an aggregate along a row or column.</p>

In [59]:
# Example
an_array = np.random.randint(1,10,(4, 5)) #matrix with 4 rows and 5 columns
print(an_array)

[[7 2 1 8 4]
 [9 4 9 3 3]
 [8 7 8 6 1]
 [2 3 9 4 7]]


In [60]:
# sum
print("Total sum:", an_array.sum()) #returns the sum of all cells
print("Column-wise sum: ", an_array.sum(axis = 0)) #here axis defines the column so sum of all columns is returned
print("Row-wise sum: ", an_array.sum(axis = 1))

Total sum: 105
Column-wise sum:  [26 16 27 21 15]
Row-wise sum:  [22 28 30 25]


In [14]:
# mean
print("Total mean:", an_array.mean())
print("Column-wise mean: ", an_array.mean(axis = 0))
print("Row-wise mean: ", an_array.mean(axis = 1))

Total mean: 4.3
Column-wise mean:  [3.25 2.   7.25 3.5  5.5 ]
Row-wise mean:  [4.2 4.2 4.2 4.6]


In [15]:
# max
print("Overall max:", an_array.max())
print("Column-wise max: ", an_array.max(axis = 0))
print("Row-wise max: ", an_array.max(axis = 1))

Overall max: 9
Column-wise max:  [6 3 9 5 8]
Row-wise max:  [7 9 9 6]


<b>Note:</b> When you apply multi-dimensional operation on 2d array, the output is 1d array. Let's see an example:

In [17]:
an_array = np.random.randint(1,10,(4, 5))
sum_result = an_array.sum(axis = 1)
print(sum_result.shape)

(4,)


If you want to keep the number of dimentions in 2d format you need to use <b style="color:red">keepdims=True</b> parameter.

In [16]:
an_array = np.random.randint(1,10,(4, 5))
sum_result = an_array.sum(axis = 0, keepdims=True)
print(sum_result.shape)

(1, 5)


### 1.8. Sorting in Numpy

<b>Note:</b> There are two useful sort functions in numpy:
<ol>
    <li><b>sort:</b> applies sort function to the array (does not return a new array).</li>
    <li><b>argsort:</b> returns the indicies of the sorted array.</li></ol>

In [3]:
an_array = np.array([1,5,8,2,4,11,6])
an_array.sort()
an_array

array([ 1,  2,  4,  5,  6,  8, 11])

<b>Note:</b> You can also find the reverse of the sorted array using slicing methods.

In [4]:
an_array[::-1]

array([11,  8,  6,  5,  4,  2,  1])

Let's see an example of <u>argsort</u> function.

In [19]:
an_array = np.array([1,5,8,2,4,11,6])
an_array.argsort() # gives the original index of the item.

array([0, 3, 4, 1, 6, 2, 5])

In [20]:
# like before you can do it in reverse
an_array.argsort()[::-1]

array([5, 2, 6, 1, 4, 3, 0])

<b>Question:</b> Consider the following two arrays:

In [21]:
names = np.array(['Jhon','Ali','Miriam','Alex','Hoda','Jennifer'])
ages = np.array([18,22,11,7,31,19])

Now, sort the names based on ages in a descending manner (high to low)  

In [11]:
names[ages.argsort()[::-1]]

''' 
argsort  returns the index of array befoer sorting:
7 is least and its index is 3
so names[3] = 'Alex'
and [::-1] means reverse 'Alex' will be last
'''

array(['Hoda', 'Ali', 'Jennifer', 'Jhon', 'Miriam', 'Alex'], dtype='<U8')

In [22]:
for item in ages.argsort()[::-1]:
    print(names[item])

Hoda
Ali
Jennifer
Jhon
Miriam
Alex


### 1.9. Dot Product (Vectorization)

<p>You can find dot product of two matrices in the following format:</p> 

In [24]:
x = np.array([[1,2,3],
              [4,5,6]]) #2x3

y = np.array([[7 ,8 ],
              [9 ,10],
              [11,12]])#3X2

print(x.dot(y)) #since x(col)==y(row)
print()
print(np.dot(x, y))


[[ 58  64]
 [139 154]]

[[ 58  64]
 [139 154]]


In [25]:
# Example: dot product of one-dimensional arrays
vec1 = np.array([5 , 5 ])
vec2 = np.array([2, 2])

print("vec1.dot(vec2):{}".format(vec1.dot(vec2)))
print()
print("np.dot(vec1, vec2):{}".format(np.dot(vec1,vec2)))

vec1.dot(vec2):20

np.dot(vec1, vec2):20


### 1.10. Merging data sets

<p>You can merge arrays horizontally or vertically, considering the following rules </p>

* In order to merge arrays vertically, two arrays should have identical number of columns.
* In order to merge arrays horizontally, two arrays should have identical number of rows.

In [27]:
array_one = np.random.randint(1,10,size=(4,2)) #4x2
array_two = np.random.randint(1,10,size=(3,2)) #3x2
print(array_one)
print()
print(array_two)

[[6 5]
 [4 4]
 [7 7]
 [2 1]]

[[5 4]
 [9 1]
 [7 6]]


In [28]:
# vertical merge method 1  *** column should be same length ****
np.vstack((array_one,array_two))
#increases the number of rows <- vertical merge method

array([[6, 5],
       [4, 4],
       [7, 7],
       [2, 1],
       [5, 4],
       [9, 1],
       [7, 6]])

In [30]:
# vertical merge method 2
np.concatenate([array_one,array_two], axis = 0) #does not work with axis=1

array([[6, 5],
       [4, 4],
       [7, 7],
       [2, 1],
       [5, 4],
       [9, 1],
       [7, 6]])

In [31]:
array_three = np.random.randint(1,10,size=(3,4))
array_four = np.random.randint(1,10,size=(3,6))
print(array_three)
print()
print(array_four)


[[9 6 1 2]
 [3 2 6 4]
 [1 7 5 4]]

[[9 1 3 5 3 8]
 [3 1 3 6 2 5]
 [4 9 8 9 6 4]]


In [34]:
# Horizontal merge method 1 - **** rows should be same *****
np.hstack((array_four,array_three))




array([[9, 6, 1, 2, 9, 1, 3, 5, 3, 8],
       [3, 2, 6, 4, 3, 1, 3, 6, 2, 5],
       [1, 7, 5, 4, 4, 9, 8, 9, 6, 4]])

In [None]:
# Horizontal merge method 2
np.concatenate([array_three,array_four], axis = 1)