# 5 Numpy functions you didn't know you needed

[Numpy](https://en.wikipedia.org/wiki/NumPy)  is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. It's one of the most used library for data processing in many field such as Data scientist, Data engineering, Data analysis. With Numpy, developers can do the following :

- Array arithmetic and logical operations.
- Fourier transfroms and routines for graph manipulation.
- operations related to linear algebra(Numpy has built-in functions for linear algebra and random number generation).

Numpy is commonly used with SciPy(Scientific Python) and Matplolib(drawing library).This combination is widely used as an alternative to Matlab(a popular technical computing platform). Python is now considered as a more modern and complete programming language by many expert. In this article we are going to discuss about 5 interesting Numpy array functions such as: 

- np.reshape()
- np.allclose()
- np.where()
- np.clip()
- np.einsum()



The recommended way to run this notebook is to click the "Run" button at the top of this page, and select "Run on Binder". This will run the notebook on mybinder.org, a free online service for running Jupyter notebooks.

In [1]:
!pip install jovian --upgrade -q

In [2]:
import jovian

In [3]:
jovian.commit(project='zerotoanalyst-numpy-array-operations')

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m
[jovian] Updating notebook "landryroni/zerotoanalyst-numpy-array-operations" on https://jovian.ai[0m
[jovian] Uploading notebook..[0m
[jovian] Uploading additional files...[0m
[jovian] Committed successfully! https://jovian.ai/landryroni/zerotoanalyst-numpy-array-operations[0m


'https://jovian.ai/landryroni/zerotoanalyst-numpy-array-operations'

Let's begin by importing Numpy and listing out the functions covered in this notebook.

In [4]:
import numpy as np

The main idea here is to discuss about 5 interesting Numpy array functions that would make our life easier when doing data analysis. we are not going deep inside to nalyse each parameters of the functions but trying to use the simplest way to show how amazing and useful they are by helping saving a lot of time. We will use 2 example showing sucessful implementation of the function  and a third example showing what can cause the functions to break down and how to fixe it.

## Function 1 - np.reshape

The [reshape](https://numpy.org/doc/stable/reference/generated/numpy.reshape.html) function is a numpy array function used to give new shape to an array. It's one the most used numpy function in field such as machine learning, neural network etc..

In [7]:
# Example-11: Define array to reshape
arr = np.arange(1,19)
print(arr)

#reshape 
print("reshape1\n",arr.reshape(3,6))
print('reshape2\n',arr.reshape(2,3,3))

[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18]
reshape1
 [[ 1  2  3  4  5  6]
 [ 7  8  9 10 11 12]
 [13 14 15 16 17 18]]
reshape2
 [[[ 1  2  3]
  [ 4  5  6]
  [ 7  8  9]]

 [[10 11 12]
  [13 14 15]
  [16 17 18]]]


Example-11: shows how reshape is use to transfrom or give a new shape an array. It change a 1D array shape to multi-dimenssional array.

In [9]:
# Example-12: reduce array dimension 
arr2 = np.array([[1,2,3,4,5,6,7,8,9],
        [10,11,12,13,14,15,16,17,18]])
#reshape
print(arr2.shape)
print(np.reshape(arr2,-1))

(2, 9)
[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18]


example-12: shows how reshape can also be useful to change array shape form multi-dimenssional to 1D array

In [10]:
# Example-13: error due to shape mismatc
arr3 = np.arange(11)
print(arr3.reshape(2,5))

ValueError: cannot reshape array of size 11 into shape (2,5)

example-13: the above example got an error because we try to reshape 11 elements into 10(2*5 array) what is inappropriate. when applying reshape  the total number of element should remain the same.

reshape function is a very useful function, we can use it to give new shape or the shape we want to give to arrays. It can be use to expand or reduce the array dimenssion, by still concerving the same total number of element. As we can observe in example-1 where the 1D array  with 18 element is reshaped to 2x3x3=18 3D array having the same number of element. 

In [11]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m
[jovian] Updating notebook "landryroni/zerotoanalyst-numpy-array-operations" on https://jovian.ai[0m
[jovian] Uploading notebook..[0m
[jovian] Uploading additional files...[0m
[jovian] Committed successfully! https://jovian.ai/landryroni/zerotoanalyst-numpy-array-operations[0m


'https://jovian.ai/landryroni/zerotoanalyst-numpy-array-operations'

## Function 2 - np.allclose

A function that finds whether two arrays are equal or approximately equal. What the function does is to handle two arrays data element by element and check if they are equal within a tolerance, then get the output represented by a boolean value. It returns false if two arrays element are not equal within a tolerance range and true when the arrays element are equal within a tolerance.

note that default tolerance values are very small and positive numbers.

In [12]:
# Example-21: with 1D array
a = np.array([0.16,0.26,0.365])
b = np.array([0.15,0.25,0.36])
tolerance1 = 0.1 #0.2
tolerance2 = 0.05
print(np.allclose(a,b,tolerance1))
print(np.allclose(a,b,tolerance2))

True
False


Example-21: shows that the two array are similar when applying a tolerance values of 0.1 or above. 

In [14]:
# Example-22: with 2D array
arr1 = np.array([ [0.16,0.26,0.365],
         [0.20,0.30,0.40]])

arr2 = np.array([[0.14,0.27,0.37],
         [0.21,0.29,0.38]])

tolerance1 = 0.1
tolerance2 = 0.2

print(np.allclose(arr1,arr2,tolerance1))
print(np.allclose(arr1,arr2,tolerance2))

False
True


Example-22: as the function is done through element by element operations, the function can be apply on multi-dimenssional array.

In [15]:
# Example-23: collapse shape mismatch
arr1 = np.array([[0.16,0.26],
         [0.20,0.30]])

arr2 = np.array([[0.14,0.27,0.37],
         [0.21,0.29,0.38]])

tolerance = 0.1
print(np.allclose(arr1,arr2,tolerance))

ValueError: operands could not be broadcast together with shapes (2,2) (2,3) 

Example-23: failed to compute because arrays dimenssion are not the same. 

Allclose function It is a great way to check whether two arrays are similar, implementing such function manually could be a bit tricky.


In [18]:
# jovian.commit()

## Function 3 - np.where


The [where](https://numpy.org/doc/stable/reference/generated/numpy.where.html) funtion return elements from an array that satisfy a certain condition. It Returns the index position of the value that meets the specified criteria.

In [19]:
# Example-31:  return index of element that satisfied the condition.
b = np.arange(4*5).reshape(4,5)
#Where x is greater than 2, returns index position
print(b)
print(np.where(b>14))

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]
(array([3, 3, 3, 3, 3]), array([0, 1, 2, 3, 4]))


Expample-31: here the function return index position of element that satisfied the condition, on this example it return all element index(row, column) that are greater than 14.

In [20]:
# Example-32: with 2 additional arguments
y = np.array([[1,2],[3,4]])

print(np.where(y<3,y,y*3))

[[ 1  2]
 [ 9 12]]


Example-32: it's obvious that the first argument determined the condition to be satisfy.
the second argument is to replace values that satisfy the given condition.
the third argument is to replace those that don’t satisfy the condition.
Here what the function does is if the condition is satisfied(element in y are less than 3), output(element),else output(element time 3).

In [22]:
# Example-33: function error argument missing
np.where(b>3,b)

ValueError: either both or neither of x and y should be given

Example-33: the error is because the second and third argument must given together or both should not be use.

where function in Numpy is similar to the vectorized version of our ternary expression x if condition,else y.

In [23]:
# jovian.commit()

## Function 4 - np.clip


The function [clip](https://numpy.org/doc/stable/reference/generated/numpy.clip.html) is use to keep the values of an array within an interval between a defined upper and lower range.

In [24]:
# Example-41: 1D array 
x = np.array([1,0,3,2,4,9,8])

print(x.clip(3, 5))

[3 3 3 3 4 5 5]


Example-41: 3 and 5 represent the lower and upper limit, all elements with values less or equal to 3 become 3, and values greater than or equal to 5 become 5.

In [62]:
# Example-42: 2D array
t = np.random.randint(1, 16,(3, 4), dtype=int)
t1 = np.arange(12).reshape(3, 4)

print(t)
np.clip(t,5,10,t1)

[[ 9  3 15  6]
 [ 9 12  5 12]
 [15  2 11  1]]


array([[ 9,  5, 10,  6],
       [ 9, 10,  5, 10],
       [10,  5, 10,  5]])

Example-42: In the matrix t, the number less than 5 is changed to 5, the number greater than 10 is changed to 10, and the number between [5,10] remains unchanged. The modified data is stored in t1.

In [26]:
# Example-43: error missing argument
p = np.array([[2,3,1],[4,5,6]])
np.clip(p,3)

TypeError: _clip_dispatcher() missing 1 required positional argument: 'a_max'

Example-43: the function breaks because max_values to be convert into is not specify where both max_values and  min_values   most be given for preservation of the interval values.

In Clip function all the numbers less than min are replaced by min, and all the numbers greater than Max are replaced by Max. The numbers between [min, Max] remain unchanged. Returns an array of values that must have the same dimension as the original value, otherwise it will report an error.

In [27]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m
[jovian] Updating notebook "landryroni/zerotoanalyst-numpy-array-operations" on https://jovian.ai[0m
[jovian] Uploading notebook..[0m
[jovian] Uploading additional files...[0m
[jovian] Committed successfully! https://jovian.ai/landryroni/zerotoanalyst-numpy-array-operations[0m


'https://jovian.ai/landryroni/zerotoanalyst-numpy-array-operations'

## Function 5 -  np.einsum

[Einsum](https://numpy.org/doc/stable/reference/generated/numpy.einsum.html) function also know as Einstein summation convention is one of the most useful functions of Numpy. Due to its expressive power and smart loops, it can outperform our usual array functions in terms of speed and memory efficiency. But the difficult part is it can take a while to understand notation, and sometimes takes an attempt to apply it correctly to tricky problems.
Einsum allows numpy functions such as array np.multiply, np.addition, np.transpose and np.diag to help us do our job faster and more eficiently. 
Here we will just give a brief introduction to einsum function as its a multi-task function. We will discuss about few simple operations that we need to know when we starting sing einsum. 

In [28]:
# Example-51 - find array diagonal, transpose using einsum
a = np.array([[1,2,3],[4,5,6],[7,8,9]])
print("Find array Diagonal")
a_diag = np.einsum('ii->i', a)
print("option1\n",a_diag)
print('option2\n',np.einsum(a,[0,0],[0]))
a_transpose = np.einsum("ij->ji",a)
print("Do arrayTranspose")
print("option1\n",a_transpose)
print("option2\n",np.einsum(a,[1,0]))

Find array Diagonal
option1
 [1 5 9]
option2
 [1 5 9]
Do arrayTranspose
option1
 [[1 4 7]
 [2 5 8]
 [3 6 9]]
option2
 [[1 4 7]
 [2 5 8]
 [3 6 9]]


Example-51: We used the einsum to compute de array diagonal which is the same as np.diag and apply array transpose to get the same result as np.transpose. here we listed 2 way or doing the same task using Einsum function

In [29]:
# Example-52: do matrix*matrix multiplication, maxtrix*vector multiplication using einsum
a1 = [[1,2,3],[4,5,6],[7,8,9]]
a2 = np.array([[0.1,0.2,0.3],[4.0,0.5,6.1],[0.7,8.0,.9]])
b = [10,11,12]

# matrix*matrix
print("maxtrix mutiplication")
mm_mutltiply = np.einsum("ij,jk",a1,a2)
print("option1\n",mm_mutltiply)
print("option2\n",np.einsum(a1,[0,1],a2,[1,2]))
#maxtrix*vector
print("maxtrix * vector mutiplication")
mv_multi = np.einsum("ij,j",a2,b) 
print("option1\n",mv_multi)
print('option2\n',np.einsum(a2,[0,1],b,[1]))


maxtrix mutiplication
option1
 [[10.2 25.2 15.2]
 [24.6 51.3 37.1]
 [39.  77.4 59. ]]
option2
 [[10.2 25.2 15.2]
 [24.6 51.3 37.1]
 [39.  77.4 59. ]]
maxtrix * vector mutiplication
option1
 [  6.8 118.7 105.8]
option2
 [  6.8 118.7 105.8]


Example-52: The function here compute matrix multiplication and matrix vector multiplication. once again np.matmul() give the same result. 

In [30]:
# Example-53: failed due to mismatch argument type
a3 = np.array([[0.1,0.2,0.3],[4.0,0.5,6.1],[0.7,8.0,.9]])
b = [10,11,12]
np.einsum("ij,jk",a3,b) 

ValueError: einstein sum subscripts string contains too many subscripts for operand 1

Example-53: failed because the passing argument dont correspond to the defined task, "ij,jk" command is use to compute 2 array or matrix mutiplication but with instead pass a second arguement as a vector. To solve this we need to change the second argument to a matrix.

Einstum function is a little tricky when using it, as error can easily occur when defining operation to compute. It is an interesting function good to know about it. The "ij,jk" expression can be seen as having an A array of size iXj  and B array of size jxk, when doing the matrix multiplification what we get a  usual will be a new array of size ixk.

In [31]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m
[jovian] Updating notebook "landryroni/zerotoanalyst-numpy-array-operations" on https://jovian.ai[0m
[jovian] Uploading notebook..[0m
[jovian] Uploading additional files...[0m
[jovian] Committed successfully! https://jovian.ai/landryroni/zerotoanalyst-numpy-array-operations[0m


'https://jovian.ai/landryroni/zerotoanalyst-numpy-array-operations'

## Conclusion

Numpy is not only an efficient python array computation library but also more convenient. With Numpy we got many Vector and matrix operations for free, which can sometimes avoid unnecessary work. they have also been effectively implemented, it takes less time duration to read and write multi-dimenssional data from a file. In this article we explored basic and very useful numpy arrray function that will be helpful in future work. There are more other functionalities that we can review from the NumPy official documentation, try to understand how it work and let's share to enhance our knowledge

## Future Work

- Additional work on Einsum function .

- Discuss about the difference between Numpy dot, matmul and multiply functions.


## Reference Links

Here are some links of NumPy official documentation and interesting articles about numpy arrays function :

* Numpy official tutorial : https://numpy.org/doc/stable/user/quickstart.html
* python-numerical-computing-with-numpy : https://jovian.ai/aakashns/python-numerical-computing-with-numpy
* 12 Amazing Pandas & NumPy Functions : https://towardsdatascience.com/12-amazing-pandas-numpy-functions-22e5671a45b8
* scipy.org :https://docs.scipy.org/doc/numpy-1.15.1/reference/index.html
* 5 NumPy Functions that you Should Know : https://medium.com/@sergioalves94/5-numpy-functions-that-you-should-know-49e9fdbf2b18


In [None]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m


In [None]:
jovian.submit(assignment="zerotoanalyst-a4")

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m


In [None]:
# a multi-task function, where the task to be done could be defined in different way by the funtion argument. for Example when using first argument to defined task we should use string such as "ii->i" mean diagonal, or we can also [0.0],[0] as second and third argument respectively to specify the array diagonal task. It can be use to compute oeprations such as np.diag, np.matmul, np.dot, np.transpose,np.sum etc..
# here we will discuss about the first 4 functions computations when using eisum function

In [None]:
# Then we will also discuss about the difference between :
# - np.dot()
# - np.matmul()
# - np.multiply()

In [None]:
# > ### **Assignment 2 - Numpy Array Operations** 
# >
# > This assignment is part of the course ["Data Analyst Bootcamp by Jovian"](http://zerotoanalyst.com). The objective of this assignment is to develop a solid understanding of Numpy array operations. In this assignment you will:
# > 
# > 1. Pick 5 interesting Numpy array functions by going through the documentation: https://numpy.org/doc/stable/reference/routines.html 
# > 2. Run and modify this Jupyter notebook to illustrate their usage (some explanation and 3 examples for each function). Use your imagination to come up with interesting and unique examples.
# > 3. Upload this notebook to your Jovian profile using `jovian.commit` and make a submission here: https://jovian.ai/learn/zero-to-data-analyst-bootcamp/assignment/assignment-4-exploring-numpy-functions
# > 4. (Optional) Share your notebook online (on Twitter, LinkedIn, Facebook) and with the course community.
# > 5. (Optional) Check out the notebooks shared by other participants and give feedback & appreciation.
# >
# > Try to give pick a theme for your assignment and give your notebook an interesting title e.g. "All about Numpy array operations", "5 Numpy functions you didn't know you needed", "A beginner's guide to broadcasting in Numpy", "Interesting ways to create Numpy arrays", "Trigonometic functions in Numpy", "How to use Python for Linear Algebra" etc.
# >
# > **NOTE**: Remove this cell containing explanations before submitting or sharing your notebook online - to make it more presentable.


# #