# 5 underrated numpy array functions that you should know of.
### Numpy functions with real-time examples

Numpy stands for Numerical Python. It is a scientific computing package in Python (from numpy website). 

Numpy library helps us create, manipulate, shape manipulation, sorting, selecting and many more things on n-dimensional arrays. 

Numpy is a very powerful library and is extremely fast than the in-built Python's lists.

In this notebook I will try to explain about 5 underrated numpy routines (functions) with some real-world examples. 

The functions that I will be covering in this notebook are:
- np.reshape() - shape manipulation
- np.where() - searching operation
- np.stack() - shape manipulation
- np.argmax() - searching operation
- np.cumsum() - statistical operation

Import libraries

In [57]:
import numpy as np
import jovian

<IPython.core.display.Javascript object>

# List of functions explained 
- function1 = np.reshape()  => for changing the shape of arrays
- function2 = numpy.where() => for SQL like operations
- function3 = numpy.stack() => for stacking multiple arrays => used in real-time image processing
- function4 = numpy.argmax() => for finding out the indices of maximum values in an array
- function5 = numpy.cumsum() => for calculating the running total => used in finance and sales applications

Let us start!!!

# Function 1 - np.reshape

Suppose that we are collecting the room temperatures of our home every day once every six hours (i.e. 4 times in a day).

Now from the picture below lets say that the temperatures collected in a day are: (C = celsius, F = fahrenheit)
- C,F,C,F,C,F,C,F 

And we would like to convert this list or 1 dimensional array into a 2 dimensional array which looks more intuitive and easy to understand (like a pair).

![Temperatures](https://raw.githubusercontent.com/py404/zerotopandas/master/numpy-array-operations/temperatures.png)

To do that we use reshape() function (odd places = celsius, even places = fahrenheit).

### Example 1

In [16]:
temperatures = [14, 57.2, 7, 44.6, 18, 66.4, 25, 77]

new_temperatures = np.reshape(temperatures, newshape=(4,2))
new_temperatures

array([[14. , 57.2],
       [ 7. , 44.6],
       [18. , 66.4],
       [25. , 77. ]])

Now it is easy to understand our data or easy to extract the temperature pairs.

We have transformed our 1 dimensional array (list) into a 2 dimensional array using the "newshape=(4,2)" i.e. 4 rows and 2 columns.

In [17]:
print('Below are the temperatures recorded today:')
for c,f in new_temperatures:
    print(f'Celsius: {c}, Fahrenheit: {f}')

Below are the temperatures recorded today:
Celsius: 14.0, Fahrenheit: 57.2
Celsius: 7.0, Fahrenheit: 44.6
Celsius: 18.0, Fahrenheit: 66.4
Celsius: 25.0, Fahrenheit: 77.0


### Explanation about example
In the example above we have successfully transformed a 1-D array into a 2-D array which is meaningful and easy to understand. 

Now an end user can extract the array elements in pairs to do a quick analysis of temperatures recorded.

### Example 2

Let's say that we are collecting the time taken by five F1 racers to finish each lap (in minutes).

The data collected is in the format of list of lists.

In [18]:
lap_times = [
    [3.1, 2.73, 2.64, 3.05, 2.99],
    [2.71, 2.99, 2.94, 2.94, 2.58],
    [3.01, 3.73, 2.74, 3.86, 3.62]
]

## Now to analyse the overall times taken by all racers, or observe the trend of lap times of all racers, we would like to combine all into one big array
## In that case we transform our (3,5) (2-D) array into a 1-D array.

all_lap_times = np.reshape(lap_times, newshape=-1)
all_lap_times

array([3.1 , 2.73, 2.64, 3.05, 2.99, 2.71, 2.99, 2.94, 2.94, 2.58, 3.01,
       3.73, 2.74, 3.86, 3.62])

In [19]:
all_lap_times.shape

(15,)

### Printing the dimension of our new array

In [20]:
all_lap_times.ndim

1

In the above example, a 3 dimensional array is converted into a 1 dimensional array using the reshape option "-1".

In [21]:
# Example 3 - breaking (to illustrate when it breaks)
np.reshape(lap_times, newshape=(2,15))

ValueError: cannot reshape array of size 15 into shape (2,15)

# Explanation about why reshape failed above

Import note for numpy's reshape is that when we pass a new shape of (a,b) then it should be equal to that of multiplication of number of rows and number of columns from the original array i.e. (n_rows * n_columns) = newshape(a,b).

For example let's say our original array is a (3,5) array i.e. 3 rows and 5 columns. The reshape operation always expects the newshape to be equal to that of 15 i.e. (a,b) => a*b = 15. If we pass something like (4,4) it fails. Because the total elements 4*4 = 16 which is not equal to 15 (original array elements).

Notice that in the example 2 above, a (3,5) array is transformed into (15,). The number of elements in a (3,5) array is 15 and the number of elements in the output array is also 15. Other key takeaway here is that the dimension of array in second example is (15,). It means that the output array is 1-D not 2-D.

### Some closing comments about when to use this function.

Numpy's reshape is a very handy and important function than you think of. In the two examples above we were able to create a 1-D array into a 2-D array and vice versa.

In [58]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..
[jovian] Uploading notebook..
[jovian] Capturing environment..


[jovian] Error: Failed to read Anaconda environment using command: "conda env export -n base --no-builds"


[jovian] Committed successfully! https://jovian.ml/phani/numpy-assignment-2


'https://jovian.ml/phani/numpy-assignment-2'

# Function 2 - np.where()

If you have some experience in SQL, then Numpy's where is something similar to WHERE clause in SQL. Except it works a little different. 

Numpy's where works like this:
- np.where(condition, if yes give this output, if no give this output)

i.e. we pass a condition that determines our output and returns an output if our condition is met or true or returns the other output if the condition is not met or false.

For example: np.where(x > 10, A, B) => here if x is greater than 10, then A is returned as our output, if false B is returned as our output.

#### The examples below that I mention are on cricket analytics
I am big fan of Cricket. I will take an example dataset containing number of runs scored by Sachin Tendulkar and Rahul Dravid in a match that India had played.

***Note: I do not own this dataset. I have downloaded this resource from this link: https://drive.google.com/file/d/1lBEeQ9iycLmQX9LkA8gq2Tm1Utd9sKu8/view?usp=sharing***

The dataset looks something like this:

![Cricket Scores](https://raw.githubusercontent.com/py404/zerotopandas/master/numpy-array-operations/cricket_scores.png)

### A .tsv file is similar to a CSV file but is delimited by a TAB instead of a COMMA (,) which is why the file extension is .tsv

What the columns mean:
- Sachin = number of runs scored by Sachin
- Dravid = number of runs scored by Dravid
- India  = total runs India had made in that match

In [177]:
# Before we perform our numpy operations let us load our dataset in a numpy array
data = np.loadtxt("https://raw.githubusercontent.com/py404/zerotopandas/master/numpy-array-operations/cricket_data.tsv", skiprows=1)
data.shape

(225, 4)

We have a total of 225 rows and 4 columns

In [41]:
# Example 1 - find number of centuries made by Sachin and Dravid

# Sachin - column 1 (0, 1, 2)
sachin_centuries = np.where(data[:, 1] >= 100)

# Dravid - column 2 (0, 1, 2)
dravid_centuries = np.where(data[:, 2] >= 100)

print(sachin_centuries) # output returned is the index of elements where sachin's score was >= 100
print(dravid_centuries) # output returned is the index of elements where dravid's score was >= 100

(array([  0,   4,  22,  30,  37,  44,  47,  56,  57,  64,  68,  94, 115,
       116, 134, 148, 159, 166, 175, 177, 181, 182, 204, 208, 210, 221],
      dtype=int64),)
(array([  5,  11,  21,  37,  60,  88, 205, 208], dtype=int64),)


Example 1 continuation:
- Notice that the outputs above returned a Tuple with list of indexes where runs are 100 or more (i.e. a century)
- To access the array we use output[0] since there is only one array inside the Tuple

In [29]:
sachin_centuries[0]

array([  0,   4,  22,  30,  37,  44,  47,  56,  57,  64,  68,  94, 115,
       116, 134, 148, 159, 166, 175, 177, 181, 182, 204, 208, 210, 221],
      dtype=int64)

Example 1 continuation:
- To find total centuries, we use

In [42]:
print(f'Total centuries by Sachin: {sachin_centuries[0].shape[0]} out of {data.shape[0]} matches.')
print(f'Total centuries by Dravid: {dravid_centuries[0].shape[0]} out of {data.shape[0]} matches.')

Total centuries by Sachin: 26 out of 225 matches.
Total centuries by Dravid: 8 out of 225 matches.


### Example 1 - a little variation to the example above (returning custom valuse based on a condition

In [58]:
# return True if condition is met otherwise False
np.where(data[:,1] >= 100, True, False)

array([ True, False, False, False,  True, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False,  True, False, False, False, False,
       False, False, False,  True, False, False, False, False, False,
       False,  True, False, False, False, False, False, False,  True,
       False, False,  True, False, False, False, False, False, False,
       False, False,  True,  True, False, False, False, False, False,
       False,  True, False, False, False,  True, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False,  True, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False,  True,  True,
       False, False, False, False, False, False, False, False, False,
       False, False,

### Explanation about example
Using where function we were able to find out the number of centuries made by Sachin and Dravid with just a couple of lines of code (can be condensed into 1 line in fact). 

This is a simple use case to get output based on a simple "YES" or "NO" condition.

### Example 2 - combining multiple conditions

In [73]:
# Example 2 - find out if both sachin and dravid made centuries in a single match
both_centuries = np.where((data[:,1] >= 100) & (data[:,2] >= 100))

print(both_centuries) # returns indices where the condition met

print('\n')

# print(f'Sachin scored: {data[37][1]}, Dravid scored {data[37][2]} and India\'s total is {data[37][3]}')  # printing row using index 37
# print(f'Sachin scored: {data[208][1]}, Dravid scored {data[208][2]} and India\'s total is {data[208][3]}') # printing row using index 208

for i in both_centuries[0]:
    print(f'Sachin scored: {data[i][1]}, Dravid scored {data[i][2]} and India\'s total is {data[i][3]}')

print('\n')
print('Or may be print if both have scored fifty runs in a single match:')
print('-'*70)

fifties = np.where( ((data[:,1] >= 50) & (data[:,1] < 100)) & ((data[:,2] >= 50) & (data[:,2] < 100)))
print(fifties)

(array([ 37, 208], dtype=int64),)


Sachin scored: 186.0, Dravid scored 153.0 and India's total is 345.0
Sachin scored: 140.0, Dravid scored 104.0 and India's total is 301.0


Or may be print if both have scored fifty runs in a single match:
----------------------------------------------------------------------
(array([  7,  72,  84, 143, 163, 164, 195, 200], dtype=int64),)


In [72]:
for i in fifties[0]:
    print(f'Sachin scored: {data[i][1]}, Dravid scored {data[i][2]} and India\'s total is {data[i][3]}')

Sachin scored: 86.0, Dravid scored 74.0 and India's total is 288.0
Sachin scored: 99.0, Dravid scored 74.0 and India's total is 229.0
Sachin scored: 50.0, Dravid scored 62.0 and India's total is 256.0
Sachin scored: 93.0, Dravid scored 79.0 and India's total is 279.0
Sachin scored: 60.0, Dravid scored 57.0 and India's total is 309.0
Sachin scored: 68.0, Dravid scored 59.0 and India's total is 202.0
Sachin scored: 62.0, Dravid scored 56.0 and India's total is 276.0
Sachin scored: 99.0, Dravid scored 92.0 and India's total is 304.0


### Explanation about example
In the example above observe that I have combined mulitple conditions in the where() function. 

The logic is executed in the steps below:
- find data points where sachin scored > 50 and < 100
- if that is true, then find dravid scored > 50 and < 100
- if both statements are true above, then get the index of that row

Notice how I wrapped the conditions around the brackets. Brackets are important if you pass multiple conditions to numpy's where function. I will explain why it is important below.

### Example 3 - breaking (to illustrate when it breaks)

Below I will try to ignore the brackets and execute the condition inside where function.

In [25]:
np.where(data[:,1] >= 100 & data[:,2] >= 100)

TypeError: ufunc 'bitwise_and' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Notice the error, "TypeError: ufunc 'bitwise_and' not supported for the input types". 

What it means is that while performing bitwise_and operations i.e. using "&" in numpy, the order of precedence is observed and is important.

In our statement above, '&' has higher precendence than '>='. 

So numpy will assume that the arguments next to '&' are the condition sets, i.e. our condition:
- data[:,1] >= 100 & data[:,2] >= 100 is equal to **data[:,1] >= (100 & data[:,2]) >= 100**

A stackoverflow question here will explain you in simple terms: https://stackoverflow.com/questions/50656307/numpy-typeerror-ufunc-bitwise-and-not-supported-for-the-input-types-when-us

Notice the difference? Numpy's where is trying to do a whole different operation that what we needed. 

In order to safely execute our condition set, we have to wrap multiple conditions inside brackets. That way the precendence is followed and right output is extracted.

To fix that problem we do something like this:

- np.where( (condition 1) & (condition 2) )

Observe that two conditions are wrapped inside brackets between the bitwise AND operator.

### Some closing comments about when to use this function

Numpy's where condition is really handy if you're performing a search operation based on specific set of conditions. 

You can combine many condition sets and pass to where function to extract your output. It is a very useful condition while doing some analytics.

In [59]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..
[jovian] Updating notebook "phani/numpy-assignment-2" on https://jovian.ml/
[jovian] Uploading notebook..
[jovian] Capturing environment..


[jovian] Error: Failed to read Anaconda environment using command: "conda env export -n base --no-builds"


[jovian] Committed successfully! https://jovian.ml/phani/numpy-assignment-2


'https://jovian.ml/phani/numpy-assignment-2'

# Function 3 - np.stack()
Numpy's stack function is used to combine multiple arrays and get an output array.

Now this might be similar to concatenate, except stack can come handy with the axis parameter operations. Let us see how it works and is different from np.concatenate()

### Example 1 - working (basic example)

In [36]:
arr1 = [1,4,5]
arr2 = [2,9,6]

# default stacking is by horizontal axis
output = np.stack((arr1, arr2))
print(output)
print('\n')
print(f'Output array dimension: {output.ndim}')
print('\n')

[[1 4 5]
 [2 9 6]]


Output array dimension: 2




### Explanation about example
Numpy's stack is a array manipulation function. The documentation says that it joins a sequence of arrays along a new axis. 

In the example above, a simple joining is performed along the horizontal axis, i.e. arr1 is stacked on top of arr2.

### Example 2 - (using axis parameter)
Joining multidimensional arrays and using axis parameter.

In [66]:
arr1 = np.array([
    [1,4,5],
    [7,3,2]
])
print(f'arr1 dimension: {arr1.ndim}')

arr2 = np.array([
    [2,9,6],
    [0,5,7]
])
print(f'arr2 dimension: {arr2.ndim}')

print('\n')

output = np.stack((arr1, arr2), axis=0)

print(f'Stacked output array looks like: \n{output}')
print('\n')

print(f'Dimension of output array is: {output.ndim}')

arr1 dimension: 2
arr2 dimension: 2


Stacked output array looks like: 
[[[1 4 5]
  [7 3 2]]

 [[2 9 6]
  [0 5 7]]]


Dimension of output array is: 3


### Explanation
In the example above, using the axis parameter we are telling numpy's stack function that the first array is stacked on top of the second array along the first dimension.

### Example 2 - continuation (using axis parameter)

In [68]:
arr1 = np.array([
    [1,4,5],
    [7,3,2]
])
print(f'arr1 dimension: {arr1.ndim}')

arr2 = np.array([
    [2,9,6],
    [0,5,7]
])
print(f'arr2 dimension: {arr2.ndim}')

print('\n')

output = np.stack((arr1, arr2), axis=1)

print(f'Stacked output array using 1 as axis looks like: \n{output}')
print('\n')

print(f'Dimension of output array is: {output.ndim}')

arr1 dimension: 2
arr2 dimension: 2


Stacked output array using 1 as axis looks like: 
[[[1 4 5]
  [2 9 6]]

 [[7 3 2]
  [0 5 7]]]


Dimension of output array is: 3


### Explanation
In the example above, using the axis parameter 1 we are telling numpy's stack function that the first array is stacked on top of the second array along the last dimension.

i.e. notice the output above, we have stacked first row of first array on top of first row of second array, then the second row of first array is stacked on top of second row of second array.

That is the difference between axis=0 and axis=1

### Example 3 - breaking (to illustrate when it breaks)

In [75]:
arr1 = np.array([1,2,3])
print(arr1)
print('\n')

arr2 = np.array([
    [1,2,3],
    [4,5,6]
])
print(arr2)
print('\n')

output = np.stack((arr1, arr2))

[1 2 3]


[[1 2 3]
 [4 5 6]]




ValueError: all input arrays must have the same shape

Notice the error. It says that the arrays must have same shape. Numpy's stack function performs by looking at the shape of the array. 

It means that when performing stack function, we have to always pass arrays that are of same shape.

To fix this we have to something like below:

In [79]:
arr1 = np.array([
    [9,8,7],
    [6,5,4]
])
print(f'arr1 dimension: {arr1.ndim}')

arr2 = np.array([
    [1,2,3],
    [4,5,6]
])
print(f'arr2 dimension: {arr2.ndim}')

arr1 dimension: 2
arr2 dimension: 2


### Some closing comments about when to use this function.

Numpy's stack function is a extensively used in the image processing field. If you have knowledge about pixels, image, or RGB in general then this article will enlight you about the image processing field. 

Link: http://www.degeneratestate.org/posts/2016/Oct/23/image-processing-with-numpy/

In [80]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..
[jovian] Updating notebook "phani/numpy-assignment-2" on https://jovian.ml/
[jovian] Uploading notebook..
[jovian] Capturing environment..


[jovian] Error: Failed to read Anaconda environment using command: "conda env export -n base --no-builds"


[jovian] Committed successfully! https://jovian.ml/phani/numpy-assignment-2


'https://jovian.ml/phani/numpy-assignment-2'

# Function 4 - numpy.argmax
Numpy's argmax function returns the indices of the maximum values along an axis. 

I will try to demonstrate the use of argmax using the cricket dataset that I have used in the second fuction (np.where()) and it's examples.

### Example 1 - working

Let us say we would like to find the position of maximum runs score by Sachin or Dravid.

In [176]:
data = np.loadtxt("https://raw.githubusercontent.com/py404/zerotopandas/master/numpy-array-operations/cricket_data.tsv", skiprows=1)
print(f'Printing first 5 rows of the dataset: \n {data[:4,:]}')
print('\n')
print(f'Dimension of data: {data.ndim}')
print('\n')

max_position = np.argmax(data[:, 1], axis=0) # column 1 is sachin's scores
print(f'Max position: {max_position}')
print(f'Max position row: {data[max_position]} and Sachin scored: {data[max_position, 1]} runs.')

Printing first 5 rows of the dataset: 
 [[  0. 100.  78. 342.]
 [  1.  11.  62. 191.]
 [  2.   8.  85. 252.]
 [  3.  71.  24. 307.]]


Dimension of data: 2


Max position: 37
Max position row: [ 37. 186. 153. 345.] and Sachin scored: 186.0 runs.


### Explanation about example
In the example above I have extracted the index of the row where Sachin scored the maximum runs out of all the matches in our dataset. 

Similarly for Dravid:

In [113]:
max_position = np.argmax(data[:, 2], axis=0) # column 2 is dravid's scores
print(f'Max position: {max_position}')
print(f'Max position row: {data[max_position]} and Dravid scored: {data[max_position, 2]} runs.')

Max position: 37
Max position row: [ 37. 186. 153. 345.] and Dravid scored: 153.0 runs.


In [114]:
max_position = np.argmax(data[:, 3], axis=0) # column 3 is India's scores
print(f'Max position: {max_position}')
print(f'Max position row: {data[max_position]} and India scored: {data[max_position, 3]} runs.')

Max position: 71
Max position row: [ 71.  57.   7. 499.] and India scored: 499.0 runs.


An interesting finding is that both Sachin and Dravid's scores were highest in a single match that India had played.

### Example 2 - finding out the indices of all maximum values of Sachin, Dravid and India in one go.

In [118]:
max_position = np.argmax(data[:, 1:], axis=0)
print(max_position)

[37 37 71]


### Explanation about example
Notice the above example that instead of passing a single column, I have passed down all the columns starting from 1 (excluding 0, below column 0 is just index of row).

The output is [37, 37, 71] which is exactly what we got above. 

Sachin's maximum score was at the row 37, Dravid's maximum score was at row 37 and India's maximum score was at row 71. 

### Example 3 - breaking (to illustrate when it breaks)

In [146]:
np.argmax(data, axis=2)

AxisError: axis 2 is out of bounds for array of dimension 2

### Explanation about example (why it breaks and how to fix it)
In the above example argmax function breaks if we pass an axis that doesn't exist i.e. for our cricket dataset we only have two axes, horizontal=0, vertical=1.

While performing argmax please make sure that you are using correct array indexing and using right axis for your desired output.

### Some closing comments about when to use this function.

Numpy's argmax is a very useful function and is extensively used in a lot of use cases. Be it finding out maximum values of stocks, maxiumum temperatures recorded in a year dataset, maximum runs scored by a team in a tournament etc.

In [147]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..
[jovian] Updating notebook "phani/numpy-assignment-2" on https://jovian.ml/
[jovian] Uploading notebook..
[jovian] Capturing environment..


[jovian] Error: Failed to read Anaconda environment using command: "conda env export -n base --no-builds"


[jovian] Committed successfully! https://jovian.ml/phani/numpy-assignment-2


'https://jovian.ml/phani/numpy-assignment-2'

# Function 5 - np.cumsum() - cumulative sum 

Numpy's cumsum() function calculates the cumulative sum of the values in the array and produces a new output array.

The difference between sum() and cumsum() is as follows:
- suppose you have an array [1,4,3,5,7,9] the sum() function gives an output of 1+4+3+5+7+9 = 29
- but a cumsum() function calculates the same array differently and gives an output => [1, 5, 8, 13, 20, 29]

Did you get the difference? 

Both sum() and cumsum() gives the final sum of the array elements, but cumsum() would be doing the "running total" of the array.

For example, suppose that you are counting the savings you have been doing every week for about a month.
- first week you saved 100 rupees, second week 150 rupees, third week 150 rupees, fourth week 200 rupees
- cumsum() gives the running total 
- i.e. first = 100
- second = first + 150 = 100 + 150 = 250
- third = second + 150 = 250 + 150 = 400
- fourth = third + 200 = 400 + 200 = 600

### Example 1 - working
Suppose we have been saving money for 8 weeks and the cumsum() or running total will be

In [150]:
savings = np.array([100, 150, 150, 200, 125, 100, 75, 130])
np.cumsum(savings)

array([ 100,  250,  400,  600,  725,  825,  900, 1030], dtype=int32)

### Explanation about example
Notice the output above, the cumsum() function calculated running total like this:
- 100
- 250 (100 + 150)
- 400 (100 + 150 + 250)
- 600 (100 + 150 + 250 + 200)
- 725 (100 + 150 + 250 + 200 + 125)
- 825 (100 + 150 + 250 + 200 + 125 + 100)
- 900 (100 + 150 + 250 + 200 + 125 + 100 + 75)
- 1030 (100 + 150 + 250 + 200 + 125 + 100 + 75 + 130)

After 8 weeks our savings is 1030 rupees.

### Example 2 - cumsum() with 2 dimensional array

In [152]:
arr = np.array([
    [1,5,3],
    [6,4,9]
])

np.cumsum(arr)

array([ 1,  6,  9, 15, 19, 28], dtype=int32)

### Explanation about example
Notice that even with a 2-D array cumsum() function has given a 1-D output array. 

At the moment it is doing a rowwise operation i.e. 1 + 5 + 3 + 6 + 4 + 9 => 1, 6, 9, 15, 19, 28

### Example 2 - continuation - doing cumsum() using axis parameter 

In [155]:
arr = np.array([
    [1,5,3],
    [6,4,9]
])

np.cumsum(arr, axis=1)

array([[ 1,  6,  9],
       [ 6, 10, 19]], dtype=int32)

Notice the output above, we no longer receive a 1-D array output. In the example above cumsum() function is doing the cumulative sum in the first element inside the array and then the second element inside the array.

i.e. it first calculates cumulative sum of [1,5,3] => [1, 6 (1+5), 9 (6+3)], then the next [6, 4, 9] => [6, 10 (6+4), 19 (10+9)]

### Example 3 - breaking (to illustrate when it breaks)

In [172]:
arr = np.array([
    [1,5,3],
    [6,4,9]
])

np.cumsum(arr, axis=1)

array([[ 1,  6,  9],
       [ 6, 10, 19]], dtype=int32)

In [173]:
arr = np.array([
    [[1,2,4]], 
    [[1,3], [4,7]]
], dtype=object)

np.cumsum(arr, axis=1)

AxisError: axis 1 is out of bounds for array of dimension 1

Suppose we are doing a cumulative sum on a multi-dimensional complex numpy array then we have to be always careful about the axis.

We have to make sure that the axis matches with our original array. 

### Some closing comments about when to use this function.
Numpy's cumsum() function is a helpful function for finance related datasets. For example in real-world scenario you might be able to find the cumulative sum of sales a particular company had made in a year, or finding out the cumulative sum of subscribers of Netflix in a year, and many more application examples.

In [28]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m
[jovian] Updating notebook "aakashns/numpy-array-operations" on https://jovian.ml/[0m
[jovian] Uploading notebook..[0m
[jovian] Capturing environment..[0m
[jovian] Committed successfully! https://jovian.ml/aakashns/numpy-array-operations[0m


'https://jovian.ml/aakashns/numpy-array-operations'

## Conclusion

Summarize what was covered in this notebook, and where to go next

In this notebook we have learnt how to do array shape manipulation, search array elements using where() function, and even some statistical use cases. 

Numpy is a very powerful and useful tool that is used by many people around the world. If you follow some real case studies of Numpy you will be amazed to know how powerfull the library is. A fun fact that Numpy was used in the computational image photography of the first ever real picture of a black hole (damn!!!!!!!!!). That is so amazing.

## Reference Links
Provide links to your references and other interesting articles about Numpy arrays:
* Numpy official tutorial : https://numpy.org/doc/stable/user/quickstart.html
* https://stackoverflow.com/questions/50656307/numpy-typeerror-ufunc-bitwise-and-not-supported-for-the-input-types-when-us
* http://www.degeneratestate.org/posts/2016/Oct/23/image-processing-with-numpy/
* Case studies: go to https://numpy.org/ and scroll down to "CASE STUDIES" section. 

In [178]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..
[jovian] Updating notebook "phani/numpy-assignment-2" on https://jovian.ml/
[jovian] Uploading notebook..
[jovian] Capturing environment..


[jovian] Error: Failed to read Anaconda environment using command: "conda env export -n base --no-builds"


[jovian] Committed successfully! https://jovian.ml/phani/numpy-assignment-2


'https://jovian.ml/phani/numpy-assignment-2'