<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Maximum-(or-minimum)-value-over-the-entire-matrix" data-toc-modified-id="Maximum-(or-minimum)-value-over-the-entire-matrix-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Maximum (or minimum) value over the entire matrix</a></span></li><li><span><a href="#Minimum-(or-maximum)-value-in-every-row/column" data-toc-modified-id="Minimum-(or-maximum)-value-in-every-row/column-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Minimum (or maximum) value in every row/column</a></span></li><li><span><a href="#The-average-value-of-every-column-or-row" data-toc-modified-id="The-average-value-of-every-column-or-row-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>The average value of every column or row</a></span></li><li><span><a href="#Sort-the-rows,-so-at-the-end-every-column-goes-from-low-to-high" data-toc-modified-id="Sort-the-rows,-so-at-the-end-every-column-goes-from-low-to-high-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Sort the rows, so at the end every column goes from low to high</a></span></li><li><span><a href="#Calculate-the-(cumulative)-sum-going-across-each-column" data-toc-modified-id="Calculate-the-(cumulative)-sum-going-across-each-column-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Calculate the (cumulative) sum going across each column</a></span></li></ul></div>

>All content is released under Creative Commons Attribution [CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/) and all source code is released under a [BSD-3 clause license](https://en.wikipedia.org/wiki/BSD_licenses). Parts of these materials were inspired by https://github.com/engineersCode/EngComp/ (CC-BY 4.0), L.A. Barba, N.C. Clementi.
>
>Please reuse, remix, revise, and reshare this content in any way, keeping this notice.
>
><img style="float: right;" width="150px" src="images/jupyter-logo.png">**Are you viewing this on jupyter.org?** Then this notebook will be read-only. <br>
>See how you can interactively run the code in this notebook by visiting our [instruction page about Notebooks](https://yint.org/notebooks). 

# Functions on the rows or columns of a matrix

In the [prior notebook](./) you learned about elementwise operations. In other words, NumPy performed the mathematical calculation on every element (entry) in the array.

Sometimes we need calculations the work on every row, or column, of an array. For example:
1. Find the maximum value in the entire array (over all rows and all columns)
2. Calculate the minimum value in every row (give back a column vector that has the minimum value of every row)
3. Calculate the average value of every column (give back a row vector that has the average value of every column)
4. Sort the rows, so at the end every column goes from low to high
5. Show the (cumulative) sum going across each column

In this notebook we will talk about matrices, but the operations can be applied to multi-dimensional arrays, with 3, 4, or more dimentions.

We also introduce the important term ***``axis``***, which you regularly see in the NumPy documentation.

## Maximum (or minimum) value over the entire matrix

You have just received all the data in your matrix, and now you wish to find the largest, or smallest value.


In [1]:
import numpy as np
rnd = np.array([[ 7, 3, 11, 12, 2], [10, 13, 8, 8, 2], [3, 13, 6, 2, 3], [5, 3, 9, 2, 6]])
print('The matrix is:\n {}'.format(rnd))

max_value = np.amax(rnd)
print('The maximum value is {}'.format(max_value))

min_value = np.amin(rnd)
print('The minimum value is {}'.format(min_value))

The matrix is:
 [[ 7  3 11 12  2]
 [10 13  8  8  2]
 [ 3 13  6  2  3]
 [ 5  3  9  2  6]]
The maximum value is 13
The minimum value is 2


The ``np.amax(...)`` and ``np.amin(...)`` functions will work along the entire array: all dimensions, looking at every element.

### Enrichment:

The NumPy library will internally unfold, or flatten the array into a single long vector. Take a look at what that looks like when you use the ``.flatten(...)`` method on the array: ``rnd.flatten()``. It works from column to column, down each row:

```python
print(rnd.flatten())
```
``[ 7  3 11 12  2 10 13  8  8  2  3 13  6  2  3  5  3  9  2  6]``

This is actually how the data is stored internally in the computer's memory.

The reason we point this ``.flatten(...)`` function out is because sometimes knowing what the maximum value is is only half the work. The other half is knowing *where* that maximum value is. For that we have the ``np.argmax(...)`` function.

Try this code:

```python
max_position = np.argmax(rnd)
print('The maximum value is in position {} of the flattened array'.format(max_position))
```

Verify that that is actually the case, using the space below:

In [2]:
# Copy the above code here and run it.
# In which position is the maximum value?
# And the minimum value?



## Minimum (or maximum) value in every row/column

Above we found the minimum or maximum in the entire matrix. But what if we wanted that extreme value given per row, or per column?

Think of a matrix containing the daily temperatures per city; one city per column, and every row is a day of the year. 
* What is the max or min temperature for each city (per column)?
* What is the max or min temperature each day for all cities (per row)?

For this we also use the ``np.amax(matrix, axis=...)`` or ``np.amin(matrix, axis=...)`` function.

You must specify, as a second input, along which ***``axis``*** you want that extreme value to be calculated. 
* Axis 0 is the first axis, along the direction of the rows, going from top to bottom
* Axis 1 is the next axis, along the direction of the columns, going from left to right

See the code below.

In [10]:
import numpy as np
temps = np.array([[7, 9, 12, 10], [1, 4, 5, 2], [-3, 1, -2, -3], [-2, -1, -2, -2], [-3, -1, -2, -4]])
print('The temperatures are given one column per city, each row is a daily average:\n {}'.format(temps))

max_value_0 = np.amax(temps, axis=0)
print('The maximum value along axis 0 (row-wise, per city for all days) is {}'.format(max_value_0))

max_value_1 = np.amax(temps, axis=1)
print('The maximum value along axis 1 (column-wise, per day for all cities) is {}'.format(max_value_1))

# Notice the above output is 'flatten' and returned as a row, 
# instead of a column, as you might hope for. We can use the `keepdims` input though:
max_value_1_col = np.amax(temps, axis=1, keepdims=True)
print('The maximum value along axis 1 (column-wise, per day for all cities) is\n{}'.format(max_value_1_col))



The temperatures are given one column per city, each row is a daily average:
 [[ 7  9 12 10]
 [ 1  4  5  2]
 [-3  1 -2 -3]
 [-2 -1 -2 -2]
 [-3 -1 -2 -4]]
The maximum value along axis 0 (row-wise, per city for all days) is [ 7  9 12 10]
The maximum value along axis 1 (column-wise, per day for all cities) is [12  5  1 -1 -1]
The maximum value along axis 1 (column-wise, per day for all cities) is
[[12]
 [ 5]
 [ 1]
 [-1]
 [-1]]


You can visually verify that the maximum values returned are what you expected.

Now try it below for the minimum values:

In [4]:
# Give the minimum temperature for all cities
# Print the minimum temperature for all days for every city


### Enrichment:

Many functions in NumPy take ***``axis``*** as in input argument, including the ``np.argmin(...)`` and ``np.argmax(...)`` functions you saw above. 

Try this in the code block above:
```python
max_position_0 = np.argmin(temps, axis=0)
print('The minimum temperature for each city occurred in position {} of each column'.format(max_position_0))
```

What position value is returned if there is more than one entry of the same minimum value (see column 3, for example, which has ``12, 5, -2, -2, -2``)?

## The average value of every column or row

Just like with the minimum or maximum value in the part above, you can expect to calculate averages per row and per column.

In [13]:
import numpy as np
temps = np.array([[7, 9, 12, 10], [1, 4, 5, 2], [-3, 1, -2, -3], [-2, -1, -2, -2], [-3, -1, -2, -4]])
print('The temperatures are given one column per city, each row is a daily average:\n {}'.format(temps))

mean_value_0 = np.mean(temps, axis=0)
print('The average value along axis 0 (row-wise, per city, over all days) is {}'.format(mean_value_0))

mean_value_1 = np.mean(temps, axis=1, keepdims=True) # <-- notice the extra input
print('The average value along axis 1 (column-wise, per day, over all cities) is:\n{}'.format(mean_value_1))

The temperatures are given one column per city, each row is a daily average:
 [[ 7  9 12 10]
 [ 1  4  5  2]
 [-3  1 -2 -3]
 [-2 -1 -2 -2]
 [-3 -1 -2 -4]]
The average value along axis 0 (row-wise, per city, over all days) is [0.  2.4 2.2 0.6]
The average value along axis 1 (column-wise, per day, over all cities) is:
[[ 9.5 ]
 [ 3.  ]
 [-1.75]
 [-1.75]
 [-2.5 ]]


## Sort the rows, so at the end every column goes from low to high

In [25]:
import numpy as np
temps = np.array([[7, 9, 12, 10], [1, 4, 5, 2], [-3, 1, -2, -3], [-2, -1, -2, -2], [-3, -1, -2, -4]])
print('The temperatures are given one column per city, each row is a daily average:\n {}'.format(temps))

temps.sort(axis=0)
print(temps)
temps.argmax(axis=0)

print(temps.sum(axis=1))

The temperatures are given one column per city, each row is a daily average:
 [[ 7  9 12 10]
 [ 1  4  5  2]
 [-3  1 -2 -3]
 [-2 -1 -2 -2]
 [-3 -1 -2 -4]]
[[-3 -1 -2 -4]
 [-3 -1 -2 -3]
 [-2  1 -2 -2]
 [ 1  4  5  2]
 [ 7  9 12 10]]
[-10  -9  -5  12  38]


## Calculate the (cumulative) sum going across each column


* sort
* cumulative sum
