# Introduction to Python for Open Source Geocomputation

![python](pics/python-logo-master-v3-TM.png)

* Instructor: Dr. Wei Kang

Content:

* Numpy
* A new data type: `numpy.array`
    * How to create an array
    * Array operations

# What is Numpy?

* The fundamental package for scientific computing with Python
* Nearly every scientist working in Python draws on the power of NumPy.
* NumPy brings the **computational power** of languages like C and Fortran to Python, a language much easier to learn and use. With this power comes **simplicity: a solution in NumPy is often clear and elegant**.
* Essential in many different realms:
    * NumPy lies at the core of a rich ecosystem of **data science** libraries 
<img src="pics/ds-landscape.png" width="500"/>
    * NumPy forms the basis of powerful **machine learning** libraries like [scikit-learn](https://scikit-learn.org/stable/), [SciPy](https://scipy.org/), [TensorFlow](https://www.tensorflow.org/), and [PyTorch](https://pytorch.org/)

    * NumPy is an essential component in the burgeoning Python **visualization landscape**, which includes Matplotlib, Seaborn, Plotly, Altair, Bokeh, Holoviz, Vispy, Napari, and PyVista, to name a few.


## What makes Numpy so important?

*arrays*: A very powerful data type essential to numerical computing: 
* sequences of data all of the _same type_
* behave a lot like lists, except for the constraint in the type of their elements.
    * There is a huge efficiency advantage when you know that **all elements of a sequence are of the same type**—so equivalent methods for arrays execute a lot **faster** than those for lists.

## Numpy `Array`  (or `ndarray`)

* homogeneous multidimensional array
    * a table of elements (usually numbers), all of the same type, indexed by a tuple of non-negative integers 
        * for the data types accepted in Numpy. Read the [docs: Data type objects](https://numpy.org/doc/stable/reference/arrays.dtypes.html).
    * dimensions are called _axes_
* An Example: points' coordinates
    * one single point: one-dimensional array: `np.array([1,2])`
    * two or more points: two-dimensional array: 
        * two points: `np.array([[1,2], [3,4]])`
        * five points: `np.array([[1,2], [3,4],[5,6], [7,8], [9,10]])`

In [1]:
import numpy as np



In [2]:
a1 = np.array([1,2])
a1

array([1, 2])

In [3]:
a2 = np.array([[1,2], [3,4],[5,6], [7,8], [9,10]])
a2

array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10]])

### Motivation (1): What can a Numpy array used for?

* An array can contain:
     * values of an experiment/simulation at discrete time steps, e.g., income, air pollution, crime rate, animal/plant occurrence
     * pixels of an image, grey-level or colour
     * signal recorded by a measurement device, e.g. sound wave
     * 3-D data measured at different X-Y-Z positions, e.g. MRI scan, digital elevation model

### Motivation (2): Efficiency of Numpy array - an example

* Problem description:  Write a python program that calculate the square of each number in a list, such that $x_i=i^2$, for $0\leq i < n$. 

Two data types: 
* Python built-in type: list
* Numpy array

We use [`%timeit`](https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-timeit) to calculate the time execution of a Python statement or expression.

In [4]:
L = list(range(1000)) #produce a list of integers from 0 to 999

In [5]:
L

[0,
 1,
 2,
 3,
 4,
 5,
 6,
 7,
 8,
 9,
 10,
 11,
 12,
 13,
 14,
 15,
 16,
 17,
 18,
 19,
 20,
 21,
 22,
 23,
 24,
 25,
 26,
 27,
 28,
 29,
 30,
 31,
 32,
 33,
 34,
 35,
 36,
 37,
 38,
 39,
 40,
 41,
 42,
 43,
 44,
 45,
 46,
 47,
 48,
 49,
 50,
 51,
 52,
 53,
 54,
 55,
 56,
 57,
 58,
 59,
 60,
 61,
 62,
 63,
 64,
 65,
 66,
 67,
 68,
 69,
 70,
 71,
 72,
 73,
 74,
 75,
 76,
 77,
 78,
 79,
 80,
 81,
 82,
 83,
 84,
 85,
 86,
 87,
 88,
 89,
 90,
 91,
 92,
 93,
 94,
 95,
 96,
 97,
 98,
 99,
 100,
 101,
 102,
 103,
 104,
 105,
 106,
 107,
 108,
 109,
 110,
 111,
 112,
 113,
 114,
 115,
 116,
 117,
 118,
 119,
 120,
 121,
 122,
 123,
 124,
 125,
 126,
 127,
 128,
 129,
 130,
 131,
 132,
 133,
 134,
 135,
 136,
 137,
 138,
 139,
 140,
 141,
 142,
 143,
 144,
 145,
 146,
 147,
 148,
 149,
 150,
 151,
 152,
 153,
 154,
 155,
 156,
 157,
 158,
 159,
 160,
 161,
 162,
 163,
 164,
 165,
 166,
 167,
 168,
 169,
 170,
 171,
 172,
 173,
 174,
 175,
 176,
 177,
 178,
 179,
 180,
 181,
 182,
 183,
 184,


In [6]:
L

[0,
 1,
 2,
 3,
 4,
 5,
 6,
 7,
 8,
 9,
 10,
 11,
 12,
 13,
 14,
 15,
 16,
 17,
 18,
 19,
 20,
 21,
 22,
 23,
 24,
 25,
 26,
 27,
 28,
 29,
 30,
 31,
 32,
 33,
 34,
 35,
 36,
 37,
 38,
 39,
 40,
 41,
 42,
 43,
 44,
 45,
 46,
 47,
 48,
 49,
 50,
 51,
 52,
 53,
 54,
 55,
 56,
 57,
 58,
 59,
 60,
 61,
 62,
 63,
 64,
 65,
 66,
 67,
 68,
 69,
 70,
 71,
 72,
 73,
 74,
 75,
 76,
 77,
 78,
 79,
 80,
 81,
 82,
 83,
 84,
 85,
 86,
 87,
 88,
 89,
 90,
 91,
 92,
 93,
 94,
 95,
 96,
 97,
 98,
 99,
 100,
 101,
 102,
 103,
 104,
 105,
 106,
 107,
 108,
 109,
 110,
 111,
 112,
 113,
 114,
 115,
 116,
 117,
 118,
 119,
 120,
 121,
 122,
 123,
 124,
 125,
 126,
 127,
 128,
 129,
 130,
 131,
 132,
 133,
 134,
 135,
 136,
 137,
 138,
 139,
 140,
 141,
 142,
 143,
 144,
 145,
 146,
 147,
 148,
 149,
 150,
 151,
 152,
 153,
 154,
 155,
 156,
 157,
 158,
 159,
 160,
 161,
 162,
 163,
 164,
 165,
 166,
 167,
 168,
 169,
 170,
 171,
 172,
 173,
 174,
 175,
 176,
 177,
 178,
 179,
 180,
 181,
 182,
 183,
 184,


In [7]:
L = list(range(1000))
for i in range(len(L)):
    L[i] = L[i] **2

list comprehension

In [8]:
%timeit -n 1000 [i**2 for i in L]  

187 µs ± 2.54 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [9]:
import numpy as np
a = np.arange(1000) #produce an array of integers from 0 to 999
# a

In [10]:
%timeit -n 1000 a**2

1.04 µs ± 248 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


# Importing Numpy

```python
import numpy as np
```

In [11]:
import numpy as np

In [12]:
dir(np) #function dir gives you the package's attributes and functions.

['ALLOW_THREADS',
 'AxisError',
 'BUFSIZE',
 'CLIP',
 'DataSource',
 'ERR_CALL',
 'ERR_DEFAULT',
 'ERR_IGNORE',
 'ERR_LOG',
 'ERR_PRINT',
 'ERR_RAISE',
 'ERR_WARN',
 'FLOATING_POINT_SUPPORT',
 'FPE_DIVIDEBYZERO',
 'FPE_INVALID',
 'FPE_OVERFLOW',
 'FPE_UNDERFLOW',
 'False_',
 'Inf',
 'Infinity',
 'MAXDIMS',
 'MAY_SHARE_BOUNDS',
 'MAY_SHARE_EXACT',
 'NAN',
 'NINF',
 'NZERO',
 'NaN',
 'PINF',
 'PZERO',
 'RAISE',
 'SHIFT_DIVIDEBYZERO',
 'SHIFT_INVALID',
 'SHIFT_OVERFLOW',
 'SHIFT_UNDERFLOW',
 'ScalarType',
 'Tester',
 'TooHardError',
 'True_',
 'UFUNC_BUFSIZE_DEFAULT',
 'UFUNC_PYVALS_NAME',
 'WRAP',
 '_CopyMode',
 '_NoValue',
 '_UFUNC_API',
 '__NUMPY_SETUP__',
 '__all__',
 '__builtins__',
 '__cached__',
 '__config__',
 '__deprecated_attrs__',
 '__dir__',
 '__doc__',
 '__expired_functions__',
 '__file__',
 '__former_attrs__',
 '__future_scalars__',
 '__getattr__',
 '__git_version__',
 '__loader__',
 '__mkl_version__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 '__version__',
 '

### Creating a Numpy Array

* create an array from a regular Python list or tuple using the `array` function.
```python
np.array(list/tuple)
```
* functions from Numpy to create special arrays
    * [`np.arange()`](https://numpy.org/doc/stable/reference/generated/numpy.arange.html): create evenly spaced values within a given interval.
    * [`np.linspace(start, stop, num=50)`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.linspace.html): create evenly spaced numbers over a specified interval.
    * [`np.ones(shape)`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ones.html#numpy.ones): create new array of given shape and type, filled with ones.
    * [`np.zeros(shape)`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.zeros.html#numpy.zeros): create a new array of given shape and type, filled with zeros.
    * [`np.eye(N)`](https://numpy.org/devdocs/reference/generated/numpy.eye.html): create a 2-D array with ones on the diagonal and zeros elsewhere.

In [13]:
a1 = np.array([1,2])
a1

array([1, 2])

In [14]:
type(a1)

numpy.ndarray

In [15]:
a1.size

2

`array.size` gives the number of items in the array.

In [16]:
len(a1)

2

`len(array)` gives the same result to `array.size` 

In [17]:
a1.ndim

1

`array.ndim` gives the number of axes (dimensions) of the array.

In [18]:
a1.shape

(2,)

In [19]:
a1

array([1, 2])

`array.shape` gives the dimensions of the array. This is a tuple of integers indicating the **size** of the array in each dimension. For a matrix with n rows and m columns, shape will be (n,m). The length of the shape tuple is therefore the number of axes, ndim.

In [20]:
a1.dtype

dtype('int64')

`array.dtype` returns an object describing the type of the elements in the array

In [21]:
a_str = np.array([1.0,2,"1"])
a_str

array(['1.0', '2', '1'], dtype='<U32')

In [22]:
a_str.dtype #32-character  string 

dtype('<U32')

In [23]:
a2 = np.array([[1,2], [3,4]])
a2

array([[1, 2],
       [3, 4]])

In [24]:
a2.ndim

2

In [25]:
a2.size

4

In [26]:
len(a2)

2

`len(array)` gives the number of rows or the size of the first dimension when encountering a 2-dimensional array

In [27]:
a2.shape

(2, 2)

In [28]:
a2.dtype

dtype('int64')

In [29]:
a3 = np.array([[1,2], [3,4],[5,6], [7,8], [9,10]])
a3

array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10]])

In [30]:
a3.ndim

2

In [31]:
len(a3)

5

In [32]:
a3.size

10

In [33]:
a3

array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10]])

In [34]:
a3.shape

(5, 2)

In [35]:
a3.dtype

dtype('int64')

We can create a 3-dimensional array

In [36]:
import numpy as np

In [37]:
c = np.array([[[1,1], [2,2]], 
              [[3,23], [4,5]], 
              [[5,3], [9,10]]])
c

array([[[ 1,  1],
        [ 2,  2]],

       [[ 3, 23],
        [ 4,  5]],

       [[ 5,  3],
        [ 9, 10]]])

In [38]:
c.ndim

3

In [39]:
c.shape

(3, 2, 2)

In [40]:
c.size

12

In [41]:
len(c)

3

In [42]:
a = np.array(1, 2, 3, 4)    

TypeError: array() takes from 1 to 2 positional arguments but 4 were given

The input needs to be an ordered sequence data type: list or tuples

In [43]:
a = np.array([1, 2, 3, 4])
a

array([1, 2, 3, 4])

In [44]:
a = np.array((1, 2, 3, 4))
a

array([1, 2, 3, 4])

In [45]:
a = np.array((1, 2, 3, 4), (1, 2, 3, 4))
a

TypeError: Tuple must have size 2, but has size 4

In [46]:
a = np.array(((1, 2, 3, 4), (1, 2, 3, 4)))
a

array([[1, 2, 3, 4],
       [1, 2, 3, 4]])

In [47]:
a = np.array([[1, 2, 3, 4], [1, 2, 3, 4]])
a

array([[1, 2, 3, 4],
       [1, 2, 3, 4]])

## Numpy functions to generate special arrays

### `numpy.arange()`

`numpy.arange()` gives an array of evenly spaced values in a defined interval. Similar to `range()`

*Syntax:*

`numpy.arange(start, stop, step)`

where `start` by default is zero, `stop` is not inclusive, and the default
for `step` is one. 

In [48]:
list(range(3))

[0, 1, 2]

In [49]:
np.arange(3) # 0 .. n-1 (!)

array([0, 1, 2])

In [50]:
np.arange(2, 6)

array([2, 3, 4, 5])

In [51]:
np.arange(2, 6, 2) # start, end (exclusive), step

array([2, 4])

In [52]:
np.arange(2, 6, 0.5) # start, end (exclusive), step

array([2. , 2.5, 3. , 3.5, 4. , 4.5, 5. , 5.5])

### `numpy.linspace()`

`numpy.linspace()` is similar to `numpy.arange()`, but uses number of samples instead of a step size. It returns an array with evenly spaced numbers over the specified interval.  

*Syntax:*

`numpy.linspace(start, stop, num)`

`stop` is included by default (it can be removed, read the docs), and `num` by default is 50. 

In [53]:
np.linspace(2.0, 3.0)

array([2.        , 2.02040816, 2.04081633, 2.06122449, 2.08163265,
       2.10204082, 2.12244898, 2.14285714, 2.16326531, 2.18367347,
       2.20408163, 2.2244898 , 2.24489796, 2.26530612, 2.28571429,
       2.30612245, 2.32653061, 2.34693878, 2.36734694, 2.3877551 ,
       2.40816327, 2.42857143, 2.44897959, 2.46938776, 2.48979592,
       2.51020408, 2.53061224, 2.55102041, 2.57142857, 2.59183673,
       2.6122449 , 2.63265306, 2.65306122, 2.67346939, 2.69387755,
       2.71428571, 2.73469388, 2.75510204, 2.7755102 , 2.79591837,
       2.81632653, 2.83673469, 2.85714286, 2.87755102, 2.89795918,
       2.91836735, 2.93877551, 2.95918367, 2.97959184, 3.        ])

In [54]:
len(np.linspace(2.0, 3.0))

50

In [55]:
np.linspace(0, 1, 6) # start, end, num of points

array([0. , 0.2, 0.4, 0.6, 0.8, 1. ])

In [56]:
np.linspace(-1, 1, 9)

array([-1.  , -0.75, -0.5 , -0.25,  0.  ,  0.25,  0.5 ,  0.75,  1.  ])

We can also create special arrays using Numpy functions

In [57]:
a = np.ones(3) # creating a 1-D array full of 1s

In [58]:
a

array([1., 1., 1.])

In [59]:
a = np.ones((3, 2)) # (3,2) is the shape of the array we want to create, which needs to be a tuple

In [60]:
a

array([[1., 1.],
       [1., 1.],
       [1., 1.]])

In [61]:
b = np.zeros(10) # creating a 1-D array full of 0s

In [62]:
b

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [63]:
b = np.zeros((3, 3))

In [64]:
b

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [65]:
np.eye(5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

In [66]:
np.eye(2)

array([[1., 0.],
       [0., 1.]])

In [67]:
np.empty((2, 3)) 

array([[1., 1., 1.],
       [1., 1., 1.]])

## Arithmetic operations on arrays

* Arithmetic operators on arrays apply elementwise
* Different from the application of Arithmetic operators to lists

In [68]:
a = np.array([20, 30, 40, 50])
b = np.array([1,2,3,4])

In [69]:
c = a + b
c

array([21, 32, 43, 54])

In [70]:
d = np.array([1,2,3])

In [71]:
a + d

ValueError: operands could not be broadcast together with shapes (4,) (3,) 

In [72]:
list_a = list(a)
list_b = list(b)

In [73]:
type(list_a)

list

In [74]:
list_a

[20, 30, 40, 50]

In [75]:
list_b

[1, 2, 3, 4]

In [76]:
list_a + list_b

[20, 30, 40, 50, 1, 2, 3, 4]

In [77]:
a

array([20, 30, 40, 50])

In [78]:
b

array([1, 2, 3, 4])

In [79]:
d = a - b
d

array([19, 28, 37, 46])

In [80]:
list_d = list_a - list_b
list_d

TypeError: unsupported operand type(s) for -: 'list' and 'list'

In [81]:
a

array([20, 30, 40, 50])

In [82]:
a ** 2

array([ 400,  900, 1600, 2500])

In [83]:
list_a ** 2

TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'int'

In [84]:
a

array([20, 30, 40, 50])

In [85]:
a * 2

array([ 40,  60,  80, 100])

In [86]:
list_a

[20, 30, 40, 50]

In [87]:
list_a * 2

[20, 30, 40, 50, 20, 30, 40, 50]

In [88]:
a

array([20, 30, 40, 50])

In [89]:
b

array([1, 2, 3, 4])

In [90]:
a < b

array([False, False, False, False])

In [91]:
list_a

[20, 30, 40, 50]

In [92]:
list_b

[1, 2, 3, 4]

In [93]:
list_a < list_b

False

In [94]:
a

array([20, 30, 40, 50])

In [95]:
b

array([1, 2, 3, 4])

In [96]:
a/b

array([20.        , 15.        , 13.33333333, 12.5       ])

In [97]:
list_a/list_b

TypeError: unsupported operand type(s) for /: 'list' and 'list'

In [98]:
a.shape

(4,)

In [99]:
a

array([20, 30, 40, 50])

In [100]:
a + 1

array([21, 31, 41, 51])

Broadcasting with scalar numerical data type

In [101]:
list_a + 1

TypeError: can only concatenate list (not "int") to list

In [102]:
a

array([20, 30, 40, 50])

In [103]:
a < 30

array([ True, False, False, False])

In [104]:
list_a < 30

TypeError: '<' not supported between instances of 'list' and 'int'

In [105]:
c = np.array([10,15,20])

In [106]:
a

array([20, 30, 40, 50])

In [107]:
a + c

ValueError: operands could not be broadcast together with shapes (4,) (3,) 

Shape mismatches

#### Arithmetic operation on 2-D arrays

In [108]:
X = np.array([[1, 2], [3, 4]])
print(X)

[[1 2]
 [3 4]]


In [109]:
Y = np.array([[1, -1], [0, 1]])
print(Y)

[[ 1 -1]
 [ 0  1]]


In [110]:
X + Y

array([[2, 1],
       [3, 5]])

In [111]:
X * Y

array([[ 1, -2],
       [ 0,  4]])

The multiplication using the `'*'` operator is element-wise. 

What if we want to do matrix multiplication? Using the `'@'` operator:

![matrix_multiplication.jpg](pics/matrix_multiplication.jpg)

In [112]:
X * Y

array([[ 1, -2],
       [ 0,  4]])

In [113]:
X @ Y

array([[1, 1],
       [3, 1]])

Or equivalently we can use `np.dot()`:

In [114]:
np.dot(X, Y)

array([[1, 1],
       [3, 1]])

In [115]:
Z = np.arange(12)
Z

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [116]:
Z.reshape(3,4) #reshape() change the shape of an array

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [117]:
Z.reshape((3,4)) #reshape() change the shape of an array

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [118]:
Z = Z.reshape(3,4) 
Z

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [119]:
Z.sum()

66

In [120]:
Z.max()

11

In [121]:
Z.min()

0

In [122]:
Z

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [123]:
Z.sum(axis=0) # sum of each column

array([12, 15, 18, 21])

In [124]:
Z.sum(axis=1) # sum of each row

array([ 6, 22, 38])

In [125]:
Z.mean(axis=1) # average of each row

array([1.5, 5.5, 9.5])

In [126]:
Z.mean(axis=0) # average of each column

array([4., 5., 6., 7.])

## Indexing, Slicing and Iterating

* 1-dimensional arrays can be indexed, sliced and iterated over, much like lists and other Python sequences.
* Multidimensional arrays have one index per axis. These indices are given in a tuple separated by commas

In [127]:
a = np.arange(10)**3
a

array([  0,   1,   8,  27,  64, 125, 216, 343, 512, 729])

In [128]:
a[0]

0

In [129]:
a[3]

27

In [130]:
a[2:5]

array([ 8, 27, 64])

In [131]:
for i in a:
    print(i)

0
1
8
27
64
125
216
343
512
729


In [132]:
b = np.arange(12).reshape(3,4)
b

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [133]:
list_b = []
for i in b:
    list_b.append(list(i))
list_b

[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]

In [134]:
list_b[0][0]

0

In [135]:
b

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [136]:
b[0,0]

0

In [137]:
b

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [138]:
b[2,3]

11

In [139]:
b[:2,0]

array([0, 4])

In [140]:
b[:,0]

array([0, 4, 8])

In [141]:
b

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [142]:
b[0] 

array([0, 1, 2, 3])

In [143]:
b[0, :]

array([0, 1, 2, 3])

The missing indices are considered complete slices: `b[0]` is equivalent to `b[0,:]`

#### Exercise:

```python
b = np.arange(12).reshape(3,4)
```

* Obtain each column in the second and third row of b
* Obtain the first three rows and columns of b

In [144]:
b = np.arange(12).reshape(3,4)
b

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [145]:
b[1:3,]

array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [146]:
b[1:3,:]

array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [147]:
b[1:3]

array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [148]:
b[0:3,0:3]

array([[ 0,  1,  2],
       [ 4,  5,  6],
       [ 8,  9, 10]])

In [149]:
b[:3,:3]

array([[ 0,  1,  2],
       [ 4,  5,  6],
       [ 8,  9, 10]])

In [150]:
b

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [151]:
for i in b:
    print(i)

[0 1 2 3]
[4 5 6 7]
[ 8  9 10 11]


Iterating over multidimensional arrays is done with respect to the first axis: row by row

#### More flexible indexing - fancy indexing

* Indexing with Arrays of Indices
* Indexing with Boolean Arrays

In [152]:
a = np.arange(12)**2  # the first 12 square numbers
a

array([  0,   1,   4,   9,  16,  25,  36,  49,  64,  81, 100, 121])

In [153]:
np.__version__

'1.24.3'

In [154]:
i = np.array([1, 1, 3, 8, 5])  # an array of indices
i

array([1, 1, 3, 8, 5])

In [155]:
a[i]  # the elements of `a` at the positions `i`

array([ 1,  1,  9, 64, 25])

In [156]:
j = np.array([[3, 4], [9, 7]])  # a bidimensional array of indices
j

array([[3, 4],
       [9, 7]])

In [157]:
a

array([  0,   1,   4,   9,  16,  25,  36,  49,  64,  81, 100, 121])

In [158]:
a[j] # the same shape as `j`

array([[ 9, 16],
       [81, 49]])

What if `a` is multidimensional?

In [159]:
a = a.reshape(4,3)
a

array([[  0,   1,   4],
       [  9,  16,  25],
       [ 36,  49,  64],
       [ 81, 100, 121]])

In [160]:
i = np.array([[2, 1],  # indices for the first dim of `a`
              [3, 3]])

In [161]:
j = np.array([[0, 1],  # indices for the second dim of `a`
              [1, 2]])

In [162]:
a[i,j]

array([[ 36,  16],
       [100, 121]])

In [163]:
a

array([[  0,   1,   4],
       [  9,  16,  25],
       [ 36,  49,  64],
       [ 81, 100, 121]])

In [164]:
b = a> 14
b

array([[False, False, False],
       [False,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True]])

In [165]:
a[b]  # 1d array with the selected elements

array([ 16,  25,  36,  49,  64,  81, 100, 121])

use boolean arrays that have the same shape as the original array

In [166]:
a[b] = 0  # All elements of `a` higher than 14 become 0
a

array([[0, 1, 4],
       [9, 0, 0],
       [0, 0, 0],
       [0, 0, 0]])

## More on Shape manipulation

In [167]:
a = np.arange(20)
a

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19])

In [168]:
a.shape

(20,)

In [169]:
a.shape = (4,5)

In [170]:
a

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [171]:
a.shape = (2,4)

ValueError: cannot reshape array of size 20 into shape (2,4)

In [172]:
a.shape = (2,10)

In [173]:
a

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]])

In [174]:
a.transpose() # Transpose of the array

array([[ 0, 10],
       [ 1, 11],
       [ 2, 12],
       [ 3, 13],
       [ 4, 14],
       [ 5, 15],
       [ 6, 16],
       [ 7, 17],
       [ 8, 18],
       [ 9, 19]])

In [175]:
a.T # Transpose of the array

array([[ 0, 10],
       [ 1, 11],
       [ 2, 12],
       [ 3, 13],
       [ 4, 14],
       [ 5, 15],
       [ 6, 16],
       [ 7, 17],
       [ 8, 18],
       [ 9, 19]])

In [176]:
a.reshape(4,5)

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [177]:
a.reshape(4,-1)

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

If in a reshaping operation a dimension is given as -1, it is automatically calculated to correspond to the other dimensions.

In [178]:
a.reshape(7,-1)

ValueError: cannot reshape array of size 20 into shape (7,newaxis)

In [179]:
mean_row = a.mean(axis=1)
mean_row

array([ 4.5, 14.5])

In [180]:
mean_row.shape

(2,)

In [181]:
mean_row + a

ValueError: operands could not be broadcast together with shapes (2,) (2,10) 

In [182]:
a.shape

(2, 10)

In [183]:
mean_row = mean_row.reshape(2, -1)
mean_row

array([[ 4.5],
       [14.5]])

In [184]:
mean_row.shape

(2, 1)

In [185]:
mean_row + a 

array([[ 4.5,  5.5,  6.5,  7.5,  8.5,  9.5, 10.5, 11.5, 12.5, 13.5],
       [24.5, 25.5, 26.5, 27.5, 28.5, 29.5, 30.5, 31.5, 32.5, 33.5]])

### Broadcasting for 2-d arrays

How NumPy treats arrays with different shapes during arithmetic operations
* One dimension has the same size
* The other dimension is of size 1

In [186]:
mean_row = a.mean(axis=1)
mean_row

array([ 4.5, 14.5])

In [187]:
mean_row.shape

(2,)

In [188]:
a

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]])

In [189]:
a - mean_row

ValueError: operands could not be broadcast together with shapes (2,10) (2,) 

In [190]:
mean_row = mean_row.reshape(2, -1)
mean_row

array([[ 4.5],
       [14.5]])

In [191]:
mean_row.shape

(2, 1)

In [192]:
a - mean_row

array([[-4.5, -3.5, -2.5, -1.5, -0.5,  0.5,  1.5,  2.5,  3.5,  4.5],
       [-4.5, -3.5, -2.5, -1.5, -0.5,  0.5,  1.5,  2.5,  3.5,  4.5]])

### Concatenating two numpy arrays

In [193]:
a.flatten() # turn the array into 1-d

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19])

Stacking arrays together

In [194]:
a

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]])

In [195]:
a.shape

(2, 10)

In [196]:
b=np.arange(200).reshape(-1,10)
b

array([[  0,   1,   2,   3,   4,   5,   6,   7,   8,   9],
       [ 10,  11,  12,  13,  14,  15,  16,  17,  18,  19],
       [ 20,  21,  22,  23,  24,  25,  26,  27,  28,  29],
       [ 30,  31,  32,  33,  34,  35,  36,  37,  38,  39],
       [ 40,  41,  42,  43,  44,  45,  46,  47,  48,  49],
       [ 50,  51,  52,  53,  54,  55,  56,  57,  58,  59],
       [ 60,  61,  62,  63,  64,  65,  66,  67,  68,  69],
       [ 70,  71,  72,  73,  74,  75,  76,  77,  78,  79],
       [ 80,  81,  82,  83,  84,  85,  86,  87,  88,  89],
       [ 90,  91,  92,  93,  94,  95,  96,  97,  98,  99],
       [100, 101, 102, 103, 104, 105, 106, 107, 108, 109],
       [110, 111, 112, 113, 114, 115, 116, 117, 118, 119],
       [120, 121, 122, 123, 124, 125, 126, 127, 128, 129],
       [130, 131, 132, 133, 134, 135, 136, 137, 138, 139],
       [140, 141, 142, 143, 144, 145, 146, 147, 148, 149],
       [150, 151, 152, 153, 154, 155, 156, 157, 158, 159],
       [160, 161, 162, 163, 164, 165, 166, 167, 168, 169

In [197]:
b.shape

(20, 10)

In [198]:
np.vstack((a,b)) #Stack arrays in sequence vertically (row wise): number of columns have to match

array([[  0,   1,   2,   3,   4,   5,   6,   7,   8,   9],
       [ 10,  11,  12,  13,  14,  15,  16,  17,  18,  19],
       [  0,   1,   2,   3,   4,   5,   6,   7,   8,   9],
       [ 10,  11,  12,  13,  14,  15,  16,  17,  18,  19],
       [ 20,  21,  22,  23,  24,  25,  26,  27,  28,  29],
       [ 30,  31,  32,  33,  34,  35,  36,  37,  38,  39],
       [ 40,  41,  42,  43,  44,  45,  46,  47,  48,  49],
       [ 50,  51,  52,  53,  54,  55,  56,  57,  58,  59],
       [ 60,  61,  62,  63,  64,  65,  66,  67,  68,  69],
       [ 70,  71,  72,  73,  74,  75,  76,  77,  78,  79],
       [ 80,  81,  82,  83,  84,  85,  86,  87,  88,  89],
       [ 90,  91,  92,  93,  94,  95,  96,  97,  98,  99],
       [100, 101, 102, 103, 104, 105, 106, 107, 108, 109],
       [110, 111, 112, 113, 114, 115, 116, 117, 118, 119],
       [120, 121, 122, 123, 124, 125, 126, 127, 128, 129],
       [130, 131, 132, 133, 134, 135, 136, 137, 138, 139],
       [140, 141, 142, 143, 144, 145, 146, 147, 148, 149

In [199]:
np.hstack((a,b)) #Stack arrays in sequence horizontally (column wise): : number of rows have to match

ValueError: all the input array dimensions except for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 2 and the array at index 1 has size 20

## Other array functions

In [200]:
a=np.array( [[ 7,2, 1], [4,3, 8] ])
a

array([[7, 2, 1],
       [4, 3, 8]])

In [201]:
np.sort?

In [202]:
np.sort(a)

array([[1, 2, 7],
       [3, 4, 8]])

In [203]:
np.sort(a,axis=1)

array([[1, 2, 7],
       [3, 4, 8]])

In [204]:
np.sort(a,axis=0)

array([[4, 2, 1],
       [7, 3, 8]])

## Further reading

* read [Numpy tutorial](https://numpy.org/doc/stable/user/quickstart.html) to learn more about numpy functionalities