# NumPy fundamentals

Many thanks to: https://numpy.org/doc/stable/user/absolute_beginners.html

## Introduction

There are 6 general mechanisms for creating arrays:

1. Conversion from other Python structures (i.e. lists and tuples) 

2. Intrinsic NumPy array creation functions (e.g. arange, ones, zeros, etc.)

3. Replicating, joining, or mutating existing arrays

4. Reading arrays from disk, either from standard or custom formats

5. Creating arrays from raw bytes through the use of strings or buffers

6. Use of special library functions (e.g. random)

In most case, we do it as described in 1 or 2.

In [231]:
# Import numpy
import numpy as np

In [232]:
!pip list

Package                       Version
----------------------------- --------------------
alabaster                     0.7.12
anaconda-client               1.11.0
anaconda-navigator            2.3.1
anaconda-project              0.11.1
anyio                         3.5.0
appdirs                       1.4.4
argon2-cffi                   21.3.0
argon2-cffi-bindings          21.2.0
arrow                         1.2.2
astroid                       2.11.7
astropy                       5.1
atomicwrites                  1.4.0
attrs                         21.4.0
Automat                       20.2.0
autopep8                      1.6.0
Babel                         2.9.1
backcall                      0.2.0
backports.functools-lru-cache 1.6.4
backports.tempfile            1.0
backports.weakref             1.0.post1
bcrypt                        3.2.0
beautifulsoup4                4.11.1
binaryornot                   0.4.4
bitarray                      2.5.1
bkcharts                      0.2
blac

## How to create a basic array (=Vector)
To create a NumPy array, you can use the function `np.array()`. All you need to do to create a simple array is pass a list to it.

In [233]:
array = np.array([1, 2, 3])
print(type(a))
print(a)

<class 'list'>
[1, 2, 3]


You can visualize your array this way:
<img src="res/np_array.png">

### How do we get the dimension, size or shape of an array?
- `ndarray.ndim` will tell you the number of axes, or dimensions, of the array.

- `ndarray.size` will tell you the total number of elements of the array. This is the product of the elements of the array’s shape.

- `ndarray.shape` will display a tuple of integers that indicate the number of elements stored along each dimension of the array. If, for example, you have a 2-D array with 2 rows and 3 columns, the shape of your array is (2, 3).
- `ndarray.dtype` will tell you the data type. The Elements are all of the same type.

In [234]:
array.ndim

1

In [235]:
array.size

3

In [236]:
print(type(array.shape))
print(array.shape)

<class 'tuple'>
(3,)


In [237]:
array.dtype

dtype('int32')

### Intrinsic NumPy array creation functions
- `np.zeros()` creates an array filled with 0’s
- `np.ones()` creates an array filled with 1’s
- `np.arange()` creates an array with a range of elements
- `np.linspace()` creates an array with values that are spaced linearly in a specified interval 

In [238]:
# Return a new array of given shape and type
print(np.zeros(2))
print()

# with tuples
print(np.zeros((2)))
print()
print(np.zeros((2, 2)))

[0. 0.]

[0. 0.]

[[0. 0.]
 [0. 0.]]


In [239]:
# Return a new array of given shape and type
print(np.ones(2))
print()

# with tuples
print(np.ones((2)))
print()
print(np.ones((2, 2)))

[1. 1.]

[1. 1.]

[[1. 1.]
 [1. 1.]]


In [240]:
# arange([start,] stop[, step,], dtype=None, *, like=None)
print(np.arange(10))
print(np.arange(3, 10))
print(np.arange(1, 10, 2))

[0 1 2 3 4 5 6 7 8 9]
[3 4 5 6 7 8 9]
[1 3 5 7 9]


In [241]:
#np.linspace(start,stop,num=50,endpoint=True,retstep=False,dtype=None,axis=0)
print(np.linspace(1, 10))
print(np.linspace(1, 10, 5))
print(np.linspace(1, 10, 3))

[ 1.          1.18367347  1.36734694  1.55102041  1.73469388  1.91836735
  2.10204082  2.28571429  2.46938776  2.65306122  2.83673469  3.02040816
  3.20408163  3.3877551   3.57142857  3.75510204  3.93877551  4.12244898
  4.30612245  4.48979592  4.67346939  4.85714286  5.04081633  5.2244898
  5.40816327  5.59183673  5.7755102   5.95918367  6.14285714  6.32653061
  6.51020408  6.69387755  6.87755102  7.06122449  7.24489796  7.42857143
  7.6122449   7.79591837  7.97959184  8.16326531  8.34693878  8.53061224
  8.71428571  8.89795918  9.08163265  9.26530612  9.44897959  9.63265306
  9.81632653 10.        ]
[ 1.    3.25  5.5   7.75 10.  ]
[ 1.   5.5 10. ]


In [242]:
print(np.linspace(1, 10).shape) # default num is 50 by default

(50,)


### Specifying your data type
While the default data type is floating point (`np.float64`), you can explicitly specify which data type you want using the `dtype` keyword.

In [243]:
dtype_example = np.arange(1, 10, 2, 'int8')
dtype_example = np.arange(1, 10, 2, dtype=np.int8)

In [244]:
print(dtype_example)
print(dtype_example.dtype)

[1 3 5 7 9]
int8


### What about reshaping an array?
Yes, this is possible, but a little tricky! 

>Using `arr.reshape()` will give a new shape to an array without changing the data. Just remember that when you use the reshape method, the array you want to produce needs to have the **same number of elements** as the original array. If you start with an array with 12 elements, you’ll need to make sure that your new array also has a total of 12 elements.

In [245]:
reshape_example = np.arange(1, 7)
print(reshape_example)

[1 2 3 4 5 6]


To reshape this vector in an array with three rows and two columns, use `reshape(3,2)`.

In [270]:
reshape_example = reshape_example.reshape((2, 3))
print(reshape_example)

[[1 2 3]
 [4 5 6]]


In [272]:
reshape_example = reshape_example.reshape((3, 2))
print(reshape_example)

[[1 2]
 [3 4]
 [5 6]]


<img src="res/np_reshape.png" width="70%">

Eureka! We converted a vector into a 2D array (Matrix). To convert the array back to a vector, use `reshape` again

In [268]:
reshape_example = reshape_example.reshape((6,))
print(reshape_example)

[1 2 3 4 5 6]


In [273]:
# fast and easy method to convert back to vector
reshape_example = reshape_example.reshape(-1)
print(reshape_example)

[1 2 3 4 5 6]


### Indexing and slicing
You can index and slice NumPy arrays in the same ways you can slice Python lists.

In [293]:
data = np.array([1, 2, 3])

In [295]:
# looks more complicated that it is (it just prints every possibility)

for i in range(-len(data), len(data) + 1):
    other_i = True
    for j in range(-len(data), len(data) + 1):
        if i < 0 and other_i:
            print(f"data[{i}:]: {data[i:]}")
            other_i = False
        if i < j:
            if i < 0 and j >= 0: continue
            print(f"data[{i}:{j}]: {data[i:j]}")  

data[-3:]: [1 2 3]
data[-3:-2]: [1]
data[-3:-1]: [1 2]
data[-2:]: [2 3]
data[-2:-1]: [2]
data[-1:]: [3]
data[0:1]: [1]
data[0:2]: [1 2]
data[0:3]: [1 2 3]
data[1:2]: [2]
data[1:3]: [2 3]
data[2:3]: [3]


<img src="res/np_indexing.png">

## Basic array operations
### Broadcasting
There are times when you might want to carry out an operation between an array and a single number (also called an operation between a vector and a scalar). For example, your array (we’ll call it “data”) might contain information about distance in miles but you want to convert the information to kilometers. You can perform this operation with:

In [296]:
data = np.array([1.0, 2.0])

data * 1.6

array([1.6, 3.2])

<img src="res/np_multiply_broadcasting.png">

### Addition, subtraction, multiplication, division, and more

In [308]:
data = np.array([1, 2])
ones = np.ones(2, dtype=int)

<img src="res/np_data_plus_ones.png">

In [299]:
data + ones

array([2, 3])

In [304]:
data - ones

array([0, 1])

In [305]:
data * ones

array([1, 2])

In [306]:
data / data

array([1., 1.])

<img src="res/np_sub_mult_divide.png">

### More useful array operations

In [309]:
data = np.array([1, 2, 3])

In [310]:
data.max()

3

In [311]:
data.min()

1

In [312]:
data.sum()

6

In [319]:
data.average()

AttributeError: 'numpy.ndarray' object has no attribute 'average'

<img src="res/np_aggregation.png">

## Creating 2D arrays (matrices)
You can pass Python lists of lists to create a 2-D array (or “matrix”) to represent them in NumPy.

In [348]:
data = np.arange(1, 7).reshape(3, 2)
print(data)

[[1 2]
 [3 4]
 [5 6]]


<img src="res/np_create_matrix.png" width="90%">

### Indexing and slicing operations 

In [341]:
# filter a value with their indices
print(data[0,1])

2


In [338]:
print(data[1:3])
print(data[1:3,:]) # little bit more difficult

[[3 4]
 [5 6]]
[[3 4]
 [5 6]]


In [335]:
print(data[0:2,0])
print(data[1:,1])

[1 3]
[4 6]


<img src="res/np_matrix_indexing.png">

### Useful operations with matrices 

In [342]:
data.max()

6

In [343]:
data.min()

1

In [349]:
data.sum()

21

<img src="res/np_matrix_aggregation.png">

You can aggregate all the values in a matrix and you can aggregate them across **columns** or **rows** using the `axis` parameter:

In [351]:
data = np.array([[1, 2],[5, 3],[4, 6]])

In [352]:
data.max(axis=0)

array([5, 6])

In [353]:
data.max(axis=1)

array([2, 5, 6])

<img src="res/np_matrix_aggregation_row.png">

Once you’ve created your matrices, you can add and multiply them using arithmetic operators if you have **two matrices** that are the **same size**.

In [355]:
data = np.array([[1, 2], [3, 4]])
ones = np.array([[1, 1], [1, 1]])

In [356]:
data + ones

array([[2, 3],
       [4, 5]])

<img src="res/np_matrix_arithmetic.png">

You can do these arithmetic operations on matrices of different sizes, but only if one matrix has only one column or one row. In this case, NumPy will use its **broadcast rules** for the operation.

In [254]:
data = np.array([[1, 2], [3, 4], [5, 6]])
ones_row = np.array([[1, 1]])


<img src="res/np_matrix_broadcasting.png">

### Transposing a matrix

### Flattening multidemsional arrays

### Dot product of two arrays

In [255]:
a = [1, 2, 3]
b = [4, 5, 6]
