# NumPy fundamentals

Many thanks to: https://numpy.org/doc/stable/user/absolute_beginners.html

## Introduction

There are 6 general mechanisms for creating NumPy arrays:

1. Conversion from other Python structures (i.e. lists and tuples) 

2. Intrinsic NumPy array creation functions (e.g. arange, ones, zeros, etc.)

3. Replicating, joining, or mutating existing arrays

4. Reading arrays from disk, either from standard or custom formats

5. Creating arrays from raw bytes through the use of strings or buffers

6. Use of special library functions (e.g. random)

In most case, we do it as described in 1 or 2.

In [2]:
import numpy as np

## How to create a basic array (Vector)
To create a NumPy array, you can use the function `np.array()`. All you need to do to create a simple array is pass a list to it.

In [3]:
array = np.array([1, 2, 3, 4, 5])

You can visualize your array this way:
<img src="res/np_array.png">

### How do we get the dimension, size or shape of an array?
- `ndarray.ndim` will tell you the number of axes, or dimensions, of the array.

- `ndarray.size` will tell you the total number of elements of the array. This is the product of the elements of the array’s shape.

- `ndarray.shape` will display a tuple of integers that indicate the number of elements stored along each dimension of the array. If, for example, you have a 2-D array with 2 rows and 3 columns, the shape of your array is (2, 3).
- `ndarray.dtype` will tell you the data type. The Elements are all of the same type.

In [4]:
array.ndim

1

In [5]:
array.size

5

In [6]:
array.shape

(5,)

In [7]:
array.dtype

dtype('int64')

### Intrinsic NumPy array creation functions
- `np.zeros()` creates an array filled with 0’s
- `np.ones()` creates an array filled with 1’s
- `np.arange()` creates an array with a range of elements
- `np.linspace()` creates an array with values that are spaced linearly in a specified interval 

In [9]:
np.zeros(3)

array([0., 0., 0.])

In [10]:
np.ones(3)

array([1., 1., 1.])

In [11]:
np.arange(1, 10, 2)

array([1, 3, 5, 7, 9])

In [12]:
np.linspace(0, 1, 6)

array([0. , 0.2, 0.4, 0.6, 0.8, 1. ])

### Specifying your data type
While the default data type is floating point (`np.float64`), you can explicitly specify which data type you want using the `dtype` keyword.

In [13]:
np.float64

numpy.float64

In [15]:
my_ones = np.ones(3, dtype=np.int64)
my_ones.dtype

dtype('int64')

### What about reshaping an array?
Yes, this is possible, but a little tricky! 

>Using `arr.reshape()` will give a new shape to an array without changing the data. Just remember that when you use the reshape method, the array you want to produce needs to have the **same number of elements** as the original array. If you start with an array with 12 elements, you’ll need to make sure that your new array also has a total of 12 elements.

In [17]:
data = np.array([1, 2, 3, 4, 5, 6])

To reshape this vector in an array with three rows and two columns, use `reshape(3,2)`.

In [18]:
data.reshape(3, 2)

array([[1, 2],
       [3, 4],
       [5, 6]])

<img src="res/np_reshape.png" width="70%">

Eureka! We converted a vector into a 2D array (Matrix). To convert the array back to a vector, use `reshape` again

In [19]:
data.reshape(3, 2).reshape(6)

array([1, 2, 3, 4, 5, 6])

### Indexing and slicing
You can index and slice NumPy arrays in the same ways you can slice Python lists.

In [20]:
data = np.array([1, 2, 3])

In [21]:
data[0]

np.int64(1)

<img src="res/np_indexing.png">

## Basic array operations
### Broadcasting
There are times when you might want to carry out an operation between an array and a single number (also called an operation between a vector and a scalar). For example, your array (we’ll call it “data”) might contain information about distance in miles but you want to convert the information to kilometers. You can perform this operation with:

In [25]:
np.array([1.0, 2.0]) * 1.6

array([1.6, 3.2])

<img src="res/np_multiply_broadcasting.png">

### Addition, subtraction, multiplication, division, and more

In [26]:
data = np.array([1, 2])
ones = np.ones(2, dtype=int)

<img src="res/np_data_plus_ones.png">

In [27]:
data + ones

array([2, 3])

In [28]:
data-ones

array([0, 1])

In [29]:
data*ones

array([1, 2])

<img src="res/np_sub_mult_divide.png">

### More useful array operations

In [30]:
np.array([1, 2, 3]).sum()

np.int64(6)

In [31]:
np.array([[1, 1], [2, 2]]).sum(axis=0)

array([3, 3])

In [32]:
np.array([[1, 1], [2, 2]]).sum(axis=1)

array([2, 4])

In [33]:
np.array([[1, 1], [2, 2]]).mean()

np.float64(1.5)

In [34]:
np.array([1, 2, 3, 4, 5]).min()

np.int64(1)

In [35]:
np.array([1, 2, 3, 4, 5]).max()

np.int64(5)

<img src="res/np_aggregation.png">

## Creating 2D arrays (matrices)
You can pass Python lists of lists to create a 2-D array (or “matrix”) to represent them in NumPy.

In [37]:
np.array([[1, 2], [3, 4]])

array([[1, 2],
       [3, 4]])

<img src="res/np_create_matrix.png" width="90%">

### Indexing and slicing operations 

In [38]:
np.array([[1, 2], [3, 4]])[0]

array([1, 2])

In [39]:
np.array([[1, 2], [3, 4]])[0][0]

np.int64(1)

In [40]:
np.array([[1, 2, 6], [3, 4, 2]])[1:3]

array([[3, 4, 2]])

<img src="res/np_matrix_indexing.png">

### Useful operations with matrices 

In [42]:
data = np.array([[1, 2], [3, 4]])

In [43]:
data.sum()

np.int64(10)

In [44]:
data.max()

np.int64(4)

<img src="res/np_matrix_aggregation.png">

You can aggregate all the values in a matrix and you can aggregate them across **columns** or **rows** using the `axis` parameter:

In [45]:
np.array([[1, 1], [2, 2]]).sum(axis=0)

array([3, 3])

In [46]:
np.array([[1, 1], [2, 2]]).sum(axis=1)

array([2, 4])

<img src="res/np_matrix_aggregation_row.png">

Once you’ve created your matrices, you can add and multiply them using arithmetic operators if you have **two matrices** that are the **same size**.

In [48]:
data = np.array([[1, 2], [3, 4]])
ones = np.array([[1, 1], [1, 1]])
data * ones

array([[1, 2],
       [3, 4]])

<img src="res/np_matrix_arithmetic.png">

You can do these arithmetic operations on matrices of different sizes, but only if one matrix has only one column or one row. In this case, NumPy will use its **broadcast rules** for the operation.

In [49]:
data = np.array([[1, 2], [3, 4], [5, 6]])
ones_row = np.array([[1, 1]])
data + ones_row

array([[2, 3],
       [4, 5],
       [6, 7]])

<img src="res/np_matrix_broadcasting.png">

### Transposing a matrix

In [50]:
data = np.array([[1, 2], [3, 4]])
data.T

array([[1, 3],
       [2, 4]])

### Flattening multidemsional arrays

In [51]:
data = np.array([[1, 2], [3, 4]])
data.flatten()

array([1, 2, 3, 4])

### Dot product of two arrays

In [52]:
a = [1, 2, 3]
b = [4, 5, 6]
np.dot(a, b)

np.int64(32)

# Other Showcases

In [54]:
np.identity(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [55]:
np.e

2.718281828459045

In [56]:
np.pi

3.141592653589793

In [57]:
np.linalg.cross([1, 2, 3], [4, 5, 6])

array([-3,  6, -3])

In [58]:
np.linalg.norm([1, 2, 3])

np.float64(3.7416573867739413)

In [59]:
np.random.rand(3)

array([0.61609123, 0.51317699, 0.04830059])

In [60]:
np.unique_values([1, 2, 3, 1, 2, 3])

array([1, 2, 3])

In [62]:
np.unique_counts([1, 2, 3, 1, 2, 3, 4, 3])

UniqueCountsResult(values=array([1, 2, 3, 4]), counts=array([2, 2, 3, 1]))

In [65]:
np.percentile([1, 2, 3, 4, 5], 50)

np.float64(3.0)

In [66]:
np.sort([3, 2, 1])

array([1, 2, 3])