### MEDC0106: Bioinformatics in Applied Biomedical Science

<p align="center">
  <img src="../../resources/static/Banner.png" alt="MEDC0106 Banner" width="90%"/>
  <br>
</p>

---------------------------------------------------------------

# 05 - Introduction to NumPy

*Written by:* Oliver Scott

**This notebook provides a general introduction to NumPy.**

Do not be afraid to make changes to the code cells to explore how things work!

### What is NumPy?

[**Numpy**](https://numpy.org/) is a popular python package containing multidimensional array and matrix data structures. 

Description from the [NumPy user guide](https://numpy.org/devdocs/user/absolute_beginners.html):

> NumPy (Numerical Python) is an open source Python library that’s used in almost every field of science and engineering. It’s the universal standard for working with numerical data in Python, and it’s at the core of the scientific Python and PyData ecosystems. NumPy users include everyone from beginning coders to experienced researchers doing state-of-the-art scientific and industrial research and development. The NumPy API is used extensively in Pandas, SciPy, Matplotlib, scikit-learn, scikit-image and most other data science and scientific Python packages.

NumPy allows scientists to produce cutting edge software with the speed of C with a much less involved API!

In this notebook we will learn the very basics of the NumPy, it could have a whole lecture series itself! 

-----

## Contents

1. [The Basics](#The-Basics)
2. [Indexing and Slicing](#Indexing-and-Slicing)

-----

#### Extra Resources:

- [Learn NumPy](https://numpy.org/learn/) - Recommended learning material from the NumPy developers
-----

## The Basics

Importing numpy is no different to any other package/module. NumPy users often use the np alias to keep code clean:

In [None]:
import numpy as np

array = np.array([1,2,3,4])

print(array)

#### NumPy arrays vs Python lists 

- NumPy arrays can only hold one 'type' of data unlike Python lists
- NumPy arrays consume less memory than Python lists
- NumPy arrays are much faster then Python lists
- NumPy arrays are of a fixed size
- NumPy provides numerous (fast) mathematical operations that can be applied over arrays

##### The structure of an array

The array is the fundamental data structure in the NumPy library, consisting of a grid of values which all share the same 'type' or 'dtype' in NumPy. This grid can be indexed in a similar way to Python lists, and also using tuples of nonnegative integers, by booleans, by another array, or by integers. 

----

**Array structure:**

<p align="center">
  <img src="https://i.imgur.com/mg8O3kd.png" alt="NumPy Arrays" width="100%"/>
  <br>
</p>

[Image Source](https://www.freecodecamp.org/news/exploratory-data-analysis-with-numpy-pandas-matplotlib-seaborn/)


**Multiple Dimensions:**

NumPy arrays can also be multidimensional (1D, 2D, 3D ... ND), meaning the NumPy array structure can be used to model vectors (1D) and matrices (2D). Arrays with >= 3 dimensions is often refered to as a tensor (see above).

Dimensions in NumPy are refered to as 'axes'. A 2D array may look something like this:

```python
[[0., 0., 0.],
 [1., 1., 1.]]
```

Where there are two axes and the first axis has a length of two and the second a length of three. You can access the shape of an array with the `.shape` attribute, which is a tuple refering to the length of each axis. In this case this would be `(2,3)`


**Creating NumPy arrays:**

There are numerous ways to create a NumPy array. We will cover some of the ways here:

- `np.array()`
- `np.zeros()`
- `np.ones()`
- `np.empty()`
- `np.arange()`
- `np.linspace()`

`np.array()`can be used to construct an array from a Python list:

In [None]:
array_1d = np.array([1, 2, 3, 4])                  # A 1D array
array_2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])  # A 2D array

print('1D NumPy array:\n', array_1d)
print('\n2D NumPy array:\n', array_2d)

`np.zeros()` fills an array with zeros, while `np.ones()` fills an array with ones:

In [None]:
zero_array_1d = np.zeros(6)
ones_array_2d = np.ones((6, 3))

print('1D NumPy array:\n', zero_array_1d)
print('\n2D NumPy array:\n', ones_array_2d)

`np.empty()` functions in much the same way but fills the array with random numbers depending on the state of the memory (quicker than zeros/ones):

In [None]:
array = np.empty(10)

print(array)

`np.arange()` creates an array containing a range of numbers similarly to the Python `range()` function. A step size parameter can also be provided:

In [None]:
array = np.arange(0, 11, 2)

print(array)

`np.linspace()` creates an array of evenly spaced values within a specified interval:

In [None]:
array = np.linspace(0, 20, 5)  # 0 -> 20 in 5 values

print(array)

**Specifying a dtype**

When creating an array the dtype is automatically `np.float64`, however the dtype can be specified by the user.

Learn more about datatypes [here](https://www.tutorialspoint.com/numpy/numpy_data_types.htm)

In [None]:
array = np.ones((4,4), dtype=np.int64)  # Here we specify integer

print(array)