## What is Numpy?

* Python library for numerical operations
* Provides a multidimensional array data type `ndarray`, with functions to operate on it efficiently and easily
* Fast and efficient : written in C or C++
* To handle large arrays and matrices of numeric data `ndarray` should be used as Python's built-in `list` type
* Overview of functionas available in Numpy: https://numpy.org/doc/stable/user/quickstart.html#functions-and-methods-overview
* Other useful data types and constant for data science: `np.nan`, `np.inf`, `np.pi`, `np.e`, etc

## Numpy is the foundation of the Python data science ecosystem

![data-science infrascture](https://numpy.org/images/content_images/ds-landscape.png)

## Quick numpy demo

In [None]:
# The standard way to import NumPy:
import numpy as np # Equivalent to library() in R

# Create a 2-D array, set every second element in
# some rows and find max per row:

x = np.arange(15, dtype=np.int64).reshape(3, 5) # Other common dtypes: np.float64, np.complex128

# # Slice syntax: [start:stop:step] (same as Python lists)
# x[1:, ::2] = -99
x

In [None]:
# Random number generation:
# Functions np.random.rand(), np.random.randint(), and np.random.randn()
# are the most commonly used.

# Functions to specify a random seed and a specific distribution are also available.
y = np.random.randint(low=-100, high=100, size=(3, 5))
y

In [None]:
# Arithmetic operations are element-wise:
x * 2
# x * y

# Algebraic operations are matrix operations:
# Dot product:
v = np.array([1, 2, 3, 4, 5])
np.dot(x, v)

# Matrix multiplication:
np.matmul(x, y.T) # same as x @ v


In [None]:
# Describe the array:
print(f'''
shape: {x.shape}
dtype: {x.dtype}
ndim: {x.ndim}
size: {x.size}
''')

In [None]:
# Basic aggregation on arrays along dimensions (0 is along rows, 1 is along columns):
x.max(axis=1)
# x.mean(axis=1)

### Exercise:
1. Using numpy create two 2-D arrays, one with random integers and one
with random floats. 
2. Calculate the mean and standard deviation of each array along the rows. Use np.mean() and np.std().
3. Calculate the correlation coefficient between the two arrays. Find the function in the [numpy documentation](https://numpy.org/doc/stable/reference/routines.html).


In [None]:
# 1. Create arrays with random integers and floats.

# 2. Calculate the mean and standard deviation of each array along the rows.

# 3. Calculate the correlation coefficient between the two arrays. use np.corrcoef().