# 10 - Numpy: Vectors, Arrays, and Plotting

## 1 - Introduction to NumPy

NumPy, which stands for Numerical Python, is a fundamental package for scientific computing in Python. It is a library that provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays. NumPy is open-souce software and has become a cornerstone in the ecosystem of data science and scientific computing due to its efficiency and versatility.

### 1.1 - Features of NumPy

- **Efficient Storage and Operations**: NumPy arrays are more efficient in terms of storage and operations compared to Python lists, especially for large data sets. This efficiency comes from NumPy's ability to store data in contiguous blocks of memory, facilitating fast array operations.
- **Broadcasting**: NumPy can perform operations on arrays of different shapes during arithmetic operations. This feature, known as broadcasting, makes it possible to perform vectorized operations, leading to more concise and readable code.
- **Universal Functions**: These are high-performance, element-wise operations over NumPy arrays. They are the key to the package's high performance, providing a flexible interface for optimized array operations.
- **Slicing and Indexing**: NumPy offers comprehensive tools for indexing and slicing, allowing for the manipulation and extraction of data from arrays.
- **Linear Algebra, Fourier Transform, and Random Number Capabilities**: Beyond basic array manipulation, NumPy provides a host of functions for more complex mathematical operations, including matrix operations, Fourier transforms, and sophisticated random number generation.

### 1.2 - How to Use NumPy?

#### Creating Arrays

An array is the collection of elements that have the same data type, the length of an array must be known when created. We can use NumPy to create an array.

In [2]:
import numpy as np

arr = np.array([1, 2, 3, 4])
arr

array([1, 2, 3, 4])

Any array in NumPy must have the type `ndarray`:

In [3]:
type(arr)

numpy.ndarray

#### Special Arrays

In some calculations, we need to initialize an array with all zeros or ones:

In [4]:
zero_arr = np.zeros(10)
zero_arr

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [5]:
one_arr = np.ones(10)
one_arr

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

We can also use NumPy to create 2D array/matrix:

In [6]:
matrix = np.array([[1, 2, 3], [4, 5, 6]])
matrix

array([[1, 2, 3],
       [4, 5, 6]])

We can also create arrays with a defined data type:

In [17]:
int_arr = np.array([2, 3, 4], dtype=int)
int_arr.dtype

dtype('int32')

In [18]:
float_arr = np.array([2, 3, 4], dtype=float)
float_arr.dtype

dtype('float64')

In [19]:
bool_arr = np.array([0, 1, 1], dtype=bool)
bool_arr.dtype

dtype('bool')

NumPy can create an array of $n$ uniformly spaced values from $a$ to $b$:

In [21]:
a = 1
b = 5
n = 10
arr = np.linspace(a, b, n)
arr

array([1.        , 1.44444444, 1.88888889, 2.33333333, 2.77777778,
       3.22222222, 3.66666667, 4.11111111, 4.55555556, 5.        ])

Create an array with random numbers:

In [24]:
a = 10
b = 50
n = 4
rd_int_arr = np.random.randint(a, b, n)
rd_int_arr

array([39, 32, 45, 10])

In [30]:
# For random floats, use a Uniform(a,b) distribution
rd_float_arr = np.random.uniform(a, b, n)
rd_float_arr

array([47.53384627, 12.20116381, 31.28464847, 32.54825994])

In [31]:
# The random.normal() function takes mean and standard deviation as arguments
mu = 5
s = 2
n = 10
rd_float_arr = np.random.normal(mu, s, n)
rd_float_arr

array([7.65143805, 1.61665877, 6.65579479, 3.95870947, 6.0919045 ,
       2.40302267, 4.57518371, 4.25373789, 2.56358796, 4.8299748 ])

#### Basic Operations

NumPy arrays facilitate operations that can be performed element-wise or on the whole array.

In [35]:
arr1 = np.random.randint(low=5, high=10, size=10)
arr2 = np.random.randint(low=3, high=9, size=10)

# Element-wise addition
arr = arr1 + arr2

print(arr1)
print(arr2)
print(arr)

[5 7 8 7 9 9 8 9 9 7]
[4 8 4 6 4 8 3 3 6 5]
[ 9 15 12 13 13 17 11 12 15 12]


In [37]:
# Scalar Multiplication
10 * arr

array([ 90, 150, 120, 130, 130, 170, 110, 120, 150, 120])

In [38]:
# Summing elements of an array
np.sum(arr)

129

In [39]:
# Find mean
np.mean(arr)

12.9

#### Reshaping and Manipulating Arrays

NumPy arrays can be reshaped, allowing for flexible data manipulation.

In [40]:
arr = np.random.randint(0, 5, 10)
arr

array([0, 3, 4, 3, 3, 2, 2, 3, 3, 0])

In [43]:
# Reshape 1D array to 2D array
reshapened_arr = arr.reshape((2, 5))
reshapened_arr

array([[0, 3, 4, 3, 3],
       [2, 2, 3, 3, 0]])

In [44]:
flattened_arr = reshapened_arr.flatten()
flattened_arr

array([0, 3, 4, 3, 3, 2, 2, 3, 3, 0])

In [45]:
reshapened_arr.ravel()

array([0, 3, 4, 3, 3, 2, 2, 3, 3, 0])

Note:
- `flatten()` returns a new array and does not modify the original array. It always returns a copy of the original array.
- `ravel()` returns a flattened array but tries to return a view of the original array whenever possible. This makes `ravel()` potentially more memory efficient since it doesn't guarantee a copy unless necessary.

### 1.3 - Vector-Valued Functions

We can associate a vector-valued function $g: \mathbb{R}^n \rightarrow \mathbb{R}^n$ with any scalar function $f: \mathbb{R} \rightarrow \mathbb{R}$ by

$$
\begin{pmatrix}
    f(\nu_0) \\
    f(\nu_1) \\
    \vdots \\
    f(\nu_{n-1})
\end{pmatrix}
$$

For example, if $f(x) = \sin(x)$ then $g(\vec{\nu}) = [\sin(\nu_0), \dots, \sin(\nu_{n-1})]$, where $\vec{\nu} = [\nu_0, \dots, \nu_{n-1}]$.

### 1.4 - Vectorization

Vectorization, in the context of data analysis and scientific computing, refers to the process of operating on entire arrays of data without explicitly writing loop constructs in the code. This technique is not only syntactically cleaner but also significantly faster due to several underlying optimizations. Vectorization exploits the capabilities of modern CPUs and computing architectures, which are designed to perform operations on multiple data points simultaneously, a feature known as Single Instruction, Multiple Data (SIMD).

Advantages of vectorization:
- **Performance Improvement**: Vectorized operations are typically executed by optimized, compiled code behind the scenes, taking advantage of low-level optimizations and parallel execution capabilities of the CPU. This makes vectorized operations much faster than their non-vectorized counterparts, particularly for large datasets.
- **Simpler, More Readable Code**: By eliminating explicit loops, vectorized code is often more concise and easier to read, making it easier to understand and maintain.
- **Efficient Use of Memory**: Vectorized operations can also lead to more efficient use of memory by reducing the need for temporary variables and leveraging in-place computations.

### 1.5 - Vectorization Performance

Given a vector $\vec{\nu}$ of length $10000$, we can compute $\sin(\vec{\nu}) = [\sin(\nu_0), \dots, \sin(\nu_{n-1})]$.

In [46]:
import math
import time

ARR_SIZE = 10000
data = np.random.uniform(-3, 3, ARR_SIZE)

In [49]:
# Using a normal Python loop
start_time = time.time()
sin_values_py = [math.sin(val) for val in data]
end_time = time.time()

# Display time taken in seconds
python_time = end_time - start_time
python_time

0.0032558441162109375

In [50]:
# Using NumPy vectorization
start_time = time.time()
sin_values_np = np.sin(data)
end_time = time.time()

# Display time taken in seconds
numpy_time = end_time - start_time
numpy_time

0.00029087066650390625

In [51]:
python_time / numpy_time

11.193442622950819

## 2 - Curve Plotting

### 2.1 - Features of Matplotlib

- **Versatile Plotting**: 

### 2.2 - Augmenting the Figure

### 2.3 - Plot Multiple Curves

#### Customizing Markers

### 2.4 - Combine Plot with Subplots

### 2.5 - Polar Coordinate Plotting

#### All Polar Curves

#### Plot Polar and Cartesian Curves