# NumPy 

> NumPy is a packages that extend Python with vectors, matrices and related functions.

See [documentation](https://numpy.org/doc/stable/)

## Vectors

- Elements are all of the same type.
- Number of elements is refered to as the **_dimension_** (or mathematecally _rank_)
- The $0^{th}$ element, of the vector $\mathbf{x}$ is $x_0$
- Number of indices, indicates number of dimensions. 4 indices = 4-D or 4-dimensional array
- The **_Shape_** of an array or vector denotes the number of indices and number of elements - or the rows and columns.

```python
a = np.zeros(4,4)

print(f"{a}")
>
[[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]

print(f"{a.shape}")
> (4, 4)
```

## Vector Operations

### Slicing

In [14]:
import numpy as np # np is the convention (abbreviation) for using numpy

# Slicing
a = np.arange(10)
print("Slicing Example")
print(f"a        =  {a}")

#access 5 consecutive elements (start:stop:step)
c = a[2:7:1];     print("a[2:7:1] = ", c)

# access 3 elements separated by two 
c = a[2:7:2];     print("a[2:7:2] = ", c)

# access all elements index 3 and above
c = a[3:];        print("a[3:]    = ", c)

# access all elements below index 3
c = a[:3];        print("a[:3]    = ", c)

# access all elements
c = a[:];         print("a[:]     = ", c)

print("\n")


Slicing Example
a        =  [0 1 2 3 4 5 6 7 8 9]
a[2:7:1] =  [2 3 4 5 6]
a[2:7:2] =  [2 4 6]
a[3:]    =  [3 4 5 6 7 8 9]
a[:3]    =  [0 1 2]
a[:]     =  [0 1 2 3 4 5 6 7 8 9]




### Single Vector Operations

In [19]:
a = np.array([1,2,3,4])
print(f"a             : {a}")
print("\nNegate elements of a")
b = -a 
print(f"b = -a        : {b}")

print("\nSum all elements of a, returns a scalar") 
b = np.sum(a) 
print(f"b = np.sum(a) : {b}")

b = np.mean(a)
print(f"b = np.mean(a): {b}")

b = a**2
print(f"b = a**2      : {b}")

a             : [1 2 3 4]

Negate elements of a
b = -a        : [-1 -2 -3 -4]

Sum all elements of a, returns a scalar
b = np.sum(a) : 10
b = np.mean(a): 2.5
b = a**2      : [ 1  4  9 16]


In [20]:
# Most of the NumPy arithmetic, logical and comparison operations apply to vectors as well. These operators work on an element-by-element basis.

a = np.array([ 1, 2, 3, 4])
b = np.array([-1,-2, 3, 4])

print(f"Binary operators work element wise: {a + b}")

# Only works for vectors of the same size

Binary operators work element wise: [0 0 6 8]


### Dot Product

In [13]:
# Dot product
print("Dot Product Example")

# Create two vectors
vector1 = np.array([2, 2, 2])
vector2 = np.array([2, 2, 2])
print("Vector 1:", vector1)
print("Vector 2:", vector2)

# Calculate the dot product
dot_product = np.dot(vector1, vector2)

print("Dot Product:", dot_product)

Dot Product Example
Vector 1: [2 2 2]
Vector 2: [2 2 2]
Dot Product: 12


We utilized the NumPy library because it improves speed memory efficiency.

Vectorization provides a large speed up because NumPy makes better use of available data parallelism in the underlying hardware. GPU's and modern CPU's implement Single Instruction, Multiple Data (SIMD) pipelines allowing multiple operations to be issued in parallel. This is critical in Machine Learning where the data sets are often very large.

In [22]:
import time

def my_dot(a, b): 
    """
   Compute the dot product of two vectors
 
    Args:
      a (ndarray (n,)):  input vector 
      b (ndarray (n,)):  input vector with same dimension as a
    
    Returns:
      x (scalar): 
    """
    x=0
    for i in range(a.shape[0]):
        x = x + a[i] * b[i]
    return x

np.random.seed(1)
a = np.random.rand(10000000)  # very large arrays
b = np.random.rand(10000000)

tic = time.time()  # capture start time
c = np.dot(a, b)
toc = time.time()  # capture end time

print(f"np.dot(a, b) =  {c:.4f}")
print(f"Vectorized version duration: {1000*(toc-tic):.4f} ms ")

tic = time.time()  # capture start time
c = my_dot(a,b)
toc = time.time()  # capture end time

print(f"my_dot(a, b) =  {c:.4f}")
print(f"loop version duration: {1000*(toc-tic):.4f} ms ")

del(a);del(b)  #remove these big arrays from memory

np.dot(a, b) =  2501072.5817
Vectorized version duration: 22.2754 ms 
my_dot(a, b) =  2501072.5817
loop version duration: 3552.6266 ms 
