In [5]:
import numpy as np
import pandas as pd

# Numpy

`Numpy` (Numerical Python) is one of the most important foundational packages for numerical computing in Python.
<br/>
One of the reason Numpy is important for numerical computations in Python is because it is designed for efficiency on large arrays of data

<ul>
    <li>
        Numpy internally stores data in a contiguous block of memory, independent of other built-in Python objects
    </li>
    <li>
        Uses much less memory than Python built-in sequences
    </li>
    <li>
        Numpy operations perform complex computations on entire arrays without the need for Python `for` loops
    </li>
</ul>


In [10]:
my_arr = np.arange(1_000_000)
%timeit my_arr2 = my_arr * 2

3.61 ms ± 74.5 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [11]:
my_list = list(range(1_000_000))
%timeit my_list2 = [x * 2 for x in my_list]

85.7 ms ± 2.91 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


Thea above result shows that if every item of numpy array is multiplied by 2, it takes roughly `3.61 ms` while the same operation on list takes nearly `85.7ms`


## Numpy array

One of the key featues of NumPy is its n-dimensional array object. Array enables you to perform mathematical operations on whole blocks of data using similar syntax to the equivalent operations between scalar elements. `In other words, it allows you to apply scalar operations to arrays that will otherwise require a loop in Python built-in list`


In [19]:
example_data = np.array([
    [1, 0.5, 6],
    [5, -3, 5.5]
])
print(f"Data: {example_data}\n")

print(f"Data doubled: {example_data * 2}\n")

print(f"Data added with itself: {example_data + example_data}")

Data: [[ 1.   0.5  6. ]
 [ 5.  -3.   5.5]]

Data doubled: [[ 2.  1. 12.]
 [10. -6. 11.]]

Data added with itself: [[ 2.  1. 12.]
 [10. -6. 11.]]


numpy array is a multidimensional container for homogeneous data (all of the elements must be of same data type)


In [21]:
# example_data = np.array([
#     [1, 0.5, 6],
#     [5, 'hello', 5.5]
# ])

# This will give error because an str is based in an float array. The integers are parsed to float

Easiest way to create an array is using `array()`


In [32]:
example_data = [6, 0.2, 3.6, 4.2]
example_data_array = np.array(example_data)
print(f'Array: {example_data_array}') # it may seems like a simple list but its type is different

print(f"Example data type: {type(example_data)}")
print(f"Example data array type: {type(example_data_array)}")

print(f"Example array dimensions: {example_data_array.ndim}")
print(f'Example array shape: {example_data_array.shape}')
print(f'Example array size: {example_data_array.size}')
print(f"Example array data type: {example_data_array.dtype}")

Array: [6.  0.2 3.6 4.2]
Example data type: <class 'list'>
Example data array type: <class 'numpy.ndarray'>
Example array dimensions: 1
Example array shape: (4,)
Example array size: 4
Example array data type: float64


You can explicitly cast an array from one type to another using ndarray `astype()`


In [68]:
print(f'Example array: {example_data_array}')

print(f'Example array type: {example_data_array.dtype}\n')

int_example_array = example_data_array.astype(np.int64)

print(f"Example array after type conversion to int: {int_example_array}")

print(f'Example array type: {int_example_array.dtype}\n')

Example array: [6.  0.2 3.6 4.2]
Example array type: float64

Example array after type conversion to int: [6 0 3 4]
Example array type: int64



If an array is of string numbers, it can be converted to float or integer. If casting were to fail (example string cannot be converted to integer or float), a `ValueError` will be raised


In [84]:
string_numbers_array = np.array(['1.2', '4.4', '0.5', '-1.6'])
print("Strings array: ", string_numbers_array)
print("Float numbers array type: ", string_numbers_array.dtype)

float_numbers_array = string_numbers_array.astype(np.float64)
print("\nFloat numbers array: ", float_numbers_array)
print("Float numbers array type: ", float_numbers_array.dtype)

Strings array:  ['1.2' '4.4' '0.5' '-1.6']
Float numbers array type:  <U4

Float numbers array:  [ 1.2  4.4  0.5 -1.6]
Float numbers array type:  float64


## Numpy functions


NumPy provides other functions for creating new arrays


In [58]:
print(f"Zero function: {np.zeros(5)}\n") #prints an array of zero

print(f"Zero matrix: {np.zeros((3,3))}\n")

print(f"Empty array: {np.empty((2,3))}\n") #prints an array with garbage values

print(f"Type of numpy.empty: {type(np.empty((2,2)))}\n")

print(f"Identity matrix: {np.eye(3)}\n") #identity matrix

print(f"Random matrix: {np.random.random((2, 2))}\n") #random matrix

print(f"Matrix of one: {np.ones((2,3))}\n")

print(f"Matrix of one while taking an another array as input: {np.ones_like(example_data)}\n") #ones_like takes another array and produces an array of ones array of same shape and size

print(f"Full numpy matrix: {np.full(3,5)}")

Zero function: [0. 0. 0. 0. 0.]

Zero matrix: [[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]

Empty array: [[1. 1. 1.]
 [1. 1. 1.]]

Type of numpy.empty: <class 'numpy.ndarray'>

Identity matrix: [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]

Random matrix: [[0.51772468 0.0988314 ]
 [0.00306367 0.47616796]]

Matrix of one: [[1. 1. 1.]
 [1. 1. 1.]]

Matrix of one while taking an another array as input: [1. 1. 1. 1.]

Full numpy matrix: [5 5 5]


# Numpy Arrays Arithmetic

An important feature of a numpy array is that you can apply arithmetic operations without using loops. This is called `vectorization` <br/> <br/>
Vectorization refers to the process of performing operations on entire array at once rather iterating through individual elements using loop, leading to faster and more efficient code


In [86]:
example_array = np.array([[1., 2., 3.], [4., 5., 6.]])
print(f'Example array {example_array}')

Example array [[1. 2. 3.]
 [4. 5. 6.]]


In [89]:
print(f"Example array double: {example_array * 2}\n")

print(f"Example array square: {example_array * example_array}\n")

print(f"1 divided by example array {1/example_array}\n")

Example array double: [[ 2.  4.  6.]
 [ 8. 10. 12.]]

Example array square: [[ 1.  4.  9.]
 [16. 25. 36.]]

1 divided by example array [[1.         0.5        0.33333333]
 [0.25       0.2        0.16666667]]



We can compare two arrays. The comparison is done by element wise


In [90]:
example_array_2 = np.array([[0., 4., 1.], [7., 2., 12.]])

print(f"First example array: {example_array}\n")
print(f"Second example array: {example_array_2}")

First example array: [[1. 2. 3.]
 [4. 5. 6.]]

Second example array: [[ 0.  4.  1.]
 [ 7.  2. 12.]]


In [91]:
example_array > example_array_2

array([[ True, False,  True],
       [False,  True, False]])

In [93]:
example_array < example_array_2

array([[False,  True, False],
       [ True, False,  True]])