<a href="https://colab.research.google.com/github/MSimsDev/machine-learning-prework/blob/main/02-numpy/02.1-Intro-to-Numpy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

![NumPy logo](https://github.com/4GeeksAcademy/machine-learning-prework/blob/main/02-numpy/assets/numpy_logo.png?raw=true)

## Introduction to NumPy

**NumPy** means **Numerical Python**. It is an open-source library used to perform mathematical tasks with very high efficiency. In addition, it introduces data structures, such as multidimensional arrays, which can be operated on at a high level, without getting too much into the details.

Specifically, the keys to this library are:

- **Multidimensional arrays**: This library provides an object called `ndarray`, which allows you to store and manipulate large data sets efficiently. Arrays can have any number of dimensions.
- **Vectorized operations**: NumPy allows performing mathematical operations on complete arrays without the need for explicit loops in the code, which makes it very fast and efficient.
- **Mathematical functions**: NumPy provides a wide range of mathematical functions for working with arrays, including trigonometric functions, statistics, and linear algebra, among others.
- **Efficiency**: It is much faster than the same functionality implemented directly on native Python. It is also very flexible in terms of accessing and manipulating individual elements or subsets of arrays.

NumPy is a fundamental library for Machine Learning and data science in Python. It provides a wide range of tools and functions to work efficiently with numerical data in the form of arrays and matrices.

### Arrays

A NumPy **array** is a data structure that allows you to store a collection of elements, usually numbers, in one or more dimensions.

#### One-dimensional Array

A one-dimensional (1D) array in NumPy is a data structure that contains a sequence of elements in a single dimension. It is similar to a list in Python, but with the performance and functionality advantages offered by NumPy.

![One dimensional array](https://github.com/4GeeksAcademy/machine-learning-prework/blob/main/02-numpy/assets/1D.png?raw=true "1D")

A 1D array can be created using the `array` function of the library with a list of elements as an argument. For example:

In [1]:
import numpy as np

array = np.array([1, 2, 3, 4, 5])
array

array([1, 2, 3, 4, 5])

This will create a 1D array with elements 1, 2, 3, 4 and 5. The array elements must be of the same data type. If the elements are of different types, NumPy will try to convert them to the same type if possible.

In a 1D array, we can access the elements using **indexes**, modify them and perform mathematical operations on the whole array efficiently. Below are some operations that can be performed using the above array:

In [4]:

print(array[2])

array[1] = 7
print(array)

array += 10
print(array)

sum_all = np.sum(array)
print(sum_all)

23
[21  7 23 24 25]
[31 17 33 34 35]
150


In [3]:
# Access the third element
print(array[2])

# Change the value of the second element
array[1] = 7
print(array)

# Add 10 to all elements
array += 10
print(array)

# Calculate the sum of the elements
sum_all = np.sum(array)
print(sum_all)

13
[11  7 13 14 15]
[21 17 23 24 25]
110


#### N-dimensional Array

A multidimensional or n-dimensional array in NumPy is a data structure that organizes elements in multiple dimensions (axes). These arrays allow you to represent more complex data structures, such as matrixes (2D array, 2 axes), tensors (3D array, 3 axes) and higher-dimensional structures.

![Arrays of different dimensions](https://github.com/4GeeksAcademy/machine-learning-prework/blob/main/02-numpy/assets/3D.png?raw=true "3D")

An N-dimensional array can also be created using the `array` function of the library. For example, if we want to create a 2D array:

In [None]:
array_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
array_2d

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [5]:
array_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
array_2d

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [6]:
array_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
array_2d

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

If we now wanted to create a 3D array, we would have to think of it as a list of arrays:

In [7]:
array_3d = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
array_3d

array([[[1, 2],
        [3, 4]],

       [[5, 6],
        [7, 8]]])

As with 1D arrays, the elements in a multidimensional array are accessible via indexes, operations can be performed on them, and so on.

As we add more dimensions, the basic principle remains the same: each additional dimension can be considered an additional level of nesting. However, on a practical level, working with arrays of more than 3 or 4 dimensions can become more complex and less intuitive.

The n-dimensional arrays in NumPy allow great flexibility and power to represent and manipulate data in more complex ways, especially useful in fields such as data science, image processing and deep learning.

### Functions

NumPy provides a large number of predefined functions that can be applied directly to the data structures seen above or to Python's own data structures (lists, arrays, etc.). Some of the most commonly used in data analysis are:

In [8]:
import numpy as np

# Create an array for the example
arr = np.array([1, 2, 3, 4, 5])

# Arithmetic Operations
print("Sum:", np.add(arr, 5))
print("Product:", np.multiply(arr, 3))

# Logarithmic and Exponential
print("Natural logarithm:", np.log(arr))
print("Exponential:", np.exp(arr))

# Statistical Functions
print("Mean:", np.mean(arr))
print("Median:", np.median(arr))
print("Standard Deviation:", np.std(arr))
print("Variance:", np.var(arr))
print("Maximum value:", np.max(arr))
print("Maximum value index:", np.argmax(arr))
print("Minimum value:", np.min(arr))
print("Minimum value index:", np.argmin(arr))
print("Sum of all elements:", np.sum(arr))

# Rounding Functions
arr_decimal = np.array([1.23, 2.47, 3.56, 4.89])
print("Rounding:", np.around(arr_decimal))
print("Minor integer (floor):", np.floor(arr_decimal))
print("Major integer (ceil):", np.ceil(arr_decimal))

Sum: [ 6  7  8  9 10]
Product: [ 3  6  9 12 15]
Natural logarithm: [0.         0.69314718 1.09861229 1.38629436 1.60943791]
Exponential: [  2.71828183   7.3890561   20.08553692  54.59815003 148.4131591 ]
Mean: 3.0
Median: 3.0
Standard Deviation: 1.4142135623730951
Variance: 2.0
Maximum value: 5
Maximum value index: 4
Minimum value: 1
Minimum value index: 0
Sum of all elements: 15
Rounding: [1. 2. 4. 5.]
Minor integer (floor): [1. 2. 3. 4.]
Major integer (ceil): [2. 3. 4. 5.]


## Exercises: Click on "open in colab" to start practicing

> Solution: https://github.com/4GeeksAcademy/machine-learning-prework/blob/main/02-numpy/02.1-Intro-to-Numpy_solutions.ipynb

### Array creation

#### Exercise 01: Create a **null vector** that contains 10 elements (★☆☆)

A null vector is a one-dimensional array composed of zeros (`0`).

> NOTE: Check the function `np.zeros` (https://numpy.org/doc/stable/reference/generated/numpy.zeros.html)

In [9]:
import numpy as np

null_vector = np.zeros(10)
print(null_vector)


[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]


#### Exercise 02: Create a vector of ones with 10 elements (★☆☆)

> NOTE: Check the function `np.ones` (https://numpy.org/doc/stable/reference/generated/numpy.ones.html)

In [10]:
import numpy as np

ones_vector = np.ones(10)
print(ones_vector)


[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]


#### Exercise 03: Investigate the `linspace` function of NumPy and create an array with 10 elements (★☆☆)

> NOTE: Check the function `np.linspace` (https://numpy.org/doc/stable/reference/generated/numpy.linspace.html)

In [11]:
import numpy as np

array = np.linspace(0, 1, 10)
print(array)


[0.         0.11111111 0.22222222 0.33333333 0.44444444 0.55555556
 0.66666667 0.77777778 0.88888889 1.        ]


#### Exercise 04: Find several ways to generate an array with random numbers and create a 1D array and two 2D arrays (★★☆)

> NOTE: Check the functions `np.random.rand` (https://numpy.org/doc/stable/reference/random/generated/numpy.random.rand.html), `np.random.randint` (https://numpy.org/doc/stable/reference/random/generated/numpy.random.randint.html) and `np.random.randn` (https://numpy.org/doc/stable/reference/random/generated/numpy.random.randn.html)

In [14]:
import numpy as np

arr_1d = np.random.rand(5)

arr_2d_1 = np.random.rand(3, 4)

arr_2d_2 = np.random.rand(2, 3)

arr_1d = np.random.random(5)

arr_2d_1 = np.random.random((3, 4))

arr_2d_2 = np.random.random((2, 3))

rng = np.random.default_rng()

arr_1d = rng.random(5)

arr_2d_1 = rng.random((3, 4))

arr_2d_2 = rng.random((2, 3))

#### Exercise 05: Create a 5x5 identity matrix (2D array) (★☆☆)


> NOTE: Check the function `np.eye`(https://numpy.org/devdocs/reference/generated/numpy.eye.html)

In [15]:
import numpy as np

identity_matrix = np.eye(5)
print(identity_matrix)


[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]


#### Exercise 06: Create a 3x2 random number matrix and calculate the minimum and maximum value (★☆☆)

> NOTE: Check the functions `np.min` (https://numpy.org/devdocs/reference/generated/numpy.min.html) and `np.max` (https://numpy.org/devdocs/reference/generated/numpy.max.html)

In [16]:
import numpy as np

matrix = np.random.rand(3, 2)

min_value = np.min(matrix)
max_value = np.max(matrix)

print("Random 3x2 matrix:")
print(matrix)
print(f"\nMinimum value: {min_value}")
print(f"Maximum value: {max_value}")


Random 3x2 matrix:
[[0.33130554 0.67676501]
 [0.68096697 0.11414014]
 [0.43083407 0.30832263]]

Minimum value: 0.11414014240958703
Maximum value: 0.6809669739350065


#### Exercise 07: Create a vector of 30 elements that are random numbers and calculate the mean. (★☆☆)

> NOTE: Check the function `np.mean` (https://numpy.org/doc/stable/reference/generated/numpy.mean.html)

In [17]:
import numpy as np

vector = np.random.rand(30)

mean_value = np.mean(vector)

print("Random vector:")
print(vector)
print(f"\nMean value: {mean_value}")


Random vector:
[0.68493179 0.56208822 0.76877087 0.19977465 0.91496974 0.41206978
 0.13037122 0.19226209 0.10458525 0.24522598 0.425296   0.77293676
 0.03171854 0.5405155  0.04990391 0.00224014 0.77943445 0.39726176
 0.08393275 0.73442219 0.15497594 0.5495857  0.5924505  0.11310187
 0.59077437 0.30218424 0.20560467 0.38730843 0.45705082 0.09293782]

Mean value: 0.38262286555103225


#### Exercise 08: Converts the list `[1, 2, 3]` and the tuple `(1, 2, 3)` to arrays (★☆☆)

In [18]:
import numpy as np

list_example = [1, 2, 3]
array_from_list = np.array(list_example)

tuple_example = (1, 2, 3)
array_from_tuple = np.array(tuple_example)

print("Array from list:", array_from_list)
print("Array from tuple:", array_from_tuple)

print("\nType of array from list:", type(array_from_list))
print("Type of array from tuple:", type(array_from_tuple))


Array from list: [1 2 3]
Array from tuple: [1 2 3]

Type of array from list: <class 'numpy.ndarray'>
Type of array from tuple: <class 'numpy.ndarray'>


### Operations between arrays

#### Exercise 09: Invert the vector of the previous exercise (★☆☆)

> NOTE: Check the function `np.flip` (https://numpy.org/doc/stable/reference/generated/numpy.flip.html)

In [19]:
import numpy as np

vector = np.random.rand(30)

inverted_vector = np.flip(vector)

print("Original vector:")
print(vector)
print("\nInverted vector:")
print(inverted_vector)


Original vector:
[0.29738822 0.19863657 0.99450454 0.38368969 0.14009722 0.45493
 0.87811287 0.15973237 0.44716086 0.60462235 0.04925706 0.23996977
 0.70257989 0.54475418 0.83590293 0.9480693  0.21920579 0.62244249
 0.74740097 0.21090847 0.09656899 0.84286549 0.58717899 0.33728259
 0.87531534 0.91025322 0.91639432 0.5792088  0.14258976 0.27883519]

Inverted vector:
[0.27883519 0.14258976 0.5792088  0.91639432 0.91025322 0.87531534
 0.33728259 0.58717899 0.84286549 0.09656899 0.21090847 0.74740097
 0.62244249 0.21920579 0.9480693  0.83590293 0.54475418 0.70257989
 0.23996977 0.04925706 0.60462235 0.44716086 0.15973237 0.87811287
 0.45493    0.14009722 0.38368969 0.99450454 0.19863657 0.29738822]


#### Exercise 10: Change the size of a random array of dimensions 5x12 into 12x5 (★☆☆)

> NOTE: Check the function `np.reshape` (https://numpy.org/doc/stable/reference/generated/numpy.reshape.html)

In [21]:
import numpy as np

original_array = np.random.rand(5, 12)

reshaped_array = np.reshape(original_array, (12, 5))

print("Original array shape:", original_array.shape)
print("Reshaped array shape:", reshaped_array.shape)


Original array shape: (5, 12)
Reshaped array shape: (12, 5)


#### Exercise 11: Convert the list `[1, 2, 0, 0, 0, 4, 0]` into an array and get the index of the non-zero elements (★★☆)

> NOTE: Check the function `np.where` (https://numpy.org/devdocs/reference/generated/numpy.where.html)

In [22]:
import numpy as np

arr = np.array([1, 2, 0, 0, 0, 4, 0])

non_zero_indices = np.where(arr != 0)[0]

print("Original array:", arr)
print("Indices of non-zero elements:", non_zero_indices)


Original array: [1 2 0 0 0 4 0]
Indices of non-zero elements: [0 1 5]


#### Exercise 12: Convert the list `[0, 5, -1, 3, 15]` into an array, multiply its values by `-2` and obtain the even elements (★★☆)

In [23]:
import numpy as np

arr = np.array([0, 5, -1, 3, 15])

arr_multiplied = arr * -2

even_elements = arr_multiplied[arr_multiplied % 2 == 0]

print("Original array:", arr)
print("Array after multiplication by -2:", arr_multiplied)
print("Even elements after multiplication:", even_elements)


Original array: [ 0  5 -1  3 15]
Array after multiplication by -2: [  0 -10   2  -6 -30]
Even elements after multiplication: [  0 -10   2  -6 -30]


#### Exercise 13: Create a random vector of 10 elements and order it from smallest to largest (★★☆)

> NOTE: Check the function `np.sort` (https://numpy.org/doc/stable/reference/generated/numpy.sort.html)

In [24]:
import numpy as np

vector = np.random.random(10)

sorted_vector = np.sort(vector)

print("Original vector:")
print(vector)
print("\nSorted vector:")
print(sorted_vector)


Original vector:
[0.82959031 0.46067039 0.34419453 0.4460009  0.07397971 0.49135102
 0.5443698  0.40869433 0.872543   0.63761168]

Sorted vector:
[0.07397971 0.34419453 0.40869433 0.4460009  0.46067039 0.49135102
 0.5443698  0.63761168 0.82959031 0.872543  ]


#### Exercise 14: Generate two random vectors of 8 elements and apply the operations of addition, subtraction and multiplication between them (★★☆)

> NOTE: Check the math module functions: https://numpy.org/doc/stable/reference/routines.math.html

In [27]:
import numpy as np

vector1 = np.random.rand(8)
vector2 = np.random.rand(8)

addition = np.add(vector1, vector2)
subtraction = np.subtract(vector1, vector2)
multiplication = np.multiply(vector1, vector2)

print("Vector 1:", vector1)
print("Vector 2:", vector2)
print("\nAddition:", addition)
print("Subtraction:", subtraction)
print("Multiplication:", multiplication)


Vector 1: [0.16667586 0.03938923 0.67347996 0.44758926 0.88493356 0.49662429
 0.5049288  0.91726733]
Vector 2: [0.37444367 0.67223371 0.51263063 0.64828402 0.92144681 0.27598445
 0.48133904 0.82989116]

Addition: [0.54111952 0.71162294 1.18611059 1.09587328 1.80638037 0.77260874
 0.98626785 1.74715848]
Subtraction: [-0.20776781 -0.63284449  0.16084934 -0.20069477 -0.03651325  0.22063984
  0.02358976  0.08737617]
Multiplication: [0.06241072 0.02647877 0.34524645 0.29016496 0.81541921 0.13706058
 0.24304195 0.76123204]


#### Exercise 15: Convert the list `[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]` into an array and transform it into a matrix with rows of 3 columns (★★★)

In [25]:
import numpy as np

original_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

arr = np.array(original_list)

matrix = arr.reshape(-1, 3)

print("Original list:", original_list)
print("\nReshaped matrix:")
print(matrix)


Original list: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

Reshaped matrix:
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
