# Exercise 01.1 Intro to ML - Numpy Recap

## Pedagogy

This notebook contains both theoretical explanations and executable cells to execute your code.

When you see the <span style="color:red">**[TBC]**</span> (To Be Completed) sign, it means that you need to perform an action else besides executing the cells of code that already exist. These actions can be:
- Complete the code with proper comments
- Respond to a question
- Write an analysis
- etc.

## Part 1. NumPy ultra-quick tutorial

[Numpy](https://numpy.org/doc/stable/index.html) is a Python library for creating and manipulating matrices, the main data structure used by ML algorithms. [Matrices](https://en.wikipedia.org/wiki/Matrix_(mathematics)) are mathematical objects used to store values in rows and columns.

Python calls matrices *lists*, NumPy calls them *arrays*.

This notebook is not an exhaustive tutorial on NumPy. Rather, the purpose of this notebook is to let you review the knowledge about NumPy you learned in the Python Bootcamp course, and set up python, anaconda, jupyter notebook and other tools for subsequent courses.

### Import NumPy library

Run the following code cell to import the NumPy library.

In [None]:
# import the NumPy library
import numpy as np

### Populate arrays with specific numbers

Call `np.array` to create a NumPy array with you own hand-picked values. For example, the following call to `np.array` creates an 8-element array.

In [None]:
# create an 8-element array
one_dimensional_array = np.array([1.2, 2.4, 3.5, 4.7, 6.1, 7.2, 8.3, 9.5])
print(one_dimensional_array)

You can also use `np.array` to create a two-dimensional array. To do that, specify an extra layer of square brackets. For example, the following call creates a 3$\times$2 array:

In [None]:
# create a two-dimensional array
two_dimensional_array = np.array([[6, 5], [11, 7], [4, 8]])
print(two_dimensional_array)

To populate an array with all zeros, call `np.zeros`.

In [None]:
# create an array with all zeros
all_zero_array = np.zeros(8)
print(all_zero_array)

To populate an array with all ones, call `np.ones`.

In [None]:
# create an array with all ones
all_one_array = np.ones((3,2))
print(all_one_array)

### Populate arrays with random numbers

NumPy provides various functions to populate arrays with random numbers across certain ranges. For example, `np.random.randint` generates random integers between a low and high value. The following call populates a 6-element array with random integers between 50 and 100.

In [None]:
# create a 6-element array with random integers between 50 and 100
random_integers_between_50_and_100 = np.random.randint(low = 50, high = 101, size = (6))
print(random_integers_between_50_and_100)

Note that the highest generated integer will be less than the `high` argument. More explanations about `np.random.randint` can be found in the official documentation: https://numpy.org/doc/stable/reference/random/generated/numpy.random.randint.html#numpy.random.randint

To create random floating-point values between 0.0 and 1.0, call `np.random.random`. For example:

In [3]:
# create an array with random floating-point values between 0.0 and 1.0
random_floats_between_0_and_1 = np.random.random([6])
print(random_floats_between_0_and_1)

NameError: name 'np' is not defined

### Mathematical operations on NumPy operands

If you want to add or subtract two arrays, linear algebra requires that the two operands have the same dimensions. Furthermore, if you want to multiply two arrays, linear algebra imposes strict rules on the dimensional compatibility of operands. Fortunately, NumPy uses a trick called [**broadcasting**](https://developers.google.com/machine-learning/glossary/#broadcasting) to virtually expand the smaller operand to dimensions compatible for linear algebra. For example, the following operation uses broadcasting to add 2.0 to the value of every item in the array created in the previous code cell:

In [None]:
# add 2.0 to every item in the array
random_floats_between_2_and_3 = 2.0 + random_floats_between_0_and_1
print(random_floats_between_2_and_3)

The following operation also relies on broadcasting to multiply each item in an array by 3:

In [None]:
# multipy each item in the array by 3
random_integers_between_150_and_300 = 3 * random_integers_between_50_and_100
print(random_integers_between_150_and_300)

## Part 2. Hands-on exercises

Please complete the following two exercises.

### Task 1. Create a linear dataset

<span style="color:red">**[TBC]**</span>: Your goal is to create a simple dataset consisting of a single feature and a label as follows:
1. Assign a sequence of integers from 6 to 20 (inclusive) to a NumPy array named `feature`
2. Assign 15 values to a NumPy array named `label` such that: label = 3 $\times$ feature + 4
3. Print the created `feature` and `label`

For example, the first value for `label` should be:

3 $\times$ 6 + 4 = 22

In [11]:
# [TBC] complete your code here with proper comments
import numpy as np
feature = np.random.randint(low = 6, high = 21, size = (15))
label = 3 * array + 4
print(feature)
print(label)

[14  6 13  7  9  7 16 11 20  7 17 10 15 20 20]
[61 25 46 55 34 28 61 55 40 49 43 64 34 22 40]


### Task 2. Add some noise to the dataset

<span style="color:red">**[TBC]**</span>: To make your dataset a little more realistic, insert a little random noise into each element of the `label` array you already created. To be more precise, modify each value assigned to `label` by adding a different random floating-point value between -2 and +2.

In [15]:
# [TBC] complete your code here with proper comments
import numpy as np
feature = np.random.randint(low = 6, high = 21, size = (15))
label = 3 * array + 4
noise = np.random.uniform(-2, 2, size =len(label))
label_with_noise = label + noise
print("Feature:", feature)
print("original Label:", label)
print("label with Noise:", label_with_noise)

Feature: [17 11 18 13  8  9 20 13 15 18 18 10 15 15 18]
original Label: [61 25 46 55 34 28 61 55 40 49 43 64 34 22 40]
label with Noise: [61.33102987 24.97527284 44.77011094 55.09318023 33.28607557 26.86943831
 61.43322719 55.02875407 40.35874852 47.50509825 42.23547431 62.77529186
 34.82611439 22.4807524  40.12708921]
