# Quick Start Guide to NumPy

If you're diving into the world of machine learning (ML), understanding NumPy is essential. NumPy, a Python library, is designed for creating and manipulating matrices – the fundamental data structure in many ML algorithms. [Matrices](https://en.wikipedia.org/wiki/Matrix_(mathematics)), also known as arrays or tensors, store values in rows and columns.

In Python, matrices are commonly referred to as *lists*, while NumPy prefers the term *arrays*. PyTorch and TensorFlow have their own term – *tensors*. Python uses the [list data type](https://docs.python.org/3/library/stdtypes.html#lists) to represent matrices.

This tutorial serves as a quick start guide to NumPy, providing you with the necessary knowledge to embark on your journey into Machine Learning and Deep Learning. While not exhaustive, it equips you with the basics to kickstart your exploration of these exciting fields. For more in-depth information, refer to the official [NumPy documentation](https://numpy.org/doc/stable/index.html).

## Importing the NumPy Module

Execute the code cell below to import the NumPy module:

In [1]:
# Importing the NumPy module with an alias 'np'
# This allows us to refer to NumPy functions and objects using the
# shorter name 'np' instead of the full 'numpy'
import numpy as np

## Creating NumPy Arrays

Utilize the `np.array` function to generate a NumPy array containing manually selected values. As an illustration, the subsequent `np.array` invocation produces an array with eight elements:

In [2]:
# Creating a one-dimensional NumPy array using the np.array function
# The array is initialized with specific float values enclosed in square brackets
one_dimensional_array = np.array([1.2, 2.4, 3.5, 4.7, 6.1, 7.2, 8.3, 9.5])

# Printing the contents of the created NumPy array
print(one_dimensional_array)


[1.2 2.4 3.5 4.7 6.1 7.2 8.3 9.5]


## Creating Two-Dimensional Arrays with `np.array`

The `np.array` function can be employed to generate a two-dimensional array by introducing an additional layer of square brackets. For instance, the subsequent call produces a 3x2 array:

In [3]:
# Creating a two-dimensional NumPy array using the np.array function
# The array is initialized with nested lists representing rows and columns
two_dimensional_array = np.array([[6, 5], [11, 7], [4, 8]])

# Printing the contents of the created two-dimensional NumPy array
print(two_dimensional_array)

[[ 6  5]
 [11  7]
 [ 4  8]]


## Populating Arrays with Zeros and Ones

To fill an array entirely with zeros, use the `np.zeros` function. Conversely, to populate an array with all ones, utilize the `np.ones` function.

In [4]:
# Defining the size of the array using a tuple (3 rows, 2 columns)
size = (3, 2)

# Creating a NumPy array filled with zeros using the np.zeros function
zeros_array = np.zeros(size)

# Printing the zeros array along with a formatted message
print(f"Zeros Array: \n{zeros_array}")

Zeros Array: 
[[0. 0.]
 [0. 0.]
 [0. 0.]]


In [5]:
# Creating a NumPy array filled with ones using the np.ones function
# The size of the array is determined by the previously defined
# tuple (3 rows, 2 columns)
ones_array = np.ones(size)

# Printing the ones array along with a formatted message
print(f"Ones Array: \n{ones_array}")

Ones Array: 
[[1. 1.]
 [1. 1.]
 [1. 1.]]


## Creating Arrays with Sequential Numbers

Generate an array by populating it with a sequence of numbers:

In [6]:
# Creating a NumPy array with a sequence of integers using the np.arange function
# The sequence starts from 5 (inclusive) and ends at 12 (exclusive)
sequence_of_integers = np.arange(5, 12)

# Printing the contents of the created array
print(sequence_of_integers)

[ 5  6  7  8  9 10 11]


## Note on `np.arange`

It's important to observe that `np.arange` generates a sequence that includes the lower bound (5) but excludes the upper bound (12).

## Generating Arrays with Random Numbers

NumPy offers diverse functions for populating arrays with random numbers within specified ranges. As an illustration, `np.random.randint` produces random integers between a specified low and high value. The subsequent call demonstrates creating a 6-element array with random integers ranging from 50 to 100.




In [7]:
# Generating a NumPy array with random integers using np.random.randint
# The 'low' parameter specifies the lower bound (50), 'high' specifies the
# upper bound (101, exclusive)
# The 'size' parameter determines the number of elements in the array (6)
random_integers_between_50_and_100 = np.random.randint(low=50, high=101, size=(6))

# Printing the contents of the array with random integers
print(random_integers_between_50_and_100)

[68 69 78 83 91 65]


## Important Note on `np.random.randint`

It's crucial to be aware that the highest integer generated by `np.random.randint` is one less than the specified `high` argument.

## Generating Random Floating-Point Values

If you aim to produce random floating-point values within the range of 0.0 to 1.0, use the `np.random.random` function. Here's an example:

In [8]:
# Generating a NumPy array with random floating-point values using np.random.random
# The argument [6] specifies the size of the array, in this case, a 1-dimensional
# array with 6 elements
random_floats_between_0_and_1 = np.random.random([6])

# Printing the contents of the array with random floating-point values
print(random_floats_between_0_and_1)

[0.90177485 0.63351733 0.93698489 0.40922062 0.29264066 0.58904973]


## Mathematical Operations with NumPy Arrays

Performing addition or subtraction between two arrays in linear algebra necessitates having operands with matching dimensions. Additionally, multiplication follows strict rules regarding the compatibility of operand dimensions. Fortunately, NumPy employs a technique known as [**broadcasting**](https://developers.google.com/machine-learning/glossary/#broadcasting) to effectively extend the smaller operand to dimensions suitable for linear algebra. As an illustration, the subsequent operation showcases broadcasting by adding 2.0 to each item in the array created in the preceding code cell:

In [9]:
# Performing a mathematical operation on a NumPy array using broadcasting
# Adding 2.0 to every element in the 'random_floats_between_0_and_1' array
random_floats_between_2_and_3 = random_floats_between_0_and_1 + 2.0

# Printing the contents of the resulting array after the addition operation
print(random_floats_between_2_and_3)

[2.90177485 2.63351733 2.93698489 2.40922062 2.29264066 2.58904973]


## Broadcasting in Array Multiplication

In the subsequent operation, broadcasting is utilized to multiply each element in an array by 3:

In [10]:
# Performing a mathematical operation on a NumPy array using broadcasting
# Multiplying every element in the 'random_integers_between_50_and_100' array by 3
random_integers_between_150_and_300 = random_integers_between_50_and_100 * 3

# Printing the contents of the resulting array after the multiplication operation
print(random_integers_between_150_and_300)

[204 207 234 249 273 195]


## Example 1: Generating a Linear Dataset

In this example, our objective is to construct a straightforward dataset comprising a single feature and a corresponding label, following these steps:

1. Allocate a sequence of integers from 6 to 20 (inclusive) to a NumPy array named `feature`.
2. Populate a NumPy array named `label` with 15 values based on the equation:

   ```
   label = (3)(feature) + 4
   ```

   For instance, the initial value for `label` should be calculated as:

   ```
   label = (3)(6) + 4 = 22
   ```

In [11]:
# Creating a NumPy array 'feature' with a sequence of integers from
# 6 to 20 (inclusive) using np.arange
feature = np.arange(6, 21)

# Printing the contents of the 'feature' array
print(feature)

# Creating a NumPy array 'label' by applying the linear equation (3 * feature) + 4
label = (feature * 3) + 4

# Printing the contents of the 'label' array
print(label)

[ 6  7  8  9 10 11 12 13 14 15 16 17 18 19 20]
[22 25 28 31 34 37 40 43 46 49 52 55 58 61 64]


## Example 2: Introducing Noise to the Dataset

Enhance the realism of your dataset by incorporating some random noise into each element of the existing `label` array. Specifically, adjust each value assigned to `label` by adding a distinct random floating-point value within the range of -2 to +2.

Avoid relying on broadcasting; instead, create a `noise` array with the same dimensions as `label`.

In [12]:
# Generating a NumPy array 'noise' with random floating-point values
# between -2 and +2
# The size of the 'noise' array is specified as [15], matching the dimension
# of the 'label' array
noise = (np.random.random([15]) * 4) - 2

# Printing the contents of the 'noise' array
print(noise)

# Adding the generated 'noise' array to the 'label' array, introducing variation
# to each element
label = label + noise

# Printing the contents of the 'label' array after incorporating noise
print(label)

[ 1.72585046 -0.38295165 -0.7520211   0.49260779 -1.81020396  1.94445271
 -0.25260659  0.5594518  -0.85073671  0.14252357  1.49077953  0.75943542
 -1.29923172  1.94562788 -0.13365134]
[23.72585046 24.61704835 27.2479789  31.49260779 32.18979604 38.94445271
 39.74739341 43.5594518  45.14926329 49.14252357 53.49077953 55.75943542
 56.70076828 62.94562788 63.86634866]
