<h1 align = 'center' style = "font-size:60px; font-family:verdana ; font-weight : normal; background-color: #f6f5f5 ; color : #fe346e; text-align: center; border-radius: 100px 100px; padding: 10px"> ML.AI Week 1 Data Structures in Python Tutorial</h1>

<div class="alert alert-block alert-info" style="font-size:14px; font-family:verdana; line-height: 1.7em; background-color:#5642C5; color:white;">
📌 &nbsp; About ML.AI<br>
 MachineLearning.AI is a 7 week course, and it will take you from the basics of Python to the detailed knowledge about Data Science and Machine Learning as well as an Introduction to Neural Networks and Deep Learning. 
<br><br>
The course offers a comprehensive blend of high-quality resources sourced from the internet, including engaging videos, informative blogs, weekly assignments, and culminating in an exciting hackathon to assess your knowledge and skills. By the end of 7 weeks, you will have a solid understanding of the key concepts and techniques of Machine Learning as well as an appreciation of their potential impact on our society.
<br><br>
    
📌 &nbsp; Structure of the Course<br><br>
Week 1:  Introduction To Python & Python Libraries<br>
Week 2: Introduction to ML and Basic ML Algorithms<br>
Week 3: Advanced ML Algorithms<br>
Week 4: Unsupervised Learning Hackathon<br>
Week 5: Ensemble Learning and Dealing with Real World Data<br>
Week 6: Introduction to Neural Networks and Deep Learning<br>
Week 7: Final Hackathon<br>
<br>
📌 &nbsp; What Next?<br>Advanced ML.AI, a guiding resource for Deep Learning Specialization
</div>

## What and Why NumPy?

NumPy is a Python library used for scientific computing and data analysis. It provides support for large, multi-dimensional arrays and matrices, as well as a large collection of mathematical functions to operate on these arrays. NumPy is often used for tasks such as linear algebra, Fourier transform, and random number generation.

There are several reasons **why NumPy** is a popular choice for scientific computing:

* **Fast**: NumPy is written in C and is optimized for performance, making it much faster than Python's built-in data structures.
* **Efficient memory usage**: NumPy's arrays are stored in contiguous memory, which means they use less memory than Python's built-in data structures.
* **Broadcasting**: NumPy allows you to perform operations on arrays of different shapes and sizes, by automatically broadcasting the smaller array to match the larger array.
* **Interoperability**: NumPy can easily interface with other languages, such as C and Fortran.
* **Large collection of mathematical functions**: NumPy provides a large collection of mathematical functions for operations such as trigonometry, logarithms, and exponentials.

Overall, NumPy is a powerful and efficient library that is widely used in scientific computing and data analysis.

For in depth tutorials and study material on Numpy, visit our course page at: [Insert Link]

## From lists to numpy array

Python lists are very general.
* They can contain any kind of object (or a mix of objects).
* They are dynamically typed.
* They do not support mathematical functions such as matrix and dot multiplications, etc. Implementing mathematical functions for Python lists would not be very efficient because of the dynamic typing.

Numpy arrays are statically typed and homogeneous. The type of the elements is determined when the array is created. Because of the static typing, fast implementation of mathematical functions such as multiplication and addition of numpy arrays can be implemented in a compiled language (C and Fortran is used). Since Numpy arrays are statically typed, the type of an array does not change once created. But we can explicitly cast an array of some type to another using the astype functions (see also the similar asarray function). This always creates a new array of new type.

Let's begin to interact with numpy package.
In the below cell we import the numpy package as np. Then print the version of the package we just imported.

In [None]:
import numpy as np
print(np.version)
print(np.__version__)

<h1 id = 'prac' style = "font-size:30px; font-family:verdana ; font-weight : normal; background-color: #f6f5f5 ; color : #fe346e; text-align: center; border-radius: 100px 100px; padding: 10px">Numpy Practice Exercises
<a class="anchor-link" href="https://www.kaggle.com/code/varunnagpalspyz/ml-ai-week-1-numpy-tutorial/notebook#prac">¶</a>
</h1>

<h1>Basics</h1>

## 1. Create an array of the integers from 10 to 50

In [None]:
arr_10to50 = np.arange(10,51)
print(arr_10to50)

## 2 . Create an array of the even integers from 10 to 50

In [None]:
arr_10to50_even = np.arange(10,51,2)
print(arr_10to50_even)

## 3. Creates a 3x3 matrix with values ranging from 1 to 9

In [None]:
np.arange(1,10).reshape(3,3)

## 4. Use NumPy to generate a random number between 0 and 1

In [None]:
np.random.rand(1)

## 5. Use NumPy to generate 25 random numbers sampled from a standard normal distribution

In [None]:
np.random.randn(25)

## 6. Create a matrix of 100 items spaced equally at 0.01 steps ranged fro 0.01 to 1

In [None]:
np.arange(1,101).reshape(10,10)/100

In [None]:
# Returning the exact same result as above
np.linspace(0.01,1, 100).reshape(10,10)

## 7. Create an array of 20 linearly spaced points between 0 and 1

In [None]:
np.linspace(0,1, 20)

# Indexing and Slicing

In [None]:
## Use this 2-Dimensional Array for this exercises 
arr_2_d = np.arange(1,26).reshape(5,5)
print(arr_2_d)

## 8. Grab element 20  of the arr_2_d array

In [None]:
# That is to say:4th row and 5th column
arr_2_d[3,4]

## 9. Grab items 2, 7, 12 of the arr_2_d array

In [None]:
arr_2_d[:3,1:2]

# That is to say:
#
# for row: **Grab all from the** begining,
#          **up to the** forth row (but not including),
#          it rescues 3 rows in a row,
#
# for column: **and all from the** second 
#             **up to the** 3ª Column (bni).
#             It rescues just one column, the second one:)

## 10. Grab the last row of the arr_2_d array

In [None]:
arr_2_d[:3,1:2]

# That is to say:
#
# for row: **Grab all from the** begining,
#          **up to the** forth row (but not including),
#          it rescues 3 rows in a row,
#
# for column: **and all from the** second 
#             **up to the** 3ª Column (bni).
#             It rescues just one column, the second one:)

## 11. Get the sum of all the values, column values and row values in arr_2_d array

In [None]:
# Using NumPy and pass array as parameter
np.sum(arr_2_d)

# OR
# Use the method of the object array itself directly
# arr_2_d.sum()

In [None]:
np.sum(arr_2_d, axis=0)

# axis=0 -> row -> so the result is placed in a row
# (then it is a columns' sum)

# axis=1 -> columns -> so the result is placed in a column 
# (then it is a row's sum:)

np.sum(arr_2_d, axis=1)

# axis=0 -> row -> so the result is placed in a row
# (then it is a columns' sum)

# axis=1 -> columns -> so the result is placed in a column 
# (then it is a row's sum:)

## 12. Get the standard diviation of the values in arr_2_d array

In [None]:
np.std(arr_2_d)

# OR
# arr_2_d.std()

## 13. 

## 8. Declare a 8x8 matrix and fill it with a checkerboard (or chess) pattern.

In [None]:
data = np.zeros((8,8), dtype=int)

for i in range(8):
    for j in range(8):
        if (i % 2) == 0:
            if (j % 2) ==0:
                data[i,j] = 1
        else:
            if (j % 2):
                data[i,j] = 1
print(data)

Declare a 10x10 array with random values and find the minimum and maximum values.

In [None]:
data = np.random.randint(1, 5, size=(10, 10))
print(data)
print()
print('Min:', np.min(data))
print('Max:', np.max(data))

Create a random 10x2 matrix. Considering each pair represents an (x,y) co-ordinate in 2-D, obtain a matrix which contains (r,theta) in polar co-ordinate system.

In [None]:
import numpy as np

# Create a random 10x2 matrix
xy = np.random.rand(10, 2)

# Convert to polar coordinates
r = np.hypot(xy[:, 0], xy[:, 1])
theta = np.arctan2(xy[:, 1], xy[:, 0])

# Combine r and theta into a 10x2 matrix
polar = np.column_stack((r, theta))

print("Original Matrix (x, y):\n", xy)
print("\nMatrix in Polar Coordinates (r, theta):\n", polar)

Multiply a 5x3 matrix by a 3x2 matrix (real matrix product)

In [None]:
x = np.random.rand(5,3)
y = np.random.rand(3,2)
z = x@y
print(z)

Numpy use a powerful idea called broadcasting. The best thing is that it is also simple to use. Example, we will execute the following in the cell below.
```
x = np.arange(4)
print(x)
y = x + 3
print(y)
```
Did you notice how 3 got added to every element of x! This is refered to as broadcasting. The concept extends when you wish to add a vector to every row (or column) of a matrix.

In [None]:
x = np.arange(4)
print(x)
y = x + 3
print(y)

Can we demonstrate a computational time benefit of broadcasting? For this we will need to measure time. No worries, let's welcome the timeit package.

In [None]:
# importing the required module
import timeit

# code snippet to be executed only once
test_setup = "import numpy as np"

# code snippet to be executed only once
test_code_1 = '''
def example_function():
    return np.arange(1000)
'''
num_runs = 1000
time_elapsed = timeit.timeit(setup=test_setup, stmt = test_code_1, number=num_runs)
print('Time taken: ' + str(time_elapsed/num_runs))

A bottlenecks in code is loops, this holds especially for python code. Compared to languages like C/C++ , Python loops are relatively slower. One reason for this is the dynamically typed nature of Python.

Python interpreter first goes line-by-line through the code, compiles the code into bytecode, which is then executed to run the program. Let's consider the code contains a section where it has to loop over a list. Being dynamically typed, Python has no idea what type of objects are present in the list (whether it's an integer, a string or a float). Further, this information is basically stored in every object itself, and Python can not know this in advance before actually going through the list. Hence, at each iteration python has to perform a bunch of checks every iteration like determining the type of variable, resolving it's scope, checking for any invalid operations etc.

Contrasting this with C, arrays are allowed to be consisting of only one data type, which the compiler knows well ahead of time. This opens up possibility of many optimizations which are not possible in Python. For this reason, loops in python are often much slower than in C, and nested loops is where things can really get slow.

In the below cell, define two random vectors x and y, each of size (1000x1). Do an elementwise multiplication of these vectors to get a new vector z. Try two approaches for this: (a) writing a function which uses a loop, (b) writing a function which the "star" numpy operator. Print the time for approach (a) and (b). 

In [None]:
import numpy as np
import time

# Define the vectors x and y
x = np.random.rand(1000, 1)
y = np.random.rand(1000, 1)

# Method 1: Elementwise multiplication using a loop
def elementwise_mult_loop(x, y):
    z = np.zeros((1000, 1))
    for i in range(1000):
        z[i] = x[i] * y[i]
    return z

start_time = time.time()
z1 = elementwise_mult_loop(x, y)
end_time = time.time()
print("Method 1 (loop):", end_time - start_time, "seconds")

# Method 2: Elementwise multiplication using the "star" operator
def elementwise_mult_star(x, y):
    z = x * y
    return z

start_time = time.time()
z2 = elementwise_mult_star(x, y)
end_time = time.time()
print("Method 2 (star):", end_time - start_time, "seconds")

# Verify that the two methods produce the same result. If results don't match, it will through an error
try:
    assert np.allclose(z1, z2)
    print('Results match')
except:
    print('Results don\'t match!')

Create a 2-D grid between (-10,10) in x-axis and (-10,10) in y-axis, with step size of .01. Generate a 2-D gaussian function on this grid, with mean = (0,0), and co-variance being an identity matrix.

Plot the 2-D Gaussian using plot3D() function.

In [None]:
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

# Define the range and step size of the grid
x = np.arange(-10, 10, 0.01)
y = np.arange(-10, 10, 0.01)

# Define the mean and covariance of the 2D Gaussian
mean = [0, 0]
cov = [[1, 0], [0, 1]]

# Create a 2D Gaussian function
def gaussian(x, y):
    '''
    implements the 2-D Gaussian function
    returns the values
    '''

    data = np.zeros((len(x), len(y)), dtype=float)

    mu = np.array(mean)
    sigma = np.array(cov)
    det = np.linalg.det(sigma)
    inv = np.linalg.inv(sigma)

    for i in range(len(x)):
        for j in range(len(y)):
            z = np.array([x[i], y[j]])
            data[i,j] = np.exp(-0.5 * (z - mu) @ inv @ (z-mu).T)

    return data / (2.0 * np.pi * np.sqrt(det))

# Evaluate the 2D Gaussian on the grid
Z = gaussian(x, y)

# create the 2-D grid for plotting
X, Y = np.meshgrid(x, y)

# Create a 3D plot of the 2D Gaussian
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(X, Y, Z)
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_zlabel('Z')
plt.show()
