<a href="https://colab.research.google.com/github/KebsWilly/BAHA/blob/master/NumpyBasics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Welcome to this numpy tutorial. Numpy stands for Numerical Python. 
It is a useful tool commonly used to perform most mathematical operations on given datasets. 
Understanding NumPy is important to anyone who wishes to master data science and/or machine learning and deep learning. 
This tutorial offers a hands on introduction to NumPy using Python programming language.

In [1]:
import numpy as np
import sys
import matplotlib.pyplot as plt

Arrays: 

In [2]:
a = np.array([1, 2, 4, 5])
print(a), print(type(a)), print(a.ndim)

[1 2 4 5]
<class 'numpy.ndarray'>
1


(None, None, None)

2 Dimensional Array:

In [3]:
b = np.array([[1, 2, 4, 5], [5, 6, 7, 8]])
print(b), print(type(b)), print(b.ndim)

[[1 2 4 5]
 [5 6 7 8]]
<class 'numpy.ndarray'>
2


(None, None, None)

Working with Arrays:
Accessing elements in an array. 
Note: Array indexing starts from 0

In [4]:
array2 = np.array([1,2,3,4,5,6,7])
#accessing the first element
element1 = array2[0]
element1

1

In [None]:
#accessing the last array
last_element = array2[-1]
last_element

7

In [None]:
#For multidimensional arrays, we access elements using their row and column ids
array3 = np.array([[1,2,3,4,5,6,7],[8,9,10,11,12,13,14], [15, 16, 17, 18, 19, 20, 21]])
first_element_3 = array3[0,0] #row 0 column 0
first_element_3

1

In [None]:
element_n = array3[2, 6]
element_n

21

Creating Arrays with given numbers

In [None]:
#an array of zeros
zero_array = np.zeros((2, 4)) #2 by 4 array of zeros
zero_array

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [None]:
#an array of ones
ones_array = np.ones((3, 4)) # 3 by 4 array of ones
ones_array

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

In [None]:
#an array of 101's
array_101 = np.full((4, 4,), 101)
array_101

array([[101, 101, 101, 101],
       [101, 101, 101, 101],
       [101, 101, 101, 101],
       [101, 101, 101, 101]])

In [None]:
#a random array
random_array = np.random.randn(3,3)
random_array

array([[ 0.64521941,  0.01791989, -0.21122096],
       [-0.18975091,  1.29371152, -0.00447193],
       [-1.23492149,  1.02520185,  1.41260885]])

Arrays are the fundamental data types that are used in solving various problems in data science/ML. 
In most instances, our features are first converted to arrays before any algorithms can be implemented on them.
First we present a simple example of how arrays can be used to solve simultaneous equations to show their applicability in linear algebra

Consider the equation: 
$x+y+z = 6$
$2y+5z = -4$
$2x+5y-z=27$. 
This can be translated to a matrix(an array) which is represented by:
$A = \begin{pmatrix}
  1 & 1 & 1 \\
  0& 2& 5 \\
  2 & 5 & -1
 \end{pmatrix}$
 and the unknowns are given by: 
 $A = \begin{pmatrix}
      6\\
      -4\\
      27
      \end{pmatrix}$

The solutions to the three equation are also [6, -4, 27]
To solve for the unknowns, we have to multiply our matrix with the given solution. 
From basic algebra, we know that given AB = C, where A is our original matrix,B is the set of unknowns and C is the solution matrix, we can find our unknowns by multiplying C by the inverse of A. 
Using NumPy, this can be achieved as follows:

1.   List item
2.   List item



In [None]:
A = np.array([[1, 1, 1], [0, 2, 5],[2, 5, -1]])
C = np.array([6, -4, 27])
#step one: finding the inverse of the matrix A
#Using numpy
A_inv = np.linalg.inv(A)
#step to: solving for the unkowns using the dot product
unknowns = A_inv.dot(C)
unknowns

array([ 5.,  3., -2.])

The example above has provided a high level overview of the different inbuilt functionalities of NumPy. 
Functionalities such as the linear algebra and inverse are just examples of what is contained in the package. 
We explore more examples as follows


In [None]:
#Statistics
random_numbers = [101, 40, 54050, 320404, 5, 4, 5, 405032, 5677]
np.mean(random_numbers), np.std(random_numbers), np.max(random_numbers), np.min(random_numbers)

(87257.55555555556, 149494.83748879915, 405032, 4)

In [None]:
#mathematical operations
var1 = 50; var2 = 30
sum = np.add(var1, var2)
diff = np.subtract(var1, var2)
prod = np.multiply(var1, var2)
sum, diff, prod

(80, 20, 1500)

# Practical Example: Using Numpy to Perform Linear Regression




> Overview of Linear Regression.


 Linear regression is a statistical approach used to find the relationship between a dependent variable and independent variable(s). The approach assumes that a linear relationship exists between given set of variables. The relationship can be summarized using the formula: 
$y = wX+b$ 
where y is the dependent variable, X is the independent variable, w is the weights of the model and b is the bias in the model. 


> 


Linear regression is among the many machine learning algorithms used for regression problems. 
We are now going to implement a linear regression model using only numpy and custom built python functions. 

In [None]:
class LinearRegression():
  def __init__(self, lr=0.001, n_iterations = 1000):
    self.lr = lr
    self.n_iters = n_iterations
    self.weights = None
    self.bias = None

  def fit(self, X, y):
    n_samples, n_features = X.shape
    self.weights = np.zeros(n_features)
    self.bias = 0

    for _ in range(self.n_iters):
      y_hat = np.dot(X, self.weights) + self.bias
      #calculating the derivatives (a.k.a Gradient descent)
      dw = (1/n_samples) * np.dot(X.T, (y_hat-y)) #multiplying by the transpose of x to find derivative of the weight
      db = (1/n_samples) * np.sum(y_hat-y) #derivative of the bias

      self.weights -= self.lr*dw
      self.bias -= self.lr*db

  def predict(self, X):
    y_hat = np.dot(X, self.weights) + self.bias
    return y_hat





In [None]:
#function to help compute the error of the model
def mse(y_true, y_predicted):
   return  np.mean((y_true-y_predicted)**2)

In [None]:
from sklearn.model_selection import train_test_split
from sklearn import datasets
X, y = datasets.make_regression(n_samples=1000, n_features=1, noise=20, random_state=42)
X_train,X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [None]:
model = LinearRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)
error = mse(y_test, predictions)
error

445.160788488899