Skip to content

Ra00f1/Machine-Learning-and-Deep-Learning-notes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

13 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Metrics

Accuracy

Out of all the predictions we made, how many were true? image

Precision

Out of all the positive predictions we made, how many were true?

Use when:

  • False positives are costly.
  • Example: Spam detection. You don’t want to classify important emails as spam.

image

Recall

Out of all the data points that should be predicted as true, how many did we correctly predict as true?

Use when:

  • False negatives are costly.
  • Example: Disease diagnosis. You want to catch as many sick patients as possible.

image

F1 Score

F1 Score is a measure that combines recall and precision. As we have seen there is a trade-off between precision and recall, F1 can therefore be used to measure how effectively our models make that trade-off.

image

Confusion matrix

  • True Positive: You predicted positive, and it’s true.
  • True Negative: You predicted negative, and it’s true.
  • False Positive: (Type 1 Error): You predicted positive, and it’s false.
  • False Negative: (Type 2 Error): You predicted negative, and it’s false.
  • Accuracy: the proportion of the total number of correct predictions that were correct.
  • Positive Predictive Value or Precision: the proportion of positive cases that were correctly identified.
  • Negative Predictive Value: the proportion of negative cases that were correctly identified.
  • Sensitivity or Recall: the proportion of actual positive cases which are correctly identified.
  • Specificity: the proportion of actual negative cases which are correctly identified.
  • Rate: It is a measuring factor in a confusion matrix. It has also 4 types TPR, FPR, TNR, and FNR. image

Logarithmic Loss

Log loss penalizes the false (false positive) classification. It usually works well with multi-class classification. image

Area Under Curve (AUC)

It is one of the widely used metrics and basically used for binary classification. The AUC of a classifier is defined as the probability of a classifier will rank a randomly chosen positive example higher than a negative example.

Mean Absolute Error (MAE)

Mean Absolute Error(MAE) is the average distance between predicted and original values. Basically, it gives how we have predicted from the actual output. However, there is one limitation i.e. it doesn't give any idea about the direction of the error which is whether we are under-predicting or over-predicting our data. image

Mean Squared Error (MSE)

Root Mean Squared Error (RMSE)

Machine Learning Summary

Unsupervised Learning

Clustering

K-Means Clustering

image image

Cost functions:
MSE

image

WCSS

image

Cross-Entropy Loss

image

image

Inertia

Calculated at the end to see how compacted the groups are and can be used with the "Elbow Method" to determine the ideal number for k. image image

Initializing K-Means

One of the best ways is to have hundreds of test cases at which centroids are chosen at random and to train them and choose the one with the lowest cost.

Choosing the best

While "Eblow Method" can be used there is an other method as well which uses the silhoette of the model to choose the ideal number for k.

For each data point 𝑖:

  • Compute Cohesion (π‘Žπ‘–) β†’ The average distance between 𝑖 and all other points in the same cluster.

  • Compute Separation (𝑏𝑖) β†’ The average distance between 𝑖 and the nearest cluster (not including its own cluster).

  • Calculate the Silhouette Score for point 𝑖:

    image

𝑆𝑖 ranges from -1 to 1:

  • Close to 1 β†’ Well-clustered (correct assignment).
  • Close to 0 β†’ Overlapping clusters.
  • Negative β†’ Misclassified point.

Overall Formulla:

image

image


Convolutional Neural Networks Summary

Notes

f: Filter Size, s: Stride, p: Padding, n: Input Size, b: bias

  • Filter Output Size: ((n + 2p - f) / s ) + 1
  • Total Parameters: (n * f) + b
  • Output after Max Pooling: ((n * f) / s) + 1
  • Padding for "same" convolution: (f - s - 1) / 2

Classic Networks

-AlexNet

image

-VGG-16

  • Smaller and easier compared to other networks
  • Conv Layers: has the same filter size(3x3), stride of 1, and "same" padding
  • Max-Pool: 2x2 and stride 2 image

ResNet

ResNets are like normal networks but they have jumps every few layers which normally is either 2 or 3 layers and each one witj a jump is called a residual block.

One of experts I talked to explained it like this: while plain networks try to find which data the input resembles to, ResNets try to change the image into a data that it has worked with before.

Residual Block: image

Residual Block Formula: image

Coding Notes

This part is mostely for translating math formulas into code when practicing and for future use.

np.linalg.norm(X[i] - centroids[j]) 

image


points = X[idx == k]
  • get all the points in X that have the idx equal to k

centroids[k] = np.mean(points, axis = 0)

image

Notes

Loss function: Error for a single training example.

Cost function: Average of the loss functions of the entire training set.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published