Metrics

Accuracy

Out of all the predictions we made, how many were true?

Precision

Out of all the positive predictions we made, how many were true?

Use when:

False positives are costly.
Example: Spam detection. You don’t want to classify important emails as spam.

Recall

Out of all the data points that should be predicted as true, how many did we correctly predict as true?

Use when:

False negatives are costly.
Example: Disease diagnosis. You want to catch as many sick patients as possible.

F1 Score

F1 Score is a measure that combines recall and precision. As we have seen there is a trade-off between precision and recall, F1 can therefore be used to measure how effectively our models make that trade-off.

Confusion matrix

True Positive: You predicted positive, and it’s true.
True Negative: You predicted negative, and it’s true.
False Positive: (Type 1 Error): You predicted positive, and it’s false.
False Negative: (Type 2 Error): You predicted negative, and it’s false.
Accuracy: the proportion of the total number of correct predictions that were correct.
Positive Predictive Value or Precision: the proportion of positive cases that were correctly identified.
Negative Predictive Value: the proportion of negative cases that were correctly identified.
Sensitivity or Recall: the proportion of actual positive cases which are correctly identified.
Specificity: the proportion of actual negative cases which are correctly identified.
Rate: It is a measuring factor in a confusion matrix. It has also 4 types TPR, FPR, TNR, and FNR.

Logarithmic Loss

Log loss penalizes the false (false positive) classification. It usually works well with multi-class classification.

Area Under Curve (AUC)

It is one of the widely used metrics and basically used for binary classification. The AUC of a classifier is defined as the probability of a classifier will rank a randomly chosen positive example higher than a negative example.

Mean Absolute Error (MAE)

Mean Absolute Error(MAE) is the average distance between predicted and original values. Basically, it gives how we have predicted from the actual output. However, there is one limitation i.e. it doesn't give any idea about the direction of the error which is whether we are under-predicting or over-predicting our data.

Mean Squared Error (MSE)

Root Mean Squared Error (RMSE)

Machine Learning Summary

Unsupervised Learning

Clustering

K-Means Clustering

Cost functions:

MSE

WCSS

Cross-Entropy Loss

Inertia

Calculated at the end to see how compacted the groups are and can be used with the "Elbow Method" to determine the ideal number for k.

Initializing K-Means

One of the best ways is to have hundreds of test cases at which centroids are chosen at random and to train them and choose the one with the lowest cost.

Choosing the best

While "Eblow Method" can be used there is an other method as well which uses the silhoette of the model to choose the ideal number for k.

For each data point 𝑖:

Compute Cohesion (𝑎𝑖) → The average distance between 𝑖 and all other points in the same cluster.
Compute Separation (𝑏𝑖) → The average distance between 𝑖 and the nearest cluster (not including its own cluster).
Calculate the Silhouette Score for point 𝑖:

𝑆𝑖 ranges from -1 to 1:

Close to 1 → Well-clustered (correct assignment).
Close to 0 → Overlapping clusters.
Negative → Misclassified point.

Overall Formulla:

Convolutional Neural Networks Summary

Notes

f: Filter Size, s: Stride, p: Padding, n: Input Size, b: bias

Filter Output Size: ((n + 2p - f) / s ) + 1
Total Parameters: (n * f) + b
Output after Max Pooling: ((n * f) / s) + 1
Padding for "same" convolution: (f - s - 1) / 2

Classic Networks

-AlexNet

-VGG-16

Smaller and easier compared to other networks
Conv Layers: has the same filter size(3x3), stride of 1, and "same" padding
Max-Pool: 2x2 and stride 2

ResNet

ResNets are like normal networks but they have jumps every few layers which normally is either 2 or 3 layers and each one witj a jump is called a residual block.

One of experts I talked to explained it like this: while plain networks try to find which data the input resembles to, ResNets try to change the image into a data that it has worked with before.

Residual Block:

Residual Block Formula:

Coding Notes

This part is mostely for translating math formulas into code when practicing and for future use.

np.linalg.norm(X[i] - centroids[j])

https://numpy.org/doc/stable/reference/generated/numpy.linalg.norm.html

points = X[idx == k]

get all the points in X that have the idx equal to k

centroids[k] = np.mean(points, axis = 0)

The formula looks complicated however it is jsut a mean of the group formula
https://numpy.org/doc/stable/reference/generated/numpy.mean.html

Notes

Loss function: Error for a single training example.

Cost function: Average of the loss functions of the entire training set.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Codes		Codes
RAG		RAG
Breast_Cancer_Testing_Project.ipynb		Breast_Cancer_Testing_Project.ipynb
README.md		README.md
Sources		Sources
test.md		test.md

Ra00f1/Machine-Learning-and-Deep-Learning-notes

Folders and files

Latest commit

History

Repository files navigation

Metrics

Accuracy

Precision

Recall

F1 Score

Confusion matrix

Logarithmic Loss

Area Under Curve (AUC)

Mean Absolute Error (MAE)

Mean Squared Error (MSE)

Root Mean Squared Error (RMSE)

Machine Learning Summary

Unsupervised Learning

Clustering

K-Means Clustering

Cost functions:

MSE

WCSS

Cross-Entropy Loss

Inertia

Initializing K-Means

Choosing the best

Convolutional Neural Networks Summary

Notes

Classic Networks

-AlexNet

-VGG-16

ResNet

Residual Block:

Residual Block Formula:

Coding Notes

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages