Out of all the predictions we made, how many were true?

Out of all the positive predictions we made, how many were true?
Use when:
- False positives are costly.
- Example: Spam detection. You donβt want to classify important emails as spam.
Out of all the data points that should be predicted as true, how many did we correctly predict as true?
Use when:
- False negatives are costly.
- Example: Disease diagnosis. You want to catch as many sick patients as possible.
F1 Score is a measure that combines recall and precision. As we have seen there is a trade-off between precision and recall, F1 can therefore be used to measure how effectively our models make that trade-off.
- True Positive: You predicted positive, and itβs true.
- True Negative: You predicted negative, and itβs true.
- False Positive: (Type 1 Error): You predicted positive, and itβs false.
- False Negative: (Type 2 Error): You predicted negative, and itβs false.
- Accuracy: the proportion of the total number of correct predictions that were correct.
- Positive Predictive Value or Precision: the proportion of positive cases that were correctly identified.
- Negative Predictive Value: the proportion of negative cases that were correctly identified.
- Sensitivity or Recall: the proportion of actual positive cases which are correctly identified.
- Specificity: the proportion of actual negative cases which are correctly identified.
- Rate: It is a measuring factor in a confusion matrix. It has also 4 types TPR, FPR, TNR, and FNR.

Log loss penalizes the false (false positive) classification. It usually works well with multi-class classification.

It is one of the widely used metrics and basically used for binary classification. The AUC of a classifier is defined as the probability of a classifier will rank a randomly chosen positive example higher than a negative example.
Mean Absolute Error(MAE) is the average distance between predicted and original values. Basically, it gives how we have predicted from the actual output. However, there is one limitation i.e. it doesn't give any idea about the direction of the error which is whether we are under-predicting or over-predicting our data.

Calculated at the end to see how compacted the groups are and can be used with the "Elbow Method" to determine the ideal number for k.

One of the best ways is to have hundreds of test cases at which centroids are chosen at random and to train them and choose the one with the lowest cost.
While "Eblow Method" can be used there is an other method as well which uses the silhoette of the model to choose the ideal number for k.
For each data point π:
-
Compute Cohesion (ππ) β The average distance between π and all other points in the same cluster.
-
Compute Separation (ππ) β The average distance between π and the nearest cluster (not including its own cluster).
-
Calculate the Silhouette Score for point π:
ππ ranges from -1 to 1:
- Close to 1 β Well-clustered (correct assignment).
- Close to 0 β Overlapping clusters.
- Negative β Misclassified point.
Overall Formulla:
f: Filter Size, s: Stride, p: Padding, n: Input Size, b: bias
- Filter Output Size: ((n + 2p - f) / s ) + 1
- Total Parameters: (n * f) + b
- Output after Max Pooling: ((n * f) / s) + 1
- Padding for "same" convolution: (f - s - 1) / 2
- Smaller and easier compared to other networks
- Conv Layers: has the same filter size(3x3), stride of 1, and "same" padding
- Max-Pool: 2x2 and stride 2

ResNets are like normal networks but they have jumps every few layers which normally is either 2 or 3 layers and each one witj a jump is called a residual block.
One of experts I talked to explained it like this: while plain networks try to find which data the input resembles to, ResNets try to change the image into a data that it has worked with before.
This part is mostely for translating math formulas into code when practicing and for future use.
np.linalg.norm(X[i] - centroids[j])
points = X[idx == k]
- get all the points in X that have the idx equal to k
centroids[k] = np.mean(points, axis = 0)
- The formula looks complicated however it is jsut a mean of the group formula
- https://numpy.org/doc/stable/reference/generated/numpy.mean.html
Loss function: Error for a single training example.
Cost function: Average of the loss functions of the entire training set.
















