-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Classification evaluation #188
Conversation
// get tp tn fp fn (and p/n as combinations) in one iteration | ||
let _ = | ||
Seq.zip actual predictions | ||
|> Seq.iter (fun (truth,pred) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be a possibility to omit the mutable variables by replacing the seq.iter by a fold using an anonymous record holding with tp, tn, fp and fn as fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BinaryConfusionMatrix is now only that - a record of TP/TN/FP/FN, and is used as accumulator directly - not sure how {x with ...}
compares to increasing mutable variables though.
… for multi label comparisons
…onfusion matrices
Codecov Report
@@ Coverage Diff @@
## developer #188 +/- ##
=============================================
+ Coverage 24.57% 27.52% +2.94%
=============================================
Files 118 121 +3
Lines 10969 11531 +562
Branches 1972 2029 +57
=============================================
+ Hits 2696 3174 +478
- Misses 7782 7846 +64
- Partials 491 511 +20
Continue to review full report at Codecov.
|
docs/ComparisonMetrics.fsx
Outdated
|
||
Predictors can be compared by comparing the relative frequency distributions of metrics of interest for each possible (or obtained) confidence value. | ||
|
||
Two prominent examples are the **Reciever Operating Characteristic (ROC)** or the **Precision-Recall metric** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo 'Receiver'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
(** | ||
#### ROC curve example | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add one or two sentences what you can see on the ROC plot and how to interpret the result. It is stated above, that it is used to evaluate the validity of the predictor, but at this chapter a short summary is missing (at least for me).
Edit: Add a sentence that AUC stands for 'area under the curve' and describe if it is a quality parameter that should be maximized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
docs/Integration.fsx
Outdated
|
||
Instead of integrating a function by sampling the function values in a set interval, we can also calculate the definite integral of (x,y) pairs with these methods. | ||
|
||
This may be of use for example for calculating the area under the curve for prediction metrics such as the ROC(Reciever operator characteristic), which yields a distinct set of (Specificity/Fallout) pairs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: Receiver
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
docs/Integration.fsx
Outdated
We compare the resulting values with the values of the known differential f'(x) = 3x^2, here called g(x) | ||
## Explanation of the methods | ||
|
||
In the following chapter, each estimation method is introduced brefly and visualized for the example of $f(x) = x^3$ in the interval $[0,1]$ using 5 partitions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: briefly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
\int_a^b f(x)\,dx \approx \frac{b - a}6 [f(a) + 4f(\frac{a+b}2) + f(b)] | ||
$$ | ||
|
||
The integral of the whole integration interval is obtained by summing the integral of n partitions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it somehow possible to visualize simpsons rule, or to describe the strategy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
$$ | ||
|
||
The integral of the whole integration interval is obtained by summing the integral of n partitions. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a high quality integration documentation 🚀
let rectWidth = x - xVals[i-1] | ||
(rectWidth*yVals[i]) | ||
) | ||
| Midpoint -> fun (observations: (float*float) []) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a problem with the midpoint strategy:
- As you can see the performed calculations are equal to the Trapezoidal method. The result would always be the same.
- Since the midpoint rule requires a function to calculate the true value of (a+b)/2 it can only be applied if a function is given. For
float*float
inputs this is impossible and therefore should be either removed or midpoint should ask for a(f: float -> float)
input with interval rather than a(float*float) []
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is only the case for midpoint rule definite integrals for observations. i added a hint in the xml docs fr the midpoint rule.
let expected = 0.25 | ||
Expect.floatClose Accuracy.low actual expected "LeftEndpoint did not return the correct result" | ||
) | ||
testCase "Midpoint x^3" (fun _ -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above
) = | ||
fun (data: (float*float) []) -> NumericalIntegrationMethod.integrateObservations method data |> Seq.sum | ||
|
||
|
||
module DefiniteIntegral = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be removed. Make sure to replace any other references.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
This PR adds several modules, functions and types/classes for classification evaluation. For now i only focused on binary classification evaluation, but all of this can (and most likely will) be generalized for multi-label classification.
This PR will consist of 3 parts:
Binary confusion matrix
This can be used for evaluating any kind of binary test or binary classification. See https://en.wikipedia.org/wiki/Confusion_matrix
BinaryConfusionMatrix
typeMulti-label confusion matrix
A generalization of the binary confusion matrix for any amount of labels.
MultiLabelConfusionMatrix
typeComparison metrics
Exhaustive collection of metrics that can be derived from confusion matrices
ComparisonMetrics
typeEquivalents of the integration methods already implemented in https://github.com/fslaborg/FSharp.Stats/blob/developer/src/FSharp.Stats/Integration/Integration.fs for estimating AUC from (x,y) data (instead of estimating AUC of a function)
it was only possible to integrate functions with the current approximation methods, this PR aims to enable integration of (x,y) observations.