-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Built-in box for ROC curve #197
Changes from all commits
7275007
a0b0d15
12a11a0
7b0d38e
a29b2bd
266d343
05ee4e9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,6 +10,7 @@ __pycache__ | |
.cache/ | ||
.history/ | ||
.lib/ | ||
/.bsp | ||
/dist/* | ||
target/ | ||
/logs/ | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,155 @@ | ||
boxes: | ||
- id: anchor | ||
inputs: {} | ||
operationId: Anchor | ||
parameters: | ||
description: |- | ||
Draws an ROC curve and computes the AUC | ||
for a binary classifier prediction. | ||
The two parameters are the true label (0 or 1) | ||
and the predicted score from the model (between 0 and 1). | ||
|
||
To avoid an overly detailed plot, | ||
the curve is based on a sample of vertices. | ||
parameters: >- | ||
[{"kind":"vertex attribute (number)","id":"true | ||
label","defaultValue":""},{"kind":"vertex | ||
attribute (number)","id":"predicted | ||
score","defaultValue":""},{"kind":"text","id":"sample | ||
size","defaultValue":"1000"}] | ||
parametricParameters: {} | ||
x: 0 | ||
y: 0 | ||
- id: Custom-plot_2 | ||
inputs: | ||
table: | ||
boxId: SQL1_5 | ||
id: table | ||
operationId: Custom plot | ||
parameters: | ||
plot_code: |- | ||
{ | ||
"layer": [{ | ||
"mark": "line", | ||
"encoding": { | ||
"x": { | ||
"field": "fpr", | ||
"title": "False positive rate", | ||
"type": "quantitative" | ||
}, | ||
"y": { | ||
"field": "tpr", | ||
"title": "True positive rate", | ||
"type": "quantitative" | ||
} | ||
} | ||
}, { | ||
"mark": { | ||
"type": "rule", | ||
"color": "lightgray", | ||
"strokeDash": [8, 8] | ||
}, | ||
"encoding": { | ||
"x": { "datum": 0 }, | ||
"y": { "datum": 0 }, | ||
"x2": { "datum": 1 }, | ||
"y2": { "datum": 1 } | ||
} | ||
}] | ||
} | ||
parametricParameters: {} | ||
x: 700 | ||
y: 150 | ||
- id: SQL1_4 | ||
inputs: | ||
input: | ||
boxId: input-input | ||
id: input | ||
operationId: SQL1 | ||
parameters: | ||
persist: 'yes' | ||
summary: Rename and filter | ||
parametricParameters: | ||
sql: |- | ||
select | ||
${`true label`} as label, | ||
${`predicted score`} as score | ||
from vertices | ||
where isnotnull(${`true label`}) | ||
and isnotnull(${`predicted score`}) | ||
limit ${`sample size`} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we regard this a a true random sample because the rows are in random order in the query? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes. The input is a graph where this usually holds. |
||
x: 250 | ||
y: 250 | ||
- id: input-input | ||
inputs: {} | ||
operationId: Input | ||
parameters: | ||
name: input | ||
parametricParameters: {} | ||
x: 50 | ||
y: 250 | ||
- id: SQL1_5 | ||
inputs: | ||
input: | ||
boxId: SQL1_4 | ||
id: table | ||
operationId: SQL1 | ||
parameters: | ||
persist: 'no' | ||
sql: |- | ||
select | ||
label, score, | ||
sum(label) over ( | ||
order by score desc rows between | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I only found a small issue here: If it's possible that the classifier assigns the same score for multiple items, for that specific score we will get a lot of different On the chart it can create weird blobs I think. One possible solution is to add a second sql to pick the last value from each score-group, but it only works if we can assume that the order of the records remains the same between the first and the second sql. Or we can pre-aggregate by score There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Nope. Looks fine: |
||
unbounded preceding and current row) | ||
/ (select sum(label) from input) | ||
as tpr, | ||
|
||
sum(1 - label) over ( | ||
order by score desc rows between | ||
unbounded preceding and current row) | ||
/ (select sum(1 - label) from input) | ||
as fpr | ||
|
||
from input | ||
Comment on lines
+100
to
+114
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @erbenpeter when you are not on vacation I'd appreciate if you could take a look at this SQL query for calculating the curve. |
||
summary: Compute TPR / FPR | ||
parametricParameters: {} | ||
x: 450 | ||
y: 250 | ||
- id: SQL1_6 | ||
inputs: | ||
input: | ||
boxId: SQL1_5 | ||
id: table | ||
operationId: SQL1 | ||
parameters: | ||
persist: 'no' | ||
sql: | | ||
select sum(1 - fpr) / count(1) as AUC | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's a tricky implementation (compared to the first 10 I've found googling) but it also assumes that scores are unique (which I think is not realistic: if two customers have the same ingredients the (deterministic) classifier needs to assign the same score to them, but it's possible that their labels are different in reality.) Mathematically speaking this formula only works when the segments of the ROC curve are all vertical or horizontal. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To phrase my problem in a different way: if it's possible to have repeated scores with different labels, your curve is not defined, its shape (and consequently the AUC value) )depends on the order of the data points inside a group with the same score. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
You need to have them in a random order, which is the case here. So you get a nearly diagonal line and an AUC close to 0.5 in the case of the above screenshot. (Random 0/1 label and random 0/1 prediction.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Then I have great news! The segments of the ROC curve are all vertical or horizontal. |
||
from input where label == 1 | ||
Comment on lines
+128
to
+129
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Plus this one for calculating the AUC. I made up both queries myself instead of copying them from somewhere, so I'm not terribly confident in them. 😅 Thanks! |
||
summary: Compute AUC | ||
parametricParameters: {} | ||
x: 700 | ||
y: 300 | ||
- id: output-plot | ||
inputs: | ||
output: | ||
boxId: Custom-plot_2 | ||
id: plot | ||
operationId: Output | ||
parameters: | ||
name: plot | ||
parametricParameters: {} | ||
x: 900 | ||
y: 150 | ||
- id: output-table | ||
inputs: | ||
output: | ||
boxId: SQL1_6 | ||
id: table | ||
operationId: Output | ||
parameters: | ||
name: AUC | ||
parametricParameters: {} | ||
x: 900 | ||
y: 300 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part I don't understand even after reading the long commit command. How is it related to the newly added built-in box?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plot refers to the table GUID and the frontend sends this request to get the data. The box output state for a table also has the table's GUID. So you can access the same table either by looking up the GUID as a box output state ID and then taking the table from it (the old code) or looking up the GUID as a table (the new code).
Box output states are not persisted. We assume you only want to look at a box output that we have returned in this run. So if you restart LynxKite after creating a plot, and then look at the plot without looking at the box that generated it, you get an error. This is an edge case I didn't consider originally. You don't typically look at box outputs when not looking at the box. Except this happens with custom boxes!