Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Built-in box for ROC curve #197

Merged
merged 7 commits into from Aug 23, 2021
Merged

Built-in box for ROC curve #197

merged 7 commits into from Aug 23, 2021

Conversation

darabos
Copy link
Contributor

@darabos darabos commented Aug 18, 2021

Resolves #181.

In action:

image

And inside:

image

The BoxOutputState may not exist because we may not have opened the
inside of the custom box. But the BoxOutputState is actually a table
GUID. The user is ignored in this case, but the original code
didn't offer real access control either on this endpoint.
Comment on lines +100 to +114
select
label, score,
sum(label) over (
order by score desc rows between
unbounded preceding and current row)
/ (select sum(label) from input)
as tpr,

sum(1 - label) over (
order by score desc rows between
unbounded preceding and current row)
/ (select sum(1 - label) from input)
as fpr

from input
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erbenpeter when you are not on vacation I'd appreciate if you could take a look at this SQL query for calculating the curve.

Comment on lines +128 to +129
select sum(1 - fpr) / count(1) as AUC
from input where label == 1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plus this one for calculating the AUC. I made up both queries myself instead of copying them from somewhere, so I'm not terribly confident in them. 😅 Thanks!

@darabos
Copy link
Contributor Author

darabos commented Aug 23, 2021

Everyone's on vacation but I want to prepare a release candidate. I'm merging this while hoping for a future review.

@darabos darabos merged commit 69018ae into main Aug 23, 2021
@darabos darabos deleted the darabos-roc branch August 23, 2021 13:35
@erbenpeter
Copy link
Contributor

I'm back from vacation, so now I'll take a look.

Copy link
Contributor

@erbenpeter erbenpeter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, nice and clean!

I almost like it. My issue is with the possibility of items have the same score but different labels (which I think is possible). See my more detailed comments inline.

@@ -392,8 +392,7 @@ class ProductionJsonServer @javax.inject.Inject() (

def downloadCSV = asyncAction(parse.anyContent) { (user: User, r: mvc.Request[mvc.AnyContent]) =>
val request = parseJson[GetTableOutputRequest](user, r)
implicit val metaManager = workspaceController.metaManager
val table = workspaceController.getOutput(user, request.id).table
val table = workspaceController.metaManager.table(java.util.UUID.fromString(request.id))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part I don't understand even after reading the long commit command. How is it related to the newly added built-in box?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part I don't understand even after reading the long commit command. How is it related to the newly added built-in box?

The plot refers to the table GUID and the frontend sends this request to get the data. The box output state for a table also has the table's GUID. So you can access the same table either by looking up the GUID as a box output state ID and then taking the table from it (the old code) or looking up the GUID as a table (the new code).

Box output states are not persisted. We assume you only want to look at a box output that we have returned in this run. So if you restart LynxKite after creating a plot, and then look at the plot without looking at the box that generated it, you get an error. This is an edge case I didn't consider originally. You don't typically look at box outputs when not looking at the box. Except this happens with custom boxes!

from vertices
where isnotnull(${`true label`})
and isnotnull(${`predicted score`})
limit ${`sample size`}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we regard this a a true random sample because the rows are in random order in the query?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we regard this a a true random sample because the rows are in random order in the query?

Yes. The input is a graph where this usually holds.

select
label, score,
sum(label) over (
order by score desc rows between
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only found a small issue here: If it's possible that the classifier assigns the same score for multiple items, for that specific score we will get a lot of different (fpr, tpr) pairs. Only the last one (order by score desc) is really meaningful.

On the chart it can create weird blobs I think.

One possible solution is to add a second sql to pick the last value from each score-group, but it only works if we can assume that the order of the records remains the same between the first and the second sql. Or we can pre-aggregate by score (num_1, num_0) and the then your SQL will work even for repeated score values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only found a small issue here: If it's possible that the classifier assigns the same score for multiple items, for that specific score we will get a lot of different (fpr, tpr) pairs. Only the last one (order by score desc) is really meaningful.

On the chart it can create weird blobs I think.

Nope. Looks fine:

image

parameters:
persist: 'no'
sql: |
select sum(1 - fpr) / count(1) as AUC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a tricky implementation (compared to the first 10 I've found googling) but it also assumes that scores are unique (which I think is not realistic: if two customers have the same ingredients the (deterministic) classifier needs to assign the same score to them, but it's possible that their labels are different in reality.)

Mathematically speaking this formula only works when the segments of the ROC curve are all vertical or horizontal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To phrase my problem in a different way: if it's possible to have repeated scores with different labels, your curve is not defined, its shape (and consequently the AUC value) )depends on the order of the data points inside a group with the same score.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To phrase my problem in a different way: if it's possible to have repeated scores with different labels, your curve is not defined, its shape (and consequently the AUC value) )depends on the order of the data points inside a group with the same score.

You need to have them in a random order, which is the case here. So you get a nearly diagonal line and an AUC close to 0.5 in the case of the above screenshot. (Random 0/1 label and random 0/1 prediction.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mathematically speaking this formula only works when the segments of the ROC curve are all vertical or horizontal.

Then I have great news! The segments of the ROC curve are all vertical or horizontal.

Copy link
Contributor

@erbenpeter erbenpeter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, nice and clean!

I almost like it. My issue is with the possibility of items have the same score but different labels (which I think is possible). See my more detailed comments inline.

@darabos
Copy link
Contributor Author

darabos commented Sep 10, 2021

I almost like it. My issue is with the possibility of items have the same score but different labels (which I think is possible). See my more detailed comments inline.

I've checked how scikit-learn does the ROC curve when scores are repeated and it does it like you say. 😓 Sorry I was confidently wrong. I've sent a fix in #202. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New built-in: ROC curve
2 participants