Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance score stratified by confound #10

Open
raamana opened this issue Jun 17, 2020 · 6 comments
Open

Performance score stratified by confound #10

raamana opened this issue Jun 17, 2020 · 6 comments

Comments

@raamana
Copy link
Owner

raamana commented Jun 17, 2020

utils.score_stratified_by_confound()

Helper to summarize the performance score (accuracy, MSE, MAE etc) for each
level or variant of confound. This is helpful to assess any bias towards a
particular value when confounds are categorical (such as site or gender). So
if the MSE (of target) for Females is much lower compared to Males, then it
may indicate a potential bias of the model towards Females (due to imbalance in
size?)

@raamana
Copy link
Owner Author

raamana commented Jun 17, 2020

@dinga92 , this is related to comparisons we discussed today (figure in your poster).

@dinga92
Copy link
Contributor

dinga92 commented Jun 17, 2020

I wouldn't call it a bias, but the function is useful. What about continuous confound variables?

@raamana
Copy link
Owner Author

raamana commented Jun 17, 2020

Good question- Quantizing them is one option, but I haven’t thought about in serious detail yet. Let’s get it done for categorical first, like gender, site etc.

@mnarayan
Copy link

@raamana is this like a partial dependence type function, one covariate at a time? I would call it something like dependence on a covariate. The covariate could be a source of bias, or simply a moderator (just like covid risk varies with age). The fact that there some type of trend that differs from a flat line would make it important to consider.

@raamana
Copy link
Owner Author

raamana commented Jun 18, 2020

Probably similar, I jotted this down many months ago, so I don't recall the particular paper/application that prompted me to think of this.

but I think even a simpler form would help: imagine a bar plot of a metric for different levels of categorical confounder (site, gender etc). In a way, further breaking down the plot from Richard's poster, into different levels of Age (young vs old), Education (highly educated vs. not etc) etc

@raamana
Copy link
Owner Author

raamana commented Jun 18, 2020

perhaps we don't need to make it too generic, let's start with concrete applications and real datasets, and evolve it from there as we need them to..

vis helpers to create this from Manjari's slides and provide the result of H0 would be helpful already:
between-within-site-H0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants