New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance score stratified by confound #10
Comments
@dinga92 , this is related to comparisons we discussed today (figure in your poster). |
I wouldn't call it a bias, but the function is useful. What about continuous confound variables? |
Good question- Quantizing them is one option, but I haven’t thought about in serious detail yet. Let’s get it done for categorical first, like gender, site etc. |
@raamana is this like a partial dependence type function, one covariate at a time? I would call it something like dependence on a covariate. The covariate could be a source of bias, or simply a moderator (just like covid risk varies with age). The fact that there some type of trend that differs from a flat line would make it important to consider. |
Probably similar, I jotted this down many months ago, so I don't recall the particular paper/application that prompted me to think of this. but I think even a simpler form would help: imagine a bar plot of a metric for different levels of categorical confounder (site, gender etc). In a way, further breaking down the plot from Richard's poster, into different levels of Age (young vs old), Education (highly educated vs. not etc) etc |
utils.score_stratified_by_confound()
Helper to summarize the performance score (accuracy, MSE, MAE etc) for each
level or variant of confound. This is helpful to assess any bias towards a
particular value when confounds are categorical (such as site or gender). So
if the MSE (of target) for Females is much lower compared to Males, then it
may indicate a potential bias of the model towards Females (due to imbalance in
size?)
The text was updated successfully, but these errors were encountered: