Skip to content
This repository has been archived by the owner on May 9, 2024. It is now read-only.

[GSOC]: Compare TFCO with mindiff on text data. #27

Closed
bhaktipriya opened this issue Mar 14, 2022 · 8 comments
Closed

[GSOC]: Compare TFCO with mindiff on text data. #27

bhaktipriya opened this issue Mar 14, 2022 · 8 comments

Comments

@bhaktipriya
Copy link
Contributor

bhaktipriya commented Mar 14, 2022

TFCO is a technique for constrained optimisation that can be used to make models fairer.

TFCO colab with Fairness indicators: https://colab.sandbox.google.com/github/tensorflow/fairness-indicators/blob/master/g3doc/tutorials/Fairness_Indicators_TFCO_CelebA_Case_Study.ipynb#scrollTo=idY3Uuk3yvty for fairness.

Write a colab for TFCO that uses the text toxicity dataset as we did for mindiff here https://colab.sandbox.google.com/github/tensorflow/model-remediation/blob/master/docs/min_diff/tutorials/min_diff_keras.ipynb. You can extend the mindiff colab and add a TFCO section if you like.

Please use the same baseline model as used in the mindiff example.

Please ensure that the train/test split is same for mindiff and for TFCO.

Compare performance of TFCO and mindiff, and understand where mindiff outperforms TFCO and vice versa.

@RuijianSZ
Copy link

Where should we submit the result? Thank you!

@bhaktipriya
Copy link
Contributor Author

You can add your colabs here for review.

@YASH-GU24
Copy link

Hello @bhaktipriya i have completed the given starter task,
Please have a look at it here :- https://colab.research.google.com/drive/1ParYStYt0Uu_rRdheW3Cnmt42Mwc51KP?usp=sharing
I have added the result of TFCO in the end of the MinDIff notebook and compared the fairness of both

@bhaktipriya
Copy link
Contributor Author

This is excellent @YASH-GU24! Very well documented and clean. Results are presented very systematically. Thanks a ton for working on this!

As a final step, I'd like you to add a few more analyses/experiments to the colab. You can add new sections to the colab and reuse the set up you already have.

  1. Beat mindiff false positive rate: Add an explicit constraint to beat false positive rate that the mindiff model gives on one slice. If mindiff gives x, see if we can beat that by 2-10 points or so. Extend that to improve FPR on multiple slices. If mindiff gives x,y,z on the three slices, can we reduce FPR for all three by some margin using TFCO.
  2. Multiple constraints: Add constraints to ensure that FPR of one slice is equal or better as compared to other slices. Experiment with multiple constraints at once where FPR of each slice is less than a desired epsilon. E.g.., FPR Muslim <= epsilon, FPR Jewish <= epsilon. See if TFCO can perform better than mindiff.

@YASH-GU24
Copy link

Hello @bhaktipriya ,
Thanks for providing the feedback, I will surely try working on these.
For the first point, I think you are asking to apply TFCO on MinDiff model also, To see if we can further improve FPR of slices.
However I am not quite clear about the second task, Do you need me to try the outputs with different constraint value which in this notebook was 0.05 and see if i can get better results than MinDiff model?

@bhaktipriya
Copy link
Contributor Author

Thanks, Yash. I meant adding new constraints to the model. A constraint that explicitly says FPR of each slice Muslim, Jewish, Christian, Hindu, Buddhist, Atheist is less than 0.008. Experiment with this number(make it 0.002 etc) if there's no solution at 0.008

Below is the mindiff scores from your colab, where overall is 0.008, and group wise it's either higher than 0.008(jewish, muslim) or lower(for others). Your goal is to train a model that has good overall fpr with the constrain that FPR of all groups is less than atleast 0.009(to beat mindiff score of Jewish slice).

Screenshot 2022-03-21 at 19 39 00

Hope this helps and thanks a lot for working on it!

@bhaktipriya
Copy link
Contributor Author

@YASH-GU24 thanks for all the good work. Please send your proposals to us. Emails are in the contributor document.

@ronakkkk
Copy link

ronakkkk commented May 1, 2022

Hello @bhaktipriya,
I am trying to improve the TFCO model but getting values where FPR is zero with normal TFCO model, even developed a robust optimization model using TFCO it's helping to reduce the FNR compare to unconstrained model. Following link shares the code along with the documentation for the given task. Please have a look and share some insights for the same(where it going wrong for the TFCO model without robust optimization).

https://colab.research.google.com/drive/1Di4FgK0ox8q0w97EtmK8SxUj9LOmDWO5?usp=sharing

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants