-
Notifications
You must be signed in to change notification settings - Fork 50
Curated Word Lists #73
Comments
The idea is to pass it a language backend and then just run a lot of tests to indicate what types of bias may exist in the word-embeddings that you pass it. We can have a linters for different languages and they can be used to generate a report of sorts to demonstrate some of the potential downsides in the dataset. |
Here's some tests that come to mind. We could project some professions to the man-woman axis in the language embedding. This axis allows us to do some hypothesis tests.
|
The more that I think about this the more that I wonder if it is better to just add word-lists so people can more easily make comparisons. There's not really a consensus to measuring bias. |
The more and more that I think about it ... a linter is risky. Even a linter will have blind spots in it and we do not want to give a false suggestion here. Instead it may be more appropriate to supply the user with curated word lists. |
Place to discuss a linter to check for biases in word embeddings.
The text was updated successfully, but these errors were encountered: