Curated Word Lists #73

koaning · 2020-07-03T08:41:24Z

Place to discuss a linter to check for biases in word embeddings.

koaning · 2020-07-03T08:53:28Z

The idea is to pass it a language backend and then just run a lot of tests to indicate what types of bias may exist in the word-embeddings that you pass it. We can have a linters for different languages and they can be used to generate a report of sorts to demonstrate some of the potential downsides in the dataset.

koaning · 2020-07-03T13:43:32Z

Here's some tests that come to mind.

We could project some professions to the man-woman axis in the language embedding. This axis allows us to do some hypothesis tests.

There's a set of professions like "nurse" that technically should be gender-neutral. If this is not the case -> flag it.
There's a set of descriptions like "beautiful", "handsom" that might also suggest gender imbalance.

koaning · 2020-08-06T10:33:56Z

The more that I think about this the more that I wonder if it is better to just add word-lists so people can more easily make comparisons. There's not really a consensus to measuring bias.

koaning · 2020-08-10T19:38:41Z

The more and more that I think about it ... a linter is risky. Even a linter will have blind spots in it and we do not want to give a false suggestion here. Instead it may be more appropriate to supply the user with curated word lists.

koaning changed the title ~~Linter~~ Curated Word Lists Aug 10, 2020

koaning closed this as completed Aug 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Curated Word Lists #73

Curated Word Lists #73

koaning commented Jul 3, 2020

koaning commented Jul 3, 2020

koaning commented Jul 3, 2020

koaning commented Aug 6, 2020

koaning commented Aug 10, 2020

Curated Word Lists #73

Curated Word Lists #73

Comments

koaning commented Jul 3, 2020

koaning commented Jul 3, 2020

koaning commented Jul 3, 2020

koaning commented Aug 6, 2020

koaning commented Aug 10, 2020