# Gender Bias in Common Adjectives

Inspired by [this](https://arxiv.org/pdf/1607.06520.pdf) article linked to in the New York Times this morning, I decided to do some experiments on how gender bias influences words that are not intended to connote gender. 

## Algebra with Words

Words can be represented as abstractly as vectors in a 300-dimensional space. The attributes are determined by a machine learning algorithm, so they don't mean much by themselves. But this allows you, loosely speaking, to do algebra on words: King - Man + Woman = Queen.

What we're really doing here is making an analogy. "Man is to king as woman is to...?" And we want the computer to be able to return "Queen" as the answer.

So I defined an analogy function using the Python `gensim` library, which allows you to easily make such comparisons based on a model pre-trained on 100 billion words of Google News data.


In [None]:
import gensim.downloader as api
wv = api.load('word2vec-google-news-300')

def analogy(w1a, w1b, w2a):
	result = wv.most_similar(negative=[w1a], 
                                positive=[w1b, w2a])
	if result[0][0].lower() == w1b:
		return result[1]
	return result[0]



  'See the migration notes for details: %s' % _MIGRATION_NOTES_URL


## Bias Experiments

The article I linked to above talks about detecting gender bias in words that shouldn't, by definition, be related to gender. It also presents a method for removing such bias.

I decided to do some of my own experiments looking at the way gender bias is presented in adjectives that can be used to describe a person of any gender. For each word, `word`, in my list, I looked at the result of `analogy("he", word, "she")` and the result of `analogy("she", word, "he")`. The former is like "adding femininity" to the word, while the latter is like "adding masculinity."

Some of the words I used came up with non-sensical results, and others came up with very similar results, indicating little gender bias in the word. The word list I will discuss here is: "badass", "sexy", "affectionate", "beautiful","anxious", "overbearing", "assertive", "bossy", "rude", "fixated", "unhappy".



In [None]:
	words = ["badass", "sexy", "affectionate", "beautiful", "anxious", "overbearing", "assertive", "bossy", "rude", "fixated", "unhappy"]
	for word in words:
		female_result = analogy("he", word, "she")
		male_result = analogy("she", word, "he")
		print("He:%s :: she:%s" % (word, female_result[0]))
		print("She:%s :: he:%s" % (word, male_result[0]))

  if np.issubdtype(vec.dtype, np.int):


He:badass :: she:superheroine
She:badass :: he:dude
He:sexy :: she:sassy
She:sexy :: he:suave
He:affectionate :: she:motherly
She:affectionate :: he:genial
He:beautiful :: she:gorgeous
She:beautiful :: he:magnificent
He:anxious :: she:fearful
She:anxious :: he:eager
He:overbearing :: she:bossy
She:overbearing :: he:arrogant
He:assertive :: she:bossy
She:assertive :: he:forceful
He:bossy :: she:bitchy
She:bossy :: he:arrogant
He:rude :: she:catty
She:rude :: he:discourteous
He:fixated :: she:obsessed
She:fixated :: he:preoccupied
He:unhappy :: she:dissatisfied
She:unhappy :: he:displeased


## Results

Below is a graphic with some of the results. 

![Gender Bias Results](https://i.imgur.com/tRnB789.jpg)

## Discussion

First, these results should be taken with some healthy grains of salt. I didn't consider the strength of the analogies at all, and I cherry-picked the results I found most interesting. These relationships are based on the Google News training dataset, which is not all-encompassing. 

That being said, some observations:

* The model thinks female + badass is a superheroine, while a male + badass is just a dude, implying that the requirements for being "badass" are harder to meet as a woman. 
* If you take "affectionate" in the female direction, you get a word tied by dictionary definition to womanhood. If you take it in the male direction, you get a word with a signficantly less warm connotation.
* The model seems to be using a different definition of "anxious" for the female and male versions. Women get the definition associated with a mental health disorder, while men get the a positive, desirable attribute. This associaton of anxiety with feminitiy may be harmful for men who experience anxiety.
* For words like "bossy" and "rude", the female version is decidedly more insulting. For women, there are fine lines between "assertive," "bossy," and "bitchy."
* For the word "unhappy," the feminized version "dissatisfied" is more passive than the masculinized version, "displeased."

