Using Word Embeddings to Examine Gender Bias in Dutch Newspapers, 1950-1990
Contemporary debates on filter bubbles and polarization in public and social media raise the question to what extent news media of the past exhibited biases. This paper specifically examines bias related to gender in six Dutch national newspapers between 1950 and 1990. We measure bias related to gender by comparing local changes in word embedding models trained on newspapers with divergent ideological backgrounds. We demonstrate clear differences in gender bias and changes within and between newspapers over time. In relation to themes such as sexuality and leisure, we see the bias moving toward women, whereas, generally, the bias shifts in the direction of men, despite growing female employment number and feminist movements. Even though Dutch society became less stratified ideologically (depillarization), we found an increasing divergence in gender bias between religious and social-democratic on the one hand and liberal newspapers on the other. Methodologically, this paper illustrates how word embeddings can be used to examine historical language change. Future work will investigate how fine-tuning deep contextualized embedding models, such as ELMO, might be used for similar tasks with greater contextual information.
The repository contains the code for the article 'Using Word Embeddings to Examine Gender Bias in Dutch Newspapers, 1950-1990'. The experiment is included in the Jupyter Notebook and the preparatory scripts can be found in the folder code. The article can be found in the folder Manuscript
The models can be downloaded from Zenodo https://zenodo.org/record/3237380#.XTQPYy2B1TZ. You can place these in the embeddings folder.