Skip to content
Project for DS8008 (Natural Language Processing) : Debiasing word embeddings - implementation of https://arxiv.org/abs/1607.06520
Jupyter Notebook
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data
Debiasing_word_embeddings_project_report.pdf
GloVe Post_debiasing.png
GloVe Pre_debiasing.png
README.md Update README.md Apr 23, 2019
project.ipynb

README.md

DS8008-project

Instructions:

(1) Download glove.6B.zip from https://nlp.stanford.edu/projects/glove/ and unzip glove.6B.50d.txt into the project folder

(2) Run the code in the jupyter notebook project.ipynb

(3) Project report can be found here

Introduction

In this project, we aim to perform hard gender debiasing on pre-trained GloVe embeddings. For this project, we have chosen the 50-dimensional version of GloVe, which is based on Wik- ipedia 2014 and Gigaword5 and has 400,000 words.

The method used consists of neutralizing and equalizing gender word pairs in such a way that any non-gendered/neutral word is at equal distance to gender word pairs such as she-he.After plotting the extreme she-he occupations, we find that all occupations are at equal distance from the she and he axis. We also find that gender specific words have moved closer to their respective gender axis (corresponding she or he axis). Conclusions. The application of the suggested debiasing algorithm demonstrates promising results in terms of debiasing occupational stereotypes. GloVe Pre_debiasing GloVe Post_debiasing

You can’t perform that action at this time.