Measuring Gender Inequalities of German Professions on Wikipedia
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.ipynb_checkpoints some tunnings Nov 3, 2017
de add new plots with hypothetical violin plots Nov 3, 2017
en add notebook files Jun 15, 2015
fr add notebook files Jun 15, 2015
results_crowdflower move files2 Sep 27, 2016
ru add notebook files Jun 15, 2015
.gitattributes add gitignore Jun 15, 2015
.gitignore add gitignore Jun 15, 2015
1.0 Create male-female pairs of profession names (German).ipynb rename ipynb Sep 27, 2016
1.1 NewDataset. Create male-femalepairs and match Wiki articles [not used in thesis].ipynb rename ipynb Sep 27, 2016
2.1 Match profession names with Wiki articles (male and female).ipynb rename ipynb Sep 27, 2016
2.2 Match profession names with Wiki articles (neutral prof names).ipynb rename ipynb Sep 27, 2016
2.3 Match profession names with Wiki articles using Levenstein dist and ratio .ipynb add new plots with hypothetical violin plots Nov 3, 2017
2.4.Plot tables with number of matched articles and redirections. Save redirection bias groups.ipynb rename ipynb Sep 27, 2016
3. Collect Google hits.ipynb some tunnings Nov 3, 2017
4. Match labor market statistics to professions .ipynb add new plots with hypothetical violin plots Nov 3, 2017
5. Collect text and mentioned persons using links from Wiki articles.ipynb rename ipynb Sep 27, 2016
6. Collect images from Wiki articles.ipynb rename ipynb Sep 27, 2016
6.1 Create file with image links for Crowdflower task .ipynb add new plots with hypothetical violin plots Nov 3, 2017
7. Mine persons mentioned in wiki articles.ipynb add new plots with hypothetical violin plots Nov 3, 2017
7.1 Get date of birth of mentioned people (plot ratios of men).ipynb add new plots with hypothetical violin plots Nov 3, 2017
7.2 Merge dataset of persons from links and dataset mined with polyglot.ipynb rename ipynb Sep 27, 2016
9.1.1 Logistic regresion (Google hits).ipynb add new plots with hypothetical violin plots Nov 3, 2017
9.1.2 Logistic regression (Labour market statistics) .ipynb add new plots with hypothetical violin plots Nov 3, 2017
9.1.22 Logistic regression (Labour market statistics)with correction for not balanced groups [not used in thesis].ipynb rename ipynb Sep 27, 2016
9.1.3 Logistic regresion (Google results and Labor market) [not used in thesis].ipynb rename ipynb Sep 27, 2016
9.2 Analysis of mentioned people in articles.ipynb add new plots with hypothetical violin plots Nov 3, 2017
9.2.1 Analysis of mentioned people (restricted by BirthDate).ipynb add new plots with hypothetical violin plots Nov 3, 2017
9.3 Analysis of images.ipynb add new plots with hypothetical violin plots Nov 3, 2017
9.39 Old Analysis of images.ipynb rename ipynb Sep 27, 2016
Crowdflower_image_links.xls files2 Feb 16, 2016
Crowdflower_image_links_all.xls structural changes Sep 27, 2016
Crowdflower_image_links_more.xls add name analysis Mar 22, 2016
French Wikipedia.ipynb I add new notebooks Dec 22, 2015
Hypothetical violin plots.ipynb add new plots with hypothetical violin plots Nov 3, 2017
LICENSE license fix Jan 30, 2018
README.md add link to paper Jan 30, 2018
R_Chi2_text_with_simulation_monteCarlo.txt add corrections ofp-val for multiple post hoc tests Apr 19, 2016
Russian Wikipedia.ipynb move files2 Sep 27, 2016
gender_names_40000.txt add name analysis Mar 22, 2016
name_gender_genderiz.txt add name analysis Mar 22, 2016
profession_images (2).txt files2 Feb 16, 2016
profession_images.csv files2 Feb 16, 2016
profession_images_all.csv add name analysis Mar 22, 2016

README.md

Measuring Gender Inequalities of German Professions on Wikipedia

License: MIT

Description

Master thesis project "Measuring Gender Inequalities of German Professions on Wikipedia"

Abstract:

Wikipedia is a community-created online encyclopedia; arguably, it is the most popular and largest knowledge resource on the Internet. Thus, reliability and neutrality are of high importance for Wikipedia. Previous research [3] reveals gender bias in Google search results for many professions and occupations. Also, Wikipedia was criticized for existing gender bias in biographies [4] and gender gap in the editor community [5, 6]. Thus, one could expect that gender bias related to professions and occupations may be present in Wikipedia. The term gender bias is used here in the sense of conscious or unconscious favoritism towards one gender over another [47] with respect to professions and occupations. The objective of this work is to identify and assess gender bias. To this end, the German Wikipedia articles about professions and occupations were analyzed on three dimensions: redirections, images, and people mentioned in the articles. This work provides evidence for systematic overrepresentation of men in all three dimensions; female bias is only present for a few professions.

Supervised by: Claudia Wagner, Fabian Flöck

Further Reading

Paper

My slides

Slides (by Claudia Wagner)

Thesis on ArXiv

How to cite

Olga Zagovora, Fabian Flöck, and Claudia Wagner. 2017. "(Weitergeleitet von Journalistin)": The Gendered Presentation of Professions on Wikipedia. In Proceedings of the 2017 ACM on Web Science Conference (WebSci '17). ACM, New York, NY, USA, 83-92. DOI: https://doi.org/10.1145/3091478.3091488 Download preprint

Contact

Olga Zagovora olga.zagovora (at) gesis (dot) org

License

This work is licensed under the MIT license. See LICENSE file in this repository.

Developed at Computational Social Science department of GESIS - Leibniz Institute for the Social Sciences, Cologne (Germany) and WeST Institute for Web Science and Technologies of the University of Koblenz-Landau, Koblenz (Germany).