Gender Gap in Online Developer Communities
Online developer communities boast millions of users - over 29 million on GitHub and over 8 million on Stack Overflow, in 2018. Participation in these communities is becoming one of the primary ways software developers learn new programming languages, improve their skills, develop collaborative projects, and find new job opportunities. (David and Shapiro, 2008; Ford et al., 2016; Vasilescu et al., 2015)
Developers on these sites may ask and answer coding questions to improve their skills (e.g. Stack Overflow), use those skills to contribute to open-source code (e.g. GitHub) and participate in coding challenges (e.g. HackerRank). These platforms are becoming increasingly important to hiring decisions, as recruiters look at GitHub contributions or reputation on Stack Overflow as indicators of developers' skill.
However, despite the promise for online software developer communities to support software developers in their professional development, there are indicators that there may be serious difference in women and men's* participation in these communities - differences which may further exacerbate existing gender gaps in the global ICT workforce.
* See the UNU-CS EQUALS project page for more detail on the nature of the gender analyses used by the UN and the EQUALS project for the purposes of this research.
To understand the extent and nature of the gender gap in online software developer communities, we ask the following research questions:
- How do male and female developers differ in their participation in online software developer communities?
- How do male and female developers differ in their perceptions of belonging and kinship in online software developer communities?
- How do male and female developers in online software developer communities differ in their employment and prior experience with coding?
We use publicly available survey data from 3 major online developer communities:
- Stack Overflow survey (download latest survey results here)
- GitHub survey (download latest survey results here)
- HackerRank survey (download latest survey results here)
- Statistical analyses of survey data (Chi-squares, log-linear models, etc)
- Visualizations of descriptive statistics of survey data (bar charts, cross-tabulated heatmaps, etc)
- Re-usable data cleaning scripts (cleaning country names, etc)
- To view the results of the analyses, clone or download this repository using the green button, then open the
.htmlfiles in your browser.
- To modify or re-run the code, some basic familiarity with Python and the Jupyter development environment may be necessary. See a tutorial here for assistance beginning to work with Jupyter.
- Re-usable data cleaning scripts are located in the
- The primary analysis results are in the
developer_survey_analysesscript is the main analysis file, with additional visualizations in the
- Michael A. Madaio (email@example.com)
- Dr. Araba Sey