Skip to content
Switch branches/tags

The Open Source Survey

We've run the largest survey of the open source community to date, the results of are an open dataset for us all to use and learn from. We hope the dataset informs some of the most pressing questions about open source software, the people that create it, their experience, and their relationship to the industry that depends on it.

Learn more about the survey design and the topics we're studying.

Why is GitHub doing this?

At GitHub our goal is to help everyone build better software. We believe open source code, communities, and principles create better software. As an industry, we know a lot about how open source software is created but very little about the people who create and use it. Are they professional developers, students, or hobbyists?

To build better software, then we need a software community where anyone, regardless of what they look like or where they come from, can participate. This survey will help us see how we, as a community, are doing.

Open data

Open source is bigger than any company or community. The dataset is released under CC0-1.0 for anyone to use and learn from.


This survey is primarily designed and implemented by GitHub:

  • @franniez - Data and social scientist at GitHub. New to open source but not to studying people or movements, she's done extensive survey research in Washington D.C, from inside the ivory tower, and within the technology sector.
  • @arfon - Program Manager for Open Source Data at GitHub. A lapsed academic with a passion for new models of scientific collaboration, he's used big telescopes to study dust in space, built sequencing technologies in Cambridge, and has engaged millions of people in online citizen science by co-founding the Zooniverse.
  • @mlinksva - Open Source Maven at GitHub. A lapsed engineer and non-lawyer with a passion for increasing the efficacy and scope of open production and policy, he is an advisor/director/volunteer for various open initiatives and was previously a manager and technologist at Creative Commons.

This isn't a solo effort for us, these awesome individuals and organizations have helped us design this survey:

Check out the contributing guidelines if you want to get involved.


The material in this repo is open data released under CC0-1.0. This means you need no copyright or database right (if any) permissions to make use of this data and survey questions. However:

  • Survey participants have not waived their privacy rights; read our Privacy Statement regarding Public Information on GitHub. In particular, do not attempt to reidentify survey participants.
  • If you use this dataset in a publication, a link to or citation of this repository would be appreciated.
  • If you extend this dataset, sharing your additions as open data would also be appreciated.
  • CC0-1.0 does not grant any trademark permissions. GitHub® and its stylized versions and the Invertocat mark are GitHub's Trademarks or registered Trademarks. When using GitHub's logos, be sure to follow the GitHub logo guidelines.

Citation info

The data is additionally published on Zenodo, which provides a DOI as well as an easy way to generate citations in a number of formats. We suggest modifying autogenerated citations to reflect the original publication source, e.g as below.

screen shot 2017-06-19 at 4 13 11 pm

  author       = {Zlotnick, Frances},
  title        = {GitHub Open Source Survey 2017},
  month        = jun,
  year         = 2017,
  doi          = {10.5281/zenodo.806811},
  publisher    = {GitHub, Inc.},
  howpublished = {\url{}}

Citations and Reuse

  • R. Stuart Geiger Summary Analysis of the 2017 GitHub Open Source Survey "presenting frequency counts, proportions, and frequency or proportion bar plots for every question asked in the survey."
  • The LibreOffice Design Team asked users what aspects of open source are important, using questions from the Open Source Survey. Their summary includes a comparison with Open Source Survey responses, and their data is also released under CC0-1.0.