Skip to content

DSPG-Young-Scholars-Program/dspg21oss

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Defining and Measuring the Universe of Open Source Software Innovation

Data Science for Public Good summer program 2021

Research Question

  • In what ways can repositories be efficiently classified into “types” (e.g., operating systems, network services, database management, development tools, blockchain, etc.)? What information (e.g., tags, repos stats, or READMEs) is most helpful for classifying existing repositories?

  • How does GitHub activity change based on the type of software being developed? Which types of software have the most contributors? Which types of software requires more commits, additions or deletions?

  • How do different types of software affect collaboration tendencies? How do these tendencies change across the academic, business, or government sectors?

Collect Access Token

This won't take longer than 5 min.

We are collecting access token to speed up the process of scraping GitHub repositories. One access token can only scrape 5000 repositories in an hour, and our goal is to scrape about 10 million repositories. Having more acess token would help us tremendously.

Please refer to this document for detailed instruction in creating a personal access token for step 1-5.

For step 6 and 7, please refer to the following image. plot

Please private message the team:

  • access token
  • username

and make sure you delete the message (not the access token) afterwards. (Ex. if you messaged us on teams, you should delete that message.)

We appreciate your help!

Project Sponsor

National Center for Science and Engineering Statistics (NCSES)

  • Carol Robbins, Senior Economist
  • Ledia Guci, Science Resources Analyst

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published