Scripts that I have used to retrieve and process data about JavaScript projects from Github, npm, StackOverflow etc. Run these scripts if you want up-to-date data. Otherwise, check out my gists for precompiled data.
I've compiled a GitHub star history chart based on the data from May 2017:
This was part of the research I was doing for my master's thesis on universal web applications. You might also want to check out the research results.
All scripts can be executed with node > 6 by running node scripts/<script-name> ...
. All data is written to process.stdout
. Some meta information, such as the remaining rate limit, is written to process.stderr
.
You can pipe that data into a dedicated file by running for example node scripts/githubStarsHistory.js jhnns/popular-javascript-projects > data/my-file.json
.
Heads up! Github limits the pagination offset for performance reasons. This means that it might be impossible to retrieve all the desired data. For instance, githubStarsHistory
will only return the first 40.000 stars.
Github support answer:
That's indeed intentional. Some lists of resources have a limit on the pagination for performance reasons -- fetching pages with a large offset is expensive to compute which introduces performance and reliability concerns. [...] There's no workaround for this since we intentionally limit this behavior.
Arguments:
- Star threshold
Example:
Get all JavaScript projects with at least 10000 stars:
node scripts/githubMostPopular.js 10000 > data/at-least-10000-stars.json
Requires a Github access token (see Config).
Arguments:
- The full repo name, including the organization
Example:
Get an array of star timestamps for a given repo:
node scripts/githubStarsHistory.js jhnns/popular-javascript-projects > data/star-history.json
Requires a Github access token (see Config).
Arguments:
- Exact URL
Example:
node scripts/removeFromHttpCache.js https://api.github.com/search/repositories
Removes all cached responses from the given URL.
I've created some gists with precompiled data. Also check out the revisions for historic data.
- Most popular JavaScript projects
- JavaScript library types
- GitHub stars history of Redux
- GitHub stars history of React
- GitHub stars history of Vue
- GitHub stars history of Angular 2+
- GitHub stars history of Angular 1
- GitHub stars history of Meteor
- GitHub stars history of Express
- GitHub stars history of Backbone
- GitHub stars history of Ember
- GitHub stars history of Knockout
- GitHub stars history of Dojo 1
- GitHub stars history of Marionette
If a non-public API or an API with rate limits is used, you need to provide access tokens. Copy the config.default.js
inside the project folder and rename it to config.js
. This file is ignored by git because it is for your local setup only.
Since there is a rate limit on the Github API, you need to create an access token.
In order to save some requests, all responses are infinitly cached inside the .http-cache
folder. Failed requests are not cached.
Just delete this folder if you want to avoid stale data. You can delete all cached responses for a specific URL by using the removeFromHttpCache
script.
Unlicense