Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
GitHub Archive is a project to record the public GitHub timeline, archive it, and make it easily accessible for further analysis.
Ruby JavaScript
tree: 0da92d3d82

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
bigquery
crawler
web
README.md

README.md

GitHub Archive

http://www.githubarchive.org

Open-source developers all over the world are working on millions of projects: writing code & documentation, fixing & submitting bugs, and so forth. GitHub Archive is a project to record the public GitHub timeline, archive it, and make it easily accessible for further analysis.

Stats


GitHub provides 18 event types, which range from new commits and fork events, to opening new tickets, commenting, and adding members to a project. The activity is aggregated in hourly archives, which you can access with any HTTP client:

Query Command
Activity for March 11, 2012 at 3PM PST wget http://data.githubarchive.org/2012-03-11-15.json.gz
Activity for March 11, 2012 wget http://data.githubarchive.org/2012-03-11-{0..24}.json.gz
Activity for March 2012 wget http://data.githubarchive.org/2012-03-{0..30}-{0..24}.json.gz

Note: timeline data is available starting March 11, 2012.


Each archive contains a stream of JSON encoded GitHub events (sample), which you can process in any language. Ruby example:

require 'open-uri'
require 'zlib'
require 'yajl'

gz = open('http://data.githubarchive.org/2012-03-11-12.json.gz')
js = Zlib::GzipReader.new(gz).read

Yajl::Parser.parse(js) do |event|
  print event
end

Projects using GitHub Archive data Link
your project here fork and update

License

(MIT License) - Copyright (c) 2012 Ilya Grigorik

Something went wrong with that request. Please try again.