A command line client and go package for iterating over events from gharchive.
Download binaries from the latest release
Usage: gharchive <start> [<end>]
Arguments:
<start> start time formatted as YYYY-MM-DD, or as an RFC3339 date
[<end>] end time formatted as YYYY-MM-DD, or as an RFC3339 date. default is an hour past start
Flags:
-h, --help Show context-sensitive help.
--type=TYPE,... include only these event types
--not-type=NOT-TYPE,... exclude these event types
--strict-created-at only output events with a created_at between start and end
--no-empty-lines skip empty lines
--only-valid-json skip lines that aren not valid json objects
--preserve-order ensure that events are output in the same order they exist on data.gharchive.org
--concurrency=INT max number of concurrent downloads to run. Ignored if --preserve-order is set. Default is the number of cpus available.
--debug output debug logs
I can iterate about 200k events per second from an 8 core MacBook Pro with a cable modem. On an 80 core server in a data center that increases to about 450k.