This node.js app will crawl the GitHub issues and comments API to then save them in Elasticsearch.
git clone https://github.com/grafana/github-to-es.git
cd github-to-es
npm install
Copy config.sample.json to config.json.
{
"github_token": "mytoken",
"elasticsearch": {
"host": "localhost",
"port": 9200
},
"repos": [
{
"repo": "grafana/grafana-docker",
"comments": true,
"issues": true,
"sinceDays": 2
}
]
}
Specify your Elasticsearch details (no auth options available at this point). For the repository entries you can specify if comments and/or issues should be fetched. If you want a complete (from the start) import of all issues and comments remove the sinceDays option or set it to 0. After the complete import is done you can do incremental updates by setting this to 1.
node app.js init | bunyan
The above command will create an Elasticsearch index named github
. The | bunyan
part is optional. It's to get nicer console logging (instead of the default json logger). To use bunyan install
it first using npm install -g bunyan
.
You can reset (remove and recreates) the index using:
node app.js reset | bunyan
node app.js start | bunyan
When you add the data source specify github
as index name and created_at
in the Timestamp field.
Currently GitHub API limits the number of pages you can fetch to 400. So there is a limit for the initial complete import of issues & comments to 40000 issues and 40000 comments.
It would be nice to get stars & other repo stats as well.