Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect and make available metrics data #2

Open
jmatsushita opened this issue Aug 17, 2015 · 4 comments
Open

Collect and make available metrics data #2

jmatsushita opened this issue Aug 17, 2015 · 4 comments
Labels

Comments

@jmatsushita
Copy link
Member

What is the minimum viable data structure? The first best thing with regards to collecting and publishing metrics data? Versioned JSON files on Github? NoSQL database? Open scrapers on https://morph.io ? CKAN ?

@jmatsushita
Copy link
Member Author

Also http://dat-data.com/ ?

@andrew
Copy link

andrew commented Aug 19, 2015

I was pondering this over the weekend, Elasticsearch seems like a good fit for the kind of data modeling planned, it has a very flexible schema, can easily grow to handle more data and replication options and has some very powerful ways with "Aggregations": https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations.html

Also since v1.7 you can export whole dumps of the data for sharing publicly: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html

@jmatsushita
Copy link
Member Author

Elasticsearch as the main store? It would probably be the smoothest way to get started. Have you seen this about resilience.

I'm quite curious as to how well the MySQL / MongoDB combo is working in practice for @gousiosg with GHTorrent.

Another option is to borrow infrastructure like Big Query, Red Shift or Cloud Data Flow.

@gousiosg
Copy link

The MySQL and MongoDB combo is working quite well, scaling is just starting to become an issue. The real issue is consistency across the two.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants