-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Idea of missing events ratio? #113
Comments
Looking at the logs.. It does happen, and when it does it appears to happen closely together (e.g. 3-4 times) due to some burst of activity. However, it's not that frequent.. a few times per week. We can and should fix that logic though: when we detect that all events are new we can paginate back until we find the overlap with previously seen events. |
OK, that's weird; I was speaking about algolia/instantsearch.js (56 commits, lots of discussion on some issues) and still: If it's only 3-4 times a week, I don't think it's the root cause then. Do you know if their API could miss some public events? |
Err, recent activity? Note that we're no longer logging to github.timeline post Jan 1, 2015. See: https://www.githubarchive.org/#bigquery |
Oh alright, yes recent activity. That was my issue :) Thank you for clarifying things! |
Hey @igrigorik,
do you have an idea about the number of events the crawler is missing (because of https://github.com/igrigorik/githubarchive.org/blob/master/crawler/crawler.rb#L77 and of the polling mechanism)? I was pretty surprised not to find some new projects having a pretty high (several dozens events per day those last days) public activity.
Thank you for that project, very useful 👍
The text was updated successfully, but these errors were encountered: