I'm trying to plot a latency scatterplot from the raw data collected on May 23rd ( when the new TradingAPI was released, and we see the increase in measured latency )
I'm pulling my data from https://s3.amazonaws.com/cityindex.appmetrics/CiapiLatencyCollector/2012-05/*.zip
However, my plot seems to be missing data for a bunch of dates:
Is it possible that S3 is missing some source data?
Yes, it's possible.
Backup sends file to S3 when it's last update is 7 days old, and before this date all records from that file are not present in S3 storage
Sure, or even more frequently. But the problem is that file won't be sent to S3 until appropriate user session is finished, and some sessions are many days long. It's made this way because it's impossible to "append" data to S3 object, only completely rewrite it. But I can change this behavior.
As "session" here I mean AppMetrics session, not CIAPI
Getting raw data directly from AppMetrics server: