Missing data in S3 archive? #80

Closed
mrdavidlaing opened this Issue Jun 2, 2012 · 6 comments

Projects

None yet

2 participants

@mrdavidlaing
Collaborator

I'm trying to plot a latency scatterplot from the raw data collected on May 23rd ( when the new TradingAPI was released, and we see the increase in measured latency )

I'm pulling my data from https://s3.amazonaws.com/cityindex.appmetrics/CiapiLatencyCollector/2012-05/*.zip

However, my plot seems to be missing data for a bunch of dates:

Is it possible that S3 is missing some source data?

@fandrei fandrei was assigned Jun 2, 2012
@fandrei
Owner
fandrei commented Jun 4, 2012

Yes, it's possible.
Backup sends file to S3 when it's last update is 7 days old, and before this date all records from that file are not present in S3 storage

@mrdavidlaing
Collaborator

Could we make the archiving happen daily?

On 4 June 2012 04:16, fandrei <
reply@reply.github.com

wrote:

Yes, it's possible.
Backup sends file to S3 when it's last update is 7 days old, and before
this date all records from that file are not present in S3 storage


Reply to this email directly or view it on GitHub:
#80 (comment)

David Laing
Open source @ City Index - github.com/cityindex
http://davidlaing.com
Twitter: @davidlaing

@fandrei
Owner
fandrei commented Jun 7, 2012

Sure, or even more frequently. But the problem is that file won't be sent to S3 until appropriate user session is finished, and some sessions are many days long. It's made this way because it's impossible to "append" data to S3 object, only completely rewrite it. But I can change this behavior.

@mrdavidlaing
Collaborator

I think we should ensure that sessions recycle every few hours, to better
simulate real users. Also, I we have sessions running of days, we're going
to start triggering the anti data theft system I'm support to be
developing...

On Thursday, June 7, 2012, fandrei wrote:

Sure, or even more frequently. But the problem is that file won't be sent
to S3 until appropriate user session is finished, and some sessions are
many days long. It's made this way because it's impossible to "append" data
to S3 object, only completely rewrite it. But I can change this behavior.


Reply to this email directly or view it on GitHub:
#80 (comment)

David Laing
Open source @ City Index - github.com/cityindex
http://davidlaing.com
Twitter: @davidlaing

@fandrei
Owner
fandrei commented Jun 7, 2012

As "session" here I mean AppMetrics session, not CIAPI

@fandrei
Owner
fandrei commented Jun 19, 2012

Getting raw data directly from AppMetrics server:
https://github.com/fandrei/AppMetrics/wiki/Getting-raw-data

@fandrei fandrei closed this Jun 19, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment