New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backfill for continuous query should happen in chunks #186

Closed
pauldix opened this Issue Jan 17, 2014 · 8 comments

Comments

Projects
None yet
6 participants
@pauldix
Member

pauldix commented Jan 17, 2014

The backfill for continuous queries could blow up if there is already a lot of data in the db that matches the query. Do this to make sure it happens in chunks and doesn't explode.

Regex queries should do a query with limit 1 to pull back all the series then fill each one like so:

Find out the first group to file by doing limit 1 order desc. Then do chunks of groups. If it's group by 1d, do each day at a time. Otherwise grab somewhere around an hour or two at a time.

@ghost ghost assigned toddboom Jan 17, 2014

@pauldix

This comment has been minimized.

Member

pauldix commented Jan 31, 2014

After looking at the code, I saw that continuous queries weren't backfilling at all (at least for the first one created). I fixed that, but they're only backfilling for queries that have a group by interval.

We should keep it this way until the backfill in chunks is done. Then backfill for continuous queries that don't have a group by time.

@toddboom toddboom modified the milestones: 0.6.0, 0.5.0 Feb 6, 2014

@pauldix pauldix modified the milestones: 0.5.4, 0.6.0 Mar 31, 2014

@pauldix

This comment has been minimized.

Member

pauldix commented Mar 31, 2014

So this is really only applicable to continuous queries without a group by time() clause. These currently don't backfill and they should. To ensure that they don't crash things, they should backfill in chunks

@jvshahid jvshahid modified the milestones: 0.5.5, 0.5.4, 0.5.6, 0.5.7 Apr 3, 2014

@pauldix pauldix modified the milestones: 0.5.9, 0.5.7, 0.5.10, 0.5.11 Apr 17, 2014

@jvshahid jvshahid modified the milestones: 0.5.12, 0.5.11, 0.5.13 Apr 25, 2014

@jvshahid jvshahid modified the milestones: 0.6.0, 0.5.13, 0.6.1, 0.6.2 May 2, 2014

@jvshahid jvshahid added this to the Next release milestone May 12, 2014

@jvshahid jvshahid removed this from the 0.6.2 milestone May 12, 2014

@freeformz

This comment has been minimized.

Contributor

freeformz commented Jun 11, 2014

I am now hitting the originally reported issue.

@pauldix

This comment has been minimized.

Member

pauldix commented Jun 11, 2014

Your query with a group by clause is filling things up? What's the group by? If it something with a big time range and another column with high cardinality, it could be a problem.

@freeformz

This comment has been minimized.

Contributor

freeformz commented Jun 11, 2014

select percentile(service, 99) as service_99, percentile(service, 95) as service_95, median(service) as service_50 from router group by time(10m),id fill(0) into Percentile_10m_router.[id]

@freeformz

This comment has been minimized.

Contributor

freeformz commented Jun 11, 2014

id doesn't have a super big cardinality (maybe 100 or so )

@freeformz

This comment has been minimized.

Contributor

freeformz commented Jun 11, 2014

I think there's just a lot of data in the db already.

@jvshahid jvshahid removed this from the Next release milestone Oct 9, 2014

@pauldix pauldix removed this from the Next release milestone Oct 9, 2014

@jvshahid jvshahid added bug labels Oct 9, 2014

@toddboom toddboom removed bug labels Nov 25, 2014

@toddboom toddboom added this to the 0.9.0 milestone Nov 25, 2014

@inthecloud247

This comment has been minimized.

inthecloud247 commented Mar 7, 2015

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment