Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ShardGroups should have durations based on the duration of the retention policy they're in #1783

Closed
pauldix opened this issue Feb 27, 2015 · 3 comments · Fixed by #1897
Closed
Assignees

Comments

@pauldix
Copy link
Member

pauldix commented Feb 27, 2015

When a ShardGroup is created for a block of time, we should set the start and end time of the group based on the duration of the retention policy. My initial thoughts on a ruleset for determining it would be:

if rp.Duration > 365 * time.Day {
    shardGroup.Duration = 30 * time.Day
} else if rp.Duration > 60 * time.Day {
    shardGroup.Duration = 7 * time.Day
} else if rp.Duration > 2 * time.Day {
    shardGroup.Duration = 1 * time.Day
} else if rp.Duration > 10 * time.Hour {
    shardGroup.Duration = 2 * time.Hour
} else if rp.Duration > 2 * time.Hour {
    shardGroup.Duration = 30 * time.Minute
} else {
    shardGroup.Duration = 5 * time.Minute
}
@otoolep
Copy link
Contributor

otoolep commented Feb 28, 2015

Well right now we set the duration to be the Retention Period, so something needs to be done:

https://github.com/influxdb/influxdb/blob/master/server.go#L771

Since the system will drop data by "shard group" quanta, 30-days extra data (in the case of a year) and 1 week in case of 2 months, might that not still be quite a lot of slop in those cases? Not sure, I don't have anything to go on but first impressions, but it certainly feels like it could be a significant chunk. Perhaps a week for > 365? That will mean more shards in general of course -- that is certainly a trade off too, since it impacts number of topics on brokers (could be an issue, but most topics will be quiet), file descriptors (hardly an issue).

@pauldix
Copy link
Member Author

pauldix commented Feb 28, 2015

I knew the current implementation was wrong, which was one of the reasons I logged this ;)

On the part about the number of topics, we'll just have to do some testing. Having some extra amount of data isn't that big of a deal. You're talking about a 10% increase or something like that.

@pauldix
Copy link
Member Author

pauldix commented Mar 6, 2015

On more reflection, I think the max shard size should be 7d and in most cases (less than 6 months of retention) shards should be 1d or less.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants