Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deleting expired shards isn't thread safe #779

Closed
jvshahid opened this issue Jul 24, 2014 · 5 comments
Closed

Deleting expired shards isn't thread safe #779

jvshahid opened this issue Jul 24, 2014 · 5 comments
Assignees
Milestone

Comments

@jvshahid
Copy link
Contributor

Due to a bug fixed by #769 and c02cff2. Shards were getting dropped prematurely (using the duration instead of retention) while some user is still writing to them. Although the bug is fixed it turns out that a shard could be deleted and closed while some data is writing to it which will cause a nil pointer dereference causing the entire daemon to crash. The two locations that were identified to have race condition (there could be more) are https://github.com/influxdb/influxdb/blob/master/datastore/shard.go#L459 and https://github.com/influxdb/influxdb/blob/master/datastore/shard.go#L93

We should mark the shards as deletable until no one has reference to it and delete it, similar to the shard cache that we have in the datastore.

@jvshahid jvshahid added this to the 0.8.0 milestone Jul 24, 2014
@nichdiekuh
Copy link

Sounds like this issue is related to my #747 issue and will fix it as well. can't wait it! :)

@jvshahid
Copy link
Contributor Author

@nichdiekuh yes that could be the cause of the issues you're seeing. That said, as of rc3 InfluxDB incorrectly drops shards when it shouldn't be which is why you hit the bug in the first place. We will release rc4 today with the fix for #769 and #774. This version should not have any problem with the benchmark you provided. I ran the benchmark last night and it wrote 40% before the machine ran out of space.

@nichdiekuh
Copy link

That sounds promising! And btw: I've set the retention-sweep-period too "1000000m" for testing purposes, started my script and it's still running. Unfortunately the sweep-period cannot be set to anything like "30d" or so (influx doesn't start with other units, but that's another issue)

@shugo
Copy link
Contributor

shugo commented Jul 31, 2014

Thanks for your efforts for shard expiration.

Is this issue related to #767?
It seems that shards are not removed from shardsById even if they are deleted by PeriodicallyDropShardsWithRetentionPolicies().

@pauldix
Copy link
Member

pauldix commented Aug 22, 2014

Fixed by #866

@pauldix pauldix closed this as completed Aug 22, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants