Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hitting open files limit causes influxdb to create shards in loop #1024

Merged
merged 1 commit into from
Nov 14, 2014

Conversation

dgnorton
Copy link
Contributor

@dgnorton dgnorton commented Nov 6, 2014

Default install on Debian, 0.8.3, no tuning done except changing data dir path.

After InfluxDB hit the open files limit it started to create new shards in the loop, and of course could not write data, resulting in creation of few thousand dirs.
It should either:

  • exit, so if it is monitored user can notice before wondering why there is no stats from last shard
  • close old/least used shards

Logs:

[2014/10/12 00:00:01 CEST] [INFO] (github.com/influxdb/influxdb/cluster.(*ClusterConfiguration).GetShardToWriteToBySeriesAndTime:796) No matching shards for write at time 1413064200000000u, creating...
[2014/10/12 00:00:01 CEST] [INFO] (github.com/influxdb/influxdb/cluster.(*ClusterConfiguration).createShards:827) createShards for space long_term: start: Tue Oct 7 02:00:00 +0200 CEST 2014. end: Thu Nov 6 01:00:00 +0100 CET 2014
[2014/10/12 00:00:01 CEST] [INFO] (github.com/influxdb/influxdb/datastore.(*ShardDatastore).GetOrCreateShard:162) DATASTORE: opening or creating shard /var/lib/influxdb/db/shard_db_v2/04960
[2014/10/12 00:00:01 CEST] [EROR] (github.com/influxdb/influxdb/cluster.(*ClusterConfiguration).AddShards:1029) AddShards: error setting local store: %!(EXTRA *os.PathError=open /var/lib/influxdb/db/shard_db_v2/04960/type: too many open files)
[2014/10/12 00:00:01 CEST] [EROR] (github.com/influxdb/influxdb/coordinator.(*RaftServer).doOrProxyCommandOnce:169) Cannot run command &coordinator.CreateShardsCommand{Shards:[]*cluster.NewShardData{(*cluster.NewShardData)(0xc20983ba40)}, SpaceName:""}. open /var/lib/influxdb/db/shard_db_v2/04960/type: too many open files
[2014/10/12 00:00:01 CEST] [EROR] (github.com/influxdb/influxdb/coordinator.(*RaftServer).CreateShards:749) RAFT: CreateShards: %!(EXTRA *os.PathError=open /var/lib/influxdb/db/shard_db_v2/04960/type: too many open files)
[2014/10/12 00:00:01 CEST] [EROR] (github.com/influxdb/influxdb/coordinator.(*CoordinatorImpl).InterpolateValuesAndCommit:626) Couldn't write data for continuous query: %!(EXTRA *os.PathError=open /var/lib/influxdb/db/shard_db_v2/04960/type: too many open files)
[2014/10/12 00:00:01 CEST] [INFO] (github.com/influxdb/influxdb/cluster.(*ClusterConfiguration).GetShardToWriteToBySeriesAndTime:796) No matching shards for write at time 1413064200000000u, creating...
[2014/10/12 00:00:01 CEST] [INFO] (github.com/influxdb/influxdb/cluster.(*ClusterConfiguration).createShards:827) createShards for space long_term: start: Tue Oct 7 02:00:00 +0200 CEST 2014. end: Thu Nov 6 01:00:00 +0100 CET 2014
[2014/10/12 00:00:01 CEST] [INFO] (github.com/influxdb/influxdb/datastore.(*ShardDatastore).GetOrCreateShard:162) DATASTORE: opening or creating shard /var/lib/influxdb/db/shard_db_v2/04961
[2014/10/12 00:00:01 CEST] [EROR] (github.com/influxdb/influxdb/cluster.(*ClusterConfiguration).AddShards:1029) AddShards: error setting local store: %!(EXTRA *os.PathError=open /var/lib/influxdb/db/shard_db_v2/04961/type: too many open files)
[2014/10/12 00:00:01 CEST] [EROR] (github.com/influxdb/influxdb/coordinator.(*RaftServer).doOrProxyCommandOnce:169) Cannot run command &coordinator.CreateShardsCommand{Shards:[]*cluster.NewShardData{(*cluster.NewShardData)(0xc208edaee0)}, SpaceName:""}. open /var/lib/influxdb/db/shard_db_v2/04961/type: too many open files
[2014/10/12 00:00:01 CEST] [EROR] (github.com/influxdb/influxdb/coordinator.(*RaftServer).CreateShards:749) RAFT: CreateShards: %!(EXTRA *os.PathError=open /var/lib/influxdb/db/shard_db_v2/04961/type: too many open files)
[2014/10/12 00:00:01 CEST] [EROR] (github.com/influxdb/influxdb/coordinator.(*CoordinatorImpl).InterpolateValuesAndCommit:626) Couldn't write data for continuous query: %!(EXTRA *os.PathError=open /var/lib/influxdb/db/shard_db_v2/04961/type: too many open files)
[2014/10/12 00:00:01 CEST] [INFO] (github.com/influxdb/influxdb/cluster.(*ClusterConfiguration).GetShardToWriteToBySeriesAndTime:796) No matching shards for write at time 1413064200000000u, creating...
[2014/10/12 00:00:01 CEST] [INFO] (github.com/influxdb/influxdb/cluster.(*ClusterConfiguration).createShards:827) createShards for space long_term: start: Tue Oct 7 02:00:00 +0200 CEST 2014. end: Thu Nov 6 01:00:00 +0100 CET 2014
[2014/10/12 00:00:01 CEST] [INFO] (github.com/influxdb/influxdb/datastore.(*ShardDatastore).GetOrCreateShard:162) DATASTORE: opening or creating shard /var/lib/influxdb/db/shard_db_v2/04962
[2014/10/12 00:00:01 CEST] [EROR] (github.com/influxdb/influxdb/cluster.(*ClusterConfiguration).AddShards:1029) AddShards: error setting local store: %!(EXTRA *os.PathError=open /var/lib/influxdb/db/shard_db_v2/04962/type: too many open files)
[2014/10/12 00:00:01 CEST] [EROR] (github.com/influxdb/influxdb/coordinator.(*RaftServer).doOrProxyCommandOnce:169) Cannot run command &coordinator.CreateShardsCommand{Shards:[]*cluster.NewShardData{(*cluster.NewShardData)(0xc20a3684d0)}, SpaceName:""}. open /var/lib/influxdb/db/shard_db_v2/04962/type: too many open files
[2014/10/12 00:00:01 CEST] [EROR] (github.com/influxdb/influxdb/coordinator.(*RaftServer).CreateShards:749) RAFT: CreateShards: %!(EXTRA *os.PathError=open /var/lib/influxdb/db/shard_db_v2/04962/type: too many open files)
[2014/10/12 00:00:01 CEST] [EROR] (github.com/influxdb/influxdb/coordinator.(*CoordinatorImpl).InterpolateValuesAndCommit:626) Couldn't write data for continuous query: %!(EXTRA *os.PathError=open /var/lib/influxdb/db/shard_db_v2/04962/type: too many open files)

@dgnorton
Copy link
Contributor

@XANi does it create them in a loop or is a client continually writing to it and it creates a new file every time the client attempts to write? Either is bad...just want to make sure I understand what you're seeing.

@XANi
Copy link
Author

XANi commented Oct 31, 2014

From what I've debugged:

  • clients are continusly writing (we pipe our monitoring data into it)
  • at midnight it tries to create new shard
  • shard creation failed because of FD limit
  • tries to create shard again
  • gets into loop creating thousands of shards

toddboom added a commit that referenced this pull request Nov 14, 2014
Hitting open files limit causes influxdb to create shards in loop
@toddboom toddboom merged commit 0169fb1 into master Nov 14, 2014
@toddboom toddboom removed the review label Nov 14, 2014
@toddboom toddboom deleted the fix-1024 branch November 14, 2014 22:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants