You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Yesterday one of your disk cabins went off due to electric problem and affected to our InfluxDB data, causing a corruption into one TSM file.
Once the disk was recovered, we rebooted the host and InfluxDB tried to start without succes, with high consumption of resources due (apparently) to corrupted TSM
Note the following graphs data are stored into other DB, so it doesn't come from the unresponsive host. The null values appears on graphs due the host was unresponsive
As it appears on the log, it tried to open constantly the same file, eating all memory and started to swap.
Called influx_inspect dumptsm -dir /store/influxdata
Other shards were OK and the process stopped, without giving the name of the of the failing shard, with the following message: mmapAccessor: invalid indexStart
Stopped InfluxDB and moved the TSM file
On 17:20, we decided to stop InfluxDB and move the corrupted TSM file /store/influxdata/data/db_metrics/1y/54/000001696-000000002.tsm into another dir
Started InfluxDB
We started the InfluxDB and it was running OK with lost data from the moved TSM
Expected Behaviour
InfluxDB should skip/discard corrupted TSM files and start normally, giving an error into log.
The tool influx_inspect dumptsm should have an output of the failling TSM file.
Actual Behaviour
InfluxDB becomes unresponsive and consumed massive resources, with apparently an infinite memory consumption
The tool influx_inspect dumptsm doesn't echo the failing shard, only the error
The text was updated successfully, but these errors were encountered:
Hi,
InfluxDB: 1.5.2
OS: RHEL 7.4
TSI: enabled
Case
Yesterday one of your disk cabins went off due to electric problem and affected to our InfluxDB data, causing a corruption into one TSM file.
Once the disk was recovered, we rebooted the host and InfluxDB tried to start without succes, with high consumption of resources due (apparently) to corrupted TSM
Note the following graphs data are stored into other DB, so it doesn't come from the unresponsive host. The
null
values appears on graphs due the host was unresponsiveAs it appears on the log, it tried to open constantly the same file, eating all memory and started to swap.
Trying to solve that we did the following:
Called
influx_inspect dumptsm -dir /store/influxdata
Other shards were OK and the process stopped, without giving the name of the of the failing shard, with the following message:
mmapAccessor: invalid indexStart
Stopped InfluxDB and moved the TSM file
On 17:20, we decided to stop InfluxDB and move the corrupted TSM file
/store/influxdata/data/db_metrics/1y/54/000001696-000000002.tsm
into another dirStarted InfluxDB
We started the InfluxDB and it was running OK with lost data from the moved TSM
Expected Behaviour
influx_inspect dumptsm
should have an output of the failling TSM file.Actual Behaviour
influx_inspect dumptsm
doesn't echo the failing shard, only the errorThe text was updated successfully, but these errors were encountered: