vmstorage sometimes fails to start on NFSv3 disks #162

prezhdarov · 2019-08-26T19:16:19Z

Deployed VM in kubernetes yesterday and this morning both vmstorage nodes were dead. Apparently they were looking for a file in a folder which didn't exist (no, vmstorage nodes did not lie). Cleaned up and redeployed vmstorage (new PVCs) few hours later one of them is experiencing the same difficulties. Here's what it says on startup:

2019-08-26T19:04:15.088+0000 info lib/logger/flag.go:11 build version: vmstorage-20190822-115633-tags-v1.26.0-cluster-0-ge9db22a5
2019-08-26T19:04:15.088+0000 info lib/logger/flag.go:12 command line flags
2019-08-26T19:04:15.088+0000 info lib/logger/flag.go:20 flag "http.disableResponseCompression" = "false"
2019-08-26T19:04:15.088+0000 info lib/logger/flag.go:20 flag "httpListenAddr" = ":8482"
2019-08-26T19:04:15.088+0000 info lib/logger/flag.go:20 flag "loggerLevel" = "INFO"
2019-08-26T19:04:15.088+0000 info lib/logger/flag.go:20 flag "memory.allowedPercent" = "60"
2019-08-26T19:04:15.088+0000 info lib/logger/flag.go:20 flag "precisionBits" = "64"
2019-08-26T19:04:15.088+0000 info lib/logger/flag.go:20 flag "retentionPeriod" = "48"
2019-08-26T19:04:15.088+0000 info lib/logger/flag.go:20 flag "rpc.disableCompression" = "false"
2019-08-26T19:04:15.088+0000 info lib/logger/flag.go:20 flag "search.maxTagKeys" = "secret"
2019-08-26T19:04:15.088+0000 info lib/logger/flag.go:20 flag "search.maxTagValues" = "10000"
2019-08-26T19:04:15.088+0000 info lib/logger/flag.go:20 flag "search.maxUniqueTimeseries" = "100000"
2019-08-26T19:04:15.088+0000 info lib/logger/flag.go:20 flag "snapshotAuthKey" = "secret"
2019-08-26T19:04:15.088+0000 info lib/logger/flag.go:20 flag "storageDataPath" = "/storage"
2019-08-26T19:04:15.088+0000 info lib/logger/flag.go:20 flag "version" = "false"
2019-08-26T19:04:15.088+0000 info lib/logger/flag.go:20 flag "vminsertAddr" = ":8400"
2019-08-26T19:04:15.088+0000 info lib/logger/flag.go:20 flag "vmselectAddr" = ":8401"
2019-08-26T19:04:15.088+0000 info app/vmstorage/main.go:34 opening storage at "/storage" with retention period 48 months
2019-08-26T19:04:15.090+0000 info lib/memory/memory.go:33 limiting caches to 2576980377 bytes of RAM according to -memory.allowedPercent=60
2019-08-26T19:04:15.090+0000 info lib/storage/storage.go:548 loading MetricName->TSID cache from "/storage/cache/metricName_tsid"...
2019-08-26T19:04:15.091+0000 info lib/storage/storage.go:553 loaded MetricName->TSID cache from "/storage/cache/metricName_tsid" in 472.738µs; entriesCount: 0; sizeBytes: 0
2019-08-26T19:04:15.091+0000 info lib/storage/storage.go:548 loading MetricID->TSID cache from "/storage/cache/metricID_tsid"...
2019-08-26T19:04:15.091+0000 info lib/storage/storage.go:553 loaded MetricID->TSID cache from "/storage/cache/metricID_tsid" in 706.625µs; entriesCount: 0; sizeBytes: 0
2019-08-26T19:04:15.091+0000 info lib/storage/storage.go:548 loading MetricID->MetricName cache from "/storage/cache/metricID_metricName"...
2019-08-26T19:04:15.092+0000 info lib/storage/storage.go:553 loaded MetricID->MetricName cache from "/storage/cache/metricID_metricName" in 303.152µs; entriesCount: 0; sizeBytes: 0
2019-08-26T19:04:15.092+0000 info lib/storage/storage.go:548 loading Date->MetricID cache from "/storage/cache/date_metricID"...
2019-08-26T19:04:15.092+0000 info lib/storage/storage.go:553 loaded Date->MetricID cache from "/storage/cache/date_metricID" in 367.411µs; entriesCount: 0; sizeBytes: 0
2019-08-26T19:04:15.092+0000 info lib/storage/storage.go:482 loading curr_hour_metric_ids from "/storage/cache/curr_hour_metric_ids"...
2019-08-26T19:04:15.092+0000 info lib/storage/storage.go:485 nothing to load from "/storage/cache/curr_hour_metric_ids"
2019-08-26T19:04:15.092+0000 info lib/storage/storage.go:482 loading prev_hour_metric_ids from "/storage/cache/prev_hour_metric_ids"...
2019-08-26T19:04:15.092+0000 info lib/storage/storage.go:485 nothing to load from "/storage/cache/prev_hour_metric_ids"
2019-08-26T19:04:15.097+0000 info lib/mergeset/table.go:128 opening table "/storage/indexdb/15BE7AFB904C1F0B"...
2019-08-26T19:04:15.123+0000 info lib/mergeset/table.go:160 table "/storage/indexdb/15BE7AFB904C1F0B" has been opened in 25.740749ms; partsCount: 4; blocksCount: 49, itemsCount: 48595; sizeBytes: 163308
2019-08-26T19:04:15.125+0000 info lib/mergeset/table.go:128 opening table "/storage/indexdb/15BE7AFB904C1F0A"...
2019-08-26T19:04:15.132+0000 info lib/mergeset/table.go:160 table "/storage/indexdb/15BE7AFB904C1F0A" has been opened in 6.545175ms; partsCount: 0; blocksCount: 0, itemsCount: 0; sizeBytes: 0
2019-08-26T19:04:15.145+0000 info lib/mergeset/table.go:170 waiting for raw items flusher to stop on "/storage/indexdb/15BE7AFB904C1F0B"...
2019-08-26T19:04:15.145+0000 info lib/mergeset/table.go:173 raw items flusher stopped in 384ns on "/storage/indexdb/15BE7AFB904C1F0B"
2019-08-26T19:04:15.145+0000 info lib/mergeset/table.go:175 waiting for part mergers to stop on "/storage/indexdb/15BE7AFB904C1F0B"...
2019-08-26T19:04:15.145+0000 info lib/mergeset/table.go:178 part mergers stopped in 395ns on "/storage/indexdb/15BE7AFB904C1F0B"
2019-08-26T19:04:15.145+0000 info lib/mergeset/table.go:180 flushing inmemory parts to files on "/storage/indexdb/15BE7AFB904C1F0B"...
2019-08-26T19:04:15.145+0000 info lib/mergeset/table.go:204 0 inmemory parts have been flushed to files in 2.118µs on "/storage/indexdb/15BE7AFB904C1F0B"
2019-08-26T19:04:15.146+0000 info lib/mergeset/table.go:170 waiting for raw items flusher to stop on "/storage/indexdb/15BE7AFB904C1F0A"...
2019-08-26T19:04:15.146+0000 info lib/mergeset/table.go:173 raw items flusher stopped in 4.566µs on "/storage/indexdb/15BE7AFB904C1F0A"
2019-08-26T19:04:15.146+0000 info lib/mergeset/table.go:175 waiting for part mergers to stop on "/storage/indexdb/15BE7AFB904C1F0A"...
2019-08-26T19:04:15.146+0000 info lib/mergeset/table.go:178 part mergers stopped in 237ns on "/storage/indexdb/15BE7AFB904C1F0A"
2019-08-26T19:04:15.146+0000 info lib/mergeset/table.go:180 flushing inmemory parts to files on "/storage/indexdb/15BE7AFB904C1F0A"...
2019-08-26T19:04:15.146+0000 info lib/mergeset/table.go:204 0 inmemory parts have been flushed to files in 1.098µs on "/storage/indexdb/15BE7AFB904C1F0A"
2019-08-26T19:04:15.148+0000 fatal app/vmstorage/main.go:38 cannot open a storage at /storage with retention period 48 months: cannot open table at "/storage/data": cannot open partitions in the table "/storage/data": cannot open partition "2019_08": cannot open small parts from "/storage/data/small/2019_08": cannot open part "/storage/data/small/2019_08/1502_1494_20190826161500.000_20190826161500.000_15BE7B0A0A234959": cannot open timestamps file: open /storage/data/small/2019_08/1502_1494_20190826161500.000_20190826161500.000_15BE7B0A0A234959/timestamps.bin: no such file or directory

The node again is right, there is no timestamps.bin in that folder or in fact any other files. The adjacent folders however have those files, except maybe one or two.

Could this be a tiny little nuisance I've accidentally discovered or I'm holding it wrong? As you can see, It's mostly basic setup - persistent storage is NFSv3.

prezhdarov · 2019-08-26T23:17:29Z

Right, seems this has been covered in #61 already. I might try and work out nfs options needed to avoid such behaviour (netapp trident as storage driver), but wouldn’t this make it easier if vmstorage simply ignores empty dirs and even tries to remove them at a later stage?

valyala · 2019-08-28T16:35:07Z

It looks like VictoriaMetrics failed to properly flush files on NFSv3 disk before stopping. Did the previous vmstorage container stop gracefully via SIGINT or SIGTERM signal? Could you provide log lines for the stopped vmstorage container before this error?

Temporary workaround is to manually remove the folder with incomplete data - /storage/data/small/2019_08/1502_1494_20190826161500.000_20190826161500.000_15BE7B0A0A234959 in this case. See troubleshooting section for details. This can result in loss for the data stored inside this folder.

wouldn’t this make it easier if vmstorage simply ignores empty dirs and even tries to remove them at a later stage?

This would hide real errors and may result in data loss, since automatic removal of broken directories could silently remove the data stored inside these directories, which could be recovered otherwise.

The proper solution is to use more reliable filesystem for persistent volumes such as GCP disks.

prezhdarov · 2019-09-04T08:46:13Z

It is what we all think it is - nfs locks. Here's a sample output with vmstorage crashing:

2019-09-03T14:55:22.128+0000 info app/vmstorage/transport/server.go:122 accepted vminsert conn from 172.30.7.10:39660
2019-09-03T14:55:22.130+0000 info app/vmstorage/transport/server.go:154 processing vminsert conn from 172.30.7.10:39660
2019-09-03T14:55:26.378+0000 panic lib/fs/fs.go:289 FATAL: cannot remove "/storage/indexdb/15BED237DBBFDA5E/2006_4_15C0F55E58C6ED7E" in 11 attempts: unlinkat /storage/indexdb/15BED237DBBFDA5E/2006_4_15C0F55E58C6ED7E/.nfs80e3356e00004d6c000065a9: device or resource busy
panic: FATAL: cannot remove "/storage/indexdb/15BED237DBBFDA5E/2006_4_15C0F55E58C6ED7E" in 11 attempts: unlinkat /storage/indexdb/15BED237DBBFDA5E/2006_4_15C0F55E58C6ED7E/.nfs80e3356e00004d6c000065a9: device or resource busy

goroutine 19 [running]:
github.com/VictoriaMetrics/VictoriaMetrics/lib/logger.logMessage(0xa56551, 0x5, 0xc0000da7e0, 0xd9, 0x3)
/VictoriaMetrics/lib/logger/logger.go:124 +0x53d
github.com/VictoriaMetrics/VictoriaMetrics/lib/logger.logLevel(0xa56551, 0x5, 0xa6d699, 0x2a, 0xc00e369fa0, 0x3, 0x3)
/VictoriaMetrics/lib/logger/logger.go:78 +0x113
github.com/VictoriaMetrics/VictoriaMetrics/lib/logger.Panicf(...)
/VictoriaMetrics/lib/logger/logger.go:62
github.com/VictoriaMetrics/VictoriaMetrics/lib/fs.dirRemover()
/VictoriaMetrics/lib/fs/fs.go:289 +0x1b5
created by github.com/VictoriaMetrics/VictoriaMetrics/lib/fs.init.0
/VictoriaMetrics/lib/fs/fs.go:306 +0x35

I'm not going to great lengths to defend NFS as viable storage solution, enough to mention this particular kubernetes cluster uses industry standard storage system and their kubernetes driver for persistent storage and there are influxdb and prometheus instances running flawlessly - may add gitlab, goharbor, jenkins, ELK among the few others - also GCK is not an option as this and few more are local running kubernetes clusters. There I go..... went too far defending my beloved NFS. I can add that this vendor NFS volumes are sought after in all public clouds as providing far superior performance and lower latency - even better than GCP disks. Probably we're going to see more and more of NFS as it feels natural to the whole kubernetes and cloud idea. But anyway -

Now, let's make VM happy with NFS locks. Apparently 11 attempts (are these 11 ms instead?) are not enough to clear the locks, though less than a second later the folders are empty? What about a quick and dirty workaround - a tiny KV in memory with all failed remove attempts, main processes ignore all paths in it, then a sidecar process - I know Go is great at spawning those - attempts to clean these at its leisure. At startup a quick lookup finds all the empty ones and puts them in said KV and here we go again... I must read through the source and find out what pile of gibberish I just wrote, but still, it's an idea that might work before something better comes to mind... I was thinking of putting a sidecar container cron job that removes all every minute - even if vmstorage dies for a minute, hopefully vminsert might keep the records, but you said this may corrupt data, so no....

valyala · 2019-09-04T09:26:15Z

What about a quick and dirty workaround - a tiny KV in memory with all failed remove attempts, main processes ignore all paths in it, then a sidecar process - I know Go is great at spawning those - attempts to clean these at its leisure.

This is how it works now - see dirRemover code. 11 attempts were performed in 1s with 100ms sleep intervals between each attempt.

I'll remove the limit on the number of failed attempts in the follow-up commit.

…orary lock Do not give up after 11 attempts of directory removal on laggy NFS. Add `vm_nfs_dir_remove_failed_attempts_total` metric for counting the number of failed attempts on directory removal. Log failed attempts on directory removal after long sleep times. Updates #162

valyala · 2019-09-04T09:32:19Z

@prezhdarov , try building from 82bfe81 and verifying whether the issue is fixed after removing empty directories. See build instructions.

Monitor vmstorage logs and vm_nfs_dir_remove_failed_attempts_total metric exported on the /metrics page by vmstorage nodes. This metric should increment on each failed attempt to remove directory on NFS.

prezhdarov · 2019-09-04T19:37:20Z

Works fine for the past 5 hours:
NAME READY STATUS RESTARTS AGE
vmstorage-0 1/1 Running 2 4h52m
vmstorage-1 1/1 Running 0 4h52m

vmstorage-0 restarts were triggered by already existing empty folders I removed quickly.
Removals per minute look like this:

Roughly 60 failed attempts per minute. Now if we could get the delay required for removals and if I can manage to get NFS latency per pod/container, we can easily see, if that is related. Also I may see if this is negated by different NFS mount options - i.e. sync and hard mounts, which will increase latency and storage work, but might just get rid of all NFS locks... Should've thought about this earlier.

…are removed on graceful shutdown This patch is mainly required for laggy NFS. See #162

valyala · 2019-09-05T09:37:57Z

@prezhdarov , fixes for the issue have been added into release v1.27.2.
Closing the issue as fixed.
Feel free re-opening it if similar issues will appear again after v1.27.2.

Continue trying to remove NFS directory on temporary errors for up to a minute. The previous async removal process breaks in the following case during VictoriaMetrics start - VictoriaMetrics opens index, finds incomplete merge transactions and starts replaying them. - The transaction instructs removing old directories for parts, which were already merged into bigger part. - VictoriaMetrics removes these directories, but their removal is delayed due to NFS errors. - VictoriaMetrics scans partition directory after all the incomplete merge transactions are finished and finds directories, which should be removed, but weren't still removed due to NFS errors. - VictoriaMetrics panics when it finds unexpected empty directory. Updates #162

valyala · 2019-11-12T00:46:16Z

FYI, the issue was still present and should be finally fixed in v1.29.0.

…oned dirs are really removed This should fix the issue on NFS when incompletely removed dirs may be left after unclean shutdown (OOM, kill -9, hard reset, etc.), while the corresponding transaction files are already removed. Updates #162

valyala · 2019-12-02T19:40:22Z

FYI, the issue should be finally fixed in the following commits:

single-node version - 638a5cb
cluster version - b9616c0

The fix will be included in the next VictoriaMetrics release. Until then it is possible to build VictoriaMetrics from sources at the commits mentioned above according to the following instructions:

Relate issue: #61 .

valyala · 2019-12-02T20:56:35Z

The issue should be fixed in v1.30.3.

prezhdarov changed the title ~~vmstorage seems to be losing its date after a while (or looking for the wrong kind if it)~~ vmstorage seems to be losing its data after a while (or looking for the wrong kind if it) Aug 26, 2019

valyala changed the title ~~vmstorage seems to be losing its data after a while (or looking for the wrong kind if it)~~ vmstorage sometimes fails to start on NFSv3 disks Aug 28, 2019

valyala added bug Something isn't working question The question issue labels Aug 29, 2019

valyala added a commit that referenced this issue Sep 5, 2019

lib/fs: add MustStopDirRemover for waiting until pending directories …

88f8670

…are removed on graceful shutdown This patch is mainly required for laggy NFS. See #162

valyala added a commit that referenced this issue Sep 5, 2019

lib/fs: add MustStopDirRemover for waiting until pending directories …

2c65425

…are removed on graceful shutdown This patch is mainly required for laggy NFS. See #162

valyala closed this as completed Sep 5, 2019

valyala mentioned this issue Nov 10, 2019

Crashes on empty directories when storing data in NFS #64

Closed

valyala mentioned this issue Nov 12, 2019

v1.29.0 panic: FATAL: couldn't remove NFS directory #234

Closed

valyala reopened this Dec 2, 2019

valyala mentioned this issue Dec 2, 2019

Unlink panic on NFS #61

Closed

valyala closed this as completed Dec 2, 2019

Haleygo mentioned this issue Jun 4, 2024

"Must-remove" files cannot be removed because "file exists" #6396

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vmstorage sometimes fails to start on NFSv3 disks #162

vmstorage sometimes fails to start on NFSv3 disks #162

prezhdarov commented Aug 26, 2019

prezhdarov commented Aug 26, 2019

valyala commented Aug 28, 2019 •

edited

Loading

prezhdarov commented Sep 4, 2019

valyala commented Sep 4, 2019

valyala commented Sep 4, 2019

prezhdarov commented Sep 4, 2019

valyala commented Sep 5, 2019 •

edited

Loading

valyala commented Nov 12, 2019

valyala commented Dec 2, 2019 •

edited

Loading

valyala commented Dec 2, 2019

vmstorage sometimes fails to start on NFSv3 disks #162

vmstorage sometimes fails to start on NFSv3 disks #162

Comments

prezhdarov commented Aug 26, 2019

prezhdarov commented Aug 26, 2019

valyala commented Aug 28, 2019 • edited Loading

prezhdarov commented Sep 4, 2019

valyala commented Sep 4, 2019

valyala commented Sep 4, 2019

prezhdarov commented Sep 4, 2019

valyala commented Sep 5, 2019 • edited Loading

valyala commented Nov 12, 2019

valyala commented Dec 2, 2019 • edited Loading

valyala commented Dec 2, 2019

valyala commented Aug 28, 2019 •

edited

Loading

valyala commented Sep 5, 2019 •

edited

Loading

valyala commented Dec 2, 2019 •

edited

Loading