Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/sorted/leveldb: writing blobpacked index on samba mount fails #1185

Closed
dcramer opened this issue Jun 6, 2018 · 21 comments
Closed

pkg/sorted/leveldb: writing blobpacked index on samba mount fails #1185

dcramer opened this issue Jun 6, 2018 · 21 comments
Assignees

Comments

@dcramer
Copy link

dcramer commented Jun 6, 2018

Not going to be super helpful here given I don't have a lot of details on the project yet (first time setting it up), but let me know if you need more details.

Details:

  • Latest version (0.10?) from the download page, prebuilt image for Linux
  • Ubuntu 16.04
  • Started ./perkeepd for the first time to generate config
  • Updated config to move blobPath and levelDB to a network mounted (samba 2.0) disk
  • Started ./perkeepd and hit panic
  • Various files exist for leveldb/blobs (confirming it was able to write)

Config:

{
    "auth": "localhost",
    "listen": ":3179",
    "camliNetIP": "",
    "identity": "XXX",
    "identitySecretRing": "/home/pax/.config/perkeep/identity-secring.gpg",
    "blobPath": "/mnt/shares/server/perkeep/blobs",
    "packRelated": true,
    "levelDB": "/mnt/shares/server/perkeep/index.leveldb"
}

Log:

2018/06/06 05:36:01 Starting perkeepd version 0.10, 2018-05-12-21e2d574e5; Go go1.10.2 (linux/amd64)
2018/06/06 05:36:01 Starting to listen on http://localhost:3179
2018/06/06 05:36:02 Caught panic installer handlers: error instantiating storage for prefix "/bs/", type "blobpacked": failed to setup blobpacked metaIndex: error from "leveldb" KeyValue: file missing [file=MANIFEST-000000]
goroutine 1 [running]:
runtime/debug.Stack(0xc4200a8190, 0x2, 0xc4209545b0)
	/usr/local/go/src/runtime/debug/stack.go:24 +0xa7
runtime/debug.PrintStack()
	/usr/local/go/src/runtime/debug/stack.go:16 +0x22
perkeep.org/pkg/serverinit.(*Config).InstallHandlers.func1(0xc420795c60)
	/gopath/src/perkeep.org/pkg/serverinit/serverinit.go:548 +0x9e
panic(0xefcc60, 0xc420d20940)
	/usr/local/go/src/runtime/panic.go:502 +0x229
perkeep.org/pkg/serverinit.(*handlerLoader).setupHandler.func1(0xc4209cfef0, 0xc420463d00, 0x114bb57, 0xe)
	/gopath/src/perkeep.org/pkg/serverinit/serverinit.go:309 +0x198
panic(0xefcc60, 0xc420d20940)
	/usr/local/go/src/runtime/panic.go:502 +0x229
perkeep.org/pkg/serverinit.(*handlerLoader).setupHandler.func1(0xc420a140f0, 0xc420463d00, 0x1140b2a, 0x4)
	/gopath/src/perkeep.org/pkg/serverinit/serverinit.go:309 +0x198
panic(0xefcc60, 0xc420d20940)
	/usr/local/go/src/runtime/panic.go:502 +0x229
perkeep.org/pkg/serverinit.exitFailure(0x117b3d8, 0x36, 0xc420795138, 0x3, 0x3)
	/gopath/src/perkeep.org/pkg/serverinit/serverinit.go:278 +0x167
perkeep.org/pkg/serverinit.(*handlerLoader).setupHandler(0xc420463d00, 0x1140b2a, 0x4)
	/gopath/src/perkeep.org/pkg/serverinit/serverinit.go:324 +0x3a0
perkeep.org/pkg/serverinit.(*handlerLoader).GetStorage(0xc420463d00, 0x1140b2a, 0x4, 0x2, 0x2, 0x0, 0x1a)
	/gopath/src/perkeep.org/pkg/serverinit/serverinit.go:252 +0x43
perkeep.org/pkg/blobserver/replica.newFromConfig(0x1799a40, 0xc420463d00, 0xc4209cf860, 0x7, 0xc4206fc870, 0x1300000001701c01, 0xc420795458)
	/gopath/src/perkeep.org/pkg/blobserver/replica/replica.go:115 +0x26c
perkeep.org/pkg/blobserver.CreateStorage(0x114dace, 0x7, 0x1799a40, 0xc420463d00, 0xc4209cf860, 0xe, 0x11, 0xc420463d00, 0x178a1e0)
	/gopath/src/perkeep.org/pkg/blobserver/registry.go:111 +0xc3
perkeep.org/pkg/serverinit.(*handlerLoader).setupHandler(0xc420463d00, 0x114bb57, 0xe)
	/gopath/src/perkeep.org/pkg/serverinit/serverinit.go:322 +0x25c
perkeep.org/pkg/serverinit.(*handlerLoader).setupAll(0xc420463d00)
	/gopath/src/perkeep.org/pkg/serverinit/serverinit.go:232 +0x92
perkeep.org/pkg/serverinit.(*Config).InstallHandlers(0xc42091af00, 0x1788a00, 0xc4209e41e0, 0xc42020bdc0, 0x18, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/gopath/src/perkeep.org/pkg/serverinit/serverinit.go:617 +0xa0e
main.Main(0x0, 0x0)
	/gopath/src/perkeep.org/server/perkeepd/camlistored.go:963 +0x4d8
main.main()
	/gopath/src/perkeep.org/server/perkeepd/camlistored.go:906 +0x29
Error parsing config: Caught panic: error instantiating storage for prefix "/bs/", type "blobpacked": failed to setup blobpacked metaIndex: error from "leveldb" KeyValue: file missing [file=MANIFEST-000000]
@mpl
Copy link
Contributor

mpl commented Jun 6, 2018

@dcramer thanks.
Could you please

rm -rf /mnt/shares/server/perkeep/index.leveldb

and run perkeepd with -recovery=2 and -reindex=true ?

@mpl
Copy link
Contributor

mpl commented Jun 6, 2018

In any case, I think I've seen someone on IRC try and fail with samba mounts as well, so I suspect we're missing something to support writing there.
I'll see if i can reproduce.

@mpl mpl self-assigned this Jun 6, 2018
@dcramer
Copy link
Author

dcramer commented Jun 6, 2018

Same-ish error:

automator@pax:~$ rm -rf /mnt/shares/server/perkeep/index.leveldb
automator@pax:~$ ./perkeepd -recovery=2 -reindex=true
2018/06/06 15:58:52 Starting perkeepd version 0.10, 2018-05-12-21e2d574e5; Go go1.10.2 (linux/amd64)
2018/06/06 15:58:52 Starting to listen on http://localhost:3179
2018/06/06 15:58:53 Caught panic installer handlers: error instantiating handler for prefix "/sync/", type "sync": error from "leveldb" KeyValue: file missing [file=MANIFEST-000000]
goroutine 1 [running]:
runtime/debug.Stack(0xc4200a8190, 0x2, 0xc420909ad0)
	/usr/local/go/src/runtime/debug/stack.go:24 +0xa7
runtime/debug.PrintStack()
	/usr/local/go/src/runtime/debug/stack.go:16 +0x22
perkeep.org/pkg/serverinit.(*Config).InstallHandlers.func1(0xc4206b7c60)
	/gopath/src/perkeep.org/pkg/serverinit/serverinit.go:548 +0x9e
panic(0xefcc60, 0xc42119d3d0)
	/usr/local/go/src/runtime/panic.go:502 +0x229
perkeep.org/pkg/serverinit.(*handlerLoader).setupHandler.func1(0xc421199500, 0xc420548580, 0x114260d, 0x6)
	/gopath/src/perkeep.org/pkg/serverinit/serverinit.go:309 +0x198
panic(0xefcc60, 0xc42119d3d0)
	/usr/local/go/src/runtime/panic.go:502 +0x229
perkeep.org/pkg/serverinit.exitFailure(0x117b3a2, 0x36, 0xc4206b76f8, 0x3, 0x3)
	/gopath/src/perkeep.org/pkg/serverinit/serverinit.go:278 +0x167
perkeep.org/pkg/serverinit.(*handlerLoader).setupHandler(0xc420548580, 0x114260d, 0x6)
	/gopath/src/perkeep.org/pkg/serverinit/serverinit.go:373 +0xb5d
perkeep.org/pkg/serverinit.(*handlerLoader).setupAll(0xc420548580)
	/gopath/src/perkeep.org/pkg/serverinit/serverinit.go:232 +0x92
perkeep.org/pkg/serverinit.(*Config).InstallHandlers(0xc420083000, 0x1788a00, 0xc42118ac80, 0xc421195160, 0x18, 0x1, 0x0, 0x0, 0x0, 0x0, ...)
	/gopath/src/perkeep.org/pkg/serverinit/serverinit.go:617 +0xa0e
main.Main(0x0, 0x0)
	/gopath/src/perkeep.org/server/perkeepd/camlistored.go:963 +0x4d8
main.main()
	/gopath/src/perkeep.org/server/perkeepd/camlistored.go:906 +0x29
Error parsing config: Caught panic: error instantiating handler for prefix "/sync/", type "sync": error from "leveldb" KeyValue: file missing [file=MANIFEST-000000]

@mpl
Copy link
Contributor

mpl commented Jun 6, 2018

@dcramer could you try without samba? with your blobs and index configured to be on a local disk?
if that works, then it stands to reason the samba mount is the source of the problem, and i don't need to try myself with samba to know that this is the area where we have to fix the problem.

@dcramer
Copy link
Author

dcramer commented Jun 6, 2018

@mpl it definitely worked before I moved the paths to the samba mount

Aside here's the way I have it mounted:

//nas.local/media /mnt/shares/server cifs uid=1001,gid=1003,vers=2.0,forceuid,forcegid,rw 0 0

@mpl
Copy link
Contributor

mpl commented Jun 6, 2018

oh, ok then.

@mpl mpl changed the title Error on initial network mounted storage: error instantiating storage for prefix pkg/sorted/leveldb: writing blobpacked index on samba mount fails Jun 6, 2018
@mpl
Copy link
Contributor

mpl commented Jun 6, 2018

@dcramer btw, one thing you could try in the meantime, is to store the blobs on the samba share, but let the index be stored on localdisk.

@euank
Copy link
Contributor

euank commented Jun 6, 2018

The person on irc ran into a different error; specifically they got a 'fsync: invalid argument' iirc (which looked like this leveldb issue: google/leveldb#281).

Either way, it looks like upstream leveldb doesn't support putting it on cifs, so if that's what's causing the issue I don't think it's a perkeep bug.

@bradfitz
Copy link
Contributor

bradfitz commented Jun 6, 2018

@dcramer, can you run perkeep under strace, like:

$ strace -f -o trace.txt perkeepd

And then post the possibly-redacted (might have passwords/keys) relevant section of trace.txt towards the end before it fails?

I'm wondering what system calls the goleveldb code is trying to do and what the Linux CIFS/SMB implementation is actually doing.

@bradfitz
Copy link
Contributor

bradfitz commented Jun 6, 2018

@euank, we don't use google/leveldb, so I doubt that bug is relevant. We use github.com/syndtr/goleveldb

@euank
Copy link
Contributor

euank commented Jun 6, 2018

Ah, thanks for clarifying, I should have looked before assuming.

@dcramer
Copy link
Author

dcramer commented Jun 7, 2018

fyi changing leveldb to be local disk and leaving blobs on samba produces a similar issue

@bradfitz lmk if you need more and i can probably just dump the whole file

11402 <... fsync resumed> )             = 0
11392 <... pselect6 resumed> )          = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL <unfinished ...>
11402 close(9 <unfinished ...>
11392 <... pselect6 resumed> )          = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL <unfinished ...>
11402 <... close resumed> )             = 0
11392 <... pselect6 resumed> )          = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL <unfinished ...>
11402 lstat("/mnt/shares/server/perkeep/blobs/packed/packindex.leveldb/CURRENT",  <unfinished ...>
11392 <... pselect6 resumed> )          = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL <unfinished ...>
11402 <... lstat resumed> {st_mode=S_IFREG|0755, st_size=16, ...}) = 0
11402 renameat(AT_FDCWD, "/mnt/shares/server/perkeep/blobs/packed/packindex.leveldb/CURRENT.0", AT_FDCWD, "/mnt/shares/server/perkeep/blobs/packed/packindex.leveldb/CURRENT" <unfinished ...>
11392 <... pselect6 resumed> )          = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 40000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 80000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 160000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 320000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 640000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 1280000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 2560000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 5120000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 futex(0x1def3b0, FUTEX_WAIT, 0, {60, 0} <unfinished ...>
11402 <... renameat resumed> )          = 0
11402 futex(0x1def3b0, FUTEX_WAKE, 1)   = 1
11392 <... futex resumed> )             = 0
11392 sched_yield( <unfinished ...>
11402 openat(AT_FDCWD, "/mnt/shares/server/perkeep/blobs/packed/packindex.leveldb", O_RDONLY|O_CLOEXEC <unfinished ...>
11392 <... sched_yield resumed> )       = 0
11392 futex(0x1def2d0, FUTEX_WAKE, 1)   = 0
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 40000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 80000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 160000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 320000}, NULL) = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 640000}, NULL <unfinished ...>
11402 <... openat resumed> )            = 9
11402 epoll_ctl(4, EPOLL_CTL_ADD, 9, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=252353888, u64=139702653590880}}) = -1 EPERM (Operation not permitted)
11392 <... pselect6 resumed> )          = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 1280000}, NULL <unfinished ...>
11402 epoll_ctl(4, EPOLL_CTL_DEL, 9, 0xc42057570c) = -1 EPERM (Operation not permitted)
11402 fsync(9)                          = -1 EINVAL (Invalid argument)
11402 close(9)                          = 0
11402 write(7, "05:39:14.274269 syncDir: sync /m"..., 106 <unfinished ...>
11392 <... pselect6 resumed> )          = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 2560000}, NULL <unfinished ...>
11402 <... write resumed> )             = 106
11402 write(7, "05:39:14.275979 CURRENT: sync /m"..., 106 <unfinished ...>
11392 <... pselect6 resumed> )          = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 5120000}, NULL <unfinished ...>
11402 <... write resumed> )             = 106
11402 close(8 <unfinished ...>
11392 <... pselect6 resumed> )          = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 10000000}, NULL <unfinished ...>
11402 <... close resumed> )             = 0
11402 unlinkat(AT_FDCWD, "/mnt/shares/server/perkeep/blobs/packed/packindex.leveldb/MANIFEST-000000", 0) = 0
11402 futex(0xc420057d48, FUTEX_WAKE, 1) = 1
11401 <... futex resumed> )             = 0
11401 pselect6(0, NULL, NULL, NULL, {0, 3000}, NULL <unfinished ...>
11402 close(7 <unfinished ...>
11401 <... pselect6 resumed> )          = 0 (Timeout)
11401 futex(0x1defe08, FUTEX_WAKE, 1)   = 1
11391 <... futex resumed> )             = 0
11401 futex(0xc420057d48, FUTEX_WAIT, 0, NULL <unfinished ...>
11391 futex(0x1defe08, FUTEX_WAIT, 0, NULL <unfinished ...>
11392 <... pselect6 resumed> )          = 0 (Timeout)
11392 pselect6(0, NULL, NULL, NULL, {0, 10000000}, NULL <unfinished ...>
11402 <... close resumed> )             = 0
11402 flock(6, LOCK_NB|LOCK_UN)         = 0
11402 close(6)                          = 0
11402 write(2, "2018/06/07 05:39:14 Caught panic"..., 269) = 269
11402 write(2, "goroutine 1 [running]:\nruntime/d"..., 4475) = 4475
11402 write(2, "Error parsing config: Caught pan"..., 253) = 253
11402 exit_group(1)

@euank
Copy link
Contributor

euank commented Jun 7, 2018

Looks like fsync(fd) = EINVAL is the culprit.

11402 openat(AT_FDCWD, "/mnt/shares/server/perkeep/blobs/packed/packindex.leveldb", O_RDONLY|O_CLOEXEC <unfinished ...>
...
11402 <... openat resumed> )            = 9
...
11402 fsync(9)                          = -1 EINVAL (Invalid argument)

Note, this is calling fsync on a directory fd, not file. This corresponds to this code (calling this) (based on the previous rename in the strace).

The isErrInvalid code in syncDir does attempt to ignore EINVAL, but it gets it wrong on modern go. On recent go versions, it's an *os.PathError, even though when that code was written it was correctly an *os.SyscallError.

Very arguably, this could be considered a break of the go 1 promise, but it's certainly not something I'd personally want to try arguing.

The type of that error changed in the go1.8 timeframe with this commit.

@bradfitz
Copy link
Contributor

bradfitz commented Jun 7, 2018

@euank, nice debugging.

If you update that "func isErrInvalid" in the Perkeep vendored copy of goleveldb, does it all work for you?

I can escalate with the Go team if so and figure out what to do. And/or we can fix goleveldb upstream.

euank added a commit to euank/perkeep that referenced this issue Jun 7, 2018
@euank
Copy link
Contributor

euank commented Jun 7, 2018

@bradfitz

If you update that "func isErrInvalid" in the Perkeep vendored copy of goleveldb, does it all work for you?

I PR'd a change against goleveldb (ref). If I vendor in that change, I'm able to run with the blob + leveldb storage on a fresh directory on my test cifs mount (whereas before it reliably errored out immediately).

I'd prefer waiting a bit to see if it gets accepted upstream before pointing Gopkg.toml at a fork though.

There is still an additional bug I think as well. If I re-use a dirty directory, the broken code causes corruption which can't be recovered automatically. The CURRENT file ends up holding a reference to a MANIFEST file which doesn't exist, and I don't think there's any way to fix that.
Further investigation remains as to how it gets into that state in the first place.

I can escalate with the Go team if so and figure out what to do. And/or we can fix goleveldb upstream.

I suspect that relying on the underlying types and messages of errors should be out of the go1 promise unless the error is specifically documented or there's a helper like os.IsNotExist... that being said, the go1compat page also doesn't make that totally clear, so at the very least it could be worth having a discussion to clarify that further.

@dcramer if you want to verify that change is sufficient for your setup, you could try pulling in the commit on my branch here to see that it works.
I know cifs has a bit of a feature matrix, so there is a chance that even though it seems to work with a fresh setup for me, your client+server pair could have additional quirks.

@dcramer
Copy link
Author

dcramer commented Jun 8, 2018

@euank confirmed that fixed it for me

@euank
Copy link
Contributor

euank commented Jun 8, 2018

I submitted a cr to bump that dependency since the upstream change was merged; this should be fixed by https://perkeep-review.googlesource.com/c/perkeep/+/17126

@mpl
Copy link
Contributor

mpl commented Jun 8, 2018

@euank excellent, thanks

@mpl
Copy link
Contributor

mpl commented Jun 8, 2018

@dcramer but wait, you were saying earlier that moving the index to localdisk and leaving blobs on samba was not enough to fix the issue. By the same logic, @euank 's fix is not enough either, as it will only fix the problem for the index itself, not for the blobs.

So is Perkeep as a whole working now for you (while using samba) or not?

@dcramer
Copy link
Author

dcramer commented Jun 8, 2018

@mpl other than a bunch of seemingly random bugs (i think unrelated) it was working with both the index and the blobs on samba. Previously when I tried placing only blobs on the samba I got the same/similar error. I'm happy to reproduce and dump similar logs, but given the branch fixed it I'm not sure its worth the effort.

@mpl
Copy link
Contributor

mpl commented Jun 8, 2018

@dcramer oh wait, this is silly. I forgot we were talking about the index for the blobpacked, not the index for Perkeep as a whole. so yeah, by default the index for blobpacked goes wherever you tell the blobs to go, which means it was still being on samba in all cases. Nevermind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants