New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lizardfs becomes slow, unstable when growing and with replication #659

Open
mwaeckerlin opened this Issue Feb 5, 2018 · 72 comments

Comments

Projects
None yet
9 participants
@mwaeckerlin
Copy link

mwaeckerlin commented Feb 5, 2018

My setup is described in my blog.

Everything was fast and stable at the beginning. I then successfully migrated from GlusterFS to LizadFS. But when I started to copy more and more data to LizardFS, everything became more and more slow, fuzzy, unstable. CPUs are not stressed, nor is RAM full. So probably the problem is I/O or Network. Currently, used disk space is 5,4TB.

Entry in /etc/fstab:

mfsmount /var/volumes fuse rw,mfsmaster=universum,mfsdelayedinit,mfschunkserverwriteto=20000,mfsioretries=120 0 0

A week ago, I've seen, that default goal was 1, so I set goal for everything to 2, and since then it is much worse. This weekend, I changed th goal so that everything must explicitely be on the two hosts universum (the strongest machine) and urknall since raum is in another location and pulsar is the slowest. Also I upgraded fom Ubuntu' s lizardsfs version 3.7 to 3.12.

It still does not work.

In /var/log/syslog, I still see many problems:

Host universum (Master and Chunk):

Feb  5 11:39:17 universum mfschunkserver[7368]: Did not manage to receive packet header                                                                                                      
Feb  5 11:39:23 universum mfsmount: write file error, inode: 415346, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 4)               
Feb  5 11:39:27 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 1)               
Feb  5 11:39:34 universum mfsmount: write file error, inode: 415346, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 5)               
Feb  5 11:39:38 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 1)           
Feb  5 11:39:39 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 2)               
Feb  5 11:39:40 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 2)           
Feb  5 11:39:42 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 3)           
Feb  5 11:39:44 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 4)           
Feb  5 11:39:45 universum mfsmount: write file error, inode: 415346, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 6)               
Feb  5 11:39:46 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 5)           
Feb  5 11:39:48 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 6)           
Feb  5 11:39:50 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 7)           
Feb  5 11:39:51 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 3)               
Feb  5 11:39:52 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 8)           
Feb  5 11:39:54 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 9)           
Feb  5 11:39:56 universum mfsmount: write file error, inode: 415346, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 7)               
Feb  5 11:39:56 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 10)          
Feb  5 11:39:58 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 11)          
Feb  5 11:40:01 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 12)          
Feb  5 11:40:02 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 4)               
Feb  5 11:40:05 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 13)          
Feb  5 11:40:07 universum mfsmount: write file error, inode: 415346, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 8)               
Feb  5 11:40:12 universum mfschunkserver[7368]: replication error: Chunkserver communication timed out: 192.168.99.3:9422                                                                    
Feb  5 11:40:12 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007BF51 replication status: Unknown LizardFS error                                                          
Feb  5 11:40:13 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 5)               
Feb  5 11:40:13 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 14)          
Feb  5 11:40:14 universum mfschunkserver[7368]: Did not manage to receive packet header                                                                                                      
Feb  5 11:40:17 universum mfschunkserver[7368]: replication error: Chunkserver communication timed out: 192.168.99.3:9422                                                                    
Feb  5 11:40:17 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007463B replication status: Unknown LizardFS error
Feb  5 11:40:18 universum mfsmount: write file error, inode: 415346, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 9)
Feb  5 11:40:19 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:40:23 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 15)
Feb  5 11:40:24 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 6)
Feb  5 11:40:29 universum mfsmount: write file error, inode: 415346, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 10)
Feb  5 11:40:33 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 16)
Feb  5 11:40:35 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 7)
Feb  5 11:40:40 universum mfsmount: write file error, inode: 415346, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 11)
Feb  5 11:40:43 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 17)
Feb  5 11:40:46 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 8)
Feb  5 11:40:52 universum mfsmount: write file error, inode: 415346, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 12)
Feb  5 11:40:53 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 18)
Feb  5 11:40:58 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 9)
Feb  5 11:41:03 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 19)
Feb  5 11:41:05 universum mfsmount: write file error, inode: 415346, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 13)
Feb  5 11:41:10 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 10)
Feb  5 11:41:13 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 20)
Feb  5 11:41:14 universum mfschunkserver[7368]: replication error: Chunkserver communication timed out: 192.168.99.3:9422
Feb  5 11:41:14 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 0000000000078E36 replication status: Unknown LizardFS error
Feb  5 11:41:19 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:41:19 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007BAE3 replication status: Unknown LizardFS error
Feb  5 11:41:19 universum dockerd[1458]: time="2018-02-05T11:41:19.943977779+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:41:19 universum dockerd[1458]: time="2018-02-05T11:41:19.944442530+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:41:35 universum mfsmount: read file error, inode: 141719, index: 15, chunk: 506898, version: 1 - Chunkserver communication timed out: 192.168.99.137:9422 (try counter: 1)
Feb  5 11:41:37 universum mfsmount: read file error, inode: 141719, index: 15, chunk: 506898, version: 1 - Chunkserver communication timed out: 192.168.99.137:9422 (try counter: 2)
Feb  5 11:41:39 universum mfsmount: read file error, inode: 141719, index: 15, chunk: 506898, version: 1 - Chunkserver communication timed out: 192.168.99.137:9422 (try counter: 3)
Feb  5 11:41:41 universum mfsmount: read file error, inode: 141719, index: 15, chunk: 506898, version: 1 - Chunkserver communication timed out: 192.168.99.137:9422 (try counter: 4)
Feb  5 11:41:43 universum mfsmount: read file error, inode: 141719, index: 15, chunk: 506898, version: 1 - Chunkserver communication timed out: 192.168.99.137:9422 (try counter: 5)
Feb  5 11:41:43 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:41:47 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 1)
Feb  5 11:41:49 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 2)
Feb  5 11:41:51 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 3)
Feb  5 11:41:53 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 4)
Feb  5 11:41:55 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 5)
Feb  5 11:41:57 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 6)
Feb  5 11:41:59 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 7)
Feb  5 11:42:01 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 8)
Feb  5 11:42:03 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 9)
Feb  5 11:42:05 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 10)
Feb  5 11:42:07 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 11)
Feb  5 11:42:09 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 12)
Feb  5 11:42:13 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 13)
Feb  5 11:42:14 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:42:16 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007E398 replication status: Unknown LizardFS error
Feb  5 11:42:17 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:42:17 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007CA23 replication status: Unknown LizardFS error
Feb  5 11:42:19 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:42:23 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 1)
Feb  5 11:42:25 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 2)
Feb  5 11:42:27 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 3)
Feb  5 11:42:29 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 4)
Feb  5 11:42:31 universum mfsmount: read file error, inode: 141719, index: 14, chunk: 506777, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 5)
Feb  5 11:42:53 universum mfsmount: read file error, inode: 141719, index: 16, chunk: 506985, version: 1 - Chunkserver communication timed out: 192.168.99.137:9422 (try counter: 1)
Feb  5 11:42:55 universum mfsmount: read file error, inode: 141719, index: 16, chunk: 506985, version: 1 - Chunkserver communication timed out: 192.168.99.137:9422 (try counter: 2)
Feb  5 11:43:17 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:43:17 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:43:17 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007AB54 replication status: Unknown LizardFS error
Feb  5 11:43:17 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 0000000000082933 replication status: Unknown LizardFS error
Feb  5 11:43:27 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:43:29 universum dockerd[1458]: time="2018-02-05T11:43:29.434596908+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:43:29 universum dockerd[1458]: time="2018-02-05T11:43:29.434656283+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:43:29 universum dockerd[1458]: time="2018-02-05T11:43:29.434684659+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:43:29 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 11:43:31 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 3)
Feb  5 11:43:33 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 4)
Feb  5 11:43:35 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 5)
Feb  5 11:43:37 universum dockerd[1458]: time="2018-02-05T11:43:37.234326051+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:43:37 universum dockerd[1458]: time="2018-02-05T11:43:37.234407024+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:43:37 universum dockerd[1458]: time="2018-02-05T11:43:37.234444493+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:43:37 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 6)
Feb  5 11:43:38 universum dhclient[1339]: DHCPREQUEST of 192.168.99.137 on eno1 to 192.168.99.1 port 67 (xid=0x5da66249)
Feb  5 11:43:38 universum dhclient[1339]: DHCPACK of 192.168.99.137 from 192.168.99.1
Feb  5 11:43:38 universum root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 11:43:39 universum dhclient[1339]: bound to 192.168.99.137 -- renewal in 237 seconds.
Feb  5 11:43:39 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 7)
Feb  5 11:43:41 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 8)
Feb  5 11:43:43 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 9)
Feb  5 11:43:45 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 10)
Feb  5 11:43:47 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 11)
Feb  5 11:43:49 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:43:49 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 12)
Feb  5 11:43:51 universum dockerd[1458]: time="2018-02-05T11:43:51.306076028+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:mh26wxh3legx189b8ummizzvg leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:43:51 universum dockerd[1458]: time="2018-02-05T11:43:51.306146867+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:npq26i6igr6n6fn1idntes46c leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:43:51 universum dockerd[1458]: time="2018-02-05T11:43:51.306185869+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:2ntxgh39t5rk8b55tapu1l8gh leaving:false netPeers:2 entries:8 Queue qLen:0 netMsg/s:0"
Feb  5 11:43:51 universum dockerd[1458]: time="2018-02-05T11:43:51.306204829+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:ksx36pb1e18i5cywyl0itazk1 leaving:false netPeers:1 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:43:51 universum dockerd[1458]: time="2018-02-05T11:43:51.306238582+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:hixi0z35nd69dxzd621dtjl0e leaving:false netPeers:2 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:43:51 universum dockerd[1458]: time="2018-02-05T11:43:51.306255195+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:o5j79tmzry4nxic6p0uymd9qr leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:43:51 universum dockerd[1458]: time="2018-02-05T11:43:51.306271580+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:sv1iurfiu8hvvlkql0955wr3x leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:43:51 universum dockerd[1458]: time="2018-02-05T11:43:51.306288554+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:fbh4j37soxs97qxvjj36b2eis leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:43:51 universum dockerd[1458]: time="2018-02-05T11:43:51.306322590+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:87vvi5rhozzl8qws2ufe7ltl2 leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:43:51 universum dockerd[1458]: time="2018-02-05T11:43:51.306339247+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:4vpfofq2v8hubh30a3td4xwgx leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:43:51 universum dockerd[1458]: time="2018-02-05T11:43:51.306373078+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:xx0m50f4cog3dik8xwqxd4g0q leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:43:51 universum dockerd[1458]: time="2018-02-05T11:43:51.306407266+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:mwursuov7sc87fxhzkmpdtyes leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:43:51 universum dockerd[1458]: time="2018-02-05T11:43:51.306440676+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:2vg30sgcyr0omhvavdcptdrgv leaving:false netPeers:2 entries:48 Queue qLen:0 netMsg/s:0"
Feb  5 11:43:51 universum dockerd[1458]: time="2018-02-05T11:43:51.306457046+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:jer1lv709js55yyve3pudm9sq leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:43:51 universum dockerd[1458]: time="2018-02-05T11:43:51.306473376+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:azbnydw3jzhx3cp6dy38fmldk leaving:false netPeers:1 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:44:04 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:44:06 universum mfsmount: read file error, inode: 141719, index: 16, chunk: 506985, version: 1 - Chunkserver communication timed out: 192.168.99.137:9422 (try counter: 1)
Feb  5 11:44:08 universum mfsmount: read file error, inode: 141719, index: 16, chunk: 506985, version: 1 - Chunkserver communication timed out: 192.168.99.137:9422 (try counter: 2)
Feb  5 11:44:09 universum mfsmaster[8676]: chunk 000000000000f127 has not enough valid parts (2) consider repairing it manually
Feb  5 11:44:09 universum mfsmaster[8676]: chunk 000000000000f127_0000000f - invalid part on (192.168.99.137 - ver:0000000e)
Feb  5 11:44:09 universum mfsmaster[8676]: chunk 000000000000f127_0000000f - invalid part on (192.168.99.3 - ver:0000000e)
Feb  5 11:44:09 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:44:13 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:44:15 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 11:44:17 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 3)
Feb  5 11:44:19 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 4)
Feb  5 11:44:21 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:44:44 universum mfsmount: write file error, inode: 414011, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 1)
Feb  5 11:44:44 universum mfsmount: write file error, inode: 140664, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 1)
Feb  5 11:44:49 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:44:49 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:44:49 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007BF51 replication status: Unknown LizardFS error
Feb  5 11:44:49 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 0000000000078253 replication status: Unknown LizardFS error
Feb  5 11:44:51 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:44:54 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:44:55 universum mfsmount: write file error, inode: 414011, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 2)
Feb  5 11:44:56 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 11:44:58 universum mfsmount: write file error, inode: 140664, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 2)
Feb  5 11:44:58 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 3)
Feb  5 11:44:59 universum kernel: [237839.493938] IPVS: __ip_vs_del_service: enter
Feb  5 11:45:00 universum dockerd[1458]: time="2018-02-05T11:45:00.240829158+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:45:00 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 4)
Feb  5 11:45:10 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:45:11 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:45:11 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 00000000000799CB replication status: Unknown LizardFS error
Feb  5 11:45:13 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:45:25 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:45:27 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 11:45:29 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 3)
Feb  5 11:45:31 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 4)
Feb  5 11:45:33 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 5)
Feb  5 11:45:34 universum mfsmount: write file error, inode: 400427, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 1)
Feb  5 11:45:35 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 6)
Feb  5 11:45:37 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 7)
Feb  5 11:45:43 universum dockerd[1458]: sync duration of 7.384164527s, expected less than 1s
Feb  5 11:45:45 universum dockerd[1458]: sync duration of 2.717259987s, expected less than 1s
Feb  5 11:45:54 universum mfsmount: read file error, inode: 141719, index: 17, chunk: 507041, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:45:55 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:45:57 universum mfsmount: write file error, inode: 140664, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 1)
Feb  5 11:46:14 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:46:14 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007C942 replication status: Unknown LizardFS error
Feb  5 11:46:22 universum dockerd[1458]: sync duration of 5.620471903s, expected less than 1s
Feb  5 11:46:28 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:46:33 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:46:41 universum mfsmount: read file error, inode: 141719, index: 21, chunk: 507309, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:46:58 universum mfsmount: read file error, inode: 141719, index: 20, chunk: 507244, version: 1 - Chunkserver communication timed out: 192.168.99.137:9422 (try counter: 1)
Feb  5 11:47:10 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:47:10 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007BCB7 replication status: Unknown LizardFS error
Feb  5 11:47:10 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:47:10 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 00000000000791F2 replication status: Unknown LizardFS error
Feb  5 11:47:11 universum kernel: [237970.684192] IPVS: __ip_vs_del_service: enter
Feb  5 11:47:14 universum mfsmount: read file error, inode: 141719, index: 21, chunk: 507309, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:47:15 universum dockerd[1458]: time="2018-02-05T11:47:15.834626167+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:47:16 universum mfsmount: read file error, inode: 141719, index: 21, chunk: 507309, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 11:47:18 universum mfsmount: read file error, inode: 141719, index: 21, chunk: 507309, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 3)
Feb  5 11:47:20 universum mfsmount: read file error, inode: 141719, index: 21, chunk: 507309, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 4)
Feb  5 11:47:22 universum mfsmount: read file error, inode: 141719, index: 21, chunk: 507309, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 5)
Feb  5 11:47:35 universum dhclient[1339]: DHCPREQUEST of 192.168.99.137 on eno1 to 192.168.99.1 port 67 (xid=0x5da66249)
Feb  5 11:47:35 universum dhclient[1339]: DHCPACK of 192.168.99.137 from 192.168.99.1
Feb  5 11:47:35 universum root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 11:47:35 universum dhclient[1339]: bound to 192.168.99.137 -- renewal in 260 seconds.
Feb  5 11:47:36 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:47:36 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:47:36 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 0000000000079821 replication status: Unknown LizardFS error
Feb  5 11:47:36 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 0000000000079A0D replication status: Unknown LizardFS error
Feb  5 11:47:37 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:48:09 universum mfschunkserver[7368]: message repeated 3 times: [ Did not manage to receive packet header]
Feb  5 11:48:30 universum mfsmount: write file error, inode: 400037, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 1)
Feb  5 11:48:40 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 1)
Feb  5 11:48:42 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 2)
Feb  5 11:48:44 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 3)
Feb  5 11:48:46 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 4)
Feb  5 11:48:48 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 5)
Feb  5 11:48:50 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 6)
Feb  5 11:48:51 universum mfsmount: write file error, inode: 400037, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 2)
Feb  5 11:48:51 universum dockerd[1458]: time="2018-02-05T11:48:51.506102982+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:87vvi5rhozzl8qws2ufe7ltl2 leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:48:51 universum dockerd[1458]: time="2018-02-05T11:48:51.506171356+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:sv1iurfiu8hvvlkql0955wr3x leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:48:51 universum dockerd[1458]: time="2018-02-05T11:48:51.506208850+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:2ntxgh39t5rk8b55tapu1l8gh leaving:false netPeers:2 entries:8 Queue qLen:0 netMsg/s:0"
Feb  5 11:48:51 universum dockerd[1458]: time="2018-02-05T11:48:51.506242855+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:mh26wxh3legx189b8ummizzvg leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:48:51 universum dockerd[1458]: time="2018-02-05T11:48:51.506260007+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:jer1lv709js55yyve3pudm9sq leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:48:51 universum dockerd[1458]: time="2018-02-05T11:48:51.506292716+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:xx0m50f4cog3dik8xwqxd4g0q leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:48:51 universum dockerd[1458]: time="2018-02-05T11:48:51.506324478+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:hixi0z35nd69dxzd621dtjl0e leaving:false netPeers:2 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:48:51 universum dockerd[1458]: time="2018-02-05T11:48:51.506356380+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:mwursuov7sc87fxhzkmpdtyes leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:48:51 universum dockerd[1458]: time="2018-02-05T11:48:51.506389027+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:2vg30sgcyr0omhvavdcptdrgv leaving:false netPeers:2 entries:46 Queue qLen:0 netMsg/s:0"
Feb  5 11:48:51 universum dockerd[1458]: time="2018-02-05T11:48:51.506405958+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:4vpfofq2v8hubh30a3td4xwgx leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:48:51 universum dockerd[1458]: time="2018-02-05T11:48:51.506422194+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:azbnydw3jzhx3cp6dy38fmldk leaving:false netPeers:1 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:48:51 universum dockerd[1458]: time="2018-02-05T11:48:51.506438579+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:ksx36pb1e18i5cywyl0itazk1 leaving:false netPeers:1 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:48:51 universum dockerd[1458]: time="2018-02-05T11:48:51.506455455+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:fbh4j37soxs97qxvjj36b2eis leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:48:51 universum dockerd[1458]: time="2018-02-05T11:48:51.506472739+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:npq26i6igr6n6fn1idntes46c leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:48:51 universum dockerd[1458]: time="2018-02-05T11:48:51.506488714+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:o5j79tmzry4nxic6p0uymd9qr leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:48:52 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 7)
Feb  5 11:48:54 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 8)
Feb  5 11:48:56 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 9)
Feb  5 11:48:57 universum mfsmount: write file error, inode: 413757, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 1)
Feb  5 11:48:58 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 10)
Feb  5 11:49:00 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 11)
Feb  5 11:49:02 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 12)
Feb  5 11:49:02 universum mfschunkserver[7368]: replication error: Chunkserver communication timed out: 192.168.99.3:9422
Feb  5 11:49:02 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007AB45 replication status: Unknown LizardFS error
Feb  5 11:49:05 universum mfsmount: write file error, inode: 400037, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 3)
Feb  5 11:49:05 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:49:05 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 00000000000760C9 replication status: Unknown LizardFS error
Feb  5 11:49:06 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 13)
Feb  5 11:49:14 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 14)
Feb  5 11:49:15 universum mfsmaster[8676]: chunk 000000000000f127 has not enough valid parts (2) consider repairing it manually
Feb  5 11:49:15 universum mfsmaster[8676]: chunk 000000000000f127_0000000f - invalid part on (192.168.99.137 - ver:0000000e)
Feb  5 11:49:15 universum mfsmaster[8676]: chunk 000000000000f127_0000000f - invalid part on (192.168.99.3 - ver:0000000e)
Feb  5 11:49:15 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:49:22 universum mfsmount: write file error, inode: 400037, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 4)
Feb  5 11:49:22 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:49:22 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007B638 replication status: Unknown LizardFS error
Feb  5 11:49:24 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 15)
Feb  5 11:49:34 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 1)
Feb  5 11:49:34 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 16)
Feb  5 11:49:44 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 17)
Feb  5 11:49:46 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 2)
Feb  5 11:49:48 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:49:48 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:49:48 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007356D replication status: Unknown LizardFS error
Feb  5 11:49:48 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 0000000000089882 replication status: Unknown LizardFS error
Feb  5 11:49:54 universum mfsmount: read file error, inode: 141719, index: 21, chunk: 507309, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:50:02 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 1)
Feb  5 11:50:04 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 2)
Feb  5 11:50:06 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 3)
Feb  5 11:50:08 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 4)
Feb  5 11:50:09 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:50:10 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 5)
Feb  5 11:50:12 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 6)
Feb  5 11:50:14 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 7)
Feb  5 11:50:20 universum mfsmount: read file error, inode: 141719, index: 23, chunk: 507500, version: 1 - Chunkserver communication timed out: 192.168.99.137:9422 (try counter: 1)
Feb  5 11:50:22 universum mfsmount: read file error, inode: 141719, index: 23, chunk: 507500, version: 1 - Chunkserver communication timed out: 192.168.99.137:9422 (try counter: 2)
Feb  5 11:50:24 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:50:27 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 1)
Feb  5 11:50:29 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 2)
Feb  5 11:50:31 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 3)
Feb  5 11:50:33 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 4)
Feb  5 11:50:35 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 5)
Feb  5 11:50:36 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:50:37 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 6)
Feb  5 11:50:39 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 7)
Feb  5 11:50:41 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 8)
Feb  5 11:50:43 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 9)
Feb  5 11:50:44 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:50:44 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007E71B replication status: Unknown LizardFS error
Feb  5 11:50:45 universum mfsmount: read file error, inode: 141719, index: 22, chunk: 507401, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 10)
Feb  5 11:50:53 universum dockerd[1458]: time="2018-02-05T11:50:53.034604217+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:50:53 universum dockerd[1458]: time="2018-02-05T11:50:53.034685478+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:50:53 universum dockerd[1458]: time="2018-02-05T11:50:53.034723577+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:50:53 universum dockerd[1458]: sync duration of 1.707560792s, expected less than 1s
Feb  5 11:50:53 universum mfsmount: read file error, inode: 141719, index: 23, chunk: 507500, version: 1 - Chunkserver communication timed out: 192.168.99.137:9422 (try counter: 1)
Feb  5 11:51:09 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:51:11 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 11:51:13 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 3)
Feb  5 11:51:15 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 4)
Feb  5 11:51:31 universum mfsmount: write file error, inode: 400037, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 1)
Feb  5 11:51:33 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:51:33 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007ADE2 replication status: Unknown LizardFS error
Feb  5 11:51:33 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:51:35 universum dockerd[1458]: time="2018-02-05T11:51:35.034564344+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:51:35 universum dockerd[1458]: time="2018-02-05T11:51:35.034620249+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:51:35 universum dockerd[1458]: time="2018-02-05T11:51:35.034650220+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:51:35 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 11:51:37 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 3)
Feb  5 11:51:39 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 4)
Feb  5 11:51:41 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 5)
Feb  5 11:51:43 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 6)
Feb  5 11:51:46 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:51:49 universum dockerd[1458]: time="2018-02-05T11:51:49.634597600+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:51:55 universum dhclient[1339]: DHCPREQUEST of 192.168.99.137 on eno1 to 192.168.99.1 port 67 (xid=0x5da66249)
Feb  5 11:51:55 universum dhclient[1339]: DHCPACK of 192.168.99.137 from 192.168.99.1
Feb  5 11:51:55 universum root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 11:51:55 universum dhclient[1339]: bound to 192.168.99.137 -- renewal in 274 seconds.
Feb  5 11:52:13 universum dockerd[1458]: time="2018-02-05T11:52:13.634581686+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:52:13 universum kernel: [238273.275100] IPVS: __ip_vs_del_service: enter
Feb  5 11:52:20 universum dockerd[1458]: time="2018-02-05T11:52:20.234426928+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:52:20 universum dockerd[1458]: time="2018-02-05T11:52:20.234480613+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:52:20 universum dockerd[1458]: time="2018-02-05T11:52:20.234507939+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:52:20 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:52:34 universum dockerd[1458]: sync duration of 2.323483652s, expected less than 1s
Feb  5 11:52:35 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:52:37 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 11:52:39 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 3)
Feb  5 11:52:41 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 4)
Feb  5 11:52:43 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 5)
Feb  5 11:52:45 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 6)
Feb  5 11:52:47 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 7)
Feb  5 11:52:49 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 8)
Feb  5 11:52:51 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 9)
Feb  5 11:53:01 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:53:03 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 11:53:05 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 3)
Feb  5 11:53:07 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 4)
Feb  5 11:53:09 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 5)
Feb  5 11:53:11 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 6)
Feb  5 11:53:13 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 7)
Feb  5 11:53:15 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 8)
Feb  5 11:53:17 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 9)
Feb  5 11:53:19 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 10)
Feb  5 11:53:21 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 11)
Feb  5 11:53:23 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 12)
Feb  5 11:53:30 universum mfsmount: write file error, inode: 400037, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 1)
Feb  5 11:53:30 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:53:32 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 11:53:34 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 3)
Feb  5 11:53:36 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 4)
Feb  5 11:53:44 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:53:46 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 11:53:48 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 1)
Feb  5 11:53:48 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 3)
Feb  5 11:53:50 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 4)
Feb  5 11:53:51 universum dockerd[1458]: time="2018-02-05T11:53:51.706060747+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:ksx36pb1e18i5cywyl0itazk1 leaving:false netPeers:1 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:53:51 universum dockerd[1458]: time="2018-02-05T11:53:51.706139321+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:sv1iurfiu8hvvlkql0955wr3x leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:53:51 universum dockerd[1458]: time="2018-02-05T11:53:51.706159755+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:jer1lv709js55yyve3pudm9sq leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:53:51 universum dockerd[1458]: time="2018-02-05T11:53:51.706177144+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:azbnydw3jzhx3cp6dy38fmldk leaving:false netPeers:1 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:53:51 universum dockerd[1458]: time="2018-02-05T11:53:51.706216333+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:hixi0z35nd69dxzd621dtjl0e leaving:false netPeers:2 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:53:51 universum dockerd[1458]: time="2018-02-05T11:53:51.706250724+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:mwursuov7sc87fxhzkmpdtyes leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:53:51 universum dockerd[1458]: time="2018-02-05T11:53:51.706284148+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:2vg30sgcyr0omhvavdcptdrgv leaving:false netPeers:2 entries:46 Queue qLen:0 netMsg/s:0"
Feb  5 11:53:51 universum dockerd[1458]: time="2018-02-05T11:53:51.706317549+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:xx0m50f4cog3dik8xwqxd4g0q leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:53:51 universum dockerd[1458]: time="2018-02-05T11:53:51.706334981+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:4vpfofq2v8hubh30a3td4xwgx leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:53:51 universum dockerd[1458]: time="2018-02-05T11:53:51.706367985+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:87vvi5rhozzl8qws2ufe7ltl2 leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:53:51 universum dockerd[1458]: time="2018-02-05T11:53:51.706385282+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:npq26i6igr6n6fn1idntes46c leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:53:51 universum dockerd[1458]: time="2018-02-05T11:53:51.706418506+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:2ntxgh39t5rk8b55tapu1l8gh leaving:false netPeers:2 entries:8 Queue qLen:0 netMsg/s:0"
Feb  5 11:53:51 universum dockerd[1458]: time="2018-02-05T11:53:51.706451868+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:mh26wxh3legx189b8ummizzvg leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:53:51 universum dockerd[1458]: time="2018-02-05T11:53:51.706468341+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:o5j79tmzry4nxic6p0uymd9qr leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:53:51 universum dockerd[1458]: time="2018-02-05T11:53:51.706507309+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:fbh4j37soxs97qxvjj36b2eis leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:53:52 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 5)
Feb  5 11:53:54 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 6)
Feb  5 11:53:56 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 7)
Feb  5 11:53:58 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 8)
Feb  5 11:54:00 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 9)
Feb  5 11:54:01 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 2)
Feb  5 11:54:01 universum mfsmaster[8676]: (192.168.99.3:9422) chunk: 0000000000079799 replication status: Unknown LizardFS error
Feb  5 11:54:02 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 10)
Feb  5 11:54:04 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 11)
Feb  5 11:54:06 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 12)
Feb  5 11:54:14 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:54:16 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 11:54:18 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 3)
Feb  5 11:54:20 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 4)
Feb  5 11:54:20 universum mfsmaster[8676]: chunk 000000000000f127 has not enough valid parts (2) consider repairing it manually
Feb  5 11:54:20 universum mfsmaster[8676]: chunk 000000000000f127_0000000f - invalid part on (192.168.99.137 - ver:0000000e)
Feb  5 11:54:20 universum mfsmaster[8676]: chunk 000000000000f127_0000000f - invalid part on (192.168.99.3 - ver:0000000e)
Feb  5 11:54:27 universum dockerd[1458]: time="2018-02-05T11:54:27.234498316+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:54:27 universum dockerd[1458]: time="2018-02-05T11:54:27.234557320+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:54:27 universum dockerd[1458]: time="2018-02-05T11:54:27.234585824+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:54:30 universum kernel: [238409.871518] IPVS: __ip_vs_del_service: enter
Feb  5 11:54:30 universum dockerd[1458]: time="2018-02-05T11:54:30.435126660+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:54:44 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:54:46 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 11:54:48 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 3)
Feb  5 11:54:50 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 4)
Feb  5 11:54:52 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 5)
Feb  5 11:54:54 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 6)
Feb  5 11:54:56 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 7)
Feb  5 11:55:02 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:55:07 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:55:18 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:55:20 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 11:55:29 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:55:31 universum mfsmount: write file error, inode: 400037, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 1)
Feb  5 11:55:31 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 11:55:33 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 3)
Feb  5 11:55:35 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 4)
Feb  5 11:55:37 universum mfsmount: read file error, inode: 141719, index: 24, chunk: 507591, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 5)
Feb  5 11:55:39 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:55:52 universum mfsmount: write file error, inode: 400037, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 2)
Feb  5 11:55:59 universum mfsmount: write file error, inode: 63712, index: 0 - Timeout after 56320 ms (Timeout) (try counter: 1)
Feb  5 11:55:59 universum mfsmount: write file error, inode: 413360, index: 0 - Timeout after 48907 ms (Timeout) (try counter: 1)
Feb  5 11:56:07 universum mfschunkserver[7368]: replication error: Chunkserver communication timed out: 192.168.99.3:9422
Feb  5 11:56:07 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007BA75 replication status: Unknown LizardFS error
Feb  5 11:56:11 universum mfsmount: write file error, inode: 413360, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 2)
Feb  5 11:56:11 universum mfsmount: write file error, inode: 63712, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 2)
Feb  5 11:56:13 universum mfsmount: write file error, inode: 400037, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 3)
Feb  5 11:56:16 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:56:17 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:56:17 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007BAE3 replication status: Unknown LizardFS error
Feb  5 11:56:18 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:56:25 universum mfsmount: read file error, inode: 141719, index: 25, chunk: 507645, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 1)
Feb  5 11:56:27 universum mfsmount: read file error, inode: 141719, index: 25, chunk: 507645, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 2)
Feb  5 11:56:29 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 1)
Feb  5 11:56:29 universum mfsmount: read file error, inode: 141719, index: 25, chunk: 507645, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 3)
Feb  5 11:56:29 universum dhclient[1339]: DHCPREQUEST of 192.168.99.137 on eno1 to 192.168.99.1 port 67 (xid=0x5da66249)
Feb  5 11:56:29 universum dhclient[1339]: DHCPACK of 192.168.99.137 from 192.168.99.1
Feb  5 11:56:30 universum root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 11:56:31 universum mfsmount: read file error, inode: 141719, index: 25, chunk: 507645, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 4)
Feb  5 11:56:32 universum dhclient[1339]: bound to 192.168.99.137 -- renewal in 278 seconds.
Feb  5 11:56:33 universum mfsmount: read file error, inode: 141719, index: 25, chunk: 507645, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 5)
Feb  5 11:56:35 universum mfsmount: read file error, inode: 141719, index: 25, chunk: 507645, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 6)
Feb  5 11:56:37 universum mfsmount: read file error, inode: 141719, index: 25, chunk: 507645, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 7)
Feb  5 11:56:39 universum mfsmount: read file error, inode: 141719, index: 25, chunk: 507645, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 8)
Feb  5 11:56:41 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 2)
Feb  5 11:56:41 universum mfsmount: read file error, inode: 141719, index: 25, chunk: 507645, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 9)
Feb  5 11:56:43 universum mfsmount: read file error, inode: 141719, index: 25, chunk: 507645, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 10)
Feb  5 11:56:45 universum mfsmount: read file error, inode: 141719, index: 25, chunk: 507645, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 11)
Feb  5 11:56:47 universum mfsmount: read file error, inode: 141719, index: 25, chunk: 507645, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 12)
Feb  5 11:56:52 universum mfsmount: read file error, inode: 141719, index: 25, chunk: 507645, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 13)
Feb  5 11:56:52 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 3)
Feb  5 11:57:00 universum mfsmount: read file error, inode: 141719, index: 25, chunk: 507645, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 14)
Feb  5 11:57:04 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 4)
Feb  5 11:57:05 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:57:05 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 0000000000072F7E replication status: Unknown LizardFS error
Feb  5 11:57:05 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:57:05 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 0000000000081245 replication status: Unknown LizardFS error
Feb  5 11:57:06 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:57:06 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:57:10 universum mfsmount: read file error, inode: 141719, index: 25, chunk: 507645, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 15)
Feb  5 11:57:16 universum mfsmount: write file error, inode: 400427, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 1)
Feb  5 11:57:20 universum mfsmount: read file error, inode: 141719, index: 25, chunk: 507645, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 16)
Feb  5 11:57:27 universum mfsmount: write file error, inode: 400427, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 2)
Feb  5 11:57:30 universum mfsmount: read file error, inode: 141719, index: 25, chunk: 507645, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 17)
Feb  5 11:57:38 universum mfsmount: write file error, inode: 400427, index: 0 - Timeout after 10023 ms (Timeout) (try counter: 3)
Feb  5 11:57:39 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:57:39 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:57:39 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007A1C4 replication status: Unknown LizardFS error
Feb  5 11:57:39 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007A0CE replication status: Unknown LizardFS error
Feb  5 11:57:40 universum mfsmount: read file error, inode: 141719, index: 25, chunk: 507645, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 18)
Feb  5 11:57:49 universum mfsmount: read file error, inode: 141719, index: 26, chunk: 507728, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 1)
Feb  5 11:57:51 universum mfsmount: read file error, inode: 141719, index: 26, chunk: 507728, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 2)
Feb  5 11:57:53 universum mfsmount: read file error, inode: 141719, index: 26, chunk: 507728, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 3)
Feb  5 11:57:54 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:57:55 universum mfsmount: read file error, inode: 141719, index: 26, chunk: 507728, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 4)
Feb  5 11:57:58 universum mfsmount: write file error, inode: 63715, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 1)
Feb  5 11:57:58 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:57:58 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007A9DF replication status: Unknown LizardFS error
Feb  5 11:58:10 universum mfsmount: write file error, inode: 63715, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 2)
Feb  5 11:58:15 universum mfsmount: read file error, inode: 141719, index: 26, chunk: 507728, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 1)
Feb  5 11:58:17 universum mfsmount: read file error, inode: 141719, index: 26, chunk: 507728, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 2)
Feb  5 11:58:21 universum mfsmount: write file error, inode: 63715, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 3)
Feb  5 11:58:24 universum mfsmount: write file error, inode: 400037, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 1)
Feb  5 11:58:24 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:58:24 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007C4C4 replication status: Unknown LizardFS error
Feb  5 11:58:26 universum mfsmount: read file error, inode: 141719, index: 26, chunk: 507728, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:58:28 universum mfsmount: read file error, inode: 141719, index: 26, chunk: 507728, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 2)
Feb  5 11:58:30 universum mfsmount: read file error, inode: 141719, index: 26, chunk: 507728, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 3)
Feb  5 11:58:32 universum mfsmount: read file error, inode: 141719, index: 26, chunk: 507728, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 4)
Feb  5 11:58:34 universum mfsmount: read file error, inode: 141719, index: 26, chunk: 507728, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 5)
Feb  5 11:58:36 universum mfsmount: read file error, inode: 141719, index: 26, chunk: 507728, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 6)
Feb  5 11:58:38 universum mfsmount: read file error, inode: 141719, index: 26, chunk: 507728, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 7)
Feb  5 11:58:40 universum mfsmount: read file error, inode: 141719, index: 26, chunk: 507728, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 8)
Feb  5 11:58:41 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 1)
Feb  5 11:58:42 universum mfsmount: read file error, inode: 141719, index: 26, chunk: 507728, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 9)
Feb  5 11:58:44 universum mfsmount: read file error, inode: 141719, index: 26, chunk: 507728, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 10)
Feb  5 11:58:46 universum mfsmount: read file error, inode: 141719, index: 26, chunk: 507728, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 11)
Feb  5 11:58:48 universum mfsmount: read file error, inode: 141719, index: 26, chunk: 507728, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 12)
Feb  5 11:58:49 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:58:51 universum dockerd[1458]: time="2018-02-05T11:58:51.906059197+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:4vpfofq2v8hubh30a3td4xwgx leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:58:51 universum dockerd[1458]: time="2018-02-05T11:58:51.906166318+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:hixi0z35nd69dxzd621dtjl0e leaving:false netPeers:2 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:58:51 universum dockerd[1458]: time="2018-02-05T11:58:51.906203962+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:2ntxgh39t5rk8b55tapu1l8gh leaving:false netPeers:2 entries:8 Queue qLen:0 netMsg/s:0"
Feb  5 11:58:51 universum dockerd[1458]: time="2018-02-05T11:58:51.906221413+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:npq26i6igr6n6fn1idntes46c leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:58:51 universum dockerd[1458]: time="2018-02-05T11:58:51.906237681+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:o5j79tmzry4nxic6p0uymd9qr leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:58:51 universum dockerd[1458]: time="2018-02-05T11:58:51.906256065+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:fbh4j37soxs97qxvjj36b2eis leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:58:51 universum dockerd[1458]: time="2018-02-05T11:58:51.906289815+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:87vvi5rhozzl8qws2ufe7ltl2 leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:58:51 universum dockerd[1458]: time="2018-02-05T11:58:51.906306563+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:jer1lv709js55yyve3pudm9sq leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:58:51 universum dockerd[1458]: time="2018-02-05T11:58:51.906323036+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:sv1iurfiu8hvvlkql0955wr3x leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:58:51 universum dockerd[1458]: time="2018-02-05T11:58:51.906342618+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:azbnydw3jzhx3cp6dy38fmldk leaving:false netPeers:1 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:58:51 universum dockerd[1458]: time="2018-02-05T11:58:51.906375582+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:xx0m50f4cog3dik8xwqxd4g0q leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:58:51 universum dockerd[1458]: time="2018-02-05T11:58:51.906407643+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:mwursuov7sc87fxhzkmpdtyes leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:58:51 universum dockerd[1458]: time="2018-02-05T11:58:51.906440244+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:mh26wxh3legx189b8ummizzvg leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:58:51 universum dockerd[1458]: time="2018-02-05T11:58:51.906457171+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:ksx36pb1e18i5cywyl0itazk1 leaving:false netPeers:1 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:58:51 universum dockerd[1458]: time="2018-02-05T11:58:51.906500505+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:2vg30sgcyr0omhvavdcptdrgv leaving:false netPeers:2 entries:46 Queue qLen:0 netMsg/s:0"
Feb  5 11:58:52 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 2)
Feb  5 11:58:53 universum dockerd[1458]: time="2018-02-05T11:58:53.353304585+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:58:53 universum dockerd[1458]: time="2018-02-05T11:58:53.353378407+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:58:53 universum dockerd[1458]: time="2018-02-05T11:58:53.365181437+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:58:53 universum dockerd[1458]: time="2018-02-05T11:58:53.365243192+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:58:53 universum dockerd[1458]: time="2018-02-05T11:58:53.365278410+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:58:55 universum mfsmount: read file error, inode: 141719, index: 26, chunk: 507728, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 11:58:57 universum mfsmount: read file error, inode: 141719, index: 26, chunk: 507728, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 2)
Feb  5 11:58:58 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:58:59 universum mfsmount: read file error, inode: 141719, index: 26, chunk: 507728, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 3)
Feb  5 11:59:03 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 3)
Feb  5 11:59:04 universum mfsmount: write file error, inode: 358923, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 12)
Feb  5 11:59:04 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:59:04 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007CB68 replication status: Unknown LizardFS error
Feb  5 11:59:06 universum dockerd[1458]: time="2018-02-05T11:59:06.341072066+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:59:06 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:59:06 universum kernel: [238686.463527] IPVS: __ip_vs_del_service: enter
Feb  5 11:59:15 universum mfsmount: write file error, inode: 413342, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 1)
Feb  5 11:59:17 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 4)
Feb  5 11:59:25 universum mfsmount: write file error, inode: 358923, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 13)
Feb  5 11:59:25 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:59:25 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007C2BB replication status: Unknown LizardFS error
Feb  5 11:59:25 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:59:25 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007B736 replication status: Unknown LizardFS error
Feb  5 11:59:26 universum mfsmount: write file error, inode: 413342, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 2)
Feb  5 11:59:26 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:59:26 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:59:26 universum mfsmaster[8676]: chunk 000000000000f127 has not enough valid parts (2) consider repairing it manually
Feb  5 11:59:26 universum mfsmaster[8676]: chunk 000000000000f127_0000000f - invalid part on (192.168.99.137 - ver:0000000e)
Feb  5 11:59:26 universum mfsmaster[8676]: chunk 000000000000f127_0000000f - invalid part on (192.168.99.3 - ver:0000000e)
Feb  5 11:59:28 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 5)
Feb  5 11:59:28 universum dockerd[1458]: time="2018-02-05T11:59:28.634475035+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:59:30 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:59:38 universum dockerd[1458]: sync duration of 1.294411485s, expected less than 1s
Feb  5 11:59:41 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:59:41 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 11:59:50 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 1)
Feb  5 11:59:53 universum mfsmount: write file error, inode: 40710, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 1)
Feb  5 11:59:55 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:59:55 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 11:59:55 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 0000000000079508 replication status: Unknown LizardFS error
Feb  5 11:59:55 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007948D replication status: Unknown LizardFS error
Feb  5 12:00:00 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 12:00:02 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 12:00:04 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 3)
Feb  5 12:00:06 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 4)
Feb  5 12:00:26 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 1)
Feb  5 12:00:29 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 12:00:31 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 12:00:33 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 3)
Feb  5 12:00:35 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 4)
Feb  5 12:00:36 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 12:00:37 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 5)
Feb  5 12:00:44 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 12:00:46 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 12:00:48 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 3)
Feb  5 12:00:50 universum mfsmount: write file error, inode: 400427, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 1)
Feb  5 12:00:50 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 12:00:52 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 12:00:52 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007E71B replication status: Unknown LizardFS error
Feb  5 12:01:01 universum mfsmount: write file error, inode: 400427, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 2)
Feb  5 12:01:10 universum dhclient[1339]: DHCPREQUEST of 192.168.99.137 on eno1 to 192.168.99.1 port 67 (xid=0x5da66249)
Feb  5 12:01:10 universum dhclient[1339]: DHCPACK of 192.168.99.137 from 192.168.99.1
Feb  5 12:01:10 universum root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 12:01:10 universum dhclient[1339]: bound to 192.168.99.137 -- renewal in 254 seconds.
Feb  5 12:01:12 universum mfsmount: write file error, inode: 400427, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 3)
Feb  5 12:01:15 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 12:01:15 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007CB37 replication status: Unknown LizardFS error
Feb  5 12:01:17 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 12:01:20 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 12:01:22 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 12:01:30 universum dockerd[1458]: time="2018-02-05T12:01:30.234696422+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 12:01:30 universum dockerd[1458]: time="2018-02-05T12:01:30.234755160+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 12:01:30 universum dockerd[1458]: time="2018-02-05T12:01:30.234781904+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 12:01:37 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 12:01:43 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 12:01:45 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 12:01:47 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 3)
Feb  5 12:01:49 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 4)
Feb  5 12:01:51 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 5)
Feb  5 12:01:53 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 6)
Feb  5 12:01:55 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 7)
Feb  5 12:01:57 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 8)
Feb  5 12:01:59 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 9)
Feb  5 12:02:01 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 10)
Feb  5 12:02:03 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 11)
Feb  5 12:02:05 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 12)
Feb  5 12:02:09 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 12:02:09 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 12:02:30 universum mfsmount: write file error, inode: 400037, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 1)
Feb  5 12:02:40 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 12:02:40 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 12:02:40 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007E5FF replication status: Unknown LizardFS error
Feb  5 12:02:40 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007E398 replication status: Unknown LizardFS error
Feb  5 12:03:11 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 12:03:22 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 12:03:22 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 00000000000733A5 replication status: Unknown LizardFS error
Feb  5 12:03:22 universum mfsmount: write file error, inode: 400037, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 1)
Feb  5 12:03:22 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 12:03:22 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007AB54 replication status: Unknown LizardFS error
Feb  5 12:03:27 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 12:03:35 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 1)
Feb  5 12:03:35 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.2:9422)
Feb  5 12:03:35 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 00000000000747CE replication status: Unknown LizardFS error
Feb  5 12:03:47 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 2)
Feb  5 12:03:47 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 12:03:47 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 00000000000760C9 replication status: Unknown LizardFS error
Feb  5 12:03:52 universum dockerd[1458]: time="2018-02-05T12:03:52.106085774+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:jer1lv709js55yyve3pudm9sq leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 12:03:52 universum dockerd[1458]: time="2018-02-05T12:03:52.106194180+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:xx0m50f4cog3dik8xwqxd4g0q leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:03:52 universum dockerd[1458]: time="2018-02-05T12:03:52.106215985+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:npq26i6igr6n6fn1idntes46c leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 12:03:52 universum dockerd[1458]: time="2018-02-05T12:03:52.106235080+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:4vpfofq2v8hubh30a3td4xwgx leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 12:03:52 universum dockerd[1458]: time="2018-02-05T12:03:52.106272510+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:87vvi5rhozzl8qws2ufe7ltl2 leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:03:52 universum dockerd[1458]: time="2018-02-05T12:03:52.106324843+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:mh26wxh3legx189b8ummizzvg leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:03:52 universum dockerd[1458]: time="2018-02-05T12:03:52.106358124+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:mwursuov7sc87fxhzkmpdtyes leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:03:52 universum dockerd[1458]: time="2018-02-05T12:03:52.106395805+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:2vg30sgcyr0omhvavdcptdrgv leaving:false netPeers:2 entries:48 Queue qLen:0 netMsg/s:0"
Feb  5 12:03:52 universum dockerd[1458]: time="2018-02-05T12:03:52.106413597+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:o5j79tmzry4nxic6p0uymd9qr leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 12:03:52 universum dockerd[1458]: time="2018-02-05T12:03:52.106430222+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:sv1iurfiu8hvvlkql0955wr3x leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 12:03:52 universum dockerd[1458]: time="2018-02-05T12:03:52.106446484+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:ksx36pb1e18i5cywyl0itazk1 leaving:false netPeers:1 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:03:52 universum dockerd[1458]: time="2018-02-05T12:03:52.106479591+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:hixi0z35nd69dxzd621dtjl0e leaving:false netPeers:2 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 12:03:52 universum dockerd[1458]: time="2018-02-05T12:03:52.106495912+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:fbh4j37soxs97qxvjj36b2eis leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 12:03:52 universum dockerd[1458]: time="2018-02-05T12:03:52.106513661+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:azbnydw3jzhx3cp6dy38fmldk leaving:false netPeers:1 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:03:52 universum dockerd[1458]: time="2018-02-05T12:03:52.106551088+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:2ntxgh39t5rk8b55tapu1l8gh leaving:false netPeers:2 entries:8 Queue qLen:0 netMsg/s:0"
Feb  5 12:03:58 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 3)
Feb  5 12:04:10 universum mfsmount: write file error, inode: 399491, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 4)
Feb  5 12:04:23 universum dockerd[1458]: time="2018-02-05T12:04:23.435646989+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 12:04:23 universum dockerd[1458]: time="2018-02-05T12:04:23.435720109+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 12:04:23 universum dockerd[1458]: time="2018-02-05T12:04:23.435754698+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 12:04:32 universum mfsmaster[8676]: chunk 000000000000f127 has not enough valid parts (2) consider repairing it manually
Feb  5 12:04:32 universum mfsmaster[8676]: chunk 000000000000f127_0000000f - invalid part on (192.168.99.137 - ver:0000000e)
Feb  5 12:04:32 universum mfsmaster[8676]: chunk 000000000000f127_0000000f - invalid part on (192.168.99.3 - ver:0000000e)
Feb  5 12:04:38 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 12:04:43 universum mfsmount: write file error, inode: 140664, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 1)
Feb  5 12:05:02 universum mfsmount: write file error, inode: 140664, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 2)
Feb  5 12:05:02 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 12:05:02 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007BA0A replication status: Unknown LizardFS error
Feb  5 12:05:02 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 12:05:02 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007B638 replication status: Unknown LizardFS error
Feb  5 12:05:03 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 12:05:23 universum mfsmount: write file error, inode: 140664, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 3)
Feb  5 12:05:24 universum dhclient[1339]: DHCPREQUEST of 192.168.99.137 on eno1 to 192.168.99.1 port 67 (xid=0x5da66249)
Feb  5 12:05:24 universum dhclient[1339]: DHCPACK of 192.168.99.137 from 192.168.99.1
Feb  5 12:05:24 universum root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 12:05:24 universum dhclient[1339]: bound to 192.168.99.137 -- renewal in 255 seconds.
Feb  5 12:05:33 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 12:05:33 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 12:05:33 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007B799 replication status: Unknown LizardFS error
Feb  5 12:05:51 universum mfsmount: write file error, inode: 400427, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 1)
Feb  5 12:06:02 universum mfsmount: write file error, inode: 400427, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 2)
Feb  5 12:06:10 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 12:06:10 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 12:06:10 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 0000000000079E90 replication status: Unknown LizardFS error
Feb  5 12:06:10 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007DB96 replication status: Unknown LizardFS error
Feb  5 12:06:13 universum mfsmount: write file error, inode: 400427, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 3)
Feb  5 12:06:13 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 12:06:16 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 12:06:24 universum mfsmount: write file error, inode: 400427, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 4)
Feb  5 12:06:28 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 12:06:28 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 12:06:28 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 00000000000782A2 replication status: Unknown LizardFS error
Feb  5 12:06:28 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 000000000007C305 replication status: Unknown LizardFS error
Feb  5 12:06:33 universum dockerd[1458]: time="2018-02-05T12:06:33.634990609+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 12:06:33 universum kernel: [239133.251368] IPVS: __ip_vs_del_service: enter
Feb  5 12:06:39 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 12:07:11 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 12:07:11 universum mfschunkserver[7368]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb  5 12:07:11 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 00000000000812C0 replication status: Unknown LizardFS error
Feb  5 12:07:11 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 0000000000081245 replication status: Unknown LizardFS error
Feb  5 12:07:12 universum mfsmount: write file error, inode: 48513, index: 0 - Timeout after 37108 ms (Timeout) (try counter: 1)
Feb  5 12:07:13 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 12:07:13 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 12:07:34 universum mfsmount: write file error, inode: 400427, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 1)
Feb  5 12:07:45 universum mfsmount: write file error, inode: 400427, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 2)
Feb  5 12:07:57 universum mfsmount: write file error, inode: 400427, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 3)
Feb  5 12:08:09 universum mfsmount: write file error, inode: 400427, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 4)
Feb  5 12:08:13 universum mfschunkserver[7368]: replication error: Chunkserver communication timed out: 192.168.99.3:9422
Feb  5 12:08:13 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 0000000000079B0C replication status: Unknown LizardFS error
Feb  5 12:08:13 universum mfschunkserver[7368]: replication error: Chunkserver communication timed out: 192.168.99.3:9422
Feb  5 12:08:13 universum mfsmaster[8676]: (192.168.99.137:9422) chunk: 0000000000079CF8 replication status: Unknown LizardFS error
Feb  5 12:08:16 universum mfsmount: write file error, inode: 409072, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 1)
Feb  5 12:08:27 universum mfsmount: write file error, inode: 409072, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 2)
Feb  5 12:08:34 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 12:08:36 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 12:08:36 universum mfsmount: write file error, inode: 140664, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 1)
Feb  5 12:08:38 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 3)
Feb  5 12:08:39 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 12:08:40 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 4)
Feb  5 12:08:43 universum dockerd[1458]: time="2018-02-05T12:08:43.434724346+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 12:08:43 universum kernel: [239263.047510] IPVS: __ip_vs_del_service: enter
Feb  5 12:08:52 universum dockerd[1458]: time="2018-02-05T12:08:52.306033047+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:4vpfofq2v8hubh30a3td4xwgx leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 12:08:52 universum dockerd[1458]: time="2018-02-05T12:08:52.306101830+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:mh26wxh3legx189b8ummizzvg leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:08:52 universum dockerd[1458]: time="2018-02-05T12:08:52.306122038+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:o5j79tmzry4nxic6p0uymd9qr leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 12:08:52 universum dockerd[1458]: time="2018-02-05T12:08:52.306139536+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:jer1lv709js55yyve3pudm9sq leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 12:08:52 universum dockerd[1458]: time="2018-02-05T12:08:52.306174622+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:mwursuov7sc87fxhzkmpdtyes leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:08:52 universum dockerd[1458]: time="2018-02-05T12:08:52.306191648+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:azbnydw3jzhx3cp6dy38fmldk leaving:false netPeers:1 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:08:52 universum dockerd[1458]: time="2018-02-05T12:08:52.306225132+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:xx0m50f4cog3dik8xwqxd4g0q leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:08:52 universum dockerd[1458]: time="2018-02-05T12:08:52.306242263+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:ksx36pb1e18i5cywyl0itazk1 leaving:false netPeers:1 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:08:52 universum dockerd[1458]: time="2018-02-05T12:08:52.306258691+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:npq26i6igr6n6fn1idntes46c leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 12:08:52 universum dockerd[1458]: time="2018-02-05T12:08:52.306291511+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:2vg30sgcyr0omhvavdcptdrgv leaving:false netPeers:2 entries:47 Queue qLen:0 netMsg/s:0"
Feb  5 12:08:52 universum dockerd[1458]: time="2018-02-05T12:08:52.306325107+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:87vvi5rhozzl8qws2ufe7ltl2 leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:08:52 universum dockerd[1458]: time="2018-02-05T12:08:52.306342317+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:sv1iurfiu8hvvlkql0955wr3x leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 12:08:52 universum dockerd[1458]: time="2018-02-05T12:08:52.306375309+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:hixi0z35nd69dxzd621dtjl0e leaving:false netPeers:2 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 12:08:52 universum dockerd[1458]: time="2018-02-05T12:08:52.306412309+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:2ntxgh39t5rk8b55tapu1l8gh leaving:false netPeers:2 entries:8 Queue qLen:0 netMsg/s:0"
Feb  5 12:08:52 universum dockerd[1458]: time="2018-02-05T12:08:52.306430023+01:00" level=info msg="NetworkDB stats universum(8c31e37f1156) - netID:fbh4j37soxs97qxvjj36b2eis leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 12:08:53 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 1)
Feb  5 12:08:55 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 2)
Feb  5 12:08:57 universum mfsmount: read file error, inode: 141719, index: 28, chunk: 507912, version: 1 - Chunkserver communication timed out: 192.168.99.7:9422 (try counter: 3)
Feb  5 12:09:00 universum mfschunkserver[7368]: Did not manage to receive packet header
Feb  5 12:09:02 universum dockerd[1458]: time="2018-02-05T12:09:02.539874916+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 12:09:02 universum dockerd[1458]: time="2018-02-05T12:09:02.539957196+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 12:09:03 universum mfsmount: write file error, inode: 48506, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 1)

Host urknall (Chunk and Metalogger)

Feb  5 11:44:59 urknall kernel: [439533.674510] device veth8fbb4a7 left promiscuous mode
Feb  5 11:44:59 urknall kernel: [439533.674521] docker_gwbridge: port 21(veth8fbb4a7) entered disabled state
Feb  5 11:44:59 urknall dockerd[1676]: time="2018-02-05T11:44:59+01:00" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/b56bb316fe60faf95cea28cc55fa544f854ba19096fa340393e7f297c6be6cfa/shim.sock" debug=false module="containerd/tasks" pid=11659
Feb  5 11:44:59 urknall kernel: [439533.707232] br0: port 5(veth1042) entered disabled state
Feb  5 11:44:59 urknall kernel: [439533.707839] veth7b0561a: renamed from eth0
Feb  5 11:44:59 urknall kernel: [439533.794199] IPVS: Creating netns size=2192 id=1956
Feb  5 11:44:59 urknall kernel: [439533.853586] IPVS: __ip_vs_del_service: enter
Feb  5 11:45:00 urknall mfsmetalogger[8545]: sessions downloaded 742B/0.003541s (0.210 MB/s)
Feb  5 11:45:00 urknall kernel: [439534.918243] eth0: renamed from veth7df8d6d
Feb  5 11:45:00 urknall kernel: [439534.938086] br0: port 4(veth1041) entered forwarding state
Feb  5 11:45:00 urknall kernel: [439534.938115] br0: port 4(veth1041) entered forwarding state
Feb  5 11:45:08 urknall dockerd[1676]: time="2018-02-05T11:45:08.634000848+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:87vvi5rhozzl8qws2ufe7ltl2 leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:45:08 urknall dockerd[1676]: time="2018-02-05T11:45:08.634166121+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:xx0m50f4cog3dik8xwqxd4g0q leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:45:08 urknall dockerd[1676]: time="2018-02-05T11:45:08.634222758+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:dpu8s5w8nx7vtqyiqai8rflqm leaving:false netPeers:1 entries:12 Queue qLen:0 netMsg/s:0"
Feb  5 11:45:08 urknall dockerd[1676]: time="2018-02-05T11:45:08.634297624+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:2vg30sgcyr0omhvavdcptdrgv leaving:false netPeers:2 entries:49 Queue qLen:0 netMsg/s:0"
Feb  5 11:45:08 urknall dockerd[1676]: time="2018-02-05T11:45:08.634370376+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:mh26wxh3legx189b8ummizzvg leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:45:08 urknall dockerd[1676]: time="2018-02-05T11:45:08.634422287+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:gliwcqzlhjzjulhs3xdbxuqrv leaving:false netPeers:1 entries:7 Queue qLen:0 netMsg/s:0"
Feb  5 11:45:08 urknall dockerd[1676]: time="2018-02-05T11:45:08.634493086+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:2ntxgh39t5rk8b55tapu1l8gh leaving:false netPeers:2 entries:8 Queue qLen:0 netMsg/s:0"
Feb  5 11:45:08 urknall dockerd[1676]: time="2018-02-05T11:45:08.634564372+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:hixi0z35nd69dxzd621dtjl0e leaving:false netPeers:2 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:45:08 urknall dockerd[1676]: time="2018-02-05T11:45:08.634639279+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:mwursuov7sc87fxhzkmpdtyes leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:45:13 urknall mfsmount: write file error, inode: 67783, index: 0 - Timeout after 12008 ms (Timeout) (try counter: 29)
Feb  5 11:45:15 urknall mfschunkserver[8794]: Did not manage to receive packet header
Feb  5 11:45:15 urknall kernel: [439549.728988] br0: port 5(veth1042) entered disabled state
Feb  5 11:45:15 urknall kernel: [439549.736377] device veth1042 left promiscuous mode
Feb  5 11:45:15 urknall kernel: [439549.736386] br0: port 5(veth1042) entered disabled state
Feb  5 11:45:15 urknall kernel: [439549.948972] br0: port 4(veth1041) entered forwarding state
Feb  5 11:45:17 urknall kernel: [439551.477655] br0: port 2(veth0) entered disabled state
Feb  5 11:45:17 urknall kernel: [439551.482445] device veth0 left promiscuous mode
Feb  5 11:45:17 urknall kernel: [439551.482471] br0: port 2(veth0) entered disabled state
Feb  5 11:45:18 urknall kernel: [439552.232287] br0: port 3(veth5) entered disabled state
Feb  5 11:45:18 urknall kernel: [439552.232316] br0: port 1(vxlan0) entered disabled state
Feb  5 11:45:18 urknall kernel: [439552.237236] ov-00100d-dpu8s: renamed from br0
Feb  5 11:45:18 urknall kernel: [439552.254267] device veth5 left promiscuous mode
Feb  5 11:45:18 urknall kernel: [439552.254318] ov-00100d-dpu8s: port 3(veth5) entered disabled state
Feb  5 11:45:18 urknall kernel: [439552.273546] device vxlan0 left promiscuous mode
Feb  5 11:45:18 urknall kernel: [439552.273586] ov-00100d-dpu8s: port 1(vxlan0) entered disabled state
Feb  5 11:45:18 urknall kernel: [439552.303426] eth1: renamed from veth2040f5e
Feb  5 11:45:18 urknall kernel: [439552.324078] vx-00100d-dpu8s: renamed from vxlan0
Feb  5 11:45:18 urknall kernel: [439552.338467] IPVS: __ip_vs_del_service: enter
Feb  5 11:45:18 urknall kernel: [439552.355298] docker_gwbridge: port 40(vetha782956) entered forwarding state
Feb  5 11:45:18 urknall kernel: [439552.355373] docker_gwbridge: port 40(vetha782956) entered forwarding state
Feb  5 11:45:18 urknall kernel: [439552.371458] vethd110afc: renamed from veth5
Feb  5 11:45:18 urknall kernel: [439552.441719] vethed4e8de: renamed from eth2
Feb  5 11:45:15 urknall mfschunkserver[8794]: Did not manage to receive packet header
Feb  5 11:45:19 urknall dockerd[1676]: time="2018-02-05T11:45:19.072637973+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/containers/delete type="*events.ContainerDelete"
Feb  5 11:45:19 urknall kernel: [439553.181765] eth2: renamed from veth6eac6ab
Feb  5 11:45:19 urknall kernel: [439553.198533] br0: port 2(veth0) entered forwarding state
Feb  5 11:45:19 urknall kernel: [439553.198554] br0: port 2(veth0) entered forwarding state
Feb  5 11:45:28 urknall kernel: [439562.694571] docker_gwbridge: port 37(veth0b18fd5) entered disabled state
Feb  5 11:45:28 urknall kernel: [439562.694864] veth204cabe: renamed from eth1
Feb  5 11:45:33 urknall kernel: [439567.359237] docker_gwbridge: port 40(vetha782956) entered forwarding state
Feb  5 11:45:34 urknall mfsmount: write file error, inode: 67783, index: 0 - Chunk write error (Disconnected) (try counter: 30)
Feb  5 11:45:34 urknall kernel: [439568.255447] br0: port 2(veth0) entered forwarding state
Feb  5 11:45:34 urknall dockerd[1676]: time="2018-02-05T11:45:34.513176377+01:00" level=error msg="fatal task error" error="task: non-zero exit (3)" module=node/agent/taskmanager node.id=4z5kkc15en2q29wt1crkxomc8 service.id=g6trfvrh61m8pdp2l1ttht8m8 task.id=n7oevpc7g1s9tnya7ma8nhrp1
Feb  5 11:45:35 urknall dockerd[1676]: time="2018-02-05T11:45:35.754305089+01:00" level=warning msg="unknown container" container=b56bb316fe60faf95cea28cc55fa544f854ba19096fa340393e7f297c6be6cfa module=libcontainerd namespace=plugins.moby
Feb  5 11:45:35 urknall dockerd[1676]: time="2018-02-05T11:45:35.844875898+01:00" level=warning msg="unknown container" container=b56bb316fe60faf95cea28cc55fa544f854ba19096fa340393e7f297c6be6cfa module=libcontainerd namespace=plugins.moby
Feb  5 11:45:37 urknall mfschunkserver[8794]: Did not manage to receive packet header
Feb  5 11:45:45 urknall dhclient[1979]: DHCPREQUEST of 192.168.99.3 on em1 to 192.168.99.1 port 67 (xid=0x1f01ed1a)
Feb  5 11:45:45 urknall dhclient[1979]: DHCPACK of 192.168.99.3 from 192.168.99.1
Feb  5 11:45:45 urknall marc: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 11:45:50 urknall mfsmount: read file error, inode: 387293, index: 0, chunk: 618667, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 1)
Feb  5 11:45:52 urknall mfsmount: read file error, inode: 387293, index: 0, chunk: 618667, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 2)
Feb  5 11:45:54 urknall mfsmount: read file error, inode: 387293, index: 0, chunk: 618667, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 3)
Feb  5 11:45:56 urknall mfsmount: read file error, inode: 387293, index: 0, chunk: 618667, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 4)
Feb  5 11:45:57 urknall dhclient[1979]: bound to 192.168.99.3 -- renewal in 299 seconds.
Feb  5 11:45:57 urknall mfsmount: write file error, inode: 196764, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 21)
Feb  5 11:45:57 urknall mfsmount: write file error, inode: 67783, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 31)
Feb  5 11:45:57 urknall kernel: [439591.782830] docker_gwbridge: port 37(veth0b18fd5) entered disabled state
Feb  5 11:45:57 urknall kernel: [439591.795431] device veth0b18fd5 left promiscuous mode
Feb  5 11:45:57 urknall kernel: [439591.795440] docker_gwbridge: port 37(veth0b18fd5) entered disabled state
Feb  5 11:46:00 urknall mfsmetalogger[8545]: sessions downloaded 742B/0.000618s (1.201 MB/s)
Feb  5 11:46:04 urknall mfsmount: read file error, inode: 413912, index: 0, chunk: 622939, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 1)
Feb  5 11:46:06 urknall mfsmount: read file error, inode: 413912, index: 0, chunk: 622939, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 2)
Feb  5 11:46:08 urknall mfsmount: read file error, inode: 413912, index: 0, chunk: 622939, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 3)
Feb  5 11:46:10 urknall mfsmount: read file error, inode: 413912, index: 0, chunk: 622939, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 4)
Feb  5 11:46:12 urknall mfsmount: read file error, inode: 413912, index: 0, chunk: 622939, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 5)
Feb  5 11:46:15 urknall kernel: [439609.724889] IPVS: __ip_vs_del_service: enter
Feb  5 11:46:17 urknall mfschunkserver[8794]: Did not manage to receive packet header
Feb  5 11:46:17 urknall mfschunkserver[8794]: Did not manage to receive packet header
Feb  5 11:46:17 urknall dockerd[1676]: time="2018-02-05T11:46:17.894089436+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/containers/delete type="*events.ContainerDelete"
Feb  5 11:46:26 urknall mfsmount: read file error, inode: 413912, index: 0, chunk: 622939, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 1)
Feb  5 11:46:28 urknall mfsmount: read file error, inode: 413912, index: 0, chunk: 622939, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 2)
Feb  5 11:46:30 urknall mfsmount: read file error, inode: 413912, index: 0, chunk: 622939, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 3)
Feb  5 11:46:32 urknall mfsmount: read file error, inode: 413912, index: 0, chunk: 622939, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 4)
Feb  5 11:46:33 urknall mfsmount: write file error, inode: 67788, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 1)
Feb  5 11:46:33 urknall mfsmount: write file error, inode: 3791, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 37)
Feb  5 11:46:35 urknall dockerd[1676]: time="2018-02-05T11:46:35.343356885+01:00" level=error msg="fatal task error" error="task: non-zero exit (1)" module=node/agent/taskmanager node.id=4z5kkc15en2q29wt1crkxomc8 service.id=gq2beoh3n3f0y9ix02ri5kiyl task.id=onf2yrslt1rihpxg6jf750a98
Feb  5 11:46:36 urknall mfschunkserver[8794]: Did not manage to receive packet header
Feb  5 11:46:47 urknall mfsmount: read file error, inode: 73950, index: 0, chunk: 67081, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 1)
Feb  5 11:46:49 urknall mfsmount: read file error, inode: 73950, index: 0, chunk: 67081, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 2)
Feb  5 11:46:51 urknall mfsmount: read file error, inode: 73950, index: 0, chunk: 67081, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 3)
Feb  5 11:46:53 urknall mfsmount: read file error, inode: 73950, index: 0, chunk: 67081, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 4)
Feb  5 11:46:55 urknall mfsmount: read file error, inode: 73950, index: 0, chunk: 67081, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 5)
Feb  5 11:46:57 urknall mfsmount: read file error, inode: 73950, index: 0, chunk: 67081, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 6)
Feb  5 11:46:59 urknall mfsmount: read file error, inode: 73950, index: 0, chunk: 67081, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 7)
Feb  5 11:47:01 urknall mfsmount: read file error, inode: 73950, index: 0, chunk: 67081, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 8)
Feb  5 11:47:02 urknall mfsmount: write file error, inode: 67783, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 32)
Feb  5 11:47:03 urknall mfsmount: read file error, inode: 73950, index: 0, chunk: 67081, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 9)
Feb  5 11:47:03 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 38)
Feb  5 11:47:05 urknall mfsmount: read file error, inode: 73950, index: 0, chunk: 67081, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 10)
Feb  5 11:47:07 urknall mfsmount: read file error, inode: 73950, index: 0, chunk: 67081, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 11)
Feb  5 11:47:09 urknall mfsmount: read file error, inode: 73950, index: 0, chunk: 67081, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 12)
Feb  5 11:47:10 urknall mfsmetalogger[8545]: sessions downloaded 742B/0.000647s (1.147 MB/s)
Feb  5 11:47:10 urknall dockerd[1676]: time="2018-02-05T11:47:10.756025073+01:00" level=warning msg="unknown container" container=b56bb316fe60faf95cea28cc55fa544f854ba19096fa340393e7f297c6be6cfa module=libcontainerd namespace=plugins.moby
Feb  5 11:47:10 urknall dockerd[1676]: time="2018-02-05T11:47:10+01:00" level=info msg="shim reaped" id=b56bb316fe60faf95cea28cc55fa544f854ba19096fa340393e7f297c6be6cfa module="containerd/tasks"
Feb  5 11:47:10 urknall dockerd[1676]: time="2018-02-05T11:47:10.909541156+01:00" level=info msg="ignoring event" module=libcontainerd namespace=plugins.moby topic=/tasks/delete type="*events.TaskDelete"
Feb  5 11:47:10 urknall dockerd[1676]: time="2018-02-05T11:47:10.909708254+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Feb  5 11:47:10 urknall kernel: [439665.026807] br0: port 4(veth1041) entered disabled state
Feb  5 11:47:10 urknall kernel: [439665.036044] veth7df8d6d: renamed from eth0
Feb  5 11:47:10 urknall kernel: [439665.090674] IPVS: __ip_vs_del_service: enter
Feb  5 11:47:21 urknall kernel: [439675.386698] br0: port 4(veth1041) entered disabled state
Feb  5 11:47:21 urknall kernel: [439675.398582] device veth1041 left promiscuous mode
Feb  5 11:47:21 urknall kernel: [439675.398592] br0: port 4(veth1041) entered disabled state
Feb  5 11:47:21 urknall kernel: [439675.437724] br0: port 2(veth0) entered disabled state
Feb  5 11:47:21 urknall kernel: [439675.437785] br0: port 1(vxlan0) entered disabled state
Feb  5 11:47:21 urknall kernel: [439675.441846] ov-00100b-gliwc: renamed from br0
Feb  5 11:47:21 urknall kernel: [439675.465408] device veth0 left promiscuous mode
Feb  5 11:47:21 urknall kernel: [439675.465443] ov-00100b-gliwc: port 2(veth0) entered disabled state
Feb  5 11:47:21 urknall kernel: [439675.481183] device vxlan0 left promiscuous mode
Feb  5 11:47:21 urknall kernel: [439675.481220] ov-00100b-gliwc: port 1(vxlan0) entered disabled state
Feb  5 11:47:21 urknall kernel: [439675.545315] vx-00100b-gliwc: renamed from vxlan0
Feb  5 11:47:21 urknall kernel: [439675.596712] veth54eaca9: renamed from veth0
Feb  5 11:47:21 urknall kernel: [439675.680849] veth6eac6ab: renamed from eth2
Feb  5 11:47:22 urknall kernel: [439676.605356] docker_gwbridge: port 40(vetha782956) entered disabled state
Feb  5 11:47:22 urknall kernel: [439676.605501] veth2040f5e: renamed from eth1
Feb  5 11:47:24 urknall kernel: [439678.586334] docker_gwbridge: port 40(vetha782956) entered disabled state
Feb  5 11:47:24 urknall kernel: [439678.604355] device vetha782956 left promiscuous mode
Feb  5 11:47:24 urknall kernel: [439678.604365] docker_gwbridge: port 40(vetha782956) entered disabled state
Feb  5 11:47:24 urknall dockerd[1676]: time="2018-02-05T11:47:24.537105545+01:00" level=warning msg="Peer operation failed:Unable to find the peerDB for nid:dpu8s5w8nx7vtqyiqai8rflqm op:&{3 dpu8s5w8nx7vtqyiqai8rflqm  [] [] [] [] false false false DeleteNetwork}"
Feb  5 11:47:35 urknall mfsmount: write file error, inode: 3791, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 39)
Feb  5 11:47:50 urknall mfsmount: write file error, inode: 67783, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 33)
Feb  5 11:47:51 urknall kernel: [439705.200793] IPVS: __ip_vs_del_service: enter
Feb  5 11:48:11 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 2)
Feb  5 11:48:11 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunk write error (Disconnected) (try counter: 40)
Feb  5 11:48:23 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 3)
Feb  5 11:48:30 urknall dockerd[1676]: time="2018-02-05T11:48:30.021923199+01:00" level=error msg="792a0ffc1c14e4fac6e81ae9c79de4c67148e60e3581ee9ba032aaa7ee78b14b cleanup: failed to delete container from containerd: no such container"
Feb  5 11:48:32 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunk write error (Disconnected) (try counter: 41)
Feb  5 11:48:34 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 4)
Feb  5 11:48:52 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunk write error (Disconnected) (try counter: 42)
Feb  5 11:48:52 urknall dhclient[1206]: DHCPREQUEST of 192.168.99.3 on em1 to 192.168.99.1 port 67 (xid=0x65d559d6)
Feb  5 11:48:52 urknall dhclient[1206]: DHCPACK of 192.168.99.3 from 192.168.99.1
Feb  5 11:48:54 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 5)
Feb  5 11:49:04 urknall root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 11:49:05 urknall mfsmetalogger[8545]: sessions downloaded 742B/30.022665s (0.000 MB/s)
Feb  5 11:49:22 urknall mfsmetalogger[8545]: sessions downloaded 742B/0.000352s (2.108 MB/s)
Feb  5 11:49:22 urknall dhclient[1206]: bound to 192.168.99.3 -- renewal in 224 seconds.
Feb  5 11:49:22 urknall dockerd[1676]: time="2018-02-05T11:49:22.197179631+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/containers/delete type="*events.ContainerDelete"
Feb  5 11:49:22 urknall mfsmount: write file error, inode: 3791, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 43)
Feb  5 11:49:22 urknall mfsmount: write file error, inode: 67783, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 34)
Feb  5 11:49:48 urknall mfsmount: write file error, inode: 67783, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 35)
Feb  5 11:49:48 urknall mfsmount: write file error, inode: 3791, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 44)
Feb  5 11:49:54 urknall dockerd[1676]: time="2018-02-05T11:49:54.512666154+01:00" level=error msg="fatal task error" error="task: non-zero exit (255)" module=node/agent/taskmanager node.id=4z5kkc15en2q29wt1crkxomc8 service.id=dun7e29phoubjq10posm5f24q task.id=008ucp94pkosrhsv7zasm51hh
Feb  5 11:49:54 urknall kernel: [439828.651560] IPVS: Creating netns size=2192 id=1957
Feb  5 11:49:55 urknall kernel: [439829.169906] br0: renamed from ov-00100d-dpu8s
Feb  5 11:49:55 urknall dockerd[1676]: time="2018-02-05T11:49:55.063937041+01:00" level=error msg="Failed to deserialize netlink ndmsg: invalid argument"
Feb  5 11:49:55 urknall kernel: [439829.218927] vxlan0: renamed from vx-00100d-dpu8s
Feb  5 11:49:55 urknall dockerd[1676]: time="2018-02-05T11:49:55.106285802+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:49:55 urknall kernel: [439829.233267] device vxlan0 entered promiscuous mode
Feb  5 11:49:55 urknall kernel: [439829.236216] br0: port 1(vxlan0) entered forwarding state
Feb  5 11:49:55 urknall kernel: [439829.236236] br0: port 1(vxlan0) entered forwarding state
Feb  5 11:49:55 urknall dockerd[1676]: time="2018-02-05T11:49:55.120645305+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:49:56 urknall kernel: [439830.365261] veth0: renamed from vethe550f6a
Feb  5 11:49:56 urknall dockerd[1676]: time="2018-02-05T11:49:56.258586112+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:49:56 urknall kernel: [439830.385256] device veth0 entered promiscuous mode
Feb  5 11:49:58 urknall kernel: [439832.505545] device vethe467f7c entered promiscuous mode
Feb  5 11:50:00 urknall kernel: [439834.527886] veth1: renamed from veth9f98ea4
Feb  5 11:50:00 urknall dockerd[1676]: time="2018-02-05T11:50:00.413372939+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:50:00 urknall kernel: [439834.541936] device veth1 entered promiscuous mode
Feb  5 11:50:00 urknall mfsmetalogger[8545]: sessions downloaded 742B/0.000438s (1.694 MB/s)
Feb  5 11:50:08 urknall dockerd[1676]: time="2018-02-05T11:50:08.833933025+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:dpu8s5w8nx7vtqyiqai8rflqm leaving:false netPeers:1 entries:12 Queue qLen:0 netMsg/s:0"
Feb  5 11:50:08 urknall dockerd[1676]: time="2018-02-05T11:50:08.835035985+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:xx0m50f4cog3dik8xwqxd4g0q leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:50:08 urknall dockerd[1676]: time="2018-02-05T11:50:08.835216525+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:hixi0z35nd69dxzd621dtjl0e leaving:false netPeers:2 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:50:08 urknall dockerd[1676]: time="2018-02-05T11:50:08.835369189+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:mwursuov7sc87fxhzkmpdtyes leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:50:08 urknall dockerd[1676]: time="2018-02-05T11:50:08.835448181+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:87vvi5rhozzl8qws2ufe7ltl2 leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:50:08 urknall dockerd[1676]: time="2018-02-05T11:50:08.835503566+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:gliwcqzlhjzjulhs3xdbxuqrv leaving:false netPeers:1 entries:6 Queue qLen:0 netMsg/s:0"
Feb  5 11:50:08 urknall dockerd[1676]: time="2018-02-05T11:50:08.835580191+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:2vg30sgcyr0omhvavdcptdrgv leaving:false netPeers:2 entries:46 Queue qLen:0 netMsg/s:0"
Feb  5 11:50:08 urknall dockerd[1676]: time="2018-02-05T11:50:08.835654156+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:mh26wxh3legx189b8ummizzvg leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:50:08 urknall dockerd[1676]: time="2018-02-05T11:50:08.835727869+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:2ntxgh39t5rk8b55tapu1l8gh leaving:false netPeers:2 entries:8 Queue qLen:0 netMsg/s:0"
Feb  5 11:50:10 urknall kernel: [439844.258584] br0: port 1(vxlan0) entered forwarding state
Feb  5 11:50:18 urknall kernel: [439852.280876] device vethc7329c3 entered promiscuous mode
Feb  5 11:50:44 urknall mfsmount: write file error, inode: 67783, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 36)
Feb  5 11:50:44 urknall mfsmount: write file error, inode: 67788, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 1)
Feb  5 11:50:44 urknall mfsmount: write file error, inode: 67787, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 6)
Feb  5 11:50:44 urknall mfsmount: write file error, inode: 3791, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 45)
Feb  5 11:50:47 urknall dockerd[1676]: time="2018-02-05T11:50:47.666860168+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/containers/create type="*events.ContainerCreate"
Feb  5 11:50:50 urknall dockerd[1676]: time="2018-02-05T11:50:50+01:00" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/792a0ffc1c14e4fac6e81ae9c79de4c67148e60e3581ee9ba032aaa7ee78b14b/shim.sock" debug=false module="containerd/tasks" pid=12285
Feb  5 11:50:50 urknall kernel: [439884.962145] IPVS: Creating netns size=2192 id=1958
Feb  5 11:50:51 urknall kernel: [439885.234381] eth0: renamed from veth4e7c042
Feb  5 11:50:51 urknall kernel: [439885.248521] br0: port 2(veth0) entered forwarding state
Feb  5 11:50:51 urknall kernel: [439885.248543] br0: port 2(veth0) entered forwarding state
Feb  5 11:50:51 urknall kernel: [439885.924293] eth1: renamed from veth6a35d9b
Feb  5 11:50:51 urknall kernel: [439885.948585] docker_gwbridge: port 21(vethe467f7c) entered forwarding state
Feb  5 11:50:51 urknall kernel: [439885.948635] docker_gwbridge: port 21(vethe467f7c) entered forwarding state
Feb  5 11:50:52 urknall kernel: [439886.701906] veth1043: renamed from vethecb6e62
Feb  5 11:50:52 urknall dockerd[1676]: time="2018-02-05T11:50:52.585252199+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:50:52 urknall kernel: [439886.720517] device veth1043 entered promiscuous mode
Feb  5 11:50:53 urknall dockerd[1676]: time="2018-02-05T11:50:53.107459203+01:00" level=warning msg="unknown container" container=792a0ffc1c14e4fac6e81ae9c79de4c67148e60e3581ee9ba032aaa7ee78b14b module=libcontainerd namespace=plugins.moby
Feb  5 11:50:53 urknall dockerd[1676]: time="2018-02-05T11:50:53.296299525+01:00" level=warning msg="unknown container" container=792a0ffc1c14e4fac6e81ae9c79de4c67148e60e3581ee9ba032aaa7ee78b14b module=libcontainerd namespace=plugins.moby
Feb  5 11:50:56 urknall dhclient[1979]: DHCPREQUEST of 192.168.99.3 on em1 to 192.168.99.1 port 67 (xid=0x1f01ed1a)
Feb  5 11:50:56 urknall dhclient[1979]: DHCPACK of 192.168.99.3 from 192.168.99.1
Feb  5 11:51:06 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunk write error (Disconnected) (try counter: 46)
Feb  5 11:51:06 urknall kernel: [439900.265651] br0: port 2(veth0) entered forwarding state
Feb  5 11:51:06 urknall kernel: [439900.969761] docker_gwbridge: port 21(vethe467f7c) entered forwarding state
Feb  5 11:51:16 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 2)
Feb  5 11:51:23 urknall marc: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 11:51:27 urknall mfsmount: read file error, inode: 67936, index: 0, chunk: 61732, version: 11 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 1)
Feb  5 11:51:29 urknall mfsmount: read file error, inode: 67936, index: 0, chunk: 61732, version: 11 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 2)
Feb  5 11:51:31 urknall mfsmount: read file error, inode: 67936, index: 0, chunk: 61732, version: 11 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 3)
Feb  5 11:51:33 urknall mfsmount: read file error, inode: 67936, index: 0, chunk: 61732, version: 11 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 4)
Feb  5 11:51:33 urknall dhclient[1979]: bound to 192.168.99.3 -- renewal in 250 seconds.
Feb  5 11:51:33 urknall mfsmetalogger[8545]: sessions downloaded 742B/23.100761s (0.000 MB/s)
Feb  5 11:51:33 urknall mfsmount: write file error, inode: 3791, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 47)
Feb  5 11:51:33 urknall mfsmount: write file error, inode: 67788, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 3)
Feb  5 11:51:34 urknall dockerd[1676]: time="2018-02-05T11:51:34.611059330+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/containers/create type="*events.ContainerCreate"
Feb  5 11:51:34 urknall kernel: [439929.122562] veth1044: renamed from vethae5b2af
Feb  5 11:51:35 urknall dockerd[1676]: time="2018-02-05T11:51:35.001296156+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:51:35 urknall kernel: [439929.141973] device veth1044 entered promiscuous mode
Feb  5 11:51:35 urknall kernel: [439929.147985] device veth997d831 entered promiscuous mode
Feb  5 11:51:35 urknall kernel: [439929.148202] docker_gwbridge: port 40(veth997d831) entered forwarding state
Feb  5 11:51:35 urknall kernel: [439929.148235] docker_gwbridge: port 40(veth997d831) entered forwarding state
Feb  5 11:51:35 urknall kernel: [439929.261455] docker_gwbridge: port 40(veth997d831) entered disabled state
Feb  5 11:51:35 urknall dockerd[1676]: time="2018-02-05T11:51:35+01:00" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/e9ead84691f5fa24a4fea7d2ca2f603ab729d792f4e1a355d56b24e7cff3e14c/shim.sock" debug=false module="containerd/tasks" pid=12502
Feb  5 11:51:35 urknall kernel: [439929.556275] IPVS: Creating netns size=2192 id=1959
Feb  5 11:51:35 urknall kernel: [439929.988796] eth0: renamed from veth4da4725
Feb  5 11:51:35 urknall kernel: [439930.014132] br0: port 4(veth1043) entered forwarding state
Feb  5 11:51:35 urknall kernel: [439930.014154] br0: port 4(veth1043) entered forwarding state
Feb  5 11:51:36 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 1)
Feb  5 11:51:36 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 2)
Feb  5 11:51:36 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 3)
Feb  5 11:51:36 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 4)
Feb  5 11:51:36 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 5)
Feb  5 11:51:36 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 6)
Feb  5 11:51:36 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 7)
Feb  5 11:51:36 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 8)
Feb  5 11:51:36 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 9)
Feb  5 11:51:37 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 10)
Feb  5 11:51:37 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 11)
Feb  5 11:51:37 urknall kernel: [439932.031842] eth1: renamed from veth4add636
Feb  5 11:51:37 urknall kernel: [439932.042748] docker_gwbridge: port 37(vethc7329c3) entered forwarding state
Feb  5 11:51:37 urknall kernel: [439932.042798] docker_gwbridge: port 37(vethc7329c3) entered forwarding state
Feb  5 11:51:38 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 12)
Feb  5 11:51:39 urknall kernel: [439933.315776] eth2: renamed from veth4d297c1
Feb  5 11:51:39 urknall kernel: [439933.330799] br0: port 3(veth1) entered forwarding state
Feb  5 11:51:39 urknall kernel: [439933.330821] br0: port 3(veth1) entered forwarding state
Feb  5 11:51:40 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 13)
Feb  5 11:51:44 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 14)
Feb  5 11:51:49 urknall dockerd[1676]: time="2018-02-05T11:51:49.455392320+01:00" level=warning msg="deleteServiceInfoFromCluster NetworkDB DeleteEntry failed for e181b2e5fc881c505a15e78c0f923613581f1bddfafd17992bbdcf783e85d8ab 2vg30sgcyr0omhvavdcptdrgv err:cannot delete entry as the entry in table endpoint_table with network id 2vg30sgcyr0omhvavdcptdrgv and key e181b2e5fc881c505a15e78c0f923613581f1bddfafd17992bbdcf783e85d8ab does not exist"
Feb  5 11:51:49 urknall dockerd[1676]: time="2018-02-05T11:51:49.456691795+01:00" level=warning msg="rmServiceBinding deleteServiceInfoFromCluster ldap-mrw-world_ldap e181b2e5fc881c505a15e78c0f923613581f1bddfafd17992bbdcf783e85d8ab aborted c.serviceBindings[skey] !ok"
Feb  5 11:51:49 urknall dockerd[1676]: time="2018-02-05T11:51:49.502960504+01:00" level=error msg="Failed to receive from netlink: interrupted system call "
Feb  5 11:51:49 urknall dockerd[1676]: time="2018-02-05T11:51:49.771461372+01:00" level=warning msg="unknown container" container=e9ead84691f5fa24a4fea7d2ca2f603ab729d792f4e1a355d56b24e7cff3e14c module=libcontainerd namespace=plugins.moby
Feb  5 11:51:49 urknall dockerd[1676]: time="2018-02-05T11:51:49.875547166+01:00" level=warning msg="unknown container" container=e9ead84691f5fa24a4fea7d2ca2f603ab729d792f4e1a355d56b24e7cff3e14c module=libcontainerd namespace=plugins.moby
Feb  5 11:51:50 urknall kernel: [439945.071333] br0: port 4(veth1043) entered forwarding state
Feb  5 11:51:52 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 15)
Feb  5 11:51:52 urknall kernel: [439947.055535] docker_gwbridge: port 37(vethc7329c3) entered forwarding state
Feb  5 11:51:54 urknall kernel: [439948.335691] br0: port 3(veth1) entered forwarding state
Feb  5 11:52:00 urknall dockerd[1676]: time="2018-02-05T11:52:00.321543525+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:52:00 urknall dockerd[1676]: time="2018-02-05T11:52:00.322494535+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:52:00 urknall dockerd[1676]: time="2018-02-05T11:52:00.323020592+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:52:00 urknall dockerd[1676]: time="2018-02-05T11:52:00.514016079+01:00" level=warning msg="unknown container" container=e9ead84691f5fa24a4fea7d2ca2f603ab729d792f4e1a355d56b24e7cff3e14c module=libcontainerd namespace=plugins.moby
Feb  5 11:52:02 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 16)
Feb  5 11:52:07 urknall dockerd[1676]: time="2018-02-05T11:52:07+01:00" level=info msg="shim reaped" id=e9ead84691f5fa24a4fea7d2ca2f603ab729d792f4e1a355d56b24e7cff3e14c module="containerd/tasks"
Feb  5 11:52:07 urknall dockerd[1676]: time="2018-02-05T11:52:07.185962068+01:00" level=info msg="ignoring event" module=libcontainerd namespace=plugins.moby topic=/tasks/delete type="*events.TaskDelete"
Feb  5 11:52:07 urknall dockerd[1676]: time="2018-02-05T11:52:07.186140401+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Feb  5 11:52:07 urknall mfsmetalogger[8545]: sessions downloaded 742B/0.000550s (1.349 MB/s)
Feb  5 11:52:12 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 17)
Feb  5 11:52:13 urknall kernel: [439967.363943] docker_gwbridge: port 40(veth997d831) entered disabled state
Feb  5 11:52:13 urknall kernel: [439967.381794] device veth997d831 left promiscuous mode
Feb  5 11:52:13 urknall kernel: [439967.381806] docker_gwbridge: port 40(veth997d831) entered disabled state
Feb  5 11:52:13 urknall kernel: [439967.400329] br0: port 4(veth1043) entered disabled state
Feb  5 11:52:13 urknall kernel: [439967.400469] veth4da4725: renamed from eth0
Feb  5 11:52:13 urknall kernel: [439967.680379] IPVS: __ip_vs_del_service: enter
Feb  5 11:52:14 urknall kernel: [439969.006841] br0: port 4(veth1043) entered disabled state
Feb  5 11:52:14 urknall kernel: [439969.013139] device veth1043 left promiscuous mode
Feb  5 11:52:14 urknall kernel: [439969.013172] br0: port 4(veth1043) entered disabled state
Feb  5 11:52:14 urknall kernel: [439969.039110] br0: port 3(veth1) entered disabled state
Feb  5 11:52:14 urknall kernel: [439969.039291] veth4d297c1: renamed from eth2
Feb  5 11:52:14 urknall kernel: [439969.070944] IPVS: __ip_vs_del_service: enter
Feb  5 11:52:15 urknall kernel: [439969.472410] br0: port 5(veth1044) entered disabled state
Feb  5 11:52:15 urknall kernel: [439969.483568] device veth1044 left promiscuous mode
Feb  5 11:52:15 urknall kernel: [439969.483578] br0: port 5(veth1044) entered disabled state
Feb  5 11:52:15 urknall dockerd[1676]: time="2018-02-05T11:52:15.911033925+01:00" level=error msg="d6e570d2de914967b7ec3c3b7d86d5235719df625ca0b2e320f2510e9125bdd6 cleanup: failed to delete container from containerd: no such container"
Feb  5 11:52:16 urknall kernel: [439970.760350] docker_gwbridge: port 37(vethc7329c3) entered disabled state
Feb  5 11:52:16 urknall kernel: [439970.760572] veth4add636: renamed from eth1
Feb  5 11:52:17 urknall kernel: [439971.955689] docker_gwbridge: port 37(vethc7329c3) entered disabled state
Feb  5 11:52:17 urknall kernel: [439971.961985] device vethc7329c3 left promiscuous mode
Feb  5 11:52:17 urknall kernel: [439971.961995] docker_gwbridge: port 37(vethc7329c3) entered disabled state
Feb  5 11:52:20 urknall kernel: [439974.339791] veth1045: renamed from veth94c16b8
Feb  5 11:52:20 urknall dockerd[1676]: time="2018-02-05T11:52:20.209147926+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:52:20 urknall kernel: [439974.355563] device veth1045 entered promiscuous mode
Feb  5 11:52:20 urknall kernel: [439974.894622] device veth15fcb08 entered promiscuous mode
Feb  5 11:52:22 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 18)
Feb  5 11:52:23 urknall kernel: [439977.197639] br0: port 3(veth1) entered disabled state
Feb  5 11:52:23 urknall kernel: [439977.208461] device veth1 left promiscuous mode
Feb  5 11:52:23 urknall kernel: [439977.208472] br0: port 3(veth1) entered disabled state
Feb  5 11:52:23 urknall kernel: [439977.295579] IPVS: __ip_vs_del_service: enter
Feb  5 11:52:23 urknall kernel: [439977.295591] IPVS: __ip_vs_del_service: enter
Feb  5 11:52:25 urknall dockerd[1676]: time="2018-02-05T11:52:25.221572872+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/containers/delete type="*events.ContainerDelete"
Feb  5 11:52:31 urknall dockerd[1676]: time="2018-02-05T11:52:31.827055756+01:00" level=error msg="fatal task error" error="task: non-zero exit (1)" module=node/agent/taskmanager node.id=4z5kkc15en2q29wt1crkxomc8 service.id=gq2beoh3n3f0y9ix02ri5kiyl task.id=w2pc86ezvwi62zyvli54xg5g0
Feb  5 11:52:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 19)
Feb  5 11:52:34 urknall kernel: [439988.202974] IPVS: Creating netns size=2192 id=1960
Feb  5 11:52:34 urknall dockerd[1676]: time="2018-02-05T11:52:34.189823883+01:00" level=error msg="Failed to deserialize netlink ndmsg: invalid argument"
Feb  5 11:52:34 urknall kernel: [439988.338652] br0: renamed from ov-00100b-gliwc
Feb  5 11:52:34 urknall kernel: [439988.390437] vxlan0: renamed from vx-00100b-gliwc
Feb  5 11:52:34 urknall kernel: [439988.405652] device vxlan0 entered promiscuous mode
Feb  5 11:52:34 urknall dockerd[1676]: time="2018-02-05T11:52:34.265524840+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:52:34 urknall dockerd[1676]: time="2018-02-05T11:52:34.268166234+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:52:34 urknall kernel: [439988.416995] br0: port 1(vxlan0) entered forwarding state
Feb  5 11:52:34 urknall kernel: [439988.417019] br0: port 1(vxlan0) entered forwarding state
Feb  5 11:52:35 urknall kernel: [439989.974775] veth0: renamed from veth3fef976
Feb  5 11:52:35 urknall dockerd[1676]: time="2018-02-05T11:52:35.841191421+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:52:35 urknall kernel: [439989.989610] device veth0 entered promiscuous mode
Feb  5 11:52:42 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 20)
Feb  5 11:52:49 urknall kernel: [440003.446684] br0: port 1(vxlan0) entered forwarding state
Feb  5 11:52:49 urknall dockerd[1676]: time="2018-02-05T11:52:49.437228331+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/containers/create type="*events.ContainerCreate"
Feb  5 11:52:50 urknall dockerd[1676]: time="2018-02-05T11:52:50+01:00" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/d6e570d2de914967b7ec3c3b7d86d5235719df625ca0b2e320f2510e9125bdd6/shim.sock" debug=false module="containerd/tasks" pid=12914
Feb  5 11:52:50 urknall kernel: [440004.988381] IPVS: Creating netns size=2192 id=1961
Feb  5 11:52:51 urknall kernel: [440005.757851] eth0: renamed from veth2f79259
Feb  5 11:52:51 urknall kernel: [440005.767899] br0: port 4(veth1045) entered forwarding state
Feb  5 11:52:51 urknall kernel: [440005.767922] br0: port 4(veth1045) entered forwarding state
Feb  5 11:52:52 urknall kernel: [440006.165240] eth1: renamed from veth4ae28bc
Feb  5 11:52:52 urknall kernel: [440006.183817] docker_gwbridge: port 37(veth15fcb08) entered forwarding state
Feb  5 11:52:52 urknall kernel: [440006.183871] docker_gwbridge: port 37(veth15fcb08) entered forwarding state
Feb  5 11:52:52 urknall kernel: [440006.609355] eth2: renamed from veth6aa3f55
Feb  5 11:52:52 urknall kernel: [440006.627949] br0: port 2(veth0) entered forwarding state
Feb  5 11:52:52 urknall kernel: [440006.627974] br0: port 2(veth0) entered forwarding state
Feb  5 11:52:52 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 21)
Feb  5 11:52:54 urknall dockerd[1676]: time="2018-02-05T11:52:54.781558470+01:00" level=warning msg="unknown container" container=d6e570d2de914967b7ec3c3b7d86d5235719df625ca0b2e320f2510e9125bdd6 module=libcontainerd namespace=plugins.moby
Feb  5 11:52:55 urknall dockerd[1676]: time="2018-02-05T11:52:55.157119449+01:00" level=warning msg="unknown container" container=d6e570d2de914967b7ec3c3b7d86d5235719df625ca0b2e320f2510e9125bdd6 module=libcontainerd namespace=plugins.moby
Feb  5 11:53:02 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 22)
Feb  5 11:53:05 urknall dhclient[1206]: DHCPREQUEST of 192.168.99.3 on em1 to 192.168.99.1 port 67 (xid=0x65d559d6)
Feb  5 11:53:05 urknall dhclient[1206]: DHCPACK of 192.168.99.3 from 192.168.99.1
Feb  5 11:53:06 urknall kernel: [440020.793028] br0: port 4(veth1045) entered forwarding state
Feb  5 11:53:07 urknall kernel: [440021.244973] docker_gwbridge: port 37(veth15fcb08) entered forwarding state
Feb  5 11:53:07 urknall kernel: [440021.657050] br0: port 2(veth0) entered forwarding state
Feb  5 11:53:12 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunk write error (Disconnected) (try counter: 48)
Feb  5 11:53:12 urknall mfsmount: write file error, inode: 413364, index: 0 - Chunk write error (Disconnected) (try counter: 1)
Feb  5 11:53:12 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 23)
Feb  5 11:53:22 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 1)
Feb  5 11:53:22 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 24)
Feb  5 11:53:24 urknall mfsmount: write file error, inode: 413364, index: 0 - Chunk write error (Disconnected) (try counter: 2)
Feb  5 11:53:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 25)
Feb  5 11:53:33 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunk write error (Disconnected) (try counter: 49)
Feb  5 11:53:34 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 2)
Feb  5 11:53:35 urknall root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 11:53:35 urknall mfsmetalogger[8545]: sessions downloaded 742B/0.000619s (1.199 MB/s)
Feb  5 11:53:36 urknall mfsmount: write file error, inode: 413364, index: 0 - Chunk write error (Disconnected) (try counter: 3)
Feb  5 11:53:42 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 26)
Feb  5 11:53:47 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 3)
Feb  5 11:53:50 urknall dhclient[1206]: bound to 192.168.99.3 -- renewal in 267 seconds.
Feb  5 11:53:50 urknall mfsmount: write file error, inode: 413364, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 4)
Feb  5 11:53:52 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 27)
Feb  5 11:53:54 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunk write error (Disconnected) (try counter: 50)
Feb  5 11:54:01 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 4)
Feb  5 11:54:01 urknall mfschunkserver[8794]: replication error: Chunkserver communication timed out: 192.168.99.7:9422
Feb  5 11:54:02 urknall mfsmount: write file error, inode: 413364, index: 0 - Chunk write error (Disconnected) (try counter: 5)
Feb  5 11:54:02 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 28)
Feb  5 11:54:03 urknall mfsmetalogger[8545]: sessions downloaded 742B/1.775205s (0.000 MB/s)
Feb  5 11:54:07 urknall mfsmount: read file error, inode: 413364, index: 0, chunk: 622947, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 1)
Feb  5 11:54:09 urknall mfsmount: read file error, inode: 413364, index: 0, chunk: 622947, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 2)
Feb  5 11:54:11 urknall mfsmount: read file error, inode: 413364, index: 0, chunk: 622947, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 3)
Feb  5 11:54:12 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 29)
Feb  5 11:54:13 urknall mfsmount: read file error, inode: 413364, index: 0, chunk: 622947, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 4)
Feb  5 11:54:22 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 30)
Feb  5 11:54:27 urknall kernel: [440101.242413] veth1046: renamed from veth53fe535
Feb  5 11:54:27 urknall kernel: [440101.260214] device veth1046 entered promiscuous mode
Feb  5 11:54:27 urknall dockerd[1676]: time="2018-02-05T11:54:27.100613684+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:54:27 urknall kernel: [440101.288595] device vethf4cc4fc entered promiscuous mode
Feb  5 11:54:27 urknall kernel: [440101.288876] docker_gwbridge: port 40(vethf4cc4fc) entered forwarding state
Feb  5 11:54:27 urknall kernel: [440101.288904] docker_gwbridge: port 40(vethf4cc4fc) entered forwarding state
Feb  5 11:54:27 urknall kernel: [440101.507321] docker_gwbridge: port 40(vethf4cc4fc) entered disabled state
Feb  5 11:54:28 urknall dockerd[1676]: time="2018-02-05T11:54:28.496588535+01:00" level=warning msg="unknown container" container=d6e570d2de914967b7ec3c3b7d86d5235719df625ca0b2e320f2510e9125bdd6 module=libcontainerd namespace=plugins.moby
Feb  5 11:54:28 urknall dockerd[1676]: time="2018-02-05T11:54:28+01:00" level=info msg="shim reaped" id=d6e570d2de914967b7ec3c3b7d86d5235719df625ca0b2e320f2510e9125bdd6 module="containerd/tasks"
Feb  5 11:54:28 urknall dockerd[1676]: time="2018-02-05T11:54:28.597322211+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Feb  5 11:54:28 urknall dockerd[1676]: time="2018-02-05T11:54:28.597143313+01:00" level=info msg="ignoring event" module=libcontainerd namespace=plugins.moby topic=/tasks/delete type="*events.TaskDelete"
Feb  5 11:54:30 urknall kernel: [440104.337282] br0: port 4(veth1045) entered disabled state
Feb  5 11:54:30 urknall kernel: [440104.337416] veth2f79259: renamed from eth0
Feb  5 11:54:30 urknall kernel: [440104.373433] IPVS: __ip_vs_del_service: enter
Feb  5 11:54:30 urknall kernel: [440105.124455] br0: port 4(veth1045) entered disabled state
Feb  5 11:54:30 urknall kernel: [440105.133302] device veth1045 left promiscuous mode
Feb  5 11:54:30 urknall kernel: [440105.133312] br0: port 4(veth1045) entered disabled state
Feb  5 11:54:31 urknall kernel: [440105.164525] br0: port 2(veth0) entered disabled state
Feb  5 11:54:31 urknall kernel: [440105.164586] br0: port 1(vxlan0) entered disabled state
Feb  5 11:54:31 urknall kernel: [440105.170421] ov-00100b-gliwc: renamed from br0
Feb  5 11:54:31 urknall kernel: [440105.188133] device veth0 left promiscuous mode
Feb  5 11:54:31 urknall kernel: [440105.188179] ov-00100b-gliwc: port 2(veth0) entered disabled state
Feb  5 11:54:31 urknall kernel: [440105.200077] device vxlan0 left promiscuous mode
Feb  5 11:54:31 urknall kernel: [440105.200115] ov-00100b-gliwc: port 1(vxlan0) entered disabled state
Feb  5 11:54:31 urknall kernel: [440105.253747] vx-00100b-gliwc: renamed from vxlan0
Feb  5 11:54:31 urknall kernel: [440105.295739] veth3fef976: renamed from veth0
Feb  5 11:54:31 urknall kernel: [440105.364172] veth6aa3f55: renamed from eth2
Feb  5 11:54:31 urknall kernel: [440106.023498] veth2: renamed from veth0f94f9f
Feb  5 11:54:31 urknall dockerd[1676]: time="2018-02-05T11:54:31.877422051+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:54:31 urknall kernel: [440106.040510] device veth2 entered promiscuous mode
Feb  5 11:54:31 urknall kernel: [440106.040830] br0: port 3(veth2) entered forwarding state
Feb  5 11:54:31 urknall kernel: [440106.040844] br0: port 3(veth2) entered forwarding state
Feb  5 11:54:32 urknall kernel: [440106.335845] br0: port 3(veth2) entered disabled state
Feb  5 11:54:32 urknall kernel: [440106.800846] docker_gwbridge: port 37(veth15fcb08) entered disabled state
Feb  5 11:54:32 urknall kernel: [440106.801099] veth4ae28bc: renamed from eth1
Feb  5 11:54:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 31)
Feb  5 11:54:42 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 32)
Feb  5 11:54:52 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 33)
Feb  5 11:55:02 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 34)
Feb  5 11:55:09 urknall dockerd[1676]: time="2018-02-05T11:55:09.035431938+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:2vg30sgcyr0omhvavdcptdrgv leaving:false netPeers:2 entries:48 Queue qLen:0 netMsg/s:0"
Feb  5 11:55:09 urknall dockerd[1676]: time="2018-02-05T11:55:09.035652147+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:mh26wxh3legx189b8ummizzvg leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:55:09 urknall dockerd[1676]: time="2018-02-05T11:55:09.035731265+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:2ntxgh39t5rk8b55tapu1l8gh leaving:false netPeers:2 entries:8 Queue qLen:0 netMsg/s:0"
Feb  5 11:55:09 urknall dockerd[1676]: time="2018-02-05T11:55:09.035996584+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:dpu8s5w8nx7vtqyiqai8rflqm leaving:false netPeers:1 entries:13 Queue qLen:0 netMsg/s:0"
Feb  5 11:55:09 urknall dockerd[1676]: time="2018-02-05T11:55:09.036052077+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:gliwcqzlhjzjulhs3xdbxuqrv leaving:false netPeers:1 entries:6 Queue qLen:0 netMsg/s:0"
Feb  5 11:55:09 urknall dockerd[1676]: time="2018-02-05T11:55:09.036130815+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:xx0m50f4cog3dik8xwqxd4g0q leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:55:09 urknall dockerd[1676]: time="2018-02-05T11:55:09.036204016+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:hixi0z35nd69dxzd621dtjl0e leaving:false netPeers:2 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 11:55:09 urknall dockerd[1676]: time="2018-02-05T11:55:09.036276625+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:mwursuov7sc87fxhzkmpdtyes leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:55:09 urknall dockerd[1676]: time="2018-02-05T11:55:09.036407726+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:87vvi5rhozzl8qws2ufe7ltl2 leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 11:55:12 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 35)
Feb  5 11:55:22 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 36)
Feb  5 11:55:25 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 1)
Feb  5 11:55:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 37)
Feb  5 11:55:42 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 38)
Feb  5 11:55:43 urknall dhclient[1979]: DHCPREQUEST of 192.168.99.3 on em1 to 192.168.99.1 port 67 (xid=0x1f01ed1a)
Feb  5 11:55:43 urknall dhclient[1979]: DHCPACK of 192.168.99.3 from 192.168.99.1
Feb  5 11:55:46 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 2)
Feb  5 11:55:52 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 39)
Feb  5 11:55:58 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 3)
Feb  5 11:56:02 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 40)
Feb  5 11:56:12 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 41)
Feb  5 11:56:17 urknall marc: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 11:56:17 urknall dhclient[1979]: bound to 192.168.99.3 -- renewal in 238 seconds.
Feb  5 11:56:17 urknall mfsmetalogger[8545]: sessions downloaded 742B/55.797584s (0.000 MB/s)
Feb  5 11:56:17 urknall mfsmount: write file error, inode: 67788, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 4)
Feb  5 11:56:22 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 42)
Feb  5 11:56:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 43)
Feb  5 11:56:36 urknall mfsmetalogger[8545]: sessions downloaded 742B/0.001396s (0.532 MB/s)
Feb  5 11:56:38 urknall mfsmount: write file error, inode: 196764, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 22)
Feb  5 11:56:38 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 5)
Feb  5 11:56:42 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 44)
Feb  5 11:56:50 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 6)
Feb  5 11:56:52 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 45)
Feb  5 11:56:53 urknall mfsmount: write file error, inode: 413353, index: 0 - Chunk write error (Disconnected) (try counter: 1)
Feb  5 11:57:02 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 7)
Feb  5 11:57:02 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 46)
Feb  5 11:57:05 urknall mfsmount: write file error, inode: 196764, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 23)
Feb  5 11:57:05 urknall mfsmount: write file error, inode: 413353, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 2)
Feb  5 11:57:12 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 47)
Feb  5 11:57:16 urknall mfsmount: write file error, inode: 67783, index: 0 - Chunk write error (Disconnected) (try counter: 37)
Feb  5 11:57:17 urknall mfsmount: write file error, inode: 413353, index: 0 - Chunk write error (Disconnected) (try counter: 3)
Feb  5 11:57:22 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 48)
Feb  5 11:57:24 urknall mfsmetalogger[8545]: sessions downloaded 742B/0.000974s (0.762 MB/s)
Feb  5 11:57:24 urknall kernel: [440278.434954] docker_gwbridge: port 37(veth15fcb08) entered disabled state
Feb  5 11:57:24 urknall kernel: [440278.441134] device veth15fcb08 left promiscuous mode
Feb  5 11:57:24 urknall kernel: [440278.441143] docker_gwbridge: port 37(veth15fcb08) entered disabled state
Feb  5 11:57:24 urknall dockerd[1676]: time="2018-02-05T11:57:24.258181805+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/containers/create type="*events.ContainerCreate"
Feb  5 11:57:29 urknall mfsmount: write file error, inode: 413353, index: 0 - Chunk write error (Disconnected) (try counter: 4)
Feb  5 11:57:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 49)
Feb  5 11:57:35 urknall mfsmount: write file error, inode: 196764, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 24)
Feb  5 11:57:37 urknall mfsmount: write file error, inode: 67783, index: 0 - Chunk write error (Disconnected) (try counter: 38)
Feb  5 11:57:40 urknall dockerd[1676]: time="2018-02-05T11:57:40+01:00" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/4e9dc2919f07e3b5c419533ffdd0831950a4c1139915b497036137d913c48601/shim.sock" debug=false module="containerd/tasks" pid=13329
Feb  5 11:57:40 urknall kernel: [440294.839686] IPVS: Creating netns size=2192 id=1962
Feb  5 11:57:40 urknall kernel: [440295.132327] eth0: renamed from veth29c0558
Feb  5 11:57:40 urknall kernel: [440295.171823] br0: port 5(veth1046) entered forwarding state
Feb  5 11:57:40 urknall kernel: [440295.171844] br0: port 5(veth1046) entered forwarding state
Feb  5 11:57:42 urknall kernel: [440296.450930] eth1: renamed from vethdc6891f
Feb  5 11:57:42 urknall kernel: [440296.474116] docker_gwbridge: port 40(vethf4cc4fc) entered forwarding state
Feb  5 11:57:42 urknall kernel: [440296.474176] docker_gwbridge: port 40(vethf4cc4fc) entered forwarding state
Feb  5 11:57:42 urknall kernel: [440296.484187] IPVS: __ip_vs_del_service: enter
Feb  5 11:57:42 urknall dockerd[1676]: time="2018-02-05T11:57:42.642311406+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/containers/delete type="*events.ContainerDelete"
Feb  5 11:57:42 urknall kernel: [440296.846810] eth2: renamed from vethc48ad91
Feb  5 11:57:42 urknall kernel: [440296.861075] br0: port 3(veth2) entered forwarding state
Feb  5 11:57:42 urknall kernel: [440296.861097] br0: port 3(veth2) entered forwarding state
Feb  5 11:57:42 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 50)
Feb  5 11:57:52 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 51)
Feb  5 11:57:56 urknall kernel: [440310.209720] br0: port 5(veth1046) entered forwarding state
Feb  5 11:57:57 urknall kernel: [440311.518045] docker_gwbridge: port 40(vethf4cc4fc) entered forwarding state
Feb  5 11:57:57 urknall kernel: [440311.901930] br0: port 3(veth2) entered forwarding state
Feb  5 11:58:02 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 52)
Feb  5 11:58:03 urknall dockerd[1676]: time="2018-02-05T11:58:03.383062223+01:00" level=error msg="fatal task error" error="task: non-zero exit (255)" module=node/agent/taskmanager node.id=4z5kkc15en2q29wt1crkxomc8 service.id=dun7e29phoubjq10posm5f24q task.id=ln2d322kranyv8xu8xm9138ei
Feb  5 11:58:03 urknall mfsmount: write file error, inode: 67788, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 8)
Feb  5 11:58:03 urknall mfsmount: write file error, inode: 3791, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 51)
Feb  5 11:58:03 urknall mfsmetalogger[8545]: sessions downloaded 742B/0.000733s (1.012 MB/s)
Feb  5 11:58:03 urknall dockerd[1676]: time="2018-02-05T11:58:03.595713851+01:00" level=warning msg="unknown container" container=4e9dc2919f07e3b5c419533ffdd0831950a4c1139915b497036137d913c48601 module=libcontainerd namespace=plugins.moby
Feb  5 11:58:03 urknall dockerd[1676]: time="2018-02-05T11:58:03.800291983+01:00" level=warning msg="unknown container" container=4e9dc2919f07e3b5c419533ffdd0831950a4c1139915b497036137d913c48601 module=libcontainerd namespace=plugins.moby
Feb  5 11:58:12 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 53)
Feb  5 11:58:15 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 9)
Feb  5 11:58:16 urknall dhclient[1206]: DHCPREQUEST of 192.168.99.3 on em1 to 192.168.99.1 port 67 (xid=0x65d559d6)
Feb  5 11:58:16 urknall dhclient[1206]: DHCPACK of 192.168.99.3 from 192.168.99.1
Feb  5 11:58:16 urknall root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 11:58:22 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 54)
Feb  5 11:58:23 urknall mfsmount: read file error, inode: 27960, index: 0, chunk: 542619, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 1)
Feb  5 11:58:24 urknall mfsmount: write file error, inode: 3791, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 52)
Feb  5 11:58:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 55)
Feb  5 11:58:33 urknall dockerd[1676]: time="2018-02-05T11:58:33.426478867+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:58:33 urknall dockerd[1676]: time="2018-02-05T11:58:33.427930972+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:58:33 urknall dockerd[1676]: time="2018-02-05T11:58:33.428072352+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 11:58:33 urknall dockerd[1676]: time="2018-02-05T11:58:33.701412653+01:00" level=warning msg="unknown container" container=4e9dc2919f07e3b5c419533ffdd0831950a4c1139915b497036137d913c48601 module=libcontainerd namespace=plugins.moby
Feb  5 11:58:42 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 56)
Feb  5 11:58:49 urknall mfsmount: write file error, inode: 67783, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 39)
Feb  5 11:58:49 urknall mfsmount: write file error, inode: 196764, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 25)
Feb  5 11:58:52 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 57)
Feb  5 11:58:54 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 53)
Feb  5 11:59:02 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 58)
Feb  5 11:59:03 urknall dhclient[1206]: bound to 192.168.99.3 -- renewal in 280 seconds.
Feb  5 11:59:04 urknall mfsmetalogger[8545]: sessions downloaded 742B/0.000484s (1.533 MB/s)
Feb  5 11:59:04 urknall dockerd[1676]: time="2018-02-05T11:59:04+01:00" level=info msg="shim reaped" id=4e9dc2919f07e3b5c419533ffdd0831950a4c1139915b497036137d913c48601 module="containerd/tasks"
Feb  5 11:59:04 urknall dockerd[1676]: time="2018-02-05T11:59:04.435883342+01:00" level=info msg="ignoring event" module=libcontainerd namespace=plugins.moby topic=/tasks/delete type="*events.TaskDelete"
Feb  5 11:59:04 urknall dockerd[1676]: time="2018-02-05T11:59:04.436053942+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Feb  5 11:59:04 urknall kernel: [440378.645159] br0: port 5(veth1046) entered disabled state
Feb  5 11:59:04 urknall kernel: [440378.662653] veth29c0558: renamed from eth0
Feb  5 11:59:06 urknall mfsmount: read file error, inode: 370135, index: 0, chunk: 616311, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 1)
Feb  5 11:59:06 urknall kernel: [440380.867797] IPVS: __ip_vs_del_service: enter
Feb  5 11:59:08 urknall mfsmount: read file error, inode: 370135, index: 0, chunk: 616311, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 2)
Feb  5 11:59:10 urknall mfsmount: write file error, inode: 196764, index: 0 - Chunk write error (Disconnected) (try counter: 26)
Feb  5 11:59:10 urknall mfsmount: write file error, inode: 67783, index: 0 - Chunk write error (Disconnected) (try counter: 40)
Feb  5 11:59:10 urknall mfsmount: read file error, inode: 370135, index: 0, chunk: 616311, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 3)
Feb  5 11:59:12 urknall mfsmount: read file error, inode: 370135, index: 0, chunk: 616311, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 4)
Feb  5 11:59:12 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 59)
Feb  5 11:59:14 urknall mfsmount: read file error, inode: 370135, index: 0, chunk: 616311, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 5)
Feb  5 11:59:16 urknall mfsmount: read file error, inode: 370135, index: 0, chunk: 616311, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 6)
Feb  5 11:59:18 urknall mfsmount: read file error, inode: 370135, index: 0, chunk: 616311, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 7)
Feb  5 11:59:20 urknall mfsmount: read file error, inode: 370135, index: 0, chunk: 616311, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 8)
Feb  5 11:59:22 urknall mfsmount: read file error, inode: 370135, index: 0, chunk: 616311, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 9)
Feb  5 11:59:22 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 60)
Feb  5 11:59:24 urknall mfsmount: read file error, inode: 370135, index: 0, chunk: 616311, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 10)
Feb  5 11:59:24 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 54)
Feb  5 11:59:26 urknall mfsmount: read file error, inode: 370135, index: 0, chunk: 616311, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 11)
Feb  5 11:59:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 61)
Feb  5 11:59:34 urknall kernel: [440408.844171] br0: port 5(veth1046) entered disabled state
Feb  5 11:59:34 urknall kernel: [440408.852282] device veth1046 left promiscuous mode
Feb  5 11:59:34 urknall kernel: [440408.852291] br0: port 5(veth1046) entered disabled state
Feb  5 11:59:34 urknall kernel: [440408.887228] br0: port 3(veth2) entered disabled state
Feb  5 11:59:34 urknall kernel: [440408.887362] vethc48ad91: renamed from eth2
Feb  5 11:59:34 urknall kernel: [440408.920116] IPVS: __ip_vs_del_service: enter
Feb  5 11:59:35 urknall kernel: [440409.307652] docker_gwbridge: port 40(vethf4cc4fc) entered disabled state
Feb  5 11:59:35 urknall kernel: [440409.307934] vethdc6891f: renamed from eth1
Feb  5 11:59:35 urknall kernel: [440410.053143] docker_gwbridge: port 40(vethf4cc4fc) entered disabled state
Feb  5 11:59:35 urknall kernel: [440410.073836] device vethf4cc4fc left promiscuous mode
Feb  5 11:59:35 urknall kernel: [440410.073845] docker_gwbridge: port 40(vethf4cc4fc) entered disabled state
Feb  5 11:59:37 urknall kernel: [440411.519059] br0: port 3(veth2) entered disabled state
Feb  5 11:59:37 urknall kernel: [440411.541079] device veth2 left promiscuous mode
Feb  5 11:59:37 urknall kernel: [440411.541090] br0: port 3(veth2) entered disabled state
Feb  5 11:59:37 urknall kernel: [440411.646714] IPVS: __ip_vs_del_service: enter
Feb  5 11:59:37 urknall kernel: [440411.646726] IPVS: __ip_vs_del_service: enter
Feb  5 11:59:43 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 62)
Feb  5 11:59:49 urknall mfsmount: write file error, inode: 3791, index: 0 - Timeout after 10041 ms (Timeout) (try counter: 55)
Feb  5 11:59:51 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 10)
Feb  5 11:59:51 urknall mfsmount: write file error, inode: 67783, index: 0 - Chunk write error (Disconnected) (try counter: 41)
Feb  5 11:59:52 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 63)
Feb  5 11:59:55 urknall mfsmount: write file error, inode: 67786, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 1)
Feb  5 11:59:55 urknall dockerd[1676]: time="2018-02-05T11:59:55.351173147+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/containers/delete type="*events.ContainerDelete"
Feb  5 12:00:02 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 64)
Feb  5 12:00:09 urknall dockerd[1676]: time="2018-02-05T12:00:09.234567161+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:dpu8s5w8nx7vtqyiqai8rflqm leaving:false netPeers:1 entries:12 Queue qLen:0 netMsg/s:0"
Feb  5 12:00:09 urknall dockerd[1676]: time="2018-02-05T12:00:09.236714307+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:2vg30sgcyr0omhvavdcptdrgv leaving:false netPeers:2 entries:47 Queue qLen:0 netMsg/s:0"
Feb  5 12:00:09 urknall dockerd[1676]: time="2018-02-05T12:00:09.238627818+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:mh26wxh3legx189b8ummizzvg leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:00:09 urknall dockerd[1676]: time="2018-02-05T12:00:09.240044566+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:mwursuov7sc87fxhzkmpdtyes leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:00:09 urknall dockerd[1676]: time="2018-02-05T12:00:09.241408979+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:xx0m50f4cog3dik8xwqxd4g0q leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:00:09 urknall dockerd[1676]: time="2018-02-05T12:00:09.242163874+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:hixi0z35nd69dxzd621dtjl0e leaving:false netPeers:2 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 12:00:09 urknall dockerd[1676]: time="2018-02-05T12:00:09.242584012+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:87vvi5rhozzl8qws2ufe7ltl2 leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:00:09 urknall dockerd[1676]: time="2018-02-05T12:00:09.243010542+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:gliwcqzlhjzjulhs3xdbxuqrv leaving:false netPeers:1 entries:6 Queue qLen:0 netMsg/s:0"
Feb  5 12:00:09 urknall dockerd[1676]: time="2018-02-05T12:00:09.243935699+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:2ntxgh39t5rk8b55tapu1l8gh leaving:false netPeers:2 entries:8 Queue qLen:0 netMsg/s:0"
Feb  5 12:00:12 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 65)
Feb  5 12:00:13 urknall mfsmetalogger[8545]: sessions downloaded 742B/0.001120s (0.662 MB/s)
Feb  5 12:00:13 urknall dockerd[1676]: time="2018-02-05T12:00:13.619278525+01:00" level=error msg="fatal task error" error="task: non-zero exit (1)" module=node/agent/taskmanager node.id=4z5kkc15en2q29wt1crkxomc8 service.id=gq2beoh3n3f0y9ix02ri5kiyl task.id=zdiqqm853yjfol0skyhfe2sza
Feb  5 12:00:15 urknall dhclient[1979]: DHCPREQUEST of 192.168.99.3 on em1 to 192.168.99.1 port 67 (xid=0x1f01ed1a)
Feb  5 12:00:15 urknall dhclient[1979]: DHCPACK of 192.168.99.3 from 192.168.99.1
Feb  5 12:00:22 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 66)
Feb  5 12:00:23 urknall mfsmount: write file error, inode: 3791, index: 0 - Timeout after 10025 ms (Timeout) (try counter: 56)
Feb  5 12:00:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 67)
Feb  5 12:00:33 urknall mfsmount: write file error, inode: 66435, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 1)
Feb  5 12:00:33 urknall mfsmount: write file error, inode: 196764, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 27)
Feb  5 12:00:42 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 68)
Feb  5 12:00:52 urknall marc: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 12:00:52 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 69)
Feb  5 12:00:56 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 57)
Feb  5 12:00:59 urknall mfsmount: write file error, inode: 67783, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 42)
Feb  5 12:01:02 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 70)
Feb  5 12:01:03 urknall mfsmount: write file error, inode: 196764, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 28)
Feb  5 12:01:12 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 71)
Feb  5 12:01:15 urknall dhclient[1979]: bound to 192.168.99.3 -- renewal in 190 seconds.
Feb  5 12:01:21 urknall mfsmetalogger[8545]: sessions downloaded 742B/17.311260s (0.000 MB/s)
Feb  5 12:01:22 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 72)
Feb  5 12:01:30 urknall kernel: [440524.298418] veth1047: renamed from vethb05a062
Feb  5 12:01:30 urknall dockerd[1676]: time="2018-02-05T12:01:30.099781278+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 12:01:30 urknall kernel: [440524.313521] device veth1047 entered promiscuous mode
Feb  5 12:01:30 urknall kernel: [440524.328893] device veth886ee36 entered promiscuous mode
Feb  5 12:01:30 urknall kernel: [440524.329592] docker_gwbridge: port 37(veth886ee36) entered forwarding state
Feb  5 12:01:30 urknall kernel: [440524.329620] docker_gwbridge: port 37(veth886ee36) entered forwarding state
Feb  5 12:01:30 urknall kernel: [440524.993196] docker_gwbridge: port 37(veth886ee36) entered disabled state
Feb  5 12:01:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 73)
Feb  5 12:01:39 urknall kernel: [440533.831872] IPVS: Creating netns size=2192 id=1963
Feb  5 12:01:40 urknall kernel: [440534.239771] br0: renamed from ov-00100b-gliwc
Feb  5 12:01:40 urknall dockerd[1676]: time="2018-02-05T12:01:40.049308941+01:00" level=error msg="Failed to deserialize netlink ndmsg: invalid argument"
Feb  5 12:01:40 urknall kernel: [440534.304748] vxlan0: renamed from vx-00100b-gliwc
Feb  5 12:01:40 urknall dockerd[1676]: time="2018-02-05T12:01:40.110911929+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 12:01:40 urknall dockerd[1676]: time="2018-02-05T12:01:40.112023572+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 12:01:40 urknall kernel: [440534.328044] device vxlan0 entered promiscuous mode
Feb  5 12:01:40 urknall kernel: [440534.331001] br0: port 1(vxlan0) entered forwarding state
Feb  5 12:01:40 urknall kernel: [440534.331024] br0: port 1(vxlan0) entered forwarding state
Feb  5 12:01:40 urknall kernel: [440534.985662] veth0: renamed from veth3162d51
Feb  5 12:01:40 urknall kernel: [440535.007002] device veth0 entered promiscuous mode
Feb  5 12:01:40 urknall kernel: [440535.007320] br0: port 2(veth0) entered forwarding state
Feb  5 12:01:40 urknall kernel: [440535.007332] br0: port 2(veth0) entered forwarding state
Feb  5 12:01:40 urknall dockerd[1676]: time="2018-02-05T12:01:40.794124810+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 12:01:41 urknall kernel: [440535.327216] br0: port 2(veth0) entered disabled state
Feb  5 12:01:42 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 74)
Feb  5 12:01:52 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 75)
Feb  5 12:01:55 urknall kernel: [440549.376159] br0: port 1(vxlan0) entered forwarding state
Feb  5 12:02:00 urknall mfsmount: write file error, inode: 67787, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 7)
Feb  5 12:02:00 urknall mfsmount: write file error, inode: 67788, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 1)
Feb  5 12:02:00 urknall mfsmount: write file error, inode: 3791, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 58)
Feb  5 12:02:00 urknall mfsmetalogger[8545]: sessions downloaded 742B/0.000655s (1.133 MB/s)
Feb  5 12:02:02 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 76)
Feb  5 12:02:11 urknall mfsmount: write file error, inode: 67783, index: 0 - Timeout after 10005 ms (Timeout) (try counter: 43)
Feb  5 12:02:12 urknall mfsmount: write file error, inode: 67786, index: 0 - Chunk write error (Disconnected) (try counter: 2)
Feb  5 12:02:12 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 2)
Feb  5 12:02:12 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 77)
Feb  5 12:02:21 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunk write error (Disconnected) (try counter: 59)
Feb  5 12:02:22 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 78)
Feb  5 12:02:32 urknall mfsmount: write file error, inode: 67783, index: 0 - Chunk write error (Disconnected) (try counter: 44)
Feb  5 12:02:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 79)
Feb  5 12:02:32 urknall mfsmount: write file error, inode: 67786, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 3)
Feb  5 12:02:32 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 3)
Feb  5 12:02:41 urknall dockerd[1676]: time="2018-02-05T12:02:41.038454337+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/containers/create type="*events.ContainerCreate"
Feb  5 12:02:41 urknall dockerd[1676]: time="2018-02-05T12:02:41+01:00" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/fb1d30cb50dcd52e996d0837d78afdcdd68ab0f8e51a6fed6619ac53040cc0c5/shim.sock" debug=false module="containerd/tasks" pid=13841
Feb  5 12:02:41 urknall kernel: [440595.995097] IPVS: Creating netns size=2192 id=1964
Feb  5 12:02:42 urknall kernel: [440596.250518] eth0: renamed from vethd926fc2
Feb  5 12:02:42 urknall kernel: [440596.267050] br0: port 4(veth1047) entered forwarding state
Feb  5 12:02:42 urknall kernel: [440596.267075] br0: port 4(veth1047) entered forwarding state
Feb  5 12:02:42 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 80)
Feb  5 12:02:42 urknall kernel: [440597.216963] eth1: renamed from veth769d6bf
Feb  5 12:02:43 urknall kernel: [440597.231884] docker_gwbridge: port 37(veth886ee36) entered forwarding state
Feb  5 12:02:43 urknall kernel: [440597.231940] docker_gwbridge: port 37(veth886ee36) entered forwarding state
Feb  5 12:02:43 urknall kernel: [440598.042829] eth2: renamed from vethe98a694
Feb  5 12:02:43 urknall kernel: [440598.064148] br0: port 2(veth0) entered forwarding state
Feb  5 12:02:43 urknall kernel: [440598.064170] br0: port 2(veth0) entered forwarding state
Feb  5 12:02:45 urknall dockerd[1676]: time="2018-02-05T12:02:45.015428623+01:00" level=warning msg="unknown container" container=fb1d30cb50dcd52e996d0837d78afdcdd68ab0f8e51a6fed6619ac53040cc0c5 module=libcontainerd namespace=plugins.moby
Feb  5 12:02:45 urknall dockerd[1676]: time="2018-02-05T12:02:45.126432307+01:00" level=warning msg="unknown container" container=fb1d30cb50dcd52e996d0837d78afdcdd68ab0f8e51a6fed6619ac53040cc0c5 module=libcontainerd namespace=plugins.moby
Feb  5 12:02:52 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 81)
Feb  5 12:02:57 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunk write error (Disconnected) (try counter: 60)
Feb  5 12:02:57 urknall kernel: [440611.300122] br0: port 4(veth1047) entered forwarding state
Feb  5 12:02:58 urknall kernel: [440612.260226] docker_gwbridge: port 37(veth886ee36) entered forwarding state
Feb  5 12:02:58 urknall kernel: [440613.092320] br0: port 2(veth0) entered forwarding state
Feb  5 12:03:02 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 82)
Feb  5 12:03:07 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 4)
Feb  5 12:03:07 urknall mfsmount: write file error, inode: 67787, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 8)
Feb  5 12:03:07 urknall mfsmount: write file error, inode: 67786, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 4)
Feb  5 12:03:12 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 83)
Feb  5 12:03:22 urknall mfsmount: write file error, inode: 67786, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 5)
Feb  5 12:03:22 urknall mfsmount: write file error, inode: 67787, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 9)
Feb  5 12:03:22 urknall mfsmount: write file error, inode: 67788, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 5)
Feb  5 12:03:22 urknall mfsmount: write file error, inode: 3791, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 61)
Feb  5 12:03:22 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 84)
Feb  5 12:03:27 urknall mfsmount: read file error, inode: 413364, index: 0, chunk: 622947, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 1)
Feb  5 12:03:29 urknall mfsmount: read file error, inode: 413364, index: 0, chunk: 622947, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 2)
Feb  5 12:03:31 urknall mfsmount: read file error, inode: 413364, index: 0, chunk: 622947, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 3)
Feb  5 12:03:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 85)
Feb  5 12:03:33 urknall mfsmount: read file error, inode: 413364, index: 0, chunk: 622947, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 4)
Feb  5 12:03:34 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 6)
Feb  5 12:03:35 urknall mfsmount: read file error, inode: 413364, index: 0, chunk: 622947, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 5)
Feb  5 12:03:37 urknall mfsmount: read file error, inode: 413364, index: 0, chunk: 622947, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 6)
Feb  5 12:03:39 urknall mfsmount: read file error, inode: 413364, index: 0, chunk: 622947, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 7)
Feb  5 12:03:41 urknall mfsmount: read file error, inode: 413364, index: 0, chunk: 622947, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 8)
Feb  5 12:03:42 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 86)
Feb  5 12:03:43 urknall dhclient[1206]: DHCPREQUEST of 192.168.99.3 on em1 to 192.168.99.1 port 67 (xid=0x65d559d6)
Feb  5 12:03:43 urknall dhclient[1206]: DHCPACK of 192.168.99.3 from 192.168.99.1
Feb  5 12:03:43 urknall mfsmount: read file error, inode: 413364, index: 0, chunk: 622947, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 9)
Feb  5 12:03:45 urknall mfsmount: read file error, inode: 413364, index: 0, chunk: 622947, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 10)
Feb  5 12:03:47 urknall root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 12:03:47 urknall mfsmetalogger[8545]: sessions downloaded 742B/3.401270s (0.000 MB/s)
Feb  5 12:03:47 urknall dhclient[1206]: bound to 192.168.99.3 -- renewal in 263 seconds.
Feb  5 12:03:47 urknall mfsmount: write file error, inode: 67788, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 7)
Feb  5 12:03:47 urknall mfsmount: write file error, inode: 3791, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 62)
Feb  5 12:03:47 urknall mfsmount: read file error, inode: 413364, index: 0, chunk: 622947, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 11)
Feb  5 12:03:50 urknall mfsmount: read file error, inode: 413364, index: 0, chunk: 622947, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 12)
Feb  5 12:03:52 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 87)
Feb  5 12:03:54 urknall mfsmount: read file error, inode: 413364, index: 0, chunk: 622947, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 13)
Feb  5 12:04:02 urknall mfsmount: read file error, inode: 413364, index: 0, chunk: 622947, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 14)
Feb  5 12:04:02 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 88)
Feb  5 12:04:08 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunk write error (Disconnected) (try counter: 63)
Feb  5 12:04:08 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 8)
Feb  5 12:04:12 urknall mfsmount: read file error, inode: 413364, index: 0, chunk: 622947, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 15)
Feb  5 12:04:12 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 89)
Feb  5 12:04:22 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 90)
Feb  5 12:04:23 urknall kernel: [440697.581733] veth1048: renamed from veth979e383
Feb  5 12:04:23 urknall dockerd[1676]: time="2018-02-05T12:04:23.359400712+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 12:04:23 urknall kernel: [440697.595692] device veth1048 entered promiscuous mode
Feb  5 12:04:23 urknall kernel: [440697.632090] device veth6663354 entered promiscuous mode
Feb  5 12:04:23 urknall mfsmetalogger[8545]: sessions downloaded 742B/3.950299s (0.000 MB/s)
Feb  5 12:04:25 urknall dhclient[1979]: DHCPREQUEST of 192.168.99.3 on em1 to 192.168.99.1 port 67 (xid=0x1f01ed1a)
Feb  5 12:04:25 urknall dhclient[1979]: DHCPACK of 192.168.99.3 from 192.168.99.1
Feb  5 12:04:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 91)
Feb  5 12:04:34 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 9)
Feb  5 12:04:34 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunk write error (Disconnected) (try counter: 64)
Feb  5 12:04:42 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 92)
Feb  5 12:04:45 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 10)
Feb  5 12:04:52 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 93)
Feb  5 12:04:55 urknall marc: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 12:04:57 urknall mfsmount: write file error, inode: 411287, index: 0 - Timeout after 30040 ms (Timeout) (try counter: 1)
Feb  5 12:05:02 urknall dhclient[1979]: bound to 192.168.99.3 -- renewal in 209 seconds.
Feb  5 12:05:02 urknall mfsmetalogger[8545]: sessions downloaded 742B/0.000602s (1.233 MB/s)
Feb  5 12:05:02 urknall mfsmount: write file error, inode: 67788, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 11)
Feb  5 12:05:02 urknall mfsmount: write file error, inode: 3791, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 65)
Feb  5 12:05:02 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 94)
Feb  5 12:05:09 urknall dockerd[1676]: time="2018-02-05T12:05:09.433889982+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:gliwcqzlhjzjulhs3xdbxuqrv leaving:false netPeers:1 entries:5 Queue qLen:0 netMsg/s:0"
Feb  5 12:05:09 urknall dockerd[1676]: time="2018-02-05T12:05:09.434060410+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:2ntxgh39t5rk8b55tapu1l8gh leaving:false netPeers:2 entries:8 Queue qLen:0 netMsg/s:0"
Feb  5 12:05:09 urknall dockerd[1676]: time="2018-02-05T12:05:09.434142880+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:hixi0z35nd69dxzd621dtjl0e leaving:false netPeers:2 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 12:05:09 urknall dockerd[1676]: time="2018-02-05T12:05:09.434196186+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:dpu8s5w8nx7vtqyiqai8rflqm leaving:false netPeers:1 entries:10 Queue qLen:0 netMsg/s:0"
Feb  5 12:05:09 urknall dockerd[1676]: time="2018-02-05T12:05:09.434272947+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:mh26wxh3legx189b8ummizzvg leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:05:09 urknall dockerd[1676]: time="2018-02-05T12:05:09.434346553+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:87vvi5rhozzl8qws2ufe7ltl2 leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:05:09 urknall dockerd[1676]: time="2018-02-05T12:05:09.434421311+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:xx0m50f4cog3dik8xwqxd4g0q leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:05:09 urknall dockerd[1676]: time="2018-02-05T12:05:09.434494570+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:mwursuov7sc87fxhzkmpdtyes leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:05:09 urknall dockerd[1676]: time="2018-02-05T12:05:09.434571762+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:2vg30sgcyr0omhvavdcptdrgv leaving:false netPeers:2 entries:46 Queue qLen:0 netMsg/s:0"
Feb  5 12:05:12 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 95)
Feb  5 12:05:13 urknall mfsmount: write file error, inode: 196764, index: 0 - Chunk write error (Disconnected) (try counter: 29)
Feb  5 12:05:22 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 96)
Feb  5 12:05:23 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunk write error (Disconnected) (try counter: 66)
Feb  5 12:05:23 urknall mfsmount: write file error, inode: 409250, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 1)
Feb  5 12:05:24 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 12)
Feb  5 12:05:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 97)
Feb  5 12:05:33 urknall mfsmount: write file error, inode: 196764, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 30)
Feb  5 12:05:39 urknall kernel: [440774.075370] veth3: renamed from veth3967052
Feb  5 12:05:39 urknall dockerd[1676]: time="2018-02-05T12:05:39.841308580+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 12:05:39 urknall kernel: [440774.089477] device veth3 entered promiscuous mode
Feb  5 12:05:42 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 98)
Feb  5 12:05:44 urknall mfsmount: read file error, inode: 409247, index: 0, chunk: 622965, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 1)
Feb  5 12:05:46 urknall mfsmount: read file error, inode: 409247, index: 0, chunk: 622965, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 2)
Feb  5 12:05:48 urknall mfsmount: read file error, inode: 409247, index: 0, chunk: 622965, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 3)
Feb  5 12:05:50 urknall mfsmount: read file error, inode: 409247, index: 0, chunk: 622965, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 4)
Feb  5 12:05:51 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 1)
Feb  5 12:05:52 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunk write error (Disconnected) (try counter: 67)
Feb  5 12:05:52 urknall mfsmount: read file error, inode: 409247, index: 0, chunk: 622965, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 5)
Feb  5 12:05:52 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 99)
Feb  5 12:05:54 urknall mfsmount: read file error, inode: 409247, index: 0, chunk: 622965, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 6)
Feb  5 12:05:55 urknall mfsmount: write file error, inode: 196764, index: 0 - Chunk write error (Disconnected) (try counter: 31)
Feb  5 12:05:55 urknall mfsmount: write file error, inode: 409249, index: 0 - Chunk write error (Disconnected) (try counter: 1)
Feb  5 12:05:55 urknall mfsmount: write file error, inode: 409250, index: 0 - Chunk write error (Disconnected) (try counter: 1)
Feb  5 12:05:56 urknall mfsmount: read file error, inode: 409247, index: 0, chunk: 622965, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 7)
Feb  5 12:05:58 urknall mfsmount: read file error, inode: 409247, index: 0, chunk: 622965, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 8)
Feb  5 12:06:00 urknall mfsmount: read file error, inode: 409247, index: 0, chunk: 622965, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 9)
Feb  5 12:06:02 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 2)
Feb  5 12:06:02 urknall mfsmount: read file error, inode: 409247, index: 0, chunk: 622965, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 10)
Feb  5 12:06:02 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 100)
Feb  5 12:06:04 urknall mfsmount: read file error, inode: 409247, index: 0, chunk: 622965, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 11)
Feb  5 12:06:06 urknall mfsmount: read file error, inode: 409247, index: 0, chunk: 622965, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 12)
Feb  5 12:06:10 urknall mfsmount: read file error, inode: 409247, index: 0, chunk: 622965, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 13)
Feb  5 12:06:12 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 101)
Feb  5 12:06:15 urknall mfsmount: write file error, inode: 196764, index: 0 - Chunk write error (Disconnected) (try counter: 32)
Feb  5 12:06:15 urknall mfsmount: write file error, inode: 409250, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 2)
Feb  5 12:06:15 urknall mfsmount: write file error, inode: 409249, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 2)
Feb  5 12:06:18 urknall mfsmount: read file error, inode: 409247, index: 0, chunk: 622965, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 14)
Feb  5 12:06:22 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 68)
Feb  5 12:06:22 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 102)
Feb  5 12:06:22 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 3)
Feb  5 12:06:27 urknall mfsmount: write file error, inode: 409249, index: 0 - Chunk write error (Disconnected) (try counter: 3)
Feb  5 12:06:27 urknall mfsmount: write file error, inode: 409250, index: 0 - Chunk write error (Disconnected) (try counter: 3)
Feb  5 12:06:28 urknall mfsmetalogger[8545]: sessions downloaded 742B/0.000734s (1.011 MB/s)
Feb  5 12:06:28 urknall mfsmount: read file error, inode: 409247, index: 0, chunk: 622965, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 15)
Feb  5 12:06:29 urknall dockerd[1676]: time="2018-02-05T12:06:29.265668536+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/containers/create type="*events.ContainerCreate"
Feb  5 12:06:30 urknall dockerd[1676]: time="2018-02-05T12:06:30+01:00" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/e17345c4539366316e5a760f71f72b626a4008443a9a58c94a5b758b462b9f9c/shim.sock" debug=false module="containerd/tasks" pid=14157
Feb  5 12:06:30 urknall kernel: [440824.498238] IPVS: Creating netns size=2192 id=1965
Feb  5 12:06:30 urknall kernel: [440824.732599] eth0: renamed from veth691b387
Feb  5 12:06:30 urknall kernel: [440824.752197] br0: port 5(veth1048) entered forwarding state
Feb  5 12:06:30 urknall kernel: [440824.752219] br0: port 5(veth1048) entered forwarding state
Feb  5 12:06:31 urknall kernel: [440825.861131] eth1: renamed from vetha200f2b
Feb  5 12:06:31 urknall kernel: [440825.876139] docker_gwbridge: port 40(veth6663354) entered forwarding state
Feb  5 12:06:31 urknall kernel: [440825.876199] docker_gwbridge: port 40(veth6663354) entered forwarding state
Feb  5 12:06:32 urknall kernel: [440826.653244] eth2: renamed from vethf9d4e18
Feb  5 12:06:32 urknall kernel: [440826.668970] br0: port 3(veth3) entered forwarding state
Feb  5 12:06:32 urknall kernel: [440826.668991] br0: port 3(veth3) entered forwarding state
Feb  5 12:06:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 103)
Feb  5 12:06:33 urknall dockerd[1676]: time="2018-02-05T12:06:33.329723925+01:00" level=warning msg="unknown container" container=fb1d30cb50dcd52e996d0837d78afdcdd68ab0f8e51a6fed6619ac53040cc0c5 module=libcontainerd namespace=plugins.moby
Feb  5 12:06:33 urknall dockerd[1676]: time="2018-02-05T12:06:33+01:00" level=info msg="shim reaped" id=fb1d30cb50dcd52e996d0837d78afdcdd68ab0f8e51a6fed6619ac53040cc0c5 module="containerd/tasks"
Feb  5 12:06:33 urknall dockerd[1676]: time="2018-02-05T12:06:33.419368524+01:00" level=info msg="ignoring event" module=libcontainerd namespace=plugins.moby topic=/tasks/delete type="*events.TaskDelete"
Feb  5 12:06:33 urknall dockerd[1676]: time="2018-02-05T12:06:33.419527406+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Feb  5 12:06:33 urknall dockerd[1676]: time="2018-02-05T12:06:33.426277217+01:00" level=warning msg="unknown container" container=e17345c4539366316e5a760f71f72b626a4008443a9a58c94a5b758b462b9f9c module=libcontainerd namespace=plugins.moby
Feb  5 12:06:33 urknall kernel: [440827.680276] br0: port 4(veth1047) entered disabled state
Feb  5 12:06:33 urknall kernel: [440827.680456] vethd926fc2: renamed from eth0
Feb  5 12:06:33 urknall kernel: [440827.719467] IPVS: __ip_vs_del_service: enter
Feb  5 12:06:33 urknall dockerd[1676]: time="2018-02-05T12:06:33.680225489+01:00" level=warning msg="unknown container" container=e17345c4539366316e5a760f71f72b626a4008443a9a58c94a5b758b462b9f9c module=libcontainerd namespace=plugins.moby
Feb  5 12:06:34 urknall kernel: [440828.618377] br0: port 4(veth1047) entered disabled state
Feb  5 12:06:34 urknall kernel: [440828.624045] device veth1047 left promiscuous mode
Feb  5 12:06:34 urknall kernel: [440828.624053] br0: port 4(veth1047) entered disabled state
Feb  5 12:06:34 urknall kernel: [440828.667723] br0: port 2(veth0) entered disabled state
Feb  5 12:06:34 urknall kernel: [440828.667776] br0: port 1(vxlan0) entered disabled state
Feb  5 12:06:34 urknall kernel: [440828.672279] ov-00100b-gliwc: renamed from br0
Feb  5 12:06:34 urknall kernel: [440828.692134] device veth0 left promiscuous mode
Feb  5 12:06:34 urknall kernel: [440828.692170] ov-00100b-gliwc: port 2(veth0) entered disabled state
Feb  5 12:06:34 urknall kernel: [440828.724087] device vxlan0 left promiscuous mode
Feb  5 12:06:34 urknall kernel: [440828.724167] ov-00100b-gliwc: port 1(vxlan0) entered disabled state
Feb  5 12:06:34 urknall kernel: [440828.806527] vx-00100b-gliwc: renamed from vxlan0
Feb  5 12:06:34 urknall kernel: [440828.888277] veth3162d51: renamed from veth0
Feb  5 12:06:34 urknall kernel: [440829.005902] vethe98a694: renamed from eth2
Feb  5 12:06:42 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 104)
Feb  5 12:06:45 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunk write error (Disconnected) (try counter: 69)
Feb  5 12:06:45 urknall kernel: [440839.777251] br0: port 5(veth1048) entered forwarding state
Feb  5 12:06:46 urknall mfsmount: write file error, inode: 67783, index: 0 - Chunk write error (Disconnected) (try counter: 45)
Feb  5 12:06:46 urknall mfsmount: write file error, inode: 67787, index: 0 - Chunk write error (Disconnected) (try counter: 10)
Feb  5 12:06:46 urknall kernel: [440840.929364] docker_gwbridge: port 40(veth6663354) entered forwarding state
Feb  5 12:06:47 urknall kernel: [440841.697451] br0: port 3(veth3) entered forwarding state
Feb  5 12:06:52 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 105)
Feb  5 12:07:02 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 106)
Feb  5 12:07:06 urknall mfsmount: write file error, inode: 67783, index: 0 - Chunk write error (Disconnected) (try counter: 46)
Feb  5 12:07:07 urknall mfsmount: write file error, inode: 67787, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 11)
Feb  5 12:07:11 urknall mfsmount: write file error, inode: 3791, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 70)
Feb  5 12:07:12 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 107)
Feb  5 12:07:21 urknall mfsmount: read file error, inode: 27960, index: 0, chunk: 542619, version: 1 - Chunkserver communication timed out: 192.168.99.3:9422 (try counter: 1)
Feb  5 12:07:22 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 108)
Feb  5 12:07:23 urknall mfsmetalogger[8545]: sessions downloaded 742B/12.324977s (0.000 MB/s)
Feb  5 12:07:23 urknall kernel: [440877.366045] docker_gwbridge: port 37(veth886ee36) entered disabled state
Feb  5 12:07:23 urknall kernel: [440877.366324] veth769d6bf: renamed from eth1
Feb  5 12:07:24 urknall dockerd[1676]: time="2018-02-05T12:07:24.698969135+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 12:07:24 urknall dockerd[1676]: time="2018-02-05T12:07:24.699129256+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 12:07:24 urknall dockerd[1676]: time="2018-02-05T12:07:24.699216071+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb  5 12:07:24 urknall dockerd[1676]: time="2018-02-05T12:07:24.873453119+01:00" level=warning msg="unknown container" container=e17345c4539366316e5a760f71f72b626a4008443a9a58c94a5b758b462b9f9c module=libcontainerd namespace=plugins.moby
Feb  5 12:07:24 urknall dockerd[1676]: time="2018-02-05T12:07:24+01:00" level=info msg="shim reaped" id=e17345c4539366316e5a760f71f72b626a4008443a9a58c94a5b758b462b9f9c module="containerd/tasks"
Feb  5 12:07:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 109)
Feb  5 12:07:34 urknall mfsmount: write file error, inode: 196764, index: 0 - Chunk write error (Disconnected) (try counter: 33)
Feb  5 12:07:34 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 4)
Feb  5 12:07:34 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunk write error (Disconnected) (try counter: 71)
Feb  5 12:07:42 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 110)
Feb  5 12:07:45 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 5)
Feb  5 12:07:52 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 111)
Feb  5 12:07:54 urknall mfsmount: write file error, inode: 196764, index: 0 - Chunk write error (Disconnected) (try counter: 34)
Feb  5 12:07:54 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunk write error (Disconnected) (try counter: 72)
Feb  5 12:07:57 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 6)
Feb  5 12:08:02 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 112)
Feb  5 12:08:03 urknall dockerd[1676]: time="2018-02-05T12:08:03.903462468+01:00" level=info msg="ignoring event" module=libcontainerd namespace=plugins.moby topic=/tasks/delete type="*events.TaskDelete"
Feb  5 12:08:03 urknall dockerd[1676]: time="2018-02-05T12:08:03.903637495+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Feb  5 12:08:10 urknall mfsmount: write file error, inode: 409239, index: 0 - Timeout after 30012 ms (Timeout) (try counter: 1)
Feb  5 12:08:11 urknall dhclient[1206]: DHCPREQUEST of 192.168.99.3 on em1 to 192.168.99.1 port 67 (xid=0x65d559d6)
Feb  5 12:08:11 urknall dhclient[1206]: DHCPACK of 192.168.99.3 from 192.168.99.1
Feb  5 12:08:12 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 113)
Feb  5 12:08:16 urknall root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 12:08:16 urknall mfsmetalogger[8545]: sessions downloaded 742B/3.925297s (0.000 MB/s)
Feb  5 12:08:16 urknall mfsmount: write file error, inode: 3791, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 73)
Feb  5 12:08:16 urknall mfsmount: write file error, inode: 196764, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 35)
Feb  5 12:08:16 urknall mfsmount: write file error, inode: 67788, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 7)
Feb  5 12:08:22 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 114)
Feb  5 12:08:28 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 8)
Feb  5 12:08:31 urknall dhclient[1979]: DHCPREQUEST of 192.168.99.3 on em1 to 192.168.99.1 port 67 (xid=0x1f01ed1a)
Feb  5 12:08:31 urknall dhclient[1979]: DHCPACK of 192.168.99.3 from 192.168.99.1
Feb  5 12:08:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 115)
Feb  5 12:08:37 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunk write error (Disconnected) (try counter: 74)
Feb  5 12:08:37 urknall mfsmount: write file error, inode: 196764, index: 0 - Chunk write error (Disconnected) (try counter: 36)
Feb  5 12:08:37 urknall dhclient[1206]: bound to 192.168.99.3 -- renewal in 229 seconds.
Feb  5 12:08:37 urknall marc: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 12:08:42 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 116)
Feb  5 12:08:43 urknall dhclient[1979]: bound to 192.168.99.3 -- renewal in 285 seconds.
Feb  5 12:08:43 urknall kernel: [440957.348909] docker_gwbridge: port 37(veth886ee36) entered disabled state
Feb  5 12:08:43 urknall kernel: [440957.373837] device veth886ee36 left promiscuous mode
Feb  5 12:08:43 urknall kernel: [440957.373849] docker_gwbridge: port 37(veth886ee36) entered disabled state
Feb  5 12:08:43 urknall kernel: [440957.398167] br0: port 5(veth1048) entered disabled state
Feb  5 12:08:43 urknall kernel: [440957.398339] veth691b387: renamed from eth0
Feb  5 12:08:43 urknall kernel: [440957.549936] IPVS: __ip_vs_del_service: enter
Feb  5 12:08:44 urknall kernel: [440958.979267] br0: port 5(veth1048) entered disabled state
Feb  5 12:08:44 urknall kernel: [440958.996825] device veth1048 left promiscuous mode
Feb  5 12:08:44 urknall kernel: [440958.996836] br0: port 5(veth1048) entered disabled state
Feb  5 12:08:45 urknall kernel: [440960.017558] br0: port 3(veth3) entered disabled state
Feb  5 12:08:45 urknall kernel: [440960.017685] vethf9d4e18: renamed from eth2
Feb  5 12:08:45 urknall kernel: [440960.057124] IPVS: __ip_vs_del_service: enter
Feb  5 12:08:47 urknall kernel: [440962.185054] IPVS: __ip_vs_del_service: enter
Feb  5 12:08:48 urknall kernel: [440962.759301] vetha200f2b: renamed from eth1
Feb  5 12:08:48 urknall kernel: [440962.789643] docker_gwbridge: port 40(veth6663354) entered disabled state
Feb  5 12:08:49 urknall dockerd[1676]: time="2018-02-05T12:08:49.566163881+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/containers/delete type="*events.ContainerDelete"
Feb  5 12:08:53 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 117)
Feb  5 12:09:01 urknall mfsmount: write file error, inode: 67786, index: 0 - Timeout after 10026 ms (Timeout) (try counter: 1)
Feb  5 12:09:01 urknall mfsmount: write file error, inode: 67788, index: 0 - Timeout after 10038 ms (Timeout) (try counter: 9)
Feb  5 12:09:01 urknall mfsmount: write file error, inode: 67787, index: 0 - Timeout after 10022 ms (Timeout) (try counter: 12)
Feb  5 12:09:02 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 118)
Feb  5 12:09:10 urknall mfsmount: write file error, inode: 3791, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 75)
Feb  5 12:09:12 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 119)
Feb  5 12:09:13 urknall mfsmount: write file error, inode: 67786, index: 0 - Chunk write error (Disconnected) (try counter: 2)
Feb  5 12:09:13 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 10)
Feb  5 12:09:15 urknall mfsmount: write file error, inode: 67787, index: 0 - Chunk write error (Disconnected) (try counter: 13)
Feb  5 12:09:22 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 120)
Feb  5 12:09:29 urknall mfsmount: write file error, inode: 67787, index: 0 - Chunk write error (Disconnected) (try counter: 14)
Feb  5 12:09:29 urknall kernel: [441003.935899] docker_gwbridge: port 40(veth6663354) entered disabled state
Feb  5 12:09:29 urknall mfsmount: write file error, inode: 67788, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 11)
Feb  5 12:09:29 urknall mfsmount: write file error, inode: 67786, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 3)
Feb  5 12:09:29 urknall kernel: [441003.944068] device veth6663354 left promiscuous mode
Feb  5 12:09:29 urknall kernel: [441003.944077] docker_gwbridge: port 40(veth6663354) entered disabled state
Feb  5 12:09:30 urknall CRON[14618]: (root) CMD (  [ -x /usr/lib/php5/maxlifetime ] && [ -x /usr/lib/php5/sessionclean ] && [ -d /var/lib/php5 ] && /usr/lib/php5/sessionclean /var/lib/php5 $(/usr/lib/php5/maxlifetime))
Feb  5 12:09:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 121)
Feb  5 12:09:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 1)
Feb  5 12:09:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 2)
Feb  5 12:09:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 3)
Feb  5 12:09:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 4)
Feb  5 12:09:32 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 5)
Feb  5 12:09:33 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 6)
Feb  5 12:09:33 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 7)
Feb  5 12:09:33 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 8)
Feb  5 12:09:33 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 9)
Feb  5 12:09:33 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 10)
Feb  5 12:09:34 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 11)
Feb  5 12:09:35 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 12)
Feb  5 12:09:37 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 13)
Feb  5 12:09:41 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 14)
Feb  5 12:09:47 urknall dockerd[1676]: time="2018-02-05T12:09:47.990461848+01:00" level=error msg="fatal task error" error="task: non-zero exit (255)" module=node/agent/taskmanager node.id=4z5kkc15en2q29wt1crkxomc8 service.id=dun7e29phoubjq10posm5f24q task.id=f76ijutsskh8j8uvps8dsqhr4
Feb  5 12:09:48 urknall mfsmetalogger[8545]: sessions downloaded 742B/23.364226s (0.000 MB/s)
Feb  5 12:09:49 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 15)
Feb  5 12:09:50 urknall mfsmount: write file error, inode: 67786, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 4)
Feb  5 12:09:50 urknall mfsmount: write file error, inode: 196764, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 37)
Feb  5 12:09:51 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 12)
Feb  5 12:09:54 urknall mfsmount: write file error, inode: 67787, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 15)
Feb  5 12:09:57 urknall kernel: [441031.340176] br0: port 3(veth3) entered disabled state
Feb  5 12:09:57 urknall kernel: [441031.348857] device veth3 left promiscuous mode
Feb  5 12:09:57 urknall kernel: [441031.348869] br0: port 3(veth3) entered disabled state
Feb  5 12:09:57 urknall kernel: [441031.441712] IPVS: __ip_vs_del_service: enter
Feb  5 12:09:57 urknall kernel: [441031.441724] IPVS: __ip_vs_del_service: enter
Feb  5 12:09:57 urknall dockerd[1676]: time="2018-02-05T12:09:57.999295732+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/containers/delete type="*events.ContainerDelete"
Feb  5 12:09:59 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 16)
Feb  5 12:10:09 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 17)
Feb  5 12:10:09 urknall dockerd[1676]: time="2018-02-05T12:10:09.634083440+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:mh26wxh3legx189b8ummizzvg leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:10:09 urknall dockerd[1676]: time="2018-02-05T12:10:09.634290442+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:mwursuov7sc87fxhzkmpdtyes leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:10:09 urknall dockerd[1676]: time="2018-02-05T12:10:09.634347558+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:gliwcqzlhjzjulhs3xdbxuqrv leaving:false netPeers:1 entries:6 Queue qLen:0 netMsg/s:0"
Feb  5 12:10:09 urknall dockerd[1676]: time="2018-02-05T12:10:09.634429591+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:xx0m50f4cog3dik8xwqxd4g0q leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:10:09 urknall dockerd[1676]: time="2018-02-05T12:10:09.634480657+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:dpu8s5w8nx7vtqyiqai8rflqm leaving:false netPeers:1 entries:12 Queue qLen:0 netMsg/s:0"
Feb  5 12:10:09 urknall dockerd[1676]: time="2018-02-05T12:10:09.634553602+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:2vg30sgcyr0omhvavdcptdrgv leaving:false netPeers:2 entries:47 Queue qLen:0 netMsg/s:0"
Feb  5 12:10:09 urknall dockerd[1676]: time="2018-02-05T12:10:09.634624971+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:hixi0z35nd69dxzd621dtjl0e leaving:false netPeers:2 entries:2 Queue qLen:0 netMsg/s:0"
Feb  5 12:10:09 urknall dockerd[1676]: time="2018-02-05T12:10:09.634696822+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:87vvi5rhozzl8qws2ufe7ltl2 leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb  5 12:10:09 urknall dockerd[1676]: time="2018-02-05T12:10:09.634772000+01:00" level=info msg="NetworkDB stats urknall(f1266e563c51) - netID:2ntxgh39t5rk8b55tapu1l8gh leaving:false netPeers:2 entries:8 Queue qLen:0 netMsg/s:0"
Feb  5 12:10:10 urknall mfsmount: write file error, inode: 3791, index: 0 - Timeout after 10037 ms (Timeout) (try counter: 76)
Feb  5 12:10:12 urknall mfsmount: write file error, inode: 67788, index: 0 - Chunk write error (Disconnected) (try counter: 13)
Feb  5 12:10:17 urknall mfsmount: write file error, inode: 67787, index: 0 - Chunk write error (Disconnected) (try counter: 16)
Feb  5 12:10:19 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 18)
Feb  5 12:10:20 urknall mfsmount: write file error, inode: 196764, index: 0 - Chunkserver timed out (server 192.168.99.3:9422) (try counter: 38)
Feb  5 12:10:21 urknall mfsmetalogger[8545]: sessions downloaded 742B/0.000566s (1.311 MB/s)
Feb  5 12:10:22 urknall mfschunkserver[8794]: Did not manage to receive packet header
Feb  5 12:10:22 urknall dockerd[1676]: time="2018-02-05T12:10:22.782034139+01:00" level=error msg="fatal task error" error="task: non-zero exit (1)" module=node/agent/taskmanager node.id=4z5kkc15en2q29wt1crkxomc8 service.id=gq2beoh3n3f0y9ix02ri5kiyl task.id=w4urw8nvv1629qikt3f4d6apj
Feb  5 12:10:29 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 19)
Feb  5 12:10:39 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 20)
Feb  5 12:10:46 urknall mfsmount: write file error, inode: 67783, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 47)
Feb  5 12:10:46 urknall mfsmount: write file error, inode: 196764, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 39)
Feb  5 12:10:46 urknall mfsmount: write file error, inode: 67788, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 14)
Feb  5 12:10:49 urknall mfsmount: read file error, inode: 67938, index: 0, chunk: 61735, version: 15 - no valid copies (try counter: 21)

Host pulsar (Chubk and Metalogger):

Feb  5 11:41:13 pulsar dhclient[1004]: DHCPREQUEST of 192.168.99.2 on em1 to 192.168.99.1 port 67 (xid=0x45d74c8a)                                                                           
Feb  5 11:41:13 pulsar dhclient[1004]: DHCPACK of 192.168.99.2 from 192.168.99.1                                                                                                             
Feb  5 11:41:24 pulsar mfsmetalogger[6534]: connection was reset by Master                                                                                                                   
Feb  5 11:41:25 pulsar mfsmetalogger[6534]: connecting to Master                                                                                                                             
Feb  5 11:41:25 pulsar mfsmetalogger[6534]: connected to Master                                                                                                                              
Feb  5 11:42:07 pulsar root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1                                                                                          
Feb  5 11:42:07 pulsar dhclient[1004]: bound to 192.168.99.2 -- renewal in 221 seconds.                                                                                                      
Feb  5 11:45:01 pulsar CRON[22885]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)                                                                                         
Feb  5 11:45:49 pulsar dhclient[1004]: DHCPREQUEST of 192.168.99.2 on em1 to 192.168.99.1 port 67 (xid=0x45d74c8a)                                                                           
Feb  5 11:45:49 pulsar dhclient[1004]: DHCPACK of 192.168.99.2 from 192.168.99.1                                                                                                             
Feb  5 11:46:18 pulsar mfsmetalogger[6534]: connecting to Master                                                                                                                             
Feb  5 11:46:18 pulsar mfsmetalogger[6534]: connected to Master                                                                                                                              
Feb  5 11:46:22 pulsar root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1                                                                                          
Feb  5 11:46:22 pulsar dhclient[1004]: bound to 192.168.99.2 -- renewal in 257 seconds.                                                                                                      
Feb  5 11:50:39 pulsar dhclient[1004]: DHCPREQUEST of 192.168.99.2 on em1 to 192.168.99.1 port 67 (xid=0x45d74c8a)                                                                           
Feb  5 11:50:39 pulsar dhclient[1004]: DHCPACK of 192.168.99.2 from 192.168.99.1                                                                                                             
Feb  5 11:50:41 pulsar mfsmetalogger[6534]: connecting to Master                                                                                                                             
Feb  5 11:50:41 pulsar mfsmetalogger[6534]: connected to Master                                                                                                                              
Feb  5 11:50:45 pulsar root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1                                                                                          
Feb  5 11:50:45 pulsar dhclient[1004]: bound to 192.168.99.2 -- renewal in 238 seconds.                                                                                                      
Feb  5 11:54:43 pulsar dhclient[1004]: DHCPREQUEST of 192.168.99.2 on em1 to 192.168.99.1 port 67 (xid=0x45d74c8a)                                                                           
Feb  5 11:54:43 pulsar dhclient[1004]: DHCPACK of 192.168.99.2 from 192.168.99.1                                                                                                             
Feb  5 11:55:01 pulsar CRON[23597]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)                                                                                         
Feb  5 11:55:11 pulsar mfsmetalogger[6534]: connecting to Master                                                                                                                             
Feb  5 11:55:11 pulsar mfsmetalogger[6534]: connected to Master                                                                                                                              
Feb  5 11:55:14 pulsar root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1                                                                                          
Feb  5 11:55:14 pulsar dhclient[1004]: bound to 192.168.99.2 -- renewal in 204 seconds.                                                                                                      
Feb  5 11:57:26 pulsar mfsmetalogger[6534]: connecting to Master                                                                                                                             
Feb  5 11:57:26 pulsar mfsmetalogger[6534]: connected to Master                                                                                                                              
Feb  5 11:58:38 pulsar dhclient[1004]: DHCPREQUEST of 192.168.99.2 on em1 to 192.168.99.1 port 67 (xid=0x45d74c8a)                                                                           
Feb  5 11:58:38 pulsar dhclient[1004]: DHCPACK of 192.168.99.2 from 192.168.99.1
Feb  5 11:59:35 pulsar root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 11:59:35 pulsar dhclient[1004]: bound to 192.168.99.2 -- renewal in 220 seconds.
Feb  5 12:03:15 pulsar dhclient[1004]: DHCPREQUEST of 192.168.99.2 on em1 to 192.168.99.1 port 67 (xid=0x45d74c8a)
Feb  5 12:03:15 pulsar dhclient[1004]: DHCPACK of 192.168.99.2 from 192.168.99.1
Feb  5 12:04:26 pulsar root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 12:04:26 pulsar dhclient[1004]: bound to 192.168.99.2 -- renewal in 173 seconds.
Feb  5 12:05:01 pulsar CRON[24237]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Feb  5 12:07:18 pulsar mfsmetalogger[6534]: connecting to Master
Feb  5 12:07:18 pulsar mfsmetalogger[6534]: connected to Master
Feb  5 12:07:20 pulsar dhclient[1004]: DHCPREQUEST of 192.168.99.2 on em1 to 192.168.99.1 port 67 (xid=0x45d74c8a)
Feb  5 12:07:20 pulsar dhclient[1004]: DHCPACK of 192.168.99.2 from 192.168.99.1
Feb  5 12:07:21 pulsar root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 12:07:21 pulsar dhclient[1004]: bound to 192.168.99.2 -- renewal in 236 seconds.
Feb  5 12:09:03 pulsar kernel: [443248.752180] audit_printk_skb: 66 callbacks suppressed
Feb  5 12:09:03 pulsar kernel: [443248.752188] audit: type=1400 audit(1517828943.975:5728): apparmor="DENIED" operation="ptrace" profile="docker-default" pid=24418 comm="pidof" requested_mask="trace" denied_mask="trace" peer="docker-default"
Feb  5 12:09:03 pulsar kernel: [443248.752303] audit: type=1400 audit(1517828943.975:5729): apparmor="DENIED" operation="ptrace" profile="docker-default" pid=24418 comm="pidof" requested_mask="trace" denied_mask="trace" peer="docker-default"
Feb  5 12:09:03 pulsar kernel: [443248.752338] audit: type=1400 audit(1517828943.975:5730): apparmor="DENIED" operation="ptrace" profile="docker-default" pid=24418 comm="pidof" requested_mask="trace" denied_mask="trace" peer="docker-default"
Feb  5 12:09:03 pulsar kernel: [443248.752409] audit: type=1400 audit(1517828943.975:5731): apparmor="DENIED" operation="ptrace" profile="docker-default" pid=24418 comm="pidof" requested_mask="trace" denied_mask="trace" peer="docker-default"
Feb  5 12:09:03 pulsar kernel: [443248.752441] audit: type=1400 audit(1517828943.975:5732): apparmor="DENIED" operation="ptrace" profile="docker-default" pid=24418 comm="pidof" requested_mask="trace" denied_mask="trace" peer="docker-default"
Feb  5 12:09:03 pulsar kernel: [443248.752509] audit: type=1400 audit(1517828943.975:5733): apparmor="DENIED" operation="ptrace" profile="docker-default" pid=24418 comm="pidof" requested_mask="trace" denied_mask="trace" peer="docker-default"
Feb  5 12:09:03 pulsar kernel: [443248.752542] audit: type=1400 audit(1517828943.975:5734): apparmor="DENIED" operation="ptrace" profile="docker-default" pid=24418 comm="pidof" requested_mask="trace" denied_mask="trace" peer="docker-default"
Feb  5 12:09:03 pulsar kernel: [443248.752606] audit: type=1400 audit(1517828943.975:5735): apparmor="DENIED" operation="ptrace" profile="docker-default" pid=24418 comm="pidof" requested_mask="trace" denied_mask="trace" peer="docker-default"
Feb  5 12:09:03 pulsar kernel: [443248.752639] audit: type=1400 audit(1517828943.975:5736): apparmor="DENIED" operation="ptrace" profile="docker-default" pid=24418 comm="pidof" requested_mask="trace" denied_mask="trace" peer="docker-default"
Feb  5 12:09:03 pulsar kernel: [443248.752708] audit: type=1400 audit(1517828943.975:5737): apparmor="DENIED" operation="ptrace" profile="docker-default" pid=24418 comm="pidof" requested_mask="trace" denied_mask="trace" peer="docker-default"
Feb  5 12:11:18 pulsar dhclient[1004]: DHCPREQUEST of 192.168.99.2 on em1 to 192.168.99.1 port 67 (xid=0x45d74c8a)
Feb  5 12:11:18 pulsar dhclient[1004]: DHCPACK of 192.168.99.2 from 192.168.99.1
Feb  5 12:11:18 pulsar root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 12:11:18 pulsar dhclient[1004]: bound to 192.168.99.2 -- renewal in 299 seconds

Host raum (Chunk and CGI-Server):

Feb  5 11:42:16 raum dhclient[1051]: DHCPREQUEST of 192.168.99.7 on em1 to 192.168.99.1 port 67 (xid=0x67ed8381)                                                                             
Feb  5 11:42:16 raum dhclient[1051]: DHCPACK of 192.168.99.7 from 192.168.99.1                                                                                                               
Feb  5 11:42:16 raum root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1                                                                                            
Feb  5 11:42:16 raum dhclient[1051]: bound to 192.168.99.7 -- renewal in 275 seconds.                                                                                                        
Feb  5 11:43:03 raum apcupsd[955]: Communications with UPS lost.                                                                                                                             
Feb  5 11:45:01 raum CRON[46398]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)                                                                                           
Feb  5 11:46:51 raum dhclient[1051]: DHCPREQUEST of 192.168.99.7 on em1 to 192.168.99.1 port 67 (xid=0x67ed8381)                                                                             
Feb  5 11:46:51 raum dhclient[1051]: DHCPACK of 192.168.99.7 from 192.168.99.1                                                                                                               
Feb  5 11:46:51 raum root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1                                                                                            
Feb  5 11:46:51 raum dhclient[1051]: bound to 192.168.99.7 -- renewal in 244 seconds.                                                                                                        
Feb  5 11:50:56 raum dhclient[1051]: DHCPREQUEST of 192.168.99.7 on em1 to 192.168.99.1 port 67 (xid=0x67ed8381)                                                                             
Feb  5 11:50:56 raum dhclient[1051]: DHCPACK of 192.168.99.7 from 192.168.99.1                                                                                                               
Feb  5 11:50:56 raum root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1                                                                                            
Feb  5 11:50:56 raum dhclient[1051]: bound to 192.168.99.7 -- renewal in 239 seconds.                                                                                                        
Feb  5 11:53:03 raum apcupsd[955]: Communications with UPS lost.                                                                                                                             
Feb  5 11:54:55 raum dhclient[1051]: DHCPREQUEST of 192.168.99.7 on em1 to 192.168.99.1 port 67 (xid=0x67ed8381)                                                                             
Feb  5 11:54:55 raum dhclient[1051]: DHCPACK of 192.168.99.7 from 192.168.99.1                                                                                                               
Feb  5 11:54:55 raum root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1                                                                                            
Feb  5 11:54:55 raum dhclient[1051]: bound to 192.168.99.7 -- renewal in 277 seconds.                                                                                                        
Feb  5 11:55:01 raum CRON[54920]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)                                                                                           
Feb  5 11:59:32 raum dhclient[1051]: DHCPREQUEST of 192.168.99.7 on em1 to 192.168.99.1 port 67 (xid=0x67ed8381)                                                                             
Feb  5 11:59:32 raum dhclient[1051]: DHCPACK of 192.168.99.7 from 192.168.99.1                                                                                                               
Feb  5 11:59:33 raum root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1                                                                                            
Feb  5 11:59:33 raum dhclient[1051]: bound to 192.168.99.7 -- renewal in 282 seconds.                                                                                                        
Feb  5 12:03:03 raum apcupsd[955]: Communications with UPS lost.                                                                                                                             
Feb  5 12:04:15 raum dhclient[1051]: DHCPREQUEST of 192.168.99.7 on em1 to 192.168.99.1 port 67 (xid=0x67ed8381)                                                                             
Feb  5 12:04:15 raum dhclient[1051]: DHCPACK of 192.168.99.7 from 192.168.99.1                                                                                                               
Feb  5 12:04:16 raum root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1                                                                                            
Feb  5 12:04:16 raum dhclient[1051]: bound to 192.168.99.7 -- renewal in 249 seconds.                                                                                                        
Feb  5 12:05:01 raum CRON[54024]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)                                                                                           
Feb  5 12:08:25 raum dhclient[1051]: DHCPREQUEST of 192.168.99.7 on em1 to 192.168.99.1 port 67 (xid=0x67ed8381)
Feb  5 12:08:25 raum dhclient[1051]: DHCPACK of 192.168.99.7 from 192.168.99.1
Feb  5 12:08:25 raum root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Feb  5 12:08:26 raum dhclient[1051]: bound to 192.168.99.7 -- renewal in 289 seconds.
Feb  5 12:09:01 raum CRON[64865]: (root) CMD (  [ -x /usr/lib/php5/maxlifetime ] && [ -x /usr/lib/php5/sessionclean ] && [ -d /var/lib/php5 ] && /usr/lib/php5/sessionclean /var/lib/php5 $(/usr/lib/php5/maxlifetime))
@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 5, 2018

On the CGI-GUI, I cannot see any problem (except that 2 chunks somehow have been lost):

image

image

For me, also the statistics don't look bad. The fully filled memory display is scaled to 275MB, where 40GB are available, so that's nothing.

image
image
image
image

Current configuration:
image

@DarkHaze

This comment has been minimized.

Copy link
Contributor

DarkHaze commented Feb 5, 2018

First of all switching goal from 1 to 2 increases number of write operations by factor of 2. So you increase number of operation for the LizardFS to perform. It's not surprising that it resulted in reduced system performance.

In your case I'm guessing that your hard drives are overloaded. You can check it with command
iostat -x 5

You could write the performance you require and your system specification (number and types of hard drives and if they are raid configured) and we could try to estimate if it's possible for your system to achieve it.

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 5, 2018

@DarkHaze, can I reduce replication bandwidth? It used to be buggy, so I don't dare to enable it, see #658.

Hard disk configuration (details in my blog):

  • there is no raid
  • all hard disks are in one huge LUKS volume
  • on top of LUKS, there is dm-crypt
  • on top of dm-crypt there is the btrfs root filesystem
  • the chunks are in a path within the root filesystem

Here are the results of iostat -x 5 10 on the servers. Currently, all chunks should migrate to universum and urknall, so these two are probably the most interesting. What does this tell you?

Host universum (here only hda is used, hdb is experimentally formatted for zfs):

Linux 4.4.0-112-generic (universum)     05.02.2018      _x86_64_        (8 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.19    0.02    4.54   21.89    0.00   71.36

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.04    4.05     0.32   318.02   155.39     0.04   10.23    2.73   10.31   9.33   3.82
sda               1.34    11.58    9.98  102.14  3423.63  7355.45   192.28     1.09    9.72   46.73    6.11   3.13  35.14
dm-0              0.00     0.00    1.47    3.66     5.93    14.63     8.02     3.81  743.60   16.18 1036.30   8.16   4.19
dm-1              0.00     0.00    8.33  107.63  3417.65  7340.36   185.55     1.77   15.24   47.08   12.78   3.02  35.05
dm-2              0.00     0.00    8.33  107.63  3417.65  7340.36   185.55    15.41  132.89   47.79  139.48   3.03  35.16
dm-3              0.00     0.00    1.47    3.66     5.90    14.63     8.01     3.82  744.95   16.22 1037.94   8.18   4.20

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.96    0.00    2.51   19.38    0.00   77.15

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00    8.20     0.00   631.20   153.95     0.10   11.71    0.00   11.71  10.24   8.40
sda               0.00    11.40   27.00   59.60 13128.00 26463.20   914.35    24.26  280.08   32.71  392.15   4.28  37.04
dm-0              0.00     0.00    0.00   11.20     0.00    44.80     8.00    11.17  997.43    0.00  997.43  21.29  23.84
dm-1              0.00     0.00   16.80   37.40 13128.00 26418.40  1459.28    14.86  274.24   32.95  382.63   6.82  36.96
dm-2              0.00     0.00   16.80   37.40 13128.00 26418.40  1459.28    16.89  311.66   34.19  436.30   6.82  36.96
dm-3              0.00     0.00    0.00   11.20     0.00    44.80     8.00    11.17  997.43    0.00  997.43  21.29  23.84

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.63    0.00    0.78   14.75    0.00   83.84

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    1.40   18.80     5.60  1738.40   172.67     0.19    9.58    0.00   10.30   8.91  18.00
sda               0.00     0.00    1.40    0.40    32.00     2.40    38.22     0.02    9.78   10.29    8.00   7.11   1.28
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-1              0.00     0.00    1.40    0.40    32.00     2.40    38.22     0.02    9.78   10.29    8.00   7.11   1.28
dm-2              0.00     0.00    1.40    0.40    32.00     2.40    38.22     0.02    9.78   10.29    8.00   7.11   1.28
dm-3              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.89    0.00    2.53   13.08    0.00   83.50

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     1.60    0.00   40.00     0.00 18704.80   935.24    12.97  233.38    0.00  233.38   4.66  18.64
dm-0              0.00     0.00    0.00    1.60     0.00     6.40     8.00     0.50  144.00    0.00  144.00 101.00  16.16
dm-1              0.00     0.00    0.00   29.40     0.00 26282.40  1787.92     6.67  164.24    0.00  164.24   6.31  18.56
dm-2              0.00     0.00    0.00   29.40     0.00 26282.40  1787.92     6.72  165.50    0.00  165.50   6.37  18.72
dm-3              0.00     0.00    0.00    1.60     0.00     6.40     8.00     0.50  144.00    0.00  144.00 101.00  16.16

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.51    0.00    1.39   18.03    0.00   80.07

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     6.00   11.80  229.00  5390.40 14276.80   163.35    35.36  161.94   28.20  168.83   1.93  46.48
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.40    0.00    0.00    0.00   0.00   6.64
dm-1              0.00     0.00    6.80  220.00  5390.40  6692.80   106.55    35.31  163.80   27.76  168.01   2.05  46.40
dm-2              0.00     0.00    6.80  220.00  5390.40  6692.80   106.55   123.01  550.53   29.06  566.65   2.06  46.64
dm-3              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.40    0.00    0.00    0.00   0.00   6.64

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.86    0.00    1.67   12.46    0.00   85.01

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-3              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.89    0.00    1.57   13.77    0.00   83.77

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00    13.60    0.00   31.00     0.00 13352.00   861.42     7.94  241.42    0.00  241.42   5.06  15.68
dm-0              0.00     0.00    0.00   15.00     0.00    60.00     8.00     7.64  509.33    0.00  509.33   8.53  12.80
dm-1              0.00     0.00    0.00   29.80     0.00 26266.40  1762.85     4.10  130.04    0.00  130.04   4.81  14.32
dm-2              0.00     0.00    0.00   29.80     0.00 26266.40  1762.85     4.16  130.95    0.00  130.95   4.83  14.40
dm-3              0.00     0.00    0.00   15.00     0.00    60.00     8.00     7.64  509.33    0.00  509.33   8.53  12.80

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.04    0.00    1.44   14.20    0.00   83.32

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00   32.00     0.00 13080.80   817.55     7.25  240.70    0.00  240.70   4.58  14.64
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    6.60     0.00   106.40    32.24     3.95  632.97    0.00  632.97  22.18  14.64
dm-2              0.00     0.00    0.00    6.60     0.00   106.40    32.24     3.95  636.48    0.00  636.48  22.18  14.64
dm-3              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.61    0.00    0.88   12.64    0.00   85.87

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00    94.60    1.40    7.40   164.80   413.60   131.45     0.15   16.91   11.43   17.95   3.27   2.88
dm-0              0.00     0.00    0.00  102.40     0.00   409.60     8.00     2.23   21.82    0.00   21.82   0.09   0.96
dm-1              0.00     0.00    1.40    0.60   164.80     4.00   168.80     0.03   12.80   11.43   16.00   9.60   1.92
dm-2              0.00     0.00    1.40    0.60   164.80     4.00   168.80     0.03   12.80   11.43   16.00   9.60   1.92
dm-3              0.00     0.00    0.00  102.40     0.00   409.60     8.00     2.24   21.90    0.00   21.90   0.09   0.96

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.96    0.00    2.50   15.42    0.00   81.11

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     4.20    0.00   62.40     0.00 26416.80   846.69    15.39  246.63    0.00  246.63   4.36  27.20
dm-0              0.00     0.00    0.00    4.00     0.00    16.00     8.00     2.25  562.40    0.00  562.40  28.40  11.36
dm-1              0.00     0.00    0.00   37.00     0.00 26400.80  1427.07     8.13  219.70    0.00  219.70   7.31  27.04
dm-2              0.00     0.00    0.00   37.00     0.00 26400.80  1427.07     8.18  221.10    0.00  221.10   7.33  27.12
dm-3              0.00     0.00    0.00    4.00     0.00    16.00     8.00     2.25  562.60    0.00  562.60  28.40  11.36

Host urknall:

Linux 4.4.0-104-generic (urknall)       05.02.2018      _x86_64_        (2 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           4.39    0.38   26.42   57.10    0.00   11.71

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.09     5.78    3.28  152.22   879.68  3960.89    62.26     3.28   21.12   72.12   20.02   2.39  37.12
sdb               0.09     5.33    2.72   25.50  1329.62  2062.46   240.46     3.67  130.19   65.90  137.04  20.30  57.26
sdc               0.00     0.04    0.03    2.58     0.78    48.40    37.62     0.88  338.21   14.82  342.16   3.42   0.89
sdd               0.00     0.07    0.04    3.54     0.82    71.59    40.47     1.00  279.81   16.45  282.55   2.56   0.91
dm-0              0.00     0.00    5.20  194.05  2210.87  6143.35    83.86     7.15   35.89   72.62   34.90   4.63  92.34
dm-1              0.00     0.00    5.20  194.05  2210.86  6143.35    83.86     4.10   20.59  107.54   18.26   4.66  92.93

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5.83    0.00   26.27    3.28    0.00   64.62

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     1.00    0.20    5.60     1.60   132.00    46.07     0.13   21.66   12.00   22.00   6.07   3.52
sdb               0.00     0.00    0.00   18.80     0.00 13115.20  1395.23     2.78  148.04    0.00  148.04  13.70  25.76
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.20   20.80     1.60 13247.20  1261.79     2.32  110.40   12.00  111.35  13.22  27.76
dm-1              0.00     0.00    0.20   20.80     1.60 13247.20  1261.79     5.65  268.91   12.00  271.38  13.68  28.72

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           6.24    0.00   27.34    6.57    0.00   59.85

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.40    0.00   10.40     0.00   255.20    49.08     0.27   26.08    0.00   26.08   5.62   5.84
sdb               0.00     0.00    0.00   20.40     0.00 13116.80  1285.96     1.29   63.37    0.00   63.37  11.88  24.24
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00   25.40     0.00 13372.00  1052.91     1.22   48.16    0.00   48.16  11.46  29.12
dm-1              0.00     0.00    0.00   25.40     0.00 13372.00  1052.91     5.93  233.32    0.00  233.32  11.94  30.32

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.47    0.00   38.53    7.68    0.00   50.32

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.40    0.00    7.40     0.00   168.00    45.41     0.12   16.00    0.00   16.00   3.24   2.40
sdb               0.80     0.00   17.20   21.00 13112.00 14293.60  1434.85     3.21   84.08   34.33  124.84  11.92  45.52
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00   17.80   23.20 13112.00 14461.60  1345.05     2.82   68.86   34.88   94.93  11.30  46.32
dm-1              0.00     0.00   17.80   23.20 13112.00 14461.60  1345.05     7.70  187.82   70.70  277.69  12.31  50.48

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.13    0.00   19.77    5.25    0.00   73.84

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     2.00    0.00   36.53     0.00  5433.13   297.49     3.38   65.05    0.00   65.05   6.78  24.75
sdb               0.00     0.00    0.00    1.80     0.00   426.35   474.67     0.11   59.11    0.00   59.11  38.67   6.95
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00   69.06     0.00  6389.62   185.04     3.58   36.60    0.00   36.60   3.82  26.35
dm-1              0.00     0.00    0.00  281.64     0.00 11358.88    80.66    17.09   10.12    0.00   10.12   0.95  26.75

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5.49    0.00   32.49   13.29    0.00   48.73

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00    35.40    0.00  211.60     0.00  5580.00    52.74    34.77  169.10    0.00  169.10   1.46  30.88
sdb               0.00     0.20   26.20    2.00 13108.80     9.60   930.38     0.68   24.14   20.49   72.00   5.25  14.80
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00   13.60  220.00 13108.80  5058.40   155.54    41.70  183.02   21.41  193.01   1.90  44.48
dm-1              0.00     0.00   13.60    7.00 13108.80    79.20  1280.39   167.89 8842.64   85.59 25856.34  26.10  53.76

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.82    0.00    6.50    0.65    0.00   90.03

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.20    0.00    1.40     0.00    24.00    34.29     0.03   16.57    0.00   16.57   9.14   1.28
sdb               0.00     0.00    0.00    0.40     0.00     2.40    12.00     0.01   34.00    0.00   34.00  34.00   1.36
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    2.20     0.00    29.60    26.91     0.04   17.45    0.00   17.45  12.00   2.64
dm-1              0.00     0.00    0.00    2.20     0.00    29.60    26.91     0.04   17.45    0.00   17.45  12.00   2.64

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           7.90    0.00   25.50    0.32    0.00   66.28

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00    0.40     0.00     4.80    24.00     0.02   74.00    0.00   74.00  60.00   2.40
sdb               0.00     0.00    0.00    0.20     0.00     0.80     8.00     0.01   36.00    0.00   36.00  36.00   0.72
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.40     0.00     2.40    12.00     0.03   92.00    0.00   92.00  78.00   3.12
dm-1              0.00     0.00    0.00    0.40     0.00     2.40    12.00     0.03   92.00    0.00   92.00  78.00   3.12

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           6.80    0.00   42.83    0.97    0.00   49.41

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    4.20    0.00  1568.80     0.00   747.05     0.04    9.14    9.14    0.00   6.29   2.64
sdb               2.80     0.00   47.60    0.20 24657.60     0.80  1031.73     0.62   13.00   12.89   40.00   3.23  15.44
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00   31.60    0.20 26226.40     0.80  1649.51     0.45   14.29   14.13   40.00   5.51  17.52
dm-1              0.00     0.00   31.60    0.20 26226.40     0.80  1649.51     1.88   59.14   59.27   40.00  13.11  41.68

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.01    0.00   25.16    1.51    0.00   70.32

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.20     0.00    2.80    0.60  1947.20     2.40  1146.82     0.09   27.06   28.86   18.67   9.88   3.36
sdb               0.00     0.00    0.00   15.20     0.00 10237.60  1347.05     2.57  109.95    0.00  109.95  11.53  17.52
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    3.00   13.80  1947.20 13112.00  1792.76     1.90   68.48   32.00   76.41  12.10  20.32
dm-1              0.00     0.00    3.00   13.80  1947.20 13112.00  1792.76     5.20  200.43   48.27  233.51  12.76  21.44

Host pulsar:

Linux 4.4.0-104-generic (pulsar)        05.02.2018      _x86_64_        (2 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           8.78    0.26   15.83   58.98    0.00   16.15

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     1.58    0.44   19.92    11.95   619.75    62.06     2.76  135.37   30.45  137.67   4.32   8.79
sdb               0.05     2.21    2.31   37.78   903.66  1850.31   137.38     8.33  207.82  209.03  207.75  20.61  82.64
sdc               0.01     0.22    0.71    6.56    48.65   287.61    92.48     1.43  197.21    8.56  217.60   3.68   2.67
sdd               0.00     0.40    0.46   15.17   133.02   786.23   117.60     2.84  181.69   27.56  186.38   3.97   6.20
dm-0              0.00     0.00    3.49   82.93  1097.24  3543.83   107.41     3.70   42.81  127.52   39.24  11.02  95.26
dm-1              0.00     0.00    3.49   82.93  1097.24  3543.83   107.41     7.14   82.63  146.48   79.94  11.07  95.69

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5.17    0.00    8.11   22.49    0.00   64.24

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00   12.80     0.00    51.20     8.00     0.09    6.88    0.00    6.88   6.88   8.80
sdb               0.00     0.00    0.00   13.00     0.00    59.20     9.11     0.46   36.31    0.00   36.31  35.75  46.48
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     0.80    0.00   98.40     0.00  1638.40    33.30     1.69   17.18    0.00   17.18   4.11  40.40
dm-0              0.00     0.00    0.00  124.80     0.00  1748.00    28.01     2.26   18.17    0.00   18.17   7.69  96.00
dm-1              0.00     0.00    0.00  125.40     0.00  1757.60    28.03     2.43   19.42    0.00   19.42   7.84  98.32

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.81    0.00   61.85   16.16    0.00   20.18

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00   36.20     0.00  4421.60   244.29     1.56   43.14    0.00   43.14   8.09  29.28
sdb               0.00     0.00    0.00    4.00     0.00   398.40   199.20     0.20   50.20    0.00   50.20  42.00  16.80
sdc               0.00     0.00    0.00    0.80     0.00   204.80   512.00     0.02   19.00    0.00   19.00  13.00   1.04
sdd               0.00    11.00    0.00  199.80     0.00 13408.00   134.21    41.79  182.39    0.00  182.39   3.75  74.88
dm-0              0.00     0.00    0.00  282.20     0.00 19008.80   134.72    47.28  147.45    0.00  147.45   3.02  85.28
dm-1              0.00     0.00    0.00 1099.40     0.00 33588.00    61.10  1218.66   97.09    0.00   97.09   0.78  86.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.61    0.00    4.86   75.99    0.00   18.54

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.60    0.00    14.40     0.00    48.00     0.01    9.33    9.33    0.00   6.67   0.40
sdb               0.00     0.00    1.40    0.00   564.80     0.00   806.86     0.01    8.00    8.00    0.00   5.14   0.72
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00    18.20    0.00  544.00     0.00  9737.60    35.80   143.37  263.74    0.00  263.74   1.84 100.00
dm-0              0.00     0.00    1.80  562.20   579.20  9702.40    36.46   149.33  264.86    8.89  265.68   1.77 100.00
dm-1              0.00     0.00    1.80    1.40   579.20     5.60   365.50  2850.70 672229.50   16.44 1536503.43 312.50 100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           4.14    0.00    8.79   43.54    0.00   43.54

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00    3.79     0.00    15.17     8.00     0.05   14.11    0.00   14.11   7.16   2.71
sdb               0.00     0.00    0.00   17.17     0.00  1219.96   142.14     2.50  145.67    0.00  145.67  21.53  36.97
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00    16.37    0.00  282.04     0.00  5663.87    40.16    69.26  264.07    0.00  264.07   2.19  61.80
dm-0              0.00     0.00    0.00  289.62     0.00  6378.44    44.05    77.36  286.44    0.00  286.44   3.39  98.04
dm-1              0.00     0.00    0.00   33.13     0.00  1496.21    90.31   374.43 65862.77    0.00 65862.77  29.88  99.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.54    0.00    6.19   21.50    0.00   69.78

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00   10.60     0.00    42.40     8.00     0.08    7.09    0.00    7.09   7.09   7.52
sdb               0.00     0.00    0.00   14.00     0.00   191.20    27.31     0.70   50.00    0.00   50.00  36.80  51.52
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     0.00    0.00   85.60     0.00  1374.40    32.11     1.69   19.92    0.00   19.92   4.63  39.60
dm-0              0.00     0.00    0.00  109.20     0.00  1589.60    29.11     2.47   22.67    0.00   22.67   8.81  96.24
dm-1              0.00     0.00    0.00  109.20     0.00  1589.60    29.11     2.58   23.74    0.00   23.74   9.03  98.56

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.16    0.00    8.05   21.51    0.00   67.28

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    1.20   11.80    28.80    47.20    11.69     0.10    7.32   11.33    6.92   6.95   9.04
sdb               0.00     0.00    1.20   14.60   105.60   149.60    32.30     0.68   43.09   22.67   44.77  34.94  55.20
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     3.20    0.00   82.20     0.00  1526.40    37.14     1.41   17.01    0.00   17.01   4.42  36.32
dm-0              0.00     0.00    2.40  112.20   134.40  1732.00    32.57     2.22   19.32   17.00   19.37   8.40  96.32
dm-1              0.00     0.00    2.40  112.20   134.40  1732.00    32.57     2.33   20.24   19.33   20.26   8.57  98.24

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.03    0.00    7.58   21.82    0.00   67.58

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.40   12.00     6.40    48.00     8.77     0.10    8.00   16.00    7.73   8.00   9.92
sdb               0.00     0.00    1.20   14.00    80.00   102.40    24.00     0.63   41.26   20.00   43.09  35.00  53.20
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     6.40    0.00   69.00     0.00  1529.60    44.34     1.38   20.21    0.00   20.21   5.29  36.48
dm-0              0.00     0.00    1.60  101.80    86.40  1686.40    34.29     2.20   21.41   19.00   21.45   9.28  96.00
dm-1              0.00     0.00    1.60  102.40    86.40  1696.00    34.28     2.33   22.52   21.00   22.55   9.50  98.80

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5.36    0.00    7.58   21.74    0.00   65.32

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00   11.80     0.00    48.00     8.14     0.09    7.73    0.00    7.73   7.59   8.96
sdb               0.00     0.00    0.00   13.60     0.00   132.80    19.53     0.56   41.24    0.00   41.24  36.59  49.76
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     1.60    0.00   87.80     0.00  1488.00    33.90     1.68   19.16    0.00   19.16   4.47  39.28
dm-0              0.00     0.00    0.00  114.00     0.00  1655.20    29.04     2.35   20.59    0.00   20.59   8.39  95.68
dm-1              0.00     0.00    0.00  113.40     0.00  1645.60    29.02     2.46   21.73    0.00   21.73   8.60  97.52

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.05    0.00    6.11   23.42    0.00   67.41

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00   11.60     0.00    48.80     8.41     0.11    9.72    0.00    9.72   9.10  10.56
sdb               0.00     0.00    0.00   11.80     0.00    69.60    11.80     0.48   40.47    0.00   40.47  38.85  45.84
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     0.40    0.00   94.60     0.00  1575.20    33.30     1.90   20.09    0.00   20.09   4.40  41.60
dm-0              0.00     0.00    0.00  120.20     0.00  1723.20    28.67     2.50   20.79    0.00   20.79   7.99  96.08
dm-1              0.00     0.00    0.00  120.20     0.00  1723.20    28.67     2.62   21.76    0.00   21.76   8.17  98.16

Host raum:

Linux 4.4.0-104-generic (raum)  05.02.2018      _x86_64_        (2 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          12.98    0.08   27.03   16.57    0.00   43.34

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda              13.55    16.70   26.75   27.26 10233.84   689.15   404.49     4.70   86.95   38.25  134.75   4.44  23.99
sdb               0.17     0.04    4.52    4.15  2305.61  1664.37   915.07     0.56   65.03   24.89  108.77   5.38   4.67
sdc               0.19     5.24   11.64   96.53  1859.16  3270.92    94.85     2.44   22.57   45.59   19.80   3.10  33.49
sdd               0.00     0.00    0.00    0.00     0.01     0.00    28.15     0.00    2.42    2.42    0.00   2.18   0.00
dm-0              0.00     0.00   20.99   18.66    83.97    74.62     8.00     3.45   87.03   26.41  155.25   1.79   7.11
dm-1              0.00     0.00   27.28  130.73 14314.61  5549.73   251.42     8.12   51.40   42.17   53.33   3.28  51.78
dm-2              0.00     0.00   27.28  130.73 14314.61  5549.73   251.42     0.38    2.37   53.19   58.88   3.37  53.24
dm-3              0.00     0.00   20.99   18.66    83.96    74.62     8.00     4.88  123.07   26.62  231.60   1.82   7.23

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          10.54    0.00   15.05    1.54    0.00   72.88

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.60    0.80     9.60     3.20    18.29     0.06   40.00   58.67   26.00  37.14   5.20
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdc               0.80     0.00   16.00    0.00  5816.00     0.00   727.00     0.26   15.95   15.95    0.00   4.45   7.12
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.80     0.00     3.20     8.00     0.02   26.00    0.00   26.00  21.00   1.68
dm-1              0.00     0.00   14.40    0.00  5825.60     0.00   809.11     0.26   17.89   17.89    0.00   6.67   9.60
dm-2              0.00     0.00   14.40    0.00  5825.60     0.00   809.11     0.34   23.83   23.83    0.00   7.50  10.80
dm-3              0.00     0.00    0.00    0.80     0.00     3.20     8.00     0.02   26.00    0.00   26.00  21.00   1.68

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          22.87    0.00   30.87    2.15    0.00   44.10

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.80    0.00    12.80     0.00    32.00     0.06   70.00   70.00    0.00  70.00   5.60
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdc               0.00     0.00    4.60    0.60    96.80     2.40    38.15     0.07   13.85   12.00   28.00   9.23   4.80
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-1              0.00     0.00    5.40    0.60   109.60     2.40    37.33     0.13   21.20   20.59   26.67  17.07  10.24
dm-2              0.00     0.00    5.40    0.60   109.60     2.40    37.33     0.13   21.47   20.89   26.67  17.33  10.40
dm-3              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           7.04    0.00   16.84    2.86    0.00   73.27

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.20     1.00    1.20   11.60    42.40    50.40    14.50     1.16   90.88   46.67   95.45  12.69  16.24
sdb               0.00     0.00   24.80    0.20 13008.00     0.80  1040.70     0.36   14.34   14.32   16.00   3.49   8.72
sdc               0.20     0.00    2.60    0.00    43.20     0.00    33.23     0.02    8.92    8.92    0.00   8.92   2.32
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00   12.60     0.00    50.40     8.00     1.28  101.52    0.00  101.52   9.40  11.84
dm-1              0.00     0.00   18.40    0.20 13093.60     0.80  1408.00     0.30   16.00   16.00   16.00   7.10  13.20
dm-2              0.00     0.00   18.40    0.20 13093.60     0.80  1408.00     0.47   25.46   25.57   16.00   9.03  16.80
dm-3              0.00     0.00    0.00   12.60     0.00    50.40     8.00     1.29  102.73    0.00  102.73   9.52  12.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.05    0.00   11.17    7.27    0.00   79.51

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    1.60   17.60    21.60  1261.60   133.67     0.66   34.17   18.50   35.59   7.25  13.92
sdb               0.80     0.00   16.80    1.40  9881.60   108.80  1097.85     0.33   17.93   17.86   18.86   4.48   8.16
sdc               0.00     0.00    7.00    5.20   116.80   563.20   111.48     0.34   27.15   17.60   40.00  11.08  13.52
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    7.40     0.00    29.60     8.00     0.32   43.68    0.00   43.68   8.32   6.16
dm-1              0.00     0.00   19.60   16.80 10023.20  1904.00   655.34     0.88   24.20   17.47   32.05   6.92  25.20
dm-2              0.00     0.00   19.60   16.80 10023.20  1904.00   655.34     1.05   28.66   24.04   34.05   7.49  27.28
dm-3              0.00     0.00    0.00    7.40     0.00    29.60     8.00     0.33   44.11    0.00   44.11   8.43   6.24

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          13.02    0.00   30.88   43.49    0.00   12.61

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.20   46.00     3.20  5044.80   218.53     4.64  100.38  144.00  100.19   4.87  22.48
sdb               0.40     0.00    5.80    0.00  3328.00     0.00  1147.59     0.04    7.03    7.03    0.00   2.76   1.60
sdc               0.00    47.00    3.40  534.80    45.60 15830.40    59.00   105.25  188.75  185.65  188.77   1.71  92.00
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    6.80     0.00    27.20     8.00     0.35   51.41    0.00   51.41   8.47   5.76
dm-1              0.00     0.00    7.60  649.00  3380.00 21526.40    75.86   119.38  175.83   91.26  176.82   1.47  96.40
dm-2              0.00     0.00    7.60  753.60  3380.00 24097.60    72.20  1503.51 1338.17  101.89 1350.64   1.27  96.64
dm-3              0.00     0.00    0.00    6.80     0.00    27.20     8.00     0.35   51.53    0.00   51.53   8.47   5.76

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5.01    0.00   13.60   12.68    0.00   68.71

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00    2.00     0.00     8.00     8.00     0.23  116.40    0.00  116.40  43.20   8.64
sdb               1.20     0.00   21.60    0.20 11880.80    90.40  1098.28     0.30   13.43   13.37   20.00   4.37   9.52
sdc               0.40    18.40    3.40  114.00    60.80  3159.20    54.86    20.62  206.86  141.41  208.81   1.85  21.68
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    1.60     0.00     6.40     8.00     0.14   85.50    0.00   85.50  34.00   5.44
dm-1              0.00     0.00   18.40  105.20 12243.20  2573.60   239.75    25.59  238.83   38.30  273.90   2.30  28.48
dm-2              0.00     0.00   18.40    0.60 12243.20     2.40  1289.01    56.61 28499.62   47.57 901029.33  16.55  31.44
dm-3              0.00     0.00    0.00    1.60     0.00     6.40     8.00     0.14   85.50    0.00   85.50  34.00   5.44

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.26    0.00   10.48    3.49    0.00   83.76

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.80    6.60    12.80    26.40    10.59     0.51   68.54   70.00   68.36  23.78  17.60
sdb               1.60     0.00   23.00    0.00 10957.60     0.00   952.83     0.47   20.80   20.80    0.00   5.08  11.68
sdc               0.40     0.00    4.20    0.00    78.40     0.00    37.33     0.04    9.90    9.90    0.00   9.14   3.84
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    6.40     0.00    25.60     8.00     0.45   70.50    0.00   70.50  18.88  12.08
dm-1              0.00     0.00   22.40    0.00 10740.80     0.00   959.00     0.48   21.96   21.96    0.00   7.64  17.12
dm-2              0.00     0.00   22.40    0.00 10740.80     0.00   959.00     0.62   28.14   28.14    0.00   8.54  19.12
dm-3              0.00     0.00    0.00    6.40     0.00    25.60     8.00     0.45   70.62    0.00   70.62  18.88  12.08

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           6.26    0.00   11.79    4.92    0.00   77.03

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     1.00    1.40    6.40    21.60    29.60    13.13     0.71   91.08   87.43   91.88  18.97  14.80
sdb               1.20     0.00   19.20    0.00  8858.40     0.00   922.75     0.29   14.96   14.96    0.00   5.38  10.32
sdc               0.00     0.00    3.00    0.40    48.00     2.40    29.65     0.06   16.94   16.80   18.00  14.12   4.80
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    7.40     0.00    29.60     8.00     0.70   94.92    0.00   94.92   3.46   2.56
dm-1              0.00     0.00   19.00    0.40  9133.60     2.40   941.86     0.42   21.24   21.31   18.00  12.04  23.36
dm-2              0.00     0.00   19.00    0.40  9133.60     2.40   941.86     0.52   25.90   26.06   18.00  12.70  24.64
dm-3              0.00     0.00    0.00    7.40     0.00    29.60     8.00     0.70   95.24    0.00   95.24   3.57   2.64

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.05    0.00   11.07    4.41    0.00   82.48

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.20     4.60    0.60    8.40    12.80    52.00    14.40     0.51   57.07  112.00   53.14  15.64  14.08
sdb               1.60     0.00   22.60    0.00 12444.00     0.00  1101.24     0.28   12.50   12.50    0.00   4.50  10.16
sdc               0.20     0.00    4.20    0.00    70.40     0.00    33.52     0.07   16.00   16.00    0.00  15.43   6.48
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00   13.00     0.00    52.00     8.00     0.71   54.58    0.00   54.58   6.15   8.00
dm-1              0.00     0.00   21.60    0.00 12321.60     0.00  1140.89     0.39   18.59   18.59    0.00   7.81  16.88
dm-2              0.00     0.00   21.60    0.00 12321.60     0.00  1140.89     0.56   26.70   26.70    0.00   9.22  19.92
dm-3              0.00     0.00    0.00   13.00     0.00    52.00     8.00     0.71   54.58    0.00   54.58   6.15   8.00

For me, it looks quite good and does not explain the problems?
Any idea on how to proceed?

@Blackpaw

This comment has been minimized.

Copy link

Blackpaw commented Feb 5, 2018

Your iowaits are extremely high. Has your rebalance finished yet? 5TB will take a while.

Not a big fan of running filesystems (lizardfs in this case) on top of encrypted filesystems - adds unnecessary complications and load. Can't you run dmcrypt on on top of lizardfs?

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 6, 2018

@Blackpaw, yes of course it is rebalancing, and yes of course this requires resources. With the current rebalancing-speed, rebalancing takes a whole @month! But a distributed filesystem shouldn't break down only because it is rebalancing.

But after the answer of @psarna in #658, I'll now try to set REPLICATION_BANDWIDTH_LIMIT_KBPS to 4k, and after the answer of @psarna in #657, I'll also increase the read timeout in the mount points.

I'll report later whether it was successful.

@guestisp

This comment has been minimized.

Copy link

guestisp commented Feb 6, 2018

But after the answer of @psarna in#658, I'll now try to set REPLICATION_BANDWIDTH_LIMIT_KBPS to 4k

4k? 4kbps ? You'll never ever finish a 5TB replication................... Better run with no replication at all in that case.

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 6, 2018

@guestisp, what replication rate would you suggest? Of course I want data duplication, otherwise I would not need LizardFS, but could use standard NFS…

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 6, 2018

@guestisp, Let's calculate: Currently, it would take about a month for replication, so let's run it in a speed that takes two monthes and round it up to 80 days. Then with 5TB (=4.3980465111e+13 bits), it results in:

5 TB / 80 Days
= 5TB * 1024 GB/TB * 1024 MB/GB * 1024 kB/MB * 1024 B/kB * 8b/B / 80 d / 24 h/d / 60 min/h / 60 s/min
= 5 * 1024^4 * 8 / 80 / 24 / 3600 b/s
= 6362915 b/s
= 6213 kb/s
= 6 Mb/s

So, you're right, 4kb/s would be too slow.

Correct?

Anyway, I'll start with a lower value then try if it fixes my problem before I increase it.

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 12, 2018

Now I reduces bandwith to 1000kbps. This improves stability a lot, but lizardfs it is still unstable.

e.g. mysql often fails with errors like this:

…
Ibuf: size 1, free list len 3, seg size 5, 0 merges
merged operations:
 insert 0, delete mark 0, delete 0
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 0 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
0.00 hash searches/s, 0.00 non-hash searches/s
---
LOG
---
Log sequence number 17065240959
Log flushed up to   17065235051
Pages flushed up to 17065233844
Last checkpoint at  17065233835
1 pending log flushes, 0 pending chkp writes
415 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 137428992
Dictionary memory allocated 402816
Buffer pool size   8192
Free buffers       7538
Database pages     644
Old database pages 257
Modified db pages  14
Pending reads      0
Pending writes: LRU 0, flush list 6, single page 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 589, created 55, written 1158
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 644, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
0 read views open inside InnoDB
Process ID=1, Main thread ID=140406700369664, state: flushing log
Number of rows inserted 301, updated 260, deleted 60, read 6085
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================

=====================================
2018-02-12 09:50:42 0x7fb2fbfff700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 20 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 143 srv_active, 0 srv_shutdown, 15928 srv_idle
srv_master_thread log flush and writes: 16070
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 624
--Thread 140406892402432 has waited at dict0dict.cc line 1238 for 820.00 seconds the semaphore:
Mutex at 0x3c82038, Mutex DICT_SYS created dict0dict.cc:1172, lock var 1

--Thread 140406683584256 has waited at row0purge.cc line 862 for 820.00 seconds the semaphore:
S-lock on RW-latch at 0x3ca8738 created in file dict0dict.cc line 1183
a writer (thread id 140406373218048) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file row0purge.cc line 862
Last time write locked in file /export/home/pb2/build/sb_0-26514852-1514434083.31/release/mysql-5.7.21/storage/innobase/dict/dict0stats.cc line 2376
OS WAIT ARRAY INFO: signal count 589
RW-shared spins 0, rounds 286, OS waits 138
RW-excl spins 0, rounds 1171, OS waits 32
RW-sx spins 6, rounds 159, OS waits 5
Spin rounds per wait: 286.00 RW-shared, 1171.00 RW-excl, 26.50 RW-sx
------------
TRANSACTIONS
------------
Trx id counter 6510754
Purge done for trx's n:o < 6510752 undo n:o < 0 state: running
History list length 35
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 6510749, ACTIVE 820 sec
3 lock struct(s), heap size 1136, 9 row lock(s), undo log entries 8
MySQL thread id 31, OS thread handle 140406892402432, query id 4862 10.0.7.12 icinga Opening tables
INSERT INTO icinga_conninfo (instance_id, connect_time, last_checkin_time, agent_name, agent_version, connect_type, data_start_time) VALUES (1, NOW(), NOW(), 'icinga2 db_ido_mysql', 'r2.8.1-1', 'INITIAL', NOW())
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
I/O thread 5 state: waiting for completed aio requests (read thread)
I/O thread 6 state: waiting for completed aio requests (write thread)
I/O thread 7 state: waiting for completed aio requests (write thread)
I/O thread 8 state: waiting for completed aio requests (write thread)
I/O thread 9 state: waiting for completed aio requests (write thread)
Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,
 ibuf aio reads:, log i/o's:, sync i/o's:
Pending flushes (fsync) log: 1; buffer pool: 0
941 OS file reads, 1705 OS file writes, 787 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 3, seg size 5, 0 merges
merged operations:
 insert 0, delete mark 0, delete 0
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 0 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
0.00 hash searches/s, 0.00 non-hash searches/s
---
LOG
---
Log sequence number 17065240959
Log flushed up to   17065235051
Pages flushed up to 17065233844
Last checkpoint at  17065233835
1 pending log flushes, 0 pending chkp writes
415 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 137428992
Dictionary memory allocated 402816
Buffer pool size   8192
Free buffers       7538
Database pages     644
Old database pages 257
Modified db pages  14
Pending reads      0
Pending writes: LRU 0, flush list 6, single page 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 589, created 55, written 1158
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 644, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
0 read views open inside InnoDB
Process ID=1, Main thread ID=140406700369664, state: flushing log
Number of rows inserted 301, updated 260, deleted 60, read 6085
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
InnoDB: ###### Diagnostic info printed to the standard error stream
2018-02-12T09:50:52.848206Z 0 [Warning] InnoDB: A long semaphore wait:
--Thread 140406892402432 has waited at dict0dict.cc line 1238 for 830.00 seconds the semaphore:
Mutex at 0x3c82038, Mutex DICT_SYS created dict0dict.cc:1172, lock var 1

2018-02-12T09:50:52.848379Z 0 [Warning] InnoDB: A long semaphore wait:
--Thread 140406683584256 has waited at row0purge.cc line 862 for 830.00 seconds the semaphore:
S-lock on RW-latch at 0x3ca8738 created in file dict0dict.cc line 1183
a writer (thread id 140406373218048) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file row0purge.cc line 862
Last time write locked in file /export/home/pb2/build/sb_0-26514852-1514434083.31/release/mysql-5.7.21/storage/innobase/dict/dict0stats.cc line 2376
InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info:
InnoDB: Pending preads 0, pwrites 0

=====================================
2018-02-12 09:51:02 0x7fb2fbfff700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 20 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 143 srv_active, 0 srv_shutdown, 15928 srv_idle
srv_master_thread log flush and writes: 16070
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 624
--Thread 140406892402432 has waited at dict0dict.cc line 1238 for 840.00 seconds the semaphore:
Mutex at 0x3c82038, Mutex DICT_SYS created dict0dict.cc:1172, lock var 1

--Thread 140406683584256 has waited at row0purge.cc line 862 for 840.00 seconds the semaphore:
S-lock on RW-latch at 0x3ca8738 created in file dict0dict.cc line 1183
a writer (thread id 140406373218048) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file row0purge.cc line 862
Last time write locked in file /export/home/pb2/build/sb_0-26514852-1514434083.31/release/mysql-5.7.21/storage/innobase/dict/dict0stats.cc line 2376
OS WAIT ARRAY INFO: signal count 589
RW-shared spins 0, rounds 286, OS waits 138
RW-excl spins 0, rounds 1171, OS waits 32
RW-sx spins 6, rounds 159, OS waits 5
Spin rounds per wait: 286.00 RW-shared, 1171.00 RW-excl, 26.50 RW-sx
------------
TRANSACTIONS
------------
Trx id counter 6510754
Purge done for trx's n:o < 6510752 undo n:o < 0 state: running
History list length 35
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 6510749, ACTIVE 840 sec
3 lock struct(s), heap size 1136, 9 row lock(s), undo log entries 8
MySQL thread id 31, OS thread handle 140406892402432, query id 4862 10.0.7.12 icinga Opening tables
INSERT INTO icinga_conninfo (instance_id, connect_time, last_checkin_time, agent_name, agent_version, connect_type, data_start_time) VALUES (1, NOW(), NOW(), 'icinga2 db_ido_mysql', 'r2.8.1-1', 'INITIAL', NOW())
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
I/O thread 5 state: waiting for completed aio requests (read thread)
I/O thread 6 state: waiting for completed aio requests (write thread)
I/O thread 7 state: waiting for completed aio requests (write thread)
I/O thread 8 state: waiting for completed aio requests (write thread)
I/O thread 9 state: waiting for completed aio requests (write thread)
Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,
 ibuf aio reads:, log i/o's:, sync i/o's:
Pending flushes (fsync) log: 1; buffer pool: 0
941 OS file reads, 1705 OS file writes, 787 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 3, seg size 5, 0 merges
merged operations:
 insert 0, delete mark 0, delete 0
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 0 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
0.00 hash searches/s, 0.00 non-hash searches/s
---
LOG
---
Log sequence number 17065240959
Log flushed up to   17065235051
Pages flushed up to 17065233844
Last checkpoint at  17065233835
1 pending log flushes, 0 pending chkp writes
415 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 137428992
Dictionary memory allocated 402816
Buffer pool size   8192
Free buffers       7538
Database pages     644
Old database pages 257
Modified db pages  14
Pending reads      0
Pending writes: LRU 0, flush list 6, single page 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 589, created 55, written 1158
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 644, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
0 read views open inside InnoDB
Process ID=1, Main thread ID=140406700369664, state: flushing log
Number of rows inserted 301, updated 260, deleted 60, read 6085
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================

=====================================
2018-02-12 09:51:22 0x7fb2fbfff700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 20 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 143 srv_active, 0 srv_shutdown, 15928 srv_idle
srv_master_thread log flush and writes: 16070
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 624
--Thread 140406892402432 has waited at dict0dict.cc line 1238 for 860.00 seconds the semaphore:
Mutex at 0x3c82038, Mutex DICT_SYS created dict0dict.cc:1172, lock var 1

--Thread 140406683584256 has waited at row0purge.cc line 862 for 860.00 seconds the semaphore:
S-lock on RW-latch at 0x3ca8738 created in file dict0dict.cc line 1183
a writer (thread id 140406373218048) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file row0purge.cc line 862
Last time write locked in file /export/home/pb2/build/sb_0-26514852-1514434083.31/release/mysql-5.7.21/storage/innobase/dict/dict0stats.cc line 2376
OS WAIT ARRAY INFO: signal count 589
RW-shared spins 0, rounds 286, OS waits 138
RW-excl spins 0, rounds 1171, OS waits 32
RW-sx spins 6, rounds 159, OS waits 5
Spin rounds per wait: 286.00 RW-shared, 1171.00 RW-excl, 26.50 RW-sx
------------
TRANSACTIONS
------------
Trx id counter 6510754
Purge done for trx's n:o < 6510752 undo n:o < 0 state: running
History list length 35
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 6510749, ACTIVE 860 sec
3 lock struct(s), heap size 1136, 9 row lock(s), undo log entries 8
MySQL thread id 31, OS thread handle 140406892402432, query id 4862 10.0.7.12 icinga Opening tables
INSERT INTO icinga_conninfo (instance_id, connect_time, last_checkin_time, agent_name, agent_version, connect_type, data_start_time) VALUES (1, NOW(), NOW(), 'icinga2 db_ido_mysql', 'r2.8.1-1', 'INITIAL', NOW())
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
I/O thread 5 state: waiting for completed aio requests (read thread)
I/O thread 6 state: waiting for completed aio requests (write thread)
I/O thread 7 state: waiting for completed aio requests (write thread)
I/O thread 8 state: waiting for completed aio requests (write thread)
I/O thread 9 state: waiting for completed aio requests (write thread)
Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,
 ibuf aio reads:, log i/o's:, sync i/o's:
Pending flushes (fsync) log: 1; buffer pool: 0
941 OS file reads, 1705 OS file writes, 787 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 3, seg size 5, 0 merges
merged operations:
 insert 0, delete mark 0, delete 0
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 0 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
0.00 hash searches/s, 0.00 non-hash searches/s
---
LOG
---
Log sequence number 17065240959
Log flushed up to   17065235051
Pages flushed up to 17065233844
Last checkpoint at  17065233835
1 pending log flushes, 0 pending chkp writes
415 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 137428992
Dictionary memory allocated 402816
Buffer pool size   8192
Free buffers       7538
Database pages     644
Old database pages 257
Modified db pages  14
Pending reads      0
Pending writes: LRU 0, flush list 6, single page 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 589, created 55, written 1158
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 644, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
0 read views open inside InnoDB
Process ID=1, Main thread ID=140406700369664, state: flushing log
Number of rows inserted 301, updated 260, deleted 60, read 6085
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
InnoDB: ###### Diagnostic info printed to the standard error stream
2018-02-12T09:51:23.857324Z 0 [Warning] InnoDB: A long semaphore wait:
--Thread 140406892402432 has waited at dict0dict.cc line 1238 for 861.00 seconds the semaphore:
Mutex at 0x3c82038, Mutex DICT_SYS created dict0dict.cc:1172, lock var 1

2018-02-12T09:51:23.857406Z 0 [Warning] InnoDB: A long semaphore wait:
--Thread 140406683584256 has waited at row0purge.cc line 862 for 861.00 seconds the semaphore:
S-lock on RW-latch at 0x3ca8738 created in file dict0dict.cc line 1183
a writer (thread id 140406373218048) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file row0purge.cc line 862
Last time write locked in file /export/home/pb2/build/sb_0-26514852-1514434083.31/release/mysql-5.7.21/storage/innobase/dict/dict0stats.cc line 2376
InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info:
InnoDB: Pending preads 0, pwrites 0

=====================================
2018-02-12 09:51:42 0x7fb2fbfff700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 20 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 143 srv_active, 0 srv_shutdown, 15928 srv_idle
srv_master_thread log flush and writes: 16070
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 624
--Thread 140406892402432 has waited at dict0dict.cc line 1238 for 880.00 seconds the semaphore:
Mutex at 0x3c82038, Mutex DICT_SYS created dict0dict.cc:1172, lock var 1

--Thread 140406683584256 has waited at row0purge.cc line 862 for 880.00 seconds the semaphore:
S-lock on RW-latch at 0x3ca8738 created in file dict0dict.cc line 1183
a writer (thread id 140406373218048) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file row0purge.cc line 862
Last time write locked in file /export/home/pb2/build/sb_0-26514852-1514434083.31/release/mysql-5.7.21/storage/innobase/dict/dict0stats.cc line 2376
OS WAIT ARRAY INFO: signal count 589
RW-shared spins 0, rounds 286, OS waits 138
RW-excl spins 0, rounds 1171, OS waits 32
RW-sx spins 6, rounds 159, OS waits 5
Spin rounds per wait: 286.00 RW-shared, 1171.00 RW-excl, 26.50 RW-sx
------------
TRANSACTIONS
------------
Trx id counter 6510754
Purge done for trx's n:o < 6510752 undo n:o < 0 state: running
History list length 35
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 6510749, ACTIVE 880 sec
3 lock struct(s), heap size 1136, 9 row lock(s), undo log entries 8
MySQL thread id 31, OS thread handle 140406892402432, query id 4862 10.0.7.12 icinga Opening tables
INSERT INTO icinga_conninfo (instance_id, connect_time, last_checkin_time, agent_name, agent_version, connect_type, data_start_time) VALUES (1, NOW(), NOW(), 'icinga2 db_ido_mysql', 'r2.8.1-1', 'INITIAL', NOW())
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
I/O thread 5 state: waiting for completed aio requests (read thread)
I/O thread 6 state: waiting for completed aio requests (write thread)
I/O thread 7 state: waiting for completed aio requests (write thread)
I/O thread 8 state: waiting for completed aio requests (write thread)
I/O thread 9 state: waiting for completed aio requests (write thread)
Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,
 ibuf aio reads:, log i/o's:, sync i/o's:
Pending flushes (fsync) log: 1; buffer pool: 0
941 OS file reads, 1705 OS file writes, 787 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 3, seg size 5, 0 merges
merged operations:
 insert 0, delete mark 0, delete 0
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 0 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
0.00 hash searches/s, 0.00 non-hash searches/s
---
LOG
---
Log sequence number 17065240959
Log flushed up to   17065235051
Pages flushed up to 17065233844
Last checkpoint at  17065233835
1 pending log flushes, 0 pending chkp writes
415 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 137428992
Dictionary memory allocated 402816
Buffer pool size   8192
Free buffers       7538
Database pages     644
Old database pages 257
Modified db pages  14
Pending reads      0
Pending writes: LRU 0, flush list 6, single page 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 589, created 55, written 1158
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 644, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
0 read views open inside InnoDB
Process ID=1, Main thread ID=140406700369664, state: flushing log
Number of rows inserted 301, updated 260, deleted 60, read 6085
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
InnoDB: ###### Diagnostic info printed to the standard error stream
2018-02-12T09:51:54.857869Z 0 [Warning] InnoDB: A long semaphore wait:
--Thread 140406892402432 has waited at dict0dict.cc line 1238 for 892.00 seconds the semaphore:
Mutex at 0x3c82038, Mutex DICT_SYS created dict0dict.cc:1172, lock var 1

2018-02-12T09:51:54.857982Z 0 [Warning] InnoDB: A long semaphore wait:
--Thread 140406683584256 has waited at row0purge.cc line 862 for 892.00 seconds the semaphore:
S-lock on RW-latch at 0x3ca8738 created in file dict0dict.cc line 1183
a writer (thread id 140406373218048) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file row0purge.cc line 862
Last time write locked in file /export/home/pb2/build/sb_0-26514852-1514434083.31/release/mysql-5.7.21/storage/innobase/dict/dict0stats.cc line 2376
InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info:
InnoDB: Pending preads 0, pwrites 0
2018-02-12T09:51:55.078513Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 890943ms. The settings might not be optimal. (flushed=12 and evicted=0, during the time.)

=====================================
2018-02-12 09:52:02 0x7fb2fbfff700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 20 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 145 srv_active, 0 srv_shutdown, 15930 srv_idle
srv_master_thread log flush and writes: 16075
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 633
OS WAIT ARRAY INFO: signal count 600
RW-shared spins 0, rounds 290, OS waits 140
RW-excl spins 0, rounds 1262, OS waits 36
RW-sx spins 6, rounds 159, OS waits 5
Spin rounds per wait: 290.00 RW-shared, 1262.00 RW-excl, 26.50 RW-sx
------------
TRANSACTIONS
------------
Trx id counter 6510756
Purge done for trx's n:o < 6510755 undo n:o < 0 state: running but idle
History list length 36
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 6510755, ACTIVE (PREPARED) 4 sec
3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 1
MySQL thread id 31, OS thread handle 140406892402432, query id 4887 10.0.7.12 icinga starting
COMMIT
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
I/O thread 5 state: waiting for completed aio requests (read thread)
I/O thread 6 state: waiting for completed aio requests (write thread)
I/O thread 7 state: waiting for completed aio requests (write thread)
I/O thread 8 state: waiting for completed aio requests (write thread)
I/O thread 9 state: waiting for completed aio requests (write thread)
Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,
 ibuf aio reads:, log i/o's:, sync i/o's:
Pending flushes (fsync) log: 1; buffer pool: 0
941 OS file reads, 1740 OS file writes, 805 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 1.75 writes/s, 0.90 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 3, seg size 5, 0 merges
merged operations:
 insert 0, delete mark 0, delete 0
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 0 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
0.10 hash searches/s, 1.60 non-hash searches/s
---
LOG
---
Log sequence number 17065242608
Log flushed up to   17065242447
Pages flushed up to 17065242608
Last checkpoint at  17065235646
1 pending log flushes, 0 pending chkp writes
421 log i/o's done, 0.30 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 137428992
Dictionary memory allocated 402816
Buffer pool size   8192
Free buffers       7538
Database pages     644
Old database pages 257
Modified db pages  0
Pending reads      0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 589, created 55, written 1184
0.00 reads/s, 0.00 creates/s, 1.30 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 644, unzip_LRU len: 0
I/O sum[0]:cur[16], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
0 read views open inside InnoDB
Process ID=1, Main thread ID=140406700369664, state: making checkpoint
Number of rows inserted 302, updated 261, deleted 60, read 6698
0.05 inserts/s, 0.05 updates/s, 0.00 deletes/s, 30.65 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================

=====================================
2018-02-12 09:52:22 0x7fb2fbfff700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 20 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 145 srv_active, 0 srv_shutdown, 15930 srv_idle
srv_master_thread log flush and writes: 16075
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 634
--Thread 140406675191552 has waited at row0purge.cc line 862 for 18.00 seconds the semaphore:
S-lock on RW-latch at 0x3ca8738 created in file dict0dict.cc line 1183
a writer (thread id 140406373218048) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file row0purge.cc line 862
Last time write locked in file /export/home/pb2/build/sb_0-26514852-1514434083.31/release/mysql-5.7.21/storage/innobase/dict/dict0stats.cc line 2376
OS WAIT ARRAY INFO: signal count 600
RW-shared spins 0, rounds 290, OS waits 140
RW-excl spins 0, rounds 1262, OS waits 36
RW-sx spins 6, rounds 159, OS waits 5
Spin rounds per wait: 290.00 RW-shared, 1262.00 RW-excl, 26.50 RW-sx
------------
TRANSACTIONS
------------
Trx id counter 6510758
Purge done for trx's n:o < 6510758 undo n:o < 0 state: running
History list length 37
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 6510755, ACTIVE (PREPARED) 24 sec
3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 1
MySQL thread id 31, OS thread handle 140406892402432, query id 4887 10.0.7.12 icinga starting
COMMIT
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
I/O thread 5 state: waiting for completed aio requests (read thread)
I/O thread 6 state: waiting for completed aio requests (write thread)
I/O thread 7 state: waiting for completed aio requests (write thread)
I/O thread 8 state: waiting for completed aio requests (write thread)
I/O thread 9 state: waiting for completed aio requests (write thread)
Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,
 ibuf aio reads:, log i/o's:, sync i/o's:
Pending flushes (fsync) log: 1; buffer pool: 0
941 OS file reads, 1740 OS file writes, 805 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 3, seg size 5, 0 merges
merged operations:
 insert 0, delete mark 0, delete 0
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 0 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
0.00 hash searches/s, 0.10 non-hash searches/s
---
LOG
---
Log sequence number 17065243081
Log flushed up to   17065242447
Pages flushed up to 17065242608
Last checkpoint at  17065235646
1 pending log flushes, 0 pending chkp writes
421 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 137428992
Dictionary memory allocated 402816
Buffer pool size   8192
Free buffers       7538
Database pages     644
Old database pages 257
Modified db pages  3
Pending reads      0
Pending writes: LRU 0, flush list 2, single page 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 589, created 55, written 1184
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 644, unzip_LRU len: 0
I/O sum[0]:cur[16], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
0 read views open inside InnoDB
Process ID=1, Main thread ID=140406700369664, state: making checkpoint
Number of rows inserted 302, updated 261, deleted 60, read 6698
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
InnoDB: ###### Diagnostic info printed to the standard error stream
2018-02-12T09:56:05.900079Z 0 [Warning] InnoDB: A long semaphore wait:
--Thread 140406675191552 has waited at row0purge.cc line 862 for 241.00 seconds the semaphore:
S-lock on RW-latch at 0x3ca8738 created in file dict0dict.cc line 1183
a writer (thread id 140406373218048) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file row0purge.cc line 862
Last time write locked in file /export/home/pb2/build/sb_0-26514852-1514434083.31/release/mysql-5.7.21/storage/innobase/dict/dict0stats.cc line 2376
InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info:
InnoDB: Pending preads 0, pwrites 0

=====================================
2018-02-12 09:56:22 0x7fb2fbfff700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 57 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 145 srv_active, 0 srv_shutdown, 15930 srv_idle
srv_master_thread log flush and writes: 16075
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 634
--Thread 140406675191552 has waited at row0purge.cc line 862 for 258.00 seconds the semaphore:
S-lock on RW-latch at 0x3ca8738 created in file dict0dict.cc line 1183
a writer (thread id 140406373218048) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file row0purge.cc line 862
Last time write locked in file /export/home/pb2/build/sb_0-26514852-1514434083.31/release/mysql-5.7.21/storage/innobase/dict/dict0stats.cc line 2376
OS WAIT ARRAY INFO: signal count 600
RW-shared spins 0, rounds 290, OS waits 140
RW-excl spins 0, rounds 1262, OS waits 36
RW-sx spins 6, rounds 159, OS waits 5
Spin rounds per wait: 290.00 RW-shared, 1262.00 RW-excl, 26.50 RW-sx
------------
TRANSACTIONS
------------
Trx id counter 6510758
Purge done for trx's n:o < 6510758 undo n:o < 0 state: running
History list length 37
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 6510755, ACTIVE (PREPARED) 264 sec
3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 1
MySQL thread id 31, OS thread handle 140406892402432, query id 4887 10.0.7.12 icinga starting
COMMIT
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
I/O thread 5 state: waiting for completed aio requests (read thread)
I/O thread 6 state: waiting for completed aio requests (write thread)
I/O thread 7 state: waiting for completed aio requests (write thread)
I/O thread 8 state: waiting for completed aio requests (write thread)
I/O thread 9 state: waiting for completed aio requests (write thread)
Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,
 ibuf aio reads:, log i/o's:, sync i/o's:
Pending flushes (fsync) log: 1; buffer pool: 0
941 OS file reads, 1740 OS file writes, 805 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 3, seg size 5, 0 merges
merged operations:
 insert 0, delete mark 0, delete 0
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 0 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
0.00 hash searches/s, 0.00 non-hash searches/s
---
LOG
---
Log sequence number 17065243081
Log flushed up to   17065242447
Pages flushed up to 17065242608
Last checkpoint at  17065235646
1 pending log flushes, 0 pending chkp writes
421 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 137428992
Dictionary memory allocated 402816
Buffer pool size   8192
Free buffers       7538
Database pages     644
Old database pages 257
Modified db pages  3
Pending reads      0
Pending writes: LRU 0, flush list 2, single page 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 589, created 55, written 1184
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 644, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
0 read views open inside InnoDB
Process ID=1, Main thread ID=140406700369664, state: making checkpoint
Number of rows inserted 302, updated 261, deleted 60, read 6698
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
2018-02-12T09:56:34.063290Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 268777ms. The settings might not be optimal. (flushed=3 and evicted=0, during the time.)
InnoDB: ###### Diagnostic info printed to the standard error stream
2018-02-12T10:00:33.404619Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 227338ms. The settings might not be optimal. (flushed=3 and evicted=0, during the time.)
2018-02-12T10:01:04.755823Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 12344ms. The settings might not be optimal. (flushed=119 and evicted=0, during the time.)
2018-02-12T10:05:22.260892Z 31 [Warning] InnoDB: fsync(): An error occurred during synchronization, retrying
2018-02-12T10:05:42.345934Z 31 [Warning] InnoDB: fsync(): An error occurred during synchronization, retrying
2018-02-12T10:06:02.405710Z 31 [Warning] InnoDB: fsync(): An error occurred during synchronization, retrying
2018-02-12T10:06:22.477300Z 31 [Warning] InnoDB: fsync(): An error occurred during synchronization, retrying
2018-02-12T10:06:42.533837Z 31 [Warning] InnoDB: fsync(): An error occurred during synchronization, retrying
2018-02-12T10:07:02.599146Z 31 [Warning] InnoDB: fsync(): An error occurred during synchronization, retrying
2018-02-12T10:07:22.694580Z 31 [Warning] InnoDB: fsync(): An error occurred during synchronization, retrying
2018-02-12T10:07:42.753513Z 31 [Warning] InnoDB: fsync(): An error occurred during synchronization, retrying
2018-02-12T10:08:02.837357Z 31 [Warning] InnoDB: fsync(): An error occurred during synchronization, retrying
2018-02-12 10:08:22 0x7fb306f21700  InnoDB: Assertion failure in thread 140406892402432 in file os0file.cc line 3079
InnoDB: Failing assertion: failures < 1000
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
10:08:23 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=5
max_threads=151
thread_count=1
connection_count=1
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 68193 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7fb2cc0008c0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fb306f20e80 thread_stack 0x40000
mysqld(my_print_stacktrace+0x2c)[0xe8ba7c]
mysqld(handle_fatal_signal+0x459)[0x7ac489]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf890)[0x7fb31cc62890]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7fb31b66b067]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7fb31b66c448]
mysqld[0x78223f]
mysqld(_Z18os_file_flush_funci+0x4b8)[0xff94a8]
mysqld(_Z9fil_flushm+0x265)[0x11c79e5]
mysqld(_Z15log_write_up_tomb+0x76e)[0xfce85e]
mysqld(_Z24log_buffer_flush_to_diskb+0x104)[0xfcf294]
mysqld[0xf667df]
mysqld[0x7f307f]
mysqld(_Z24plugin_foreach_with_maskP3THDPPFcS0_P13st_plugin_intPvEijS3_+0x1ae)[0xc80f9e]
mysqld(_Z24plugin_foreach_with_maskP3THDPFcS0_P13st_plugin_intPvEijS3_+0x1d)[0xc8112d]
mysqld(_Z13ha_flush_logsP10handlertonb+0x5a)[0x7f997a]
mysqld(_ZN13MYSQL_BIN_LOG25process_flush_stage_queueEPyPbPP3THD+0x3b)[0xe2501b]
mysqld(_ZN13MYSQL_BIN_LOG14ordered_commitEP3THDbb+0xf9)[0xe251d9]
mysqld(_ZN13MYSQL_BIN_LOG6commitEP3THDb+0x947)[0xe26b37]
mysqld(_Z15ha_commit_transP3THDbb+0x1d2)[0x7f8f72]
mysqld(_Z12trans_commitP3THD+0x39)[0xd02bb9]
mysqld(_Z21mysql_execute_commandP3THDb+0x6e8)[0xc59648]
mysqld(_Z11mysql_parseP3THDP12Parser_state+0x395)[0xc5e3e5]
mysqld(_Z16dispatch_commandP3THDPK8COM_DATA19enum_server_command+0xc21)[0xc5f0b1]
mysqld(_Z10do_commandP3THD+0x18f)[0xc6053f]
mysqld(handle_connection+0x270)[0xd1ea10]
mysqld(pfs_spawn_thread+0x1b4)[0xea35b4]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8064)[0x7fb31cc5b064]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fb31b71e62d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7fb2cc07ddda): COMMIT
Connection ID (thread ID): 31
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

Any idea?

  • What is the root cause of the problem?
  • What can I do to get a fast and stable lizardfs?
@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 12, 2018

Now I have all of a sudden a lost chunk. Fortunately, that file has been rsynced to lizardfs, so I can just restart the rsync operation. But why are there lost chunks at all? This must not happen, never, under no circumstances! All chunk servers are up and running.

currently unavailable chunk 593819 (inode: 392788 ; index: 0)
* currently unavailable file 392788: boar-mrw-sh/boar/blobs/e1/e1af5fd9138b48809697d24d460d39fd
unavailable chunks: 1
unavailable files: 1
@psarna

This comment has been minimized.

Copy link
Member

psarna commented Feb 12, 2018

Ok, this thread is longish right now, so to clarify your configuration:

  • is everything in your cluster (master, chunkservers, clients) in version 3.12? If anything is not, update first. Even one chunkserver in version < 3.12 might be the cause of your problems. Especially due to this bugfix, since you use replication limits in your installation: 745c2e8
  • your current OS is ubuntu 16.04, right?
  • is your goal 2 for all files? Including the one that's lost? Or was it 1?
  • is your network stable? Chunkservers might be up and running, but since your goal configuration is either 1 or 0 additional replicas, some false lost chunks might appear if the connection is laggy. Same goes with clients - if your writing operation is disturbed, the chunk version might be mismatched. Now, if your goal for this file was 1, there's no other copy to restore from, so as a result the chunk is lost, because there's only one version of it and it's not the most recent one. (Usually this problem can be manually solved with lizardfs filerepair).
@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 12, 2018

@psarna

is everything in your cluster (master, chunkservers, clients) in version 3.12?

Yes. You see it in the screenshots.

your current OS is ubuntu 16.04

Yes.

is your goal 2 for all files? Including the one that's lost?

Yes, 2 with specifically defined servers: 1 on universum, 1 on urknall (each chunk server has a label that is identicalto the hostname)

is your network stable?

Yes, it should be very stable, since 3 of the chunkservers (including the 2 that are now configured) are directly connected on an own 1GB switch (that can handle 10GB in total, 1GB on each port).

I could give you access to the lizard-cgi if this helps you, it is behind a basic authentication on https://lizardfs.mrw.sh. (Is it a security issue to have mfs.cgi on a public server without access restrictions? AFAIK, it offers only read access to statistics?)

@psarna

This comment has been minimized.

Copy link
Member

psarna commented Feb 13, 2018

CGI offers statistics only, but it was never designed to be a public server, rather a local tool. Unless you export it through a decent server like nginx, then I wouldn't advise making it totally public.

I think all the screens you provided might have enough data, I'll take a closer look at it soon.

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 13, 2018

What more analysis with iotop and iostat shows me:

The problem seems to be on the second host, urknall, wherer there is much more I/O-Waits. iostat shows me, that mainly only one of the four disks is in use. iotop mostly shows as most consuming processes: [dmcrypt_write] (not a surprise), mfschunkserver start, agetty --noclear tty1 linux, systemd --system --deserialize 20 and dockerd -H fd://.

Filesystem is 78.95% used. I'll replace a 2TB disk with a new (and faster) 8TB disk (WD Gold). I hope, this will better distribute the I/O load.

What I currently also try on all servers:

  1. I removed all btrfs snapshots
  2. I run btrfs filesystem defragment /
  3. I run btrfs balance start /
@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 13, 2018

BTW: Now there are 0 chunks missing. So the missing chunk magically reappeared…?!?

@psarna

This comment has been minimized.

Copy link
Member

psarna commented Feb 13, 2018

Reappeared - no. Likely they were always there. But, first of all, all the statistics you see in CGI are gathered over time in order to put some stress of the master server. Because of that, they are not always exactly up to date.

So, if any of your servers had problems connecting to master for some reason (high I/O, network problem, whatever), master might decide that this server is offline. Then, some chunks might, for a really short time, appear as missing, especially if their goal configuration is low (1 or 2). Then, it might take time to update all the statistics properly, so some chunks may appear as "lost", while they aren't lost at all. If there's a lot going on in the system (like massive replication or high I/O), CGI is even more likely to be a little outdated.

The most important thing should be that after the turmoil LizardFS stabilizes itself, everything is balanced and replicated. And I suppose it's now the case since no chunks are missing anymore?

And as for taking I/O load into account, you can turn it on in mfschunkserver.cfg:

ENABLE_LOAD_FACTOR
    if enabled, chunkserver will send periodical reports of its I/O load to master, which will be taken into consideration when picking chunkservers for I/O operations.
@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 13, 2018

What is the keyword for the value, is it: ENABLE_LOAD_FACTOR=1?

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 13, 2018

So my current mfschunkserver.cfg becomes to (e.g. on universum):

MASTER_HOST = universum
HDD_TEST_FREQ = 20
LABEL = universum
CSSERV_TIMEOUT = 20
REPLICATION_BANDWIDTH_LIMIT_KBPS = 1000
ENABLE_LOAD_FACTOR = 1

And my /etc/fstab is now:

mfsmount /var/volumes fuse rw,mfsmaster=universum,mfsdelayedinit,mfschunkserverwriteto=40000,mfsioretries=120,mfschunkserverconnectreadto=20000,mfschunkserverwavereadto=5000,mfschunkservertotalreadto=20000 0 0
@psarna

This comment has been minimized.

Copy link
Member

psarna commented Feb 13, 2018

Yes, you enable it by setting it to 1. And there's a twin setting in master you should set, so it will work properly together:

LOAD_FACTOR_PENALTY
    When set, percentage of load will be added to chunkserver disk usage to determine most fitting chunkserver. Heavy loaded chunkservers will be picked for operations less frequently. (default is 0,
           correct values are in range from 0 to 0.5)

This one is more advanced and complicated, but basically 0 means "do nothing", and 0.5 means "please take I/O load into account a lot".

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 13, 2018

So, my mfsmaster.cfg is now set to:

LOAD_FACTOR_PENALTY = 0.5
@borkd

This comment has been minimized.

Copy link

borkd commented Feb 14, 2018

Starting with a lowest hanging fruit. Is there a good reason you decided to go with DHCP and ~5 min lease time for a machine which is essential to the happiness of your distributed filesystem (master and cs)?

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 14, 2018

@borkd, good point. Because I did not pay attention to it. The default lease time in my Mikrotik is 10min. The documentation says, that statical adresses or addresses with 0 lease time are leased «forever», but this statement is simply wrong, they are both leased for the default time.

I increased default lease time to 23:59:59, one day, which is the maximum the router accepts. That should be sufficient. I'll check the server logs the next view hours.

[edit] → it does not help, still many many IO errors, currently, on universum a btrfs balance is stillrunning and on urknall a pvmove is running to replace a small 2TB HD with a new 8TB HD. Perhaps it will be better when these jobs will be done. Current logs e.g. on universum:

…
Feb 14 10:17:16 universum mfsmount: write file error, inode: 420236, index: 32 - Timeout after 10030 ms (Timeout) (try counter: 1)
Feb 14 10:17:16 universum mfsmount: write file error, inode: 395869, index: 10 - Timeout after 10034 ms (Timeout) (try counter: 1)
Feb 14 10:17:16 universum mfsmount: write file error, inode: 420401, index: 4 - Timeout after 10033 ms (Timeout) (try counter: 1)
Feb 14 10:17:16 universum mfsmount: write file error, inode: 420501, index: 26 - Timeout after 10033 ms (Timeout) (try counter: 1)
Feb 14 10:17:17 universum mfsmount: write file error, inode: 402299, index: 13 - Timeout after 10036 ms (Timeout) (try counter: 1)
Feb 14 10:17:17 universum mfsmount: write file error, inode: 410070, index: 0 - Timeout after 10029 ms (Timeout) (try counter: 5)
Feb 14 10:17:17 universum mfsmount: write file error, inode: 144336, index: 22 - Timeout after 10017 ms (Timeout) (try counter: 7)
Feb 14 10:17:22 universum mfsmount: write file error, inode: 144118, index: 17 - Timeout after 10015 ms (Timeout) (try counter: 2)
Feb 14 10:17:23 universum kernel: [1010161.105684] BTRFS info (device dm-2): found 178 extents
Feb 14 10:17:25 universum mfschunkserver[13947]: replication error: Can't connect to 192.168.99.7:9422
Feb 14 10:17:25 universum mfsmaster[32350]: (192.168.99.137:9422) chunk: 000000000007C3A3 replication status: Unknown LizardFS error
Feb 14 10:17:26 universum mfsmaster[32350]: (192.168.99.3:9422) chunk: 00000000000A71B4 replication status: Unknown LizardFS error
Feb 14 10:17:26 universum mfsmount: write file error, inode: 147188, index: 0 - Timeout after 10019 ms (Timeout) (try counter: 2)
Feb 14 10:17:26 universum mfschunkserver[13947]: Did not manage to receive packet header
Feb 14 10:17:27 universum mfsmount: write file error, inode: 420421, index: 31 - Timeout after 10010 ms (Timeout) (try counter: 2)
Feb 14 10:17:27 universum mfsmount: write file error, inode: 420186, index: 13 - Timeout after 10014 ms (Timeout) (try counter: 2)
Feb 14 10:17:27 universum mfsmount: write file error, inode: 420555, index: 26 - Timeout after 10014 ms (Timeout) (try counter: 2)
Feb 14 10:17:27 universum mfsmount: write file error, inode: 420466, index: 7 - Timeout after 10014 ms (Timeout) (try counter: 2)
Feb 14 10:17:27 universum mfsmount: write file error, inode: 420436, index: 29 - Timeout after 10019 ms (Timeout) (try counter: 2)
Feb 14 10:17:27 universum mfsmount: write file error, inode: 420506, index: 27 - Timeout after 10018 ms (Timeout) (try counter: 2)
Feb 14 10:17:27 universum mfsmount: write file error, inode: 420785, index: 0 - Timeout after 10018 ms (Timeout) (try counter: 1)
Feb 14 10:17:27 universum mfsmount: write file error, inode: 420409, index: 31 - Timeout after 10018 ms (Timeout) (try counter: 8)
Feb 14 10:17:29 universum mfsmount: write file error, inode: 27964, index: 2 - error sent by master server (Chunk locked)
Feb 14 10:17:36 universum mfsmount: write file error, inode: 419405, index: 17 - Timeout after 10034 ms (Timeout) (try counter: 1)
Feb 14 10:17:37 universum mfsmount: write file error, inode: 145107, index: 4 - Timeout after 10016 ms (Timeout) (try counter: 1)
Feb 14 10:17:37 universum mfsmount: write file error, inode: 420617, index: 16 - Timeout after 10016 ms (Timeout) (try counter: 1)
Feb 14 10:17:37 universum mfsmount: write file error, inode: 420308, index: 5 - Timeout after 10016 ms (Timeout) (try counter: 1)
Feb 14 10:17:37 universum mfsmount: write file error, inode: 144065, index: 23 - Timeout after 10016 ms (Timeout) (try counter: 8)
Feb 14 10:17:37 universum mfsmount: write file error, inode: 141217, index: 9 - Timeout after 10016 ms (Timeout) (try counter: 1)
Feb 14 10:17:37 universum mfsmount: write file error, inode: 420348, index: 5 - Timeout after 10016 ms (Timeout) (try counter: 1)
Feb 14 10:17:37 universum mfsmount: write file error, inode: 401632, index: 36 - Timeout after 10017 ms (Timeout) (try counter: 1)
Feb 14 10:17:37 universum mfsmount: write file error, inode: 406888, index: 20 - Timeout after 10028 ms (Timeout) (try counter: 8)
Feb 14 10:17:38 universum mfsmaster[32350]: chunk 00000000000a71a1 has not enough valid parts (1) consider repairing it manually
Feb 14 10:17:38 universum mfsmaster[32350]: chunk 00000000000a71a1_00000002 - invalid part on (192.168.99.3 - ver:00000001)
Feb 14 10:17:39 universum mfsmount: write file error, inode: 420212, index: 31 - Timeout after 10023 ms (Timeout) (try counter: 2)
Feb 14 10:17:44 universum kernel: [1010182.355868] BTRFS info (device dm-2): found 178 extents
Feb 14 10:17:46 universum mfsmount: write file error, inode: 410680, index: 25 - Timeout after 10045 ms (Timeout) (try counter: 2)
Feb 14 10:17:47 universum mfsmount: write file error, inode: 420456, index: 19 - Timeout after 10014 ms (Timeout) (try counter: 2)
Feb 14 10:17:47 universum mfsmount: write file error, inode: 402299, index: 13 - Timeout after 10024 ms (Timeout) (try counter: 2)
Feb 14 10:17:47 universum mfsmount: write file error, inode: 420401, index: 4 - Timeout after 10025 ms (Timeout) (try counter: 2)
Feb 14 10:17:47 universum mfsmount: write file error, inode: 395869, index: 10 - Timeout after 10025 ms (Timeout) (try counter: 2)
Feb 14 10:17:47 universum mfsmount: write file error, inode: 420236, index: 32 - Timeout after 10026 ms (Timeout) (try counter: 2)
Feb 14 10:17:47 universum mfsmount: write file error, inode: 420501, index: 26 - Timeout after 10025 ms (Timeout) (try counter: 2)
Feb 14 10:17:47 universum mfsmount: write file error, inode: 410070, index: 0 - Timeout after 10034 ms (Timeout) (try counter: 6)
Feb 14 10:17:47 universum mfsmount: write file error, inode: 144336, index: 22 - Timeout after 10037 ms (Timeout) (try counter: 8)
Feb 14 10:17:49 universum mfsmount: write file error, inode: 144118, index: 17 - Timeout after 10006 ms (Timeout) (try counter: 3)
Feb 14 10:17:55 universum kernel: [1010193.865388] BTRFS info (device dm-2): relocating block group 6879258345472 flags 1
Feb 14 10:17:56 universum mfsmount: write file error, inode: 27964, index: 2 - error sent by master server (Chunk locked)
Feb 14 10:17:56 universum mfsmount: write file error, inode: 147188, index: 0 - Timeout after 10010 ms (Timeout) (try counter: 3)
Feb 14 10:17:57 universum mfsmount: write file error, inode: 420555, index: 26 - Timeout after 10001 ms (Timeout) (try counter: 3)
Feb 14 10:17:57 universum mfsmount: write file error, inode: 420186, index: 13 - Timeout after 10001 ms (Timeout) (try counter: 3)
Feb 14 10:17:57 universum mfsmount: write file error, inode: 420466, index: 7 - Timeout after 10000 ms (Timeout) (try counter: 3)
Feb 14 10:17:57 universum mfsmount: write file error, inode: 420436, index: 29 - Timeout after 10000 ms (Timeout) (try counter: 3)
Feb 14 10:17:57 universum mfsmount: write file error, inode: 420506, index: 27 - Timeout after 10000 ms (Timeout) (try counter: 3)
Feb 14 10:17:57 universum mfsmount: write file error, inode: 420421, index: 31 - Timeout after 10044 ms (Timeout) (try counter: 3)
Feb 14 10:17:57 universum mfsmount: write file error, inode: 420409, index: 31 - Timeout after 10009 ms (Timeout) (try counter: 9)
Feb 14 10:17:59 universum mfsmaster[32350]: (192.168.99.3:9422) chunk: 000000000000723D replication status: Unknown LizardFS error
Feb 14 10:18:06 universum mfsmount: write file error, inode: 145107, index: 4 - Timeout after 10001 ms (Timeout) (try counter: 2)
Feb 14 10:18:06 universum mfsmount: write file error, inode: 419405, index: 17 - Timeout after 10005 ms (Timeout) (try counter: 2)
Feb 14 10:18:07 universum mfsmount: write file error, inode: 144065, index: 23 - Timeout after 10025 ms (Timeout) (try counter: 9)
Feb 14 10:18:07 universum mfsmount: write file error, inode: 420308, index: 5 - Timeout after 10029 ms (Timeout) (try counter: 2)
Feb 14 10:18:07 universum mfsmount: write file error, inode: 420348, index: 5 - Timeout after 10029 ms (Timeout) (try counter: 2)
Feb 14 10:18:07 universum mfsmount: write file error, inode: 406888, index: 20 - Timeout after 10020 ms (Timeout) (try counter: 9)
Feb 14 10:18:10 universum mfsmount: write file error, inode: 27964, index: 2 - error sent by master server (Chunk locked)
Feb 14 10:18:22 universum mfschunkserver[13947]: replication error: Can't connect to 192.168.99.3:9422
Feb 14 10:18:22 universum mfsmaster[32350]: (192.168.99.137:9422) chunk: 000000000007C1A3 replication status: Unknown LizardFS error
Feb 14 10:18:23 universum mfsmount: write file error, inode: 27964, index: 2 - error sent by master server (Chunk locked)
Feb 14 10:18:24 universum mfschunkserver[13947]: Did not manage to receive packet header
Feb 14 10:18:26 universum kernel: [1010225.021460] BTRFS info (device dm-2): found 73 extents
Feb 14 10:18:27 universum mfsmaster[32350]: (192.168.99.3:9422) chunk: 00000000000987A9 replication status: Unknown LizardFS error
Feb 14 10:18:28 universum mfsmount: write file error, inode: 144065, index: 23 - Timeout after 10047 ms (Timeout) (try counter: 10)
Feb 14 10:18:28 universum mfsmount: write file error, inode: 420348, index: 5 - Timeout after 10021 ms (Timeout) (try counter: 3)
Feb 14 10:18:28 universum mfsmount: write file error, inode: 420308, index: 5 - Timeout after 10021 ms (Timeout) (try counter: 3)
Feb 14 10:18:32 universum mfsmount: write file error, inode: 406888, index: 20 - Timeout after 10013 ms (Timeout) (try counter: 10)
Feb 14 10:18:33 universum mfsmount: write file error, inode: 402299, index: 13 - Timeout after 10025 ms (Timeout) (try counter: 1)
Feb 14 10:18:33 universum mfsmount: write file error, inode: 410070, index: 0 - Timeout after 10026 ms (Timeout) (try counter: 7)
Feb 14 10:18:33 universum mfsmount: write file error, inode: 144005, index: 0 - Timeout after 10024 ms (Timeout) (try counter: 5)
Feb 14 10:18:33 universum mfsmount: write file error, inode: 420401, index: 4 - Timeout after 10028 ms (Timeout) (try counter: 1)
Feb 14 10:18:33 universum mfsmount: write file error, inode: 420456, index: 19 - Timeout after 10032 ms (Timeout) (try counter: 1)
Feb 14 10:18:33 universum mfsmount: write file error, inode: 410680, index: 25 - Timeout after 10038 ms (Timeout) (try counter: 1)
Feb 14 10:18:41 universum mfschunkserver[13947]: replication error: Can't connect to 192.168.99.7:9422
Feb 14 10:18:41 universum mfsmaster[32350]: (192.168.99.137:9422) chunk: 000000000007DC0B replication status: Unknown LizardFS error
Feb 14 10:18:43 universum mfschunkserver[13947]: Did not manage to receive packet header
Feb 14 10:18:45 universum kernel: [1010243.234063] BTRFS info (device dm-2): found 73 extents
Feb 14 10:18:46 universum mfsmount: write file error, inode: 420555, index: 26 - Timeout after 10049 ms (Timeout) (try counter: 1)
Feb 14 10:18:48 universum mfsmount: write file error, inode: 420409, index: 31 - Timeout after 10025 ms (Timeout) (try counter: 1)
Feb 14 10:18:48 universum mfsmount: write file error, inode: 420436, index: 29 - Timeout after 10006 ms (Timeout) (try counter: 1)
Feb 14 10:18:48 universum mfsmount: write file error, inode: 141217, index: 9 - Timeout after 10007 ms (Timeout) (try counter: 1)
Feb 14 10:18:48 universum mfsmount: write file error, inode: 401632, index: 36 - Timeout after 10006 ms (Timeout) (try counter: 1)
Feb 14 10:18:48 universum mfsmount: write file error, inode: 420212, index: 31 - Timeout after 10018 ms (Timeout) (try counter: 1)
Feb 14 10:18:48 universum mfsmount: write file error, inode: 420617, index: 16 - Timeout after 10000 ms (Timeout) (try counter: 1)
Feb 14 10:18:48 universum mfsmount: write file error, inode: 419405, index: 17 - Timeout after 10001 ms (Timeout) (try counter: 1)
Feb 14 10:18:48 universum mfsmount: write file error, inode: 420506, index: 27 - Timeout after 10001 ms (Timeout) (try counter: 1)
Feb 14 10:18:53 universum kernel: [1010251.244991] BTRFS info (device dm-2): relocating block group 6878184603648 flags 1
Feb 14 10:19:05 universum mfsmount: write file error, inode: 395869, index: 10 - Timeout after 10000 ms (Timeout) (try counter: 1)
Feb 14 10:19:05 universum mfsmount: write file error, inode: 144118, index: 17 - Timeout after 10001 ms (Timeout) (try counter: 1)
Feb 14 10:19:05 universum mfsmount: write file error, inode: 420501, index: 26 - Timeout after 10047 ms (Timeout) (try counter: 1)
Feb 14 10:19:05 universum mfsmount: write file error, inode: 420435, index: 3 - Timeout after 10042 ms (Timeout) (try counter: 1)
Feb 14 10:19:05 universum mfsmount: write file error, inode: 144336, index: 22 - Timeout after 10049 ms (Timeout) (try counter: 1)
Feb 14 10:19:07 universum mfsmount: write file error, inode: 420236, index: 32 - Timeout after 10020 ms (Timeout) (try counter: 1)
Feb 14 10:19:07 universum systemd[1]: Started Session 19899 of user nagios.
Feb 14 10:19:07 universum mfsmount: write file error, inode: 420186, index: 13 - Timeout after 10025 ms (Timeout) (try counter: 1)
Feb 14 10:19:10 universum mfsmaster[32350]: (192.168.99.3:9422) chunk: 000000000004E45F replication status: Unknown LizardFS error
Feb 14 10:19:15 universum mfsmount: write file error, inode: 420436, index: 29 - Timeout after 10006 ms (Timeout) (try counter: 2)
Feb 14 10:19:15 universum mfsmount: write file error, inode: 420409, index: 31 - Timeout after 10020 ms (Timeout) (try counter: 2)
Feb 14 10:19:15 universum mfsmount: write file error, inode: 141217, index: 9 - Timeout after 10023 ms (Timeout) (try counter: 2)
Feb 14 10:19:15 universum mfsmount: write file error, inode: 420212, index: 31 - Timeout after 10023 ms (Timeout) (try counter: 2)
Feb 14 10:19:15 universum mfsmount: write file error, inode: 401632, index: 36 - Timeout after 10023 ms (Timeout) (try counter: 2)
Feb 14 10:19:25 universum kernel: [1010283.157076] BTRFS info (device dm-2): found 506 extents
Feb 14 10:19:29 universum mfsmaster[32350]: (192.168.99.3:9422) chunk: 0000000000060C4C replication status: Unknown LizardFS error
Feb 14 10:19:30 universum mfsmount: write file error, inode: 420501, index: 26 - Timeout after 10018 ms (Timeout) (try counter: 2)
Feb 14 10:19:35 universum mfsmount: write file error, inode: 420435, index: 3 - Timeout after 10000 ms (Timeout) (try counter: 2)
Feb 14 10:19:35 universum mfsmount: write file error, inode: 420236, index: 32 - Timeout after 10030 ms (Timeout) (try counter: 2)
Feb 14 10:19:35 universum mfsmount: write file error, inode: 420186, index: 13 - Timeout after 10008 ms (Timeout) (try counter: 2)
Feb 14 10:19:35 universum mfsmount: write file error, inode: 420638, index: 19 - Timeout after 10005 ms (Timeout) (try counter: 1)
Feb 14 10:19:35 universum mfsmount: write file error, inode: 420555, index: 26 - Timeout after 10014 ms (Timeout) (try counter: 1)
Feb 14 10:19:36 universum mfsmount: write file error, inode: 420409, index: 31 - Timeout after 10011 ms (Timeout) (try counter: 3)
Feb 14 10:19:36 universum mfsmount: write file error, inode: 420436, index: 29 - Timeout after 10015 ms (Timeout) (try counter: 3)
Feb 14 10:19:36 universum mfschunkserver[13947]: replication error: Can't connect to 192.168.99.3:9422
Feb 14 10:19:36 universum mfsmaster[32350]: (192.168.99.137:9422) chunk: 000000000007CD10 replication status: Unknown LizardFS error
Feb 14 10:19:39 universum mfschunkserver[13947]: Did not manage to receive packet header
Feb 14 10:19:40 universum mfsmount: write file error, inode: 141217, index: 9 - Timeout after 10009 ms (Timeout) (try counter: 3)
Feb 14 10:19:41 universum mfsmount: write file error, inode: 420212, index: 31 - Timeout after 10045 ms (Timeout) (try counter: 3)
Feb 14 10:19:46 universum mfschunkserver[13947]: replication error: Chunkserver communication timed out: 192.168.99.3:9422
Feb 14 10:19:46 universum mfsmaster[32350]: (192.168.99.137:9422) chunk: 000000000007BAC6 replication status: Unknown LizardFS error
Feb 14 10:19:49 universum mfsmount: write file error, inode: 420348, index: 5 - Timeout after 10044 ms (Timeout) (try counter: 1)
Feb 14 10:19:50 universum mfsmount: write file error, inode: 402299, index: 13 - Timeout after 10012 ms (Timeout) (try counter: 1)
Feb 14 10:19:51 universum mfsmount: write file error, inode: 420421, index: 31 - Timeout after 10012 ms (Timeout) (try counter: 1)
Feb 14 10:19:53 universum mfsmount: write file error, inode: 395869, index: 10 - Timeout after 10007 ms (Timeout) (try counter: 1)
Feb 14 10:19:53 universum mfsmount: write file error, inode: 410070, index: 0 - Timeout after 10022 ms (Timeout) (try counter: 8)
Feb 14 10:19:53 universum mfsmount: write file error, inode: 144118, index: 17 - Timeout after 10022 ms (Timeout) (try counter: 1)
Feb 14 10:19:53 universum mfsmount: write file error, inode: 420401, index: 4 - Timeout after 10023 ms (Timeout) (try counter: 1)
Feb 14 10:19:53 universum mfsmount: write file error, inode: 406888, index: 20 - Timeout after 10032 ms (Timeout) (try counter: 1)
Feb 14 10:19:54 universum mfsmount: write file error, inode: 420466, index: 8 - Timeout after 10022 ms (Timeout) (try counter: 1)
Feb 14 10:19:54 universum mfsmount: write file error, inode: 373531, index: 0 - Timeout after 10020 ms (Timeout) (try counter: 27)
Feb 14 10:19:55 universum kernel: [1010313.067972] BTRFS info (device dm-2): found 506 extents
Feb 14 10:19:56 universum systemd[1]: Started Session 19900 of user nagios.
Feb 14 10:19:59 universum mfsmount: write file error, inode: 420501, index: 26 - Timeout after 10033 ms (Timeout) (try counter: 3)
Feb 14 10:20:00 universum mfsmount: write file error, inode: 144336, index: 23 - Timeout after 10019 ms (Timeout) (try counter: 1)
Feb 14 10:20:01 universum mfsmount: write file error, inode: 420435, index: 3 - Timeout after 10023 ms (Timeout) (try counter: 3)
Feb 14 10:20:03 universum mfsmount: write file error, inode: 420555, index: 26 - Timeout after 10017 ms (Timeout) (try counter: 2)
Feb 14 10:20:03 universum mfsmount: write file error, inode: 420638, index: 19 - Timeout after 10018 ms (Timeout) (try counter: 2)
Feb 14 10:20:03 universum mfsmount: write file error, inode: 420409, index: 31 - Timeout after 10012 ms (Timeout) (try counter: 4)
Feb 14 10:20:03 universum mfsmount: write file error, inode: 420236, index: 32 - Timeout after 10049 ms (Timeout) (try counter: 3)
Feb 14 10:20:03 universum mfsmount: write file error, inode: 420186, index: 13 - Timeout after 10033 ms (Timeout) (try counter: 3)
Feb 14 10:20:04 universum mfsmount: write file error, inode: 147452, index: 0 - Timeout after 10000 ms (Timeout) (try counter: 1)
Feb 14 10:20:04 universum mfsmount: write file error, inode: 420436, index: 29 - Timeout after 10003 ms (Timeout) (try counter: 4)
Feb 14 10:20:04 universum kernel: [1010322.965472] BTRFS info (device dm-2): relocating block group 6877110861824 flags 1
Feb 14 10:20:09 universum mfsmount: write file error, inode: 141217, index: 9 - Timeout after 10011 ms (Timeout) (try counter: 4)
Feb 14 10:20:17 universum mfsmaster[32350]: (192.168.99.3:9422) chunk: 0000000000006961 replication status: Unknown LizardFS error
Feb 14 10:20:20 universum mfsmount: write file error, inode: 420421, index: 31 - Timeout after 10003 ms (Timeout) (try counter: 2)
Feb 14 10:20:20 universum mfsmount: write file error, inode: 395869, index: 10 - Timeout after 10039 ms (Timeout) (try counter: 2)
Feb 14 10:20:20 universum mfsmount: write file error, inode: 406888, index: 20 - Timeout after 10007 ms (Timeout) (try counter: 2)
Feb 14 10:20:20 universum mfsmount: write file error, inode: 420401, index: 4 - Timeout after 10008 ms (Timeout) (try counter: 2)
Feb 14 10:20:20 universum mfsmount: write file error, inode: 144118, index: 17 - Timeout after 10011 ms (Timeout) (try counter: 2)
Feb 14 10:20:20 universum mfsmount: write file error, inode: 420466, index: 8 - Timeout after 10005 ms (Timeout) (try counter: 2)
Feb 14 10:20:21 universum mfsmount: write file error, inode: 420501, index: 26 - Timeout after 10008 ms (Timeout) (try counter: 4)
Feb 14 10:20:21 universum mfsmount: write file error, inode: 144336, index: 23 - Timeout after 10027 ms (Timeout) (try counter: 2)
Feb 14 10:20:23 universum mfsmount: write file error, inode: 420435, index: 3 - Timeout after 10021 ms (Timeout) (try counter: 4)
Feb 14 10:20:23 universum systemd[1]: Started LVM2 poll daemon.
Feb 14 10:20:24 universum mfsmount: write file error, inode: 420555, index: 26 - Timeout after 10016 ms (Timeout) (try counter: 3)
Feb 14 10:20:30 universum mfsmaster[32350]: (192.168.99.3:9422) chunk: 0000000000006226 replication status: Unknown LizardFS error
Feb 14 10:20:40 universum mfsmount: write file error, inode: 144118, index: 17 - Timeout after 10001 ms (Timeout) (try counter: 3)
Feb 14 10:20:40 universum mfsmount: write file error, inode: 420466, index: 8 - Timeout after 10009 ms (Timeout) (try counter: 3)
Feb 14 10:20:44 universum mfsmount: write file error, inode: 147191, index: 0 - Timeout after 10021 ms (Timeout) (try counter: 1)
Feb 14 10:20:44 universum mfsmount: write file error, inode: 420409, index: 31 - Timeout after 10020 ms (Timeout) (try counter: 1)
Feb 14 10:20:44 universum mfsmount: write file error, inode: 144336, index: 23 - Timeout after 10022 ms (Timeout) (try counter: 3)
Feb 14 10:20:44 universum mfsmount: write file error, inode: 420501, index: 26 - Timeout after 10023 ms (Timeout) (try counter: 5)
Feb 14 10:20:44 universum mfsmount: write file error, inode: 420435, index: 3 - Timeout after 10023 ms (Timeout) (try counter: 5)
Feb 14 10:20:44 universum mfsmount: write file error, inode: 420555, index: 26 - Timeout after 10021 ms (Timeout) (try counter: 4)
Feb 14 10:20:45 universum mfsmount: write file error, inode: 420638, index: 19 - Timeout after 10024 ms (Timeout) (try counter: 1)
Feb 14 10:20:46 universum mfsmount: write file error, inode: 420236, index: 32 - Timeout after 10025 ms (Timeout) (try counter: 1)
Feb 14 10:20:50 universum mfsmount: write file error, inode: 420186, index: 13 - Timeout after 10024 ms (Timeout) (try counter: 1)
Feb 14 10:20:51 universum kernel: [1010369.048085] BTRFS info (device dm-2): found 260 extents
Feb 14 10:20:51 universum mfschunkserver[13947]: replication error: Chunkserver communication timed out: 192.168.99.3:9422
Feb 14 10:20:51 universum mfschunkserver[13947]: replication error: Can't connect to 192.168.99.3:9422
Feb 14 10:20:51 universum mfsmaster[32350]: (192.168.99.137:9422) chunk: 000000000007FC05 replication status: Unknown LizardFS error
Feb 14 10:20:51 universum mfsmaster[32350]: (192.168.99.137:9422) chunk: 00000000000106F0 replication status: Unknown LizardFS error
Feb 14 10:20:52 universum mfsmaster[32350]: (192.168.99.3:9422) chunk: 000000000005BB64 replication status: Unknown LizardFS error
Feb 14 10:20:53 universum systemd[1]: Started Session 19901 of user nagios.
Feb 14 10:20:54 universum mfsmount: write file error, inode: 420506, index: 27 - Timeout after 10031 ms (Timeout) (try counter: 1)
Feb 14 10:20:54 universum mfsmount: write file error, inode: 420436, index: 29 - Timeout after 10032 ms (Timeout) (try counter: 1)
Feb 14 10:20:54 universum mfsmount: write file error, inode: 141217, index: 9 - Timeout after 10033 ms (Timeout) (try counter: 1)
Feb 14 10:20:55 universum mfsmount: write file error, inode: 145107, index: 4 - Timeout after 10012 ms (Timeout) (try counter: 1)
Feb 14 10:21:03 universum systemd[1]: Started Session 19902 of user nagios.
Feb 14 10:21:05 universum kernel: [1010383.550908] BTRFS info (device dm-2): found 260 extents
Feb 14 10:21:05 universum mfsmount: write file error, inode: 420466, index: 8 - Timeout after 10047 ms (Timeout) (try counter: 4)
Feb 14 10:21:09 universum mfsmount: write file error, inode: 147191, index: 0 - Timeout after 10022 ms (Timeout) (try counter: 2)
Feb 14 10:21:10 universum mfsmount: write file error, inode: 420409, index: 31 - Timeout after 10008 ms (Timeout) (try counter: 2)
Feb 14 10:21:10 universum mfsmount: write file error, inode: 420555, index: 26 - Timeout after 10002 ms (Timeout) (try counter: 5)
Feb 14 10:21:10 universum mfsmount: write file error, inode: 144336, index: 23 - Timeout after 10048 ms (Timeout) (try counter: 4)
Feb 14 10:21:10 universum mfsmount: write file error, inode: 420501, index: 26 - Timeout after 10048 ms (Timeout) (try counter: 6)
Feb 14 10:21:10 universum mfsmount: write file error, inode: 420435, index: 3 - Timeout after 10048 ms (Timeout) (try counter: 6)
Feb 14 10:21:10 universum mfsmount: write file error, inode: 420236, index: 32 - Timeout after 10041 ms (Timeout) (try counter: 2)
Feb 14 10:21:10 universum mfsmount: write file error, inode: 420638, index: 19 - Timeout after 10047 ms (Timeout) (try counter: 2)
Feb 14 10:21:10 universum kernel: [1010388.898717] BTRFS info (device dm-2): relocating block group 6876037120000 flags 1
Feb 14 10:21:36 universum dockerd[8103]: time="2018-02-14T10:21:36.940282939+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb 14 10:21:37 universum mfsmount: write file error, inode: 419405, index: 17 - Timeout after 10008 ms (Timeout) (try counter: 1)
Feb 14 10:21:37 universum mfsmount: write file error, inode: 420436, index: 29 - Timeout after 10025 ms (Timeout) (try counter: 1)
Feb 14 10:21:37 universum dockerd[8103]: time="2018-02-14T10:21:37.940316543+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb 14 10:21:38 universum mfsmount: write file error, inode: 402299, index: 13 - Timeout after 10001 ms (Timeout) (try counter: 1)
Feb 14 10:21:38 universum dockerd[8103]: time="2018-02-14T10:21:38.940098386+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb 14 10:21:39 universum mfsmount: write file error, inode: 420617, index: 16 - Timeout after 10028 ms (Timeout) (try counter: 1)
Feb 14 10:21:39 universum mfsmount: write file error, inode: 420308, index: 6 - Timeout after 10017 ms (Timeout) (try counter: 1)
Feb 14 10:21:39 universum mfsmount: write file error, inode: 406888, index: 20 - Timeout after 10019 ms (Timeout) (try counter: 3)
Feb 14 10:21:41 universum mfsmount: write file error, inode: 144065, index: 23 - Timeout after 10010 ms (Timeout) (try counter: 1)
Feb 14 10:21:42 universum mfsmount: write file error, inode: 395869, index: 10 - Timeout after 10022 ms (Timeout) (try counter: 1)
Feb 14 10:21:42 universum kernel: [1010420.615690] BTRFS info (device dm-2): found 198 extents
Feb 14 10:21:43 universum mfsmaster[32350]: (192.168.99.3:9422) chunk: 000000000000596A replication status: Unknown LizardFS error
Feb 14 10:21:46 universum mfsmount: write file error, inode: 420565, index: 0 - Timeout after 10013 ms (Timeout) (try counter: 15)
Feb 14 10:21:46 universum mfsmount: write file error, inode: 144118, index: 17 - Timeout after 10023 ms (Timeout) (try counter: 1)
Feb 14 10:21:47 universum systemd[1]: Started Session 19903 of user nagios.
Feb 14 10:21:47 universum dockerd[8103]: time="2018-02-14T10:21:47.611020811+01:00" level=info msg="NetworkDB stats universum(7ef05be1e6e7) - netID:87vvi5rhozzl8qws2ufe7ltl2 leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb 14 10:21:47 universum dockerd[8103]: time="2018-02-14T10:21:47.611114171+01:00" level=info msg="NetworkDB stats universum(7ef05be1e6e7) - netID:mh26wxh3legx189b8ummizzvg leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb 14 10:21:47 universum dockerd[8103]: time="2018-02-14T10:21:47.611134362+01:00" level=info msg="NetworkDB stats universum(7ef05be1e6e7) - netID:sv1iurfiu8hvvlkql0955wr3x leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb 14 10:21:47 universum dockerd[8103]: time="2018-02-14T10:21:47.611167301+01:00" level=info msg="NetworkDB stats universum(7ef05be1e6e7) - netID:ksx36pb1e18i5cywyl0itazk1 leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb 14 10:21:47 universum dockerd[8103]: time="2018-02-14T10:21:47.611184786+01:00" level=info msg="NetworkDB stats universum(7ef05be1e6e7) - netID:fbh4j37soxs97qxvjj36b2eis leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb 14 10:21:47 universum dockerd[8103]: time="2018-02-14T10:21:47.611201211+01:00" level=info msg="NetworkDB stats universum(7ef05be1e6e7) - netID:npq26i6igr6n6fn1idntes46c leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb 14 10:21:47 universum dockerd[8103]: time="2018-02-14T10:21:47.611219006+01:00" level=info msg="NetworkDB stats universum(7ef05be1e6e7) - netID:4vpfofq2v8hubh30a3td4xwgx leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb 14 10:21:47 universum dockerd[8103]: time="2018-02-14T10:21:47.611235860+01:00" level=info msg="NetworkDB stats universum(7ef05be1e6e7) - netID:o5j79tmzry4nxic6p0uymd9qr leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb 14 10:21:47 universum dockerd[8103]: time="2018-02-14T10:21:47.611252553+01:00" level=info msg="NetworkDB stats universum(7ef05be1e6e7) - netID:jer1lv709js55yyve3pudm9sq leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb 14 10:21:47 universum dockerd[8103]: time="2018-02-14T10:21:47.611285551+01:00" level=info msg="NetworkDB stats universum(7ef05be1e6e7) - netID:mwursuov7sc87fxhzkmpdtyes leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"
Feb 14 10:21:47 universum dockerd[8103]: time="2018-02-14T10:21:47.611302565+01:00" level=info msg="NetworkDB stats universum(7ef05be1e6e7) - netID:dtjvr71i8opae6x4y2rxxv35n leaving:false netPeers:1 entries:4 Queue qLen:0 netMsg/s:0"
Feb 14 10:21:47 universum dockerd[8103]: time="2018-02-14T10:21:47.611318867+01:00" level=info msg="NetworkDB stats universum(7ef05be1e6e7) - netID:hixi0z35nd69dxzd621dtjl0e leaving:false netPeers:1 entries:2 Queue qLen:0 netMsg/s:0"
Feb 14 10:21:47 universum dockerd[8103]: time="2018-02-14T10:21:47.611350581+01:00" level=info msg="NetworkDB stats universum(7ef05be1e6e7) - netID:2vg30sgcyr0omhvavdcptdrgv leaving:false netPeers:2 entries:36 Queue qLen:0 netMsg/s:0"
Feb 14 10:21:47 universum dockerd[8103]: time="2018-02-14T10:21:47.611368133+01:00" level=info msg="NetworkDB stats universum(7ef05be1e6e7) - netID:azbnydw3jzhx3cp6dy38fmldk leaving:false netPeers:1 entries:4 Queue qLen:0 netMsg/s:0"
Feb 14 10:21:47 universum dockerd[8103]: time="2018-02-14T10:21:47.611402792+01:00" level=info msg="NetworkDB stats universum(7ef05be1e6e7) - netID:swt5qberixre25ua4i508w5v9 leaving:false netPeers:2 entries:8 Queue qLen:0 netMsg/s:0"
Feb 14 10:21:47 universum mfsmount: write file error, inode: 420421, index: 31 - Timeout after 10039 ms (Timeout) (try counter: 1)
Feb 14 10:21:47 universum mfsmount: write file error, inode: 420791, index: 0 - Timeout after 10025 ms (Timeout) (try counter: 1)
Feb 14 10:21:48 universum mfsmount: write file error, inode: 420348, index: 5 - Timeout after 10047 ms (Timeout) (try counter: 1)
Feb 14 10:21:49 universum mfsmount: write file error, inode: 420466, index: 8 - Timeout after 10047 ms (Timeout) (try counter: 1)
Feb 14 10:21:49 universum mfsmount: write file error, inode: 420409, index: 31 - Timeout after 10039 ms (Timeout) (try counter: 1)
Feb 14 10:21:49 universum mfsmount: write file error, inode: 401632, index: 36 - Timeout after 10040 ms (Timeout) (try counter: 1)
Feb 14 10:21:51 universum mfsmount: write file error, inode: 420555, index: 26 - Timeout after 10040 ms (Timeout) (try counter: 1)
Feb 14 10:21:52 universum mfsmount: write file error, inode: 420501, index: 26 - Timeout after 10036 ms (Timeout) (try counter: 1)
Feb 14 10:21:56 universum mfsmount: write file error, inode: 144336, index: 23 - Timeout after 10011 ms (Timeout) (try counter: 1)
Feb 14 10:21:56 universum mfsmount: write file error, inode: 420435, index: 3 - Timeout after 10030 ms (Timeout) (try counter: 1)
Feb 14 10:21:57 universum mfsmount: write file error, inode: 420236, index: 32 - Timeout after 10034 ms (Timeout) (try counter: 1)
Feb 14 10:21:57 universum mfsmount: write file error, inode: 420212, index: 32 - Timeout after 10034 ms (Timeout) (try counter: 1)
Feb 14 10:21:58 universum mfsmount: write file error, inode: 420506, index: 28 - Timeout after 10034 ms (Timeout) (try counter: 1)
Feb 14 10:21:58 universum mfschunkserver[13947]: replication error: Can't connect to 192.168.99.7:9422
Feb 14 10:21:58 universum mfsmaster[32350]: (192.168.99.137:9422) chunk: 0000000000081378 replication status: Unknown LizardFS error
Feb 14 10:21:59 universum mfsmount: write file error, inode: 420186, index: 13 - Timeout after 10003 ms (Timeout) (try counter: 1)
Feb 14 10:21:59 universum mfsmount: write file error, inode: 420638, index: 19 - Timeout after 10004 ms (Timeout) (try counter: 1)
Feb 14 10:22:01 universum mfsmount: write file error, inode: 410070, index: 0 - Timeout after 10021 ms (Timeout) (try counter: 9)
Feb 14 10:22:02 universum kernel: [1010440.645854] BTRFS info (device dm-2): found 198 extents
Feb 14 10:22:03 universum mfschunkserver[13947]: replication error: Can't connect to 192.168.99.7:9422
Feb 14 10:22:03 universum mfsmaster[32350]: (192.168.99.137:9422) chunk: 0000000000081282 replication status: Unknown LizardFS error
Feb 14 10:22:04 universum mfsmount: write file error, inode: 145107, index: 4 - Timeout after 10034 ms (Timeout) (try counter: 1)
Feb 14 10:22:05 universum mfsmount: write file error, inode: 394705, index: 0 - Timeout after 10023 ms (Timeout) (try counter: 1)
Feb 14 10:22:06 universum mfsmount: write file error, inode: 141217, index: 9 - Timeout after 10015 ms (Timeout) (try counter: 1)
Feb 14 10:22:06 universum mfsmount: write file error, inode: 419405, index: 17 - Timeout after 10016 ms (Timeout) (try counter: 2)
Feb 14 10:22:07 universum mfsmount: write file error, inode: 420436, index: 29 - Timeout after 10022 ms (Timeout) (try counter: 2)
Feb 14 10:22:07 universum mfsmount: write file error, inode: 402299, index: 13 - Timeout after 10024 ms (Timeout) (try counter: 2)
Feb 14 10:22:08 universum mfsmount: write file error, inode: 420617, index: 16 - Timeout after 10024 ms (Timeout) (try counter: 2)
Feb 14 10:22:09 universum mfsmount: write file error, inode: 420308, index: 6 - Timeout after 10041 ms (Timeout) (try counter: 2)
Feb 14 10:22:09 universum mfsmount: write file error, inode: 406888, index: 20 - Timeout after 10028 ms (Timeout) (try counter: 4)
Feb 14 10:22:11 universum kernel: [1010449.076445] BTRFS info (device dm-2): relocating block group 6874963378176 flags 1
Feb 14 10:22:11 universum mfsmount: write file error, inode: 144065, index: 23 - Timeout after 10009 ms (Timeout) (try counter: 2)
Feb 14 10:22:12 universum mfsmaster[32350]: (192.168.99.3:9422) chunk: 000000000005469C replication status: Unknown LizardFS error
Feb 14 10:22:14 universum mfsmount: write file error, inode: 395869, index: 10 - Timeout after 10003 ms (Timeout) (try counter: 2)
Feb 14 10:22:15 universum mfsmount: write file error, inode: 144118, index: 17 - Timeout after 10046 ms (Timeout) (try counter: 2)
Feb 14 10:22:16 universum mfsmount: write file error, inode: 420421, index: 31 - Timeout after 10045 ms (Timeout) (try counter: 2)
Feb 14 10:22:16 universum mfsmount: write file error, inode: 420791, index: 0 - Timeout after 10045 ms (Timeout) (try counter: 2)
Feb 14 10:22:17 universum mfsmount: write file error, inode: 420348, index: 5 - Timeout after 10002 ms (Timeout) (try counter: 2)
Feb 14 10:22:17 universum mfsmount: write file error, inode: 420466, index: 8 - Timeout after 10002 ms (Timeout) (try counter: 2)
Feb 14 10:22:18 universum mfsmount: write file error, inode: 420409, index: 31 - Timeout after 10005 ms (Timeout) (try counter: 2)
Feb 14 10:22:19 universum mfsmount: write file error, inode: 401632, index: 36 - Timeout after 10024 ms (Timeout) (try counter: 2)
Feb 14 10:22:37 universum mfsmount: write file error, inode: 144118, index: 17 - Timeout after 10037 ms (Timeout) (try counter: 3)
Feb 14 10:22:37 universum mfsmount: write file error, inode: 420421, index: 31 - Timeout after 10039 ms (Timeout) (try counter: 3)
Feb 14 10:22:38 universum mfsmount: write file error, inode: 420348, index: 5 - Timeout after 10001 ms (Timeout) (try counter: 3)
Feb 14 10:22:39 universum mfsmount: write file error, inode: 420466, index: 8 - Timeout after 10029 ms (Timeout) (try counter: 3)
Feb 14 10:22:40 universum mfsmount: write file error, inode: 420409, index: 31 - Timeout after 10036 ms (Timeout) (try counter: 3)
Feb 14 10:22:40 universum mfsmount: write file error, inode: 420555, index: 26 - Timeout after 10001 ms (Timeout) (try counter: 1)
Feb 14 10:22:40 universum mfsmount: write file error, inode: 401632, index: 36 - Timeout after 10002 ms (Timeout) (try counter: 3)
Feb 14 10:22:41 universum mfsmount: write file error, inode: 420501, index: 26 - Timeout after 10010 ms (Timeout) (try counter: 1)
Feb 14 10:22:42 universum mfsmount: write file error, inode: 147480, index: 0 - Timeout after 10013 ms (Timeout) (try counter: 1)
Feb 14 10:22:42 universum mfsmount: write file error, inode: 144336, index: 23 - Timeout after 10048 ms (Timeout) (try counter: 1)
Feb 14 10:22:43 universum kernel: [1010481.210702] BTRFS info (device dm-2): found 235 extents
Feb 14 10:22:47 universum mfsmount: write file error, inode: 420435, index: 3 - Timeout after 10034 ms (Timeout) (try counter: 1)
Feb 14 10:22:47 universum mfsmount: write file error, inode: 420236, index: 32 - Timeout after 10017 ms (Timeout) (try counter: 1)
Feb 14 10:22:48 universum mfsmount: write file error, inode: 420212, index: 32 - Timeout after 10028 ms (Timeout) (try counter: 1)
Feb 14 10:22:49 universum mfsmount: write file error, inode: 420506, index: 28 - Timeout after 10022 ms (Timeout) (try counter: 1)
Feb 14 10:22:50 universum mfsmount: write file error, inode: 420638, index: 19 - Timeout after 10022 ms (Timeout) (try counter: 1)
Feb 14 10:22:50 universum mfsmount: write file error, inode: 145107, index: 4 - Timeout after 10015 ms (Timeout) (try counter: 1)
Feb 14 10:22:50 universum mfsmount: write file error, inode: 420186, index: 13 - Timeout after 10016 ms (Timeout) (try counter: 1)
Feb 14 10:22:51 universum mfsmount: write file error, inode: 402299, index: 13 - Timeout after 10015 ms (Timeout) (try counter: 3)
Feb 14 10:22:52 universum dockerd[8103]: time="2018-02-14T10:22:52.204280570+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb 14 10:22:52 universum mfsmount: write file error, inode: 141217, index: 9 - Timeout after 10010 ms (Timeout) (try counter: 1)
Feb 14 10:22:52 universum mfsmount: write file error, inode: 419405, index: 17 - Timeout after 10008 ms (Timeout) (try counter: 1)
Feb 14 10:22:53 universum dockerd[8103]: time="2018-02-14T10:22:53.204155240+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb 14 10:22:54 universum dockerd[8103]: time="2018-02-14T10:22:54.204118114+01:00" level=error msg="Failed to deserialize netlink ndmsg: Link not found"
Feb 14 10:22:57 universum mfsmount: write file error, inode: 420436, index: 29 - Timeout after 10039 ms (Timeout) (try counter: 1)
Feb 14 10:22:57 universum mfsmount: write file error, inode: 144065, index: 23 - Timeout after 10008 ms (Timeout) (try counter: 1)
Feb 14 10:22:58 universum mfsmount: write file error, inode: 27964, index: 2 - Timeout after 10007 ms (Timeout) (try counter: 1)
Feb 14 10:22:59 universum mfsmaster[32350]: chunk 00000000000a71a1 has not enough valid parts (1) consider repairing it manually
Feb 14 10:22:59 universum mfsmaster[32350]: chunk 00000000000a71a1_00000002 - invalid part on (192.168.99.3 - ver:00000001)
Feb 14 10:22:59 universum mfsmount: write file error, inode: 420617, index: 16 - Timeout after 10014 ms (Timeout) (try counter: 1)
Feb 14 10:23:00 universum mfsmount: write file error, inode: 410013, index: 0 - Timeout after 10010 ms (Timeout) (try counter: 1)
Feb 14 10:23:00 universum mfsmount: write file error, inode: 420308, index: 6 - Timeout after 10019 ms (Timeout) (try counter: 1)
Feb 14 10:23:00 universum mfsmount: write file error, inode: 410002, index: 0 - Timeout after 10020 ms (Timeout) (try counter: 1)
Feb 14 10:23:01 universum mfsmount: write file error, inode: 406888, index: 20 - Timeout after 10020 ms (Timeout) (try counter: 1)
Feb 14 10:23:02 universum mfsmount: write file error, inode: 144118, index: 17 - Timeout after 10022 ms (Timeout) (try counter: 4)
Feb 14 10:23:02 universum mfsmount: write file error, inode: 420421, index: 31 - Timeout after 10023 ms (Timeout) (try counter: 4)
Feb 14 10:23:07 universum mfschunkserver[13947]: replication error: Chunkserver communication timed out: 192.168.99.3:9422
Feb 14 10:23:07 universum mfsmaster[32350]: (192.168.99.137:9422) chunk: 000000000007C604 replication status: Unknown LizardFS error
Feb 14 10:23:07 universum mfsmount: write file error, inode: 420348, index: 5 - Timeout after 10021 ms (Timeout) (try counter: 4)
Feb 14 10:23:07 universum mfsmount: write file error, inode: 420466, index: 8 - Timeout after 10021 ms (Timeout) (try counter: 4)
Feb 14 10:23:08 universum mfsmount: write file error, inode: 420409, index: 31 - Timeout after 10022 ms (Timeout) (try counter: 4)
Feb 14 10:23:09 universum mfsmaster[32350]: (192.168.99.3:9422) chunk: 0000000000002D18 replication status: Unknown LizardFS error
Feb 14 10:23:09 universum mfschunkserver[13947]: Did not manage to receive packet header
Feb 14 10:23:09 universum mfsmount: write file error, inode: 420555, index: 26 - Timeout after 10022 ms (Timeout) (try counter: 2)
Feb 14 10:23:10 universum mfsmount: write file error, inode: 401632, index: 36 - Timeout after 10021 ms (Timeout) (try counter: 4)
Feb 14 10:23:10 universum mfsmount: write file error, inode: 420501, index: 26 - Timeout after 10022 ms (Timeout) (try counter: 2)
Feb 14 10:23:10 universum mfsmount: write file error, inode: 147480, index: 0 - Timeout after 10022 ms (Timeout) (try counter: 2)
Feb 14 10:23:11 universum mfsmount: write file error, inode: 144336, index: 23 - Timeout after 10021 ms (Timeout) (try counter: 2)
Feb 14 10:23:12 universum mfsmount: write file error, inode: 420435, index: 3 - Timeout after 10022 ms (Timeout) (try counter: 2)
Feb 14 10:23:12 universum mfsmount: write file error, inode: 420236, index: 32 - Timeout after 10021 ms (Timeout) (try counter: 2)
Feb 14 10:23:16 universum kernel: [1010514.070846] BTRFS info (device dm-2): found 235 extents
Feb 14 10:23:16 universum mfschunkserver[13947]: replication error: Can't connect to 192.168.99.7:9422
Feb 14 10:23:16 universum mfsmaster[32350]: (192.168.99.137:9422) chunk: 00000000000791A3 replication status: Unknown LizardFS error
Feb 14 10:23:17 universum mfsmount: write file error, inode: 420212, index: 32 - Timeout after 10024 ms (Timeout) (try counter: 2)
Feb 14 10:23:17 universum mfschunkserver[13947]: Did not manage to receive packet header
Feb 14 10:23:17 universum mfsmount: write file error, inode: 420506, index: 28 - Timeout after 10023 ms (Timeout) (try counter: 2)
Feb 14 10:23:18 universum mfsmount: write file error, inode: 420638, index: 19 - Timeout after 10018 ms (Timeout) (try counter: 2)
Feb 14 10:23:19 universum mfsmount: write file error, inode: 145107, index: 4 - Timeout after 10016 ms (Timeout) (try counter: 2)
Feb 14 10:23:20 universum mfsmount: write file error, inode: 420186, index: 13 - Timeout after 10020 ms (Timeout) (try counter: 2)
Feb 14 10:23:20 universum mfsmount: write file error, inode: 141217, index: 9 - Timeout after 10016 ms (Timeout) (try counter: 2)
Feb 14 10:23:20 universum mfsmount: write file error, inode: 402299, index: 13 - Timeout after 10016 ms (Timeout) (try counter: 4)
Feb 14 10:23:21 universum mfsmount: write file error, inode: 419405, index: 17 - Timeout after 10018 ms (Timeout) (try counter: 2)
Feb 14 10:23:22 universum mfsmount: write file error, inode: 420795, index: 0 - Timeout after 10020 ms (Timeout) (try counter: 1)
Feb 14 10:23:22 universum mfsmount: write file error, inode: 420436, index: 29 - Timeout after 10021 ms (Timeout) (try counter: 2)
Feb 14 10:23:24 universum mfsmaster[32350]: (192.168.99.3:9422) chunk: 00000000000065C0 replication status: Unknown LizardFS error
Feb 14 10:23:27 universum mfsmount: write file error, inode: 144065, index: 23 - Timeout after 10034 ms (Timeout) (try counter: 2)
Feb 14 10:23:27 universum mfsmount: write file error, inode: 420617, index: 16 - Timeout after 10013 ms (Timeout) (try counter: 2)
Feb 14 10:23:28 universum mfsmount: write file error, inode: 420308, index: 6 - Timeout after 10012 ms (Timeout) (try counter: 2)
Feb 14 10:23:28 universum kernel: [1010526.980680] BTRFS info (device dm-2): relocating block group 6873889636352 flags 1
Feb 14 10:23:29 universum mfsmount: write file error, inode: 406888, index: 20 - Timeout after 10019 ms (Timeout) (try counter: 2)
Feb 14 10:23:30 universum mfsmount: write file error, inode: 144118, index: 17 - Timeout after 10019 ms (Timeout) (try counter: 5)
Feb 14 10:23:30 universum mfsmount: write file error, inode: 420348, index: 5 - Timeout after 10020 ms (Timeout) (try counter: 5)
Feb 14 10:23:30 universum mfsmount: write file error, inode: 420421, index: 31 - Timeout after 10021 ms (Timeout) (try counter: 5)
Feb 14 10:23:31 universum mfsmount: write file error, inode: 420466, index: 8 - Timeout after 10020 ms (Timeout) (try counter: 5)
Feb 14 10:23:32 universum mfsmount: write file error, inode: 420409, index: 31 - Timeout after 10021 ms (Timeout) (try counter: 5)
Feb 14 10:23:32 universum mfsmount: write file error, inode: 420555, index: 26 - Timeout after 10006 ms (Timeout) (try counter: 3)
Feb 14 10:23:48 universum mfsmount: write file error, inode: 420617, index: 16 - Timeout after 10028 ms (Timeout) (try counter: 3)
@borkd

This comment has been minimized.

Copy link

borkd commented Feb 14, 2018

You still have a lot of moving parts there, and your initial overall config while not technically invalid shows lots of fragility when you have to incur heavy IO of repairs or rebalancing (and ability to automatically heal is likely why you decided to use distributed filesystem in the first place). What else could be looked at:

  • When you are replicating from endangered chunks, any unnecessary I/O on devices that houses them will send latencies sky high. If your hardware is on the budget side to begin with this will hurt.
  • The copious graphs and logs you have pasted here are missing the most important data - history of read and write operation times.
  • I have no (good) experience with btrfs, but the system reports rebalancing while trying to replicate. COW filesystems are fantastic when used right, but in edge situations they will add to the amplification of the I/O and this can percolate up the layers and cause problems elsewhere.
@borkd

This comment has been minimized.

Copy link

borkd commented Feb 14, 2018

To keep data online, and be able to use it with minimal disturbances while life happens plan your topology and failure domains well and borrow replication factors from small expedition team sizing:

  • 1 is an accident waiting to happen
  • 2 is a witness
  • 3 is a rescue
  • 4-6 is still manageable and not so big as to become a burden
@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 17, 2018

@borkd what should be different, what is too fragile?

Here is an Image of my setup:
meine-cloud

Here are the read data operation times:
image

Here are the write data operation times:
image

Currently I don't use the btrfs COW functionality. I tried to make timed snapshots, but that was a miserable failure and has teared everything down.

Do you need any more information?

@borkd

This comment has been minimized.

Copy link

borkd commented Feb 20, 2018

You are working with a bunch of constraints. For a happy experience aim for reasonably low latency and minimal variance on write and read times.
Tune replication goals, replication delays during failures/disconnects, and number of chunk tests per time unit to something that makes sense in your environment. Point to point and aggregate bandwidth, latency, and topology of your network (powerline seems like an obvious bottleneck) will impact read/write latencies. Underlying filesystems will add to that, so will encryption, and block devices themselves.

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 20, 2018

@borkd

Questions:

  • What would you change in my configuration?
  • Which parameters would you set to what values?
  • How should I connect a remote location (powerline) with slower connection?

Regarding Powerline:

  • According to the documentation I read, remote replication with latency, even to another datacenter, should not be a problem. That's why I added that host. Is this wrong?
  • Currently, I changed the configuration with labels so, that all chunks will be stored on only two exactly defined local hosts. So everything that might still be on the remote host will no migrate to these two local hosts.
  • I'd like to have a copy in another floor / room, just for cases like a fire in a room.
  • That powerline is in fact ~70 Mbits/sec (13× slower than local, which is ~900Mbit/s, according to iperf) and has more latency (~1,5ms, 6,5× slower than local, which is ~230ms, according to ping).

As far as I understand, a client writes only to one chunk server, then the chunk server replicates later. Also it gets the data from one chunk server nearby? If so, a slow connection to one replication (powerline) to one othe chunkserver should not be a huge problem? Or am I missing something?

@borkd

This comment has been minimized.

Copy link

borkd commented Feb 20, 2018

You seem to severely underestimate the impact of everyone competing for the same significant constraint. If I understood what you are trying to achieve, the powerline machine needs to receive a replica of metadata and any and all chunks that are otherwise spread on 3 well connected machines, in real time. See the problem yet? Once the latencies stack up beyond hardcoded/configured timeout thresholds your experience will deteriorate.

MooseFS and LizardFS both come with decent defaults for gigabit ethernet (end-to-end). If you feel realtime replication over slow, high latency link is a must have for you, then you will need to spend some time experimenting and tweaking to adjust for peculiarities of your specific topology. Georeplication is not easy.

Other workaround: write to folders and files with replication goals spanning well connected machines only. Switch to goal that consists of original set of labels PLUS powerline after a while, ideally when powerline is not doing any other data transfers. Play with network QoS to assure the metalogger on your powerline machine is never starved. Chunks won't be worth much without metadata.

Good luck.

Required reading:
https://github.com/lizardfs/lizardfs/search?q=geo&type=Issues
https://github.com/lizardfs/lizardfs/search?q=topology&type=Issues

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 20, 2018

Just to make this clear: I am about taking the remote location out of the configuration as soon as it is empty. Currently, after weeks that I removed it from the goals, it still contains 70k chunks (total chunks are ~512k). So I expect at least one more month to get rid of it.

I gonna read the links, thank you, @borkd.

What else can I do to improve the performance?

@borkd

This comment has been minimized.

Copy link

borkd commented Feb 20, 2018

If moving the machine to the same network to complete the drain in a couple of hours is out of question, try setting REPLICATION_BANDWIDTH_LIMIT_KBPS on well connected chunkservers, CHUNKS_READ_REP_LIMIT=1 on master and wait it out.

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 22, 2018

@cratoo, no here everything is reachable and LizardFS is running native, outside of docker.

IMHO local chunkservers should increase, not lower the performance. In longer terms, I want to have a copy of each chunk in each chunkserver, so that data is always local (and therefore fast). But before I increase the goals, the performance issue must be fixed.

The real problem started, when I started to copy all my many TB of data to LizardFS and then noticed that the default goal was only «1», so I increased to «2». This was about 1-2 month ago and since then, it copied only about 50% of all chunks, so 50% are still endangered. Also this is very slow. I expect replication to finish in about a month…

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 22, 2018

I suppose, the problem arises from the many many read and write errors I see in the logs. But where do those come from? Network is fine, disks are fine, what is the problem? How can I get more information?

universum (master and chunkserver):

Feb 22 14:28:06 universum mfsmount: write file error, inode: 427189, index: 15 - Timeout after 10024 ms (Timeout) (try counter: 28)
Feb 22 14:28:06 universum mfsmount: write file error, inode: 430938, index: 27 - Timeout after 10023 ms (Timeout) (try counter: 27)
Feb 22 14:28:07 universum mfsmount: write file error, inode: 427298, index: 18 - Timeout after 10024 ms (Timeout) (try counter: 22)
Feb 22 14:28:10 universum mfsmount: read file error, inode: 359486, index: 0, chunk: 614969, version: 33 - no valid copies (try counter: 114)
Feb 22 14:28:11 universum mfsmount: write file error, inode: 422319, index: 44 - Timeout after 10022 ms (Timeout) (try counter: 17)
Feb 22 14:28:11 universum mfsmount: write file error, inode: 427255, index: 16 - Timeout after 10022 ms (Timeout) (try counter: 22)
Feb 22 14:28:13 universum mfsmount: write file error, inode: 428006, index: 0 - Timeout after 10026 ms (Timeout) (try counter: 5)
Feb 22 14:28:13 universum mfsmount: write file error, inode: 428111, index: 29 - Timeout after 10032 ms (Timeout) (try counter: 17)
Feb 22 14:28:14 universum mfsmount: write file error, inode: 426441, index: 15 - Timeout after 10026 ms (Timeout) (try counter: 27)
Feb 22 14:28:16 universum mfsmount: write file error, inode: 314118, index: 33 - Timeout after 10026 ms (Timeout) (try counter: 27)
Feb 22 14:28:16 universum mfsmount: write file error, inode: 427374, index: 20 - Timeout after 10026 ms (Timeout) (try counter: 27)
Feb 22 14:28:16 universum mfsmount: write file error, inode: 410856, index: 36 - Timeout after 10035 ms (Timeout) (try counter: 17)
Feb 22 14:28:16 universum mfsmount: write file error, inode: 427426, index: 22 - Timeout after 10022 ms (Timeout) (try counter: 27)
Feb 22 14:28:17 universum mfsmount: write file error, inode: 338037, index: 9 - Timeout after 10024 ms (Timeout) (try counter: 17)
Feb 22 14:28:20 universum mfsmount: read file error, inode: 359486, index: 0, chunk: 614969, version: 33 - no valid copies (try counter: 115)
Feb 22 14:28:21 universum mfsmount: write file error, inode: 412016, index: 13 - Timeout after 10027 ms (Timeout) (try counter: 17)
Feb 22 14:28:21 universum mfsmount: write file error, inode: 431153, index: 28 - Timeout after 10024 ms (Timeout) (try counter: 6)
Feb 22 14:28:23 universum mfsmount: write file error, inode: 319208, index: 32 - Timeout after 10024 ms (Timeout) (try counter: 6)
Feb 22 14:28:23 universum mfsmount: write file error, inode: 427459, index: 23 - Timeout after 10023 ms (Timeout) (try counter: 27)
Feb 22 14:28:24 universum mfsmount: write file error, inode: 428066, index: 24 - Timeout after 10026 ms (Timeout) (try counter: 6)
Feb 22 14:28:26 universum mfsmount: write file error, inode: 318905, index: 31 - Timeout after 10024 ms (Timeout) (try counter: 22)
Feb 22 14:28:26 universum mfsmount: write file error, inode: 404635, index: 35 - Timeout after 10025 ms (Timeout) (try counter: 27)
Feb 22 14:28:26 universum mfsmount: write file error, inode: 147271, index: 0 - Timeout after 10022 ms (Timeout) (try counter: 6)
Feb 22 14:28:27 universum mfsmount: write file error, inode: 427206, index: 16 - Timeout after 10023 ms (Timeout) (try counter: 17)
Feb 22 14:28:27 universum mfsmount: write file error, inode: 431200, index: 29 - Timeout after 10024 ms (Timeout) (try counter: 28)
Feb 22 14:28:30 universum mfsmount: read file error, inode: 359486, index: 0, chunk: 614969, version: 33 - no valid copies (try counter: 116)
Feb 22 14:28:31 universum mfsmount: write file error, inode: 429405, index: 15 - Timeout after 10024 ms (Timeout) (try counter: 17)
Feb 22 14:28:31 universum mfsmount: write file error, inode: 422366, index: 45 - Timeout after 10025 ms (Timeout) (try counter: 17)
Feb 22 14:28:33 universum mfsmount: write file error, inode: 27964, index: 4 - Timeout after 10023 ms (Timeout) (try counter: 4)
Feb 22 14:28:33 universum mfsmount: write file error, inode: 64516, index: 0 - Timeout after 10023 ms (Timeout) (try counter: 6)
Feb 22 14:28:34 universum mfsmount: write file error, inode: 428032, index: 0 - Timeout after 10026 ms (Timeout) (try counter: 22)
Feb 22 14:28:36 universum mfsmount: write file error, inode: 371763, index: 0 - Timeout after 10027 ms (Timeout) (try counter: 10)
Feb 22 14:28:36 universum mfsmount: write file error, inode: 410002, index: 0 - Timeout after 10026 ms (Timeout) (try counter: 13)
Feb 22 14:28:36 universum mfsmount: write file error, inode: 427442, index: 24 - Timeout after 10026 ms (Timeout) (try counter: 28)
Feb 22 14:28:37 universum mfsmount: write file error, inode: 424221, index: 45 - Timeout after 10030 ms (Timeout) (try counter: 28)
Feb 22 14:28:37 universum mfsmount: write file error, inode: 428006, index: 0 - Timeout after 10027 ms (Timeout) (try counter: 6)
Feb 22 14:28:38 universum mfschunkserver[24047]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb 22 14:28:38 universum mfsmaster[6469]: (192.168.99.137:9422) chunk: 000000000007E979 replication status: Unknown LizardFS error
Feb 22 14:28:40 universum mfsmount: read file error, inode: 359486, index: 0, chunk: 614969, version: 33 - no valid copies (try counter: 117)
Feb 22 14:28:40 universum mfsmaster[6469]: (192.168.99.3:9422) chunk: 000000000001BC5C replication status: Unknown LizardFS error
Feb 22 14:28:40 universum mfsmaster[6469]: (192.168.99.3:9422) chunk: 0000000000054DA6 replication status: Unknown LizardFS error
Feb 22 14:28:41 universum mfsmount: write file error, inode: 427189, index: 15 - Timeout after 10031 ms (Timeout) (try counter: 29)
Feb 22 14:28:41 universum mfsmount: write file error, inode: 430938, index: 27 - Timeout after 10022 ms (Timeout) (try counter: 28)
Feb 22 14:28:42 universum mfschunkserver[24047]: replication error: Chunkserver communication timed out
Feb 22 14:28:42 universum mfsmaster[6469]: (192.168.99.137:9422) chunk: 000000000007D4D1 replication status: Unknown LizardFS error
Feb 22 14:28:43 universum mfsmount: write file error, inode: 427298, index: 18 - Timeout after 10025 ms (Timeout) (try counter: 23)
Feb 22 14:28:43 universum mfsmount: write file error, inode: 422319, index: 44 - Timeout after 10025 ms (Timeout) (try counter: 18)
Feb 22 14:28:43 universum mfschunkserver[24047]: Did not manage to receive packet header
Feb 22 14:28:44 universum mfsmount: write file error, inode: 428111, index: 29 - Timeout after 10022 ms (Timeout) (try counter: 18)
Feb 22 14:28:46 universum mfsmount: write file error, inode: 427255, index: 16 - Timeout after 10023 ms (Timeout) (try counter: 23)
Feb 22 14:28:46 universum mfsmount: write file error, inode: 431153, index: 28 - Timeout after 10024 ms (Timeout) (try counter: 7)
Feb 22 14:28:46 universum mfsmount: write file error, inode: 426441, index: 15 - Timeout after 10022 ms (Timeout) (try counter: 28)
Feb 22 14:28:47 universum mfsmount: write file error, inode: 410856, index: 36 - Timeout after 10023 ms (Timeout) (try counter: 18)
Feb 22 14:28:47 universum mfsmount: write file error, inode: 319208, index: 32 - Timeout after 10024 ms (Timeout) (try counter: 7)
Feb 22 14:28:50 universum mfsmount: read file error, inode: 359486, index: 0, chunk: 614969, version: 33 - no valid copies (try counter: 118)
Feb 22 14:28:51 universum mfsmount: write file error, inode: 428066, index: 24 - Timeout after 10024 ms (Timeout) (try counter: 7)
Feb 22 14:28:51 universum mfsmount: write file error, inode: 338037, index: 9 - Timeout after 10024 ms (Timeout) (try counter: 18)
Feb 22 14:28:53 universum mfsmount: write file error, inode: 314118, index: 33 - Timeout after 10024 ms (Timeout) (try counter: 28)
Feb 22 14:28:53 universum mfsmount: write file error, inode: 427374, index: 20 - Timeout after 10024 ms (Timeout) (try counter: 28)
…
Feb 22 14:32:36 universum mfsmount: write file error, inode: 404635, index: 35 - Timeout after 10014 ms (Timeout) (try counter: 34)
Feb 22 14:32:36 universum mfsmount: read file error, inode: 359486, index: 0, chunk: 614969, version: 33 - no valid copies (try counter: 33)
Feb 22 14:32:37 universum mfsmount: write file error, inode: 431200, index: 29 - Timeout after 10049 ms (Timeout) (try counter: 35)
Feb 22 14:32:37 universum mfsmount: write file error, inode: 430938, index: 27 - Timeout after 10026 ms (Timeout) (try counter: 8)
Feb 22 14:32:37 universum mfsmount: write file error, inode: 431153, index: 28 - Timeout after 10039 ms (Timeout) (try counter: 8)
Feb 22 14:32:38 universum mfsmount: write file error, inode: 429405, index: 15 - Timeout after 10024 ms (Timeout) (try counter: 24)
Feb 22 14:32:39 universum mfsmount: write file error, inode: 427189, index: 15 - Timeout after 10029 ms (Timeout) (try counter: 8)
Feb 22 14:32:39 universum mfsmount: write file error, inode: 422366, index: 45 - Timeout after 10029 ms (Timeout) (try counter: 24)
Feb 22 14:32:42 universum mfsmount: write file error, inode: 422319, index: 44 - Timeout after 10020 ms (Timeout) (try counter: 8)
Feb 22 14:32:42 universum mfsmount: write file error, inode: 428032, index: 0 - Timeout after 10022 ms (Timeout) (try counter: 29)
Feb 22 14:32:45 universum mfsmount: write file error, inode: 410856, index: 36 - Timeout after 10022 ms (Timeout) (try counter: 25)
Feb 22 14:32:46 universum mfschunkserver[24047]: replication error: Read from chunkserver error: connection reset by peer (server 192.168.99.3:9422)
Feb 22 14:32:46 universum mfsmaster[6469]: (192.168.99.137:9422) chunk: 000000000007ECD6 replication status: Unknown LizardFS error
Feb 22 14:32:46 universum mfsmount: write file error, inode: 40714, index: 0 - Timeout after 10021 ms (Timeout) (try counter: 7)
Feb 22 14:32:46 universum mfsmount: read file error, inode: 359486, index: 0, chunk: 614969, version: 33 - no valid copies (try counter: 34)
Feb 22 14:32:47 universum mfsmount: write file error, inode: 427255, index: 16 - Timeout after 10035 ms (Timeout) (try counter: 30)
Feb 22 14:32:47 universum mfsmount: write file error, inode: 428111, index: 29 - Timeout after 10035 ms (Timeout) (try counter: 8)
Feb 22 14:32:47 universum mfsmount: write file error, inode: 428066, index: 24 - Timeout after 10023 ms (Timeout) (try counter: 8)
Feb 22 14:32:48 universum mfsmount: write file error, inode: 371763, index: 0 - Timeout after 10020 ms (Timeout) (try counter: 17)
Feb 22 14:32:49 universum mfsmount: write file error, inode: 426441, index: 15 - Timeout after 10034 ms (Timeout) (try counter: 35)
Feb 22 14:32:50 universum mfsmount: write file error, inode: 427998, index: 0 - Timeout after 10034 ms (Timeout) (try counter: 6)
Feb 22 14:32:56 universum mfsmount: read file error, inode: 359486, index: 0, chunk: 614969, version: 33 - no valid copies (try counter: 35)
Feb 22 14:32:58 universum mfsmount: write file error, inode: 427442, index: 24 - Timeout after 10038 ms (Timeout) (try counter: 9)
Feb 22 14:33:00 universum mfsmount: write file error, inode: 427426, index: 22 - Timeout after 10038 ms (Timeout) (try counter: 35)
Feb 22 14:33:00 universum mfsmount: write file error, inode: 319208, index: 32 - Timeout after 10034 ms (Timeout) (try counter: 9)
Feb 22 14:33:00 universum mfsmount: write file error, inode: 424221, index: 45 - Timeout after 10028 ms (Timeout) (try counter: 9)
Feb 22 14:33:00 universum mfsmount: write file error, inode: 427374, index: 20 - Timeout after 10038 ms (Timeout) (try counter: 35)
Feb 22 14:33:01 universum mfsmount: write file error, inode: 427459, index: 23 - Timeout after 10008 ms (Timeout) (try counter: 35)
Feb 22 14:33:01 universum mfsmount: write file error, inode: 431153, index: 28 - Timeout after 10001 ms (Timeout) (try counter: 9)
Feb 22 14:33:01 universum mfsmount: write file error, inode: 430938, index: 27 - Timeout after 10003 ms (Timeout) (try counter: 9)
Feb 22 14:33:03 universum mfsmount: write file error, inode: 427189, index: 15 - Timeout after 10044 ms (Timeout) (try counter: 9)
Feb 22 14:33:03 universum mfsmount: write file error, inode: 427206, index: 16 - Timeout after 10044 ms (Timeout) (try counter: 25)
Feb 22 14:33:06 universum mfsmount: read file error, inode: 359486, index: 0, chunk: 614969, version: 33 - no valid copies (try counter: 36)
Feb 22 14:33:08 universum mfsmount: write file error, inode: 318905, index: 31 - Timeout after 10043 ms (Timeout) (try counter: 30)
Feb 22 14:33:10 universum mfsmount: write file error, inode: 422319, index: 44 - Timeout after 10042 ms (Timeout) (try counter: 9)
Feb 22 14:33:10 universum mfsmount: write file error, inode: 431200, index: 29 - Timeout after 10001 ms (Timeout) (try counter: 36)
Feb 22 14:33:10 universum mfsmount: write file error, inode: 404635, index: 35 - Timeout after 10046 ms (Timeout) (try counter: 35)
Feb 22 14:33:11 universum mfsmount: write file error, inode: 428111, index: 29 - Timeout after 10019 ms (Timeout) (try counter: 9)
Feb 22 14:33:11 universum mfsmount: write file error, inode: 428066, index: 24 - Timeout after 10021 ms (Timeout) (try counter: 9)
Feb 22 14:33:11 universum mfsmount: write file error, inode: 429405, index: 15 - Timeout after 10019 ms (Timeout) (try counter: 25)
Feb 22 14:33:11 universum mfsmount: write file error, inode: 422366, index: 45 - Timeout after 10020 ms (Timeout) (try counter: 25)
Feb 22 14:33:13 universum mfsmount: write file error, inode: 427998, index: 0 - Timeout after 10021 ms (Timeout) (try counter: 7)
Feb 22 14:33:13 universum mfsmount: write file error, inode: 427298, index: 18 - Timeout after 10022 ms (Timeout) (try counter: 1)
Feb 22 14:33:16 universum mfsmount: read file error, inode: 359486, index: 0, chunk: 614969, version: 33 - no valid copies (try counter: 37)
Feb 22 14:33:18 universum mfsmount: write file error, inode: 314118, index: 33 - Timeout after 10021 ms (Timeout) (try counter: 1)
Feb 22 14:33:20 universum mfsmount: write file error, inode: 147509, index: 0 - Timeout after 10023 ms (Timeout) (try counter: 1)
Feb 22 14:33:20 universum mfsmount: write file error, inode: 428032, index: 0 - Timeout after 10022 ms (Timeout) (try counter: 30)
Feb 22 14:33:20 universum mfsmount: write file error, inode: 412016, index: 13 - Timeout after 10021 ms (Timeout) (try counter: 1)
Feb 22 14:33:21 universum mfsmount: write file error, inode: 338037, index: 9 - Timeout after 10021 ms (Timeout) (try counter: 1)
Feb 22 14:33:21 universum mfsmount: write file error, inode: 410856, index: 36 - Timeout after 10020 ms (Timeout) (try counter: 26)
Feb 22 14:33:21 universum mfsmount: write file error, inode: 427255, index: 16 - Timeout after 10019 ms (Timeout) (try counter: 31)
Feb 22 14:33:21 universum mfsmount: write file error, inode: 371763, index: 0 - Timeout after 10021 ms (Timeout) (try counter: 18)
Feb 22 14:33:23 universum mfsmount: write file error, inode: 427442, index: 24 - Timeout after 10021 ms (Timeout) (try counter: 10)

urknall (metalogger, chunkserver (used in the goals)):

Feb 22 14:38:40 urknall dockerd[1337]: time="2018-02-22T14:38:40.224990600+01:00" level=info msg="NetworkDB stats urknall(31763b0ca947) - netID:ksx36pb1e18i5cywyl0itazk1 leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"                                                                                                                                                     
Feb 22 14:38:40 urknall dockerd[1337]: time="2018-02-22T14:38:40.225081464+01:00" level=info msg="NetworkDB stats urknall(31763b0ca947) - netID:87vvi5rhozzl8qws2ufe7ltl2 leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"                                                                                                                                                     
Feb 22 14:38:40 urknall dockerd[1337]: time="2018-02-22T14:38:40.225155639+01:00" level=info msg="NetworkDB stats urknall(31763b0ca947) - netID:mh26wxh3legx189b8ummizzvg leaving:false netPeers:2 entries:4 Queue qLen:0 netMsg/s:0"                                                                                                                                                     
Feb 22 14:38:45 urknall mfsmount: read file error, inode: 359131, index: 0, chunk: 614967, version: 23 - no valid copies (try counter: 34)                                                   
Feb 22 14:38:46 urknall mfsmount: write file error, inode: 67783, index: 0 - Chunk write error (Disconnected) (try counter: 38)                                                              
Feb 22 14:38:47 urknall mfsmount: write file error, inode: 427967, index: 0 - Chunk write error (Disconnected) (try counter: 1)                                                              
Feb 22 14:38:55 urknall mfsmount: read file error, inode: 359131, index: 0, chunk: 614967, version: 23 - no valid copies (try counter: 35)                                                   
Feb 22 14:38:58 urknall systemd[1]: Started Session 9590 of user nagios.                                                                                                                     
Feb 22 14:39:01 urknall CRON[6403]: (root) CMD (  [ -x /usr/lib/php5/maxlifetime ] && [ -x /usr/lib/php5/sessionclean ] && [ -d /var/lib/php5 ] && /usr/lib/php5/sessionclean /var/lib/php5 $(/usr/lib/php5/maxlifetime))                                                                                                                                                                 
Feb 22 14:39:03 urknall mfsmount: write file error, inode: 67784, index: 0 - Timeout after 30034 ms (Timeout) (try counter: 5)                                                               
Feb 22 14:39:05 urknall mfsmount: read file error, inode: 359131, index: 0, chunk: 614967, version: 23 - no valid copies (try counter: 36)                                                   
Feb 22 14:39:07 urknall mfsmetalogger[1371]: sessions downloaded 742B/0.000795s (0.933 MB/s)                                                                                                 
Feb 22 14:39:07 urknall mfschunkserver[22581]: replication error: Can't connect to 192.168.99.137:9422                                                                                       
Feb 22 14:39:07 urknall mfsmount: write file error, inode: 147140, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 41)                  
Feb 22 14:39:07 urknall mfschunkserver[22581]: replication error: Can't connect to 192.168.99.137:9422                                                                                       
Feb 22 14:39:07 urknall mfsmount: write file error, inode: 427967, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 2)                   
Feb 22 14:39:08 urknall mfsmount: write file error, inode: 67783, index: 0 - Chunk write error (Disconnected) (try counter: 39)                                                              
Feb 22 14:39:15 urknall mfsmount: read file error, inode: 359131, index: 0, chunk: 614967, version: 23 - no valid copies (try counter: 37)                                                   
Feb 22 14:39:19 urknall mfsmount: write file error, inode: 427967, index: 0 - Chunk write error (Disconnected) (try counter: 3)
Feb 22 14:39:25 urknall mfsmount: read file error, inode: 359131, index: 0, chunk: 614967, version: 23 - no valid copies (try counter: 38)
Feb 22 14:39:30 urknall mfsmount: write file error, inode: 427967, index: 0 - Chunk write error (Disconnected) (try counter: 4)
Feb 22 14:39:34 urknall mfsmount: write file error, inode: 67784, index: 0 - Timeout after 30021 ms (Timeout) (try counter: 6)
Feb 22 14:39:35 urknall mfsmount: read file error, inode: 359131, index: 0, chunk: 614967, version: 23 - no valid copies (try counter: 39)
Feb 22 14:39:38 urknall systemd[1]: Started Session 9593 of user nagios.
Feb 22 14:39:42 urknall mfsmount: write file error, inode: 427967, index: 0 - Chunk write error (Disconnected) (try counter: 5)
Feb 22 14:39:45 urknall mfsmount: read file error, inode: 359131, index: 0, chunk: 614967, version: 23 - no valid copies (try counter: 40)
Feb 22 14:39:47 urknall mfsmount: write file error, inode: 147140, index: 0 - Timeout after 30045 ms (Timeout) (try counter: 42)
Feb 22 14:39:48 urknall mfsmount: write file error, inode: 67783, index: 0 - Timeout after 30039 ms (Timeout) (try counter: 40)
Feb 22 14:39:50 urknall mfsmount: write file error, inode: 67784, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 7)
Feb 22 14:39:55 urknall mfsmount: read file error, inode: 359131, index: 0, chunk: 614967, version: 23 - no valid copies (try counter: 41)
Feb 22 14:39:56 urknall systemd[1]: Started Session 9594 of user nagios.
Feb 22 14:40:00 urknall mfsmetalogger[1371]: sessions downloaded 742B/0.000703s (1.055 MB/s)
Feb 22 14:40:05 urknall mfsmount: read file error, inode: 359131, index: 0, chunk: 614967, version: 23 - no valid copies (try counter: 42)

Why can't urknall connect to universum, when the port is open:

marc@urknall:~$ nmap -p 9422 192.168.99.137

Starting Nmap 7.01 ( https://nmap.org ) at 2018-02-22 14:41 CET
Nmap scan report for universum (192.168.99.137)
Host is up (0.00045s latency).
PORT     STATE SERVICE
9422/tcp open  unknown
Feb 22 14:39:07 urknall mfschunkserver[22581]: replication error: Can't connect to 192.168.99.137:9422                                                                                       

pulsar (metalogger, chunkserver (not in the goals) and a client that's currently writing 8 MB at ~15kB/s):

Feb 22 14:30:26 pulsar mfsmount: write file error, inode: 428002, index: 0 - Timeout after 30035 ms (Timeout) (try counter: 1)
Feb 22 14:30:27 pulsar mfsmount: write file error, inode: 145229, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 2)
Feb 22 14:30:28 pulsar mfsmount: write file error, inode: 428016, index: 0 - Timeout after 30028 ms (Timeout) (try counter: 26)
Feb 22 14:30:36 pulsar mfsmetalogger[6534]: connection was reset by Master
Feb 22 14:30:37 pulsar mfsmetalogger[6534]: connecting to Master
Feb 22 14:30:37 pulsar mfsmetalogger[6534]: connected to Master
Feb 22 14:30:42 pulsar mfsmount: write file error, inode: 145229, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 3)
Feb 22 14:30:42 pulsar mfsmount: write file error, inode: 428002, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 2)
Feb 22 14:30:53 pulsar mfsmount: write file error, inode: 428016, index: 0 - Chunk write error (Disconnected) (try counter: 27)
Feb 22 14:30:54 pulsar mfsmount: write file error, inode: 428002, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 3)
Feb 22 14:31:06 pulsar mfsmount: write file error, inode: 428002, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 4)
Feb 22 14:31:13 pulsar mfsmount: write file error, inode: 145229, index: 0 - Timeout after 30046 ms (Timeout) (try counter: 4)
Feb 22 14:31:20 pulsar mfsmount: write file error, inode: 428002, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 5)
Feb 22 14:31:24 pulsar systemd[1]: Started Session 72074 of user nagios.
Feb 22 14:31:31 pulsar mfsmount: write file error, inode: 428002, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 6)
Feb 22 14:31:33 pulsar mfsmount: write file error, inode: 428016, index: 0 - Timeout after 30028 ms (Timeout) (try counter: 28)
Feb 22 14:31:43 pulsar mfsmount: write file error, inode: 428002, index: 0 - Chunk write error (Disconnected) (try counter: 7)                                                               
Feb 22 14:31:44 pulsar mfsmount: write file error, inode: 145229, index: 0 - Timeout after 30033 ms (Timeout) (try counter: 5)                                                               
Feb 22 14:31:55 pulsar systemd[1]: Started Session 72075 of user nagios.                                                                                                                     
Feb 22 14:31:56 pulsar mfsmount: write file error, inode: 428016, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 29)                 
Feb 22 14:31:56 pulsar mfsmount: write file error, inode: 145229, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 6)                  
Feb 22 14:31:58 pulsar mfsmetalogger[6534]: sessions downloaded 742B/0.000644s (1.152 MB/s)                                                                                                  
Feb 22 14:32:13 pulsar mfsmount: write file error, inode: 428002, index: 0 - Timeout after 30038 ms (Timeout) (try counter: 8)                                                               
Feb 22 14:32:25 pulsar mfsmount: write file error, inode: 428002, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 9)                    
Feb 22 14:32:25 pulsar mfsmount: write file error, inode: 428016, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 30)                   
Feb 22 14:32:25 pulsar mfsmount: write file error, inode: 145229, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 7)                    
Feb 22 14:32:38 pulsar mfsmount: write file error, inode: 145229, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 8)                  
Feb 22 14:32:46 pulsar mfsmount: write file error, inode: 428016, index: 0 - Chunk write error (Disconnected) (try counter: 31)                                                              
Feb 22 14:32:46 pulsar mfsmount: write file error, inode: 428002, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.3:9422) (try counter: 10)                   
Feb 22 14:33:00 pulsar systemd[1]: Started Session 72076 of user nagios.                                                                                                                     
Feb 22 14:33:03 pulsar mfsmetalogger[6534]: connection was reset by Master                                                                                                                   
Feb 22 14:33:04 pulsar mfsmetalogger[6534]: connecting to Master                                                                                                                             
Feb 22 14:33:04 pulsar mfsmetalogger[6534]: connected to Master                                                                                                                              
Feb 22 14:33:09 pulsar mfsmount: write file error, inode: 145229, index: 0 - Timeout after 30032 ms (Timeout) (try counter: 9)                                                               
Feb 22 14:33:10 pulsar systemd[1]: Started Session 72077 of user nagios.
Feb 22 14:33:13 pulsar mfsmount: write file error, inode: 428002, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 11)
Feb 22 14:33:21 pulsar mfsmount: write file error, inode: 145229, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 10)
Feb 22 14:33:26 pulsar mfsmount: write file error, inode: 428016, index: 0 - Timeout after 30026 ms (Timeout) (try counter: 32)
Feb 22 14:33:33 pulsar mfsmount: write file error, inode: 145229, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 11)
Feb 22 14:33:45 pulsar mfsmount: write file error, inode: 428002, index: 0 - Timeout after 30007 ms (Timeout) (try counter: 12)
Feb 22 14:33:50 pulsar mfsmount: write file error, inode: 145229, index: 0 - Chunk write error (Disconnected) (try counter: 12)
Feb 22 14:34:01 pulsar mfsmount: write file error, inode: 428002, index: 0 - Chunk write error (Disconnected) (try counter: 13)
Feb 22 14:34:01 pulsar mfsmount: write file error, inode: 428016, index: 0 - Chunk write error (Disconnected) (try counter: 33)
Feb 22 14:34:04 pulsar mfsmount: write file error, inode: 145229, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 13)
Feb 22 14:34:07 pulsar systemd[1]: Started Session 72078 of user nagios.
Feb 22 14:34:15 pulsar mfsmount: write file error, inode: 428002, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 14)
Feb 22 14:34:18 pulsar mfsmount: write file error, inode: 145229, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 14)
Feb 22 14:34:21 pulsar mfsmount: write file error, inode: 428016, index: 0 - Chunk write error (Disconnected) (try counter: 34)
Feb 22 14:34:35 pulsar mfsmetalogger[6534]: sessions downloaded 742B/0.000614s (1.208 MB/s)
Feb 22 14:34:36 pulsar mfsmount: write file error, inode: 428002, index: 0 - Read from chunkserver: connection closed by peer (server 192.168.99.137:9422) (try counter: 15)
@cratoo

This comment has been minimized.

Copy link

cratoo commented Feb 22, 2018

IMHO there is also a local problem

Feb 22 14:30:36 pulsar mfsmetalogger[6534]: connection was reset by Master
Feb 22 14:30:37 pulsar mfsmetalogger[6534]: connecting to Master
Feb 22 14:30:37 pulsar mfsmetalogger[6534]: connected to Master

I don't see any logs from the master in your dump, does it maybe log into another file?

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 22, 2018

@cratoo, but what problem and how to localize it?

Nope, here is the log since the last restart 10min ago:

marc@universum:~$ grep mfsmaster /var/log/syslog
…
Feb 22 14:55:06 universum mfsmaster[6469]: terminate signal received
Feb 22 14:55:06 universum mfsmaster[6469]: connection with ML(192.168.99.2) has been closed by peer
Feb 22 14:55:12 universum mfsmaster[6469]: (192.168.99.3:9422) chunk: 000000000004BDAD replication status: Unknown LizardFS error
Feb 22 14:55:12 universum mfsmaster[6469]: main master server module: closing *:9421
Feb 22 14:55:12 universum mfsmaster[6469]: master <-> tapeservers module: closing socket *:9424
Feb 22 14:55:12 universum mfsmaster[6469]: master <-> chunkservers module: closing *:9420
Feb 22 14:55:12 universum mfsmaster[6469]: master <-> metaloggers module: closing *:9419
Feb 22 14:55:14 universum mfsmaster: set gid to 120
Feb 22 14:55:14 universum mfsmaster: set uid to 114
Feb 22 14:55:14 universum mfsmaster: changed working directory to: /var/lib/mfs
Feb 22 14:55:14 universum mfsmaster: lockfile /var/lib/lizardfs/.mfsmaster.lock created and locked
Feb 22 14:55:14 universum mfsmaster: sessions have been loaded
Feb 22 14:55:14 universum mfsmaster: initialized sessions from file /var/lib/lizardfs/sessions.mfs
Feb 22 14:55:14 universum mfsmaster: initialized exports from file /etc/mfs/mfsexports.cfg
Feb 22 14:55:14 universum mfsmaster: initialized topology from file /etc/mfs/mfstopology.cfg
Feb 22 14:55:14 universum mfsmaster: initialized goal definitions from file /etc/mfs/mfsgoals.cfg
Feb 22 14:55:14 universum mfsmaster: opened metadata file /var/lib/lizardfs/metadata.mfs
Feb 22 14:55:14 universum mfsmaster: loading objects (files,directories,etc.) from the metadata file
Feb 22 14:55:14 universum mfsmaster: loading names from the metadata file
Feb 22 14:55:14 universum mfsmaster: loading deletion timestamps from the metadata file
Feb 22 14:55:14 universum mfsmaster: loading extra attributes (xattr) from the metadata file
Feb 22 14:55:14 universum mfsmaster: loading access control lists from the metadata file
Feb 22 14:55:14 universum mfsmaster: loading quota entries from the metadata file
Feb 22 14:55:14 universum mfsmaster: loading file locks from the metadata file
Feb 22 14:55:14 universum mfsmaster: loading chunks data from the metadata file
Feb 22 14:55:15 universum mfsmaster: checking filesystem consistency of the metadata file
Feb 22 14:55:15 universum mfsmaster: connecting files and chunks
Feb 22 14:55:15 universum mfsmaster: calculating checksum of the metadata
Feb 22 14:55:16 universum mfsmaster: metadata file /var/lib/lizardfs/metadata.mfs read (424478 inodes including 15130 directory inodes and 409126 file inodes, 522619 chunks)
Feb 22 14:55:16 universum mfsmaster: loaded charts data file from /var/lib/lizardfs/stats.mfs
Feb 22 14:55:16 universum mfsmaster: master <-> metaloggers module: listen on *:9419
Feb 22 14:55:16 universum mfsmaster: master <-> chunkservers module: listen on *:9420
Feb 22 14:55:16 universum mfsmaster: master <-> tapeservers module: listen on (*:9424)
Feb 22 14:55:16 universum mfsmaster: main master server module: listen on *:9421
Feb 22 14:55:16 universum mfsmaster: open files limit: 1024
Feb 22 14:55:16 universum mfsmaster: mfsmaster daemon initialized properly
Feb 22 14:55:17 universum mfsmaster[22205]: chunkserver register begin (packet version: 5) - ip: 192.168.99.2, port: 9422
Feb 22 14:55:17 universum mfsmaster[22205]: chunkserver register end (packet version: 5) - ip: 192.168.99.2, port: 9422, usedspace: 15947834593280 (14852.58 GiB), totalspace: 17988980342784 (16753.54 GiB)
Feb 22 14:55:17 universum mfsmaster[22205]: chunkserver (ip: 192.168.99.2, port 9422) changed its label from '_' to 'pulsar'
Feb 22 14:55:18 universum mfsmaster[22205]: chunkserver register begin (packet version: 5) - ip: 192.168.99.7, port: 9422
Feb 22 14:55:18 universum mfsmaster[22205]: chunkserver register end (packet version: 5) - ip: 192.168.99.7, port: 9422, usedspace: 15342432038912 (14288.75 GiB), totalspace: 21890394423296 (20387.02 GiB)
Feb 22 14:55:18 universum mfsmaster[22205]: chunkserver (ip: 192.168.99.7, port 9422) changed its label from '_' to 'raum'
Feb 22 14:55:48 universum mfsmaster[22205]: terminate signal received
Feb 22 14:55:48 universum mfsmaster[22205]: connection with ML(192.168.99.2) has been closed by peer
Feb 22 14:55:48 universum mfsmaster[22205]: main master server module: closing *:9421
Feb 22 14:55:48 universum mfsmaster[22205]: master <-> tapeservers module: closing socket *:9424
Feb 22 14:55:48 universum mfsmaster[22205]: master <-> chunkservers module: closing *:9420
Feb 22 14:55:48 universum mfsmaster[22205]: master <-> metaloggers module: closing *:9419
Feb 22 14:55:50 universum mfsmaster: set gid to 120
Feb 22 14:55:50 universum mfsmaster: set uid to 114
Feb 22 14:55:50 universum mfsmaster: changed working directory to: /var/lib/mfs
Feb 22 14:55:50 universum mfsmaster: lockfile /var/lib/lizardfs/.mfsmaster.lock created and locked
Feb 22 14:55:50 universum mfsmaster: sessions have been loaded
Feb 22 14:55:50 universum mfsmaster: initialized sessions from file /var/lib/lizardfs/sessions.mfs
Feb 22 14:55:50 universum mfsmaster: initialized exports from file /etc/mfs/mfsexports.cfg
Feb 22 14:55:50 universum mfsmaster: initialized topology from file /etc/mfs/mfstopology.cfg
Feb 22 14:55:50 universum mfsmaster: initialized goal definitions from file /etc/mfs/mfsgoals.cfg
Feb 22 14:55:50 universum mfsmaster: opened metadata file /var/lib/lizardfs/metadata.mfs
Feb 22 14:55:50 universum mfsmaster: loading objects (files,directories,etc.) from the metadata file
Feb 22 14:55:50 universum mfsmaster: loading names from the metadata file
Feb 22 14:55:50 universum mfsmaster: loading deletion timestamps from the metadata file
Feb 22 14:55:50 universum mfsmaster: loading extra attributes (xattr) from the metadata file
Feb 22 14:55:50 universum mfsmaster: loading access control lists from the metadata file
Feb 22 14:55:50 universum mfsmaster: loading quota entries from the metadata file
Feb 22 14:55:50 universum mfsmaster: loading file locks from the metadata file
Feb 22 14:55:50 universum mfsmaster: loading chunks data from the metadata file
Feb 22 14:55:51 universum mfsmaster: checking filesystem consistency of the metadata file
Feb 22 14:55:51 universum mfsmaster: connecting files and chunks
Feb 22 14:55:51 universum mfsmaster: calculating checksum of the metadata
Feb 22 14:55:52 universum mfsmaster: metadata file /var/lib/lizardfs/metadata.mfs read (424478 inodes including 15130 directory inodes and 409126 file inodes, 522619 chunks)
Feb 22 14:55:52 universum mfsmaster: loaded charts data file from /var/lib/lizardfs/stats.mfs
Feb 22 14:55:52 universum mfsmaster: master <-> metaloggers module: listen on *:9419
Feb 22 14:55:52 universum mfsmaster: master <-> chunkservers module: listen on *:9420
Feb 22 14:55:52 universum mfsmaster: master <-> tapeservers module: listen on (*:9424)
Feb 22 14:55:52 universum mfsmaster: main master server module: listen on *:9421
Feb 22 14:55:52 universum mfsmaster: open files limit: 1024
Feb 22 14:55:52 universum mfsmaster: mfsmaster daemon initialized properly
Feb 22 14:55:53 universum mfsmaster[29369]: chunkserver register begin (packet version: 5) - ip: 192.168.99.7, port: 9422
Feb 22 14:55:53 universum mfsmaster[29369]: chunkserver register end (packet version: 5) - ip: 192.168.99.7, port: 9422, usedspace: 15342438563840 (14288.76 GiB), totalspace: 21890394193920 (20387.02 GiB)
Feb 22 14:55:53 universum mfsmaster[29369]: chunkserver (ip: 192.168.99.7, port 9422) changed its label from '_' to 'raum'
Feb 22 14:55:55 universum mfsmaster[29369]: chunkserver register begin (packet version: 5) - ip: 192.168.99.3, port: 9422
Feb 22 14:55:56 universum mfsmaster[29369]: chunkserver register end (packet version: 5) - ip: 192.168.99.3, port: 9422, usedspace: 17091632549888 (15917.82 GiB), totalspace: 23957967372288 (22312.60 GiB)
Feb 22 14:55:56 universum mfsmaster[29369]: chunkserver (ip: 192.168.99.3, port 9422) changed its label from '_' to 'urknall'
Feb 22 14:55:57 universum mfsmaster[29369]: chunkserver register begin (packet version: 5) - ip: 192.168.99.2, port: 9422
Feb 22 14:55:57 universum mfsmaster[29369]: chunkserver register end (packet version: 5) - ip: 192.168.99.2, port: 9422, usedspace: 15947832352768 (14852.58 GiB), totalspace: 17988977754112 (16753.54 GiB)
Feb 22 14:55:57 universum mfsmaster[29369]: chunkserver (ip: 192.168.99.2, port 9422) changed its label from '_' to 'pulsar'
Feb 22 14:55:57 universum mfsmaster[29369]: chunkserver register begin (packet version: 5) - ip: 192.168.99.137, port: 9422
Feb 22 14:55:58 universum mfsmaster[29369]: chunkserver register end (packet version: 5) - ip: 192.168.99.137, port: 9422, usedspace: 10200488116224 (9499.94 GiB), totalspace: 15982325530624 (14884.70 GiB)
Feb 22 14:55:58 universum mfsmaster[29369]: chunkserver (ip: 192.168.99.137, port 9422) changed its label from '_' to 'universum'
Feb 22 14:57:08 universum mfsmaster[29369]: connection with CS(192.168.99.137) has been closed by peer
Feb 22 14:57:08 universum mfsmaster[29369]: chunkserver disconnected - ip: 192.168.99.137, port: 9422, usedspace: 10200478605312 (9499.94 GiB), totalspace: 15982325661696 (14884.70 GiB)
Feb 22 14:57:09 universum mfsmaster[29369]: chunkserver register begin (packet version: 5) - ip: 192.168.99.137, port: 9422
Feb 22 14:57:09 universum mfsmaster[29369]: chunkserver register end (packet version: 5) - ip: 192.168.99.137, port: 9422, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB)
Feb 22 14:57:09 universum mfsmaster[29369]: chunkserver (ip: 192.168.99.137, port 9422) changed its label from '_' to 'universum'
Feb 22 14:58:07 universum mfsmaster[29369]: connection with CS(192.168.99.3) has been closed by peer
Feb 22 14:58:07 universum mfsmaster[29369]: chunkserver disconnected - ip: 192.168.99.3, port: 9422, usedspace: 17091584020480 (15917.78 GiB), totalspace: 23957967699968 (22312.60 GiB)
Feb 22 14:59:55 universum mfsmaster[29369]: chunkserver register begin (packet version: 5) - ip: 192.168.99.3, port: 9422
Feb 22 14:59:55 universum mfsmaster[29369]: chunkserver register end (packet version: 5) - ip: 192.168.99.3, port: 9422, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB)
Feb 22 14:59:55 universum mfsmaster[29369]: chunkserver (ip: 192.168.99.3, port 9422) changed its label from '_' to 'urknall'
Feb 22 15:01:34 universum mfsmaster[29369]: (192.168.99.137:9422) chunk: 000000000007FFC2 replication status: Unknown LizardFS error
Feb 22 15:01:34 universum mfsmaster[29369]: (192.168.99.137:9422) chunk: 0000000000080601 replication status: Unknown LizardFS error
Feb 22 15:01:54 universum mfsmaster[29369]: (192.168.99.7:9422) chunk: 00000000000B96D4 replication status: Unknown LizardFS error
Feb 22 15:01:55 universum mfsmaster[29369]: chunkserver disconnected - ip: 192.168.99.2, port: 9422, usedspace: 15947886387200 (14852.63 GiB), totalspace: 17988980539392 (16753.54 GiB)
Feb 22 15:01:55 universum mfsmaster[29369]: chunk 00000000000987b8 has not enough valid parts (1) consider repairing it manually
Feb 22 15:01:55 universum mfsmaster[29369]: chunk 00000000000987b8_00000011 - invalid part on (192.168.99.137 - ver:00000010)
Feb 22 15:01:56 universum mfsmaster[29369]: chunkserver register begin (packet version: 5) - ip: 192.168.99.2, port: 9422
Feb 22 15:01:56 universum mfsmaster[29369]: chunkserver register end (packet version: 5) - ip: 192.168.99.2, port: 9422, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB)
Feb 22 15:01:56 universum mfsmaster[29369]: chunkserver (ip: 192.168.99.2, port 9422) changed its label from '_' to 'pulsar'
Feb 22 15:01:57 universum mfsmaster[29369]: (192.168.99.7:9422) chunk: 00000000000B96D5 replication status: Unknown LizardFS error
Feb 22 15:02:08 universum mfsmaster[29369]: chunk 0000000000096239 has not enough valid parts (1) consider repairing it manually
Feb 22 15:02:08 universum mfsmaster[29369]: chunk 0000000000096239_00000021 - invalid part on (192.168.99.7 - ver:00000020)
Feb 22 15:02:11 universum mfsmaster[29369]: connection with CS(192.168.99.7) has been closed by peer
Feb 22 15:02:11 universum mfsmaster[29369]: chunkserver disconnected - ip: 192.168.99.7, port: 9422, usedspace: 15342580011008 (14288.89 GiB), totalspace: 21890394652672 (20387.02 GiB)
Feb 22 15:02:12 universum mfsmaster[29369]: chunkserver register begin (packet version: 5) - ip: 192.168.99.7, port: 9422
Feb 22 15:02:12 universum mfsmaster[29369]: chunkserver register end (packet version: 5) - ip: 192.168.99.7, port: 9422, usedspace: 0 (0.00 GiB), totalspace: 0 (0.00 GiB)
Feb 22 15:02:12 universum mfsmaster[29369]: chunkserver (ip: 192.168.99.7, port 9422) changed its label from '_' to 'raum'
Feb 22 15:02:36 universum mfsmaster[29369]: (192.168.99.137:9422) chunk: 000000000007C497 replication status: Unknown LizardFS error
Feb 22 15:02:36 universum mfsmaster[29369]: (192.168.99.137:9422) chunk: 000000000007BED3 replication status: Unknown LizardFS error
Feb 22 15:02:57 universum mfsmaster[29369]: chunk 0000000000096225 has not enough valid parts (1) consider repairing it manually
Feb 22 15:02:57 universum mfsmaster[29369]: chunk 0000000000096225_00000015 - invalid part on (192.168.99.7 - ver:00000014)
Feb 22 15:03:37 universum mfsmaster[29369]: (192.168.99.137:9422) chunk: 000000000007FAEF replication status: Unknown LizardFS error
Feb 22 15:03:37 universum mfsmaster[29369]: (192.168.99.137:9422) chunk: 000000000007F4B0 replication status: Unknown LizardFS error
Feb 22 15:04:09 universum mfsmaster[29369]: (192.168.99.137:9422) chunk: 000000000007AE7B replication status: Unknown LizardFS error
Feb 22 15:04:09 universum mfsmaster[29369]: (192.168.99.137:9422) chunk: 000000000007A55A replication status: Unknown LizardFS error
Feb 22 15:05:12 universum mfsmaster[29369]: (192.168.99.137:9422) chunk: 00000000000824BE replication status: Unknown LizardFS error
Feb 22 15:05:19 universum mfsmaster[29369]: (192.168.99.137:9422) chunk: 000000000007C82E replication status: Unknown LizardFS error
Feb 22 15:06:14 universum mfsmaster[29369]: (192.168.99.137:9422) chunk: 000000000007CA48 replication status: Unknown LizardFS error
Feb 22 15:06:16 universum mfsmaster[29369]: (192.168.99.137:9422) chunk: 000000000007D542 replication status: Unknown LizardFS error
Feb 22 15:07:02 universum mfsmaster[29369]: chunk hasn't been deleted since previous loop - retry
Feb 22 15:07:02 universum mfsmaster[29369]: chunk 00000000000987b8 has not enough valid parts (1) consider repairing it manually
Feb 22 15:07:02 universum mfsmaster[29369]: chunk 00000000000987b8_00000011 - invalid part on (192.168.99.137 - ver:00000010)
Feb 22 15:07:15 universum mfsmaster[29369]: chunk 0000000000096239 has not enough valid parts (1) consider repairing it manually
Feb 22 15:07:15 universum mfsmaster[29369]: chunk 0000000000096239_00000021 - invalid part on (192.168.99.7 - ver:00000020)
Feb 22 15:07:16 universum mfsmaster[29369]: (192.168.99.137:9422) chunk: 000000000007ECFE replication status: Unknown LizardFS error
Feb 22 15:07:17 universum mfsmaster[29369]: (192.168.99.137:9422) chunk: 0000000000078E00 replication status: Unknown LizardFS error
Feb 22 15:07:20 universum mfsmaster[29369]: chunk 0000000000096237 has not enough valid parts (1) consider repairing it manually
Feb 22 15:07:20 universum mfsmaster[29369]: chunk 0000000000096237_00000017 - invalid part on (192.168.99.7 - ver:00000016)
Feb 22 15:07:53 universum mfsmaster[29369]: (192.168.99.137:9422) chunk: 0000000000081662 replication status: Unknown LizardFS error
Feb 22 15:07:53 universum mfsmaster[29369]: (192.168.99.137:9422) chunk: 000000000007C42A replication status: Unknown LizardFS error
Feb 22 15:08:00 universum mfsmaster[29369]: (192.168.99.3:9422) chunk: 00000000000B9660 deletion status: No such chunk
Feb 22 15:08:05 universum mfsmaster[29369]: chunk 0000000000096225 has not enough valid parts (1) consider repairing it manually
Feb 22 15:08:05 universum mfsmaster[29369]: chunk 0000000000096225_00000015 - invalid part on (192.168.99.7 - ver:00000014)
@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 22, 2018

For the first time, I see a massive improvement!

In #611 @Blackpaw recommends:

NR_OF_NETWORK_WORKERS=4
NR_OF_HDD_WORKERS_PER_NETWORK_WORKER=4
PERFORM_FSYNC = 0

I configured this and get:

8388608 bytes (8.4 MB, 8.0 MiB) copied, 7.95291 s, 1.1 MB/s
8388608 bytes (8.4 MB, 8.0 MiB) copied, 11.238 s, 746 kB/s                         
8388608 bytes (8.4 MB, 8.0 MiB) copied, 10.072 s, 833 kB/s                                   
8388608 bytes (8.4 MB, 8.0 MiB) copied, 0.888299 s, 9.4 MB/s                                 
8388608 bytes (8.4 MB, 8.0 MiB) copied, 1.09541 s, 7.7 MB/s

That's not much, but more I ever had (just used to be ~15kB/s).

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 22, 2018

According to @guestisp in #611, I set in the master (he wrote he set it to 2000, @psarna wrote that that is too much, so I set it higher 10×):

CHUNKS_WRITE_REP_LIMIT = 20
CHUNKS_READ_REP_LIMIT = 100

It doesn't seem do become worse, but it does mot seem to influence the write speed. Probably now balancing will be faster?

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 22, 2018

More comments or Ideas?

Write speed is now ~10MB/s, still too slow by a factor of ~10.

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 22, 2018

My current master-config:

LOAD_FACTOR_PENALTY = 0.5
#CHUNKS_READ_REP_LIMIT = 4                                                                   
ENDANGERED_CHUNKS_PRIORITY = 0.6
REJECT_OLD_CLIENTS = 1
CHUNKS_WRITE_REP_LIMIT = 20
CHUNKS_READ_REP_LIMIT = 100

My current chunk config:

MASTER_HOST = universum
#HDD_TEST_FREQ = 20
LABEL = universum
#CSSERV_TIMEOUT = 20
#REPLICATION_BANDWIDTH_LIMIT_KBPS = 1000
ENABLE_LOAD_FACTOR = 1
#NICE_LEVEL = 0
NR_OF_NETWORK_WORKERS = 10
NR_OF_HDD_WORKERS_PER_NETWORK_WORKER = 4
PERFORM_FSYNC = 0

Any comments or possible improvements according to this config?

@borkd

This comment has been minimized.

Copy link

borkd commented Feb 22, 2018

What's the average and std.dev. of the chunk size in the machine you are draining?

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 22, 2018

@borkd, I don't understand. What is «the average and std.dev» and how do I get them?

BTW: The new parameters do better than before, but still not good enough. Processes still crash due to write failures / timeouts. But the chunk distribution is much faster now without impact on the other operations, in contrary, they are still much better than yesterday.

Is it somehow dangerous to set PERFORM_FSYNC = 0, or why isn't this default?

I reduced:

CHUNKS_WRITE_REP_LIMIT = 4
CHUNKS_READ_REP_LIMIT = 20

What I am missing in the documentation is more background information on the parameters and more recommendations. What do you achieve by changing which parameter, etc.

@borkd

This comment has been minimized.

Copy link

borkd commented Feb 22, 2018

@mwaeckerlin, You can scan the underlying filesystem to get the idea how the size distribution looks like. If you have lots of small files and small chunks the end-to-end latencies could be the rate determining factor. For predominantly large files the bottlenecks will likely be elsewhere.

Before you opt for a permanent snowflake configuration, find out if you will be able to pick up the pieces when things fall apart. That is tweak the knobs if you must and accept that the benefits of tweaking a tunable parameter outweigh any associated risks .

PERFORM_FSYNC = 1 makes perfect sense as a default, so unsuspecting folks don't loose their "already written and acknowledged" data in the face of a power outage or a hard crash.

Your timeouts are likely not the source of the problem, but the manifestation of cumulative latencies determined by how you connected and configured your hardware, OS, any 3rd party applications, and then all LizardFS servers and clients. I base this on my experience with a couple of larger multi-site snowflakes.

With all that said, the documentation could always be made better. Your contributions are welcome.

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 22, 2018

And: When I shut down the remote host and brought it to the other network, again I lost 5 important chunks after rebooting. I hope, they will come back, but they're gone for 1 day now and at least one service is broken and constantly restarting due to the defect files. :(

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 22, 2018

Just tried the following, without any notable performance change:

sudo bash -c 'echo "DirectIO=true" > .lizardfs_tweaks'

and mount parameter are now extended by mfswritecachesize=2048,nosuid,nodev,noatime,big_writes:

mfsmount /var/volumes fuse rw,mfsmaster=universum,mfsdelayedinit,mfschunkserverwriteto=40000,mfsioretries=120,mfschunkserverconnectreadto=20000,mfschunkserverwavereadto=5000,mfschunkservertotalreadto=200000,mfswritecachesize=2048,nosuid,nodev,noatime,big_writes 0 0
@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 23, 2018

Question: Are changes to .lizardfs_tweaks persitent?

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Feb 27, 2018

Now replication is so fast, that replication filled my two new 8TB harddisks over the weekend. Now I added a new 10TB HD to the first server and am about adding one to the second (no more slots, need to pvmove data from a 2TB device first).

But: After rebooting. I lost 59 important chunks!

It seems, every time I restart a server, chunks are lost! What's that? I use LizardFS for not losing data! How do I get these chunks back?

@hradec

This comment has been minimized.

Copy link

hradec commented Feb 28, 2018

I have actually running in a similar situation as you are, my friend! :)

this is what I learned after a few months running LFS on 9 chunkservers, with a variety of hard disks, servers, nic's and sata/sas boards:

1. lost chunks:

A. indeed we've lost chunks... actually we ALWAYS loose chunks if goal is 1!!! Don't do goal 1, unless you data is not important!

B. We've actually lost chunks even with ec(2,1). We've learned the hard way that you need at least 2 parity bits for ec goals (ec(N,2)) or else we loose chunks!!

C. At least in our configuration, ec(N,2) goals proved to be a bit "faster" than standard N goals for some reason, and because we do have space constrains, we've decide to run ec() goals only. We've replaced goals 2 (2 chunkservers) by ec(2,2) (4 chunkserves), since the ec(2,2) version takes the same 200% in space, but allows for 2 chunkservers to fail. We're testing ec(6,2) (133% of space only, with 2 redundancy), but since it requires 8 chunkservers to write a file, we're running in slow downs every now and then.

2. Replication/Rebalancing indeed is a Bi@#$!

It took a lot of fiddling around with CHUNKS_WRITE_REP_LIMIT, CHUNKS_READ_REP_LIMIT, CHUNKS_LOOP_MAX_CPS, CHUNKS_LOOP_MIN_TIME and CHUNKS_LOOP_PERIOD to reduce the "scrub" speed, so clients can access the chunks.

please keep in mind that, my conclusions are based on observation, not code analysis. So if someone here known better, I would love to hear from you!!

Those parameters dictate how "chocked" the chunks get, and it's kind of annoying, since it's a GLOBAL parameter, but depending on latency/disk speed of each of the chunkservers, one setup works great for a chunkserver, but another won't!

at the end of the day, I've decided to create 2 mfsmaster.cfg files, one that is used during the day, so LFS is available to clients, and another "maintenance" one that runs at night, so to optmize the "scrubbing" when no-one is using LFS.

We run a couple of cron jobs to set each configuration at night/morning.

A. CHUNKS_LOOP_MAX_CPS, CHUNKS_LOOP_MIN_TIME and CHUNKS_LOOP_PERIOD:
The relation between CHUNKS_LOOP_MAX_CPS and CHUNKS_LOOP_MIN_TIME is very interesting and confusing. I'm still struggling with it.

The idea is that one can set a time in CHUNKS_LOOP_MIN_TIME, so the maximum number of replications in CHUNKS_LOOP_MAX_CPS be run. So, if you set CHUNKS_LOOP_MAX_CPS=500, and CHUNKS_LOOP_MIN_TIME=5000, in theory mfsmaster will execute 1 replication every 10 milliseconds (not sure if it's milliseconds or seconds though)

Then, CHUNKS_LOOP_PERIOD is actually the delay time between the execution of the function that "fills up" the 500 replications defined in CHUNKS_LOOP_MAX_CPS.

The tricky part here is that, it seems if you set CHUNKS_LOOP_MIN_TIME too low, LFS basically ignores it since it HAS to execute 500 replications defined in CHUNKS_LOOP_MAX_CPS. If they take longer then the time in CHUNKS_LOOP_MIN_TIME, it will keep going until it finishes. (apparently)

And if, the time it takes is longer than the CHUNKS_LOOP_PERIOD delay, it seems that LFS will queue another 500 replications to the ones that are still finishing, basically creating an ever-growing queue.

I've learned this the hard way, since when this happens, LFS will eventually became INACCESSIBLE for clients, requiring a full restart (and mfsmaster crashed more than once, and we loosed some goal 1 chunks! lol)

BUT, all thing considered, one can actually "slow down" the number of replications this way instead of reducing CHUNKS_WRITE_REP_LIMIT/CHUNKS_READ_REP_LIMIT!!

B. CHUNKS_WRITE_REP_LIMIT and CHUNKS_READ_REP_LIMIT
This is DIRECTLY connected to replication, and indeed are the easiest ones to lower down to increase availability to clients.

The BIGGEST problem with those for me is that, when you decrease it, it seems it ALSO affects the write/read performance of CLIENTS, no ONLY replication!!!!

It was MIND BLOWING for me... because I was initially decreasing the write one, clients seemed to get better performance. BUT, as soon as replication ended, I've notice the performance was still the same for clients... And I though it should get better after replication ended, but no!

Then, when I've increased WRITE/READ to 20/200, WHUAAU... what a change!! clients started to FLY!!!!

SOOooo.. although documentation says CHUNKS_WRITE_REP_LIMIT/CHUNKS_READ_REP_LIMIT is ONLY for replication, it seems it indeed makes a huge difference for normal client operations as well, at least in our case. (remember we're using ec(2,2) and ec(6,2))

Off course, once we switched some goals around and it started to replicate again, we had to decrease the 20/200 to something like 2/10 since the whole LFS became INACCESSIBLE to clients!!!

C. CONCLUSION

Soooo... ideally, I want to be able to leave it at 20/200 or more, and dial the REPLICATION ONLY writing/reading using CHUNKS_LOOP_MAX_CPS, CHUNKS_LOOP_MIN_TIME and CHUNKS_LOOP_PERIOD.

But, unfortunately, I wasn't able to do so since CHUNKS_LOOP_MAX_CPS, CHUNKS_LOOP_MIN_TIME and CHUNKS_LOOP_PERIOD they ALL have min/max caps. So at the minimum CHUNKS_LOOP_MAX_CPS and the maximum CHUNKS_LOOP_MIN_TIME, I still get TOO MANY replications, enough to choke my 20/200 CHUNKS_WRITE_REP_LIMIT/CHUNKS_READ_REP_LIMIT setup. (the graphs always show a maximum of 30 replications... I can never get the max to go lower due to the caps!)
image

that's why we're still running 2 configurations for "production" and "maintenance", basically tweaking CHUNKS_WRITE_REP_LIMIT/CHUNKS_READ_REP_LIMIT around.

3. CHUNKSERVER TESTING

There's one parameter in the mfschunkserver.cfg that no-one ever talks about it, but it made a HUGE difference for me, specially regarding timeouts on machines with slow disks.

HDD_TEST_FREQ=10 (default)

This is the amount of time in seconds, between chunk testing!

What I've learned is that, this testing can cause A LOT OF IOWAIT on slow disks (specially when you're running standalone disks, not RAID setup), causing chunkserver timeouts!

I've increased it to 3600, so now I get ONE test every hour. After this, we had of 70% drop of timeout messages on clients and chunkservers.
as you can see here:
image
One test every hour... so it's almost "disabled".

When there was NO replication happening, HDD_TEST_FREQ=3600 made ALL timeout messages go away!

So the default of 10, does indeed causes a lot a IOWAIT on chunkservers, to the point of causing timeout on clients.

4. TIMEOUT messages...

Getting timeout messages has being the biggest problem for us regarding performance. If we have no timeout messages, LFS runs smoothly!

That That That That That's All Folks...

I hope this helps you in some way!
And if someone has anything to add, regarding my conclusions with the parameters, please do!!

cheers...
-H

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Mar 7, 2018

@hradec, thank you very much for these profound and useful information.

For me, currently, it is much better than before, write speed is now ~100MB/s, read speed >1GB/s.

Here is my new configuration:

  • /etc/fstab:
    mfsmount /var/volumes fuse rw,mfsmaster=universum,mfsdelayedinit,nosuid,nodev,noatime,big_writes 0 0
    
  • /etc/mfs/mfsmaster.cfg:
    LOAD_FACTOR_PENALTY = 0.5
    REJECT_OLD_CLIENTS = 1
    CHUNKS_WRITE_REP_LIMIT = 20
    CHUNKS_READ_REP_LIMIT = 200
    
  • /etc/mfs/mfschunkserver.cfg:
    HDD_TEST_FREQ = 3600
    ENABLE_LOAD_FACTOR = 1
    NR_OF_NETWORK_WORKERS = 10
    NR_OF_HDD_WORKERS_PER_NETWORK_WORKER = 4
    PERFORM_FSYNC = 0
    

Of course, I would still like to get more information:

  • Who can explain the exact setting of the CHUNKS_LOOP_*-parameters?
  • Who can judge all the hints and findings and values of @hradec, are there better combinations?
  • What are your comments regarding my current settings above?
@hradec

This comment has been minimized.

Copy link

hradec commented Mar 8, 2018

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Mar 8, 2018

@hradec See above,I posted all infos about my environment,including network picture. The only change: I moved the remote node to the same switch as the other three, so all 4 are on the same 1GB Hub. >1GB/s is only read and all chunks are on the same host, goal is now: "universum urknall" (all on these two hosts) and docker swarm run on the same two hosts. Ask again, if you miss something.

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Mar 8, 2018

The mysql of my nextcloud service crashed again 13 minutes ago, after running stable for 16 hours (which is new record). So it is still not at the level it should be.

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Mar 8, 2018

Here are some current read / write performance check. All chunks are local, mount is on master:

write:

marc@universum:~$ for ((i=2;i<8;i+=2)); do for ((j=2;j<8;j+=2)); do echo -n "**** ${j}×${i}M - "; dd if=/dev/zero of=/srv/volumes/tmp/test.mrw bs=${i}M count=${j} 2>&1 | grep copied; done; done
**** 2×2M - 4194304 bytes (4.2 MB, 4.0 MiB) copied, 0.0536145 s, 78.2 MB/s
**** 4×2M - 8388608 bytes (8.4 MB, 8.0 MiB) copied, 0.0773159 s, 108 MB/s
**** 6×2M - 12582912 bytes (13 MB, 12 MiB) copied, 0.121807 s, 103 MB/s
**** 2×4M - 8388608 bytes (8.4 MB, 8.0 MiB) copied, 0.0938356 s, 89.4 MB/s
**** 4×4M - 16777216 bytes (17 MB, 16 MiB) copied, 0.156148 s, 107 MB/s
**** 6×4M - 25165824 bytes (25 MB, 24 MiB) copied, 0.235448 s, 107 MB/s
**** 2×6M - 12582912 bytes (13 MB, 12 MiB) copied, 0.816886 s, 15.4 MB/s
**** 4×6M - 25165824 bytes (25 MB, 24 MiB) copied, 0.256055 s, 98.3 MB/s
**** 6×6M - 37748736 bytes (38 MB, 36 MiB) copied, 0.352179 s, 107 MB/s

read:

marc@universum:~$ for ((i=2;i<8;i+=2)); do for ((j=2;j<8;j+=2)); do echo -n "**** ${j}×${i}M - "; dd of=/dev/null if=/srv/volumes/tmp/test.mrw bs=${i}M count=${j} 2>&1 | grep copied; done; done
**** 2×2M - 4194304 bytes (4.2 MB, 4.0 MiB) copied, 0.00368404 s, 1.1 GB/s
**** 4×2M - 8388608 bytes (8.4 MB, 8.0 MiB) copied, 0.00357403 s, 2.3 GB/s
**** 6×2M - 12582912 bytes (13 MB, 12 MiB) copied, 0.00516449 s, 2.4 GB/s
**** 2×4M - 8388608 bytes (8.4 MB, 8.0 MiB) copied, 0.00915208 s, 917 MB/s
**** 4×4M - 16777216 bytes (17 MB, 16 MiB) copied, 0.0114839 s, 1.5 GB/s
**** 6×4M - 25165824 bytes (25 MB, 24 MiB) copied, 0.0138809 s, 1.8 GB/s
**** 2×6M - 12582912 bytes (13 MB, 12 MiB) copied, 0.0149608 s, 841 MB/s
**** 4×6M - 25165824 bytes (25 MB, 24 MiB) copied, 0.0148196 s, 1.7 GB/s
**** 6×6M - 37748736 bytes (38 MB, 36 MiB) copied, 0.0207043 s, 1.8 GB/s
@hradec

This comment has been minimized.

Copy link

hradec commented Mar 8, 2018

See above,I posted all infos about my environment,including network picture. ...

Hoo yeah... sorry! I've replied from my email, and didn't came back to this github issue page to check!!


The mysql of my nextcloud service crashed again 13 minutes ago, after running stable for 16 hours (which is new record). So it is still not at the level it should be.

We don't have databases on our setup, but we do use a lot of "search paths" for python code. I've learned that LFS is TERRIBLE to traverse a long search path, like a PYTHONPATH with 20 paths.

It can take about 80 seconds to traverse on LFS, opposed to 0.7 seconds on NFS (a ZFS zraid exported over NFS)

Which I found very odd, considering that metadata is supposed to be really fast on LFS. But for some reason, traverse a long list of folders looking for a files takes WAY too long!

We could confirm this behaviour running google chrome cache folder over LFS. Having the cache folder on LFS is so SLOW, that internet browsing goes down to 300kbps!

running the same cache folder over NFS, internet browsing on chrome gets the proper speed it should have.

To be fair, I saw this same "search path" behaviour of every FUSE filesystem. I've tried on a SSHFS running from a localhost connection, and the traverse of a pythonpath search path gets 10X slower as well.

even using pcache or catfs, which are "cache" file systems to cache a filesystem on a local fast disk, have difficulty with search paths.

The only fuse filesystem that works reasonably well with searchpaths is fuse unionfs. There's a penalty, but to the scale of 1.3X, not 10-20X.

I've ended up setting up a local NFS server to server the mfsmount mountpoint LOCALLY, so I could re-mount LFS on a second mountpoint over NFS.

This actually made our searchpath traverse WAAAAY faster, after a first run! On the first time, we get the 80 seconds from LFS, but after that, we get the 0.7 speed from NFS!

But this "hack" sometimes crashes, and things get weird to the point of having to reboot the machine.

And indeed, from time to time, we get 30 chunktimeouts which cause read errors, which make python code crash!!

So, give it a try and reduce your

CHUNKS_WRITE_REP_LIMIT = 20
CHUNKS_READ_REP_LIMIT = 200

to something like

CHUNKS_WRITE_REP_LIMIT = 2
CHUNKS_READ_REP_LIMIT = 6

and see if your mysql can hold on for more than 16 hours! It may be that it's crashing due to chunks timeouts! (I saw timeouts happening even when chunks/master/mount are all on the same machine!!!)

keep an eye on you logs for mfsmount messages as well..


Here are some current read / write performance check.

Cool!!! Yeah, on my very first test, I had chunkservers running on the same host as the master, and indeed, I got about 110mb/sec reading AND writing, which was the limit of my hard disks!!

Can I suggest you one thing? Add this to the dd command line:
oflag=sync status=progress

The oflag makes dd ignore the write/read memory cache, so your reading/writing will be exactly what the filesystem can do, unaffected by linux file caching! The status=progress is just informative, so you can see the speed in realtime as it reads/writes.

It may be that your speed are so good because they are actually being done all in the linux file cache memory, specially considering the examples are very small.

All my poor speed is always from reading/writing large files, like 10 to 40 GBytes in size. As our studio does Visual Effects, the smallest files we generate are 0.5 GB... it's very common to have 10GB, 20GB and even 100GB files.

Sometimes even hundreds of 0.5GB files that, ideally, would need to be read on a rate of 24 files per second for playblack, so about 12GB per second... which we never get obviously. But 1GB/sec would be AMAZING, meaning a playback rate of 2 frames per second, which we can only achieve with a local SSD raid connected to a workstation.


But, there's definitely something going on when things are running over different machines. From what I could gather here at github, reading all the issues and all, there's a lot of people running LFS with really good systems, like fast disks, fast network, etc;

Every history I read about someone trying LFS over WAN, it's always terrible.

It makes me think that, because everyone is mostly running on "dreamy" systems (dreamy for my reality at least!! LOL), It may be that LFS ends up being more optimized to a "faster" and stable network setup.

The constant timeouts from chunkservers I get all the time, no matter what parameters I change on master/chunk setup, tells me that there's an "assumption" in the code about how fast things should be, and even adjusting the available parameters is not enough to put LFS to work on a slower network, like WAN.

I wish I had the time to dig into the code. I'm sure I could find out more information and even some workaround to make it faster and more reliable on slower networks, but since there's not much documentation about the code itself, it's really hard to start when you have loads of other things to do! LOL

If I could make request to the developers, I think it would be for then to make a little "visual" documentation of the code files, like:

FOLDER A -> [ base code used by every server ]
/ |
folderB[master] folderC[chunk] folderD[meta]

fileA,B,C - network protocol
bla bla bla...

That would, at least for me, help a LOT to, at least, look in the right place, you known?

off course, a doxygen like documentation would be awesome too.

To be fair, I always build lizardfs using YAOURT on Arch Linux, which basically builds and generates a package to be installed.

I've never build it by hand, so maybe there's a doxygen documentation that can be generated.

but would be really cool to have a doxigen like documentation somewhere on the web, though!


sorry for so much text, man... sometimes I talk to much! LOL

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Mar 9, 2018

@hradec, Isn't it possible to use NFS with LizardFS, instead of the fuse mount client? Would that help anything?

I've seen this: Providing advanced NFS services via the NFS Ganesha plugin

I don't understand, why you now recommend to lower CHUNKS_WRITE_REP_LIMIT and CHUNKS_READ_REP_LIMIT. Wasn't it your hint to increase them, which finally solved my prformance problem?

@hradec

This comment has been minimized.

Copy link

hradec commented Mar 9, 2018

@hradec

This comment has been minimized.

Copy link

hradec commented Mar 9, 2018

@mwaeckerlin

This comment has been minimized.

Copy link

mwaeckerlin commented Mar 13, 2018

@hradec:

if you make [CHUNKS_WRITE_REP_LIMIT and CHUNKS_READ_REP_LIMIT] TOO HIGH, it can cause the chunks to get too busy

Ok, I set them to lower values, but still have sporadically mysql failures, especially when I do work that writes into the DBs. As long as it is read access, it works stable.

I'll set the variables back to higher values (20/100), and I'll increase again the read/write timeouts in the mount point, in /etc/fstab:

mfsmount /srv/volumes fuse rw,mfsmaster=universum,mfssubfolder=/volumes,mfsdelayedinit,nosuid,nodev,noatime,big_writes,mfschunkserverwriteto=40000,mfsioretries=120,mfschunkserverconnectreadto=20000,mfschunkserverwavereadto=5000,mfschunkservertotalreadto=20000 0 0
mfsmount /srv/configs fuse rw,mfsmaster=universum,mfssubfolder=/configs,mfsdelayedinit,nosuid,nodev,noatime,big_writes,mfschunkserverwriteto=40000,mfsioretries=120,mfschunkserverconnectreadto=20000,mfschunkserverwavereadto=5000,mfschunkservertotalreadto=20000 0 0
@4Dolio

This comment has been minimized.

Copy link

4Dolio commented Mar 13, 2018

Mysql and databases in general can be very disk heavy.. I am not sure I would back them by LFS... Maybe try backing another loopbacked filesystem on LFS and then databases within that?? I use this method for some types of data, but mostly for archives to reduce metadata sizes and add compression if I format the loopback with ZFS.. I am not saying SQL can not be LFS backed, but you may have to get it tuned properly for it to work well..

I am backing iSCSI for Xen Servers with LFS on HDD(trying to not use) SSDs and M.2s but occasionally the VMs will go RO, so I still have some tuning to do before it is full proof.

Early on I had to do al lot of fiddling with my REP Limits to determine the maximum my environment could handle, then back down from those but keep a note about my particular max, which varies depending on all sorts of stuff like disk, network, average chunk sizes, etc.. There is no one correct setting for many of the tunables..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment