Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel crash using BTRFS and KernelMemory with 1.10 #20080

Closed
nfnty opened this issue Feb 6, 2016 · 7 comments
Closed

Kernel crash using BTRFS and KernelMemory with 1.10 #20080

nfnty opened this issue Feb 6, 2016 · 7 comments

Comments

@nfnty
Copy link

nfnty commented Feb 6, 2016

Description of problem:

docker version:

Client:
 Version:      1.10.0
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   590d5108
 Built:        Fri Feb  5 21:08:32 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.10.0
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   590d5108
 Built:        Fri Feb  5 21:08:32 2016
 OS/Arch:      linux/amd64

docker info:

Containers: 21
 Running: 21
 Paused: 0
 Stopped: 0
Images: 229
Server Version: 1.10.0
Storage Driver: btrfs
 Build Version: Btrfs v4.3.1
 Library Version: 101
Execution Driver: native-0.2
Logging Driver: json-file
Plugins: 
 Volume: local
 Network: bridge null host
Kernel Version: 4.1.17-1-lts
Operating System: Arch Linux
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.64 GiB
Name: nfnty

uname -a:

Linux nfnty 4.1.17-1-lts #1 SMP Mon Feb 1 16:35:08 CET 2016 x86_64 GNU/Linux

Environment details (AWS, VirtualBox, physical, etc.):

Physical.

How reproducible:

Steps to Reproduce:

  1. Use BTRFS storage driver
  2. Set KernelMemory to 4194304 (minimum enforced by Docker)

Actual Results:
Crash and data loss

Expected Results:
No crash

Additional info:
Had no problems with 1.9.1.
You can see all /containers/create request data over at https://github.com/nfnty/dockerfiles/blob/master/containers.yaml

Feb 06 19:46:39 nfnty kernel: slab_out_of_memory: 1897 callbacks suppressed
Feb 06 19:46:39 nfnty kernel: SLUB: Unable to allocate memory on node -1 (gfp=0x8050)
Feb 06 19:46:39 nfnty kernel:   cache: kmalloc-128(138:20ba8059599eff5c5cfa50bb45625d06e25a5d53f97077123988eb2add106413), object size: 128, buffer size: 128, default order: 0, min order: 0
Feb 06 19:46:39 nfnty kernel:   node 0: slabs: 8, objs: 256, free: 0
Feb 06 19:46:39 nfnty kernel: SLUB: Unable to allocate memory on node -1 (gfp=0x8050)
Feb 06 19:46:39 nfnty kernel:   cache: kmalloc-128(138:20ba8059599eff5c5cfa50bb45625d06e25a5d53f97077123988eb2add106413), object size: 128, buffer size: 128, default order: 0, min order: 0
Feb 06 19:46:39 nfnty kernel:   node 0: slabs: 8, objs: 256, free: 0
Feb 06 19:46:39 nfnty kernel: SLUB: Unable to allocate memory on node -1 (gfp=0x50)
Feb 06 19:46:39 nfnty kernel:   cache: radix_tree_node(138:20ba8059599eff5c5cfa50bb45625d06e25a5d53f97077123988eb2add106413), object size: 576, buffer size: 584, default order: 2, min order: 0
Feb 06 19:46:40 nfnty kernel:   node 0: slabs: 2, objs: 56, free: 0
Feb 06 19:46:40 nfnty kernel: SLUB: Unable to allocate memory on node -1 (gfp=0x50)
Feb 06 19:46:40 nfnty kernel:   cache: radix_tree_node(138:20ba8059599eff5c5cfa50bb45625d06e25a5d53f97077123988eb2add106413), object size: 576, buffer size: 584, default order: 2, min order: 0
Feb 06 19:46:40 nfnty kernel:   node 0: slabs: 2, objs: 56, free: 0
Feb 06 19:46:40 nfnty kernel: SLUB: Unable to allocate memory on node -1 (gfp=0x50)
Feb 06 19:46:40 nfnty kernel:   cache: kmalloc-192(138:20ba8059599eff5c5cfa50bb45625d06e25a5d53f97077123988eb2add106413), object size: 192, buffer size: 192, default order: 0, min order: 0
Feb 06 19:46:40 nfnty kernel:   node 0: slabs: 10, objs: 210, free: 0
Feb 06 19:46:40 nfnty kernel: SLUB: Unable to allocate memory on node -1 (gfp=0x50)
Feb 06 19:46:40 nfnty kernel:   cache: btrfs_extent_state(138:20ba8059599eff5c5cfa50bb45625d06e25a5d53f97077123988eb2add106413), object size: 80, buffer size: 80, default order: 0, min order: 0
Feb 06 19:46:40 nfnty kernel:   node 0: slabs: 4, objs: 204, free: 0
Feb 06 19:46:40 nfnty kernel: ------------[ cut here ]------------
Feb 06 19:46:40 nfnty kernel: kernel BUG at fs/btrfs/extent_io.c:855!
Feb 06 19:46:40 nfnty kernel: invalid opcode: 0000 [#2] SMP 
Feb 06 19:46:40 nfnty kernel: Modules linked in: ctr ccm veth tun ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_nat iptable_nat nf_nat_ipv4 nf_nat xt_mark iptable_mangle ebt_ip xt_pkttype xt_mac ipt_REJECT xt_tcpudp nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 xt_NFLOG xt_multiport ebt_nflog nf_conntrack_ipv6 nf_defrag_ipv6 xt_addrtype xt_owner ebtable_filter xt_iprange xt_conntrack nf_conntrack ebtables iptable_filter ip6table_filter ip6_tables nfnetlink_log nfnetlink arc4 ath9k ath9k_common ath9k_hw ath led_class nls_iso8859_1 nls_cp437 vfat fat iTCO_wdt intel_rapl ipmi_ssif iosf_mbi iTCO_vendor_support ppdev mac80211 x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm psmouse cfg80211 igb serio_raw mei_me rfkill ie31200_edac ptp pps_core pcspkr evdev dca edac_core i2c_algo_bit
Feb 06 19:46:40 nfnty kernel:  mac_hid lpc_ich mei i2c_i801 shpchp thermal fan parport_pc battery parport ipmi_si ipmi_msghandler tpm_infineon tpm_tis video tpm button processor sch_fq_codel 8021q mrp br_netfilter bridge stp llc ip_tables x_tables algif_skcipher af_alg dm_crypt dm_mod btrfs xor hid_generic usbhid hid raid6_pq sd_mod atkbd libps2 crct10dif_pclmul crc32_pclmul crc32c_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd ahci libahci libata scsi_mod xhci_pci ehci_pci xhci_hcd ehci_hcd usbcore usb_common nvme i8042 serio
Feb 06 19:46:40 nfnty kernel: CPU: 5 PID: 30862 Comm: java Tainted: G      D W       4.1.17-1-lts #1
Feb 06 19:46:40 nfnty kernel: task: ffff8802ae039e30 ti: ffff88023b5d4000 task.ti: ffff88023b5d4000
Feb 06 19:46:40 nfnty kernel: RIP: 0010:[<ffffffffa02bc19e>]  [<ffffffffa02bc19e>] __set_extent_bit+0x2de/0x5b0 [btrfs]
Feb 06 19:46:40 nfnty kernel: RSP: 0018:ffff88023b5d7b18  EFLAGS: 00010246
Feb 06 19:46:40 nfnty kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000026
Feb 06 19:46:40 nfnty kernel: RDX: 000060fbd000c6f0 RSI: 0000000000000046 RDI: ffff8801fe8a2100
Feb 06 19:46:40 nfnty kernel: RBP: ffff88023b5d7bb8 R08: 0000000000000000 R09: 0000000000000000
Feb 06 19:46:40 nfnty kernel: R10: ffff88041f021f00 R11: 0000000000000b3c R12: ffff8802fade35bc
Feb 06 19:46:40 nfnty kernel: R13: ffff8802fade35a0 R14: 0000000000000000 R15: 0000000000000000
Feb 06 19:46:40 nfnty kernel: FS:  00007f9f62bad700(0000) GS:ffff88042fd40000(0000) knlGS:0000000000000000
Feb 06 19:46:40 nfnty kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 06 19:46:40 nfnty kernel: CR2: 000000c825222000 CR3: 000000023be5f000 CR4: 00000000001406e0
Feb 06 19:46:40 nfnty kernel: Stack:
Feb 06 19:46:40 nfnty kernel:  ffff88042fd56240 ffff88042fd56240 ffff88023b5d7c10 ffff8802fb8fe900
Feb 06 19:46:40 nfnty kernel:  ffff88023b5d8000 ffff88041d6a1420 0000000000000fff ffff88023b5d8000
Feb 06 19:46:40 nfnty kernel:  ffff8802fade35a0 0000100800000008 0000000000020001 ffff88023b5d8000
Feb 06 19:46:40 nfnty kernel: Call Trace:
Feb 06 19:46:40 nfnty kernel:  [<ffffffffa02bd13b>] lock_extent_bits+0x8b/0x210 [btrfs]
Feb 06 19:46:40 nfnty kernel:  [<ffffffff810e6947>] ? ktime_get+0x37/0xb0
Feb 06 19:46:40 nfnty kernel:  [<ffffffff811164c6>] ? delayacct_end+0x56/0x60
Feb 06 19:46:40 nfnty kernel:  [<ffffffffa02c0e17>] __extent_read_full_page+0xa7/0x100 [btrfs]
Feb 06 19:46:40 nfnty kernel:  [<ffffffffa02a2fc0>] ? btrfs_direct_IO+0x320/0x320 [btrfs]
Feb 06 19:46:40 nfnty kernel:  [<ffffffffa02c1206>] extent_read_full_page+0x46/0x80 [btrfs]
Feb 06 19:46:40 nfnty kernel:  [<ffffffff810b87d0>] ? autoremove_wake_function+0x40/0x40
Feb 06 19:46:40 nfnty kernel:  [<ffffffffa02a18f5>] btrfs_readpage+0x25/0x30 [btrfs]
Feb 06 19:46:40 nfnty kernel:  [<ffffffff8116131a>] generic_file_read_iter+0x33a/0x600
Feb 06 19:46:40 nfnty kernel:  [<ffffffff811d7c7e>] __vfs_read+0xce/0x100
Feb 06 19:46:40 nfnty kernel:  [<ffffffff811d8517>] vfs_read+0x87/0x140
Feb 06 19:46:40 nfnty kernel:  [<ffffffff811d9329>] SyS_read+0x59/0xd0
Feb 06 19:46:40 nfnty kernel:  [<ffffffff815830ee>] system_call_fastpath+0x12/0x71
Feb 06 19:46:40 nfnty kernel: Code: 85 ae fd ff ff 0f 1f 84 00 00 00 00 00 f6 45 18 10 0f 84 9c fd ff ff 8b 7d 18 e8 2e e7 ff ff 48 85 c0 49 89 c7 0f 85 88 fd ff ff <0f> 0b 4c 89 e8 4d 89 cf 49 89 dd 48 89 c2 48 89 c3 48 8b 45 90 
Feb 06 19:46:40 nfnty kernel: RIP  [<ffffffffa02bc19e>] __set_extent_bit+0x2de/0x5b0 [btrfs]
Feb 06 19:46:40 nfnty kernel:  RSP <ffff88023b5d7b18>
Feb 06 19:46:40 nfnty kernel: ---[ end trace f97126fd3d3f020f ]---
Feb 06 19:46:33 nfnty kernel: BTRFS warning (device dm-0): __btrfs_unlink_inode:3955: Aborting unused transaction(Out of memory).
Feb 06 19:46:33 nfnty kernel: ------------[ cut here ]------------
Feb 06 19:46:33 nfnty kernel: kernel BUG at fs/btrfs/inode.c:1131!
Feb 06 19:46:33 nfnty kernel: invalid opcode: 0000 [#1] SMP 
Feb 06 19:46:33 nfnty kernel: Modules linked in: veth tun ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_nat iptable_nat nf_nat_ipv4 nf_nat xt_mark iptable_mangle ebt_ip xt_pkttype xt_mac ipt_REJECT xt_tcpudp nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 xt_NFLOG xt_multiport ebt_nflog nf_conntrack_ipv6 nf_defrag_ipv6 xt_addrtype xt_owner ebtable_filter xt_iprange xt_conntrack nf_conntrack ebtables iptable_filter ip6table_filter ip6_tables nfnetlink_log nfnetlink arc4 ath9k ath9k_common ath9k_hw ath led_class nls_iso8859_1 nls_cp437 vfat fat iTCO_wdt intel_rapl ipmi_ssif iosf_mbi iTCO_vendor_support ppdev mac80211 x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm psmouse cfg80211 igb serio_raw mei_me rfkill ie31200_edac ptp pps_core pcspkr evdev dca edac_core i2c_algo_bit mac_hid
Feb 06 19:46:33 nfnty kernel:  lpc_ich mei i2c_i801 shpchp thermal fan parport_pc battery parport ipmi_si ipmi_msghandler tpm_infineon tpm_tis video tpm button processor sch_fq_codel 8021q mrp br_netfilter bridge stp llc ip_tables x_tables algif_skcipher af_alg dm_crypt dm_mod btrfs xor hid_generic usbhid hid raid6_pq sd_mod atkbd libps2 crct10dif_pclmul crc32_pclmul crc32c_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd ahci libahci libata scsi_mod xhci_pci ehci_pci xhci_hcd ehci_hcd usbcore usb_common nvme i8042 serio
Feb 06 19:46:33 nfnty kernel: CPU: 5 PID: 29063 Comm: Finalizer Tainted: G        W       4.1.17-1-lts #1
Feb 06 19:46:33 nfnty kernel: task: ffff880279c58a10 ti: ffff88003d210000 task.ti: ffff88003d210000
Feb 06 19:46:33 nfnty kernel: RIP: 0010:[<ffffffffa02a683a>]  [<ffffffffa02a683a>] run_delalloc_range+0x37a/0x3f0 [btrfs]
Feb 06 19:46:33 nfnty kernel: RSP: 0018:ffff88003d213af8  EFLAGS: 00010246
Feb 06 19:46:33 nfnty kernel: RAX: 0000000000000000 RBX: ffff88040dc60000 RCX: 0000000100022c98
Feb 06 19:46:33 nfnty kernel: RDX: 000060fbd000edc0 RSI: 0000000000000092 RDI: ffff8802ae553400
Feb 06 19:46:33 nfnty kernel: RBP: ffff88003d213b98 R08: ffffffffffffffff R09: ffffffffa02a65b6
Feb 06 19:46:33 nfnty kernel: R10: ffffea00026c4740 R11: 0000000000000220 R12: 0000000000000000
Feb 06 19:46:33 nfnty kernel: R13: ffff88030231a030 R14: 0000000000000000 R15: 0000000000007fff
Feb 06 19:46:33 nfnty kernel: FS:  00007ff69fc79700(0000) GS:ffff88042fd40000(0000) knlGS:0000000000000000
Feb 06 19:46:33 nfnty kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 06 19:46:33 nfnty kernel: CR2: 000000000259905f CR3: 000000023a9e6000 CR4: 00000000001406e0
Feb 06 19:46:33 nfnty kernel: Stack:
Feb 06 19:46:33 nfnty kernel:  0000000000000000 ffffffff00000050 ffff88003d213b38 ffff88003d213bd4
Feb 06 19:46:33 nfnty kernel:  ffff88003d213b50 ffff88030231a030 ffff88003d213c68 ffffea0001223c40
Feb 06 19:46:33 nfnty kernel:  ffff88030231a030 ffffea0001223c40 0000000000000000 0000000000007fff
Feb 06 19:46:33 nfnty kernel: Call Trace:
Feb 06 19:46:33 nfnty kernel:  [<ffffffffa02bdfe4>] writepage_delalloc.isra.19+0x114/0x180 [btrfs]
Feb 06 19:46:33 nfnty kernel:  [<ffffffffa02bfe35>] __extent_writepage+0xc5/0x300 [btrfs]
Feb 06 19:46:33 nfnty kernel:  [<ffffffffa02c03d2>] extent_write_cache_pages.isra.15.constprop.31+0x362/0x420 [btrfs]
Feb 06 19:46:33 nfnty kernel:  [<ffffffffa02c175c>] extent_writepages+0x5c/0x90 [btrfs]
Feb 06 19:46:33 nfnty kernel:  [<ffffffffa02a2fc0>] ? btrfs_direct_IO+0x320/0x320 [btrfs]
Feb 06 19:46:33 nfnty kernel:  [<ffffffffa02a15b8>] btrfs_writepages+0x28/0x30 [btrfs]
Feb 06 19:46:33 nfnty kernel:  [<ffffffff8116caae>] do_writepages+0x1e/0x30
Feb 06 19:46:33 nfnty kernel:  [<ffffffff81160e65>] __filemap_fdatawrite_range+0x65/0x90
Feb 06 19:46:33 nfnty kernel:  [<ffffffff81160edc>] filemap_flush+0x1c/0x20
Feb 06 19:46:33 nfnty kernel:  [<ffffffffa02afff9>] btrfs_release_file+0x49/0x60 [btrfs]
Feb 06 19:46:33 nfnty kernel:  [<ffffffff811da0bc>] __fput+0x9c/0x200
Feb 06 19:46:33 nfnty kernel:  [<ffffffff811da26e>] ____fput+0xe/0x10
Feb 06 19:46:33 nfnty kernel:  [<ffffffff81092977>] task_work_run+0xb7/0xf0
Feb 06 19:46:33 nfnty kernel:  [<ffffffff81013ca5>] do_notify_resume+0x75/0x80
Feb 06 19:46:33 nfnty kernel:  [<ffffffff815832bc>] int_signal+0x12/0x17
Feb 06 19:46:33 nfnty kernel: Code: 00 00 4c 89 e2 4c 89 f7 e8 64 ec ff ff 48 8b 4d c8 65 48 33 0c 25 28 00 00 00 75 78 48 83 c4 78 5b 41 5c 41 5d 41 5e 41 5f 5d c3 <0f> 0b 48 8d bf 90 fe ff ff 45 31 c9 45 31 c0 b9 40 00 00 00 4c 
Feb 06 19:46:33 nfnty kernel: RIP  [<ffffffffa02a683a>] run_delalloc_range+0x37a/0x3f0 [btrfs]
Feb 06 19:46:33 nfnty kernel:  RSP <ffff88003d213af8>
Feb 06 19:46:33 nfnty kernel: ---[ end trace f97126fd3d3f020e ]---
@cpuguy83
Copy link
Member

cpuguy83 commented Feb 7, 2016

Could be we need to up the minimum.
Can you try with 10MB instead of 4MB?

@nfnty
Copy link
Author

nfnty commented Feb 7, 2016

Sorry, won't be able to experiment; had to hard reset server after freeze resulting in really bad data loss.

@unclejack
Copy link
Contributor

This is a kernel bug which should be reported upstream. Raising the memory limit shouldn't be needed, but we can do it if that has become a requirement.

@nfnty
Copy link
Author

nfnty commented Feb 7, 2016

Reported upstream: https://bugzilla.kernel.org/show_bug.cgi?id=112101

@nfnty
Copy link
Author

nfnty commented Feb 12, 2016

Add the 'area/storage/btrfs' label to this.

@hervenicol
Copy link

hervenicol commented Apr 10, 2017

Hi,

I've had the same problem (OOM and "kernel BUG at extent_io.c" ) with larger containers (16MB RAM).

So raising the minimum mem limit won't solve this issue.

  • My environment:

Kernel 4.4.0-66-generic #87~14.04.1-Ubuntu SMP

  • docker version:

Client:
Version: 1.12.6
API version: 1.24
Go version: go1.6.4
Git commit: 78d1802
Built: Tue Jan 10 20:26:30 2017
OS/Arch: linux/amd64

Server:
Version: 1.12.6
API version: 1.24
Go version: go1.6.4
Git commit: 78d1802
Built: Tue Jan 10 20:26:30 2017
OS/Arch: linux/amd64

  • docker info:

Storage Driver: btrfs
Runtimes: runc
Default Runtime: runc
Security Options: apparmor
Kernel Version: 4.4.0-66-generic
Operating System: Ubuntu 14.04.5 LTS
OSType: linux
Architecture: x86_64
CPUs: 16
Total Memory: 62.81 GiB

@thaJeztah
Copy link
Member

Let me close this ticket for now, as it looks like it went stale.

@thaJeztah thaJeztah closed this as not planned Won't fix, can't repro, duplicate, stale Sep 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants