Skip to content

4.0.y: strangeness with errors when using krealloc and later mapping the reallocated memory #959

@msperl

Description

@msperl

@notro found some issue when testing the new feature of the spi-bcm2835 driver that strangely ONLY occurs in the foundation kernel - it does NOT show in 4.1.0-rc2.

Here an example output of what we see on the console:

May  7 20:18:17 raspb kernel: [  118.908906] fbtft: module is from the staging directory, the quality is unknown, you have been warned.
May  7 20:18:17 raspb kernel: [  118.916843] fb_st7735r: module is from the staging directory, the quality is unknown, you have been warned.
May  7 20:18:17 raspb kernel: [  118.921871] fbtft_of_value: buswidth = 8
May  7 20:18:17 raspb kernel: [  118.921917] fbtft_of_value: txbuflen = 32768
May  7 20:18:18 raspb kernel: [  119.866243] graphics fb1: fb_st7735r frame buffer, 128x160, 40 KiB video memory, 32 KiB DMA buffer memory, fps=20, spi0.4 at 8 MHz
May  7 20:18:18 raspb kernel: [  119.941569] Modules linked in: fb_st7735r(C) fbtft(C) syscopyarea sysfillrect sysimgblt fb_sys_fops spi_bcm2835 binfmt_misc i2c_bcm2708 uio_pdrv_genirq uio
May  7 20:18:18 raspb kernel: [  119.941662] CPU: 0 PID: 19 Comm: kworker/0:1 Tainted: G         C      4.0.1+ #31
May  7 20:18:18 raspb kernel: [  119.941680] Hardware name: BCM2708
May  7 20:18:18 raspb kernel: [  119.941713] Workqueue: events fb_deferred_io_work
May  7 20:18:18 raspb kernel: [  119.941784] [<c0015b40>] (unwind_backtrace) from [<c0012914>] (show_stack+0x20/0x24)
May  7 20:18:18 raspb kernel: [  119.941833] [<c0012914>] (show_stack) from [<c05656b8>] (dump_stack+0x20/0x28)
May  7 20:18:18 raspb kernel: [  119.941880] [<c05656b8>] (dump_stack) from [<c00e5a9c>] (bad_page+0xc8/0x128)
May  7 20:18:18 raspb kernel: [  119.941918] [<c00e5a9c>] (bad_page) from [<c00e8c48>] (get_page_from_freelist+0x664/0x960)
May  7 20:18:18 raspb kernel: [  119.941953] [<c00e8c48>] (get_page_from_freelist) from [<c00e9044>] (__alloc_pages_nodemask+0x100/0x8f4)
May  7 20:18:18 raspb kernel: [  119.941987] [<c00e9044>] (__alloc_pages_nodemask) from [<c00e9920>] (alloc_kmem_pages+0x20/0x28)
May  7 20:18:18 raspb kernel: [  119.942023] [<c00e9920>] (alloc_kmem_pages) from [<c010395c>] (kmalloc_order+0x20/0x5c)
May  7 20:18:18 raspb kernel: [  119.942054] [<c010395c>] (kmalloc_order) from [<c01039c4>] (kmalloc_order_trace+0x2c/0xd4)
May  7 20:18:18 raspb kernel: [  119.942090] [<c01039c4>] (kmalloc_order_trace) from [<c0128a24>] (__kmalloc_track_caller+0x218/0x224)
May  7 20:18:18 raspb kernel: [  119.942123] [<c0128a24>] (__kmalloc_track_caller) from [<c0103860>] (krealloc+0x60/0xb8)
May  7 20:18:18 raspb kernel: [  119.942156] [<c0103860>] (krealloc) from [<c03c5064>] (__spi_pump_messages+0x6f8/0x76c)
May  7 20:18:18 raspb kernel: [  119.942188] [<c03c5064>] (__spi_pump_messages) from [<c03c52f4>] (__spi_sync+0x21c/0x22c)
May  7 20:18:18 raspb kernel: [  119.942214] [<c03c52f4>] (__spi_sync) from [<c03c5340>] (spi_sync+0x1c/0x20)
May  7 20:18:18 raspb kernel: [  119.942300] [<c03c5340>] (spi_sync) from [<bf032148>] (fbtft_write_spi+0x90/0x108 [fbtft])
May  7 20:18:18 raspb kernel: [  119.942382] [<bf032148>] (fbtft_write_spi [fbtft]) from [<bf0315b0>] (fbtft_write_vmem16_bus8+0xec/0x14c [fbtft])
May  7 20:18:18 raspb kernel: [  119.942444] [<bf0315b0>] (fbtft_write_vmem16_bus8 [fbtft]) from [<bf030128>] (fbtft_update_display+0xd8/0x33c [fbtft])
May  7 20:18:18 raspb kernel: [  119.942502] [<bf030128>] (fbtft_update_display [fbtft]) from [<bf0304b0>] (fbtft_deferred_io+0x124/0x150 [fbtft])
May  7 20:18:18 raspb kernel: [  119.942547] [<bf0304b0>] (fbtft_deferred_io [fbtft]) from [<c033954c>] (fb_deferred_io_work+0x9c/0xf4)
May  7 20:18:18 raspb kernel: [  119.942584] [<c033954c>] (fb_deferred_io_work) from [<c003ad04>] (process_one_work+0x13c/0x490)
May  7 20:18:18 raspb kernel: [  119.942616] [<c003ad04>] (process_one_work) from [<c003b24c>] (worker_thread+0x1b0/0x528)
May  7 20:18:18 raspb kernel: [  119.942656] [<c003b24c>] (worker_thread) from [<c00404b0>] (kthread+0xdc/0xf8)
May  7 20:18:18 raspb kernel: [  119.942694] [<c00404b0>] (kthread) from [<c000ec90>] (ret_from_fork+0x14/0x24)
May  7 20:18:18 raspb kernel: [  119.942711] Disabling lock debugging due to kernel taint

We have now traced this down to spi_map_msg, which:

  • for messages that only have TX or RX transfers will create a RX/TX-buffer with krealloc
  • then map the buffer by first using sg_set_page/sg_set_buf and subsequently calling dma_map_sg

if both code sections are run together, then things break and also impact OTHER parts of the kernel like this one:

May  7 20:18:41 raspb kernel: [  142.344159] Hardware name: BCM2708
May  7 20:18:41 raspb kernel: [  142.344234] [<c0015b40>] (unwind_backtrace) from [<c0012914>] (show_stack+0x20/0x24)
May  7 20:18:41 raspb kernel: [  142.344282] [<c0012914>] (show_stack) from [<c05656b8>] (dump_stack+0x20/0x28)
May  7 20:18:41 raspb kernel: [  142.344329] [<c05656b8>] (dump_stack) from [<c00e5a9c>] (bad_page+0xc8/0x128)
May  7 20:18:41 raspb kernel: [  142.344367] [<c00e5a9c>] (bad_page) from [<c00e8c48>] (get_page_from_freelist+0x664/0x960)
May  7 20:18:41 raspb kernel: [  142.344401] [<c00e8c48>] (get_page_from_freelist) from [<c00e9044>] (__alloc_pages_nodemask+0x100/0x8f4)
May  7 20:18:41 raspb kernel: [  142.344450] [<c00e9044>] (__alloc_pages_nodemask) from [<c0475be4>] (__alloc_page_frag+0x8c/0x148)
May  7 20:18:41 raspb kernel: [  142.344487] [<c0475be4>] (__alloc_page_frag) from [<c0476490>] (__netdev_alloc_frag+0x38/0x88)
May  7 20:18:41 raspb kernel: [  142.344521] [<c0476490>] (__netdev_alloc_frag) from [<c047ae74>] (__alloc_rx_skb+0xd4/0xec)
May  7 20:18:41 raspb kernel: [  142.344553] [<c047ae74>] (__alloc_rx_skb) from [<c047af08>] (__netdev_alloc_skb+0x28/0x50)
May  7 20:18:41 raspb kernel: [  142.344595] [<c047af08>] (__netdev_alloc_skb) from [<c03ccbb8>] (rx_submit+0x3c/0x284)
May  7 20:18:41 raspb kernel: [  142.344631] [<c03ccbb8>] (rx_submit) from [<c03cd058>] (rx_complete+0x1e8/0x1f8)
May  7 20:18:41 raspb kernel: [  142.344663] [<c03cd058>] (rx_complete) from [<c03d545c>] (__usb_hcd_giveback_urb+0x80/0x154)
May  7 20:18:41 raspb kernel: [  142.344693] [<c03d545c>] (__usb_hcd_giveback_urb) from [<c03d56d4>] (usb_hcd_giveback_urb+0x48/0x120)
May  7 20:18:41 raspb kernel: [  142.344733] [<c03d56d4>] (usb_hcd_giveback_urb) from [<c03fdec4>] (completion_tasklet_func+0x6c/0x98)
May  7 20:18:41 raspb kernel: [  142.344769] [<c03fdec4>] (completion_tasklet_func) from [<c040be80>] (tasklet_callback+0x20/0x24)
May  7 20:18:41 raspb kernel: [  142.344816] [<c040be80>] (tasklet_callback) from [<c0027ccc>] (tasklet_hi_action+0x84/0xd8)

This does NOT happen with an upstream 4.1.0-rc2 kernel, so it must be something specific to downstream/custom code

It also only happens for the second mapping request that is 16384 bytes (actually might be anything > 8192 bytes - it was only tested with 4K/8K/16K/32K transfers)

Here with some dev_info messages arround the krealloc:

[   82.843285] fbtft: module is from the staging directory, the quality is unknown, you have been warned.
[   82.852350] fb_st7735r: module is from the staging directory, the quality is unknown, you have been warned.
[   82.855984] fbtft_of_value: buswidth = 8
[   82.856030] fbtft_of_value: txbuflen = 16384
[   82.977874] spi_master spi0: krealloc-rx - 00000000 - 1
[   82.977923] spi_master spi0: krealloc-rx - da6c8be0
[   83.128369] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.128414] spi_master spi0: krealloc-rx - da6c8be0
[   83.629290] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.629332] spi_master spi0: krealloc-rx - da6c8be0
[   83.629443] spi_master spi0: krealloc-rx - 00000000 - 3
[   83.629472] spi_master spi0: krealloc-rx - da6c8be0
[   83.629565] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.629590] spi_master spi0: krealloc-rx - da6c8be0
[   83.629669] spi_master spi0: krealloc-rx - 00000000 - 3
[   83.629697] spi_master spi0: krealloc-rx - da6c8be0
[   83.629788] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.629816] spi_master spi0: krealloc-rx - da6c8be0
[   83.629894] spi_master spi0: krealloc-rx - 00000000 - 6
[   83.629920] spi_master spi0: krealloc-rx - da6c8be0
[   83.630007] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.630032] spi_master spi0: krealloc-rx - da6c8be0
[   83.630110] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.630139] spi_master spi0: krealloc-rx - da6c8be0
[   83.630219] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.630247] spi_master spi0: krealloc-rx - da6c8be0
[   83.630322] spi_master spi0: krealloc-rx - 00000000 - 3
[   83.630348] spi_master spi0: krealloc-rx - da6c8be0
[   83.630431] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.630458] spi_master spi0: krealloc-rx - da6c8be0
[   83.630534] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.630561] spi_master spi0: krealloc-rx - da6c8be0
[   83.630645] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.630671] spi_master spi0: krealloc-rx - da6c8be0
[   83.630747] spi_master spi0: krealloc-rx - 00000000 - 2
[   83.630774] spi_master spi0: krealloc-rx - da6c8be0
[   83.630856] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.630885] spi_master spi0: krealloc-rx - da6c8be0
[   83.630960] spi_master spi0: krealloc-rx - 00000000 - 2
[   83.630988] spi_master spi0: krealloc-rx - da6c8be0
[   83.631070] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.631096] spi_master spi0: krealloc-rx - da6c8be0
[   83.631172] spi_master spi0: krealloc-rx - 00000000 - 2
[   83.631198] spi_master spi0: krealloc-rx - da6c8be0
[   83.631280] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.631307] spi_master spi0: krealloc-rx - da6c8be0
[   83.631423] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.631453] spi_master spi0: krealloc-rx - da6c8be0
[   83.631577] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.631607] spi_master spi0: krealloc-rx - da6c8be0
[   83.631696] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.631723] spi_master spi0: krealloc-rx - da6c8be0
[   83.631802] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.631829] spi_master spi0: krealloc-rx - da6c8be0
[   83.631906] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.631934] spi_master spi0: krealloc-rx - da6c8be0
[   83.736459] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.736508] spi_master spi0: krealloc-rx - da6c8be0
[   83.747254] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.747299] spi_master spi0: krealloc-rx - da6c8be0
[   83.747410] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.747440] spi_master spi0: krealloc-rx - da6c8be0
[   83.747524] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.747552] spi_master spi0: krealloc-rx - da6c8be0
[   83.747630] spi_master spi0: krealloc-rx - 00000000 - 4
[   83.747657] spi_master spi0: krealloc-rx - da6c8be0
[   83.747736] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.747765] spi_master spi0: krealloc-rx - da6c8be0
[   83.747839] spi_master spi0: krealloc-rx - 00000000 - 4
[   83.747865] spi_master spi0: krealloc-rx - da6c8be0
[   83.747943] spi_master spi0: krealloc-rx - 00000000 - 1
[   83.747971] spi_master spi0: krealloc-rx - da6c8be0
[   83.748271] spi_master spi0: krealloc-rx - 00000000 - 16384
[   83.748314] spi_master spi0: krealloc-rx - da6ac000
[   83.765678] spi_master spi0: krealloc-rx - 00000000 - 16384
[   83.765731] BUG: Bad page state in process modprobe  pfn:1a6ac
[   83.774518] page:db7ba030 count:0 mapcount:0 mapping:  (null) index:0x0
[   83.783143] flags: 0x200(arch_1)
[   83.791861] page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag set
[   83.803147] bad because of flags:
[   83.808397] flags: 0x200(arch_1)
[   83.813208] Modules linked in: fb_st7735r(C+) fbtft(C) syscopyarea sysfillrect sysimgblt fb_sys_fops spi_bcm2835 binfmt_misc i2c_bcm2708 uio_pdrv_genirq uio
[   83.813297] CPU: 0 PID: 2133 Comm: modprobe Tainted: G         C      4.0.1+ #35
[   83.813313] Hardware name: BCM2708
[   83.813389] [<c001672c>] (unwind_backtrace) from [<c0013314>] (show_stack+0x20/0x24)
[   83.813439] [<c0013314>] (show_stack) from [<c0576504>] (dump_stack+0x20/0x28)
[   83.813481] [<c0576504>] (dump_stack) from [<c00e7314>] (bad_page+0xc8/0x128)
[   83.813519] [<c00e7314>] (bad_page) from [<c00ea4c0>] (get_page_from_freelist+0x664/0x960)
[   83.813553] [<c00ea4c0>] (get_page_from_freelist) from [<c00ea8bc>] (__alloc_pages_nodemask+0x100/0x8f4)
[   83.813582] [<c00ea8bc>] (__alloc_pages_nodemask) from [<c00eb198>] (alloc_kmem_pages+0x20/0x28)
[   83.813612] [<c00eb198>] (alloc_kmem_pages) from [<c01051d4>] (kmalloc_order+0x20/0x5c)
[   83.813641] [<c01051d4>] (kmalloc_order) from [<c010523c>] (kmalloc_order_trace+0x2c/0xd4)
[   83.813675] [<c010523c>] (kmalloc_order_trace) from [<c012a2d4>] (__kmalloc_track_caller+0x218/0x224)
[   83.813703] [<c012a2d4>] (__kmalloc_track_caller) from [<c01050d8>] (krealloc+0x60/0xb8)
[   83.813734] [<c01050d8>] (krealloc) from [<c03cce08>] (__spi_pump_messages+0x70c/0x79c)
[   83.813764] [<c03cce08>] (__spi_pump_messages) from [<c03cd0b4>] (__spi_sync+0x21c/0x22c)
[   83.813789] [<c03cd0b4>] (__spi_sync) from [<c03cd100>] (spi_sync+0x1c/0x20)
[   83.813871] [<c03cd100>] (spi_sync) from [<bf0341c4>] (fbtft_write_spi+0x90/0x108 [fbtft])
[   83.813953] [<bf0341c4>] (fbtft_write_spi [fbtft]) from [<bf0335fc>] (fbtft_write_vmem16_bus8+0xfc/0x18c [fbtft])
[   83.814018] [<bf0335fc>] (fbtft_write_vmem16_bus8 [fbtft]) from [<bf032128>] (fbtft_update_display+0xd8/0x33c [fbtft])
[   83.814076] [<bf032128>] (fbtft_update_display [fbtft]) from [<bf0310e0>] (fbtft_register_framebuffer+0x114/0x2c0 [fbtft])
[   83.814131] [<bf0310e0>] (fbtft_register_framebuffer [fbtft]) from [<bf031a44>] (fbtft_probe_common+0x124/0x47c [fbtft])
[   83.814190] [<bf031a44>] (fbtft_probe_common [fbtft]) from [<bf03f098>] (fbtft_driver_probe_spi+0x24/0x2c [fb_st7735r])
[   83.814230] [<bf03f098>] (fbtft_driver_probe_spi [fb_st7735r]) from [<c03cc2f4>] (spi_drv_probe+0x48/0x4c)
[   83.814269] [<c03cc2f4>] (spi_drv_probe) from [<c037ce50>] (driver_probe_device+0x130/0x378)
[   83.814300] [<c037ce50>] (driver_probe_device) from [<c037d134>] (__driver_attach+0x9c/0xa0)
[   83.814326] [<c037d134>] (__driver_attach) from [<c037b0c8>] (bus_for_each_dev+0x64/0x98)
[   83.814353] [<c037b0c8>] (bus_for_each_dev) from [<c037c828>] (driver_attach+0x28/0x30)
[   83.814380] [<c037c828>] (driver_attach) from [<c037c420>] (bus_add_driver+0xf8/0x218)
[   83.814410] [<c037c420>] (bus_add_driver) from [<c037d664>] (driver_register+0x88/0x104)
[   83.814438] [<c037d664>] (driver_register) from [<c03cc298>] (spi_register_driver+0x54/0x68)
[   83.814472] [<c03cc298>] (spi_register_driver) from [<bf041014>] (fbtft_driver_module_init+0x14/0x38 [fb_st7735r])
[   83.814512] [<bf041014>] (fbtft_driver_module_init [fb_st7735r]) from [<c00087b4>] (do_one_initcall+0xbc/0x204)
[   83.814555] [<c00087b4>] (do_one_initcall) from [<c008ad28>] (do_init_module+0x6c/0x1c4)
[   83.814586] [<c008ad28>] (do_init_module) from [<c008c7ac>] (load_module+0x17f4/0x1ff0)
[   83.814616] [<c008c7ac>] (load_module) from [<c008d080>] (SyS_init_module+0xd8/0x150)
[   83.814651] [<c008d080>] (SyS_init_module) from [<c000f5c0>] (ret_fast_syscall+0x0/0x4c)
[   83.814668] Disabling lock debugging due to kernel taint
[   83.814708] spi_master spi0: krealloc-rx - da4c0000
[   83.832113] spi_master spi0: krealloc-rx - da4c0000 - 8192
[   83.832156] spi_master spi0: krealloc-rx - da4c0000
[   83.840915] spi_master spi0: krealloc-rx - da4c0000 - 1
[   83.840956] spi_master spi0: krealloc-rx - da4c0000
[   83.841006] spi_master spi0: krealloc-rx - da4c0000 - 16
[   83.841029] spi_master spi0: krealloc-rx - da4c0000
[   83.841086] spi_master spi0: krealloc-rx - da4c0000 - 1
[   83.841108] spi_master spi0: krealloc-rx - da4c0000
[   83.841148] spi_master spi0: krealloc-rx - da4c0000 - 16
[   83.841170] spi_master spi0: krealloc-rx - da4c0000
[   83.841837] graphics fb1: fb_st7735r frame buffer, 128x160, 40 KiB video memory, 16 KiB DMA buffer memory, fps=20, spi0.4 at 8 MHz
[   85.506454] spi_master spi0: krealloc-rx - 00000000 - 1
[   85.506502] spi_master spi0: krealloc-rx - dafac180
[   85.506574] spi_master spi0: krealloc-rx - dafac180 - 4

Finally see also discussions in this thread:
msperl/spi-bcm2835#13 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions