Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: NFSv4.2 does not handle context mounts correctly #35

Closed
stephensmalley opened this issue May 30, 2017 · 3 comments
Closed

BUG: NFSv4.2 does not handle context mounts correctly #35

stephensmalley opened this issue May 30, 2017 · 3 comments
Labels

Comments

@stephensmalley
Copy link
Member

stephensmalley commented May 30, 2017

NFSv4.2 introduces support for file security labels. However, context mounts should still operate in the same manner as before, i.e. all files in the NFS filesystem should appear to be labeled with the context mount label on the client, and the NFS client filesystem should not try to set the context of any newly created files on the server. At present, we get mixed behavior with NFSv4.2: the top-level mount directory is labeled with the context mount, files that already existed on the server show up with the server file labels, newly created files on the client have their contexts set on the server to the context mount value. It is unclear whether this was ever correct for NFSv4.2, probably not.

$ cat nfs-bug.sh
#!/bin/sh
#Remove security_label if testing with an older nfs-utils, e.g. RHEL7.
cat > /etc/exports <<EOF
/home localhost(rw,no_root_squash,security_label)
EOF
exportfs -a
systemctl start nfs-server
mkdir -p /mnt/home
mount -t nfs -o vers=4.0,context=system_u:object_r:etc_t:s0 localhost:/home /mnt/home
echo "Under NFSv4.0:"
#Everything should be labeled with the context mount value.
ls -Z /mnt/home
#When we create a new file, it should appear to be labeled with the context mount value on the client.
touch /mnt/home/foo
ls -Z /mnt/home/foo
#But we should not try to set it on the server.
ls -Z /home/foo
rm /mnt/home/foo
umount /mnt/home
mount -t nfs -o vers=4.2,context=system_u:object_r:etc_t:s0 localhost:/home /mnt/home
echo "Under NFSv4.2:"
ls -Z /mnt/home
touch /mnt/home/foo
ls -Z /mnt/home/foo
ls -Z /home/foo
rm /home/foo
umount /mnt/home
rmdir /mnt/home
rm /etc/exports
exportfs -ua
systemctl stop nfs-server

$ sudo ./nfs-bug.sh
./nfs-bug.sh
Under NFSv4.0:
system_u:object_r:etc_t:s0 lost+found
system_u:object_r:etc_t:s0 sds
system_u:object_r:etc_t:s0 /mnt/home/foo
system_u:object_r:home_root_t:s0 /home/foo
Under NFSv4.2:
system_u:object_r:lost_found_t:s0 lost+found
unconfined_u:object_r:user_home_dir_t:s0 sds
system_u:object_r:etc_t:s0 /mnt/home/foo
system_u:object_r:etc_t:s0 /home/foo

@stephensmalley
Copy link
Member Author

NB For 4.11 and later, this difference in behavior only manifests if exported with security_label; if not exported with security_label, then NFSv4.2 has the same behavior as NFSv4.0 wrt context mounts. However, regardless of how the filesystem is exported, clients should still have the ability to override the server's labels (e.g. if the client does not trust the server or uses a different policy) via context mounts.

@stephensmalley
Copy link
Member Author

The way this is supposed to work is that the NFSv4 client code (nfs_set_sb_security) checks whether the server supports security labels, and if so, passes SECURITY_LSM_NATIVE_LABELS via kern_flags into security_sb_set_mnt_opts(). selinux_set_mnt_opts() internally handles context mount options and only sets SECURITY_LSM_NATIVE_LABELS in *set_kern_flags if there is no context mount. Then nfs_set_sb_security() checks those flags and if the flag is not set, then it clears NFS_CAP_SECURITY_LABEL from the server capabilities. That is supposed to disable all use of labels from the server or setting of labels on the server.

pcmoore pushed a commit that referenced this issue Jun 9, 2017
…native labeling behavior

When an NFSv4 client performs a mount operation, it first mounts the
NFSv4 root and then does path walk to the exported path and performs a
submount on that, cloning the security mount options from the root's
superblock to the submount's superblock in the process.

Unless the NFS server has an explicit fsid=0 export with the
"security_label" option, the NFSv4 root superblock will not have
SBLABEL_MNT set, and neither will the submount superblock after cloning
the security mount options.  As a result, setxattr's of security labels
over NFSv4.2 will fail.  In a similar fashion, NFSv4.2 mounts mounted
with the context= mount option will not show the correct labels because
the nfs_server->caps flags of the cloned superblock will still have
NFS_CAP_SECURITY_LABEL set.

Allowing the NFSv4 client to enable or disable SECURITY_LSM_NATIVE_LABELS
behavior will ensure that the SBLABEL_MNT flag has the correct value
when the client traverses from an exported path without the
"security_label" option to one with the "security_label" option and
vice versa.  Similarly, checking to see if SECURITY_LSM_NATIVE_LABELS is
set upon return from security_sb_clone_mnt_opts() and clearing
NFS_CAP_SECURITY_LABEL if necessary will allow the correct labels to
be displayed for NFSv4.2 mounts mounted with the context= mount option.

Resolves: #35

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov>
Tested-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
@stephensmalley
Copy link
Member Author

Fixed by 0b4d345

pcmoore pushed a commit that referenced this issue Jul 6, 2017
Using the syzkaller kernel fuzzer, Andrey Konovalov generated the
following error in gadgetfs:

> BUG: KASAN: use-after-free in __lock_acquire+0x3069/0x3690
> kernel/locking/lockdep.c:3246
> Read of size 8 at addr ffff88003a2bdaf8 by task kworker/3:1/903
>
> CPU: 3 PID: 903 Comm: kworker/3:1 Not tainted 4.12.0-rc4+ #35
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Workqueue: usb_hub_wq hub_event
> Call Trace:
>  __dump_stack lib/dump_stack.c:16 [inline]
>  dump_stack+0x292/0x395 lib/dump_stack.c:52
>  print_address_description+0x78/0x280 mm/kasan/report.c:252
>  kasan_report_error mm/kasan/report.c:351 [inline]
>  kasan_report+0x230/0x340 mm/kasan/report.c:408
>  __asan_report_load8_noabort+0x19/0x20 mm/kasan/report.c:429
>  __lock_acquire+0x3069/0x3690 kernel/locking/lockdep.c:3246
>  lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855
>  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>  _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
>  spin_lock include/linux/spinlock.h:299 [inline]
>  gadgetfs_suspend+0x89/0x130 drivers/usb/gadget/legacy/inode.c:1682
>  set_link_state+0x88e/0xae0 drivers/usb/gadget/udc/dummy_hcd.c:455
>  dummy_hub_control+0xd7e/0x1fb0 drivers/usb/gadget/udc/dummy_hcd.c:2074
>  rh_call_control drivers/usb/core/hcd.c:689 [inline]
>  rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline]
>  usb_hcd_submit_urb+0x92f/0x20b0 drivers/usb/core/hcd.c:1650
>  usb_submit_urb+0x8b2/0x12c0 drivers/usb/core/urb.c:542
>  usb_start_wait_urb+0x148/0x5b0 drivers/usb/core/message.c:56
>  usb_internal_control_msg drivers/usb/core/message.c:100 [inline]
>  usb_control_msg+0x341/0x4d0 drivers/usb/core/message.c:151
>  usb_clear_port_feature+0x74/0xa0 drivers/usb/core/hub.c:412
>  hub_port_disable+0x123/0x510 drivers/usb/core/hub.c:4177
>  hub_port_init+0x1ed/0x2940 drivers/usb/core/hub.c:4648
>  hub_port_connect drivers/usb/core/hub.c:4826 [inline]
>  hub_port_connect_change drivers/usb/core/hub.c:4999 [inline]
>  port_event drivers/usb/core/hub.c:5105 [inline]
>  hub_event+0x1ae1/0x3d40 drivers/usb/core/hub.c:5185
>  process_one_work+0xc08/0x1bd0 kernel/workqueue.c:2097
>  process_scheduled_works kernel/workqueue.c:2157 [inline]
>  worker_thread+0xb2b/0x1860 kernel/workqueue.c:2233
>  kthread+0x363/0x440 kernel/kthread.c:231
>  ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:424
>
> Allocated by task 9958:
>  save_stack_trace+0x1b/0x20 arch/x86/kernel/stacktrace.c:59
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>  set_track mm/kasan/kasan.c:525 [inline]
>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:617
>  kmem_cache_alloc_trace+0x87/0x280 mm/slub.c:2745
>  kmalloc include/linux/slab.h:492 [inline]
>  kzalloc include/linux/slab.h:665 [inline]
>  dev_new drivers/usb/gadget/legacy/inode.c:170 [inline]
>  gadgetfs_fill_super+0x24f/0x540 drivers/usb/gadget/legacy/inode.c:1993
>  mount_single+0xf6/0x160 fs/super.c:1192
>  gadgetfs_mount+0x31/0x40 drivers/usb/gadget/legacy/inode.c:2019
>  mount_fs+0x9c/0x2d0 fs/super.c:1223
>  vfs_kern_mount.part.25+0xcb/0x490 fs/namespace.c:976
>  vfs_kern_mount fs/namespace.c:2509 [inline]
>  do_new_mount fs/namespace.c:2512 [inline]
>  do_mount+0x41b/0x2d90 fs/namespace.c:2834
>  SYSC_mount fs/namespace.c:3050 [inline]
>  SyS_mount+0xb0/0x120 fs/namespace.c:3027
>  entry_SYSCALL_64_fastpath+0x1f/0xbe
>
> Freed by task 9960:
>  save_stack_trace+0x1b/0x20 arch/x86/kernel/stacktrace.c:59
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>  set_track mm/kasan/kasan.c:525 [inline]
>  kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:590
>  slab_free_hook mm/slub.c:1357 [inline]
>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>  slab_free mm/slub.c:2961 [inline]
>  kfree+0xed/0x2b0 mm/slub.c:3882
>  put_dev+0x124/0x160 drivers/usb/gadget/legacy/inode.c:163
>  gadgetfs_kill_sb+0x33/0x60 drivers/usb/gadget/legacy/inode.c:2027
>  deactivate_locked_super+0x8d/0xd0 fs/super.c:309
>  deactivate_super+0x21e/0x310 fs/super.c:340
>  cleanup_mnt+0xb7/0x150 fs/namespace.c:1112
>  __cleanup_mnt+0x1b/0x20 fs/namespace.c:1119
>  task_work_run+0x1a0/0x280 kernel/task_work.c:116
>  exit_task_work include/linux/task_work.h:21 [inline]
>  do_exit+0x18a8/0x2820 kernel/exit.c:878
>  do_group_exit+0x14e/0x420 kernel/exit.c:982
>  get_signal+0x784/0x1780 kernel/signal.c:2318
>  do_signal+0xd7/0x2130 arch/x86/kernel/signal.c:808
>  exit_to_usermode_loop+0x1ac/0x240 arch/x86/entry/common.c:157
>  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
>  syscall_return_slowpath+0x3ba/0x410 arch/x86/entry/common.c:263
>  entry_SYSCALL_64_fastpath+0xbc/0xbe
>
> The buggy address belongs to the object at ffff88003a2bdae0
>  which belongs to the cache kmalloc-1024 of size 1024
> The buggy address is located 24 bytes inside of
>  1024-byte region [ffff88003a2bdae0, ffff88003a2bdee0)
> The buggy address belongs to the page:
> page:ffffea0000e8ae00 count:1 mapcount:0 mapping:          (null)
> index:0x0 compound_mapcount: 0
> flags: 0x100000000008100(slab|head)
> raw: 0100000000008100 0000000000000000 0000000000000000 0000000100170017
> raw: ffffea0000ed3020 ffffea0000f5f820 ffff88003e80efc0 0000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
>  ffff88003a2bd980: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>  ffff88003a2bda00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >ffff88003a2bda80: fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb fb
>                                                                 ^
>  ffff88003a2bdb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff88003a2bdb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==================================================================

What this means is that the gadgetfs_suspend() routine was trying to
access dev->lock after it had been deallocated.  The root cause is a
race in the dummy_hcd driver; the dummy_udc_stop() routine can race
with the rest of the driver because it contains no locking.  And even
when proper locking is added, it can still race with the
set_link_state() function because that function incorrectly drops the
private spinlock before invoking any gadget driver callbacks.

The result of this race, as seen above, is that set_link_state() can
invoke a callback in gadgetfs even after gadgetfs has been unbound
from dummy_hcd's UDC and its private data structures have been
deallocated.

include/linux/usb/gadget.h documents that the ->reset, ->disconnect,
->suspend, and ->resume callbacks may be invoked in interrupt context.
In general this is necessary, to prevent races with gadget driver
removal.  This patch fixes dummy_hcd to retain the spinlock across
these calls, and it adds a spinlock acquisition to dummy_udc_stop() to
prevent the race.

The net2280 driver makes the same mistake of dropping the private
spinlock for its ->disconnect and ->reset callback invocations.  The
patch fixes it too.

Lastly, since gadgetfs_suspend() may be invoked in interrupt context,
it cannot assume that interrupts are enabled when it runs.  It must
use spin_lock_irqsave() instead of spin_lock_irq().  The patch fixes
that bug as well.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: Andrey Konovalov <andreyknvl@google.com>
CC: <stable@vger.kernel.org>
Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
pcmoore pushed a commit that referenced this issue Sep 5, 2017
If madvise(..., MADV_FREE) split a transparent hugepage, it called
put_page() before unlock_page().

This was wrong because put_page() can free the page, e.g. if a
concurrent madvise(..., MADV_DONTNEED) has removed it from the memory
mapping. put_page() then rightfully complained about freeing a locked
page.

Fix this by moving the unlock_page() before put_page().

This bug was found by syzkaller, which encountered the following splat:

    BUG: Bad page state in process syzkaller412798  pfn:1bd800
    page:ffffea0006f60000 count:0 mapcount:0 mapping:          (null) index:0x20a00
    flags: 0x200000000040019(locked|uptodate|dirty|swapbacked)
    raw: 0200000000040019 0000000000000000 0000000000020a00 00000000ffffffff
    raw: ffffea0006f60020 ffffea0006f60020 0000000000000000 0000000000000000
    page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
    bad because of flags: 0x1(locked)
    Modules linked in:
    CPU: 1 PID: 3037 Comm: syzkaller412798 Not tainted 4.13.0-rc5+ #35
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
     __dump_stack lib/dump_stack.c:16 [inline]
     dump_stack+0x194/0x257 lib/dump_stack.c:52
     bad_page+0x230/0x2b0 mm/page_alloc.c:565
     free_pages_check_bad+0x1f0/0x2e0 mm/page_alloc.c:943
     free_pages_check mm/page_alloc.c:952 [inline]
     free_pages_prepare mm/page_alloc.c:1043 [inline]
     free_pcp_prepare mm/page_alloc.c:1068 [inline]
     free_hot_cold_page+0x8cf/0x12b0 mm/page_alloc.c:2584
     __put_single_page mm/swap.c:79 [inline]
     __put_page+0xfb/0x160 mm/swap.c:113
     put_page include/linux/mm.h:814 [inline]
     madvise_free_pte_range+0x137a/0x1ec0 mm/madvise.c:371
     walk_pmd_range mm/pagewalk.c:50 [inline]
     walk_pud_range mm/pagewalk.c:108 [inline]
     walk_p4d_range mm/pagewalk.c:134 [inline]
     walk_pgd_range mm/pagewalk.c:160 [inline]
     __walk_page_range+0xc3a/0x1450 mm/pagewalk.c:249
     walk_page_range+0x200/0x470 mm/pagewalk.c:326
     madvise_free_page_range.isra.9+0x17d/0x230 mm/madvise.c:444
     madvise_free_single_vma+0x353/0x580 mm/madvise.c:471
     madvise_dontneed_free mm/madvise.c:555 [inline]
     madvise_vma mm/madvise.c:664 [inline]
     SYSC_madvise mm/madvise.c:832 [inline]
     SyS_madvise+0x7d3/0x13c0 mm/madvise.c:760
     entry_SYSCALL_64_fastpath+0x1f/0xbe

Here is a C reproducer:

    #define _GNU_SOURCE
    #include <pthread.h>
    #include <sys/mman.h>
    #include <unistd.h>

    #define MADV_FREE	8
    #define PAGE_SIZE	4096

    static void *mapping;
    static const size_t mapping_size = 0x1000000;

    static void *madvise_thrproc(void *arg)
    {
        madvise(mapping, mapping_size, (long)arg);
    }

    int main(void)
    {
        pthread_t t[2];

        for (;;) {
            mapping = mmap(NULL, mapping_size, PROT_WRITE,
                           MAP_POPULATE|MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);

            munmap(mapping + mapping_size / 2, PAGE_SIZE);

            pthread_create(&t[0], 0, madvise_thrproc, (void*)MADV_DONTNEED);
            pthread_create(&t[1], 0, madvise_thrproc, (void*)MADV_FREE);
            pthread_join(t[0], NULL);
            pthread_join(t[1], NULL);
            munmap(mapping, mapping_size);
        }
    }

Note: to see the splat, CONFIG_TRANSPARENT_HUGEPAGE=y and
CONFIG_DEBUG_VM=y are needed.

Google Bug Id: 64696096

Link: http://lkml.kernel.org/r/20170823205235.132061-1-ebiggers3@gmail.com
Fixes: 854e9ed ("mm: support madvise(MADV_FREE)")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>	[v4.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
stephensmalley pushed a commit to stephensmalley/selinux-kernel that referenced this issue Aug 26, 2020
In the existing NVMeOF Passthru core command handling on failure of
nvme_alloc_request() it errors out with rq value set to NULL. In the
error handling path it calls blk_put_request() without checking if
rq is set to NULL or not which produces following Oops:-

[ 1457.346861] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 1457.347838] #PF: supervisor read access in kernel mode
[ 1457.348464] #PF: error_code(0x0000) - not-present page
[ 1457.349085] PGD 0 P4D 0
[ 1457.349402] Oops: 0000 [SELinuxProject#1] SMP NOPTI
[ 1457.349851] CPU: 18 PID: 10782 Comm: kworker/18:2 Tainted: G           OE     5.8.0-rc4nvme-5.9+ SELinuxProject#35
[ 1457.350951] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e3214
[ 1457.352347] Workqueue: events nvme_loop_execute_work [nvme_loop]
[ 1457.353062] RIP: 0010:blk_mq_free_request+0xe/0x110
[ 1457.353651] Code: 3f ff ff ff 83 f8 01 75 0d 4c 89 e7 e8 1b db ff ff e9 2d ff ff ff 0f 0b eb ef 66 8
[ 1457.355975] RSP: 0018:ffffc900035b7de0 EFLAGS: 00010282
[ 1457.356636] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
[ 1457.357526] RDX: ffffffffa060bd05 RSI: 0000000000000000 RDI: 0000000000000000
[ 1457.358416] RBP: 0000000000000037 R08: 0000000000000000 R09: 0000000000000000
[ 1457.359317] R10: 0000000000000000 R11: 000000000000006d R12: 0000000000000000
[ 1457.360424] R13: ffff8887ffa68600 R14: 0000000000000000 R15: ffff8888150564c8
[ 1457.361322] FS:  0000000000000000(0000) GS:ffff888814600000(0000) knlGS:0000000000000000
[ 1457.362337] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1457.363058] CR2: 0000000000000000 CR3: 000000081c0ac000 CR4: 00000000003406e0
[ 1457.363973] Call Trace:
[ 1457.364296]  nvmet_passthru_execute_cmd+0x150/0x2c0 [nvmet]
[ 1457.364990]  process_one_work+0x24e/0x5a0
[ 1457.365493]  ? __schedule+0x353/0x840
[ 1457.365957]  worker_thread+0x3c/0x380
[ 1457.366426]  ? process_one_work+0x5a0/0x5a0
[ 1457.366948]  kthread+0x135/0x150
[ 1457.367362]  ? kthread_create_on_node+0x60/0x60
[ 1457.367934]  ret_from_fork+0x22/0x30
[ 1457.368388] Modules linked in: nvme_loop(OE) nvmet(OE) nvme_fabrics(OE) null_blk nvme(OE) nvme_corer
[ 1457.368414]  ata_piix crc32c_intel virtio_pci libata virtio_ring serio_raw t10_pi virtio floppy dm_]
[ 1457.380849] CR2: 0000000000000000
[ 1457.381288] ---[ end trace c6cab61bfd1f68fd ]---
[ 1457.381861] RIP: 0010:blk_mq_free_request+0xe/0x110
[ 1457.382469] Code: 3f ff ff ff 83 f8 01 75 0d 4c 89 e7 e8 1b db ff ff e9 2d ff ff ff 0f 0b eb ef 66 8
[ 1457.384749] RSP: 0018:ffffc900035b7de0 EFLAGS: 00010282
[ 1457.385393] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
[ 1457.386264] RDX: ffffffffa060bd05 RSI: 0000000000000000 RDI: 0000000000000000
[ 1457.387142] RBP: 0000000000000037 R08: 0000000000000000 R09: 0000000000000000
[ 1457.388029] R10: 0000000000000000 R11: 000000000000006d R12: 0000000000000000
[ 1457.388914] R13: ffff8887ffa68600 R14: 0000000000000000 R15: ffff8888150564c8
[ 1457.389798] FS:  0000000000000000(0000) GS:ffff888814600000(0000) knlGS:0000000000000000
[ 1457.390796] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1457.391508] CR2: 0000000000000000 CR3: 000000081c0ac000 CR4: 00000000003406e0
[ 1457.392525] Kernel panic - not syncing: Fatal exception
[ 1457.394138] Kernel Offset: disabled
[ 1457.394677] ---[ end Kernel panic - not syncing: Fatal exception ]---

We fix this Oops by adding a new goto label out_put_req and reordering
the blk_put_request call to avoid calling blk_put_request() with rq
value is set to NULL. Here we also update the rest of the code
accordingly.

Fixes: 06b7164dfdc0 ("nvmet: add passthru code to process commands")
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
pcmoore pushed a commit that referenced this issue Oct 17, 2022
Inspired by commit 9fb7410("arm64/BUG: Use BRK instruction for
generic BUG traps"), do similar for LoongArch to use generic BUG()
handler.

This patch uses the BREAK software breakpoint instruction to generate
a trap instead, similarly to most other arches, with the generic BUG
code generating the dmesg boilerplate.

This allows bug metadata to be moved to a separate table and reduces
the amount of inline code at BUG() and WARN() sites. This also avoids
clobbering any registers before they can be dumped.

To mitigate the size of the bug table further, this patch makes use of
the existing infrastructure for encoding addresses within the bug table
as 32-bit relative pointers instead of absolute pointers.

(Note: this limits the max kernel size to 2GB.)

Before patch:
[ 3018.338013] lkdtm: Performing direct entry BUG
[ 3018.342445] Kernel bug detected[#5]:
[ 3018.345992] CPU: 2 PID: 865 Comm: cat Tainted: G D 6.0.0-rc6+ #35

After patch:
[  125.585985] lkdtm: Performing direct entry BUG
[  125.590433] ------------[ cut here ]------------
[  125.595020] kernel BUG at drivers/misc/lkdtm/bugs.c:78!
[  125.600211] Oops - BUG[#1]:
[  125.602980] CPU: 3 PID: 410 Comm: cat Not tainted 6.0.0-rc6+ #36

Out-of-line file/line data information obtained compared to before.

Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant