Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

auto update fork #1

Closed
nickdesaulniers opened this issue Mar 19, 2018 · 2 comments
Closed

auto update fork #1

nickdesaulniers opened this issue Mar 19, 2018 · 2 comments
Assignees
Labels
infrastructure Not a bug per se, but an improvement to our workflow

Comments

@nickdesaulniers
Copy link
Member

since this is not a hard fork, it would be good to automate pulling updates from upstream. A quick search looks like https://backstroke.co/ is a service that may help.

@nickdesaulniers nickdesaulniers added the infrastructure Not a bug per se, but an improvement to our workflow label Mar 19, 2018
@nickdesaulniers
Copy link
Member Author

added a wiki page: https://github.com/ClangBuiltLinux/linux/wiki/Updating-the-repo

can probably just add this to my local cron.

nickdesaulniers pushed a commit that referenced this issue Aug 22, 2018
[  155.018460] ======================================================
[  155.021431] WARNING: possible circular locking dependency detected
[  155.024339] 4.18.0-rc3+ #5 Tainted: G           OE
[  155.026879] ------------------------------------------------------
[  155.029783] umount/2901 is trying to acquire lock:
[  155.032187] 00000000c4282f1f (kn->count#130){++++}, at: kernfs_remove+0x1f/0x30
[  155.035439]
[  155.035439] but task is already holding lock:
[  155.038892] 0000000056e4307b (&type->s_umount_key#41){++++}, at: deactivate_super+0x33/0x50
[  155.042602]
[  155.042602] which lock already depends on the new lock.
[  155.042602]
[  155.047465]
[  155.047465] the existing dependency chain (in reverse order) is:
[  155.051354]
[  155.051354] -> #1 (&type->s_umount_key#41){++++}:
[  155.054768]        f2fs_sbi_store+0x61/0x460 [f2fs]
[  155.057083]        kernfs_fop_write+0x113/0x1a0
[  155.059277]        __vfs_write+0x36/0x180
[  155.061250]        vfs_write+0xbe/0x1b0
[  155.063179]        ksys_write+0x55/0xc0
[  155.065068]        do_syscall_64+0x60/0x1b0
[  155.067071]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  155.069529]
[  155.069529] -> #0 (kn->count#130){++++}:
[  155.072421]        __kernfs_remove+0x26f/0x2e0
[  155.074452]        kernfs_remove+0x1f/0x30
[  155.076342]        kobject_del.part.5+0xe/0x40
[  155.078354]        f2fs_put_super+0x12d/0x290 [f2fs]
[  155.080500]        generic_shutdown_super+0x6c/0x110
[  155.082655]        kill_block_super+0x21/0x50
[  155.084634]        kill_f2fs_super+0x9c/0xc0 [f2fs]
[  155.086726]        deactivate_locked_super+0x3f/0x70
[  155.088826]        cleanup_mnt+0x3b/0x70
[  155.090584]        task_work_run+0x93/0xc0
[  155.092367]        exit_to_usermode_loop+0xf0/0x100
[  155.094466]        do_syscall_64+0x162/0x1b0
[  155.096312]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  155.098603]
[  155.098603] other info that might help us debug this:
[  155.098603]
[  155.102418]  Possible unsafe locking scenario:
[  155.102418]
[  155.105134]        CPU0                    CPU1
[  155.107037]        ----                    ----
[  155.108910]   lock(&type->s_umount_key#41);
[  155.110674]                                lock(kn->count#130);
[  155.113010]                                lock(&type->s_umount_key#41);
[  155.115608]   lock(kn->count#130);

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
nickdesaulniers pushed a commit that referenced this issue Aug 22, 2018
As Anatoly Trosinenko reported in bugzilla:

How to reproduce:
1. Compile the 73fcb1a version of the kernel using the config attached
2. Unpack and mount the attached filesystem image as F2FS
3. The kernel will BUG() on mount (BUGs are explicitly enabled in config)

[    2.233612] F2FS-fs (sda): Found nat_bits in checkpoint
[    2.248422] ------------[ cut here ]------------
[    2.248857] kernel BUG at fs/f2fs/node.c:1967!
[    2.249760] invalid opcode: 0000 [#1] SMP NOPTI
[    2.250219] Modules linked in:
[    2.251848] CPU: 0 PID: 944 Comm: mount Not tainted 4.17.0-rc5+ #1
[    2.252331] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[    2.253305] RIP: 0010:build_free_nids+0x337/0x3f0
[    2.253672] RSP: 0018:ffffae7fc0857c50 EFLAGS: 00000246
[    2.254080] RAX: 00000000ffffffff RBX: 0000000000000123 RCX: 0000000000000001
[    2.254638] RDX: ffff9aa7063d5c00 RSI: 0000000000000122 RDI: ffff9aa705852e00
[    2.255190] RBP: ffff9aa705852e00 R08: 0000000000000001 R09: ffff9aa7059090c0
[    2.255719] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9aa705852e00
[    2.256242] R13: ffff9aa7063ad000 R14: ffff9aa705919000 R15: 0000000000000123
[    2.256809] FS:  00000000023078c0(0000) GS:ffff9aa707800000(0000) knlGS:0000000000000000
[    2.258654] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    2.259153] CR2: 00000000005511ae CR3: 0000000005872000 CR4: 00000000000006f0
[    2.259801] Call Trace:
[    2.260583]  build_node_manager+0x5cd/0x600
[    2.260963]  f2fs_fill_super+0x66a/0x17c0
[    2.261300]  ? f2fs_commit_super+0xe0/0xe0
[    2.261622]  mount_bdev+0x16e/0x1a0
[    2.261899]  mount_fs+0x30/0x150
[    2.262398]  vfs_kern_mount.part.28+0x4f/0xf0
[    2.262743]  do_mount+0x5d0/0xc60
[    2.263010]  ? _copy_from_user+0x37/0x60
[    2.263313]  ? memdup_user+0x39/0x60
[    2.263692]  ksys_mount+0x7b/0xd0
[    2.263960]  __x64_sys_mount+0x1c/0x20
[    2.264268]  do_syscall_64+0x43/0xf0
[    2.264560]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[    2.265095] RIP: 0033:0x48d31a
[    2.265502] RSP: 002b:00007ffc6fe60a08 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
[    2.266089] RAX: ffffffffffffffda RBX: 0000000000008000 RCX: 000000000048d31a
[    2.266607] RDX: 00007ffc6fe62fa5 RSI: 00007ffc6fe62f9d RDI: 00007ffc6fe62f94
[    2.267130] RBP: 00000000023078a0 R08: 0000000000000000 R09: 0000000000000000
[    2.267670] R10: 0000000000008000 R11: 0000000000000246 R12: 0000000000000000
[    2.268192] R13: 0000000000000000 R14: 00007ffc6fe60c78 R15: 0000000000000000
[    2.268767] Code: e8 5f c3 ff ff 83 c3 01 41 83 c7 01 81 fb c7 01 00 00 74 48 44 39 7d 04 76 42 48 63 c3 48 8d 04 c0 41 8b 44 06 05 83 f8 ff 75 c1 <0f> 0b 49 8b 45 50 48 8d b8 b0 00 00 00 e8 37 59 69 00 b9 01 00
[    2.270434] RIP: build_free_nids+0x337/0x3f0 RSP: ffffae7fc0857c50
[    2.271426] ---[ end trace ab20c06cd3c8fde4 ]---

During loading NAT entries, we will do sanity check, once the entry info
is corrupted, it will cause BUG_ON directly to protect user data from
being overwrited.

In this case, it will be better to just return failure on mount() instead
of panic, so that user can get hint from kmsg and try fsck for recovery
immediately rather than after an abnormal reboot.

https://bugzilla.kernel.org/show_bug.cgi?id=199769

Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
nickdesaulniers pushed a commit that referenced this issue Aug 22, 2018
As Wen Xu reported in below link:

https://bugzilla.kernel.org/show_bug.cgi?id=200183

- Overview
Divide zero in reset_curseg() when mounting a crafted f2fs image

- Reproduce

- Kernel message
[  588.281510] divide error: 0000 [#1] SMP KASAN PTI
[  588.282701] CPU: 0 PID: 1293 Comm: mount Not tainted 4.18.0-rc1+ #4
[  588.284000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  588.286178] RIP: 0010:reset_curseg+0x94/0x1a0
[  588.298166] RSP: 0018:ffff8801e88d7940 EFLAGS: 00010246
[  588.299360] RAX: 0000000000000014 RBX: ffff8801e1d46d00 RCX: ffffffffb88bf60b
[  588.300809] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8801e1d46d64
[  588.305272] R13: 0000000000000000 R14: 0000000000000014 R15: 0000000000000000
[  588.306822] FS:  00007fad85008840(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000
[  588.308456] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  588.309623] CR2: 0000000001705078 CR3: 00000001f30f8000 CR4: 00000000000006f0
[  588.311085] Call Trace:
[  588.311637]  f2fs_build_segment_manager+0x103f/0x3410
[  588.316136]  ? f2fs_commit_super+0x1b0/0x1b0
[  588.317031]  ? set_blocksize+0x90/0x140
[  588.319473]  f2fs_mount+0x15/0x20
[  588.320166]  mount_fs+0x60/0x1a0
[  588.320847]  ? alloc_vfsmnt+0x309/0x360
[  588.321647]  vfs_kern_mount+0x6b/0x1a0
[  588.322432]  do_mount+0x34a/0x18c0
[  588.323175]  ? strndup_user+0x46/0x70
[  588.323937]  ? copy_mount_string+0x20/0x20
[  588.324793]  ? memcg_kmem_put_cache+0x1b/0xa0
[  588.325702]  ? kasan_check_write+0x14/0x20
[  588.326562]  ? _copy_from_user+0x6a/0x90
[  588.327375]  ? memdup_user+0x42/0x60
[  588.328118]  ksys_mount+0x83/0xd0
[  588.328808]  __x64_sys_mount+0x67/0x80
[  588.329607]  do_syscall_64+0x78/0x170
[  588.330400]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  588.331461] RIP: 0033:0x7fad848e8b9a
[  588.336022] RSP: 002b:00007ffd7c5b6be8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[  588.337547] RAX: ffffffffffffffda RBX: 00000000016f8030 RCX: 00007fad848e8b9a
[  588.338999] RDX: 00000000016f8210 RSI: 00000000016f9f30 RDI: 0000000001700ec0
[  588.340442] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013
[  588.341887] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 0000000001700ec0
[  588.343341] R13: 00000000016f8210 R14: 0000000000000000 R15: 0000000000000003
[  588.354891] ---[ end trace 4ce02f25ff7d3df5 ]---
[  588.355862] RIP: 0010:reset_curseg+0x94/0x1a0
[  588.360742] RSP: 0018:ffff8801e88d7940 EFLAGS: 00010246
[  588.361812] RAX: 0000000000000014 RBX: ffff8801e1d46d00 RCX: ffffffffb88bf60b
[  588.363485] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8801e1d46d64
[  588.365213] RBP: ffff8801e88d7968 R08: ffffed003c32266f R09: ffffed003c32266f
[  588.366661] R10: 0000000000000001 R11: ffffed003c32266e R12: ffff8801f0337700
[  588.368110] R13: 0000000000000000 R14: 0000000000000014 R15: 0000000000000000
[  588.370057] FS:  00007fad85008840(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000
[  588.372099] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  588.373291] CR2: 0000000001705078 CR3: 00000001f30f8000 CR4: 00000000000006f0

- Location
https://elixir.bootlin.com/linux/latest/source/fs/f2fs/segment.c#L2147
        curseg->zone = GET_ZONE_FROM_SEG(sbi, curseg->segno);

If secs_per_zone is corrupted due to fuzzing test, it will cause divide
zero operation when using GET_ZONE_FROM_SEG macro, so we should do more
sanity check with secs_per_zone during mount to avoid this issue.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
nickdesaulniers pushed a commit that referenced this issue Aug 22, 2018
If FI_EXTRA_ATTR is set in inode by fuzzing, inode.i_addr[0] will be
parsed as inode.i_extra_isize, then in __recover_inline_status, inline
data address will beyond boundary of page, result in accessing invalid
memory.

So in this condition, during reading inode page, let's do sanity check
with EXTRA_ATTR feature of fs and extra_attr bit of inode, if they're
inconsistent, deny to load this inode.

- Overview
Out-of-bound access in f2fs_iget() when mounting a corrupted f2fs image

- Reproduce

The following message will be got in KASAN build of 4.18 upstream kernel.
[  819.392227] ==================================================================
[  819.393901] BUG: KASAN: slab-out-of-bounds in f2fs_iget+0x736/0x1530
[  819.395329] Read of size 4 at addr ffff8801f099c968 by task mount/1292

[  819.397079] CPU: 1 PID: 1292 Comm: mount Not tainted 4.18.0-rc1+ #4
[  819.397082] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  819.397088] Call Trace:
[  819.397124]  dump_stack+0x7b/0xb5
[  819.397154]  print_address_description+0x70/0x290
[  819.397159]  kasan_report+0x291/0x390
[  819.397163]  ? f2fs_iget+0x736/0x1530
[  819.397176]  check_memory_region+0x139/0x190
[  819.397182]  __asan_loadN+0xf/0x20
[  819.397185]  f2fs_iget+0x736/0x1530
[  819.397197]  f2fs_fill_super+0x1b4f/0x2b40
[  819.397202]  ? f2fs_fill_super+0x1b4f/0x2b40
[  819.397208]  ? f2fs_commit_super+0x1b0/0x1b0
[  819.397227]  ? set_blocksize+0x90/0x140
[  819.397241]  mount_bdev+0x1c5/0x210
[  819.397245]  ? f2fs_commit_super+0x1b0/0x1b0
[  819.397252]  f2fs_mount+0x15/0x20
[  819.397256]  mount_fs+0x60/0x1a0
[  819.397267]  ? alloc_vfsmnt+0x309/0x360
[  819.397272]  vfs_kern_mount+0x6b/0x1a0
[  819.397282]  do_mount+0x34a/0x18c0
[  819.397300]  ? lockref_put_or_lock+0xcf/0x160
[  819.397306]  ? copy_mount_string+0x20/0x20
[  819.397318]  ? memcg_kmem_put_cache+0x1b/0xa0
[  819.397324]  ? kasan_check_write+0x14/0x20
[  819.397334]  ? _copy_from_user+0x6a/0x90
[  819.397353]  ? memdup_user+0x42/0x60
[  819.397359]  ksys_mount+0x83/0xd0
[  819.397365]  __x64_sys_mount+0x67/0x80
[  819.397388]  do_syscall_64+0x78/0x170
[  819.397403]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  819.397422] RIP: 0033:0x7f54c667cb9a
[  819.397424] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48
[  819.397483] RSP: 002b:00007ffd8f46cd08 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
[  819.397496] RAX: ffffffffffffffda RBX: 0000000000dfa030 RCX: 00007f54c667cb9a
[  819.397498] RDX: 0000000000dfa210 RSI: 0000000000dfbf30 RDI: 0000000000e02ec0
[  819.397501] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013
[  819.397503] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 0000000000e02ec0
[  819.397505] R13: 0000000000dfa210 R14: 0000000000000000 R15: 0000000000000003

[  819.397866] Allocated by task 139:
[  819.398702]  save_stack+0x46/0xd0
[  819.398705]  kasan_kmalloc+0xad/0xe0
[  819.398709]  kasan_slab_alloc+0x11/0x20
[  819.398713]  kmem_cache_alloc+0xd1/0x1e0
[  819.398717]  dup_fd+0x50/0x4c0
[  819.398740]  copy_process.part.37+0xbed/0x32e0
[  819.398744]  _do_fork+0x16e/0x590
[  819.398748]  __x64_sys_clone+0x69/0x80
[  819.398752]  do_syscall_64+0x78/0x170
[  819.398756]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[  819.399097] Freed by task 159:
[  819.399743]  save_stack+0x46/0xd0
[  819.399747]  __kasan_slab_free+0x13c/0x1a0
[  819.399750]  kasan_slab_free+0xe/0x10
[  819.399754]  kmem_cache_free+0x89/0x1e0
[  819.399757]  put_files_struct+0x132/0x150
[  819.399761]  exit_files+0x62/0x70
[  819.399766]  do_exit+0x47b/0x1390
[  819.399770]  do_group_exit+0x86/0x130
[  819.399774]  __x64_sys_exit_group+0x2c/0x30
[  819.399778]  do_syscall_64+0x78/0x170
[  819.399782]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[  819.400115] The buggy address belongs to the object at ffff8801f099c680
                which belongs to the cache files_cache of size 704
[  819.403234] The buggy address is located 40 bytes to the right of
                704-byte region [ffff8801f099c680, ffff8801f099c940)
[  819.405689] The buggy address belongs to the page:
[  819.406709] page:ffffea0007c26700 count:1 mapcount:0 mapping:ffff8801f69a3340 index:0xffff8801f099d380 compound_mapcount: 0
[  819.408984] flags: 0x2ffff0000008100(slab|head)
[  819.409932] raw: 02ffff0000008100 ffffea00077fb600 0000000200000002 ffff8801f69a3340
[  819.411514] raw: ffff8801f099d380 0000000080130000 00000001ffffffff 0000000000000000
[  819.413073] page dumped because: kasan: bad access detected

[  819.414539] Memory state around the buggy address:
[  819.415521]  ffff8801f099c800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  819.416981]  ffff8801f099c880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  819.418454] >ffff8801f099c900: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[  819.419921]                                                           ^
[  819.421265]  ffff8801f099c980: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
[  819.422745]  ffff8801f099ca00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  819.424206] ==================================================================
[  819.425668] Disabling lock debugging due to kernel taint
[  819.457463] F2FS-fs (loop0): Mounted with checkpoint version = 3

The kernel still mounts the image. If you run the following program on the mounted folder mnt,

(poc.c)

static void activity(char *mpoint) {

  char *foo_bar_baz;
  int err;

  static int buf[8192];
  memset(buf, 0, sizeof(buf));

  err = asprintf(&foo_bar_baz, "%s/foo/bar/baz", mpoint);
    int fd = open(foo_bar_baz, O_RDONLY, 0);
  if (fd >= 0) {
      read(fd, (char *)buf, 11);
      close(fd);
  }
}

int main(int argc, char *argv[]) {
  activity(argv[1]);
  return 0;
}

You can get kernel crash:
[  819.457463] F2FS-fs (loop0): Mounted with checkpoint version = 3
[  918.028501] BUG: unable to handle kernel paging request at ffffed0048000d82
[  918.044020] PGD 23ffee067 P4D 23ffee067 PUD 23fbef067 PMD 0
[  918.045207] Oops: 0000 [#1] SMP KASAN PTI
[  918.046048] CPU: 0 PID: 1309 Comm: poc Tainted: G    B             4.18.0-rc1+ #4
[  918.047573] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  918.049552] RIP: 0010:check_memory_region+0x5e/0x190
[  918.050565] Code: f8 49 c1 e8 03 49 89 db 49 c1 eb 03 4d 01 cb 4d 01 c1 4d 8d 63 01 4c 89 c8 4d 89 e2 4d 29 ca 49 83 fa 10 7f 3d 4d 85 d2 74 32 <41> 80 39 00 75 23 48 b8 01 00 00 00 00 fc ff df 4d 01 d1 49 01 c0
[  918.054322] RSP: 0018:ffff8801e3a1f258 EFLAGS: 00010202
[  918.055400] RAX: ffffed0048000d82 RBX: ffff880240006c11 RCX: ffffffffb8867d14
[  918.056832] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff880240006c10
[  918.058253] RBP: ffff8801e3a1f268 R08: 1ffff10048000d82 R09: ffffed0048000d82
[  918.059717] R10: 0000000000000001 R11: ffffed0048000d82 R12: ffffed0048000d83
[  918.061159] R13: ffff8801e3a1f390 R14: 0000000000000000 R15: ffff880240006c08
[  918.062614] FS:  00007fac9732c700(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000
[  918.064246] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  918.065412] CR2: ffffed0048000d82 CR3: 00000001df77a000 CR4: 00000000000006f0
[  918.066882] Call Trace:
[  918.067410]  __asan_loadN+0xf/0x20
[  918.068149]  f2fs_find_target_dentry+0xf4/0x270
[  918.069083]  ? __get_node_page+0x331/0x5b0
[  918.069925]  f2fs_find_in_inline_dir+0x24b/0x310
[  918.070881]  ? f2fs_recover_inline_data+0x4c0/0x4c0
[  918.071905]  ? unwind_next_frame.part.5+0x34f/0x490
[  918.072901]  ? unwind_dump+0x290/0x290
[  918.073695]  ? is_bpf_text_address+0xe/0x20
[  918.074566]  __f2fs_find_entry+0x599/0x670
[  918.075408]  ? kasan_unpoison_shadow+0x36/0x50
[  918.076315]  ? kasan_kmalloc+0xad/0xe0
[  918.077100]  ? memcg_kmem_put_cache+0x55/0xa0
[  918.077998]  ? f2fs_find_target_dentry+0x270/0x270
[  918.079006]  ? d_set_d_op+0x30/0x100
[  918.079749]  ? __d_lookup_rcu+0x69/0x2e0
[  918.080556]  ? __d_alloc+0x275/0x450
[  918.081297]  ? kasan_check_write+0x14/0x20
[  918.082135]  ? memset+0x31/0x40
[  918.082820]  ? fscrypt_setup_filename+0x1ec/0x4c0
[  918.083782]  ? d_alloc_parallel+0x5bb/0x8c0
[  918.084640]  f2fs_find_entry+0xe9/0x110
[  918.085432]  ? __f2fs_find_entry+0x670/0x670
[  918.086308]  ? kasan_check_write+0x14/0x20
[  918.087163]  f2fs_lookup+0x297/0x590
[  918.087902]  ? f2fs_link+0x2b0/0x2b0
[  918.088646]  ? legitimize_path.isra.29+0x61/0xa0
[  918.089589]  __lookup_slow+0x12e/0x240
[  918.090371]  ? may_delete+0x2b0/0x2b0
[  918.091123]  ? __nd_alloc_stack+0xa0/0xa0
[  918.091944]  lookup_slow+0x44/0x60
[  918.092642]  walk_component+0x3ee/0xa40
[  918.093428]  ? is_bpf_text_address+0xe/0x20
[  918.094283]  ? pick_link+0x3e0/0x3e0
[  918.095047]  ? in_group_p+0xa5/0xe0
[  918.095771]  ? generic_permission+0x53/0x1e0
[  918.096666]  ? security_inode_permission+0x1d/0x70
[  918.097646]  ? inode_permission+0x7a/0x1f0
[  918.098497]  link_path_walk+0x2a2/0x7b0
[  918.099298]  ? apparmor_capget+0x3d0/0x3d0
[  918.100140]  ? walk_component+0xa40/0xa40
[  918.100958]  ? path_init+0x2e6/0x580
[  918.101695]  path_openat+0x1bb/0x2160
[  918.102471]  ? __save_stack_trace+0x92/0x100
[  918.103352]  ? save_stack+0xb5/0xd0
[  918.104070]  ? vfs_unlink+0x250/0x250
[  918.104822]  ? save_stack+0x46/0xd0
[  918.105538]  ? kasan_slab_alloc+0x11/0x20
[  918.106370]  ? kmem_cache_alloc+0xd1/0x1e0
[  918.107213]  ? getname_flags+0x76/0x2c0
[  918.107997]  ? getname+0x12/0x20
[  918.108677]  ? do_sys_open+0x14b/0x2c0
[  918.109450]  ? __x64_sys_open+0x4c/0x60
[  918.110255]  ? do_syscall_64+0x78/0x170
[  918.111083]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  918.112148]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  918.113204]  ? f2fs_empty_inline_dir+0x1e0/0x1e0
[  918.114150]  ? timespec64_trunc+0x5c/0x90
[  918.114993]  ? wb_io_lists_depopulated+0x1a/0xc0
[  918.115937]  ? inode_io_list_move_locked+0x102/0x110
[  918.116949]  do_filp_open+0x12b/0x1d0
[  918.117709]  ? may_open_dev+0x50/0x50
[  918.118475]  ? kasan_kmalloc+0xad/0xe0
[  918.119246]  do_sys_open+0x17c/0x2c0
[  918.119983]  ? do_sys_open+0x17c/0x2c0
[  918.120751]  ? filp_open+0x60/0x60
[  918.121463]  ? task_work_run+0x4d/0xf0
[  918.122237]  __x64_sys_open+0x4c/0x60
[  918.123001]  do_syscall_64+0x78/0x170
[  918.123759]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  918.124802] RIP: 0033:0x7fac96e3e040
[  918.125537] Code: 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 83 3d 09 27 2d 00 00 75 10 b8 02 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 7e e0 01 00 48 89 04 24
[  918.129341] RSP: 002b:00007fff1b37f848 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
[  918.130870] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fac96e3e040
[  918.132295] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000122d080
[  918.133748] RBP: 00007fff1b37f9b0 R08: 00007fac9710bbd8 R09: 0000000000000001
[  918.135209] R10: 000000000000069d R11: 0000000000000246 R12: 0000000000400c20
[  918.136650] R13: 00007fff1b37fab0 R14: 0000000000000000 R15: 0000000000000000
[  918.138093] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy
[  918.147924] CR2: ffffed0048000d82
[  918.148619] ---[ end trace 4ce02f25ff7d3df5 ]---
[  918.149563] RIP: 0010:check_memory_region+0x5e/0x190
[  918.150576] Code: f8 49 c1 e8 03 49 89 db 49 c1 eb 03 4d 01 cb 4d 01 c1 4d 8d 63 01 4c 89 c8 4d 89 e2 4d 29 ca 49 83 fa 10 7f 3d 4d 85 d2 74 32 <41> 80 39 00 75 23 48 b8 01 00 00 00 00 fc ff df 4d 01 d1 49 01 c0
[  918.154360] RSP: 0018:ffff8801e3a1f258 EFLAGS: 00010202
[  918.155411] RAX: ffffed0048000d82 RBX: ffff880240006c11 RCX: ffffffffb8867d14
[  918.156833] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff880240006c10
[  918.158257] RBP: ffff8801e3a1f268 R08: 1ffff10048000d82 R09: ffffed0048000d82
[  918.159722] R10: 0000000000000001 R11: ffffed0048000d82 R12: ffffed0048000d83
[  918.161149] R13: ffff8801e3a1f390 R14: 0000000000000000 R15: ffff880240006c08
[  918.162587] FS:  00007fac9732c700(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000
[  918.164203] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  918.165356] CR2: ffffed0048000d82 CR3: 00000001df77a000 CR4: 00000000000006f0

Reported-by: Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
nickdesaulniers pushed a commit that referenced this issue Aug 22, 2018
This patch fixs to do sanity check with user_block_count.

- Overview
Divide zero in utilization when mount() a corrupted f2fs image

- Reproduce (4.18 upstream kernel)

- Kernel message
[  564.099503] F2FS-fs (loop0): invalid crc value
[  564.101991] divide error: 0000 [#1] SMP KASAN PTI
[  564.103103] CPU: 1 PID: 1298 Comm: f2fs_discard-7: Not tainted 4.18.0-rc1+ #4
[  564.104584] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  564.106624] RIP: 0010:issue_discard_thread+0x248/0x5c0
[  564.107692] Code: ff ff 48 8b bd e8 fe ff ff 41 8b 9d 4c 04 00 00 e8 cd b8 ad ff 41 8b 85 50 04 00 00 31 d2 48 8d 04 80 48 8d 04 80 48 c1 e0 02 <48> f7 f3 83 f8 50 7e 16 41 c7 86 7c ff ff ff 01 00 00 00 41 c7 86
[  564.111686] RSP: 0018:ffff8801f3117dc0 EFLAGS: 00010206
[  564.112775] RAX: 0000000000000384 RBX: 0000000000000000 RCX: ffffffffb88c1e03
[  564.114250] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8801e3aa4850
[  564.115706] RBP: ffff8801f3117f00 R08: 1ffffffff751a1d0 R09: fffffbfff751a1d0
[  564.117177] R10: 0000000000000001 R11: fffffbfff751a1d0 R12: 00000000fffffffc
[  564.118634] R13: ffff8801e3aa4400 R14: ffff8801f3117ed8 R15: ffff8801e2050000
[  564.120094] FS:  0000000000000000(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000
[  564.121748] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  564.122923] CR2: 000000000202b078 CR3: 00000001f11ac000 CR4: 00000000000006e0
[  564.124383] Call Trace:
[  564.124924]  ? __issue_discard_cmd+0x480/0x480
[  564.125882]  ? __sched_text_start+0x8/0x8
[  564.126756]  ? __kthread_parkme+0xcb/0x100
[  564.127620]  ? kthread_blkcg+0x70/0x70
[  564.128412]  kthread+0x180/0x1d0
[  564.129105]  ? __issue_discard_cmd+0x480/0x480
[  564.130029]  ? kthread_associate_blkcg+0x150/0x150
[  564.131033]  ret_from_fork+0x35/0x40
[  564.131794] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy
[  564.141798] ---[ end trace 4ce02f25ff7d3df5 ]---
[  564.142773] RIP: 0010:issue_discard_thread+0x248/0x5c0
[  564.143885] Code: ff ff 48 8b bd e8 fe ff ff 41 8b 9d 4c 04 00 00 e8 cd b8 ad ff 41 8b 85 50 04 00 00 31 d2 48 8d 04 80 48 8d 04 80 48 c1 e0 02 <48> f7 f3 83 f8 50 7e 16 41 c7 86 7c ff ff ff 01 00 00 00 41 c7 86
[  564.147776] RSP: 0018:ffff8801f3117dc0 EFLAGS: 00010206
[  564.148856] RAX: 0000000000000384 RBX: 0000000000000000 RCX: ffffffffb88c1e03
[  564.150424] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8801e3aa4850
[  564.151906] RBP: ffff8801f3117f00 R08: 1ffffffff751a1d0 R09: fffffbfff751a1d0
[  564.153463] R10: 0000000000000001 R11: fffffbfff751a1d0 R12: 00000000fffffffc
[  564.154915] R13: ffff8801e3aa4400 R14: ffff8801f3117ed8 R15: ffff8801e2050000
[  564.156405] FS:  0000000000000000(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000
[  564.158070] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  564.159279] CR2: 000000000202b078 CR3: 00000001f11ac000 CR4: 00000000000006e0
[  564.161043] ==================================================================
[  564.162587] BUG: KASAN: stack-out-of-bounds in from_kuid_munged+0x1d/0x50
[  564.163994] Read of size 4 at addr ffff8801f3117c84 by task f2fs_discard-7:/1298

[  564.165852] CPU: 1 PID: 1298 Comm: f2fs_discard-7: Tainted: G      D           4.18.0-rc1+ #4
[  564.167593] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  564.169522] Call Trace:
[  564.170057]  dump_stack+0x7b/0xb5
[  564.170778]  print_address_description+0x70/0x290
[  564.171765]  kasan_report+0x291/0x390
[  564.172540]  ? from_kuid_munged+0x1d/0x50
[  564.173408]  __asan_load4+0x78/0x80
[  564.174148]  from_kuid_munged+0x1d/0x50
[  564.174962]  do_notify_parent+0x1f5/0x4f0
[  564.175808]  ? send_sigqueue+0x390/0x390
[  564.176639]  ? css_set_move_task+0x152/0x340
[  564.184197]  do_exit+0x1290/0x1390
[  564.184950]  ? __issue_discard_cmd+0x480/0x480
[  564.185884]  ? mm_update_next_owner+0x380/0x380
[  564.186829]  ? __sched_text_start+0x8/0x8
[  564.187672]  ? __kthread_parkme+0xcb/0x100
[  564.188528]  ? kthread_blkcg+0x70/0x70
[  564.189333]  ? kthread+0x180/0x1d0
[  564.190052]  ? __issue_discard_cmd+0x480/0x480
[  564.190983]  rewind_stack_do_exit+0x17/0x20

[  564.192190] The buggy address belongs to the page:
[  564.193213] page:ffffea0007cc45c0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
[  564.194856] flags: 0x2ffff0000000000()
[  564.195644] raw: 02ffff0000000000 0000000000000000 dead000000000200 0000000000000000
[  564.197247] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[  564.198826] page dumped because: kasan: bad access detected

[  564.200299] Memory state around the buggy address:
[  564.201306]  ffff8801f3117b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  564.202779]  ffff8801f3117c00: 00 00 00 00 00 00 00 00 00 00 00 f3 f3 f3 f3 f3
[  564.204252] >ffff8801f3117c80: f3 f3 f3 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
[  564.205742]                    ^
[  564.206424]  ffff8801f3117d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  564.207908]  ffff8801f3117d80: f3 f3 f3 f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00
[  564.209389] ==================================================================
[  564.231795] F2FS-fs (loop0): Mounted with checkpoint version = 2

- Location
https://elixir.bootlin.com/linux/v4.18-rc1/source/fs/f2fs/segment.h#L586
	return div_u64((u64)valid_user_blocks(sbi) * 100,
					sbi->user_block_count);
Missing checks on sbi->user_block_count.

Reported-by: Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
nickdesaulniers pushed a commit that referenced this issue Aug 22, 2018
This patch adds to do sanity check with below fields of inode to
avoid reported panic.
- node footer
- iblocks

https://bugzilla.kernel.org/show_bug.cgi?id=200223

- Overview
BUG() triggered in f2fs_truncate_inode_blocks() when un-mounting a mounted f2fs image after writing to it

- Reproduce

- POC (poc.c)

static void activity(char *mpoint) {

  char *foo_bar_baz;
  int err;

  static int buf[8192];
  memset(buf, 0, sizeof(buf));

  err = asprintf(&foo_bar_baz, "%s/foo/bar/baz", mpoint);

  // open / write / read
  int fd = open(foo_bar_baz, O_RDWR | O_TRUNC, 0777);
  if (fd >= 0) {
    write(fd, (char *)buf, 517);
    write(fd, (char *)buf, sizeof(buf));
    close(fd);
  }

}

int main(int argc, char *argv[]) {
  activity(argv[1]);
  return 0;
}

- Kernel meesage
[  552.479723] F2FS-fs (loop0): Mounted with checkpoint version = 2
[  556.451891] ------------[ cut here ]------------
[  556.451899] kernel BUG at fs/f2fs/node.c:987!
[  556.452920] invalid opcode: 0000 [#1] SMP KASAN PTI
[  556.453936] CPU: 1 PID: 1310 Comm: umount Not tainted 4.18.0-rc1+ #4
[  556.455213] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  556.457140] RIP: 0010:f2fs_truncate_inode_blocks+0x4a7/0x6f0
[  556.458280] Code: e8 ae ea ff ff 41 89 c7 c1 e8 1f 84 c0 74 0a 41 83 ff fe 0f 85 35 ff ff ff 81 85 b0 fe ff ff fb 03 00 00 e9 f7 fd ff ff 0f 0b <0f> 0b e8 62 b7 9a 00 48 8b bd a0 fe ff ff e8 56 54 ae ff 48 8b b5
[  556.462015] RSP: 0018:ffff8801f292f808 EFLAGS: 00010286
[  556.463068] RAX: ffffed003e73242d RBX: ffff8801f292f958 RCX: ffffffffb88b81bc
[  556.464479] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff8801f3992164
[  556.465901] RBP: ffff8801f292f980 R08: ffffed003e73242d R09: ffffed003e73242d
[  556.467311] R10: 0000000000000001 R11: ffffed003e73242c R12: 00000000fffffc64
[  556.468706] R13: ffff8801f3992000 R14: 0000000000000058 R15: 00000000ffff8801
[  556.470117] FS:  00007f8029297840(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000
[  556.471702] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  556.472838] CR2: 000055f5f57305d8 CR3: 00000001f18b0000 CR4: 00000000000006e0
[  556.474265] Call Trace:
[  556.474782]  ? f2fs_alloc_nid_failed+0xf0/0xf0
[  556.475686]  ? truncate_nodes+0x980/0x980
[  556.476516]  ? pagecache_get_page+0x21f/0x2f0
[  556.477412]  ? __asan_loadN+0xf/0x20
[  556.478153]  ? __get_node_page+0x331/0x5b0
[  556.478992]  ? reweight_entity+0x1e6/0x3b0
[  556.479826]  f2fs_truncate_blocks+0x55e/0x740
[  556.480709]  ? f2fs_truncate_data_blocks+0x20/0x20
[  556.481689]  ? __radix_tree_lookup+0x34/0x160
[  556.482630]  ? radix_tree_lookup+0xd/0x10
[  556.483445]  f2fs_truncate+0xd4/0x1a0
[  556.484206]  f2fs_evict_inode+0x5ce/0x630
[  556.485032]  evict+0x16f/0x290
[  556.485664]  iput+0x280/0x300
[  556.486300]  dentry_unlink_inode+0x165/0x1e0
[  556.487169]  __dentry_kill+0x16a/0x260
[  556.487936]  dentry_kill+0x70/0x250
[  556.488651]  shrink_dentry_list+0x125/0x260
[  556.489504]  shrink_dcache_parent+0xc1/0x110
[  556.490379]  ? shrink_dcache_sb+0x200/0x200
[  556.491231]  ? bit_wait_timeout+0xc0/0xc0
[  556.492047]  do_one_tree+0x12/0x40
[  556.492743]  shrink_dcache_for_umount+0x3f/0xa0
[  556.493656]  generic_shutdown_super+0x43/0x1c0
[  556.494561]  kill_block_super+0x52/0x80
[  556.495341]  kill_f2fs_super+0x62/0x70
[  556.496105]  deactivate_locked_super+0x6f/0xa0
[  556.497004]  deactivate_super+0x5e/0x80
[  556.497785]  cleanup_mnt+0x61/0xa0
[  556.498492]  __cleanup_mnt+0x12/0x20
[  556.499218]  task_work_run+0xc8/0xf0
[  556.499949]  exit_to_usermode_loop+0x125/0x130
[  556.500846]  do_syscall_64+0x138/0x170
[  556.501609]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  556.502659] RIP: 0033:0x7f8028b77487
[  556.503384] Code: 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 c9 2b 00 f7 d8 64 89 01 48
[  556.507137] RSP: 002b:00007fff9f2e3598 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[  556.508637] RAX: 0000000000000000 RBX: 0000000000ebd030 RCX: 00007f8028b77487
[  556.510069] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000ec41e0
[  556.511481] RBP: 0000000000ec41e0 R08: 0000000000000000 R09: 0000000000000014
[  556.512892] R10: 00000000000006b2 R11: 0000000000000246 R12: 00007f802908083c
[  556.514320] R13: 0000000000000000 R14: 0000000000ebd210 R15: 00007fff9f2e3820
[  556.515745] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy
[  556.529276] ---[ end trace 4ce02f25ff7d3df5 ]---
[  556.530340] RIP: 0010:f2fs_truncate_inode_blocks+0x4a7/0x6f0
[  556.531513] Code: e8 ae ea ff ff 41 89 c7 c1 e8 1f 84 c0 74 0a 41 83 ff fe 0f 85 35 ff ff ff 81 85 b0 fe ff ff fb 03 00 00 e9 f7 fd ff ff 0f 0b <0f> 0b e8 62 b7 9a 00 48 8b bd a0 fe ff ff e8 56 54 ae ff 48 8b b5
[  556.535330] RSP: 0018:ffff8801f292f808 EFLAGS: 00010286
[  556.536395] RAX: ffffed003e73242d RBX: ffff8801f292f958 RCX: ffffffffb88b81bc
[  556.537824] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff8801f3992164
[  556.539290] RBP: ffff8801f292f980 R08: ffffed003e73242d R09: ffffed003e73242d
[  556.540709] R10: 0000000000000001 R11: ffffed003e73242c R12: 00000000fffffc64
[  556.542131] R13: ffff8801f3992000 R14: 0000000000000058 R15: 00000000ffff8801
[  556.543579] FS:  00007f8029297840(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000
[  556.545180] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  556.546338] CR2: 000055f5f57305d8 CR3: 00000001f18b0000 CR4: 00000000000006e0
[  556.547809] ==================================================================
[  556.549248] BUG: KASAN: stack-out-of-bounds in arch_tlb_gather_mmu+0x52/0x170
[  556.550672] Write of size 8 at addr ffff8801f292fd10 by task umount/1310

[  556.552338] CPU: 1 PID: 1310 Comm: umount Tainted: G      D           4.18.0-rc1+ #4
[  556.553886] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  556.555756] Call Trace:
[  556.556264]  dump_stack+0x7b/0xb5
[  556.556944]  print_address_description+0x70/0x290
[  556.557903]  kasan_report+0x291/0x390
[  556.558649]  ? arch_tlb_gather_mmu+0x52/0x170
[  556.559537]  __asan_store8+0x57/0x90
[  556.560268]  arch_tlb_gather_mmu+0x52/0x170
[  556.561110]  tlb_gather_mmu+0x12/0x40
[  556.561862]  exit_mmap+0x123/0x2a0
[  556.562555]  ? __ia32_sys_munmap+0x50/0x50
[  556.563384]  ? exit_aio+0x98/0x230
[  556.564079]  ? __x32_compat_sys_io_submit+0x260/0x260
[  556.565099]  ? taskstats_exit+0x1f4/0x640
[  556.565925]  ? kasan_check_read+0x11/0x20
[  556.566739]  ? mm_update_next_owner+0x322/0x380
[  556.567652]  mmput+0x8b/0x1d0
[  556.568260]  do_exit+0x43a/0x1390
[  556.568937]  ? mm_update_next_owner+0x380/0x380
[  556.569855]  ? deactivate_super+0x5e/0x80
[  556.570668]  ? cleanup_mnt+0x61/0xa0
[  556.571395]  ? __cleanup_mnt+0x12/0x20
[  556.572156]  ? task_work_run+0xc8/0xf0
[  556.572917]  ? exit_to_usermode_loop+0x125/0x130
[  556.573861]  rewind_stack_do_exit+0x17/0x20
[  556.574707] RIP: 0033:0x7f8028b77487
[  556.575428] Code: Bad RIP value.
[  556.576106] RSP: 002b:00007fff9f2e3598 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[  556.577599] RAX: 0000000000000000 RBX: 0000000000ebd030 RCX: 00007f8028b77487
[  556.579020] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000ec41e0
[  556.580422] RBP: 0000000000ec41e0 R08: 0000000000000000 R09: 0000000000000014
[  556.581833] R10: 00000000000006b2 R11: 0000000000000246 R12: 00007f802908083c
[  556.583252] R13: 0000000000000000 R14: 0000000000ebd210 R15: 00007fff9f2e3820

[  556.584983] The buggy address belongs to the page:
[  556.585961] page:ffffea0007ca4bc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
[  556.587540] flags: 0x2ffff0000000000()
[  556.588296] raw: 02ffff0000000000 0000000000000000 dead000000000200 0000000000000000
[  556.589822] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[  556.591359] page dumped because: kasan: bad access detected

[  556.592786] Memory state around the buggy address:
[  556.593753]  ffff8801f292fc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  556.595191]  ffff8801f292fc80: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
[  556.596613] >ffff8801f292fd00: 00 00 f3 00 00 00 00 f3 f3 00 00 00 00 f4 f4 f4
[  556.598044]                          ^
[  556.598797]  ffff8801f292fd80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
[  556.600225]  ffff8801f292fe00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f4 f4 f4
[  556.601647] ==================================================================

- Location
https://elixir.bootlin.com/linux/v4.18-rc1/source/fs/f2fs/node.c#L987
		case NODE_DIND_BLOCK:
			err = truncate_nodes(&dn, nofs, offset[1], 3);
			cont = 0;
			break;

		default:
			BUG(); <---
		}

Reported-by Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
nickdesaulniers pushed a commit that referenced this issue Aug 22, 2018
As Wen Xu reported in bugzilla, after image was injected with random data
by fuzzing, inline inode would contain invalid reserved blkaddr, then
during inline conversion, we will encounter illegal memory accessing
reported by KASAN, the root cause of this is when writing out converted
inline page, we will use invalid reserved blkaddr to update sit bitmap,
result in accessing memory beyond sit bitmap boundary.

In order to fix this issue, let's do sanity check with reserved block
address of inline inode to avoid above condition.

https://bugzilla.kernel.org/show_bug.cgi?id=200179

[ 1428.846352] BUG: KASAN: use-after-free in update_sit_entry+0x80/0x7f0
[ 1428.846618] Read of size 4 at addr ffff880194483540 by task a.out/2741

[ 1428.846855] CPU: 0 PID: 2741 Comm: a.out Tainted: G        W         4.17.0+ #1
[ 1428.846858] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 1428.846860] Call Trace:
[ 1428.846868]  dump_stack+0x71/0xab
[ 1428.846875]  print_address_description+0x6b/0x290
[ 1428.846881]  kasan_report+0x28e/0x390
[ 1428.846888]  ? update_sit_entry+0x80/0x7f0
[ 1428.846898]  update_sit_entry+0x80/0x7f0
[ 1428.846906]  f2fs_allocate_data_block+0x6db/0xc70
[ 1428.846914]  ? f2fs_get_node_info+0x14f/0x590
[ 1428.846920]  do_write_page+0xc8/0x150
[ 1428.846928]  f2fs_outplace_write_data+0xfe/0x210
[ 1428.846935]  ? f2fs_do_write_node_page+0x170/0x170
[ 1428.846941]  ? radix_tree_tag_clear+0xff/0x130
[ 1428.846946]  ? __mod_node_page_state+0x22/0xa0
[ 1428.846951]  ? inc_zone_page_state+0x54/0x100
[ 1428.846956]  ? __test_set_page_writeback+0x336/0x5d0
[ 1428.846964]  f2fs_convert_inline_page+0x407/0x6d0
[ 1428.846971]  ? f2fs_read_inline_data+0x3b0/0x3b0
[ 1428.846978]  ? __get_node_page+0x335/0x6b0
[ 1428.846987]  f2fs_convert_inline_inode+0x41b/0x500
[ 1428.846994]  ? f2fs_convert_inline_page+0x6d0/0x6d0
[ 1428.847000]  ? kasan_unpoison_shadow+0x31/0x40
[ 1428.847005]  ? kasan_kmalloc+0xa6/0xd0
[ 1428.847024]  f2fs_file_mmap+0x79/0xc0
[ 1428.847029]  mmap_region+0x58b/0x880
[ 1428.847037]  ? arch_get_unmapped_area+0x370/0x370
[ 1428.847042]  do_mmap+0x55b/0x7a0
[ 1428.847048]  vm_mmap_pgoff+0x16f/0x1c0
[ 1428.847055]  ? vma_is_stack_for_current+0x50/0x50
[ 1428.847062]  ? __fsnotify_update_child_dentry_flags.part.1+0x160/0x160
[ 1428.847068]  ? do_sys_open+0x206/0x2a0
[ 1428.847073]  ? __fget+0xb4/0x100
[ 1428.847079]  ksys_mmap_pgoff+0x278/0x360
[ 1428.847085]  ? find_mergeable_anon_vma+0x50/0x50
[ 1428.847091]  do_syscall_64+0x73/0x160
[ 1428.847098]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1428.847102] RIP: 0033:0x7fb1430766ba
[ 1428.847103] Code: 89 f5 41 54 49 89 fc 55 53 74 35 49 63 e8 48 63 da 4d 89 f9 49 89 e8 4d 63 d6 48 89 da 4c 89 ee 4c 89 e7 b8 09 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 56 5b 5d 41 5c 41 5d 41 5e 41 5f c3 0f 1f 00
[ 1428.847162] RSP: 002b:00007ffc651d9388 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
[ 1428.847167] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fb1430766ba
[ 1428.847170] RDX: 0000000000000001 RSI: 0000000000001000 RDI: 0000000000000000
[ 1428.847173] RBP: 0000000000000003 R08: 0000000000000003 R09: 0000000000000000
[ 1428.847176] R10: 0000000000008002 R11: 0000000000000246 R12: 0000000000000000
[ 1428.847179] R13: 0000000000001000 R14: 0000000000008002 R15: 0000000000000000

[ 1428.847252] Allocated by task 2683:
[ 1428.847372]  kasan_kmalloc+0xa6/0xd0
[ 1428.847380]  kmem_cache_alloc+0xc8/0x1e0
[ 1428.847385]  getname_flags+0x73/0x2b0
[ 1428.847390]  user_path_at_empty+0x1d/0x40
[ 1428.847395]  vfs_statx+0xc1/0x150
[ 1428.847401]  __do_sys_newlstat+0x7e/0xd0
[ 1428.847405]  do_syscall_64+0x73/0x160
[ 1428.847411]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[ 1428.847466] Freed by task 2683:
[ 1428.847566]  __kasan_slab_free+0x137/0x190
[ 1428.847571]  kmem_cache_free+0x85/0x1e0
[ 1428.847575]  filename_lookup+0x191/0x280
[ 1428.847580]  vfs_statx+0xc1/0x150
[ 1428.847585]  __do_sys_newlstat+0x7e/0xd0
[ 1428.847590]  do_syscall_64+0x73/0x160
[ 1428.847596]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[ 1428.847648] The buggy address belongs to the object at ffff880194483300
                which belongs to the cache names_cache of size 4096
[ 1428.847946] The buggy address is located 576 bytes inside of
                4096-byte region [ffff880194483300, ffff880194484300)
[ 1428.848234] The buggy address belongs to the page:
[ 1428.848366] page:ffffea0006512000 count:1 mapcount:0 mapping:ffff8801f3586380 index:0x0 compound_mapcount: 0
[ 1428.848606] flags: 0x17fff8000008100(slab|head)
[ 1428.848737] raw: 017fff8000008100 dead000000000100 dead000000000200 ffff8801f3586380
[ 1428.848931] raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000
[ 1428.849122] page dumped because: kasan: bad access detected

[ 1428.849305] Memory state around the buggy address:
[ 1428.849436]  ffff880194483400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1428.849620]  ffff880194483480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1428.849804] >ffff880194483500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1428.849985]                                            ^
[ 1428.850120]  ffff880194483580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1428.850303]  ffff880194483600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1428.850498] ==================================================================

Reported-by: Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
nickdesaulniers pushed a commit that referenced this issue Aug 22, 2018
If segment type in SSA and SIT is inconsistent, we will encounter below
BUG_ON during GC, to avoid this panic, let's just skip doing GC on such
segment.

The bug is triggered with image reported in below link:

https://bugzilla.kernel.org/show_bug.cgi?id=200223

[  388.060262] ------------[ cut here ]------------
[  388.060268] kernel BUG at /home/y00370721/git/devf2fs/gc.c:989!
[  388.061172] invalid opcode: 0000 [#1] SMP
[  388.061773] Modules linked in: f2fs(O) bluetooth ecdh_generic xt_tcpudp iptable_filter ip_tables x_tables lp ttm drm_kms_helper drm intel_rapl sb_edac crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel fb_sys_fops ppdev aes_x86_64 syscopyarea crypto_simd sysfillrect parport_pc joydev sysimgblt glue_helper parport cryptd i2c_piix4 serio_raw mac_hid btrfs hid_generic usbhid hid raid6_pq psmouse pata_acpi floppy
[  388.064247] CPU: 7 PID: 4151 Comm: f2fs_gc-7:0 Tainted: G           O    4.13.0-rc1+ #26
[  388.065306] Hardware name: Xen HVM domU, BIOS 4.1.2_115-900.260_ 11/06/2015
[  388.066058] task: ffff880201583b80 task.stack: ffffc90004d7c000
[  388.069948] RIP: 0010:do_garbage_collect+0xcc8/0xcd0 [f2fs]
[  388.070766] RSP: 0018:ffffc90004d7fc68 EFLAGS: 00010202
[  388.071783] RAX: ffff8801ed227000 RBX: 0000000000000001 RCX: ffffea0007b489c0
[  388.072700] RDX: ffff880000000000 RSI: 0000000000000001 RDI: ffffea0007b489c0
[  388.073607] RBP: ffffc90004d7fd58 R08: 0000000000000003 R09: ffffea0007b489dc
[  388.074619] R10: 0000000000000000 R11: 0052782ab317138d R12: 0000000000000018
[  388.075625] R13: 0000000000000018 R14: ffff880211ceb000 R15: ffff880211ceb000
[  388.076687] FS:  0000000000000000(0000) GS:ffff880214fc0000(0000) knlGS:0000000000000000
[  388.083277] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  388.084536] CR2: 0000000000e18c60 CR3: 00000001ecf2e000 CR4: 00000000001406e0
[  388.085748] Call Trace:
[  388.086690]  ? find_next_bit+0xb/0x10
[  388.088091]  f2fs_gc+0x1a8/0x9d0 [f2fs]
[  388.088888]  ? lock_timer_base+0x7d/0xa0
[  388.090213]  ? try_to_del_timer_sync+0x44/0x60
[  388.091698]  gc_thread_func+0x342/0x4b0 [f2fs]
[  388.092892]  ? wait_woken+0x80/0x80
[  388.094098]  kthread+0x109/0x140
[  388.095010]  ? f2fs_gc+0x9d0/0x9d0 [f2fs]
[  388.096043]  ? kthread_park+0x60/0x60
[  388.097281]  ret_from_fork+0x25/0x30
[  388.098401] Code: ff ff 48 83 e8 01 48 89 44 24 58 e9 27 f8 ff ff 48 83 e8 01 e9 78 fc ff ff 48 8d 78 ff e9 17 fb ff ff 48 83 ef 01 e9 4d f4 ff ff <0f> 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 55
[  388.100864] RIP: do_garbage_collect+0xcc8/0xcd0 [f2fs] RSP: ffffc90004d7fc68
[  388.101810] ---[ end trace 81c73d6e6b7da61d ]---

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
nickdesaulniers pushed a commit that referenced this issue Aug 22, 2018
This patch add to do sanity check with below field:
- cp_pack_total_block_count
- blkaddr of data/node
- extent info

- Overview
BUG() in verify_block_addr() when writing to a corrupted f2fs image

- Reproduce (4.18 upstream kernel)

- POC (poc.c)

static void activity(char *mpoint) {

  char *foo_bar_baz;
  int err;

  static int buf[8192];
  memset(buf, 0, sizeof(buf));

  err = asprintf(&foo_bar_baz, "%s/foo/bar/baz", mpoint);

  int fd = open(foo_bar_baz, O_RDWR | O_TRUNC, 0777);
  if (fd >= 0) {
    write(fd, (char *)buf, sizeof(buf));
    fdatasync(fd);
    close(fd);
  }
}

int main(int argc, char *argv[]) {
  activity(argv[1]);
  return 0;
}

- Kernel message
[  689.349473] F2FS-fs (loop0): Mounted with checkpoint version = 3
[  699.728662] WARNING: CPU: 0 PID: 1309 at fs/f2fs/segment.c:2860 f2fs_inplace_write_data+0x232/0x240
[  699.728670] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy
[  699.729056] CPU: 0 PID: 1309 Comm: a.out Not tainted 4.18.0-rc1+ #4
[  699.729064] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  699.729074] RIP: 0010:f2fs_inplace_write_data+0x232/0x240
[  699.729076] Code: ff e9 cf fe ff ff 49 8d 7d 10 e8 39 45 ad ff 4d 8b 7d 10 be 04 00 00 00 49 8d 7f 48 e8 07 49 ad ff 45 8b 7f 48 e9 fb fe ff ff <0f> 0b f0 41 80 4d 48 04 e9 65 fe ff ff 90 66 66 66 66 90 55 48 8d
[  699.729130] RSP: 0018:ffff8801f43af568 EFLAGS: 00010202
[  699.729139] RAX: 000000000000003f RBX: ffff8801f43af7b8 RCX: ffffffffb88c9113
[  699.729142] RDX: 0000000000000003 RSI: dffffc0000000000 RDI: ffff8802024e5540
[  699.729144] RBP: ffff8801f43af590 R08: 0000000000000009 R09: ffffffffffffffe8
[  699.729147] R10: 0000000000000001 R11: ffffed0039b0596a R12: ffff8802024e5540
[  699.729149] R13: ffff8801f0335500 R14: ffff8801e3e7a700 R15: ffff8801e1ee4450
[  699.729154] FS:  00007f9bf97f5700(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000
[  699.729156] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  699.729159] CR2: 00007f9bf925d170 CR3: 00000001f0c34000 CR4: 00000000000006f0
[  699.729171] Call Trace:
[  699.729192]  f2fs_do_write_data_page+0x2e2/0xe00
[  699.729203]  ? f2fs_should_update_outplace+0xd0/0xd0
[  699.729238]  ? memcg_drain_all_list_lrus+0x280/0x280
[  699.729269]  ? __radix_tree_replace+0xa3/0x120
[  699.729276]  __write_data_page+0x5c7/0xe30
[  699.729291]  ? kasan_check_read+0x11/0x20
[  699.729310]  ? page_mapped+0x8a/0x110
[  699.729321]  ? page_mkclean+0xe9/0x160
[  699.729327]  ? f2fs_do_write_data_page+0xe00/0xe00
[  699.729331]  ? invalid_page_referenced_vma+0x130/0x130
[  699.729345]  ? clear_page_dirty_for_io+0x332/0x450
[  699.729351]  f2fs_write_cache_pages+0x4ca/0x860
[  699.729358]  ? __write_data_page+0xe30/0xe30
[  699.729374]  ? percpu_counter_add_batch+0x22/0xa0
[  699.729380]  ? kasan_check_write+0x14/0x20
[  699.729391]  ? _raw_spin_lock+0x17/0x40
[  699.729403]  ? f2fs_mark_inode_dirty_sync.part.18+0x16/0x30
[  699.729413]  ? iov_iter_advance+0x113/0x640
[  699.729418]  ? f2fs_write_end+0x133/0x2e0
[  699.729423]  ? balance_dirty_pages_ratelimited+0x239/0x640
[  699.729428]  f2fs_write_data_pages+0x329/0x520
[  699.729433]  ? generic_perform_write+0x250/0x320
[  699.729438]  ? f2fs_write_cache_pages+0x860/0x860
[  699.729454]  ? current_time+0x110/0x110
[  699.729459]  ? f2fs_preallocate_blocks+0x1ef/0x370
[  699.729464]  do_writepages+0x37/0xb0
[  699.729468]  ? f2fs_write_cache_pages+0x860/0x860
[  699.729472]  ? do_writepages+0x37/0xb0
[  699.729478]  __filemap_fdatawrite_range+0x19a/0x1f0
[  699.729483]  ? delete_from_page_cache_batch+0x4e0/0x4e0
[  699.729496]  ? __vfs_write+0x2b2/0x410
[  699.729501]  file_write_and_wait_range+0x66/0xb0
[  699.729506]  f2fs_do_sync_file+0x1f9/0xd90
[  699.729511]  ? truncate_partial_data_page+0x290/0x290
[  699.729521]  ? __sb_end_write+0x30/0x50
[  699.729526]  ? vfs_write+0x20f/0x260
[  699.729530]  f2fs_sync_file+0x9a/0xb0
[  699.729534]  ? f2fs_do_sync_file+0xd90/0xd90
[  699.729548]  vfs_fsync_range+0x68/0x100
[  699.729554]  ? __fget_light+0xc9/0xe0
[  699.729558]  do_fsync+0x3d/0x70
[  699.729562]  __x64_sys_fdatasync+0x24/0x30
[  699.729585]  do_syscall_64+0x78/0x170
[  699.729595]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  699.729613] RIP: 0033:0x7f9bf930d800
[  699.729615] Code: 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 83 3d 49 bf 2c 00 00 75 10 b8 4b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 be 78 01 00 48 89 04 24
[  699.729668] RSP: 002b:00007ffee3606c68 EFLAGS: 00000246 ORIG_RAX: 000000000000004b
[  699.729673] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f9bf930d800
[  699.729675] RDX: 0000000000008000 RSI: 00000000006010a0 RDI: 0000000000000003
[  699.729678] RBP: 00007ffee3606ca0 R08: 0000000001503010 R09: 0000000000000000
[  699.729680] R10: 00000000000002e8 R11: 0000000000000246 R12: 0000000000400610
[  699.729683] R13: 00007ffee3606da0 R14: 0000000000000000 R15: 0000000000000000
[  699.729687] ---[ end trace 4ce02f25ff7d3df5 ]---
[  699.729782] ------------[ cut here ]------------
[  699.729785] kernel BUG at fs/f2fs/segment.h:654!
[  699.731055] invalid opcode: 0000 [#1] SMP KASAN PTI
[  699.732104] CPU: 0 PID: 1309 Comm: a.out Tainted: G        W         4.18.0-rc1+ #4
[  699.733684] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  699.735611] RIP: 0010:f2fs_submit_page_bio+0x29b/0x730
[  699.736649] Code: 54 49 8d bd 18 04 00 00 e8 b2 59 af ff 41 8b 8d 18 04 00 00 8b 45 b8 41 d3 e6 44 01 f0 4c 8d 73 14 41 39 c7 0f 82 37 fe ff ff <0f> 0b 65 8b 05 2c 04 77 47 89 c0 48 0f a3 05 52 c1 d5 01 0f 92 c0
[  699.740524] RSP: 0018:ffff8801f43af508 EFLAGS: 00010283
[  699.741573] RAX: 0000000000000000 RBX: ffff8801f43af7b8 RCX: ffffffffb88a7cef
[  699.743006] RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff8801e3e7a64c
[  699.744426] RBP: ffff8801f43af558 R08: ffffed003e066b55 R09: ffffed003e066b55
[  699.745833] R10: 0000000000000001 R11: ffffed003e066b54 R12: ffffea0007876940
[  699.747256] R13: ffff8801f0335500 R14: ffff8801e3e7a600 R15: 0000000000000001
[  699.748683] FS:  00007f9bf97f5700(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000
[  699.750293] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  699.751462] CR2: 00007f9bf925d170 CR3: 00000001f0c34000 CR4: 00000000000006f0
[  699.752874] Call Trace:
[  699.753386]  ? f2fs_inplace_write_data+0x93/0x240
[  699.754341]  f2fs_inplace_write_data+0xd2/0x240
[  699.755271]  f2fs_do_write_data_page+0x2e2/0xe00
[  699.756214]  ? f2fs_should_update_outplace+0xd0/0xd0
[  699.757215]  ? memcg_drain_all_list_lrus+0x280/0x280
[  699.758209]  ? __radix_tree_replace+0xa3/0x120
[  699.759164]  __write_data_page+0x5c7/0xe30
[  699.760002]  ? kasan_check_read+0x11/0x20
[  699.760823]  ? page_mapped+0x8a/0x110
[  699.761573]  ? page_mkclean+0xe9/0x160
[  699.762345]  ? f2fs_do_write_data_page+0xe00/0xe00
[  699.763332]  ? invalid_page_referenced_vma+0x130/0x130
[  699.764374]  ? clear_page_dirty_for_io+0x332/0x450
[  699.765347]  f2fs_write_cache_pages+0x4ca/0x860
[  699.766276]  ? __write_data_page+0xe30/0xe30
[  699.767161]  ? percpu_counter_add_batch+0x22/0xa0
[  699.768112]  ? kasan_check_write+0x14/0x20
[  699.768951]  ? _raw_spin_lock+0x17/0x40
[  699.769739]  ? f2fs_mark_inode_dirty_sync.part.18+0x16/0x30
[  699.770885]  ? iov_iter_advance+0x113/0x640
[  699.771743]  ? f2fs_write_end+0x133/0x2e0
[  699.772569]  ? balance_dirty_pages_ratelimited+0x239/0x640
[  699.773680]  f2fs_write_data_pages+0x329/0x520
[  699.774603]  ? generic_perform_write+0x250/0x320
[  699.775544]  ? f2fs_write_cache_pages+0x860/0x860
[  699.776510]  ? current_time+0x110/0x110
[  699.777299]  ? f2fs_preallocate_blocks+0x1ef/0x370
[  699.778279]  do_writepages+0x37/0xb0
[  699.779026]  ? f2fs_write_cache_pages+0x860/0x860
[  699.779978]  ? do_writepages+0x37/0xb0
[  699.780755]  __filemap_fdatawrite_range+0x19a/0x1f0
[  699.781746]  ? delete_from_page_cache_batch+0x4e0/0x4e0
[  699.782820]  ? __vfs_write+0x2b2/0x410
[  699.783597]  file_write_and_wait_range+0x66/0xb0
[  699.784540]  f2fs_do_sync_file+0x1f9/0xd90
[  699.785381]  ? truncate_partial_data_page+0x290/0x290
[  699.786415]  ? __sb_end_write+0x30/0x50
[  699.787204]  ? vfs_write+0x20f/0x260
[  699.787941]  f2fs_sync_file+0x9a/0xb0
[  699.788694]  ? f2fs_do_sync_file+0xd90/0xd90
[  699.789572]  vfs_fsync_range+0x68/0x100
[  699.790360]  ? __fget_light+0xc9/0xe0
[  699.791128]  do_fsync+0x3d/0x70
[  699.791779]  __x64_sys_fdatasync+0x24/0x30
[  699.792614]  do_syscall_64+0x78/0x170
[  699.793371]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  699.794406] RIP: 0033:0x7f9bf930d800
[  699.795134] Code: 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 83 3d 49 bf 2c 00 00 75 10 b8 4b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 be 78 01 00 48 89 04 24
[  699.798960] RSP: 002b:00007ffee3606c68 EFLAGS: 00000246 ORIG_RAX: 000000000000004b
[  699.800483] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f9bf930d800
[  699.801923] RDX: 0000000000008000 RSI: 00000000006010a0 RDI: 0000000000000003
[  699.803373] RBP: 00007ffee3606ca0 R08: 0000000001503010 R09: 0000000000000000
[  699.804798] R10: 00000000000002e8 R11: 0000000000000246 R12: 0000000000400610
[  699.806233] R13: 00007ffee3606da0 R14: 0000000000000000 R15: 0000000000000000
[  699.807667] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy
[  699.817079] ---[ end trace 4ce02f25ff7d3df6 ]---
[  699.818068] RIP: 0010:f2fs_submit_page_bio+0x29b/0x730
[  699.819114] Code: 54 49 8d bd 18 04 00 00 e8 b2 59 af ff 41 8b 8d 18 04 00 00 8b 45 b8 41 d3 e6 44 01 f0 4c 8d 73 14 41 39 c7 0f 82 37 fe ff ff <0f> 0b 65 8b 05 2c 04 77 47 89 c0 48 0f a3 05 52 c1 d5 01 0f 92 c0
[  699.822919] RSP: 0018:ffff8801f43af508 EFLAGS: 00010283
[  699.823977] RAX: 0000000000000000 RBX: ffff8801f43af7b8 RCX: ffffffffb88a7cef
[  699.825436] RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff8801e3e7a64c
[  699.826881] RBP: ffff8801f43af558 R08: ffffed003e066b55 R09: ffffed003e066b55
[  699.828292] R10: 0000000000000001 R11: ffffed003e066b54 R12: ffffea0007876940
[  699.829750] R13: ffff8801f0335500 R14: ffff8801e3e7a600 R15: 0000000000000001
[  699.831192] FS:  00007f9bf97f5700(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000
[  699.832793] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  699.833981] CR2: 00007f9bf925d170 CR3: 00000001f0c34000 CR4: 00000000000006f0
[  699.835556] ==================================================================
[  699.837029] BUG: KASAN: stack-out-of-bounds in update_stack_state+0x38c/0x3e0
[  699.838462] Read of size 8 at addr ffff8801f43af970 by task a.out/1309

[  699.840086] CPU: 0 PID: 1309 Comm: a.out Tainted: G      D W         4.18.0-rc1+ #4
[  699.841603] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  699.843475] Call Trace:
[  699.843982]  dump_stack+0x7b/0xb5
[  699.844661]  print_address_description+0x70/0x290
[  699.845607]  kasan_report+0x291/0x390
[  699.846351]  ? update_stack_state+0x38c/0x3e0
[  699.853831]  __asan_load8+0x54/0x90
[  699.854569]  update_stack_state+0x38c/0x3e0
[  699.855428]  ? __read_once_size_nocheck.constprop.7+0x20/0x20
[  699.856601]  ? __save_stack_trace+0x5e/0x100
[  699.857476]  unwind_next_frame.part.5+0x18e/0x490
[  699.858448]  ? unwind_dump+0x290/0x290
[  699.859217]  ? clear_page_dirty_for_io+0x332/0x450
[  699.860185]  __unwind_start+0x106/0x190
[  699.860974]  __save_stack_trace+0x5e/0x100
[  699.861808]  ? __save_stack_trace+0x5e/0x100
[  699.862691]  ? unlink_anon_vmas+0xba/0x2c0
[  699.863525]  save_stack_trace+0x1f/0x30
[  699.864312]  save_stack+0x46/0xd0
[  699.864993]  ? __alloc_pages_slowpath+0x1420/0x1420
[  699.865990]  ? flush_tlb_mm_range+0x15e/0x220
[  699.866889]  ? kasan_check_write+0x14/0x20
[  699.867724]  ? __dec_node_state+0x92/0xb0
[  699.868543]  ? lock_page_memcg+0x85/0xf0
[  699.869350]  ? unlock_page_memcg+0x16/0x80
[  699.870185]  ? page_remove_rmap+0x198/0x520
[  699.871048]  ? mark_page_accessed+0x133/0x200
[  699.871930]  ? _cond_resched+0x1a/0x50
[  699.872700]  ? unmap_page_range+0xcd4/0xe50
[  699.873551]  ? rb_next+0x58/0x80
[  699.874217]  ? rb_next+0x58/0x80
[  699.874895]  __kasan_slab_free+0x13c/0x1a0
[  699.875734]  ? unlink_anon_vmas+0xba/0x2c0
[  699.876563]  kasan_slab_free+0xe/0x10
[  699.877315]  kmem_cache_free+0x89/0x1e0
[  699.878095]  unlink_anon_vmas+0xba/0x2c0
[  699.878913]  free_pgtables+0x101/0x1b0
[  699.879677]  exit_mmap+0x146/0x2a0
[  699.880378]  ? __ia32_sys_munmap+0x50/0x50
[  699.881214]  ? kasan_check_read+0x11/0x20
[  699.882052]  ? mm_update_next_owner+0x322/0x380
[  699.882985]  mmput+0x8b/0x1d0
[  699.883602]  do_exit+0x43a/0x1390
[  699.884288]  ? mm_update_next_owner+0x380/0x380
[  699.885212]  ? f2fs_sync_file+0x9a/0xb0
[  699.885995]  ? f2fs_do_sync_file+0xd90/0xd90
[  699.886877]  ? vfs_fsync_range+0x68/0x100
[  699.887694]  ? __fget_light+0xc9/0xe0
[  699.888442]  ? do_fsync+0x3d/0x70
[  699.889118]  ? __x64_sys_fdatasync+0x24/0x30
[  699.889996]  rewind_stack_do_exit+0x17/0x20
[  699.890860] RIP: 0033:0x7f9bf930d800
[  699.891585] Code: Bad RIP value.
[  699.892268] RSP: 002b:00007ffee3606c68 EFLAGS: 00000246 ORIG_RAX: 000000000000004b
[  699.893781] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f9bf930d800
[  699.895220] RDX: 0000000000008000 RSI: 00000000006010a0 RDI: 0000000000000003
[  699.896643] RBP: 00007ffee3606ca0 R08: 0000000001503010 R09: 0000000000000000
[  699.898069] R10: 00000000000002e8 R11: 0000000000000246 R12: 0000000000400610
[  699.899505] R13: 00007ffee3606da0 R14: 0000000000000000 R15: 0000000000000000

[  699.901241] The buggy address belongs to the page:
[  699.902215] page:ffffea0007d0ebc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
[  699.903811] flags: 0x2ffff0000000000()
[  699.904585] raw: 02ffff0000000000 0000000000000000 ffffffff07d00101 0000000000000000
[  699.906125] raw: 0000000000000000 0000000000240000 00000000ffffffff 0000000000000000
[  699.907673] page dumped because: kasan: bad access detected

[  699.909108] Memory state around the buggy address:
[  699.910077]  ffff8801f43af800: 00 f1 f1 f1 f1 00 f4 f4 f4 f3 f3 f3 f3 00 00 00
[  699.911528]  ffff8801f43af880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  699.912953] >ffff8801f43af900: 00 00 00 00 00 00 00 00 f1 01 f4 f4 f4 f2 f2 f2
[  699.914392]                                                              ^
[  699.915758]  ffff8801f43af980: f2 00 f4 f4 00 00 00 00 f2 00 00 00 00 00 00 00
[  699.917193]  ffff8801f43afa00: 00 00 00 00 00 00 00 00 00 f3 f3 f3 00 00 00 00
[  699.918634] ==================================================================

- Location
https://elixir.bootlin.com/linux/v4.18-rc1/source/fs/f2fs/segment.h#L644

Reported-by Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
nickdesaulniers pushed a commit that referenced this issue Aug 22, 2018
If inode.i_extra_isize was fuzzed to an abnormal value, when
calculating inline data size, the result will overflow, result
in accessing invalid memory area when operating inline data.

Let's do sanity check with i_extra_isize during inode loading
for fixing.

https://bugzilla.kernel.org/show_bug.cgi?id=200421

- Reproduce

- POC (poc.c)
    #define _GNU_SOURCE
    #include <sys/types.h>
    #include <sys/mount.h>
    #include <sys/mman.h>
    #include <sys/stat.h>
    #include <sys/xattr.h>

    #include <dirent.h>
    #include <errno.h>
    #include <error.h>
    #include <fcntl.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <unistd.h>

    #include <linux/falloc.h>
    #include <linux/loop.h>

    static void activity(char *mpoint) {

      char *foo_bar_baz;
      char *foo_baz;
      char *xattr;
      int err;

      err = asprintf(&foo_bar_baz, "%s/foo/bar/baz", mpoint);
      err = asprintf(&foo_baz, "%s/foo/baz", mpoint);
      err = asprintf(&xattr, "%s/foo/bar/xattr", mpoint);

      rename(foo_bar_baz, foo_baz);

      char buf2[113];
      memset(buf2, 0, sizeof(buf2));
      listxattr(xattr, buf2, sizeof(buf2));
      removexattr(xattr, "user.mime_type");

    }

    int main(int argc, char *argv[]) {
      activity(argv[1]);
      return 0;
    }

- Kernel message
Umount the image will leave the following message
[ 2910.995489] F2FS-fs (loop0): Mounted with checkpoint version = 2
[ 2918.416465] ==================================================================
[ 2918.416807] BUG: KASAN: slab-out-of-bounds in f2fs_iget+0xcb9/0x1a80
[ 2918.417009] Read of size 4 at addr ffff88018efc2068 by task a.out/1229

[ 2918.417311] CPU: 1 PID: 1229 Comm: a.out Not tainted 4.17.0+ #1
[ 2918.417314] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 2918.417323] Call Trace:
[ 2918.417366]  dump_stack+0x71/0xab
[ 2918.417401]  print_address_description+0x6b/0x290
[ 2918.417407]  kasan_report+0x28e/0x390
[ 2918.417411]  ? f2fs_iget+0xcb9/0x1a80
[ 2918.417415]  f2fs_iget+0xcb9/0x1a80
[ 2918.417422]  ? f2fs_lookup+0x2e7/0x580
[ 2918.417425]  f2fs_lookup+0x2e7/0x580
[ 2918.417433]  ? __recover_dot_dentries+0x400/0x400
[ 2918.417447]  ? legitimize_path.isra.29+0x5a/0xa0
[ 2918.417453]  __lookup_slow+0x11c/0x220
[ 2918.417457]  ? may_delete+0x2a0/0x2a0
[ 2918.417475]  ? deref_stack_reg+0xe0/0xe0
[ 2918.417479]  ? __lookup_hash+0xb0/0xb0
[ 2918.417483]  lookup_slow+0x3e/0x60
[ 2918.417488]  walk_component+0x3ac/0x990
[ 2918.417492]  ? generic_permission+0x51/0x1e0
[ 2918.417495]  ? inode_permission+0x51/0x1d0
[ 2918.417499]  ? pick_link+0x3e0/0x3e0
[ 2918.417502]  ? link_path_walk+0x4b1/0x770
[ 2918.417513]  ? _raw_spin_lock_irqsave+0x25/0x50
[ 2918.417518]  ? walk_component+0x990/0x990
[ 2918.417522]  ? path_init+0x2e6/0x580
[ 2918.417526]  path_lookupat+0x13f/0x430
[ 2918.417531]  ? trailing_symlink+0x3a0/0x3a0
[ 2918.417534]  ? do_renameat2+0x270/0x7b0
[ 2918.417538]  ? __kasan_slab_free+0x14c/0x190
[ 2918.417541]  ? do_renameat2+0x270/0x7b0
[ 2918.417553]  ? kmem_cache_free+0x85/0x1e0
[ 2918.417558]  ? do_renameat2+0x270/0x7b0
[ 2918.417563]  filename_lookup+0x13c/0x280
[ 2918.417567]  ? filename_parentat+0x2b0/0x2b0
[ 2918.417572]  ? kasan_unpoison_shadow+0x31/0x40
[ 2918.417575]  ? kasan_kmalloc+0xa6/0xd0
[ 2918.417593]  ? strncpy_from_user+0xaa/0x1c0
[ 2918.417598]  ? getname_flags+0x101/0x2b0
[ 2918.417614]  ? path_listxattr+0x87/0x110
[ 2918.417619]  path_listxattr+0x87/0x110
[ 2918.417623]  ? listxattr+0xc0/0xc0
[ 2918.417637]  ? mm_fault_error+0x1b0/0x1b0
[ 2918.417654]  do_syscall_64+0x73/0x160
[ 2918.417660]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2918.417676] RIP: 0033:0x7f2f3a3480d7
[ 2918.417677] Code: f0 ff ff 73 01 c3 48 8b 0d be dd 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 c2 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 91 dd 2b 00 f7 d8 64 89 01 48
[ 2918.417732] RSP: 002b:00007fff4095b7d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000c2
[ 2918.417744] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2f3a3480d7
[ 2918.417746] RDX: 0000000000000071 RSI: 00007fff4095b810 RDI: 000000000126a0c0
[ 2918.417749] RBP: 00007fff4095b890 R08: 000000000126a010 R09: 0000000000000000
[ 2918.417751] R10: 00000000000001ab R11: 0000000000000206 R12: 00000000004005e0
[ 2918.417753] R13: 00007fff4095b990 R14: 0000000000000000 R15: 0000000000000000

[ 2918.417853] Allocated by task 329:
[ 2918.418002]  kasan_kmalloc+0xa6/0xd0
[ 2918.418007]  kmem_cache_alloc+0xc8/0x1e0
[ 2918.418023]  mempool_init_node+0x194/0x230
[ 2918.418027]  mempool_init+0x12/0x20
[ 2918.418042]  bioset_init+0x2bd/0x380
[ 2918.418052]  blk_alloc_queue_node+0xe9/0x540
[ 2918.418075]  dm_create+0x2c0/0x800
[ 2918.418080]  dev_create+0xd2/0x530
[ 2918.418083]  ctl_ioctl+0x2a3/0x5b0
[ 2918.418087]  dm_ctl_ioctl+0xa/0x10
[ 2918.418092]  do_vfs_ioctl+0x13e/0x8c0
[ 2918.418095]  ksys_ioctl+0x66/0x70
[ 2918.418098]  __x64_sys_ioctl+0x3d/0x50
[ 2918.418102]  do_syscall_64+0x73/0x160
[ 2918.418106]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[ 2918.418204] Freed by task 0:
[ 2918.418301] (stack is not available)

[ 2918.418521] The buggy address belongs to the object at ffff88018efc0000
                which belongs to the cache biovec-max of size 8192
[ 2918.418894] The buggy address is located 104 bytes to the right of
                8192-byte region [ffff88018efc0000, ffff88018efc2000)
[ 2918.419257] The buggy address belongs to the page:
[ 2918.419431] page:ffffea00063bf000 count:1 mapcount:0 mapping:ffff8801f2242540 index:0x0 compound_mapcount: 0
[ 2918.419702] flags: 0x17fff8000008100(slab|head)
[ 2918.419879] raw: 017fff8000008100 dead000000000100 dead000000000200 ffff8801f2242540
[ 2918.420101] raw: 0000000000000000 0000000000030003 00000001ffffffff 0000000000000000
[ 2918.420322] page dumped because: kasan: bad access detected

[ 2918.420599] Memory state around the buggy address:
[ 2918.420764]  ffff88018efc1f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 2918.420975]  ffff88018efc1f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 2918.421194] >ffff88018efc2000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 2918.421406]                                                           ^
[ 2918.421627]  ffff88018efc2080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 2918.421838]  ffff88018efc2100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 2918.422046] ==================================================================
[ 2918.422264] Disabling lock debugging due to kernel taint
[ 2923.901641] BUG: unable to handle kernel paging request at ffff88018f0db000
[ 2923.901884] PGD 22226a067 P4D 22226a067 PUD 222273067 PMD 18e642063 PTE 800000018f0db061
[ 2923.902120] Oops: 0003 [#1] SMP KASAN PTI
[ 2923.902274] CPU: 1 PID: 1231 Comm: umount Tainted: G    B             4.17.0+ #1
[ 2923.902490] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 2923.902761] RIP: 0010:__memset+0x24/0x30
[ 2923.902906] Code: 90 90 90 90 90 90 66 66 90 66 90 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3
[ 2923.903446] RSP: 0018:ffff88018ddf7ae0 EFLAGS: 00010206
[ 2923.903622] RAX: 0000000000000000 RBX: ffff8801d549d888 RCX: 1ffffffffffdaffb
[ 2923.903833] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88018f0daffc
[ 2923.904062] RBP: ffff88018efc206c R08: 1ffff10031df840d R09: ffff88018efc206c
[ 2923.904273] R10: ffffffffffffe1ee R11: ffffed0031df65fa R12: 0000000000000000
[ 2923.904485] R13: ffff8801d549dc98 R14: 00000000ffffc3db R15: ffffea00063bec80
[ 2923.904693] FS:  00007fa8b2f8a840(0000) GS:ffff8801f3b00000(0000) knlGS:0000000000000000
[ 2923.904937] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2923.910080] CR2: ffff88018f0db000 CR3: 000000018f892000 CR4: 00000000000006e0
[ 2923.914930] Call Trace:
[ 2923.919724]  f2fs_truncate_inline_inode+0x114/0x170
[ 2923.924487]  f2fs_truncate_blocks+0x11b/0x7c0
[ 2923.929178]  ? f2fs_truncate_data_blocks+0x10/0x10
[ 2923.933834]  ? dqget+0x670/0x670
[ 2923.938437]  ? f2fs_destroy_extent_tree+0xd6/0x270
[ 2923.943107]  ? __radix_tree_lookup+0x2f/0x150
[ 2923.947772]  f2fs_truncate+0xd4/0x1a0
[ 2923.952491]  f2fs_evict_inode+0x5ab/0x610
[ 2923.957204]  evict+0x15f/0x280
[ 2923.961898]  __dentry_kill+0x161/0x250
[ 2923.966634]  shrink_dentry_list+0xf3/0x250
[ 2923.971897]  shrink_dcache_parent+0xa9/0x100
[ 2923.976561]  ? shrink_dcache_sb+0x1f0/0x1f0
[ 2923.981177]  ? wait_for_completion+0x8a/0x210
[ 2923.985781]  ? migrate_swap_stop+0x2d0/0x2d0
[ 2923.990332]  do_one_tree+0xe/0x40
[ 2923.994735]  shrink_dcache_for_umount+0x3a/0xa0
[ 2923.999077]  generic_shutdown_super+0x3e/0x1c0
[ 2924.003350]  kill_block_super+0x4b/0x70
[ 2924.007619]  deactivate_locked_super+0x65/0x90
[ 2924.011812]  cleanup_mnt+0x5c/0xa0
[ 2924.015995]  task_work_run+0xce/0xf0
[ 2924.020174]  exit_to_usermode_loop+0x115/0x120
[ 2924.024293]  do_syscall_64+0x12f/0x160
[ 2924.028479]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2924.032709] RIP: 0033:0x7fa8b2868487
[ 2924.036888] Code: 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 c9 2b 00 f7 d8 64 89 01 48
[ 2924.045750] RSP: 002b:00007ffc39824d58 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[ 2924.050190] RAX: 0000000000000000 RBX: 00000000008ea030 RCX: 00007fa8b2868487
[ 2924.054604] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00000000008f4360
[ 2924.058940] RBP: 00000000008f4360 R08: 0000000000000000 R09: 0000000000000014
[ 2924.063186] R10: 00000000000006b2 R11: 0000000000000246 R12: 00007fa8b2d7183c
[ 2924.067418] R13: 0000000000000000 R14: 00000000008ea210 R15: 00007ffc39824fe0
[ 2924.071534] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer joydev input_leds serio_raw snd soundcore mac_hid i2c_piix4 ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi btrfs zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear 8139too qxl ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel psmouse aes_x86_64 8139cp crypto_simd cryptd mii glue_helper pata_acpi floppy
[ 2924.098044] CR2: ffff88018f0db000
[ 2924.102520] ---[ end trace a8e0d899985faf31 ]---
[ 2924.107012] RIP: 0010:__memset+0x24/0x30
[ 2924.111448] Code: 90 90 90 90 90 90 66 66 90 66 90 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3
[ 2924.120724] RSP: 0018:ffff88018ddf7ae0 EFLAGS: 00010206
[ 2924.125312] RAX: 0000000000000000 RBX: ffff8801d549d888 RCX: 1ffffffffffdaffb
[ 2924.129931] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88018f0daffc
[ 2924.134537] RBP: ffff88018efc206c R08: 1ffff10031df840d R09: ffff88018efc206c
[ 2924.139175] R10: ffffffffffffe1ee R11: ffffed0031df65fa R12: 0000000000000000
[ 2924.143825] R13: ffff8801d549dc98 R14: 00000000ffffc3db R15: ffffea00063bec80
[ 2924.148500] FS:  00007fa8b2f8a840(0000) GS:ffff8801f3b00000(0000) knlGS:0000000000000000
[ 2924.153247] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2924.158003] CR2: ffff88018f0db000 CR3: 000000018f892000 CR4: 00000000000006e0
[ 2924.164641] BUG: Bad rss-counter state mm:00000000fa04621e idx:0 val:4
[ 2924.170007] BUG: Bad rss-counter
tate mm:00000000fa04621e idx:1 val:2

- Location
https://elixir.bootlin.com/linux/v4.18-rc3/source/fs/f2fs/inline.c#L78
	memset(addr + from, 0, MAX_INLINE_DATA(inode) - from);
Here the length can be negative.

Reported-by Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
nickdesaulniers pushed a commit that referenced this issue Aug 22, 2018
After fuzzing, cp_pack_start_sum could be corrupted, so current log's
summary info should be wrong due to loading incorrect summary block.
Then, if segment's type in current log is exceeded NR_CURSEG_TYPE, it
can lead accessing invalid dirty_i->dirty_segmap bitmap finally.

Add sanity check for cp_pack_start_sum to fix this issue.

https://bugzilla.kernel.org/show_bug.cgi?id=200419

- Reproduce

- Kernel message (f2fs-dev w/ KASAN)
[ 3117.578432] F2FS-fs (loop0): Invalid log blocks per segment (8)

[ 3117.578445] F2FS-fs (loop0): Can't find valid F2FS filesystem in 2th superblock
[ 3117.581364] F2FS-fs (loop0): invalid crc_offset: 30716
[ 3117.583564] WARNING: CPU: 1 PID: 1225 at fs/f2fs/checkpoint.c:90 __get_meta_page+0x448/0x4b0
[ 3117.583570] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer joydev input_leds serio_raw snd soundcore mac_hid i2c_piix4 ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi btrfs zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear 8139too qxl ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel psmouse aes_x86_64 8139cp crypto_simd cryptd mii glue_helper pata_acpi floppy
[ 3117.584014] CPU: 1 PID: 1225 Comm: mount Not tainted 4.17.0+ #1
[ 3117.584017] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 3117.584022] RIP: 0010:__get_meta_page+0x448/0x4b0
[ 3117.584023] Code: 00 49 8d bc 24 84 00 00 00 e8 74 54 da ff 41 83 8c 24 84 00 00 00 08 4c 89 f6 4c 89 ef e8 c0 d9 95 00 48 89 ef e8 18 e3 00 00 <0f> 0b f0 80 4d 48 04 e9 0f fe ff ff 0f 0b 48 89 c7 48 89 04 24 e8
[ 3117.584072] RSP: 0018:ffff88018eb678c0 EFLAGS: 00010286
[ 3117.584082] RAX: ffff88018f0a6a78 RBX: ffffea0007a46600 RCX: ffffffff9314d1b2
[ 3117.584085] RDX: ffffffff00000001 RSI: 0000000000000000 RDI: ffff88018f0a6a98
[ 3117.584087] RBP: ffff88018ebe9980 R08: 0000000000000002 R09: 0000000000000001
[ 3117.584090] R10: 0000000000000001 R11: ffffed00326e4450 R12: ffff880193722200
[ 3117.584092] R13: ffff88018ebe9afc R14: 0000000000000206 R15: ffff88018eb67900
[ 3117.584096] FS:  00007f5694636840(0000) GS:ffff8801f3b00000(0000) knlGS:0000000000000000
[ 3117.584098] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3117.584101] CR2: 00000000016f21b8 CR3: 0000000191c22000 CR4: 00000000000006e0
[ 3117.584112] Call Trace:
[ 3117.584121]  ? f2fs_set_meta_page_dirty+0x150/0x150
[ 3117.584127]  ? f2fs_build_segment_manager+0xbf9/0x3190
[ 3117.584133]  ? f2fs_npages_for_summary_flush+0x75/0x120
[ 3117.584145]  f2fs_build_segment_manager+0xda8/0x3190
[ 3117.584151]  ? f2fs_get_valid_checkpoint+0x298/0xa00
[ 3117.584156]  ? f2fs_flush_sit_entries+0x10e0/0x10e0
[ 3117.584184]  ? map_id_range_down+0x17c/0x1b0
[ 3117.584188]  ? __put_user_ns+0x30/0x30
[ 3117.584206]  ? find_next_bit+0x53/0x90
[ 3117.584237]  ? cpumask_next+0x16/0x20
[ 3117.584249]  f2fs_fill_super+0x1948/0x2b40
[ 3117.584258]  ? f2fs_commit_super+0x1a0/0x1a0
[ 3117.584279]  ? sget_userns+0x65e/0x690
[ 3117.584296]  ? set_blocksize+0x88/0x130
[ 3117.584302]  ? f2fs_commit_super+0x1a0/0x1a0
[ 3117.584305]  mount_bdev+0x1c0/0x200
[ 3117.584310]  mount_fs+0x5c/0x190
[ 3117.584320]  vfs_kern_mount+0x64/0x190
[ 3117.584330]  do_mount+0x2e4/0x1450
[ 3117.584343]  ? lockref_put_return+0x130/0x130
[ 3117.584347]  ? copy_mount_string+0x20/0x20
[ 3117.584357]  ? kasan_unpoison_shadow+0x31/0x40
[ 3117.584362]  ? kasan_kmalloc+0xa6/0xd0
[ 3117.584373]  ? memcg_kmem_put_cache+0x16/0x90
[ 3117.584377]  ? __kmalloc_track_caller+0x196/0x210
[ 3117.584383]  ? _copy_from_user+0x61/0x90
[ 3117.584396]  ? memdup_user+0x3e/0x60
[ 3117.584401]  ksys_mount+0x7e/0xd0
[ 3117.584405]  __x64_sys_mount+0x62/0x70
[ 3117.584427]  do_syscall_64+0x73/0x160
[ 3117.584440]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3117.584455] RIP: 0033:0x7f5693f14b9a
[ 3117.584456] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48
[ 3117.584505] RSP: 002b:00007fff27346488 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[ 3117.584510] RAX: ffffffffffffffda RBX: 00000000016e2030 RCX: 00007f5693f14b9a
[ 3117.584512] RDX: 00000000016e2210 RSI: 00000000016e3f30 RDI: 00000000016ee040
[ 3117.584514] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013
[ 3117.584516] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 00000000016ee040
[ 3117.584519] R13: 00000000016e2210 R14: 0000000000000000 R15: 0000000000000003
[ 3117.584523] ---[ end trace a8e0d899985faf31 ]---
[ 3117.685663] F2FS-fs (loop0): f2fs_check_nid_range: out-of-range nid=2, run fsck to fix.
[ 3117.685673] F2FS-fs (loop0): recover_data: ino = 2 (i_size: recover) recovered = 1, err = 0
[ 3117.685707] ==================================================================
[ 3117.685955] BUG: KASAN: slab-out-of-bounds in __remove_dirty_segment+0xdd/0x1e0
[ 3117.686175] Read of size 8 at addr ffff88018f0a63d0 by task mount/1225

[ 3117.686477] CPU: 0 PID: 1225 Comm: mount Tainted: G        W         4.17.0+ #1
[ 3117.686481] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 3117.686483] Call Trace:
[ 3117.686494]  dump_stack+0x71/0xab
[ 3117.686512]  print_address_description+0x6b/0x290
[ 3117.686517]  kasan_report+0x28e/0x390
[ 3117.686522]  ? __remove_dirty_segment+0xdd/0x1e0
[ 3117.686527]  __remove_dirty_segment+0xdd/0x1e0
[ 3117.686532]  locate_dirty_segment+0x189/0x190
[ 3117.686538]  f2fs_allocate_new_segments+0xa9/0xe0
[ 3117.686543]  recover_data+0x703/0x2c20
[ 3117.686547]  ? f2fs_recover_fsync_data+0x48f/0xd50
[ 3117.686553]  ? ksys_mount+0x7e/0xd0
[ 3117.686564]  ? policy_nodemask+0x1a/0x90
[ 3117.686567]  ? policy_node+0x56/0x70
[ 3117.686571]  ? add_fsync_inode+0xf0/0xf0
[ 3117.686592]  ? blk_finish_plug+0x44/0x60
[ 3117.686597]  ? f2fs_ra_meta_pages+0x38b/0x5e0
[ 3117.686602]  ? find_inode_fast+0xac/0xc0
[ 3117.686606]  ? f2fs_is_valid_blkaddr+0x320/0x320
[ 3117.686618]  ? __radix_tree_lookup+0x150/0x150
[ 3117.686633]  ? dqget+0x670/0x670
[ 3117.686648]  ? pagecache_get_page+0x29/0x410
[ 3117.686656]  ? kmem_cache_alloc+0x176/0x1e0
[ 3117.686660]  ? f2fs_is_valid_blkaddr+0x11d/0x320
[ 3117.686664]  f2fs_recover_fsync_data+0xc23/0xd50
[ 3117.686670]  ? f2fs_space_for_roll_forward+0x60/0x60
[ 3117.686674]  ? rb_insert_color+0x323/0x3d0
[ 3117.686678]  ? f2fs_recover_orphan_inodes+0xa5/0x700
[ 3117.686683]  ? proc_register+0x153/0x1d0
[ 3117.686686]  ? f2fs_remove_orphan_inode+0x10/0x10
[ 3117.686695]  ? f2fs_attr_store+0x50/0x50
[ 3117.686700]  ? proc_create_single_data+0x52/0x60
[ 3117.686707]  f2fs_fill_super+0x1d06/0x2b40
[ 3117.686728]  ? f2fs_commit_super+0x1a0/0x1a0
[ 3117.686735]  ? sget_userns+0x65e/0x690
[ 3117.686740]  ? set_blocksize+0x88/0x130
[ 3117.686745]  ? f2fs_commit_super+0x1a0/0x1a0
[ 3117.686748]  mount_bdev+0x1c0/0x200
[ 3117.686753]  mount_fs+0x5c/0x190
[ 3117.686758]  vfs_kern_mount+0x64/0x190
[ 3117.686762]  do_mount+0x2e4/0x1450
[ 3117.686769]  ? lockref_put_return+0x130/0x130
[ 3117.686773]  ? copy_mount_string+0x20/0x20
[ 3117.686777]  ? kasan_unpoison_shadow+0x31/0x40
[ 3117.686780]  ? kasan_kmalloc+0xa6/0xd0
[ 3117.686786]  ? memcg_kmem_put_cache+0x16/0x90
[ 3117.686790]  ? __kmalloc_track_caller+0x196/0x210
[ 3117.686795]  ? _copy_from_user+0x61/0x90
[ 3117.686801]  ? memdup_user+0x3e/0x60
[ 3117.686804]  ksys_mount+0x7e/0xd0
[ 3117.686809]  __x64_sys_mount+0x62/0x70
[ 3117.686816]  do_syscall_64+0x73/0x160
[ 3117.686824]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3117.686829] RIP: 0033:0x7f5693f14b9a
[ 3117.686830] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48
[ 3117.686887] RSP: 002b:00007fff27346488 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[ 3117.686892] RAX: ffffffffffffffda RBX: 00000000016e2030 RCX: 00007f5693f14b9a
[ 3117.686894] RDX: 00000000016e2210 RSI: 00000000016e3f30 RDI: 00000000016ee040
[ 3117.686896] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013
[ 3117.686899] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 00000000016ee040
[ 3117.686901] R13: 00000000016e2210 R14: 0000000000000000 R15: 0000000000000003

[ 3117.687005] Allocated by task 1225:
[ 3117.687152]  kasan_kmalloc+0xa6/0xd0
[ 3117.687157]  kmem_cache_alloc_trace+0xfd/0x200
[ 3117.687161]  f2fs_build_segment_manager+0x2d09/0x3190
[ 3117.687165]  f2fs_fill_super+0x1948/0x2b40
[ 3117.687168]  mount_bdev+0x1c0/0x200
[ 3117.687171]  mount_fs+0x5c/0x190
[ 3117.687174]  vfs_kern_mount+0x64/0x190
[ 3117.687177]  do_mount+0x2e4/0x1450
[ 3117.687180]  ksys_mount+0x7e/0xd0
[ 3117.687182]  __x64_sys_mount+0x62/0x70
[ 3117.687186]  do_syscall_64+0x73/0x160
[ 3117.687190]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[ 3117.687285] Freed by task 19:
[ 3117.687412]  __kasan_slab_free+0x137/0x190
[ 3117.687416]  kfree+0x8b/0x1b0
[ 3117.687460]  ttm_bo_man_put_node+0x61/0x80 [ttm]
[ 3117.687476]  ttm_bo_cleanup_refs+0x15f/0x250 [ttm]
[ 3117.687492]  ttm_bo_delayed_delete+0x2f0/0x300 [ttm]
[ 3117.687507]  ttm_bo_delayed_workqueue+0x17/0x50 [ttm]
[ 3117.687528]  process_one_work+0x2f9/0x740
[ 3117.687531]  worker_thread+0x78/0x6b0
[ 3117.687541]  kthread+0x177/0x1c0
[ 3117.687545]  ret_from_fork+0x35/0x40

[ 3117.687638] The buggy address belongs to the object at ffff88018f0a6300
                which belongs to the cache kmalloc-192 of size 192
[ 3117.688014] The buggy address is located 16 bytes to the right of
                192-byte region [ffff88018f0a6300, ffff88018f0a63c0)
[ 3117.688382] The buggy address belongs to the page:
[ 3117.688554] page:ffffea00063c2980 count:1 mapcount:0 mapping:ffff8801f3403180 index:0x0
[ 3117.688788] flags: 0x17fff8000000100(slab)
[ 3117.688944] raw: 017fff8000000100 ffffea00063c2840 0000000e0000000e ffff8801f3403180
[ 3117.689166] raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
[ 3117.689386] page dumped because: kasan: bad access detected

[ 3117.689653] Memory state around the buggy address:
[ 3117.689816]  ffff88018f0a6280: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ 3117.690027]  ffff88018f0a6300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 3117.690239] >ffff88018f0a6380: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 3117.690448]                                                  ^
[ 3117.690644]  ffff88018f0a6400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 3117.690868]  ffff88018f0a6480: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 3117.691077] ==================================================================
[ 3117.691290] Disabling lock debugging due to kernel taint
[ 3117.693893] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[ 3117.694120] PGD 80000001f01bc067 P4D 80000001f01bc067 PUD 1d9638067 PMD 0
[ 3117.694338] Oops: 0002 [#1] SMP KASAN PTI
[ 3117.694490] CPU: 1 PID: 1225 Comm: mount Tainted: G    B   W         4.17.0+ #1
[ 3117.694703] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 3117.695073] RIP: 0010:__remove_dirty_segment+0xe2/0x1e0
[ 3117.695246] Code: c4 48 89 c7 e8 cf bb d7 ff 45 0f b6 24 24 41 83 e4 3f 44 88 64 24 07 41 83 e4 3f 4a 8d 7c e3 08 e8 b3 bc d7 ff 4a 8b 4c e3 08 <f0> 4c 0f b3 29 0f 82 94 00 00 00 48 8d bd 20 04 00 00 e8 97 bb d7
[ 3117.695793] RSP: 0018:ffff88018eb67638 EFLAGS: 00010292
[ 3117.695969] RAX: 0000000000000000 RBX: ffff88018f0a6300 RCX: 0000000000000000
[ 3117.696182] RDX: 0000000000000000 RSI: 0000000000000297 RDI: 0000000000000297
[ 3117.696391] RBP: ffff88018ebe9980 R08: ffffed003e743ebb R09: ffffed003e743ebb
[ 3117.696604] R10: 0000000000000001 R11: ffffed003e743eba R12: 0000000000000019
[ 3117.696813] R13: 0000000000000014 R14: 0000000000000320 R15: ffff88018ebe99e0
[ 3117.697032] FS:  00007f5694636840(0000) GS:ffff8801f3b00000(0000) knlGS:0000000000000000
[ 3117.697280] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3117.702357] CR2: 00007fe89bb1a000 CR3: 0000000191c22000 CR4: 00000000000006e0
[ 3117.707235] Call Trace:
[ 3117.712077]  locate_dirty_segment+0x189/0x190
[ 3117.716891]  f2fs_allocate_new_segments+0xa9/0xe0
[ 3117.721617]  recover_data+0x703/0x2c20
[ 3117.726316]  ? f2fs_recover_fsync_data+0x48f/0xd50
[ 3117.730957]  ? ksys_mount+0x7e/0xd0
[ 3117.735573]  ? policy_nodemask+0x1a/0x90
[ 3117.740198]  ? policy_node+0x56/0x70
[ 3117.744829]  ? add_fsync_inode+0xf0/0xf0
[ 3117.749487]  ? blk_finish_plug+0x44/0x60
[ 3117.754152]  ? f2fs_ra_meta_pages+0x38b/0x5e0
[ 3117.758831]  ? find_inode_fast+0xac/0xc0
[ 3117.763448]  ? f2fs_is_valid_blkaddr+0x320/0x320
[ 3117.768046]  ? __radix_tree_lookup+0x150/0x150
[ 3117.772603]  ? dqget+0x670/0x670
[ 3117.777159]  ? pagecache_get_page+0x29/0x410
[ 3117.781648]  ? kmem_cache_alloc+0x176/0x1e0
[ 3117.786067]  ? f2fs_is_valid_blkaddr+0x11d/0x320
[ 3117.790476]  f2fs_recover_fsync_data+0xc23/0xd50
[ 3117.794790]  ? f2fs_space_for_roll_forward+0x60/0x60
[ 3117.799086]  ? rb_insert_color+0x323/0x3d0
[ 3117.803304]  ? f2fs_recover_orphan_inodes+0xa5/0x700
[ 3117.807563]  ? proc_register+0x153/0x1d0
[ 3117.811766]  ? f2fs_remove_orphan_inode+0x10/0x10
[ 3117.815947]  ? f2fs_attr_store+0x50/0x50
[ 3117.820087]  ? proc_create_single_data+0x52/0x60
[ 3117.824262]  f2fs_fill_super+0x1d06/0x2b40
[ 3117.828367]  ? f2fs_commit_super+0x1a0/0x1a0
[ 3117.832432]  ? sget_userns+0x65e/0x690
[ 3117.836500]  ? set_blocksize+0x88/0x130
[ 3117.840501]  ? f2fs_commit_super+0x1a0/0x1a0
[ 3117.844420]  mount_bdev+0x1c0/0x200
[ 3117.848275]  mount_fs+0x5c/0x190
[ 3117.852053]  vfs_kern_mount+0x64/0x190
[ 3117.855810]  do_mount+0x2e4/0x1450
[ 3117.859441]  ? lockref_put_return+0x130/0x130
[ 3117.862996]  ? copy_mount_string+0x20/0x20
[ 3117.866417]  ? kasan_unpoison_shadow+0x31/0x40
[ 3117.869719]  ? kasan_kmalloc+0xa6/0xd0
[ 3117.872948]  ? memcg_kmem_put_cache+0x16/0x90
[ 3117.876121]  ? __kmalloc_track_caller+0x196/0x210
[ 3117.879333]  ? _copy_from_user+0x61/0x90
[ 3117.882467]  ? memdup_user+0x3e/0x60
[ 3117.885604]  ksys_mount+0x7e/0xd0
[ 3117.888700]  __x64_sys_mount+0x62/0x70
[ 3117.891742]  do_syscall_64+0x73/0x160
[ 3117.894692]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3117.897669] RIP: 0033:0x7f5693f14b9a
[ 3117.900563] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48
[ 3117.906922] RSP: 002b:00007fff27346488 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[ 3117.910159] RAX: ffffffffffffffda RBX: 00000000016e2030 RCX: 00007f5693f14b9a
[ 3117.913469] RDX: 00000000016e2210 RSI: 00000000016e3f30 RDI: 00000000016ee040
[ 3117.916764] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013
[ 3117.920071] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 00000000016ee040
[ 3117.923393] R13: 00000000016e2210 R14: 0000000000000000 R15: 0000000000000003
[ 3117.926680] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer joydev input_leds serio_raw snd soundcore mac_hid i2c_piix4 ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi btrfs zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear 8139too qxl ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel psmouse aes_x86_64 8139cp crypto_simd cryptd mii glue_helper pata_acpi floppy
[ 3117.949979] CR2: 0000000000000000
[ 3117.954283] ---[ end trace a8e0d899985faf32 ]---
[ 3117.958575] RIP: 0010:__remove_dirty_segment+0xe2/0x1e0
[ 3117.962810] Code: c4 48 89 c7 e8 cf bb d7 ff 45 0f b6 24 24 41 83 e4 3f 44 88 64 24 07 41 83 e4 3f 4a 8d 7c e3 08 e8 b3 bc d7 ff 4a 8b 4c e3 08 <f0> 4c 0f b3 29 0f 82 94 00 00 00 48 8d bd 20 04 00 00 e8 97 bb d7
[ 3117.971789] RSP: 0018:ffff88018eb67638 EFLAGS: 00010292
[ 3117.976333] RAX: 0000000000000000 RBX: ffff88018f0a6300 RCX: 0000000000000000
[ 3117.980926] RDX: 0000000000000000 RSI: 0000000000000297 RDI: 0000000000000297
[ 3117.985497] RBP: ffff88018ebe9980 R08: ffffed003e743ebb R09: ffffed003e743ebb
[ 3117.990098] R10: 0000000000000001 R11: ffffed003e743eba R12: 0000000000000019
[ 3117.994761] R13: 0000000000000014 R14: 0000000000000320 R15: ffff88018ebe99e0
[ 3117.999392] FS:  00007f5694636840(0000) GS:ffff8801f3b00000(0000) knlGS:0000000000000000
[ 3118.004096] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3118.008816] CR2: 00007fe89bb1a000 CR3: 0000000191c22000 CR4: 00000000000006e0

- Location
https://elixir.bootlin.com/linux/v4.18-rc3/source/fs/f2fs/segment.c#L775
		if (test_and_clear_bit(segno, dirty_i->dirty_segmap[t]))
			dirty_i->nr_dirty[t]--;
Here dirty_i->dirty_segmap[t] can be NULL which leads to crash in test_and_clear_bit()

Reported-by Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
nickdesaulniers pushed a commit that referenced this issue Aug 22, 2018
syzbot found the following crash on:

HEAD commit:    d9bd94c Add linux-next specific files for 20180801
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1001189c400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=cc8964ea4d04518c
dashboard link: https://syzkaller.appspot.com/bug?extid=c966a82db0b14aa37e81
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+c966a82db0b14aa37e81@syzkaller.appspotmail.com

loop7: rw=12288, want=8200, limit=20
netlink: 65342 bytes leftover after parsing attributes in process `syz-executor4'.
openvswitch: netlink: Message has 8 unknown bytes.
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
CPU: 1 PID: 7615 Comm: syz-executor7 Not tainted 4.18.0-rc7-next-20180801+ #29
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__read_once_size include/linux/compiler.h:188 [inline]
RIP: 0010:compound_head include/linux/page-flags.h:142 [inline]
RIP: 0010:PageLocked include/linux/page-flags.h:272 [inline]
RIP: 0010:f2fs_put_page fs/f2fs/f2fs.h:2011 [inline]
RIP: 0010:validate_checkpoint+0x66d/0xec0 fs/f2fs/checkpoint.c:835
Code: e8 58 05 7f fe 4c 8d 6b 80 4d 8d 74 24 08 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1 ea 03 c6 04 02 00 4c 89 f2 48 c1 ea 03 <80> 3c 02 00 0f 85 f4 06 00 00 4c 89 ea 4d 8b 7c 24 08 48 b8 00 00
RSP: 0018:ffff8801937cebe8 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: ffff8801937cef30 RCX: ffffc90006035000
RDX: 0000000000000000 RSI: ffffffff82fd9658 RDI: 0000000000000005
RBP: ffff8801937cef58 R08: ffff8801ab254700 R09: fffff94000d9e026
R10: fffff94000d9e026 R11: ffffea0006cf0137 R12: fffffffffffffffb
R13: ffff8801937ceeb0 R14: 0000000000000003 R15: ffff880193419b40
FS:  00007f36a61d5700(0000) GS:ffff8801db100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fc04ff93000 CR3: 00000001d0562000 CR4: 00000000001426e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 f2fs_get_valid_checkpoint+0x436/0x1ec0 fs/f2fs/checkpoint.c:860
 f2fs_fill_super+0x2d42/0x8110 fs/f2fs/super.c:2883
 mount_bdev+0x314/0x3e0 fs/super.c:1344
 f2fs_mount+0x3c/0x50 fs/f2fs/super.c:3133
 legacy_get_tree+0x131/0x460 fs/fs_context.c:729
 vfs_get_tree+0x1cb/0x5c0 fs/super.c:1743
 do_new_mount fs/namespace.c:2603 [inline]
 do_mount+0x6f2/0x1e20 fs/namespace.c:2927
 ksys_mount+0x12d/0x140 fs/namespace.c:3143
 __do_sys_mount fs/namespace.c:3157 [inline]
 __se_sys_mount fs/namespace.c:3154 [inline]
 __x64_sys_mount+0xbe/0x150 fs/namespace.c:3154
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x45943a
Code: b8 a6 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 bd 8a fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 0f 83 9a 8a fb ff c3 66 0f 1f 84 00 00 00 00 00
RSP: 002b:00007f36a61d4a88 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007f36a61d4b30 RCX: 000000000045943a
RDX: 00007f36a61d4ad0 RSI: 0000000020000100 RDI: 00007f36a61d4af0
RBP: 0000000020000100 R08: 00007f36a61d4b30 R09: 00007f36a61d4ad0
R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000013
R13: 0000000000000000 R14: 00000000004c8ea0 R15: 0000000000000000
Modules linked in:
Dumping ftrace buffer:
   (ftrace buffer empty)
---[ end trace bd8550c129352286 ]---
RIP: 0010:__read_once_size include/linux/compiler.h:188 [inline]
RIP: 0010:compound_head include/linux/page-flags.h:142 [inline]
RIP: 0010:PageLocked include/linux/page-flags.h:272 [inline]
RIP: 0010:f2fs_put_page fs/f2fs/f2fs.h:2011 [inline]
RIP: 0010:validate_checkpoint+0x66d/0xec0 fs/f2fs/checkpoint.c:835
Code: e8 58 05 7f fe 4c 8d 6b 80 4d 8d 74 24 08 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1 ea 03 c6 04 02 00 4c 89 f2 48 c1 ea 03 <80> 3c 02 00 0f 85 f4 06 00 00 4c 89 ea 4d 8b 7c 24 08 48 b8 00 00
RSP: 0018:ffff8801937cebe8 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: ffff8801937cef30 RCX: ffffc90006035000
RDX: 0000000000000000 RSI: ffffffff82fd9658 RDI: 0000000000000005
netlink: 65342 bytes leftover after parsing attributes in process `syz-executor4'.
RBP: ffff8801937cef58 R08: ffff8801ab254700 R09: fffff94000d9e026
openvswitch: netlink: Message has 8 unknown bytes.
R10: fffff94000d9e026 R11: ffffea0006cf0137 R12: fffffffffffffffb
R13: ffff8801937ceeb0 R14: 0000000000000003 R15: ffff880193419b40
FS:  00007f36a61d5700(0000) GS:ffff8801db100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fc04ff93000 CR3: 00000001d0562000 CR4: 00000000001426e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

In validate_checkpoint(), if we failed to call get_checkpoint_version(), we
will pass returned invalid page pointer into f2fs_put_page, cause accessing
invalid memory, this patch tries to handle error path correctly to fix this
issue.

Signed-off-by: Chao Yu <yuchao0@huawei.com>

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
nickdesaulniers pushed a commit that referenced this issue Aug 22, 2018
https://bugzilla.kernel.org/show_bug.cgi?id=200221

- Overview
BUG() in clear_inode() when mounting and un-mounting a corrupted f2fs image

- Reproduce

- Kernel message
[  538.601448] F2FS-fs (loop0): Invalid segment/section count (31, 24 x 1376257)
[  538.601458] F2FS-fs (loop0): Can't find valid F2FS filesystem in 2th superblock
[  538.724091] F2FS-fs (loop0): Try to recover 2th superblock, ret: 0
[  538.724102] F2FS-fs (loop0): Mounted with checkpoint version = 2
[  540.970834] ------------[ cut here ]------------
[  540.970838] kernel BUG at fs/inode.c:512!
[  540.971750] invalid opcode: 0000 [#1] SMP KASAN PTI
[  540.972755] CPU: 1 PID: 1305 Comm: umount Not tainted 4.18.0-rc1+ #4
[  540.974034] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  540.982913] RIP: 0010:clear_inode+0xc0/0xd0
[  540.983774] Code: 8d a3 30 01 00 00 4c 89 e7 e8 1c ec f8 ff 48 8b 83 30 01 00 00 49 39 c4 75 1a 48 c7 83 a0 00 00 00 60 00 00 00 5b 41 5c 5d c3 <0f> 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 1f 40 00 66 66 66 66 90 55
[  540.987570] RSP: 0018:ffff8801e34a7b70 EFLAGS: 00010002
[  540.988636] RAX: 0000000000000000 RBX: ffff8801e9b744e8 RCX: ffffffffb840eb3a
[  540.990063] RDX: dffffc0000000000 RSI: 0000000000000004 RDI: ffff8801e9b746b8
[  540.991499] RBP: ffff8801e34a7b80 R08: ffffed003d36e8ce R09: ffffed003d36e8ce
[  540.992923] R10: 0000000000000001 R11: ffffed003d36e8cd R12: ffff8801e9b74668
[  540.994360] R13: ffff8801e9b74760 R14: ffff8801e9b74528 R15: ffff8801e9b74530
[  540.995786] FS:  00007f4662bdf840(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000
[  540.997403] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  540.998571] CR2: 000000000175c568 CR3: 00000001dcfe6000 CR4: 00000000000006e0
[  541.000015] Call Trace:
[  541.000554]  f2fs_evict_inode+0x253/0x630
[  541.001381]  evict+0x16f/0x290
[  541.002015]  iput+0x280/0x300
[  541.002654]  dentry_unlink_inode+0x165/0x1e0
[  541.003528]  __dentry_kill+0x16a/0x260
[  541.004300]  dentry_kill+0x70/0x250
[  541.005018]  dput+0x154/0x1d0
[  541.005635]  do_one_tree+0x34/0x40
[  541.006354]  shrink_dcache_for_umount+0x3f/0xa0
[  541.007285]  generic_shutdown_super+0x43/0x1c0
[  541.008192]  kill_block_super+0x52/0x80
[  541.008978]  kill_f2fs_super+0x62/0x70
[  541.009750]  deactivate_locked_super+0x6f/0xa0
[  541.010664]  deactivate_super+0x5e/0x80
[  541.011450]  cleanup_mnt+0x61/0xa0
[  541.012151]  __cleanup_mnt+0x12/0x20
[  541.012893]  task_work_run+0xc8/0xf0
[  541.013635]  exit_to_usermode_loop+0x125/0x130
[  541.014555]  do_syscall_64+0x138/0x170
[  541.015340]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  541.016375] RIP: 0033:0x7f46624bf487
[  541.017104] Code: 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 c9 2b 00 f7 d8 64 89 01 48
[  541.020923] RSP: 002b:00007fff5e12e9a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[  541.022452] RAX: 0000000000000000 RBX: 0000000001753030 RCX: 00007f46624bf487
[  541.023885] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000000000175a1e0
[  541.025318] RBP: 000000000175a1e0 R08: 0000000000000000 R09: 0000000000000014
[  541.026755] R10: 00000000000006b2 R11: 0000000000000246 R12: 00007f46629c883c
[  541.028186] R13: 0000000000000000 R14: 0000000001753210 R15: 00007fff5e12ec30
[  541.029626] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy
[  541.039445] ---[ end trace 4ce02f25ff7d3df5 ]---
[  541.040392] RIP: 0010:clear_inode+0xc0/0xd0
[  541.041240] Code: 8d a3 30 01 00 00 4c 89 e7 e8 1c ec f8 ff 48 8b 83 30 01 00 00 49 39 c4 75 1a 48 c7 83 a0 00 00 00 60 00 00 00 5b 41 5c 5d c3 <0f> 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 1f 40 00 66 66 66 66 90 55
[  541.045042] RSP: 0018:ffff8801e34a7b70 EFLAGS: 00010002
[  541.046099] RAX: 0000000000000000 RBX: ffff8801e9b744e8 RCX: ffffffffb840eb3a
[  541.047537] RDX: dffffc0000000000 RSI: 0000000000000004 RDI: ffff8801e9b746b8
[  541.048965] RBP: ffff8801e34a7b80 R08: ffffed003d36e8ce R09: ffffed003d36e8ce
[  541.050402] R10: 0000000000000001 R11: ffffed003d36e8cd R12: ffff8801e9b74668
[  541.051832] R13: ffff8801e9b74760 R14: ffff8801e9b74528 R15: ffff8801e9b74530
[  541.053263] FS:  00007f4662bdf840(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000
[  541.054891] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  541.056039] CR2: 000000000175c568 CR3: 00000001dcfe6000 CR4: 00000000000006e0
[  541.058506] ==================================================================
[  541.059991] BUG: KASAN: stack-out-of-bounds in update_stack_state+0x38c/0x3e0
[  541.061513] Read of size 8 at addr ffff8801e34a7970 by task umount/1305

[  541.063302] CPU: 1 PID: 1305 Comm: umount Tainted: G      D           4.18.0-rc1+ #4
[  541.064838] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  541.066778] Call Trace:
[  541.067294]  dump_stack+0x7b/0xb5
[  541.067986]  print_address_description+0x70/0x290
[  541.068941]  kasan_report+0x291/0x390
[  541.069692]  ? update_stack_state+0x38c/0x3e0
[  541.070598]  __asan_load8+0x54/0x90
[  541.071315]  update_stack_state+0x38c/0x3e0
[  541.072172]  ? __read_once_size_nocheck.constprop.7+0x20/0x20
[  541.073340]  ? vprintk_func+0x27/0x60
[  541.074096]  ? printk+0xa3/0xd3
[  541.074762]  ? __save_stack_trace+0x5e/0x100
[  541.075634]  unwind_next_frame.part.5+0x18e/0x490
[  541.076594]  ? unwind_dump+0x290/0x290
[  541.077368]  ? __show_regs+0x2c4/0x330
[  541.078142]  __unwind_start+0x106/0x190
[  541.085422]  __save_stack_trace+0x5e/0x100
[  541.086268]  ? __save_stack_trace+0x5e/0x100
[  541.087161]  ? unlink_anon_vmas+0xba/0x2c0
[  541.087997]  save_stack_trace+0x1f/0x30
[  541.088782]  save_stack+0x46/0xd0
[  541.089475]  ? __alloc_pages_slowpath+0x1420/0x1420
[  541.090477]  ? flush_tlb_mm_range+0x15e/0x220
[  541.091364]  ? __dec_node_state+0x24/0xb0
[  541.092180]  ? lock_page_memcg+0x85/0xf0
[  541.092979]  ? unlock_page_memcg+0x16/0x80
[  541.093812]  ? page_remove_rmap+0x198/0x520
[  541.094674]  ? mark_page_accessed+0x133/0x200
[  541.095559]  ? _cond_resched+0x1a/0x50
[  541.096326]  ? unmap_page_range+0xcd4/0xe50
[  541.097179]  ? rb_next+0x58/0x80
[  541.097845]  ? rb_next+0x58/0x80
[  541.098518]  __kasan_slab_free+0x13c/0x1a0
[  541.099352]  ? unlink_anon_vmas+0xba/0x2c0
[  541.100184]  kasan_slab_free+0xe/0x10
[  541.100934]  kmem_cache_free+0x89/0x1e0
[  541.101724]  unlink_anon_vmas+0xba/0x2c0
[  541.102534]  free_pgtables+0x101/0x1b0
[  541.103299]  exit_mmap+0x146/0x2a0
[  541.103996]  ? __ia32_sys_munmap+0x50/0x50
[  541.104829]  ? kasan_check_read+0x11/0x20
[  541.105649]  ? mm_update_next_owner+0x322/0x380
[  541.106578]  mmput+0x8b/0x1d0
[  541.107191]  do_exit+0x43a/0x1390
[  541.107876]  ? mm_update_next_owner+0x380/0x380
[  541.108791]  ? deactivate_super+0x5e/0x80
[  541.109610]  ? cleanup_mnt+0x61/0xa0
[  541.110351]  ? __cleanup_mnt+0x12/0x20
[  541.111115]  ? task_work_run+0xc8/0xf0
[  541.111879]  ? exit_to_usermode_loop+0x125/0x130
[  541.112817]  rewind_stack_do_exit+0x17/0x20
[  541.113666] RIP: 0033:0x7f46624bf487
[  541.114404] Code: Bad RIP value.
[  541.115094] RSP: 002b:00007fff5e12e9a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[  541.116605] RAX: 0000000000000000 RBX: 0000000001753030 RCX: 00007f46624bf487
[  541.118034] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000000000175a1e0
[  541.119472] RBP: 000000000175a1e0 R08: 0000000000000000 R09: 0000000000000014
[  541.120890] R10: 00000000000006b2 R11: 0000000000000246 R12: 00007f46629c883c
[  541.122321] R13: 0000000000000000 R14: 0000000001753210 R15: 00007fff5e12ec30

[  541.124061] The buggy address belongs to the page:
[  541.125042] page:ffffea00078d29c0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
[  541.126651] flags: 0x2ffff0000000000()
[  541.127418] raw: 02ffff0000000000 dead000000000100 dead000000000200 0000000000000000
[  541.128963] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[  541.130516] page dumped because: kasan: bad access detected

[  541.131954] Memory state around the buggy address:
[  541.132924]  ffff8801e34a7800: 00 f1 f1 f1 f1 00 f4 f4 f4 f3 f3 f3 f3 00 00 00
[  541.134378]  ffff8801e34a7880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  541.135814] >ffff8801e34a7900: 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1
[  541.137253]                                                              ^
[  541.138637]  ffff8801e34a7980: f1 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  541.140075]  ffff8801e34a7a00: 00 00 00 00 00 00 00 00 f3 00 00 00 00 00 00 00
[  541.141509] ==================================================================

- Location
https://elixir.bootlin.com/linux/v4.18-rc1/source/fs/inode.c#L512
	BUG_ON(inode->i_data.nrpages);

The root cause is root directory inode is corrupted, it has both
inline_data and inline_dentry flag, and its nlink is zero, so in
->evict(), after dropping all page cache, it grabs page #0 for inline
data truncation, result in panic in later clear_inode() where we will
check inode->i_data.nrpages value.

This patch adds inline flags check in sanity_check_inode, in addition,
do sanity check with root inode's nlink.

Reported-by Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
nickdesaulniers pushed a commit that referenced this issue Aug 22, 2018
This patch adds f2fs_is_valid_blkaddr() in below functions to do sanity
check with block address to avoid pentential panic:
- f2fs_grab_read_bio()
- __written_first_block()

https://bugzilla.kernel.org/show_bug.cgi?id=200465

- Reproduce

- POC (poc.c)
    #define _GNU_SOURCE
    #include <sys/types.h>
    #include <sys/mount.h>
    #include <sys/mman.h>
    #include <sys/stat.h>
    #include <sys/xattr.h>

    #include <dirent.h>
    #include <errno.h>
    #include <error.h>
    #include <fcntl.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <unistd.h>

    #include <linux/falloc.h>
    #include <linux/loop.h>

    static void activity(char *mpoint) {

      char *xattr;
      int err;

      err = asprintf(&xattr, "%s/foo/bar/xattr", mpoint);

      char buf2[113];
      memset(buf2, 0, sizeof(buf2));
      listxattr(xattr, buf2, sizeof(buf2));

    }

    int main(int argc, char *argv[]) {
      activity(argv[1]);
      return 0;
    }

- kernel message
[  844.718738] F2FS-fs (loop0): Mounted with checkpoint version = 2
[  846.430929] F2FS-fs (loop0): access invalid blkaddr:1024
[  846.431058] WARNING: CPU: 1 PID: 1249 at fs/f2fs/checkpoint.c:154 f2fs_is_valid_blkaddr+0x10f/0x160
[  846.431059] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd input_leds joydev soundcore serio_raw i2c_piix4 mac_hid ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 multipath linear qxl ttm crct10dif_pclmul crc32_pclmul drm_kms_helper ghash_clmulni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops pcbc drm 8139too aesni_intel 8139cp floppy psmouse mii aes_x86_64 crypto_simd pata_acpi cryptd glue_helper
[  846.431310] CPU: 1 PID: 1249 Comm: a.out Not tainted 4.18.0-rc3+ #1
[  846.431312] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  846.431315] RIP: 0010:f2fs_is_valid_blkaddr+0x10f/0x160
[  846.431316] Code: 00 eb ed 31 c0 83 fa 05 75 ae 48 83 ec 08 48 8b 3f 89 f1 48 c7 c2 fc 0b 0f 8b 48 c7 c6 8b d7 09 8b 88 44 24 07 e8 61 8b ff ff <0f> 0b 0f b6 44 24 07 48 83 c4 08 eb 81 4c 8b 47 10 8b 8f 38 04 00
[  846.431347] RSP: 0018:ffff961c414a7bc0 EFLAGS: 00010282
[  846.431349] RAX: 0000000000000000 RBX: ffffc5f787b8ea80 RCX: 0000000000000000
[  846.431350] RDX: 0000000000000000 RSI: ffff89dfffd165d8 RDI: ffff89dfffd165d8
[  846.431351] RBP: ffff961c414a7c20 R08: 0000000000000001 R09: 0000000000000248
[  846.431353] R10: 0000000000000000 R11: 0000000000000248 R12: 0000000000000007
[  846.431369] R13: ffff89dff5492800 R14: ffff89dfae3aa000 R15: ffff89dff4ff88d0
[  846.431372] FS:  00007f882e2fb700(0000) GS:ffff89dfffd00000(0000) knlGS:0000000000000000
[  846.431373] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  846.431374] CR2: 0000000001a88008 CR3: 00000001eb572000 CR4: 00000000000006e0
[  846.431384] Call Trace:
[  846.431426]  f2fs_iget+0x6f4/0xe70
[  846.431430]  ? f2fs_find_entry+0x71/0x90
[  846.431432]  f2fs_lookup+0x1aa/0x390
[  846.431452]  __lookup_slow+0x97/0x150
[  846.431459]  lookup_slow+0x35/0x50
[  846.431462]  walk_component+0x1c6/0x470
[  846.431479]  ? memcg_kmem_charge_memcg+0x70/0x90
[  846.431488]  ? page_add_file_rmap+0x13/0x200
[  846.431491]  path_lookupat+0x76/0x230
[  846.431501]  ? __alloc_pages_nodemask+0xfc/0x280
[  846.431504]  filename_lookup+0xb8/0x1a0
[  846.431534]  ? _cond_resched+0x16/0x40
[  846.431541]  ? kmem_cache_alloc+0x160/0x1d0
[  846.431549]  ? path_listxattr+0x41/0xa0
[  846.431551]  path_listxattr+0x41/0xa0
[  846.431570]  do_syscall_64+0x55/0x100
[  846.431583]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  846.431607] RIP: 0033:0x7f882de1c0d7
[  846.431607] Code: f0 ff ff 73 01 c3 48 8b 0d be dd 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 c2 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 91 dd 2b 00 f7 d8 64 89 01 48
[  846.431639] RSP: 002b:00007ffe8e66c238 EFLAGS: 00000202 ORIG_RAX: 00000000000000c2
[  846.431641] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f882de1c0d7
[  846.431642] RDX: 0000000000000071 RSI: 00007ffe8e66c280 RDI: 0000000001a880c0
[  846.431643] RBP: 00007ffe8e66c300 R08: 0000000001a88010 R09: 0000000000000000
[  846.431645] R10: 00000000000001ab R11: 0000000000000202 R12: 0000000000400550
[  846.431646] R13: 00007ffe8e66c400 R14: 0000000000000000 R15: 0000000000000000
[  846.431648] ---[ end trace abca54df39d14f5c ]---
[  846.431651] F2FS-fs (loop0): invalid blkaddr: 1024, type: 5, run fsck to fix.
[  846.431762] WARNING: CPU: 1 PID: 1249 at fs/f2fs/f2fs.h:2697 f2fs_iget+0xd17/0xe70
[  846.431763] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd input_leds joydev soundcore serio_raw i2c_piix4 mac_hid ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 multipath linear qxl ttm crct10dif_pclmul crc32_pclmul drm_kms_helper ghash_clmulni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops pcbc drm 8139too aesni_intel 8139cp floppy psmouse mii aes_x86_64 crypto_simd pata_acpi cryptd glue_helper
[  846.431797] CPU: 1 PID: 1249 Comm: a.out Tainted: G        W         4.18.0-rc3+ #1
[  846.431798] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  846.431800] RIP: 0010:f2fs_iget+0xd17/0xe70
[  846.431801] Code: ff ff 48 63 d8 e9 e1 f6 ff ff 48 8b 45 c8 41 b8 05 00 00 00 48 c7 c2 d8 e8 0e 8b 48 c7 c6 1d b0 0a 8b 48 8b 38 e8 f9 b4 00 00 <0f> 0b 48 8b 45 c8 f0 80 48 48 04 e9 d8 f9 ff ff 0f 0b 48 8b 43 18
[  846.431832] RSP: 0018:ffff961c414a7bd0 EFLAGS: 00010282
[  846.431834] RAX: 0000000000000000 RBX: ffffc5f787b8ea80 RCX: 0000000000000006
[  846.431835] RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffff89dfffd165d0
[  846.431836] RBP: ffff961c414a7c20 R08: 0000000000000000 R09: 0000000000000273
[  846.431837] R10: 0000000000000000 R11: ffff89dfad50ca60 R12: 0000000000000007
[  846.431838] R13: ffff89dff5492800 R14: ffff89dfae3aa000 R15: ffff89dff4ff88d0
[  846.431840] FS:  00007f882e2fb700(0000) GS:ffff89dfffd00000(0000) knlGS:0000000000000000
[  846.431841] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  846.431842] CR2: 0000000001a88008 CR3: 00000001eb572000 CR4: 00000000000006e0
[  846.431846] Call Trace:
[  846.431850]  ? f2fs_find_entry+0x71/0x90
[  846.431853]  f2fs_lookup+0x1aa/0x390
[  846.431856]  __lookup_slow+0x97/0x150
[  846.431858]  lookup_slow+0x35/0x50
[  846.431874]  walk_component+0x1c6/0x470
[  846.431878]  ? memcg_kmem_charge_memcg+0x70/0x90
[  846.431880]  ? page_add_file_rmap+0x13/0x200
[  846.431882]  path_lookupat+0x76/0x230
[  846.431884]  ? __alloc_pages_nodemask+0xfc/0x280
[  846.431886]  filename_lookup+0xb8/0x1a0
[  846.431890]  ? _cond_resched+0x16/0x40
[  846.431891]  ? kmem_cache_alloc+0x160/0x1d0
[  846.431894]  ? path_listxattr+0x41/0xa0
[  846.431896]  path_listxattr+0x41/0xa0
[  846.431898]  do_syscall_64+0x55/0x100
[  846.431901]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  846.431902] RIP: 0033:0x7f882de1c0d7
[  846.431903] Code: f0 ff ff 73 01 c3 48 8b 0d be dd 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 c2 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 91 dd 2b 00 f7 d8 64 89 01 48
[  846.431934] RSP: 002b:00007ffe8e66c238 EFLAGS: 00000202 ORIG_RAX: 00000000000000c2
[  846.431936] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f882de1c0d7
[  846.431937] RDX: 0000000000000071 RSI: 00007ffe8e66c280 RDI: 0000000001a880c0
[  846.431939] RBP: 00007ffe8e66c300 R08: 0000000001a88010 R09: 0000000000000000
[  846.431940] R10: 00000000000001ab R11: 0000000000000202 R12: 0000000000400550
[  846.431941] R13: 00007ffe8e66c400 R14: 0000000000000000 R15: 0000000000000000
[  846.431943] ---[ end trace abca54df39d14f5d ]---
[  846.432033] F2FS-fs (loop0): access invalid blkaddr:1024
[  846.432051] WARNING: CPU: 1 PID: 1249 at fs/f2fs/checkpoint.c:154 f2fs_is_valid_blkaddr+0x10f/0x160
[  846.432051] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd input_leds joydev soundcore serio_raw i2c_piix4 mac_hid ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 multipath linear qxl ttm crct10dif_pclmul crc32_pclmul drm_kms_helper ghash_clmulni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops pcbc drm 8139too aesni_intel 8139cp floppy psmouse mii aes_x86_64 crypto_simd pata_acpi cryptd glue_helper
[  846.432085] CPU: 1 PID: 1249 Comm: a.out Tainted: G        W         4.18.0-rc3+ #1
[  846.432086] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  846.432089] RIP: 0010:f2fs_is_valid_blkaddr+0x10f/0x160
[  846.432089] Code: 00 eb ed 31 c0 83 fa 05 75 ae 48 83 ec 08 48 8b 3f 89 f1 48 c7 c2 fc 0b 0f 8b 48 c7 c6 8b d7 09 8b 88 44 24 07 e8 61 8b ff ff <0f> 0b 0f b6 44 24 07 48 83 c4 08 eb 81 4c 8b 47 10 8b 8f 38 04 00
[  846.432120] RSP: 0018:ffff961c414a7900 EFLAGS: 00010286
[  846.432122] RAX: 0000000000000000 RBX: 0000000000000400 RCX: 0000000000000006
[  846.432123] RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffff89dfffd165d0
[  846.432124] RBP: ffff89dff5492800 R08: 0000000000000001 R09: 000000000000029d
[  846.432125] R10: ffff961c414a7820 R11: 000000000000029d R12: 0000000000000400
[  846.432126] R13: 0000000000000000 R14: ffff89dff4ff88d0 R15: 0000000000000000
[  846.432128] FS:  00007f882e2fb700(0000) GS:ffff89dfffd00000(0000) knlGS:0000000000000000
[  846.432130] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  846.432131] CR2: 0000000001a88008 CR3: 00000001eb572000 CR4: 00000000000006e0
[  846.432135] Call Trace:
[  846.432151]  f2fs_wait_on_block_writeback+0x20/0x110
[  846.432158]  f2fs_grab_read_bio+0xbc/0xe0
[  846.432161]  f2fs_submit_page_read+0x21/0x280
[  846.432163]  f2fs_get_read_data_page+0xb7/0x3c0
[  846.432165]  f2fs_get_lock_data_page+0x29/0x1e0
[  846.432167]  f2fs_get_new_data_page+0x148/0x550
[  846.432170]  f2fs_add_regular_entry+0x1d2/0x550
[  846.432178]  ? __switch_to+0x12f/0x460
[  846.432181]  f2fs_add_dentry+0x6a/0xd0
[  846.432184]  f2fs_do_add_link+0xe9/0x140
[  846.432186]  __recover_dot_dentries+0x260/0x280
[  846.432189]  f2fs_lookup+0x343/0x390
[  846.432193]  __lookup_slow+0x97/0x150
[  846.432195]  lookup_slow+0x35/0x50
[  846.432208]  walk_component+0x1c6/0x470
[  846.432212]  ? memcg_kmem_charge_memcg+0x70/0x90
[  846.432215]  ? page_add_file_rmap+0x13/0x200
[  846.432217]  path_lookupat+0x76/0x230
[  846.432219]  ? __alloc_pages_nodemask+0xfc/0x280
[  846.432221]  filename_lookup+0xb8/0x1a0
[  846.432224]  ? _cond_resched+0x16/0x40
[  846.432226]  ? kmem_cache_alloc+0x160/0x1d0
[  846.432228]  ? path_listxattr+0x41/0xa0
[  846.432230]  path_listxattr+0x41/0xa0
[  846.432233]  do_syscall_64+0x55/0x100
[  846.432235]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  846.432237] RIP: 0033:0x7f882de1c0d7
[  846.432237] Code: f0 ff ff 73 01 c3 48 8b 0d be dd 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 c2 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 91 dd 2b 00 f7 d8 64 89 01 48
[  846.432269] RSP: 002b:00007ffe8e66c238 EFLAGS: 00000202 ORIG_RAX: 00000000000000c2
[  846.432271] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f882de1c0d7
[  846.432272] RDX: 0000000000000071 RSI: 00007ffe8e66c280 RDI: 0000000001a880c0
[  846.432273] RBP: 00007ffe8e66c300 R08: 0000000001a88010 R09: 0000000000000000
[  846.432274] R10: 00000000000001ab R11: 0000000000000202 R12: 0000000000400550
[  846.432275] R13: 00007ffe8e66c400 R14: 0000000000000000 R15: 0000000000000000
[  846.432277] ---[ end trace abca54df39d14f5e ]---
[  846.432279] F2FS-fs (loop0): invalid blkaddr: 1024, type: 5, run fsck to fix.
[  846.432376] WARNING: CPU: 1 PID: 1249 at fs/f2fs/f2fs.h:2697 f2fs_wait_on_block_writeback+0xb1/0x110
[  846.432376] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd input_leds joydev soundcore serio_raw i2c_piix4 mac_hid ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 multipath linear qxl ttm crct10dif_pclmul crc32_pclmul drm_kms_helper ghash_clmulni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops pcbc drm 8139too aesni_intel 8139cp floppy psmouse mii aes_x86_64 crypto_simd pata_acpi cryptd glue_helper
[  846.432410] CPU: 1 PID: 1249 Comm: a.out Tainted: G        W         4.18.0-rc3+ #1
[  846.432411] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  846.432413] RIP: 0010:f2fs_wait_on_block_writeback+0xb1/0x110
[  846.432414] Code: 66 90 f0 ff 4b 34 74 59 5b 5d c3 48 8b 7d 00 41 b8 05 00 00 00 89 d9 48 c7 c2 d8 e8 0e 8b 48 c7 c6 1d b0 0a 8b e8 df bc fd ff <0f> 0b f0 80 4d 48 04 e9 67 ff ff ff 48 8b 03 48 c1 e8 37 83 e0 07
[  846.432445] RSP: 0018:ffff961c414a7910 EFLAGS: 00010286
[  846.432447] RAX: 0000000000000000 RBX: 0000000000000400 RCX: 0000000000000006
[  846.432448] RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff89dfffd165d0
[  846.432449] RBP: ffff89dff5492800 R08: 0000000000000000 R09: 00000000000002d1
[  846.432450] R10: ffff961c414a7820 R11: ffff89dfad50cf80 R12: 0000000000000400
[  846.432451] R13: 0000000000000000 R14: ffff89dff4ff88d0 R15: 0000000000000000
[  846.432453] FS:  00007f882e2fb700(0000) GS:ffff89dfffd00000(0000) knlGS:0000000000000000
[  846.432454] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  846.432455] CR2: 0000000001a88008 CR3: 00000001eb572000 CR4: 00000000000006e0
[  846.432459] Call Trace:
[  846.432463]  f2fs_grab_read_bio+0xbc/0xe0
[  846.432464]  f2fs_submit_page_read+0x21/0x280
[  846.432466]  f2fs_get_read_data_page+0xb7/0x3c0
[  846.432468]  f2fs_get_lock_data_page+0x29/0x1e0
[  846.432470]  f2fs_get_new_data_page+0x148/0x550
[  846.432473]  f2fs_add_regular_entry+0x1d2/0x550
[  846.432475]  ? __switch_to+0x12f/0x460
[  846.432477]  f2fs_add_dentry+0x6a/0xd0
[  846.432480]  f2fs_do_add_link+0xe9/0x140
[  846.432483]  __recover_dot_dentries+0x260/0x280
[  846.432485]  f2fs_lookup+0x343/0x390
[  846.432488]  __lookup_slow+0x97/0x150
[  846.432490]  lookup_slow+0x35/0x50
[  846.432505]  walk_component+0x1c6/0x470
[  846.432509]  ? memcg_kmem_charge_memcg+0x70/0x90
[  846.432511]  ? page_add_file_rmap+0x13/0x200
[  846.432513]  path_lookupat+0x76/0x230
[  846.432515]  ? __alloc_pages_nodemask+0xfc/0x280
[  846.432517]  filename_lookup+0xb8/0x1a0
[  846.432520]  ? _cond_resched+0x16/0x40
[  846.432522]  ? kmem_cache_alloc+0x160/0x1d0
[  846.432525]  ? path_listxattr+0x41/0xa0
[  846.432526]  path_listxattr+0x41/0xa0
[  846.432529]  do_syscall_64+0x55/0x100
[  846.432531]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  846.432533] RIP: 0033:0x7f882de1c0d7
[  846.432533] Code: f0 ff ff 73 01 c3 48 8b 0d be dd 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 c2 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 91 dd 2b 00 f7 d8 64 89 01 48
[  846.432565] RSP: 002b:00007ffe8e66c238 EFLAGS: 00000202 ORIG_RAX: 00000000000000c2
[  846.432567] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f882de1c0d7
[  846.432568] RDX: 0000000000000071 RSI: 00007ffe8e66c280 RDI: 0000000001a880c0
[  846.432569] RBP: 00007ffe8e66c300 R08: 0000000001a88010 R09: 0000000000000000
[  846.432570] R10: 00000000000001ab R11: 0000000000000202 R12: 0000000000400550
[  846.432571] R13: 00007ffe8e66c400 R14: 0000000000000000 R15: 0000000000000000
[  846.432573] ---[ end trace abca54df39d14f5f ]---
[  846.434280] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[  846.434424] PGD 80000001ebd3a067 P4D 80000001ebd3a067 PUD 1eb1ae067 PMD 0
[  846.434551] Oops: 0000 [#1] SMP PTI
[  846.434697] CPU: 0 PID: 44 Comm: kworker/u5:0 Tainted: G        W         4.18.0-rc3+ #1
[  846.434805] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  846.435000] Workqueue: fscrypt_read_queue decrypt_work
[  846.435174] RIP: 0010:fscrypt_do_page_crypto+0x6e/0x2d0
[  846.435351] Code: 00 65 48 8b 04 25 28 00 00 00 48 89 84 24 88 00 00 00 31 c0 e8 43 c2 e0 ff 49 8b 86 48 02 00 00 85 ed c7 44 24 70 00 00 00 00 <48> 8b 58 08 0f 84 14 02 00 00 48 8b 78 10 48 8b 0c 24 48 c7 84 24
[  846.435696] RSP: 0018:ffff961c40f9bd60 EFLAGS: 00010206
[  846.435870] RAX: 0000000000000000 RBX: ffffc5f787719b80 RCX: ffffc5f787719b80
[  846.436051] RDX: ffffffff8b9f4b88 RSI: ffffffff8b0ae622 RDI: ffff961c40f9bdb8
[  846.436261] RBP: 0000000000001000 R08: ffffc5f787719b80 R09: 0000000000001000
[  846.436433] R10: 0000000000000018 R11: fefefefefefefeff R12: ffffc5f787719b80
[  846.436562] R13: ffffc5f787719b80 R14: ffff89dff4ff88d0 R15: 0ffff89dfaddee60
[  846.436658] FS:  0000000000000000(0000) GS:ffff89dfffc00000(0000) knlGS:0000000000000000
[  846.436758] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  846.436898] CR2: 0000000000000008 CR3: 00000001eddd0000 CR4: 00000000000006f0
[  846.437001] Call Trace:
[  846.437181]  ? check_preempt_wakeup+0xf2/0x230
[  846.437276]  ? check_preempt_curr+0x7c/0x90
[  846.437370]  fscrypt_decrypt_page+0x48/0x4d
[  846.437466]  __fscrypt_decrypt_bio+0x5b/0x90
[  846.437542]  decrypt_work+0x12/0x20
[  846.437651]  process_one_work+0x15e/0x3d0
[  846.437740]  worker_thread+0x4c/0x440
[  846.437848]  kthread+0xf8/0x130
[  846.437938]  ? rescuer_thread+0x350/0x350
[  846.438022]  ? kthread_associate_blkcg+0x90/0x90
[  846.438117]  ret_from_fork+0x35/0x40
[  846.438201] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd input_leds joydev soundcore serio_raw i2c_piix4 mac_hid ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 multipath linear qxl ttm crct10dif_pclmul crc32_pclmul drm_kms_helper ghash_clmulni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops pcbc drm 8139too aesni_intel 8139cp floppy psmouse mii aes_x86_64 crypto_simd pata_acpi cryptd glue_helper
[  846.438653] CR2: 0000000000000008
[  846.438713] ---[ end trace abca54df39d14f60 ]---
[  846.438796] RIP: 0010:fscrypt_do_page_crypto+0x6e/0x2d0
[  846.438844] Code: 00 65 48 8b 04 25 28 00 00 00 48 89 84 24 88 00 00 00 31 c0 e8 43 c2 e0 ff 49 8b 86 48 02 00 00 85 ed c7 44 24 70 00 00 00 00 <48> 8b 58 08 0f 84 14 02 00 00 48 8b 78 10 48 8b 0c 24 48 c7 84 24
[  846.439084] RSP: 0018:ffff961c40f9bd60 EFLAGS: 00010206
[  846.439176] RAX: 0000000000000000 RBX: ffffc5f787719b80 RCX: ffffc5f787719b80
[  846.440927] RDX: ffffffff8b9f4b88 RSI: ffffffff8b0ae622 RDI: ffff961c40f9bdb8
[  846.442083] RBP: 0000000000001000 R08: ffffc5f787719b80 R09: 0000000000001000
[  846.443284] R10: 0000000000000018 R11: fefefefefefefeff R12: ffffc5f787719b80
[  846.444448] R13: ffffc5f787719b80 R14: ffff89dff4ff88d0 R15: 0ffff89dfaddee60
[  846.445558] FS:  0000000000000000(0000) GS:ffff89dfffc00000(0000) knlGS:0000000000000000
[  846.446687] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  846.447796] CR2: 0000000000000008 CR3: 00000001eddd0000 CR4: 00000000000006f0

- Location
https://elixir.bootlin.com/linux/v4.18-rc4/source/fs/crypto/crypto.c#L149
	struct crypto_skcipher *tfm = ci->ci_ctfm;
Here ci can be NULL

Note that this issue maybe require CONFIG_F2FS_FS_ENCRYPTION=y to reproduce.

Reported-by Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
nickdesaulniers pushed a commit that referenced this issue Aug 22, 2018
Currently, whenever a new node is created/re-used from the memhotplug
path, we call free_area_init_node()->free_area_init_core().  But there is
some code that we do not really need to run when we are coming from such
path.

free_area_init_core() performs the following actions:

1) Initializes pgdat internals, such as spinlock, waitqueues and more.
2) Account # nr_all_pages and # nr_kernel_pages. These values are used later on
   when creating hash tables.
3) Account number of managed_pages per zone, substracting dma_reserved and
   memmap pages.
4) Initializes some fields of the zone structure data
5) Calls init_currently_empty_zone to initialize all the freelists
6) Calls memmap_init to initialize all pages belonging to certain zone

When called from memhotplug path, free_area_init_core() only performs
actions #1 and #4.

Action #2 is pointless as the zones do not have any pages since either the
node was freed, or we are re-using it, eitherway all zones belonging to
this node should have 0 pages.  For the same reason, action #3 results
always in manages_pages being 0.

Action #5 and #6 are performed later on when onlining the pages:
 online_pages()->move_pfn_range_to_zone()->init_currently_empty_zone()
 online_pages()->move_pfn_range_to_zone()->memmap_init_zone()

This patch does two things:

First, moves the node/zone initializtion to their own function, so it
allows us to create a small version of free_area_init_core, where we only
perform:

1) Initialization of pgdat internals, such as spinlock, waitqueues and more
4) Initialization of some fields of the zone structure data

These two functions are: pgdat_init_internals() and zone_init_internals().

The second thing this patch does, is to introduce
free_area_init_core_hotplug(), the memhotplug version of
free_area_init_core():

Currently, we call free_area_init_node() from the memhotplug path.  In
there, we set some pgdat's fields, and call calculate_node_totalpages().
calculate_node_totalpages() calculates the # of pages the node has.

Since the node is either new, or we are re-using it, the zones belonging
to this node should not have any pages, so there is no point to calculate
this now.

Actually, we re-set these values to 0 later on with the calls to:

reset_node_managed_pages()
reset_node_present_pages()

The # of pages per node and the # of pages per zone will be calculated when
onlining the pages:

online_pages()->move_pfn_range()->move_pfn_range_to_zone()->resize_zone_range()
online_pages()->move_pfn_range()->move_pfn_range_to_zone()->resize_pgdat_range()

Also, since free_area_init_core/free_area_init_node will now only get called during early init, let us replace
__paginginit with __init, so their code gets freed up.

[osalvador@techadventures.net: fix section usage]
  Link: http://lkml.kernel.org/r/20180731101752.GA473@techadventures.net
[osalvador@suse.de: v6]
  Link: http://lkml.kernel.org/r/20180801122348.21588-6-osalvador@techadventures.net
Link: http://lkml.kernel.org/r/20180730101757.28058-5-osalvador@techadventures.net
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com>
Cc: Aaron Lu <aaron.lu@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
nickdesaulniers pushed a commit that referenced this issue Aug 22, 2018
Patch series "add support for relative references in special sections", v10.

This adds support for emitting special sections such as initcall arrays,
PCI fixups and tracepoints as relative references rather than absolute
references.  This reduces the size by 50% on 64-bit architectures, but
more importantly, it removes the need for carrying relocation metadata for
these sections in relocatable kernels (e.g., for KASLR) that needs to be
fixed up at boot time.  On arm64, this reduces the vmlinux footprint of
such a reference by 8x (8 byte absolute reference + 24 byte RELA entry vs
4 byte relative reference)

Patch #3 was sent out before as a single patch.  This series supersedes
the previous submission.  This version makes relative ksymtab entries
dependent on the new Kconfig symbol HAVE_ARCH_PREL32_RELOCATIONS rather
than trying to infer from kbuild test robot replies for which
architectures it should be blacklisted.

Patch #1 introduces the new Kconfig symbol HAVE_ARCH_PREL32_RELOCATIONS,
and sets it for the main architectures that are expected to benefit the
most from this feature, i.e., 64-bit architectures or ones that use
runtime relocations.

Patch #2 add support for #define'ing __DISABLE_EXPORTS to get rid of
ksymtab/kcrctab sections in decompressor and EFI stub objects when
rebuilding existing C files to run in a different context.

Patches #4 - #6 implement relative references for initcalls, PCI fixups
and tracepoints, respectively, all of which produce sections with order
~1000 entries on an arm64 defconfig kernel with tracing enabled.  This
means we save about 28 KB of vmlinux space for each of these patches.

[From the v7 series blurb, which included the jump_label patches as well]:

  For the arm64 kernel, all patches combined reduce the memory footprint
  of vmlinux by about 1.3 MB (using a config copied from Ubuntu that has
  KASLR enabled), of which ~1 MB is the size reduction of the RELA section
  in .init, and the remaining 300 KB is reduction of .text/.data.

This patch (of 6):

Before updating certain subsystems to use place relative 32-bit
relocations in special sections, to save space and reduce the number of
absolute relocations that need to be processed at runtime by relocatable
kernels, introduce the Kconfig symbol and define it for some architectures
that should be able to support and benefit from it.

Link: http://lkml.kernel.org/r/20180704083651.24360-2-ard.biesheuvel@linaro.org
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: James Morris <jmorris@namei.org>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
Cc: James Morris <james.morris@microsoft.com>
Cc: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
nickdesaulniers pushed a commit that referenced this issue Sep 11, 2018
Test case btrfs/164 reports use-after-free:

[ 6712.084324] general protection fault: 0000 [#1] PREEMPT SMP
..
[ 6712.195423]  btrfs_update_commit_device_size+0x75/0xf0 [btrfs]
[ 6712.201424]  btrfs_commit_transaction+0x57d/0xa90 [btrfs]
[ 6712.206999]  btrfs_rm_device+0x627/0x850 [btrfs]
[ 6712.211800]  btrfs_ioctl+0x2b03/0x3120 [btrfs]

Reason for this is that btrfs_shrink_device adds the resized device to
the fs_devices::resized_devices after it has called the last commit
transaction.

So the list fs_devices::resized_devices is not empty when
btrfs_shrink_device returns.  Now the parent function
btrfs_rm_device calls:

        btrfs_close_bdev(device);
        call_rcu(&device->rcu, free_device_rcu);

and then does the transactio ncommit. It goes through the
fs_devices::resized_devices in btrfs_update_commit_device_size and
leads to use-after-free.

Fix this by making sure btrfs_shrink_device calls the last needed
btrfs_commit_transaction before the return. This is consistent with what
the grow counterpart does and this makes sure the on-disk state is
persistent when the function returns.

Reported-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Tested-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
nickdesaulniers pushed a commit that referenced this issue Sep 11, 2018
This patch fixes sleep-in-atomic bugs in AES-CBC and AES-XTS VMX
implementations. The problem is that the blkcipher_* functions should
not be called in atomic context.

The bugs can be reproduced via the AF_ALG interface by trying to
encrypt/decrypt sufficiently large buffers (at least 64 KiB) using the
VMX implementations of 'cbc(aes)' or 'xts(aes)'. Such operations then
trigger BUG in crypto_yield():

[  891.863680] BUG: sleeping function called from invalid context at include/crypto/algapi.h:424
[  891.864622] in_atomic(): 1, irqs_disabled(): 0, pid: 12347, name: kcapi-enc
[  891.864739] 1 lock held by kcapi-enc/12347:
[  891.864811]  #0: 00000000f5d42c46 (sk_lock-AF_ALG){+.+.}, at: skcipher_recvmsg+0x50/0x530
[  891.865076] CPU: 5 PID: 12347 Comm: kcapi-enc Not tainted 4.19.0-0.rc0.git3.1.fc30.ppc64le #1
[  891.865251] Call Trace:
[  891.865340] [c0000003387578c0] [c000000000d67ea4] dump_stack+0xe8/0x164 (unreliable)
[  891.865511] [c000000338757910] [c000000000172a58] ___might_sleep+0x2f8/0x310
[  891.865679] [c000000338757990] [c0000000006bff74] blkcipher_walk_done+0x374/0x4a0
[  891.865825] [c0000003387579e0] [d000000007e73e70] p8_aes_cbc_encrypt+0x1c8/0x260 [vmx_crypto]
[  891.865993] [c000000338757ad0] [c0000000006c0ee0] skcipher_encrypt_blkcipher+0x60/0x80
[  891.866128] [c000000338757b10] [c0000000006ec504] skcipher_recvmsg+0x424/0x530
[  891.866283] [c000000338757bd0] [c000000000b00654] sock_recvmsg+0x74/0xa0
[  891.866403] [c000000338757c10] [c000000000b00f64] ___sys_recvmsg+0xf4/0x2f0
[  891.866515] [c000000338757d90] [c000000000b02bb8] __sys_recvmsg+0x68/0xe0
[  891.866631] [c000000338757e30] [c00000000000bbe4] system_call+0x5c/0x70

Fixes: 8c755ac ("crypto: vmx - Adding CBC routines for VMX module")
Fixes: c07f5d3 ("crypto: vmx - Adding support for XTS")
Cc: stable@vger.kernel.org
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
nickdesaulniers pushed a commit that referenced this issue Sep 11, 2018
I changed the way mac80211 updates the PM state of the peer.
I forgot that we could also have multicast frames from the
peer and that those frame should of course not change the
PM state of the peer: A peer goes to power save when it
needs to scan, but it won't send the broadcast Probe Request
with the PM bit set.

This made us mark the peer as awake when it wasn't and then
Intel's firmware would fail to transmit because the peer is
asleep according to its database. The driver warned about
this and it looked like this:

 WARNING: CPU: 0 PID: 184 at /usr/src/linux-4.16.14/drivers/net/wireless/intel/iwlwifi/mvm/tx.c:1369 iwl_mvm_rx_tx_cmd+0x53b/0x860
 CPU: 0 PID: 184 Comm: irq/124-iwlwifi Not tainted 4.16.14 #1
 RIP: 0010:iwl_mvm_rx_tx_cmd+0x53b/0x860
 Call Trace:
  iwl_pcie_rx_handle+0x220/0x880
  iwl_pcie_irq_handler+0x6c9/0xa20
  ? irq_forced_thread_fn+0x60/0x60
  ? irq_thread_dtor+0x90/0x90

The relevant code that spits the WARNING is:

        case TX_STATUS_FAIL_DEST_PS:
                /* the FW should have stopped the queue and not
                 * return this status
                 */
                WARN_ON(1);
                info->flags |= IEEE80211_TX_STAT_TX_FILTERED;

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=199967.

Fixes: 9fef654 ("mac80211: always update the PM state of a peer on MGMT / DATA frames")
Cc: <stable@vger.kernel.org>   #4.16+
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
nickdesaulniers pushed a commit that referenced this issue Sep 11, 2018
in the (rare) case of failure in nla_nest_start(), missing NULL checks in
tcf_pedit_key_ex_dump() can make the following command

 # tc action add action pedit ex munge ip ttl set 64

dereference a NULL pointer:

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
 PGD 800000007d1cd067 P4D 800000007d1cd067 PUD 7acd3067 PMD 0
 Oops: 0002 [#1] SMP PTI
 CPU: 0 PID: 3336 Comm: tc Tainted: G            E     4.18.0.pedit+ #425
 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
 RIP: 0010:tcf_pedit_dump+0x19d/0x358 [act_pedit]
 Code: be 02 00 00 00 48 89 df 66 89 44 24 20 e8 9b b1 fd e0 85 c0 75 46 8b 83 c8 00 00 00 49 83 c5 08 48 03 83 d0 00 00 00 4d 39 f5 <66> 89 04 25 00 00 00 00 0f 84 81 01 00 00 41 8b 45 00 48 8d 4c 24
 RSP: 0018:ffffb5d4004478a8 EFLAGS: 00010246
 RAX: ffff8880fcda2070 RBX: ffff8880fadd2900 RCX: 0000000000000000
 RDX: 0000000000000002 RSI: ffffb5d4004478ca RDI: ffff8880fcda206e
 RBP: ffff8880fb9cb900 R08: 0000000000000008 R09: ffff8880fcda206e
 R10: ffff8880fadd2900 R11: 0000000000000000 R12: ffff8880fd26cf40
 R13: ffff8880fc957430 R14: ffff8880fc957430 R15: ffff8880fb9cb988
 FS:  00007f75a537a740(0000) GS:ffff8880fda00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 000000007a2fa005 CR4: 00000000001606f0
 Call Trace:
  ? __nla_reserve+0x38/0x50
  tcf_action_dump_1+0xd2/0x130
  tcf_action_dump+0x6a/0xf0
  tca_get_fill.constprop.31+0xa3/0x120
  tcf_action_add+0xd1/0x170
  tc_ctl_action+0x137/0x150
  rtnetlink_rcv_msg+0x263/0x2d0
  ? _cond_resched+0x15/0x40
  ? rtnl_calcit.isra.30+0x110/0x110
  netlink_rcv_skb+0x4d/0x130
  netlink_unicast+0x1a3/0x250
  netlink_sendmsg+0x2ae/0x3a0
  sock_sendmsg+0x36/0x40
  ___sys_sendmsg+0x26f/0x2d0
  ? do_wp_page+0x8e/0x5f0
  ? handle_pte_fault+0x6c3/0xf50
  ? __handle_mm_fault+0x38e/0x520
  ? __sys_sendmsg+0x5e/0xa0
  __sys_sendmsg+0x5e/0xa0
  do_syscall_64+0x5b/0x180
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7f75a4583ba0
 Code: c3 48 8b 05 f2 62 2c 00 f7 db 64 89 18 48 83 cb ff eb dd 0f 1f 80 00 00 00 00 83 3d fd c3 2c 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ae cc 00 00 48 89 04 24
 RSP: 002b:00007fff60ee7418 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
 RAX: ffffffffffffffda RBX: 00007fff60ee7540 RCX: 00007f75a4583ba0
 RDX: 0000000000000000 RSI: 00007fff60ee7490 RDI: 0000000000000003
 RBP: 000000005b842d3e R08: 0000000000000002 R09: 0000000000000000
 R10: 00007fff60ee6ea0 R11: 0000000000000246 R12: 0000000000000000
 R13: 00007fff60ee7554 R14: 0000000000000001 R15: 000000000066c100
 Modules linked in: act_pedit(E) ip6table_filter ip6_tables iptable_filter binfmt_misc crct10dif_pclmul ext4 crc32_pclmul mbcache ghash_clmulni_intel jbd2 pcbc snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd snd_timer cryptd glue_helper snd joydev pcspkr soundcore virtio_balloon i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi virtio_net net_failover virtio_blk virtio_console failover qxl crc32c_intel drm_kms_helper syscopyarea serio_raw sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix virtio_pci libata virtio_ring i2c_core virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_pedit]
 CR2: 0000000000000000

Like it's done for other TC actions, give up dumping pedit rules and return
an error if nla_nest_start() returns NULL.

Fixes: 71d0ed7 ("net/act_pedit: Support using offset relative to the conventional network headers")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
nickdesaulniers pushed a commit that referenced this issue Sep 11, 2018
Commit 15e6680 ("ipv6: reorder icmpv6_init() and ip6_mr_init()")
moved the cleanup label for ipmr_fail, but should have changed the
contents of the cleanup labels as well. Now we can end up cleaning up
icmpv6 even though it hasn't been initialized (jump to icmp_fail or
ipmr_fail).

Simply undo things in the reverse order of their initialization.

Example of panic (triggered by faking a failure of icmpv6_init):

    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI
    [...]
    RIP: 0010:__list_del_entry_valid+0x79/0x160
    [...]
    Call Trace:
     ? lock_release+0x8a0/0x8a0
     unregister_pernet_operations+0xd4/0x560
     ? ops_free_list+0x480/0x480
     ? down_write+0x91/0x130
     ? unregister_pernet_subsys+0x15/0x30
     ? down_read+0x1b0/0x1b0
     ? up_read+0x110/0x110
     ? kmem_cache_create_usercopy+0x1b4/0x240
     unregister_pernet_subsys+0x1d/0x30
     icmpv6_cleanup+0x1d/0x30
     inet6_init+0x1b5/0x23f

Fixes: 15e6680 ("ipv6: reorder icmpv6_init() and ip6_mr_init()")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
nickdesaulniers pushed a commit that referenced this issue Sep 11, 2018
Commit 6d0bfe2 ("net: ipv6: Add IPv6 support to the ping socket.")
contains an error in the cleanup path of inet6_init(): when
proto_register(&pingv6_prot, 1) fails, we try to unregister
&pingv6_prot. When rawv6_init() fails, we skip unregistering
&pingv6_prot.

Example of panic (triggered by faking a failure of
 proto_register(&pingv6_prot, 1)):

    general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI
    [...]
    RIP: 0010:__list_del_entry_valid+0x79/0x160
    [...]
    Call Trace:
     proto_unregister+0xbb/0x550
     ? trace_preempt_on+0x6f0/0x6f0
     ? sock_no_shutdown+0x10/0x10
     inet6_init+0x153/0x1b8

Fixes: 6d0bfe2 ("net: ipv6: Add IPv6 support to the ping socket.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
nickdesaulniers pushed a commit that referenced this issue Sep 11, 2018
…registered

rtnl_unregister_all(PF_INET6) gets called from inet6_init in cases when
no handler has been registered for PF_INET6 yet, for example if
ip6_mr_init() fails. Abort and avoid a NULL pointer deref in that case.

Example of panic (triggered by faking a failure of
 register_pernet_subsys):

    general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI
    [...]
    RIP: 0010:rtnl_unregister_all+0x17e/0x2a0
    [...]
    Call Trace:
     ? rtnetlink_net_init+0x250/0x250
     ? sock_unregister+0x103/0x160
     ? kernel_getsockopt+0x200/0x200
     inet6_init+0x197/0x20d

Fixes: e2fddf5 ("[IPV6]: Make af_inet6 to check ip6_route_init return value.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
nickdesaulniers pushed a commit that referenced this issue Sep 11, 2018
Commit 6d526ee ("arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA")
only enabled HOLES_IN_ZONE for NUMA systems because the NUMA code was
choking on the missing zone for nomap pages. This problem doesn't just
apply to NUMA systems.

If the architecture doesn't set HAVE_ARCH_PFN_VALID, pfn_valid() will
return true if the pfn is part of a valid sparsemem section.

When working with multiple pages, the mm code uses pfn_valid_within()
to test each page it uses within the sparsemem section is valid. On
most systems memory comes in MAX_ORDER_NR_PAGES chunks which all
have valid/initialised struct pages. In this case pfn_valid_within()
is optimised out.

Systems where this isn't true (e.g. due to nomap) should set
HOLES_IN_ZONE and provide HAVE_ARCH_PFN_VALID so that mm tests each
page as it works with it.

Currently non-NUMA arm64 systems can't enable HOLES_IN_ZONE, leading to
a VM_BUG_ON():

| page:fffffdff802e1780 is uninitialized and poisoned
| raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
| raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
| page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
| ------------[ cut here ]------------
| kernel BUG at include/linux/mm.h:978!
| Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[...]
| CPU: 1 PID: 25236 Comm: dd Not tainted 4.18.0 #7
| Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
| pstate: 40000085 (nZcv daIf -PAN -UAO)
| pc : move_freepages_block+0x144/0x248
| lr : move_freepages_block+0x144/0x248
| sp : fffffe0071177680
[...]
| Process dd (pid: 25236, stack limit = 0x0000000094cc07fb)
| Call trace:
|  move_freepages_block+0x144/0x248
|  steal_suitable_fallback+0x100/0x16c
|  get_page_from_freelist+0x440/0xb20
|  __alloc_pages_nodemask+0xe8/0x838
|  new_slab+0xd4/0x418
|  ___slab_alloc.constprop.27+0x380/0x4a8
|  __slab_alloc.isra.21.constprop.26+0x24/0x34
|  kmem_cache_alloc+0xa8/0x180
|  alloc_buffer_head+0x1c/0x90
|  alloc_page_buffers+0x68/0xb0
|  create_empty_buffers+0x20/0x1ec
|  create_page_buffers+0xb0/0xf0
|  __block_write_begin_int+0xc4/0x564
|  __block_write_begin+0x10/0x18
|  block_write_begin+0x48/0xd0
|  blkdev_write_begin+0x28/0x30
|  generic_perform_write+0x98/0x16c
|  __generic_file_write_iter+0x138/0x168
|  blkdev_write_iter+0x80/0xf0
|  __vfs_write+0xe4/0x10c
|  vfs_write+0xb4/0x168
|  ksys_write+0x44/0x88
|  sys_write+0xc/0x14
|  el0_svc_naked+0x30/0x34
| Code: aa1303e0 90001a01 91296421 94008902 (d4210000)
| ---[ end trace 1601ba47f6e883fe ]---

Remove the NUMA dependency.

Link: https://www.spinics.net/lists/arm-kernel/msg671851.html
Cc: <stable@vger.kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
Tested-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
nickdesaulniers pushed a commit that referenced this issue Sep 11, 2018
This patch fixes the race between netvsc_probe() and
rndis_set_subchannel(), which can cause a deadlock.

These are the related 3 paths which show the deadlock:

path #1:
    Workqueue: hv_vmbus_con vmbus_onmessage_work [hv_vmbus]
    Call Trace:
     schedule
     schedule_preempt_disabled
     __mutex_lock
     __device_attach
     bus_probe_device
     device_add
     vmbus_device_register
     vmbus_onoffer
     vmbus_onmessage_work
     process_one_work
     worker_thread
     kthread
     ret_from_fork

path #2:
    schedule
     schedule_preempt_disabled
     __mutex_lock
     netvsc_probe
     vmbus_probe
     really_probe
     __driver_attach
     bus_for_each_dev
     driver_attach_async
     async_run_entry_fn
     process_one_work
     worker_thread
     kthread
     ret_from_fork

path #3:
    Workqueue: events netvsc_subchan_work [hv_netvsc]
    Call Trace:
     schedule
     rndis_set_subchannel
     netvsc_subchan_work
     process_one_work
     worker_thread
     kthread
     ret_from_fork

Before path #1 finishes, path #2 can start to run, because just before
the "bus_probe_device(dev);" in device_add() in path #1, there is a line
"object_uevent(&dev->kobj, KOBJ_ADD);", so systemd-udevd can
immediately try to load hv_netvsc and hence path #2 can start to run.

Next, path #2 offloads the subchannal's initialization to a workqueue,
i.e. path #3, so we can end up in a deadlock situation like this:

Path #2 gets the device lock, and is trying to get the rtnl lock;
Path #3 gets the rtnl lock and is waiting for all the subchannel messages
to be processed;
Path #1 is trying to get the device lock, but since #2 is not releasing
the device lock, path #1 has to sleep; since the VMBus messages are
processed one by one, this means the sub-channel messages can't be
procedded, so #3 has to sleep with the rtnl lock held, and finally #2
has to sleep... Now all the 3 paths are sleeping and we hit the deadlock.

With the patch, we can make sure #2 gets both the device lock and the
rtnl lock together, gets its job done, and releases the locks, so #1
and #3 will not be blocked for ever.

Fixes: 8195b13 ("hv_netvsc: fix deadlock on hotplug")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
nickdesaulniers pushed a commit that referenced this issue Sep 11, 2018
It can make sure that trace_hardirqs_off/trace_hardirqs_on can get a correct
return address by frame pointer through __builtin_return_address() in this fix.

Unable to handle kernel paging request at virtual address fffffffc
pgd = 3c42e9cf
[fffffffc] *pgd=02a9c000

Internal error: Oops: 1 [#1]
Modules linked in:
CPU: 0
PC is at trace_hardirqs_off+0x78/0xec
LP is at common_exception_handler+0xda/0xf4
pc : [<b23ea5a4>]    lp : [<b2352eba>]    Tainted: G        W
sp : ada60ab0  fp : efcaff48  gp : 3a020490
r25: efcb0000  r24: 00000000
r23: 00000000  r22: 00000000  r21: 00000000  r20: 000700c1
r19: 000700ca  r18: 3a21b018  r17: 00000001  r16: 00000002
r15: 00000001  r14: 0000002a  r13: 3a00a804  r12: ada60ab0
r11: 3a113af8  r10: 3a01c530  r9 : 3a124404  r8 : 00120f9c
r7 : b2352eba  r6 : 00000000  r5 : 3a126b58  r4 : 00000000
r3 : 3a1726a8  r2 : b2921000  r1 : 00000000  r0 : 00000000
  IRQs off  Segment user
Process init (pid: 1, stack limit = 0x069d7f15)
Stack: (0xada60ab0 to 0xada61000)
Stack: 0aa0:                                     00000000 00000003 3a110000 0011f000
Stack: 0ac0: 00000005 00000000 00000000 00000000 ada60b1 3a01fe68 ada60b0c ada60b08
Stack: 0ae0: 00000000 ada60ab8 ada60b30 3a020550 00000000 00000001 3a11c2f8 3a01c6e8
Stack: 0b00: 3a01cb80 fffffba8 3a113af8 3a21b018 3a122c28 00003ec4 00000165 00000000
Stack: 0b20: 3a126aec 0000006c 00000000 00000001 3a01fe68 00000000 00000003 00000000
Stack: 0b40: 00000001 000003f8 3a020930 3a01c530 00000008 ada60c18 3a020490 3a003120
Stack: 0b60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Stack: 0b80: 00000000 00000000 00000000 00000000 ffff8000 00000000 00000000 00000000
Stack: 0ba0: 00000000 00000001 3a020550 00000000 3a01d020 00000000 fffff000 fffff000
Stack: 0bc0: 00000000 00000000 00000000 00000000 ada60f2c 00000000 00000001 00000000
Stack: 0be0: 00000000 00000000 3a01fe68 fffffab0 00008034 00000008 3a0010cc 3a01fe68
Stack: 0c00: 00000000 00000000 00000001 ada60c88 3a020490 3a0139d4 0009dc6f 00000000
Stack: 0c20: 00000000 00000000 ada60fce fffff000 00000000 0000ebe0 3a020038 3a020550
Stack: 0c40: ada60f20 ada60c90 3a0007f0 3a0002a8 ada60c8c 00000000 00000000 ada60c88
Stack: 0c60: 3a020490 3a004570 00000000 00000000 ada60f20 3a0007f0 3a000000 00000000
Stack: 0c80: 3a020490 3a004850 00000000 3a013f24 3a000000 00000000 3a01ff44 00000000
Stack: 0ca0: 00000000 00000000 00000000 00000000 00000000 00000000 3a01ff84 3a01ff7c
Stack: 0cc0: 3a01ff4c 3a01ff5c 3a01ff64 3a01ff9c 3a01ffa4 3a01ffac 3a01ff6c 3a01ff74
Stack: 0ce0: 00000000 00000000 3a01ff44 00000000 00000000 00000000 00000000 00000000
Stack: 0d00: 3a01ff8c 00000000 00000000 3a01ff94 00000000 00000000 00000000 00000000
Stack: 0d20: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Stack: 0d40: 3a01ffbc 3a01ffb4 00000000 00000000 00000000 00000000 00000000 00000000
Stack: 0d60: 00000000 00000000 00000000 00000000 00000000 3a01ffc4 00000000 00000000
Stack: 0d80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Stack: 0da0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Stack: 0dc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 3a01ff54
Stack: 0de0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Stack: 0e00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Stack: 0e20: 00000000 00000004 00000000 00000000 00000000 00000000 00000000 00000000
Stack: 0e40: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Stack: 0e60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Stack: 0e80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Stack: 0ea0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Stack: 0ec0: 00000000 00000000 00000000 00000000 ffffffff 00000000 00000000 00000000
Stack: 0ee0: 00000000 00000000 00000000 00000000 ada60f20 00000000 00000000 00000000
Stack: 0f00: 00000000 00000000 00000000 00000000 00000000 00000000 3a020490 3a000b24
Stack: 0f20: 00000001 ada60fde 00000000 ada60fe4 ada60feb 00000000 00000021 3a038000
Stack: 0f40: 00000010 0009dc6f 00000006 00001000 00000011 00000064 00000003 00008034
Stack: 0f60: 00000004 00000020 00000005 00000008 00000007 3a000000 00000008 00000000
Stack: 0f80: 00000009 0000ebe0 0000000b 00000000 0000000c 00000000 0000000d 00000000
Stack: 0fa0: 0000000e 00000000 00000017 00000000 00000019 ada60fce 0000001f ada60ff6
Stack: 0fc0: 00000000 00000000 00000000 b5010000 fa839914 23b5dd89 a2aea540 692fc82e
Stack: 0fe0: 0074696e 454d4f48 54002f3d 3d4d5245 756e696c 692f0078 0074696e 00000000
CPU: 0 PID: 1 Comm: init Tainted: G        W         4.18.0-00015-g1888b64a2558-dirty #112
Hardware name: andestech,ae3xx (DT)
Call Trace:
[<b27a8e34>] dump_stack+0x2c/0x38
[<b2354874>] die+0x128/0x18c
[<b2356f4c>] do_page_fault+0x3b8/0x4e0
[<b2352ed4>] ret_from_exception+0x0/0x10
[<b2352eba>] common_exception_handler+0xda/0xf4

Signed-off-by: Greentime Hu <greentime@andestech.com>
nickdesaulniers pushed a commit that referenced this issue Sep 11, 2018
Borislav reported the following splat:

 =============================
 WARNING: suspicious RCU usage
 4.19.0-rc1+ #1 Not tainted
 -----------------------------
 ./include/linux/rcupdate.h:631 rcu_read_lock() used illegally while idle!
 other info that might help us debug this:

 RCU used illegally from idle CPU!
 rcu_scheduler_active = 2, debug_locks = 1
 RCU used illegally from extended quiescent state!
 1 lock held by swapper/0/0:
  #0: 000000004557ee0e (rcu_read_lock){....}, at: perf_event_output_forward+0x0/0x130

 stack backtrace:
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc1+ #1
 Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 11/13/2012
 Call Trace:
  dump_stack+0x85/0xcb
  perf_event_output_forward+0xf6/0x130
  __perf_event_overflow+0x52/0xe0
  perf_swevent_overflow+0x91/0xb0
  perf_tp_event+0x11a/0x350
  ? find_held_lock+0x2d/0x90
  ? __lock_acquire+0x2ce/0x1350
  ? __lock_acquire+0x2ce/0x1350
  ? retint_kernel+0x2d/0x2d
  ? find_held_lock+0x2d/0x90
  ? tick_nohz_get_sleep_length+0x83/0xb0
  ? perf_trace_cpu+0xbb/0xd0
  ? perf_trace_buf_alloc+0x5a/0xa0
  perf_trace_cpu+0xbb/0xd0
  cpuidle_enter_state+0x185/0x340
  do_idle+0x1eb/0x260
  cpu_startup_entry+0x5f/0x70
  start_kernel+0x49b/0x4a6
  secondary_startup_64+0xa4/0xb0

This is due to the tracepoints moving to SRCU usage which does not require
RCU to be "watching". But perf uses these tracepoints with RCU and expects
it to be. Hence, we still need to add in the rcu_irq_enter/exit_irqson()
calls for "rcuidle" tracepoints. This is a temporary fix until we have SRCU
working in NMI context, and then perf can be converted to use that instead
of normal RCU.

Link: http://lkml.kernel.org/r/20180904162611.6a120068@gandalf.local.home

Cc: x86-ml <x86@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Reported-by: Borislav Petkov <bp@alien8.de>
Tested-by: Borislav Petkov <bp@alien8.de>
Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Fixes: e6753f2 ("tracepoint: Make rcuidle tracepoint callers use SRCU")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
nathanchance pushed a commit that referenced this issue Jan 14, 2021
Ido Schimmel says:

====================
mlxsw: core: Thermal control fixes

This series includes two fixes for thermal control in mlxsw.

Patch #1 validates that the alarm temperature threshold read from a
transceiver is above the warning temperature threshold. If not, the
current thresholds are maintained. It was observed that some transceiver
might be unreliable and sometimes report a too low alarm temperature
threshold which would result in thermal shutdown of the system.

Patch #2 increases the temperature threshold above which thermal
shutdown is triggered for the ASIC thermal zone. It is currently too low
and might result in thermal shutdown under perfectly fine operational
conditions.
====================

Link: https://lore.kernel.org/r/20210108145210.1229820-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
nathanchance pushed a commit that referenced this issue Jan 16, 2021
On some specific hardware on early boot we occasionally get:

[ 1193.920255][    T0] BUG: sleeping function called from invalid context at mm/mempool.c:381
[ 1193.936616][    T0] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/69
[ 1193.953233][    T0] no locks held by swapper/69/0.
[ 1193.965871][    T0] irq event stamp: 575062
[ 1193.977724][    T0] hardirqs last  enabled at (575061): [<ffffffffab73f662>] tick_nohz_idle_exit+0xe2/0x3e0
[ 1194.002762][    T0] hardirqs last disabled at (575062): [<ffffffffab74e8af>] flush_smp_call_function_from_idle+0x4f/0x80
[ 1194.029035][    T0] softirqs last  enabled at (575050): [<ffffffffad600fd2>] asm_call_irq_on_stack+0x12/0x20
[ 1194.054227][    T0] softirqs last disabled at (575043): [<ffffffffad600fd2>] asm_call_irq_on_stack+0x12/0x20
[ 1194.079389][    T0] CPU: 69 PID: 0 Comm: swapper/69 Not tainted 5.10.6-cloudflare-kasan-2021.1.4-dev #1
[ 1194.104103][    T0] Hardware name: NULL R162-Z12-CD/MZ12-HD4-CD, BIOS R10 06/04/2020
[ 1194.119591][    T0] Call Trace:
[ 1194.130233][    T0]  dump_stack+0x9a/0xcc
[ 1194.141617][    T0]  ___might_sleep.cold+0x180/0x1b0
[ 1194.153825][    T0]  mempool_alloc+0x16b/0x300
[ 1194.165313][    T0]  ? remove_element+0x160/0x160
[ 1194.176961][    T0]  ? blk_mq_end_request+0x4b/0x490
[ 1194.188778][    T0]  crypt_convert+0x27f6/0x45f0 [dm_crypt]
[ 1194.201024][    T0]  ? rcu_read_lock_sched_held+0x3f/0x70
[ 1194.212906][    T0]  ? module_assert_mutex_or_preempt+0x3e/0x70
[ 1194.225318][    T0]  ? __module_address.part.0+0x1b/0x3a0
[ 1194.237212][    T0]  ? is_kernel_percpu_address+0x5b/0x190
[ 1194.249238][    T0]  ? crypt_iv_tcw_ctr+0x4a0/0x4a0 [dm_crypt]
[ 1194.261593][    T0]  ? is_module_address+0x25/0x40
[ 1194.272905][    T0]  ? static_obj+0x8a/0xc0
[ 1194.283582][    T0]  ? lockdep_init_map_waits+0x26a/0x700
[ 1194.295570][    T0]  ? __raw_spin_lock_init+0x39/0x110
[ 1194.307330][    T0]  kcryptd_crypt_read_convert+0x31c/0x560 [dm_crypt]
[ 1194.320496][    T0]  ? kcryptd_queue_crypt+0x1be/0x380 [dm_crypt]
[ 1194.333203][    T0]  blk_update_request+0x6d7/0x1500
[ 1194.344841][    T0]  ? blk_mq_trigger_softirq+0x190/0x190
[ 1194.356831][    T0]  blk_mq_end_request+0x4b/0x490
[ 1194.367994][    T0]  ? blk_mq_trigger_softirq+0x190/0x190
[ 1194.379693][    T0]  flush_smp_call_function_queue+0x24b/0x560
[ 1194.391847][    T0]  flush_smp_call_function_from_idle+0x59/0x80
[ 1194.403969][    T0]  do_idle+0x287/0x450
[ 1194.413891][    T0]  ? arch_cpu_idle_exit+0x40/0x40
[ 1194.424716][    T0]  ? lockdep_hardirqs_on_prepare+0x286/0x3f0
[ 1194.436399][    T0]  ? _raw_spin_unlock_irqrestore+0x39/0x40
[ 1194.447759][    T0]  cpu_startup_entry+0x19/0x20
[ 1194.458038][    T0]  secondary_startup_64_no_verify+0xb0/0xbb

IO completion can be queued to a different CPU by the block subsystem as a "call
single function/data". The CPU may run these routines from the idle task, but it
does so with interrupts disabled.

It is not a good idea to do decryption with irqs disabled even in an idle task
context, so just defer it to a tasklet (as is done with requests from hard irqs).

Fixes: 39d42fa ("dm crypt: add flags to optionally bypass kcryptd workqueues")
Cc: stable@vger.kernel.org # v5.9+
Signed-off-by: Ignat Korchagin <ignat@cloudflare.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
nathanchance pushed a commit that referenced this issue Jan 16, 2021
general protection fault, probably for non-canonical address
	0xdffffc0000000022: 0000 [#1] KASAN: null-ptr-deref
	in range [0x0000000000000110-0x0000000000000117]
RIP: 0010:io_ring_set_wakeup_flag fs/io_uring.c:6929 [inline]
RIP: 0010:io_disable_sqo_submit+0xdb/0x130 fs/io_uring.c:8891
Call Trace:
 io_uring_create fs/io_uring.c:9711 [inline]
 io_uring_setup+0x12b1/0x38e0 fs/io_uring.c:9739
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

io_disable_sqo_submit() might be called before user rings were
allocated, don't do io_ring_set_wakeup_flag() in those cases.

Reported-by: syzbot+ab412638aeb652ded540@syzkaller.appspotmail.com
Fixes: d9d0521 ("io_uring: stop SQPOLL submit on creator's death")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
nathanchance pushed a commit that referenced this issue Jan 22, 2021
crypto: Fix divide error in do_xor_speed()

From: Kirill Tkhai <ktkhai@virtuozzo.com>

Latest (but not only latest) linux-next panics with divide
error on my QEMU setup.

The patch at the bottom of this message fixes the problem.

xor: measuring software checksum speed
divide error: 0000 [#1] PREEMPT SMP KASAN
PREEMPT SMP KASAN
CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.10.0-next-20201223+ #2177
RIP: 0010:do_xor_speed+0xbb/0xf3
Code: 41 ff cc 75 b5 bf 01 00 00 00 e8 3d 23 8b fe 65 8b 05 f6 49 83 7d 85 c0 75 05 e8
 84 70 81 fe b8 00 00 50 c3 31 d2 48 8d 7b 10 <f7> f5 41 89 c4 e8 58 07 a2 fe 44 89 63 10 48 8d 7b 08
 e8 cb 07 a2
RSP: 0000:ffff888100137dc8 EFLAGS: 00010246
RAX: 00000000c3500000 RBX: ffffffff823f0160 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000808 RDI: ffffffff823f0170
RBP: 0000000000000000 R08: ffffffff8109c50f R09: ffffffff824bb6f7
R10: fffffbfff04976de R11: 0000000000000001 R12: 0000000000000000
R13: ffff888101997000 R14: ffff888101994000 R15: ffffffff823f0178
FS:  0000000000000000(0000) GS:ffff8881f7780000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000000220e000 CR4: 00000000000006a0
Call Trace:
 calibrate_xor_blocks+0x13c/0x1c4
 ? do_xor_speed+0xf3/0xf3
 do_one_initcall+0xc1/0x1b7
 ? start_kernel+0x373/0x373
 ? unpoison_range+0x3a/0x60
 kernel_init_freeable+0x1dd/0x238
 ? rest_init+0xc6/0xc6
 kernel_init+0x8/0x10a
 ret_from_fork+0x1f/0x30
---[ end trace 5bd3c1d0b77772da ]---

Fixes: c055e3e ("crypto: xor - use ktime for template benchmarking")
Cc: <stable@vger.kernel.org>
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
nathanchance pushed a commit that referenced this issue Jan 22, 2021
While testing the error paths of relocation I hit the following lockdep
splat:

  ======================================================
  WARNING: possible circular locking dependency detected
  5.10.0-rc6+ #217 Not tainted
  ------------------------------------------------------
  mount/779 is trying to acquire lock:
  ffffa0e676945418 (&fs_info->balance_mutex){+.+.}-{3:3}, at: btrfs_recover_balance+0x2f0/0x340

  but task is already holding lock:
  ffffa0e60ee31da8 (btrfs-root-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x27/0x100

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #2 (btrfs-root-00){++++}-{3:3}:
	 down_read_nested+0x43/0x130
	 __btrfs_tree_read_lock+0x27/0x100
	 btrfs_read_lock_root_node+0x31/0x40
	 btrfs_search_slot+0x462/0x8f0
	 btrfs_update_root+0x55/0x2b0
	 btrfs_drop_snapshot+0x398/0x750
	 clean_dirty_subvols+0xdf/0x120
	 btrfs_recover_relocation+0x534/0x5a0
	 btrfs_start_pre_rw_mount+0xcb/0x170
	 open_ctree+0x151f/0x1726
	 btrfs_mount_root.cold+0x12/0xea
	 legacy_get_tree+0x30/0x50
	 vfs_get_tree+0x28/0xc0
	 vfs_kern_mount.part.0+0x71/0xb0
	 btrfs_mount+0x10d/0x380
	 legacy_get_tree+0x30/0x50
	 vfs_get_tree+0x28/0xc0
	 path_mount+0x433/0xc10
	 __x64_sys_mount+0xe3/0x120
	 do_syscall_64+0x33/0x40
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #1 (sb_internal#2){.+.+}-{0:0}:
	 start_transaction+0x444/0x700
	 insert_balance_item.isra.0+0x37/0x320
	 btrfs_balance+0x354/0xf40
	 btrfs_ioctl_balance+0x2cf/0x380
	 __x64_sys_ioctl+0x83/0xb0
	 do_syscall_64+0x33/0x40
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #0 (&fs_info->balance_mutex){+.+.}-{3:3}:
	 __lock_acquire+0x1120/0x1e10
	 lock_acquire+0x116/0x370
	 __mutex_lock+0x7e/0x7b0
	 btrfs_recover_balance+0x2f0/0x340
	 open_ctree+0x1095/0x1726
	 btrfs_mount_root.cold+0x12/0xea
	 legacy_get_tree+0x30/0x50
	 vfs_get_tree+0x28/0xc0
	 vfs_kern_mount.part.0+0x71/0xb0
	 btrfs_mount+0x10d/0x380
	 legacy_get_tree+0x30/0x50
	 vfs_get_tree+0x28/0xc0
	 path_mount+0x433/0xc10
	 __x64_sys_mount+0xe3/0x120
	 do_syscall_64+0x33/0x40
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  other info that might help us debug this:

  Chain exists of:
    &fs_info->balance_mutex --> sb_internal#2 --> btrfs-root-00

   Possible unsafe locking scenario:

	 CPU0                    CPU1
	 ----                    ----
    lock(btrfs-root-00);
				 lock(sb_internal#2);
				 lock(btrfs-root-00);
    lock(&fs_info->balance_mutex);

   *** DEADLOCK ***

  2 locks held by mount/779:
   #0: ffffa0e60dc040e0 (&type->s_umount_key#47/1){+.+.}-{3:3}, at: alloc_super+0xb5/0x380
   #1: ffffa0e60ee31da8 (btrfs-root-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x27/0x100

  stack backtrace:
  CPU: 0 PID: 779 Comm: mount Not tainted 5.10.0-rc6+ #217
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
  Call Trace:
   dump_stack+0x8b/0xb0
   check_noncircular+0xcf/0xf0
   ? trace_call_bpf+0x139/0x260
   __lock_acquire+0x1120/0x1e10
   lock_acquire+0x116/0x370
   ? btrfs_recover_balance+0x2f0/0x340
   __mutex_lock+0x7e/0x7b0
   ? btrfs_recover_balance+0x2f0/0x340
   ? btrfs_recover_balance+0x2f0/0x340
   ? rcu_read_lock_sched_held+0x3f/0x80
   ? kmem_cache_alloc_trace+0x2c4/0x2f0
   ? btrfs_get_64+0x5e/0x100
   btrfs_recover_balance+0x2f0/0x340
   open_ctree+0x1095/0x1726
   btrfs_mount_root.cold+0x12/0xea
   ? rcu_read_lock_sched_held+0x3f/0x80
   legacy_get_tree+0x30/0x50
   vfs_get_tree+0x28/0xc0
   vfs_kern_mount.part.0+0x71/0xb0
   btrfs_mount+0x10d/0x380
   ? __kmalloc_track_caller+0x2f2/0x320
   legacy_get_tree+0x30/0x50
   vfs_get_tree+0x28/0xc0
   ? capable+0x3a/0x60
   path_mount+0x433/0xc10
   __x64_sys_mount+0xe3/0x120
   do_syscall_64+0x33/0x40
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

This is straightforward to fix, simply release the path before we setup
the balance_ctl.

CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
nathanchance pushed a commit that referenced this issue Jan 22, 2021
In Linux, if a driver does disable_irq() and later does enable_irq()
on its interrupt, I believe it's expecting these properties:
* If an interrupt was pending when the driver disabled then it will
  still be pending after the driver re-enables.
* If an edge-triggered interrupt comes in while an interrupt is
  disabled it should assert when the interrupt is re-enabled.

If you think that the above sounds a lot like the disable_irq() and
enable_irq() are supposed to be masking/unmasking the interrupt
instead of disabling/enabling it then you've made an astute
observation.  Specifically when talking about interrupts, "mask"
usually means to stop posting interrupts but keep tracking them and
"disable" means to fully shut off interrupt detection.  It's
unfortunate that this is so confusing, but presumably this is all the
way it is for historical reasons.

Perhaps more confusing than the above is that, even though clients of
IRQs themselves don't have a way to request mask/unmask
vs. disable/enable calls, IRQ chips themselves can implement both.
...and yet more confusing is that if an IRQ chip implements
disable/enable then they will be called when a client driver calls
disable_irq() / enable_irq().

It does feel like some of the above could be cleared up.  However,
without any other core interrupt changes it should be clear that when
an IRQ chip gets a request to "disable" an IRQ that it has to treat it
like a mask of that IRQ.

In any case, after that long interlude you can see that the "unmask
and clear" can break things.  Maulik tried to fix it so that we no
longer did "unmask and clear" in commit 71266d9 ("pinctrl: qcom:
Move clearing pending IRQ to .irq_request_resources callback"), but it
only handled the PDC case and it had problems (it caused
sc7180-trogdor devices to fail to suspend).  Let's fix.

>From my understanding the source of the phantom interrupt in the
were these two things:
1. One that could have been introduced in msm_gpio_irq_set_type()
   (only for the non-PDC case).
2. Edges could have been detected when a GPIO was muxed away.

Fixing case #1 is easy.  We can just add a clear in
msm_gpio_irq_set_type().

Fixing case #2 is harder.  Let's use a concrete example.  In
sc7180-trogdor.dtsi we configure the uart3 to have two pinctrl states,
sleep and default, and mux between the two during runtime PM and
system suspend (see geni_se_resources_{on,off}() for more
details). The difference between the sleep and default state is that
the RX pin is muxed to a GPIO during sleep and muxed to the UART
otherwise.

As per Qualcomm, when we mux the pin over to the UART function the PDC
(or the non-PDC interrupt detection logic) is still watching it /
latching edges.  These edges don't cause interrupts because the
current code masks the interrupt unless we're entering suspend.
However, as soon as we enter suspend we unmask the interrupt and it's
counted as a wakeup.

Let's deal with the problem like this:
* When we mux away, we'll mask our interrupt.  This isn't necessary in
  the above case since the client already masked us, but it's a good
  idea in general.
* When we mux back will clear any interrupts and unmask.

Fixes: 4b7618f ("pinctrl: qcom: Add irq_enable callback for msm gpio")
Fixes: 71266d9 ("pinctrl: qcom: Move clearing pending IRQ to .irq_request_resources callback")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Maulik Shah <mkshah@codeaurora.org>
Tested-by: Maulik Shah <mkshah@codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20210114191601.v7.4.I7cf3019783720feb57b958c95c2b684940264cd1@changeid
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
nathanchance pushed a commit that referenced this issue Jan 22, 2021
Wolfram reports that his R-Car H2-based Lager board can no longer be
rebooted in v5.11-rc1, as it crashes with an imprecise external abort.
The issue can be reproduced on other boards (e.g. Koelsch with R-Car
M2-W) too, if CONFIG_IP_PNP is disabled, and the Ethernet interface is
down at reboot time:

    Unhandled fault: imprecise external abort (0x1406) at 0x00000000
    pgd = (ptrval)
    [00000000] *pgd=422b6835, *pte=00000000, *ppte=00000000
    Internal error: : 1406 [#1] ARM
    Modules linked in:
    CPU: 0 PID: 1105 Comm: init Tainted: G        W         5.10.0-rc1-00402-ge2f016cf7751 #1048
    Hardware name: Generic R-Car Gen2 (Flattened Device Tree)
    PC is at sh_mdio_ctrl+0x44/0x60
    LR is at sh_mmd_ctrl+0x20/0x24
    ...
    Backtrace:
    [<c0451f30>] (sh_mdio_ctrl) from [<c0451fd4>] (sh_mmd_ctrl+0x20/0x24)
     r7:0000001f r6:00000020 r5:00000002 r4:c22a1dc4
    [<c0451fb4>] (sh_mmd_ctrl) from [<c044fc18>] (mdiobb_cmd+0x38/0xa8)
    [<c044fbe0>] (mdiobb_cmd) from [<c044feb8>] (mdiobb_read+0x58/0xdc)
     r9:c229f844 r8:c0c329dc r7:c221e000 r6:00000001 r5:c22a1dc4 r4:00000001
    [<c044fe60>] (mdiobb_read) from [<c044c854>] (__mdiobus_read+0x74/0xe0)
     r7:0000001f r6:00000001 r5:c221e000 r4:c221e000
    [<c044c7e0>] (__mdiobus_read) from [<c044c9d8>] (mdiobus_read+0x40/0x54)
     r7:0000001f r6:00000001 r5:c221e000 r4:c221e458
    [<c044c998>] (mdiobus_read) from [<c044d678>] (phy_read+0x1c/0x20)
     r7:ffffe000 r6:c221e470 r5:00000200 r4:c229f800
    [<c044d65c>] (phy_read) from [<c044d94c>] (kszphy_config_intr+0x44/0x80)
    [<c044d908>] (kszphy_config_intr) from [<c044694c>] (phy_disable_interrupts+0x44/0x50)
     r5:c229f800 r4:c229f800
    [<c0446908>] (phy_disable_interrupts) from [<c0449370>] (phy_shutdown+0x18/0x1c)
     r5:c229f800 r4:c229f804
    [<c0449358>] (phy_shutdown) from [<c040066c>] (device_shutdown+0x168/0x1f8)
    [<c0400504>] (device_shutdown) from [<c013de44>] (kernel_restart_prepare+0x3c/0x48)
     r9:c22d2000 r8:c0100264 r7:c0b0d034 r6:00000000 r5:4321fedc r4:00000000
    [<c013de08>] (kernel_restart_prepare) from [<c013dee0>] (kernel_restart+0x1c/0x60)
    [<c013dec4>] (kernel_restart) from [<c013e1d8>] (__do_sys_reboot+0x168/0x208)
     r5:4321fedc r4:01234567
    [<c013e070>] (__do_sys_reboot) from [<c013e2e8>] (sys_reboot+0x18/0x1c)
     r7:00000058 r6:00000000 r5:00000000 r4:00000000
    [<c013e2d0>] (sys_reboot) from [<c0100060>] (ret_fast_syscall+0x0/0x54)

As of commit e2f016c ("net: phy: add a shutdown procedure"),
system reboot calls phy_disable_interrupts() during shutdown.  As this
happens unconditionally, the PHY registers may be accessed while the
device is suspended, causing undefined behavior, which may crash the
system.

Fix this by wrapping the PHY bitbang accessors in the sh_eth driver by
wrappers that take care of Runtime PM, to resume the device when needed.

Reported-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
nickdesaulniers pushed a commit that referenced this issue Jan 29, 2022
We encountered a crash in smc_setsockopt() and it is caused by
accessing smc->clcsock after clcsock was released.

 BUG: kernel NULL pointer dereference, address: 0000000000000020
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] PREEMPT SMP PTI
 CPU: 1 PID: 50309 Comm: nginx Kdump: loaded Tainted: G E     5.16.0-rc4+ #53
 RIP: 0010:smc_setsockopt+0x59/0x280 [smc]
 Call Trace:
  <TASK>
  __sys_setsockopt+0xfc/0x190
  __x64_sys_setsockopt+0x20/0x30
  do_syscall_64+0x34/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f16ba83918e
  </TASK>

This patch tries to fix it by holding clcsock_release_lock and
checking whether clcsock has already been released before access.

In case that a crash of the same reason happens in smc_getsockopt()
or smc_switch_to_fallback(), this patch also checkes smc->clcsock
in them too. And the caller of smc_switch_to_fallback() will identify
whether fallback succeeds according to the return value.

Fixes: fd57770 ("net/smc: wait for pending work before clcsock release_sock")
Link: https://lore.kernel.org/lkml/5dd7ffd1-28e2-24cc-9442-1defec27375e@linux.ibm.com/T/
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
nickdesaulniers pushed a commit that referenced this issue Jan 31, 2022
We encountered a crash in smc_setsockopt() and it is caused by
accessing smc->clcsock after clcsock was released.

 BUG: kernel NULL pointer dereference, address: 0000000000000020
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] PREEMPT SMP PTI
 CPU: 1 PID: 50309 Comm: nginx Kdump: loaded Tainted: G E     5.16.0-rc4+ #53
 RIP: 0010:smc_setsockopt+0x59/0x280 [smc]
 Call Trace:
  <TASK>
  __sys_setsockopt+0xfc/0x190
  __x64_sys_setsockopt+0x20/0x30
  do_syscall_64+0x34/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f16ba83918e
  </TASK>

This patch tries to fix it by holding clcsock_release_lock and
checking whether clcsock has already been released before access.

In case that a crash of the same reason happens in smc_getsockopt()
or smc_switch_to_fallback(), this patch also checkes smc->clcsock
in them too. And the caller of smc_switch_to_fallback() will identify
whether fallback succeeds according to the return value.

Fixes: fd57770 ("net/smc: wait for pending work before clcsock release_sock")
Link: https://lore.kernel.org/lkml/5dd7ffd1-28e2-24cc-9442-1defec27375e@linux.ibm.com/T/
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
nickdesaulniers pushed a commit that referenced this issue Jan 31, 2022
Zero vmcs.HOST_IA32_SYSENTER_ESP when initializing *constant* host state
if and only if SYSENTER cannot be used, i.e. the kernel is a 64-bit
kernel and is not emulating 32-bit syscalls.  As the name suggests,
vmx_set_constant_host_state() is intended for state that is *constant*.
When SYSENTER is used, SYSENTER_ESP isn't constant because stacks are
per-CPU, and the VMCS must be updated whenever the vCPU is migrated to a
new CPU.  The logic in vmx_vcpu_load_vmcs() doesn't differentiate between
"never loaded" and "loaded on a different CPU", i.e. setting SYSENTER_ESP
on VMCS load also handles setting correct host state when the VMCS is
first loaded.

Because a VMCS must be loaded before it is initialized during vCPU RESET,
zeroing the field in vmx_set_constant_host_state() obliterates the value
that was written when the VMCS was loaded.  If the vCPU is run before it
is migrated, the subsequent VM-Exit will zero out MSR_IA32_SYSENTER_ESP,
leading to a #DF on the next 32-bit syscall.

  double fault: 0000 [#1] SMP
  CPU: 0 PID: 990 Comm: stable Not tainted 5.16.0+ #97
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  EIP: entry_SYSENTER_32+0x0/0xe7
  Code: <9c> 50 eb 17 0f 20 d8 a9 00 10 00 00 74 0d 25 ff ef ff ff 0f 22 d8
  EAX: 000000a2 EBX: a8d1300c ECX: a8d13014 EDX: 00000000
  ESI: a8f87000 EDI: a8d13014 EBP: a8d12fc0 ESP: 00000000
  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00210093
  CR0: 80050033 CR2: fffffffc CR3: 02c3b000 CR4: 00152e90

Fixes: 6ab8a40 ("KVM: VMX: Avoid to rdmsrl(MSR_IA32_SYSENTER_ESP)")
Cc: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220122015211.1468758-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
nickdesaulniers pushed a commit that referenced this issue Jan 31, 2022
Paweł Marciniak reports the following crash, observed when clearing
the chassis intrusion alarm.

BUG: kernel NULL pointer dereference, address: 0000000000000028
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 3 PID: 4815 Comm: bash Tainted: G S                5.16.2-200.fc35.x86_64 #1
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z97 Extreme4, BIOS P2.60A 05/03/2018
RIP: 0010:clear_caseopen+0x5a/0x120 [nct6775]
Code: 68 70 e8 e9 32 b1 e3 85 c0 0f 85 d2 00 00 00 48 83 7c 24 ...
RSP: 0018:ffffabcb02803dd8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000
RDX: ffff8e8808192880 RSI: 0000000000000000 RDI: ffff8e87c7509a68
RBP: 0000000000000000 R08: 0000000000000001 R09: 000000000000000a
R10: 000000000000000a R11: f000000000000000 R12: 000000000000001f
R13: ffff8e87c7509828 R14: ffff8e87c7509a68 R15: ffff8e88494527a0
FS:  00007f4db9151740(0000) GS:ffff8e8ebfec0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000028 CR3: 0000000166b66001 CR4: 00000000001706e0
Call Trace:
 <TASK>
 kernfs_fop_write_iter+0x11c/0x1b0
 new_sync_write+0x10b/0x180
 vfs_write+0x209/0x2a0
 ksys_write+0x4f/0xc0
 do_syscall_64+0x3b/0x90
 entry_SYSCALL_64_after_hwframe+0x44/0xae

The problem is that the device passed to clear_caseopen() is the hwmon
device, not the platform device, and the platform data is not set in the
hwmon device. Store the pointer to sio_data in struct nct6775_data and
get if from there if needed.

Fixes: 2e7b988 ("hwmon: (nct6775) Use superio_*() function pointers in sio_data.")
Cc: Denis Pauk <pauk.denis@gmail.com>
Cc: Bernhard Seibold <mail@bernhard-seibold.de>
Reported-by: Paweł Marciniak <pmarciniak@lodz.home.pl>
Tested-by: Denis Pauk <pauk.denis@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
nickdesaulniers pushed a commit that referenced this issue Jan 31, 2022
Call trace seen when creating NPIV ports, only 32 out of 64 show online.
stag work was not initialized for vport, hence initialize the stag work.

WARNING: CPU: 8 PID: 645 at kernel/workqueue.c:1635 __queue_delayed_work+0x68/0x80
CPU: 8 PID: 645 Comm: kworker/8:1 Kdump: loaded Tainted: G IOE    --------- --
 4.18.0-348.el8.x86_64 #1
Hardware name: Dell Inc. PowerEdge MX740c/0177V9, BIOS 2.12.2 07/09/2021
Workqueue: events fc_lport_timeout [libfc]
RIP: 0010:__queue_delayed_work+0x68/0x80
Code: 89 b2 88 00 00 00 44 89 82 90 00 00 00 48 01 c8 48 89 42 50 41 81
f8 00 20 00 00 75 1d e9 60 24 07 00 44 89 c7 e9 98 f6 ff ff <0f> 0b eb
c5 0f 0b eb a1 0f 0b eb a7 0f 0b eb ac 44 89 c6 e9 40 23
RSP: 0018:ffffae514bc3be40 EFLAGS: 00010006
RAX: ffff8d25d6143750 RBX: 0000000000000202 RCX: 0000000000000002
RDX: ffff8d2e31383748 RSI: ffff8d25c000d600 RDI: ffff8d2e31383788
RBP: ffff8d2e31380de0 R08: 0000000000002000 R09: ffff8d2e31383750
R10: ffffffffc0c957e0 R11: ffff8d2624800000 R12: ffff8d2e31380a58
R13: ffff8d2d915eb000 R14: ffff8d25c499b5c0 R15: ffff8d2e31380e18
FS:  0000000000000000(0000) GS:ffff8d2d1fb00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055fd0484b8b8 CR3: 00000008ffc10006 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
  queue_delayed_work_on+0x36/0x40
  qedf_elsct_send+0x57/0x60 [qedf]
  fc_lport_enter_flogi+0x90/0xc0 [libfc]
  fc_lport_timeout+0xb7/0x140 [libfc]
  process_one_work+0x1a7/0x360
  ? create_worker+0x1a0/0x1a0
  worker_thread+0x30/0x390
  ? create_worker+0x1a0/0x1a0
  kthread+0x116/0x130
  ? kthread_flush_work_fn+0x10/0x10
  ret_from_fork+0x35/0x40
 ---[ end trace 008f00f722f2c2ff ]--

Initialize stag work for all the vports.

Link: https://lore.kernel.org/r/20220117135311.6256-2-njavali@marvell.com
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
nickdesaulniers pushed a commit that referenced this issue Jan 31, 2022
Hung task call trace was seen during LOGO processing.

[  974.309060] [0000:00:00.0]:[qedf_eh_device_reset:868]: 1:0:2:0: LUN RESET Issued...
[  974.309065] [0000:00:00.0]:[qedf_initiate_tmf:2422]: tm_flags 0x10 sc_cmd 00000000c16b930f op = 0x2a target_id = 0x2 lun=0
[  974.309178] [0000:00:00.0]:[qedf_initiate_tmf:2431]: portid=016900 tm_flags =LUN RESET
[  974.309222] [0000:00:00.0]:[qedf_initiate_tmf:2438]: orig io_req = 00000000ec78df8f xid = 0x180 ref_cnt = 1.
[  974.309625] host1: rport 016900: Received LOGO request while in state Ready
[  974.309627] host1: rport 016900: Delete port
[  974.309642] host1: rport 016900: work event 3
[  974.309644] host1: rport 016900: lld callback ev 3
[  974.313243] [0000:61:00.2]:[qedf_execute_tmf:2383]:1: fcport is uploading, not executing flush.
[  974.313295] [0000:61:00.2]:[qedf_execute_tmf:2400]:1: task mgmt command success...
[  984.031088] INFO: task jbd2/dm-15-8:7645 blocked for more than 120 seconds.
[  984.031136]       Not tainted 4.18.0-305.el8.x86_64 #1

[  984.031166] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  984.031209] jbd2/dm-15-8    D    0  7645      2 0x80004080
[  984.031212] Call Trace:
[  984.031222]  __schedule+0x2c4/0x700
[  984.031230]  ? unfreeze_partials.isra.83+0x16e/0x1a0
[  984.031233]  ? bit_wait_timeout+0x90/0x90
[  984.031235]  schedule+0x38/0xa0
[  984.031238]  io_schedule+0x12/0x40
[  984.031240]  bit_wait_io+0xd/0x50
[  984.031243]  __wait_on_bit+0x6c/0x80
[  984.031248]  ? free_buffer_head+0x21/0x50
[  984.031251]  out_of_line_wait_on_bit+0x91/0xb0
[  984.031257]  ? init_wait_var_entry+0x50/0x50
[  984.031268]  jbd2_journal_commit_transaction+0x112e/0x19f0 [jbd2]
[  984.031280]  kjournald2+0xbd/0x270 [jbd2]
[  984.031284]  ? finish_wait+0x80/0x80
[  984.031291]  ? commit_timeout+0x10/0x10 [jbd2]
[  984.031294]  kthread+0x116/0x130
[  984.031300]  ? kthread_flush_work_fn+0x10/0x10
[  984.031305]  ret_from_fork+0x1f/0x40

There was a ref count issue when LOGO is received during TMF. This leads to
one of the I/Os hanging with the driver. Fix the ref count.

Link: https://lore.kernel.org/r/20220117135311.6256-3-njavali@marvell.com
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
nickdesaulniers pushed a commit that referenced this issue Jan 31, 2022
…e_put()

The bnx2fc_destroy() functions are removing the interface before calling
destroy_work. This results multiple WARNings from sysfs_remove_group() as
the controller rport device attributes are removed too early.

Replace the fcoe_port's destroy_work queue. It's not needed.

The problem is easily reproducible with the following steps.

Example:

  $ dmesg -w &
  $ systemctl enable --now fcoe
  $ fipvlan -s -c ens2f1
  $ fcoeadm -d ens2f1.802
  [  583.464488] host2: libfc: Link down on port (7500a1)
  [  583.472651] bnx2fc: 7500a1 - rport not created Yet!!
  [  583.490468] ------------[ cut here ]------------
  [  583.538725] sysfs group 'power' not found for kobject 'rport-2:0-0'
  [  583.568814] WARNING: CPU: 3 PID: 192 at fs/sysfs/group.c:279 sysfs_remove_group+0x6f/0x80
  [  583.607130] Modules linked in: dm_service_time 8021q garp mrp stp llc bnx2fc cnic uio rpcsec_gss_krb5 auth_rpcgss nfsv4 ...
  [  583.942994] CPU: 3 PID: 192 Comm: kworker/3:2 Kdump: loaded Not tainted 5.14.0-39.el9.x86_64 #1
  [  583.984105] Hardware name: HP ProLiant DL120 G7, BIOS J01 07/01/2013
  [  584.016535] Workqueue: fc_wq_2 fc_rport_final_delete [scsi_transport_fc]
  [  584.050691] RIP: 0010:sysfs_remove_group+0x6f/0x80
  [  584.074725] Code: ff 5b 48 89 ef 5d 41 5c e9 ee c0 ff ff 48 89 ef e8 f6 b8 ff ff eb d1 49 8b 14 24 48 8b 33 48 c7 c7 ...
  [  584.162586] RSP: 0018:ffffb567c15afdc0 EFLAGS: 00010282
  [  584.188225] RAX: 0000000000000000 RBX: ffffffff8eec4220 RCX: 0000000000000000
  [  584.221053] RDX: ffff8c1586ce84c0 RSI: ffff8c1586cd7cc0 RDI: ffff8c1586cd7cc0
  [  584.255089] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffb567c15afc00
  [  584.287954] R10: ffffb567c15afbf8 R11: ffffffff8fbe7f28 R12: ffff8c1486326400
  [  584.322356] R13: ffff8c1486326480 R14: ffff8c1483a4a000 R15: 0000000000000004
  [  584.355379] FS:  0000000000000000(0000) GS:ffff8c1586cc0000(0000) knlGS:0000000000000000
  [  584.394419] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [  584.421123] CR2: 00007fe95a6f7840 CR3: 0000000107674002 CR4: 00000000000606e0
  [  584.454888] Call Trace:
  [  584.466108]  device_del+0xb2/0x3e0
  [  584.481701]  device_unregister+0x13/0x60
  [  584.501306]  bsg_unregister_queue+0x5b/0x80
  [  584.522029]  bsg_remove_queue+0x1c/0x40
  [  584.541884]  fc_rport_final_delete+0xf3/0x1d0 [scsi_transport_fc]
  [  584.573823]  process_one_work+0x1e3/0x3b0
  [  584.592396]  worker_thread+0x50/0x3b0
  [  584.609256]  ? rescuer_thread+0x370/0x370
  [  584.628877]  kthread+0x149/0x170
  [  584.643673]  ? set_kthread_struct+0x40/0x40
  [  584.662909]  ret_from_fork+0x22/0x30
  [  584.680002] ---[ end trace 53575ecefa942ece ]---

Link: https://lore.kernel.org/r/20220115040044.1013475-1-jmeneghi@redhat.com
Fixes: 0cbf32e ("[SCSI] bnx2fc: Avoid calling bnx2fc_if_destroy with unnecessary locks")
Tested-by: Guangwu Zhang <guazhang@redhat.com>
Co-developed-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: John Meneghini <jmeneghi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
nickdesaulniers pushed a commit that referenced this issue Jan 31, 2022
Crashed at i.mx8qm platform when suspend if enable remote wakeup

Internal error: synchronous external abort: 96000210 [#1] PREEMPT SMP
Modules linked in:
CPU: 2 PID: 244 Comm: kworker/u12:6 Not tainted 5.15.5-dirty #12
Hardware name: Freescale i.MX8QM MEK (DT)
Workqueue: events_unbound async_run_entry_fn
pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : xhci_disable_hub_port_wake.isra.62+0x60/0xf8
lr : xhci_disable_hub_port_wake.isra.62+0x34/0xf8
sp : ffff80001394bbf0
x29: ffff80001394bbf0 x28: 0000000000000000 x27: ffff00081193b578
x26: ffff00081193b570 x25: 0000000000000000 x24: 0000000000000000
x23: ffff00081193a29c x22: 0000000000020001 x21: 0000000000000001
x20: 0000000000000000 x19: ffff800014e90490 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
x14: 0000000000000000 x13: 0000000000000002 x12: 0000000000000000
x11: 0000000000000000 x10: 0000000000000960 x9 : ffff80001394baa0
x8 : ffff0008145d1780 x7 : ffff0008f95b8e80 x6 : 000000001853b453
x5 : 0000000000000496 x4 : 0000000000000000 x3 : ffff00081193a29c
x2 : 0000000000000001 x1 : 0000000000000000 x0 : ffff000814591620
Call trace:
 xhci_disable_hub_port_wake.isra.62+0x60/0xf8
 xhci_suspend+0x58/0x510
 xhci_plat_suspend+0x50/0x78
 platform_pm_suspend+0x2c/0x78
 dpm_run_callback.isra.25+0x50/0xe8
 __device_suspend+0x108/0x3c0

The basic flow:
	1. run time suspend call xhci_suspend, xhci parent devices gate the clock.
        2. echo mem >/sys/power/state, system _device_suspend call xhci_suspend
        3. xhci_suspend call xhci_disable_hub_port_wake, which access register,
	   but clock already gated by run time suspend.

This problem was hidden by power domain driver, which call run time resume before it.

But the below commit remove it and make this issue happen.
	commit c1df456 ("PM: domains: Don't runtime resume devices at genpd_prepare()")

This patch call run time resume before suspend to make sure clock is on
before access register.

Reviewed-by: Peter Chen <peter.chen@kernel.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Testeb-by: Abel Vesa <abel.vesa@nxp.com>
Link: https://lore.kernel.org/r/20220110172738.31686-1-Frank.Li@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
nickdesaulniers pushed a commit that referenced this issue Jan 31, 2022
Fix the following false positive warning:
 =============================
 WARNING: suspicious RCU usage
 5.16.0-rc4+ #57 Not tainted
 -----------------------------
 arch/x86/kvm/../../../virt/kvm/eventfd.c:484 RCU-list traversed in non-reader section!!

 other info that might help us debug this:

 rcu_scheduler_active = 2, debug_locks = 1
 3 locks held by fc_vcpu 0/330:
  #0: ffff8884835fc0b0 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x88/0x6f0 [kvm]
  #1: ffffc90004c0bb68 (&kvm->srcu){....}-{0:0}, at: vcpu_enter_guest+0x600/0x1860 [kvm]
  #2: ffffc90004c0c1d0 (&kvm->irq_srcu){....}-{0:0}, at: kvm_notify_acked_irq+0x36/0x180 [kvm]

 stack backtrace:
 CPU: 26 PID: 330 Comm: fc_vcpu 0 Not tainted 5.16.0-rc4+
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
 Call Trace:
  <TASK>
  dump_stack_lvl+0x44/0x57
  kvm_notify_acked_gsi+0x6b/0x70 [kvm]
  kvm_notify_acked_irq+0x8d/0x180 [kvm]
  kvm_ioapic_update_eoi+0x92/0x240 [kvm]
  kvm_apic_set_eoi_accelerated+0x2a/0xe0 [kvm]
  handle_apic_eoi_induced+0x3d/0x60 [kvm_intel]
  vmx_handle_exit+0x19c/0x6a0 [kvm_intel]
  vcpu_enter_guest+0x66e/0x1860 [kvm]
  kvm_arch_vcpu_ioctl_run+0x438/0x7f0 [kvm]
  kvm_vcpu_ioctl+0x38a/0x6f0 [kvm]
  __x64_sys_ioctl+0x89/0xc0
  do_syscall_64+0x3a/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Since kvm_unregister_irq_ack_notifier() does synchronize_srcu(&kvm->irq_srcu),
kvm->irq_ack_notifier_list is protected by kvm->irq_srcu. In fact,
kvm->irq_srcu SRCU read lock is held in kvm_notify_acked_irq(), making it
a false positive warning. So use hlist_for_each_entry_srcu() instead of
hlist_for_each_entry_rcu().

Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Hou Wenlong <houwenlong93@linux.alibaba.com>
Message-Id: <f98bac4f5052bad2c26df9ad50f7019e40434512.1643265976.git.houwenlong.hwl@antgroup.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
nickdesaulniers pushed a commit that referenced this issue Jan 31, 2022
…/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 5.17, take #1

- Correctly update the shadow register on exception injection when
  running in nVHE mode

- Correctly use the mm_ops indirection when performing cache invalidation
  from the page-table walker

- Restrict the vgic-v3 workaround for SEIS to the two known broken
  implementations
nickdesaulniers pushed a commit that referenced this issue Jan 31, 2022
A ceph user has reported that ceph is crashing with kernel NULL pointer
dereference. Following is the backtrace.

/proc/version: Linux version 5.16.2-arch1-1 (linux@archlinux) (gcc (GCC)
11.1.0, GNU ld (GNU Binutils) 2.36.1) #1 SMP PREEMPT Thu, 20 Jan 2022
16:18:29 +0000
distro / arch: Arch Linux / x86_64
SELinux is not enabled
ceph cluster version: 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503)

relevant dmesg output:
[   30.947129] BUG: kernel NULL pointer dereference, address:
0000000000000000
[   30.947206] #PF: supervisor read access in kernel mode
[   30.947258] #PF: error_code(0x0000) - not-present page
[   30.947310] PGD 0 P4D 0
[   30.947342] Oops: 0000 [#1] PREEMPT SMP PTI
[   30.947388] CPU: 5 PID: 778 Comm: touch Not tainted 5.16.2-arch1-1 #1
86fbf2c313cc37a553d65deb81d98e9dcc2a3659
[   30.947486] Hardware name: Gigabyte Technology Co., Ltd. B365M
DS3H/B365M DS3H, BIOS F5 08/13/2019
[   30.947569] RIP: 0010:strlen+0x0/0x20
[   30.947616] Code: b6 07 38 d0 74 16 48 83 c7 01 84 c0 74 05 48 39 f7 75
ec 31 c0 31 d2 89 d6 89 d7 c3 48 89 f8 31 d2 89 d6 89 d7 c3 0
f 1f 40 00 <80> 3f 00 74 12 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 31
ff
[   30.947782] RSP: 0018:ffffa4ed80ffbbb8 EFLAGS: 00010246
[   30.947836] RAX: 0000000000000000 RBX: ffffa4ed80ffbc60 RCX:
0000000000000000
[   30.947904] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
0000000000000000
[   30.947971] RBP: ffff94b0d15c0ae0 R08: 0000000000000000 R09:
0000000000000000
[   30.948040] R10: 0000000000000000 R11: 0000000000000000 R12:
0000000000000000
[   30.948106] R13: 0000000000000001 R14: ffffa4ed80ffbc60 R15:
0000000000000000
[   30.948174] FS:  00007fc7520f0740(0000) GS:ffff94b7ced40000(0000)
knlGS:0000000000000000
[   30.948252] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   30.948308] CR2: 0000000000000000 CR3: 0000000104a40001 CR4:
00000000003706e0
[   30.948376] Call Trace:
[   30.948404]  <TASK>
[   30.948431]  ceph_security_init_secctx+0x7b/0x240 [ceph
49f9c4b9bf5be8760f19f1747e26da33920bce4b]
[   30.948582]  ceph_atomic_open+0x51e/0x8a0 [ceph
49f9c4b9bf5be8760f19f1747e26da33920bce4b]
[   30.948708]  ? get_cached_acl+0x4d/0xa0
[   30.948759]  path_openat+0x60d/0x1030
[   30.948809]  do_filp_open+0xa5/0x150
[   30.948859]  do_sys_openat2+0xc4/0x190
[   30.948904]  __x64_sys_openat+0x53/0xa0
[   30.948948]  do_syscall_64+0x5c/0x90
[   30.948989]  ? exc_page_fault+0x72/0x180
[   30.949034]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   30.949091] RIP: 0033:0x7fc7521e25bb
[   30.950849] Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00
00 00 85 c0 75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 0
0 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 91 00 00 00 48 8b 54 24 28 64 48 2b 14
25

Core of the problem is that ceph checks for return code from
security_dentry_init_security() and if return code is 0, it assumes
everything is fine and continues to call strlen(name), which crashes.

Typically SELinux LSM returns 0 and sets name to "security.selinux" and
it is not a problem. Or if selinux is not compiled in or disabled, it
returns -EOPNOTSUP and ceph deals with it.

But somehow in this configuration, 0 is being returned and "name" is
not being initialized and that's creating the problem.

Our suspicion is that BPF LSM is registering a hook for
dentry_init_security() and returns hook default of 0.

LSM_HOOK(int, 0, dentry_init_security, struct dentry *dentry,...)

I have not been able to reproduce it just by doing CONFIG_BPF_LSM=y.
Stephen has tested the patch though and confirms it solves the problem
for him.

dentry_init_security() is written in such a way that it expects only one
LSM to register the hook. Atleast that's the expectation with current code.

If another LSM returns a hook and returns default, it will simply return
0 as of now and that will break ceph.

Hence, suggestion is that change semantics of this hook a bit. If there
are no LSMs or no LSM is taking ownership and initializing security context,
then return -EOPNOTSUP. Also allow at max one LSM to initialize security
context. This hook can't deal with multiple LSMs trying to init security
context. This patch implements this new behavior.

Reported-by: Stephen Muth <smuth4@gmail.com>
Tested-by: Stephen Muth <smuth4@gmail.com>
Suggested-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Paul Moore <paul@paul-moore.com>
Cc: <stable@vger.kernel.org> # 5.16.0
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: James Morris <jmorris@namei.org>
nickdesaulniers pushed a commit that referenced this issue Jan 31, 2022
With CONFIG_FORTIFY_SOURCE enabled, string functions will also perform
dynamic checks using __builtin_object_size(ptr), which when failed will
panic the kernel.

Because the KASAN test deliberately performs out-of-bounds operations,
the kernel panics with FORTIFY_SOURCE, for example:

 | kernel BUG at lib/string_helpers.c:910!
 | invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 | CPU: 1 PID: 137 Comm: kunit_try_catch Tainted: G    B             5.16.0-rc3+ #3
 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
 | RIP: 0010:fortify_panic+0x19/0x1b
 | ...
 | Call Trace:
 |  kmalloc_oob_in_memset.cold+0x16/0x16
 |  ...

Fix it by also hiding `ptr` from the optimizer, which will ensure that
__builtin_object_size() does not return a valid size, preventing
fortified string functions from panicking.

Link: https://lkml.kernel.org/r/20220124160744.1244685-1-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Reported-by: Nico Pache <npache@redhat.com>
Reviewed-by: Nico Pache <npache@redhat.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infrastructure Not a bug per se, but an improvement to our workflow
Projects
None yet
Development

No branches or pull requests

2 participants