Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small comment correction #135

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

foodprocessor
Copy link

The time field in blk_io_trace is in nanoseconds. Correct the comment from microseconds to nanoseconds.
The field was once in microseconds, but was changed to nanoseconds in commit 6c051ce on March 25, 2009. Update the comment to match a 5-year-old reality.

The time field in blk_io_trace is in nanoseconds. Correct the comment from microseconds to nanoseconds.
The field was once in microseconds, but was changed to nanoseconds in commit 6c051ce on March 25, 2009. Update the comment to match a 5-year-old reality.
@simar7
Copy link

simar7 commented Nov 14, 2014

Linus doesn't accept PRs on github.

tklauser pushed a commit to tklauser/linux that referenced this pull request Nov 17, 2014
Calling irq_find_mapping from outside a irq_{enter,exit} section is
unsafe and produces ugly messages if CONFIG_PROVE_RCU is enabled:
If coming from the idle state, the rcu_read_lock call in irq_find_mapping
will generate an unpleasant warning:

<quote>
===============================
[ INFO: suspicious RCU usage. ]
3.16.0-rc1+ torvalds#135 Not tainted
-------------------------------
include/linux/rcupdate.h:871 rcu_read_lock() used illegally while idle!

other info that might help us debug this:

RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 0
RCU used illegally from extended quiescent state!
1 lock held by swapper/0/0:
 #0:  (rcu_read_lock){......}, at: [<ffffffc00010206c>]
irq_find_mapping+0x4c/0x198
</quote>

As this issue is fairly widespread and involves at least three
different architectures, a possible solution is to add a new
handle_domain_irq entry point into the generic IRQ code that
the interrupt controller code can call.

This new function takes an irq_domain, and calls into irq_find_domain
inside the irq_{enter,exit} block. An additional "lookup" parameter is
used to allow non-domain architecture code to be replaced by this as well.

Interrupt controllers can then be updated to use the new mechanism.

This code is sitting behind a new CONFIG_HANDLE_DOMAIN_IRQ, as not all
architectures implement set_irq_regs (yes, mn10300, I'm looking at you...).

Reported-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Link: https://lkml.kernel.org/r/1409047421-27649-2-git-send-email-marc.zyngier@arm.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
@septyaman
Copy link

:(

rsalveti pushed a commit to rsalveti/linux that referenced this pull request Nov 3, 2015
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Dec 10, 2015
Directory inodes should start off with i_nlink == 2 (for "." entry).
Of course the same rule should be applied to automount dentries for
child and parent inodes as well.

Also now automount dentry does fsnotify_mkdir.

Without this patch kernel complains when sees i_nlink == 0:

  [   86.288070] WARNING: CPU: 1 PID: 3616 at fs/inode.c:273 drop_nlink+0x3e/0x50()
  [   86.288461] Modules linked in: debugfs_example2(O-)
  [   86.288745] CPU: 1 PID: 3616 Comm: rmmod Tainted: G           O    4.4.0-rc3-next-20151207+ torvalds#135
  [   86.289197] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150617_082717-anatol 04/01/2014
  [   86.289696]  ffffffff81be05c9 ffff8800b9e6fda0 ffffffff81352e2c 0000000000000000
  [   86.290110]  ffff8800b9e6fdd8 ffffffff81065142 ffff8801399175e8 ffff8800bb78b240
  [   86.290507]  ffff8801399175e8 ffff8800b73d7898 ffff8800b73d7840 ffff8800b9e6fde8
  [   86.290933] Call Trace:
  [   86.291080]  [<ffffffff81352e2c>] dump_stack+0x4e/0x82
  [   86.291340]  [<ffffffff81065142>] warn_slowpath_common+0x82/0xc0
  [   86.291640]  [<ffffffff8106523a>] warn_slowpath_null+0x1a/0x20
  [   86.291932]  [<ffffffff811ae62e>] drop_nlink+0x3e/0x50
  [   86.292208]  [<ffffffff811ba35b>] simple_unlink+0x4b/0x60
  [   86.292481]  [<ffffffff811ba3a7>] simple_rmdir+0x37/0x50
  [   86.292748]  [<ffffffff812d9808>] __debugfs_remove.part.16+0xa8/0xd0
  [   86.293082]  [<ffffffff812d9a0b>] debugfs_remove_recursive+0xdb/0x1c0
  [   86.293406]  [<ffffffffa00004dd>] cleanup_module+0x2d/0x3b [debugfs_example2]
  [   86.293762]  [<ffffffff810d959b>] SyS_delete_module+0x16b/0x220
  [   86.294077]  [<ffffffff818ef857>] entry_SYSCALL_64_fastpath+0x12/0x6a
  [   86.294405] ---[ end trace c9fc53353fe14a36 ]---
  [   86.294639] ------------[ cut here ]------------
  [   86.294871] WARNING: CPU: 1 PID: 3616 at fs/inode.c:273 drop_nlink+0x3e/0x50()
  [   86.295249] Modules linked in: debugfs_example2(O-)
  [   86.295510] CPU: 1 PID: 3616 Comm: rmmod Tainted: G        W  O    4.4.0-rc3-next-20151207+ torvalds#135
  [   86.295943] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150617_082717-anatol 04/01/2014
  [   86.296450]  ffffffff81be05c9 ffff8800b9e6fda0 ffffffff81352e2c 0000000000000000
  [   86.296846]  ffff8800b9e6fdd8 ffffffff81065142 ffff880139917810 ffff8800b73d7840
  [   86.297253]  ffff880139917810 ffff8800b73d7898 ffff8800b73d7840 ffff8800b9e6fde8
  [   86.297650] Call Trace:
  [   86.297777]  [<ffffffff81352e2c>] dump_stack+0x4e/0x82
  [   86.298043]  [<ffffffff81065142>] warn_slowpath_common+0x82/0xc0
  [   86.298344]  [<ffffffff8106523a>] warn_slowpath_null+0x1a/0x20
  [   86.298636]  [<ffffffff811ae62e>] drop_nlink+0x3e/0x50
  [   86.298893]  [<ffffffff811ba35b>] simple_unlink+0x4b/0x60
  [   86.299174]  [<ffffffff811ba3a7>] simple_rmdir+0x37/0x50
  [   86.299442]  [<ffffffff812d9808>] __debugfs_remove.part.16+0xa8/0xd0
  [   86.299760]  [<ffffffff812d9aaf>] debugfs_remove_recursive+0x17f/0x1c0
  [   86.300095]  [<ffffffffa00004dd>] cleanup_module+0x2d/0x3b [debugfs_example2]
  [   86.300453]  [<ffffffff810d959b>] SyS_delete_module+0x16b/0x220
  [   86.300750]  [<ffffffff818ef857>] entry_SYSCALL_64_fastpath+0x12/0x6a
  [   86.301198] ---[ end trace c9fc53353fe14a37 ]---

Signed-off-by: Roman Pen <r.peniaev@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-kernel@vger.kernel.org
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Feb 9, 2016
Directory inodes should start off with i_nlink == 2 (one extra ref
for "." entry).  debugfs_create_automount() increases neither the
i_nlink reference for current inode nor for parent inode.

On attempt to remove the automount dentry, kernel complains:

  [   86.288070] WARNING: CPU: 1 PID: 3616 at fs/inode.c:273 drop_nlink+0x3e/0x50()
  [   86.288461] Modules linked in: debugfs_example2(O-)
  [   86.288745] CPU: 1 PID: 3616 Comm: rmmod Tainted: G           O    4.4.0-rc3-next-20151207+ torvalds#135
  [   86.289197] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150617_082717-anatol 04/01/2014
  [   86.289696]  ffffffff81be05c9 ffff8800b9e6fda0 ffffffff81352e2c 0000000000000000
  [   86.290110]  ffff8800b9e6fdd8 ffffffff81065142 ffff8801399175e8 ffff8800bb78b240
  [   86.290507]  ffff8801399175e8 ffff8800b73d7898 ffff8800b73d7840 ffff8800b9e6fde8
  [   86.290933] Call Trace:
  [   86.291080]  [<ffffffff81352e2c>] dump_stack+0x4e/0x82
  [   86.291340]  [<ffffffff81065142>] warn_slowpath_common+0x82/0xc0
  [   86.291640]  [<ffffffff8106523a>] warn_slowpath_null+0x1a/0x20
  [   86.291932]  [<ffffffff811ae62e>] drop_nlink+0x3e/0x50
  [   86.292208]  [<ffffffff811ba35b>] simple_unlink+0x4b/0x60
  [   86.292481]  [<ffffffff811ba3a7>] simple_rmdir+0x37/0x50
  [   86.292748]  [<ffffffff812d9808>] __debugfs_remove.part.16+0xa8/0xd0
  [   86.293082]  [<ffffffff812d9a0b>] debugfs_remove_recursive+0xdb/0x1c0
  [   86.293406]  [<ffffffffa00004dd>] cleanup_module+0x2d/0x3b [debugfs_example2]
  [   86.293762]  [<ffffffff810d959b>] SyS_delete_module+0x16b/0x220
  [   86.294077]  [<ffffffff818ef857>] entry_SYSCALL_64_fastpath+0x12/0x6a
  [   86.294405] ---[ end trace c9fc53353fe14a36 ]---
  [   86.294639] ------------[ cut here ]------------

To reproduce the issue it is enough to invoke these lines:

     autom = debugfs_create_automount("automount", NULL, vfsmount_cb, data);
     BUG_ON(IS_ERR_OR_NULL(autom));
     debugfs_remove(autom);

The issue is fixed by increasing inode i_nlink references for current
and parent inodes.

Signed-off-by: Roman Pen <r.peniaev@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-kernel@vger.kernel.org
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Mar 6, 2016
Directory inodes should start off with i_nlink == 2 (one extra ref
for "." entry).  debugfs_create_automount() increases neither the
i_nlink reference for current inode nor for parent inode.

On attempt to remove the automount dentry, kernel complains:

  [   86.288070] WARNING: CPU: 1 PID: 3616 at fs/inode.c:273 drop_nlink+0x3e/0x50()
  [   86.288461] Modules linked in: debugfs_example2(O-)
  [   86.288745] CPU: 1 PID: 3616 Comm: rmmod Tainted: G           O    4.4.0-rc3-next-20151207+ torvalds#135
  [   86.289197] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150617_082717-anatol 04/01/2014
  [   86.289696]  ffffffff81be05c9 ffff8800b9e6fda0 ffffffff81352e2c 0000000000000000
  [   86.290110]  ffff8800b9e6fdd8 ffffffff81065142 ffff8801399175e8 ffff8800bb78b240
  [   86.290507]  ffff8801399175e8 ffff8800b73d7898 ffff8800b73d7840 ffff8800b9e6fde8
  [   86.290933] Call Trace:
  [   86.291080]  [<ffffffff81352e2c>] dump_stack+0x4e/0x82
  [   86.291340]  [<ffffffff81065142>] warn_slowpath_common+0x82/0xc0
  [   86.291640]  [<ffffffff8106523a>] warn_slowpath_null+0x1a/0x20
  [   86.291932]  [<ffffffff811ae62e>] drop_nlink+0x3e/0x50
  [   86.292208]  [<ffffffff811ba35b>] simple_unlink+0x4b/0x60
  [   86.292481]  [<ffffffff811ba3a7>] simple_rmdir+0x37/0x50
  [   86.292748]  [<ffffffff812d9808>] __debugfs_remove.part.16+0xa8/0xd0
  [   86.293082]  [<ffffffff812d9a0b>] debugfs_remove_recursive+0xdb/0x1c0
  [   86.293406]  [<ffffffffa00004dd>] cleanup_module+0x2d/0x3b [debugfs_example2]
  [   86.293762]  [<ffffffff810d959b>] SyS_delete_module+0x16b/0x220
  [   86.294077]  [<ffffffff818ef857>] entry_SYSCALL_64_fastpath+0x12/0x6a
  [   86.294405] ---[ end trace c9fc53353fe14a36 ]---
  [   86.294639] ------------[ cut here ]------------

To reproduce the issue it is enough to invoke these lines:

     autom = debugfs_create_automount("automount", NULL, vfsmount_cb, data);
     BUG_ON(IS_ERR_OR_NULL(autom));
     debugfs_remove(autom);

The issue is fixed by increasing inode i_nlink references for current
and parent inodes.

Signed-off-by: Roman Pen <r.peniaev@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Mar 28, 2016
Directory inodes should start off with i_nlink == 2 (one extra ref
for "." entry).  debugfs_create_automount() increases neither the
i_nlink reference for current inode nor for parent inode.

On attempt to remove the automount dentry, kernel complains:

  [   86.288070] WARNING: CPU: 1 PID: 3616 at fs/inode.c:273 drop_nlink+0x3e/0x50()
  [   86.288461] Modules linked in: debugfs_example2(O-)
  [   86.288745] CPU: 1 PID: 3616 Comm: rmmod Tainted: G           O    4.4.0-rc3-next-20151207+ torvalds#135
  [   86.289197] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150617_082717-anatol 04/01/2014
  [   86.289696]  ffffffff81be05c9 ffff8800b9e6fda0 ffffffff81352e2c 0000000000000000
  [   86.290110]  ffff8800b9e6fdd8 ffffffff81065142 ffff8801399175e8 ffff8800bb78b240
  [   86.290507]  ffff8801399175e8 ffff8800b73d7898 ffff8800b73d7840 ffff8800b9e6fde8
  [   86.290933] Call Trace:
  [   86.291080]  [<ffffffff81352e2c>] dump_stack+0x4e/0x82
  [   86.291340]  [<ffffffff81065142>] warn_slowpath_common+0x82/0xc0
  [   86.291640]  [<ffffffff8106523a>] warn_slowpath_null+0x1a/0x20
  [   86.291932]  [<ffffffff811ae62e>] drop_nlink+0x3e/0x50
  [   86.292208]  [<ffffffff811ba35b>] simple_unlink+0x4b/0x60
  [   86.292481]  [<ffffffff811ba3a7>] simple_rmdir+0x37/0x50
  [   86.292748]  [<ffffffff812d9808>] __debugfs_remove.part.16+0xa8/0xd0
  [   86.293082]  [<ffffffff812d9a0b>] debugfs_remove_recursive+0xdb/0x1c0
  [   86.293406]  [<ffffffffa00004dd>] cleanup_module+0x2d/0x3b [debugfs_example2]
  [   86.293762]  [<ffffffff810d959b>] SyS_delete_module+0x16b/0x220
  [   86.294077]  [<ffffffff818ef857>] entry_SYSCALL_64_fastpath+0x12/0x6a
  [   86.294405] ---[ end trace c9fc53353fe14a36 ]---
  [   86.294639] ------------[ cut here ]------------

To reproduce the issue it is enough to invoke these lines:

     autom = debugfs_create_automount("automount", NULL, vfsmount_cb, data);
     BUG_ON(IS_ERR_OR_NULL(autom));
     debugfs_remove(autom);

The issue is fixed by increasing inode i_nlink references for current
and parent inodes.

Signed-off-by: Roman Pen <r.peniaev@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Mar 30, 2016
Directory inodes should start off with i_nlink == 2 (one extra ref
for "." entry).  debugfs_create_automount() increases neither the
i_nlink reference for current inode nor for parent inode.

On attempt to remove the automount dentry, kernel complains:

  [   86.288070] WARNING: CPU: 1 PID: 3616 at fs/inode.c:273 drop_nlink+0x3e/0x50()
  [   86.288461] Modules linked in: debugfs_example2(O-)
  [   86.288745] CPU: 1 PID: 3616 Comm: rmmod Tainted: G           O    4.4.0-rc3-next-20151207+ torvalds#135
  [   86.289197] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150617_082717-anatol 04/01/2014
  [   86.289696]  ffffffff81be05c9 ffff8800b9e6fda0 ffffffff81352e2c 0000000000000000
  [   86.290110]  ffff8800b9e6fdd8 ffffffff81065142 ffff8801399175e8 ffff8800bb78b240
  [   86.290507]  ffff8801399175e8 ffff8800b73d7898 ffff8800b73d7840 ffff8800b9e6fde8
  [   86.290933] Call Trace:
  [   86.291080]  [<ffffffff81352e2c>] dump_stack+0x4e/0x82
  [   86.291340]  [<ffffffff81065142>] warn_slowpath_common+0x82/0xc0
  [   86.291640]  [<ffffffff8106523a>] warn_slowpath_null+0x1a/0x20
  [   86.291932]  [<ffffffff811ae62e>] drop_nlink+0x3e/0x50
  [   86.292208]  [<ffffffff811ba35b>] simple_unlink+0x4b/0x60
  [   86.292481]  [<ffffffff811ba3a7>] simple_rmdir+0x37/0x50
  [   86.292748]  [<ffffffff812d9808>] __debugfs_remove.part.16+0xa8/0xd0
  [   86.293082]  [<ffffffff812d9a0b>] debugfs_remove_recursive+0xdb/0x1c0
  [   86.293406]  [<ffffffffa00004dd>] cleanup_module+0x2d/0x3b [debugfs_example2]
  [   86.293762]  [<ffffffff810d959b>] SyS_delete_module+0x16b/0x220
  [   86.294077]  [<ffffffff818ef857>] entry_SYSCALL_64_fastpath+0x12/0x6a
  [   86.294405] ---[ end trace c9fc53353fe14a36 ]---
  [   86.294639] ------------[ cut here ]------------

To reproduce the issue it is enough to invoke these lines:

     autom = debugfs_create_automount("automount", NULL, vfsmount_cb, data);
     BUG_ON(IS_ERR_OR_NULL(autom));
     debugfs_remove(autom);

The issue is fixed by increasing inode i_nlink references for current
and parent inodes.

Signed-off-by: Roman Pen <r.peniaev@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Aug 19, 2016
I've been seeing these recently:

    INFO: task trinity-c3:14933 blocked for more than 120 seconds.
	  Not tainted 4.8.0-rc1+ torvalds#135
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    trinity-c3      D ffff88010c16fc88     0 14933      1 0x00080004
     ffff88010c16fc88 000000003b9aca00 0000000000000000 0000000000000296
     00000000776cdf88 ffff88011a520ae0 ffff88011a520b08 ffff88011a520198
     ffffffff867d7f00 ffff88011942c080 ffff880116841580 ffff88010c168000
    Call Trace:
     [<ffffffff845e9d37>] schedule+0x77/0x230
     [<ffffffff833cb8b9>] __lock_sock+0x129/0x250
     [<ffffffff833cb790>] ? __sk_destruct+0x450/0x450
     [<ffffffff81408ac0>] ? wake_bit_function+0x2e0/0x2e0
     [<ffffffff833d832b>] lock_sock_nested+0xeb/0x120
     [<ffffffff83bad815>] irda_setsockopt+0x65/0xb40
     [<ffffffff833c6c09>] SyS_setsockopt+0x139/0x230
     [<ffffffff833c6ad0>] ? SyS_recv+0x20/0x20
     [<ffffffff81004660>] ? trace_event_raw_event_sys_enter+0xb90/0xb90
     [<ffffffff823c7023>] ? __this_cpu_preempt_check+0x13/0x20
     [<ffffffff8162ee60>] ? __context_tracking_exit.part.3+0x30/0x1b0
     [<ffffffff833c6ad0>] ? SyS_recv+0x20/0x20
     [<ffffffff81007bd3>] do_syscall_64+0x1b3/0x4b0
     [<ffffffff845f84aa>] entry_SYSCALL64_slow_path+0x25/0x25

    Showing all locks held in the system:
    2 locks held by khungtaskd/563:
     #0:  (rcu_read_lock){......}, at: [<ffffffff81534ce6>] watchdog+0x106/0x910
     #1:  (tasklist_lock){......}, at: [<ffffffff8141b3c4>] debug_show_all_locks+0x74/0x360
    1 lock held by trinity-c0/19280:
     #0:  (sk_lock-AF_IRDA){......}, at: [<ffffffff83bab7c6>] irda_accept+0x176/0x10f0
    1 lock held by trinity-c0/12865:
     #0:  (sk_lock-AF_IRDA){......}, at: [<ffffffff83bab7c6>] irda_accept+0x176/0x10f0

The problem seems to be that irda_accept() goes to sleep after locking
the socket, which means that others trying to get the lock will be
"blocked for more than 120 seconds" like above.

There are unfortunately other places in the irda code that seem to be
doing the same thing: irda_connect(), irda_sendmsg(), and
irda_getsockopt() as far as I can tell at a glance. I'll start with
this patch to see if we're going in the right direction -- it does fix
the trinity problem for me, although I haven't tested any real IrDA
workloads.

Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Aug 19, 2016
When we get a hung task it can often be valuable to see _all_ the held
locks on the system (in case we are being blocked on trying to acquire
one), e.g. with this patch we can immediately see where the problem is
below:

    INFO: task trinity-c3:14933 blocked for more than 120 seconds.
	  Not tainted 4.8.0-rc1+ torvalds#135
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    trinity-c3      D ffff88010c16fc88     0 14933      1 0x00080004
     ffff88010c16fc88 000000003b9aca00 0000000000000000 0000000000000296
     00000000776cdf88 ffff88011a520ae0 ffff88011a520b08 ffff88011a520198
     ffffffff867d7f00 ffff88011942c080 ffff880116841580 ffff88010c168000
    Call Trace:
     [<ffffffff845e9d37>] schedule+0x77/0x230
     [<ffffffff833cb8b9>] __lock_sock+0x129/0x250
     [<ffffffff833cb790>] ? __sk_destruct+0x450/0x450
     [<ffffffff81408ac0>] ? wake_bit_function+0x2e0/0x2e0
     [<ffffffff833d832b>] lock_sock_nested+0xeb/0x120
     [<ffffffff83bad815>] irda_setsockopt+0x65/0xb40
     [<ffffffff833c6c09>] SyS_setsockopt+0x139/0x230
     [<ffffffff833c6ad0>] ? SyS_recv+0x20/0x20
     [<ffffffff81004660>] ? trace_event_raw_event_sys_enter+0xb90/0xb90
     [<ffffffff823c7023>] ? __this_cpu_preempt_check+0x13/0x20
     [<ffffffff8162ee60>] ? __context_tracking_exit.part.3+0x30/0x1b0
     [<ffffffff833c6ad0>] ? SyS_recv+0x20/0x20
     [<ffffffff81007bd3>] do_syscall_64+0x1b3/0x4b0
     [<ffffffff845f84aa>] entry_SYSCALL64_slow_path+0x25/0x25

    Showing all locks held in the system:
    2 locks held by khungtaskd/563:
     #0:  (rcu_read_lock){......}, at: [<ffffffff81534ce6>] watchdog+0x106/0x910
     #1:  (tasklist_lock){......}, at: [<ffffffff8141b3c4>] debug_show_all_locks+0x74/0x360
    1 lock held by trinity-c0/19280:
     #0:  (sk_lock-AF_IRDA){......}, at: [<ffffffff83bab7c6>] irda_accept+0x176/0x10f0
    1 lock held by trinity-c0/12865:
     #0:  (sk_lock-AF_IRDA){......}, at: [<ffffffff83bab7c6>] irda_accept+0x176/0x10f0

Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Aug 25, 2016
When we get a hung task it can often be valuable to see _all_ the held
locks on the system (in case we are being blocked on trying to acquire
one), e.g. with this patch we can immediately see where the problem is
below:

    INFO: task trinity-c3:14933 blocked for more than 120 seconds.
	  Not tainted 4.8.0-rc1+ torvalds#135
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    trinity-c3      D ffff88010c16fc88     0 14933      1 0x00080004
     ffff88010c16fc88 000000003b9aca00 0000000000000000 0000000000000296
     00000000776cdf88 ffff88011a520ae0 ffff88011a520b08 ffff88011a520198
     ffffffff867d7f00 ffff88011942c080 ffff880116841580 ffff88010c168000
    Call Trace:
     [<ffffffff845e9d37>] schedule+0x77/0x230
     [<ffffffff833cb8b9>] __lock_sock+0x129/0x250
     [<ffffffff833cb790>] ? __sk_destruct+0x450/0x450
     [<ffffffff81408ac0>] ? wake_bit_function+0x2e0/0x2e0
     [<ffffffff833d832b>] lock_sock_nested+0xeb/0x120
     [<ffffffff83bad815>] irda_setsockopt+0x65/0xb40
     [<ffffffff833c6c09>] SyS_setsockopt+0x139/0x230
     [<ffffffff833c6ad0>] ? SyS_recv+0x20/0x20
     [<ffffffff81004660>] ? trace_event_raw_event_sys_enter+0xb90/0xb90
     [<ffffffff823c7023>] ? __this_cpu_preempt_check+0x13/0x20
     [<ffffffff8162ee60>] ? __context_tracking_exit.part.3+0x30/0x1b0
     [<ffffffff833c6ad0>] ? SyS_recv+0x20/0x20
     [<ffffffff81007bd3>] do_syscall_64+0x1b3/0x4b0
     [<ffffffff845f84aa>] entry_SYSCALL64_slow_path+0x25/0x25

    Showing all locks held in the system:
    2 locks held by khungtaskd/563:
     #0:  (rcu_read_lock){......}, at: [<ffffffff81534ce6>] watchdog+0x106/0x910
     #1:  (tasklist_lock){......}, at: [<ffffffff8141b3c4>] debug_show_all_locks+0x74/0x360
    1 lock held by trinity-c0/19280:
     #0:  (sk_lock-AF_IRDA){......}, at: [<ffffffff83bab7c6>] irda_accept+0x176/0x10f0
    1 lock held by trinity-c0/12865:
     #0:  (sk_lock-AF_IRDA){......}, at: [<ffffffff83bab7c6>] irda_accept+0x176/0x10f0

Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mandeep Singh Baines <msb@chromium.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1471538460-7505-1-git-send-email-vegard.nossum@oracle.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
laijs pushed a commit to laijs/linux that referenced this pull request Feb 13, 2017
…hread

lkl: Free all memory for the initial syscall thread
mlyle pushed a commit to mlyle/linux that referenced this pull request Oct 20, 2017
ERROR: code indent should use tabs where possible
torvalds#124: FILE: kernel/panic.c:604:
+                        clear_warn_once_set,$

WARNING: please, no spaces at the start of a line
torvalds#124: FILE: kernel/panic.c:604:
+                        clear_warn_once_set,$

ERROR: code indent should use tabs where possible
torvalds#125: FILE: kernel/panic.c:605:
+^I^I        "%lld\n");$

WARNING: please use device_initcall() or more appropriate function instead of __initcall() (see include/linux/init.h)
torvalds#135: FILE: kernel/panic.c:615:
+__initcall(register_warn_debugfs);

total: 2 errors, 2 warnings, 87 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

NOTE: Whitespace errors detected.
      You may wish to use scripts/cleanpatch or scripts/cleanfile

./patches/support-resetting-warn_once.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Oct 29, 2017
ERROR: code indent should use tabs where possible
torvalds#124: FILE: kernel/panic.c:604:
+                        clear_warn_once_set,$

WARNING: please, no spaces at the start of a line
torvalds#124: FILE: kernel/panic.c:604:
+                        clear_warn_once_set,$

ERROR: code indent should use tabs where possible
torvalds#125: FILE: kernel/panic.c:605:
+^I^I        "%lld\n");$

WARNING: please use device_initcall() or more appropriate function instead of __initcall() (see include/linux/init.h)
torvalds#135: FILE: kernel/panic.c:615:
+__initcall(register_warn_debugfs);

total: 2 errors, 2 warnings, 87 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

NOTE: Whitespace errors detected.
      You may wish to use scripts/cleanpatch or scripts/cleanfile

./patches/support-resetting-warn_once.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Nov 8, 2017
ERROR: code indent should use tabs where possible
torvalds#124: FILE: kernel/panic.c:604:
+                        clear_warn_once_set,$

WARNING: please, no spaces at the start of a line
torvalds#124: FILE: kernel/panic.c:604:
+                        clear_warn_once_set,$

ERROR: code indent should use tabs where possible
torvalds#125: FILE: kernel/panic.c:605:
+^I^I        "%lld\n");$

WARNING: please use device_initcall() or more appropriate function instead of __initcall() (see include/linux/init.h)
torvalds#135: FILE: kernel/panic.c:615:
+__initcall(register_warn_debugfs);

total: 2 errors, 2 warnings, 87 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

NOTE: Whitespace errors detected.
      You may wish to use scripts/cleanpatch or scripts/cleanfile

./patches/support-resetting-warn_once.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
thierryreding pushed a commit to thierryreding/linux that referenced this pull request Nov 13, 2017
ERROR: code indent should use tabs where possible
torvalds#124: FILE: kernel/panic.c:604:
+                        clear_warn_once_set,$

WARNING: please, no spaces at the start of a line
torvalds#124: FILE: kernel/panic.c:604:
+                        clear_warn_once_set,$

ERROR: code indent should use tabs where possible
torvalds#125: FILE: kernel/panic.c:605:
+^I^I        "%lld\n");$

WARNING: please use device_initcall() or more appropriate function instead of __initcall() (see include/linux/init.h)
torvalds#135: FILE: kernel/panic.c:615:
+__initcall(register_warn_debugfs);

total: 2 errors, 2 warnings, 87 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

NOTE: Whitespace errors detected.
      You may wish to use scripts/cleanpatch or scripts/cleanfile

./patches/support-resetting-warn_once.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
ddstreet pushed a commit to ddstreet/linux that referenced this pull request Nov 15, 2017
ERROR: code indent should use tabs where possible
torvalds#124: FILE: kernel/panic.c:604:
+                        clear_warn_once_set,$

WARNING: please, no spaces at the start of a line
torvalds#124: FILE: kernel/panic.c:604:
+                        clear_warn_once_set,$

ERROR: code indent should use tabs where possible
torvalds#125: FILE: kernel/panic.c:605:
+^I^I        "%lld\n");$

WARNING: please use device_initcall() or more appropriate function instead of __initcall() (see include/linux/init.h)
torvalds#135: FILE: kernel/panic.c:615:
+__initcall(register_warn_debugfs);

total: 2 errors, 2 warnings, 87 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

NOTE: Whitespace errors detected.
      You may wish to use scripts/cleanpatch or scripts/cleanfile

./patches/support-resetting-warn_once.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
kraj pushed a commit to kraj/linux that referenced this pull request Dec 1, 2017
The apparmor_audit_data struct ordering got messed up during a merge
conflict, resulting in the signal integer and peer pointer being in
a union instead of a struct.

For most of the 4.13 and 4.14 life cycle, this was hidden by
commit 651e28c ("apparmor: add base infastructure for socket
mediation") which fixed the apparmor_audit_data struct when its data
was added. When that commit was reverted in -rc7 the signal audit bug
was exposed, and unfortunately it never showed up in any of the
testing until after 4.14 was released. Shaun Khan, Zephaniah
E. Loss-Cutler-Hull filed nearly simultaneous bug reports (with
different oopes, the smaller of which is included below).

Full credit goes to Tetsuo Handa for jumping on this as well and
noticing the audit data struct problem and reporting it.

[   76.178568] BUG: unable to handle kernel paging request at
ffffffff0eee3bc0
[   76.178579] IP: audit_signal_cb+0x6c/0xe0
[   76.178581] PGD 1a640a067 P4D 1a640a067 PUD 0
[   76.178586] Oops: 0000 [#1] PREEMPT SMP
[   76.178589] Modules linked in: fuse rfcomm bnep usblp uvcvideo btusb
btrtl btbcm btintel bluetooth ecdh_generic ip6table_filter ip6_tables
xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack
iptable_filter ip_tables x_tables intel_rapl joydev wmi_bmof serio_raw
iwldvm iwlwifi shpchp kvm_intel kvm irqbypass autofs4 algif_skcipher
nls_iso8859_1 nls_cp437 crc32_pclmul ghash_clmulni_intel
[   76.178620] CPU: 0 PID: 10675 Comm: pidgin Not tainted
4.14.0-f1-dirty torvalds#135
[   76.178623] Hardware name: Hewlett-Packard HP EliteBook Folio
9470m/18DF, BIOS 68IBD Ver. F.62 10/22/2015
[   76.178625] task: ffff9c7a94c31dc0 task.stack: ffffa09b02a4c000
[   76.178628] RIP: 0010:audit_signal_cb+0x6c/0xe0
[   76.178631] RSP: 0018:ffffa09b02a4fc08 EFLAGS: 00010292
[   76.178634] RAX: ffffa09b02a4fd60 RBX: ffff9c7aee0741f8 RCX:
0000000000000000
[   76.178636] RDX: ffffffffee012290 RSI: 0000000000000006 RDI:
ffff9c7a9493d800
[   76.178638] RBP: ffffa09b02a4fd40 R08: 000000000000004d R09:
ffffa09b02a4fc46
[   76.178641] R10: ffffa09b02a4fcb8 R11: ffff9c7ab44f5072 R12:
ffffa09b02a4fd40
[   76.178643] R13: ffffffff9e447be0 R14: ffff9c7a94c31dc0 R15:
0000000000000001
[   76.178646] FS:  00007f8b11ba2a80(0000) GS:ffff9c7afea00000(0000)
knlGS:0000000000000000
[   76.178648] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   76.178650] CR2: ffffffff0eee3bc0 CR3: 00000003d5209002 CR4:
00000000001606f0
[   76.178652] Call Trace:
[   76.178660]  common_lsm_audit+0x1da/0x780
[   76.178665]  ? d_absolute_path+0x60/0x90
[   76.178669]  ? aa_check_perms+0xcd/0xe0
[   76.178672]  aa_check_perms+0xcd/0xe0
[   76.178675]  profile_signal_perm.part.0+0x90/0xa0
[   76.178679]  aa_may_signal+0x16e/0x1b0
[   76.178686]  apparmor_task_kill+0x51/0x120
[   76.178690]  security_task_kill+0x44/0x60
[   76.178695]  group_send_sig_info+0x25/0x60
[   76.178699]  kill_pid_info+0x36/0x60
[   76.178703]  SYSC_kill+0xdb/0x180
[   76.178707]  ? preempt_count_sub+0x92/0xd0
[   76.178712]  ? _raw_write_unlock_irq+0x13/0x30
[   76.178716]  ? task_work_run+0x6a/0x90
[   76.178720]  ? exit_to_usermode_loop+0x80/0xa0
[   76.178723]  entry_SYSCALL_64_fastpath+0x13/0x94
[   76.178727] RIP: 0033:0x7f8b0e58b767
[   76.178729] RSP: 002b:00007fff19efd4d8 EFLAGS: 00000206 ORIG_RAX:
000000000000003e
[   76.178732] RAX: ffffffffffffffda RBX: 0000557f3e3c2050 RCX:
00007f8b0e58b767
[   76.178735] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
000000000000263b
[   76.178737] RBP: 0000000000000000 R08: 0000557f3e3c2270 R09:
0000000000000001
[   76.178739] R10: 000000000000022d R11: 0000000000000206 R12:
0000000000000000
[   76.178741] R13: 0000000000000001 R14: 0000557f3e3c13c0 R15:
0000000000000000
[   76.178745] Code: 48 8b 55 18 48 89 df 41 b8 20 00 08 01 5b 5d 48 8b
42 10 48 8b 52 30 48 63 48 4c 48 8b 44 c8 48 31 c9 48 8b 70 38 e9 f4 fd
00 00 <48> 8b 14 d5 40 27 e5 9e 48 c7 c6 7d 07 19 9f 48 89 df e8 fd 35
[   76.178794] RIP: audit_signal_cb+0x6c/0xe0 RSP: ffffa09b02a4fc08
[   76.178796] CR2: ffffffff0eee3bc0
[   76.178799] ---[ end trace 514af9529297f1a3 ]---

Fixes: cd1dbf7 ("apparmor: add the ability to mediate signals")
Reported-by: Zephaniah E. Loss-Cutler-Hull <warp-spam_kernel@aehallh.com>
Reported-by: Shuah Khan <shuahkh@osg.samsung.com>
Suggested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Tested-by: Ivan Kozik <ivan@ludios.org>
Tested-by: Zephaniah E. Loss-Cutler-Hull <warp-spam_kernel@aehallh.com>
Tested-by: Christian Boltz <apparmor@cboltz.de>
Tested-by: Shuah Khan <shuahkh@osg.samsung.com>
Cc: stable@vger.kernel.org
Signed-off-by: John Johansen <john.johansen@canonical.com>
CallMeFoxie pushed a commit to CallMeFoxie/linux that referenced this pull request Dec 5, 2017
commit b12cbb2 upstream.

The apparmor_audit_data struct ordering got messed up during a merge
conflict, resulting in the signal integer and peer pointer being in
a union instead of a struct.

For most of the 4.13 and 4.14 life cycle, this was hidden by
commit 651e28c ("apparmor: add base infastructure for socket
mediation") which fixed the apparmor_audit_data struct when its data
was added. When that commit was reverted in -rc7 the signal audit bug
was exposed, and unfortunately it never showed up in any of the
testing until after 4.14 was released. Shaun Khan, Zephaniah
E. Loss-Cutler-Hull filed nearly simultaneous bug reports (with
different oopes, the smaller of which is included below).

Full credit goes to Tetsuo Handa for jumping on this as well and
noticing the audit data struct problem and reporting it.

[   76.178568] BUG: unable to handle kernel paging request at
ffffffff0eee3bc0
[   76.178579] IP: audit_signal_cb+0x6c/0xe0
[   76.178581] PGD 1a640a067 P4D 1a640a067 PUD 0
[   76.178586] Oops: 0000 [#1] PREEMPT SMP
[   76.178589] Modules linked in: fuse rfcomm bnep usblp uvcvideo btusb
btrtl btbcm btintel bluetooth ecdh_generic ip6table_filter ip6_tables
xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack
iptable_filter ip_tables x_tables intel_rapl joydev wmi_bmof serio_raw
iwldvm iwlwifi shpchp kvm_intel kvm irqbypass autofs4 algif_skcipher
nls_iso8859_1 nls_cp437 crc32_pclmul ghash_clmulni_intel
[   76.178620] CPU: 0 PID: 10675 Comm: pidgin Not tainted
4.14.0-f1-dirty torvalds#135
[   76.178623] Hardware name: Hewlett-Packard HP EliteBook Folio
9470m/18DF, BIOS 68IBD Ver. F.62 10/22/2015
[   76.178625] task: ffff9c7a94c31dc0 task.stack: ffffa09b02a4c000
[   76.178628] RIP: 0010:audit_signal_cb+0x6c/0xe0
[   76.178631] RSP: 0018:ffffa09b02a4fc08 EFLAGS: 00010292
[   76.178634] RAX: ffffa09b02a4fd60 RBX: ffff9c7aee0741f8 RCX:
0000000000000000
[   76.178636] RDX: ffffffffee012290 RSI: 0000000000000006 RDI:
ffff9c7a9493d800
[   76.178638] RBP: ffffa09b02a4fd40 R08: 000000000000004d R09:
ffffa09b02a4fc46
[   76.178641] R10: ffffa09b02a4fcb8 R11: ffff9c7ab44f5072 R12:
ffffa09b02a4fd40
[   76.178643] R13: ffffffff9e447be0 R14: ffff9c7a94c31dc0 R15:
0000000000000001
[   76.178646] FS:  00007f8b11ba2a80(0000) GS:ffff9c7afea00000(0000)
knlGS:0000000000000000
[   76.178648] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   76.178650] CR2: ffffffff0eee3bc0 CR3: 00000003d5209002 CR4:
00000000001606f0
[   76.178652] Call Trace:
[   76.178660]  common_lsm_audit+0x1da/0x780
[   76.178665]  ? d_absolute_path+0x60/0x90
[   76.178669]  ? aa_check_perms+0xcd/0xe0
[   76.178672]  aa_check_perms+0xcd/0xe0
[   76.178675]  profile_signal_perm.part.0+0x90/0xa0
[   76.178679]  aa_may_signal+0x16e/0x1b0
[   76.178686]  apparmor_task_kill+0x51/0x120
[   76.178690]  security_task_kill+0x44/0x60
[   76.178695]  group_send_sig_info+0x25/0x60
[   76.178699]  kill_pid_info+0x36/0x60
[   76.178703]  SYSC_kill+0xdb/0x180
[   76.178707]  ? preempt_count_sub+0x92/0xd0
[   76.178712]  ? _raw_write_unlock_irq+0x13/0x30
[   76.178716]  ? task_work_run+0x6a/0x90
[   76.178720]  ? exit_to_usermode_loop+0x80/0xa0
[   76.178723]  entry_SYSCALL_64_fastpath+0x13/0x94
[   76.178727] RIP: 0033:0x7f8b0e58b767
[   76.178729] RSP: 002b:00007fff19efd4d8 EFLAGS: 00000206 ORIG_RAX:
000000000000003e
[   76.178732] RAX: ffffffffffffffda RBX: 0000557f3e3c2050 RCX:
00007f8b0e58b767
[   76.178735] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
000000000000263b
[   76.178737] RBP: 0000000000000000 R08: 0000557f3e3c2270 R09:
0000000000000001
[   76.178739] R10: 000000000000022d R11: 0000000000000206 R12:
0000000000000000
[   76.178741] R13: 0000000000000001 R14: 0000557f3e3c13c0 R15:
0000000000000000
[   76.178745] Code: 48 8b 55 18 48 89 df 41 b8 20 00 08 01 5b 5d 48 8b
42 10 48 8b 52 30 48 63 48 4c 48 8b 44 c8 48 31 c9 48 8b 70 38 e9 f4 fd
00 00 <48> 8b 14 d5 40 27 e5 9e 48 c7 c6 7d 07 19 9f 48 89 df e8 fd 35
[   76.178794] RIP: audit_signal_cb+0x6c/0xe0 RSP: ffffa09b02a4fc08
[   76.178796] CR2: ffffffff0eee3bc0
[   76.178799] ---[ end trace 514af9529297f1a3 ]---

Fixes: cd1dbf7 ("apparmor: add the ability to mediate signals")
Reported-by: Zephaniah E. Loss-Cutler-Hull <warp-spam_kernel@aehallh.com>
Reported-by: Shuah Khan <shuahkh@osg.samsung.com>
Suggested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Tested-by: Ivan Kozik <ivan@ludios.org>
Tested-by: Zephaniah E. Loss-Cutler-Hull <warp-spam_kernel@aehallh.com>
Tested-by: Christian Boltz <apparmor@cboltz.de>
Tested-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Arzte pushed a commit to Arzte/linux that referenced this pull request Jan 7, 2018
commit b12cbb2 upstream.

The apparmor_audit_data struct ordering got messed up during a merge
conflict, resulting in the signal integer and peer pointer being in
a union instead of a struct.

For most of the 4.13 and 4.14 life cycle, this was hidden by
commit 651e28c ("apparmor: add base infastructure for socket
mediation") which fixed the apparmor_audit_data struct when its data
was added. When that commit was reverted in -rc7 the signal audit bug
was exposed, and unfortunately it never showed up in any of the
testing until after 4.14 was released. Shaun Khan, Zephaniah
E. Loss-Cutler-Hull filed nearly simultaneous bug reports (with
different oopes, the smaller of which is included below).

Full credit goes to Tetsuo Handa for jumping on this as well and
noticing the audit data struct problem and reporting it.

[   76.178568] BUG: unable to handle kernel paging request at
ffffffff0eee3bc0
[   76.178579] IP: audit_signal_cb+0x6c/0xe0
[   76.178581] PGD 1a640a067 P4D 1a640a067 PUD 0
[   76.178586] Oops: 0000 [#1] PREEMPT SMP
[   76.178589] Modules linked in: fuse rfcomm bnep usblp uvcvideo btusb
btrtl btbcm btintel bluetooth ecdh_generic ip6table_filter ip6_tables
xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack
iptable_filter ip_tables x_tables intel_rapl joydev wmi_bmof serio_raw
iwldvm iwlwifi shpchp kvm_intel kvm irqbypass autofs4 algif_skcipher
nls_iso8859_1 nls_cp437 crc32_pclmul ghash_clmulni_intel
[   76.178620] CPU: 0 PID: 10675 Comm: pidgin Not tainted
4.14.0-f1-dirty torvalds#135
[   76.178623] Hardware name: Hewlett-Packard HP EliteBook Folio
9470m/18DF, BIOS 68IBD Ver. F.62 10/22/2015
[   76.178625] task: ffff9c7a94c31dc0 task.stack: ffffa09b02a4c000
[   76.178628] RIP: 0010:audit_signal_cb+0x6c/0xe0
[   76.178631] RSP: 0018:ffffa09b02a4fc08 EFLAGS: 00010292
[   76.178634] RAX: ffffa09b02a4fd60 RBX: ffff9c7aee0741f8 RCX:
0000000000000000
[   76.178636] RDX: ffffffffee012290 RSI: 0000000000000006 RDI:
ffff9c7a9493d800
[   76.178638] RBP: ffffa09b02a4fd40 R08: 000000000000004d R09:
ffffa09b02a4fc46
[   76.178641] R10: ffffa09b02a4fcb8 R11: ffff9c7ab44f5072 R12:
ffffa09b02a4fd40
[   76.178643] R13: ffffffff9e447be0 R14: ffff9c7a94c31dc0 R15:
0000000000000001
[   76.178646] FS:  00007f8b11ba2a80(0000) GS:ffff9c7afea00000(0000)
knlGS:0000000000000000
[   76.178648] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   76.178650] CR2: ffffffff0eee3bc0 CR3: 00000003d5209002 CR4:
00000000001606f0
[   76.178652] Call Trace:
[   76.178660]  common_lsm_audit+0x1da/0x780
[   76.178665]  ? d_absolute_path+0x60/0x90
[   76.178669]  ? aa_check_perms+0xcd/0xe0
[   76.178672]  aa_check_perms+0xcd/0xe0
[   76.178675]  profile_signal_perm.part.0+0x90/0xa0
[   76.178679]  aa_may_signal+0x16e/0x1b0
[   76.178686]  apparmor_task_kill+0x51/0x120
[   76.178690]  security_task_kill+0x44/0x60
[   76.178695]  group_send_sig_info+0x25/0x60
[   76.178699]  kill_pid_info+0x36/0x60
[   76.178703]  SYSC_kill+0xdb/0x180
[   76.178707]  ? preempt_count_sub+0x92/0xd0
[   76.178712]  ? _raw_write_unlock_irq+0x13/0x30
[   76.178716]  ? task_work_run+0x6a/0x90
[   76.178720]  ? exit_to_usermode_loop+0x80/0xa0
[   76.178723]  entry_SYSCALL_64_fastpath+0x13/0x94
[   76.178727] RIP: 0033:0x7f8b0e58b767
[   76.178729] RSP: 002b:00007fff19efd4d8 EFLAGS: 00000206 ORIG_RAX:
000000000000003e
[   76.178732] RAX: ffffffffffffffda RBX: 0000557f3e3c2050 RCX:
00007f8b0e58b767
[   76.178735] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
000000000000263b
[   76.178737] RBP: 0000000000000000 R08: 0000557f3e3c2270 R09:
0000000000000001
[   76.178739] R10: 000000000000022d R11: 0000000000000206 R12:
0000000000000000
[   76.178741] R13: 0000000000000001 R14: 0000557f3e3c13c0 R15:
0000000000000000
[   76.178745] Code: 48 8b 55 18 48 89 df 41 b8 20 00 08 01 5b 5d 48 8b
42 10 48 8b 52 30 48 63 48 4c 48 8b 44 c8 48 31 c9 48 8b 70 38 e9 f4 fd
00 00 <48> 8b 14 d5 40 27 e5 9e 48 c7 c6 7d 07 19 9f 48 89 df e8 fd 35
[   76.178794] RIP: audit_signal_cb+0x6c/0xe0 RSP: ffffa09b02a4fc08
[   76.178796] CR2: ffffffff0eee3bc0
[   76.178799] ---[ end trace 514af9529297f1a3 ]---

Fixes: cd1dbf7 ("apparmor: add the ability to mediate signals")
Reported-by: Zephaniah E. Loss-Cutler-Hull <warp-spam_kernel@aehallh.com>
Reported-by: Shuah Khan <shuahkh@osg.samsung.com>
Suggested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Tested-by: Ivan Kozik <ivan@ludios.org>
Tested-by: Zephaniah E. Loss-Cutler-Hull <warp-spam_kernel@aehallh.com>
Tested-by: Christian Boltz <apparmor@cboltz.de>
Tested-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
iaguis pushed a commit to kinvolk/linux that referenced this pull request Feb 6, 2018
Rebase 4.14.15 patches onto 4.14.16
krzk pushed a commit to krzk/linux that referenced this pull request Aug 10, 2018
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
torvalds#49: FILE: fs/mpage.c:139:
+	unsigned nr_pages;

WARNING: line over 80 characters
torvalds#96: FILE: fs/mpage.c:190:
+	if (buffer_mapped(map_bh) && block_in_file > args->first_logical_block &&

WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
torvalds#98: FILE: fs/mpage.c:192:
+		unsigned map_offset = block_in_file - args->first_logical_block;

WARNING: line over 80 characters
torvalds#135: FILE: fs/mpage.c:296:
+				min_t(int, args->nr_pages, BIO_MAX_PAGES), args->gfp);

ERROR: code indent should use tabs where possible
torvalds#169: FILE: fs/mpage.c:321:
+^I        block_read_full_page(page, args->get_block);$

total: 1 errors, 4 warnings, 196 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

NOTE: Whitespace errors detected.
      You may wish to use scripts/cleanpatch or scripts/cleanfile

./patches/mpage-add-argument-structure-for-do_mpage_readpage.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
rgushchin pushed a commit to rgushchin/linux that referenced this pull request Aug 16, 2018
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
torvalds#49: FILE: fs/mpage.c:139:
+	unsigned nr_pages;

WARNING: line over 80 characters
torvalds#96: FILE: fs/mpage.c:190:
+	if (buffer_mapped(map_bh) && block_in_file > args->first_logical_block &&

WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
torvalds#98: FILE: fs/mpage.c:192:
+		unsigned map_offset = block_in_file - args->first_logical_block;

WARNING: line over 80 characters
torvalds#135: FILE: fs/mpage.c:296:
+				min_t(int, args->nr_pages, BIO_MAX_PAGES), args->gfp);

ERROR: code indent should use tabs where possible
torvalds#169: FILE: fs/mpage.c:321:
+^I        block_read_full_page(page, args->get_block);$

total: 1 errors, 4 warnings, 196 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

NOTE: Whitespace errors detected.
      You may wish to use scripts/cleanpatch or scripts/cleanfile

./patches/mpage-add-argument-structure-for-do_mpage_readpage.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches


Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
theopolis pushed a commit to theopolis/linux that referenced this pull request Feb 22, 2019
Summary:
1. Change I2C bus 0 clock to 400KHz
2. Add retry when get BIC update status fail
Pull Request resolved: facebookexternal/openbmc.accton#135

Reviewed By: tomrepo

Differential Revision: D14145628

Pulled By: mikechoifb

fbshipit-source-id: 8013c4d8aa
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Jun 5, 2019
The current cgroup OOM memory info dump doesn't include all the memory
we are tracking, nor does it give insight into what the VM tried to do
leading up to the OOM. All that useful info is in memory.stat.

Furthermore, the recursive printing for every child cgroup can
generate absurd amounts of data on the console for larger cgroup
trees, and it's not like we provide a per-cgroup breakdown during
global OOM kills.

When an OOM kill is triggered, print one set of recursive memory.stat
items at the level whose limit triggered the OOM condition.

Example output:

stress invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
CPU: 2 PID: 210 Comm: stress Not tainted 5.2.0-rc2-mm1-00247-g47d49835983c torvalds#135
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181126_142135-anatol 04/01/2014
Call Trace:
 dump_stack+0x46/0x60
 dump_header+0x4c/0x2d0
 oom_kill_process.cold.10+0xb/0x10
 out_of_memory+0x200/0x270
 ? try_to_free_mem_cgroup_pages+0xdf/0x130
 mem_cgroup_out_of_memory+0xb7/0xc0
 try_charge+0x680/0x6f0
 mem_cgroup_try_charge+0xb5/0x160
 __add_to_page_cache_locked+0xc6/0x300
 ? list_lru_destroy+0x80/0x80
 add_to_page_cache_lru+0x45/0xc0
 pagecache_get_page+0x11b/0x290
 filemap_fault+0x458/0x6d0
 ext4_filemap_fault+0x27/0x36
 __do_fault+0x2f/0xb0
 __handle_mm_fault+0x9c5/0x1140
 ? apic_timer_interrupt+0xa/0x20
 handle_mm_fault+0xc5/0x180
 __do_page_fault+0x1ab/0x440
 ? page_fault+0x8/0x30
 page_fault+0x1e/0x30
RIP: 0033:0x55c32167fc10
Code: Bad RIP value.
RSP: 002b:00007fff1d031c50 EFLAGS: 00010206
RAX: 000000000dc00000 RBX: 00007fd2db000010 RCX: 00007fd2db000010
RDX: 0000000000000000 RSI: 0000000010001000 RDI: 0000000000000000
RBP: 000055c321680a54 R08: 00000000ffffffff R09: 0000000000000000
R10: 0000000000000022 R11: 0000000000000246 R12: ffffffffffffffff
R13: 0000000000000002 R14: 0000000000001000 R15: 0000000010000000
memory: usage 1024kB, limit 1024kB, failcnt 75131
swap: usage 0kB, limit 9007199254740988kB, failcnt 0
Memory cgroup stats for /foo:
anon 0
file 0
kernel_stack 36864
slab 274432
sock 0
shmem 0
file_mapped 0
file_dirty 0
file_writeback 0
anon_thp 0
inactive_anon 126976
active_anon 0
inactive_file 0
active_file 0
unevictable 0
slab_reclaimable 0
slab_unreclaimable 274432
pgfault 59466
pgmajfault 1617
workingset_refault 2145
workingset_activate 0
workingset_nodereclaim 0
pgrefill 98952
pgscan 200060
pgsteal 59340
pgactivate 40095
pgdeactivate 96787
pglazyfree 0
pglazyfreed 0
thp_fault_alloc 0
thp_collapse_alloc 0
Tasks state (memory values in pages):
[  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[    200]     0   200     1121      884    53248       29             0 bash
[    209]     0   209      905      246    45056       19             0 stress
[    210]     0   210    66442       56   499712    56349             0 stress
oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),oom_memcg=/foo,task_memcg=/foo,task=stress,pid=210,uid=0
Memory cgroup out of memory: Killed process 210 (stress) total-vm:265768kB, anon-rss:0kB, file-rss:224kB, shmem-rss:0kB
oom_reaper: reaped process 210 (stress), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Link: http://lkml.kernel.org/r/20190604210509.9744-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mrchapp pushed a commit to mrchapp/linux that referenced this pull request Jun 6, 2019
The current cgroup OOM memory info dump doesn't include all the memory
we are tracking, nor does it give insight into what the VM tried to do
leading up to the OOM. All that useful info is in memory.stat.

Furthermore, the recursive printing for every child cgroup can
generate absurd amounts of data on the console for larger cgroup
trees, and it's not like we provide a per-cgroup breakdown during
global OOM kills.

When an OOM kill is triggered, print one set of recursive memory.stat
items at the level whose limit triggered the OOM condition.

Example output:

stress invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
CPU: 2 PID: 210 Comm: stress Not tainted 5.2.0-rc2-mm1-00247-g47d49835983c torvalds#135
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181126_142135-anatol 04/01/2014
Call Trace:
 dump_stack+0x46/0x60
 dump_header+0x4c/0x2d0
 oom_kill_process.cold.10+0xb/0x10
 out_of_memory+0x200/0x270
 ? try_to_free_mem_cgroup_pages+0xdf/0x130
 mem_cgroup_out_of_memory+0xb7/0xc0
 try_charge+0x680/0x6f0
 mem_cgroup_try_charge+0xb5/0x160
 __add_to_page_cache_locked+0xc6/0x300
 ? list_lru_destroy+0x80/0x80
 add_to_page_cache_lru+0x45/0xc0
 pagecache_get_page+0x11b/0x290
 filemap_fault+0x458/0x6d0
 ext4_filemap_fault+0x27/0x36
 __do_fault+0x2f/0xb0
 __handle_mm_fault+0x9c5/0x1140
 ? apic_timer_interrupt+0xa/0x20
 handle_mm_fault+0xc5/0x180
 __do_page_fault+0x1ab/0x440
 ? page_fault+0x8/0x30
 page_fault+0x1e/0x30
RIP: 0033:0x55c32167fc10
Code: Bad RIP value.
RSP: 002b:00007fff1d031c50 EFLAGS: 00010206
RAX: 000000000dc00000 RBX: 00007fd2db000010 RCX: 00007fd2db000010
RDX: 0000000000000000 RSI: 0000000010001000 RDI: 0000000000000000
RBP: 000055c321680a54 R08: 00000000ffffffff R09: 0000000000000000
R10: 0000000000000022 R11: 0000000000000246 R12: ffffffffffffffff
R13: 0000000000000002 R14: 0000000000001000 R15: 0000000010000000
memory: usage 1024kB, limit 1024kB, failcnt 75131
swap: usage 0kB, limit 9007199254740988kB, failcnt 0
Memory cgroup stats for /foo:
anon 0
file 0
kernel_stack 36864
slab 274432
sock 0
shmem 0
file_mapped 0
file_dirty 0
file_writeback 0
anon_thp 0
inactive_anon 126976
active_anon 0
inactive_file 0
active_file 0
unevictable 0
slab_reclaimable 0
slab_unreclaimable 274432
pgfault 59466
pgmajfault 1617
workingset_refault 2145
workingset_activate 0
workingset_nodereclaim 0
pgrefill 98952
pgscan 200060
pgsteal 59340
pgactivate 40095
pgdeactivate 96787
pglazyfree 0
pglazyfreed 0
thp_fault_alloc 0
thp_collapse_alloc 0
Tasks state (memory values in pages):
[  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[    200]     0   200     1121      884    53248       29             0 bash
[    209]     0   209      905      246    45056       19             0 stress
[    210]     0   210    66442       56   499712    56349             0 stress
oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),oom_memcg=/foo,task_memcg=/foo,task=stress,pid=210,uid=0
Memory cgroup out of memory: Killed process 210 (stress) total-vm:265768kB, anon-rss:0kB, file-rss:224kB, shmem-rss:0kB
oom_reaper: reaped process 210 (stress), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Link: http://lkml.kernel.org/r/20190604210509.9744-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request Dec 11, 2023
syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request Dec 11, 2023
syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request Dec 11, 2023
syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request Dec 11, 2023
syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request Dec 11, 2023
syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request Dec 11, 2023
syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request Dec 11, 2023
syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request Dec 12, 2023
syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request Dec 12, 2023
syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request Dec 12, 2023
syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request Dec 12, 2023
syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request Dec 12, 2023
syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request Dec 12, 2023
syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request Dec 12, 2023
syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request Dec 12, 2023
syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request Dec 13, 2023
syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Signed-off-by: NipaLocal <nipa@local>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Dec 13, 2023
syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Link: https://lore.kernel.org/r/20231210020200.1539875-1-dongchenchen2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
mj22226 pushed a commit to mj22226/linux that referenced this pull request Dec 18, 2023
[ Upstream commit f99cd56 ]

syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Link: https://lore.kernel.org/r/20231210020200.1539875-1-dongchenchen2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
staging-kernelci-org pushed a commit to kernelci/linux that referenced this pull request Dec 20, 2023
[ Upstream commit f99cd56 ]

syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Link: https://lore.kernel.org/r/20231210020200.1539875-1-dongchenchen2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Dec 20, 2023
[ Upstream commit f99cd56 ]

syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Link: https://lore.kernel.org/r/20231210020200.1539875-1-dongchenchen2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Dec 20, 2023
[ Upstream commit f99cd56 ]

syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Link: https://lore.kernel.org/r/20231210020200.1539875-1-dongchenchen2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Dec 20, 2023
[ Upstream commit f99cd56 ]

syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Link: https://lore.kernel.org/r/20231210020200.1539875-1-dongchenchen2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Dec 20, 2023
[ Upstream commit f99cd56 ]

syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Link: https://lore.kernel.org/r/20231210020200.1539875-1-dongchenchen2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ptr1337 pushed a commit to CachyOS/linux that referenced this pull request Dec 20, 2023
[ Upstream commit f99cd56 ]

syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Link: https://lore.kernel.org/r/20231210020200.1539875-1-dongchenchen2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Dec 20, 2023
[ Upstream commit f99cd56 ]

syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Link: https://lore.kernel.org/r/20231210020200.1539875-1-dongchenchen2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
fallen pushed a commit to kalray/linux_coolidge that referenced this pull request Jan 8, 2024
Summary:
In order to enable lockdep support, we need a compelte TRACE_IRQFLAGS
support. While enabled previously, there no trace_hardirqs support
leading to various errors when enabling PROVE_LOCKING.
This patch adds necessary calls to trace_hardirqs_on/off in assembly to
correctly track irqs state when entering exceptions.
During testing, a problem arose when calling trace_hardirqs_off
indirectly from module:

```
 ------------[ cut here ]------------
 WARNING: CPU: 0 PID: 59 at kernel/locking/lockdep.c:4329 check_flags.part.23+0x23c/0x240
 DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)
 Modules linked in: crc32c_generic(+) loop ext4 crc16 mbcache jbd2 crypto_hash crypto_algapi crypto m)
 CPU: 0 PID: 59 Comm: modprobe Tainted: G           O      5.3.0 torvalds#135
 Hardware name: Kalray HAPS prototyping (LS) (DT)
 ...
 Call Trace:
 [<ffffff80007816e8>] dump_stack+0x30/0x50
 [<ffffff8000216c98>] __warn+0x138/0x158
 [<ffffff8000216d10>] warn_slowpath_fmt+0x58/0x78
 [<ffffff800026224c>] check_flags.part.23+0x23c/0x240
 [<ffffff800026649c>] lock_is_held_type+0x1a4/0x1b0
 [<ffffff8000282dd0>] rcu_read_lock_sched_held+0x88/0x90
 [<ffffff80002a2fac>] module_assert_mutex_or_preempt+0x2c/0x98
 [<ffffff80002a3988>] __module_address+0x48/0x150
 [<ffffff80002a3ab0>] __module_text_address+0x20/0xb0
 [<ffffff80002a7900>] is_module_text_address+0x18/0x38
 [<ffffff800023be04>] kernel_text_address+0x8c/0xb8
 [<ffffff800023be50>] __kernel_text_address+0x20/0x90
 [<ffffff800020ded8>] walk_stackframe+0x78/0xe0
 [<ffffff800020e5e8>] return_address+0x58/0x90
 [<ffffff80002c3650>] trace_hardirqs_off+0x68/0x1b8
 [<ffffff800036912c>] kfree+0xe4/0x328
 [<ffffff8040087c3c>] crypto_larval_destroy+0x54/0x78 [crypto]
 [<ffffff8040087464>] crypto_larval_kill+0xd4/0xf8 [crypto]
 [<ffffff804008080c>] crypto_wait_for_test+0x9c/0x108 [crypto_algapi]
 [<ffffff8040080e24>] crypto_register_alg+0xdc/0xe8 [crypto_algapi]
 [<ffffff804008d308>] crypto_register_shash+0x48/0x70 [crypto_hash]
 [<ffffff8040096030>] crc32c_mod_init+0x30/0x4c [crc32c_generic]
 [<ffffff80002099d8>] do_one_initcall+0x70/0x2b0
 [<ffffff80002a7ed4>] do_init_module+0x74/0x270
 [<ffffff80002a6a04>] load_module+0x223c/0x26f0
 [<ffffff80002a70c8>] sys_finit_module+0xb0/0x118
 irq event stamp: 1989
 hardirqs last  enabled at (1989): [<ffffff80007a8854>] _raw_spin_unlock_irqrestore+0x8c/0xa0
 hardirqs last disabled at (1988): [<ffffff80007a861c>] _raw_spin_lock_irqsave+0x34/0x88
 softirqs last  enabled at (1940): [<ffffff80007ab04c>] _etext+0x38c/0x2340
 softirqs last disabled at (1935): [<ffffff800021c754>] irq_exit+0xa4/0xa8
 ---[ end trace f667df957dc78eed ]---
 possible reason: unannotated irqs-off.
```

Problem is due to the fact trace_hardirqs_off used CALLER_ADDR0 and
CALLER_ADDR1. These macros indirectly uses return_addr. Then in
walk_stackframe, address of frame pointer ra is checked to be a kernel
address using __kernel_text_address. In this function, __module_address
take a lock which trigger a hardirqs check that fails. However, since we are
calling it to report that trace_hardirqs are off, this is triggering a check
before even tracking that irqs are off. Remove the call to
__kernel_text_address as it is also not done on arm64.

Ref T10401

Test Plan:
Executed both smp & valid images on FPGA and verified that there are no
signaled lockdep errors. Note that this can't run on debug image since the
overhead to call trace_hardirqs_on in exception does not leave any time for
normal code to execute.

Reviewers: O51 Linux Coolidge, gthouvenin

Reviewed By: O51 Linux Coolidge, gthouvenin

Subscribers: gthouvenin, jcpince, alfred, #linux_coolidge_cc

Maniphest Tasks: T10401

Differential Revision: https://phab.kalray.eu/D2034
fallen pushed a commit to kalray/linux_coolidge that referenced this pull request Jan 15, 2024
Summary:
In order to enable lockdep support, we need a compelte TRACE_IRQFLAGS
support. While enabled previously, there no trace_hardirqs support
leading to various errors when enabling PROVE_LOCKING.
This patch adds necessary calls to trace_hardirqs_on/off in assembly to
correctly track irqs state when entering exceptions.
During testing, a problem arose when calling trace_hardirqs_off
indirectly from module:

```
 ------------[ cut here ]------------
 WARNING: CPU: 0 PID: 59 at kernel/locking/lockdep.c:4329 check_flags.part.23+0x23c/0x240
 DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)
 Modules linked in: crc32c_generic(+) loop ext4 crc16 mbcache jbd2 crypto_hash crypto_algapi crypto m)
 CPU: 0 PID: 59 Comm: modprobe Tainted: G           O      5.3.0 torvalds#135
 Hardware name: Kalray HAPS prototyping (LS) (DT)
 ...
 Call Trace:
 [<ffffff80007816e8>] dump_stack+0x30/0x50
 [<ffffff8000216c98>] __warn+0x138/0x158
 [<ffffff8000216d10>] warn_slowpath_fmt+0x58/0x78
 [<ffffff800026224c>] check_flags.part.23+0x23c/0x240
 [<ffffff800026649c>] lock_is_held_type+0x1a4/0x1b0
 [<ffffff8000282dd0>] rcu_read_lock_sched_held+0x88/0x90
 [<ffffff80002a2fac>] module_assert_mutex_or_preempt+0x2c/0x98
 [<ffffff80002a3988>] __module_address+0x48/0x150
 [<ffffff80002a3ab0>] __module_text_address+0x20/0xb0
 [<ffffff80002a7900>] is_module_text_address+0x18/0x38
 [<ffffff800023be04>] kernel_text_address+0x8c/0xb8
 [<ffffff800023be50>] __kernel_text_address+0x20/0x90
 [<ffffff800020ded8>] walk_stackframe+0x78/0xe0
 [<ffffff800020e5e8>] return_address+0x58/0x90
 [<ffffff80002c3650>] trace_hardirqs_off+0x68/0x1b8
 [<ffffff800036912c>] kfree+0xe4/0x328
 [<ffffff8040087c3c>] crypto_larval_destroy+0x54/0x78 [crypto]
 [<ffffff8040087464>] crypto_larval_kill+0xd4/0xf8 [crypto]
 [<ffffff804008080c>] crypto_wait_for_test+0x9c/0x108 [crypto_algapi]
 [<ffffff8040080e24>] crypto_register_alg+0xdc/0xe8 [crypto_algapi]
 [<ffffff804008d308>] crypto_register_shash+0x48/0x70 [crypto_hash]
 [<ffffff8040096030>] crc32c_mod_init+0x30/0x4c [crc32c_generic]
 [<ffffff80002099d8>] do_one_initcall+0x70/0x2b0
 [<ffffff80002a7ed4>] do_init_module+0x74/0x270
 [<ffffff80002a6a04>] load_module+0x223c/0x26f0
 [<ffffff80002a70c8>] sys_finit_module+0xb0/0x118
 irq event stamp: 1989
 hardirqs last  enabled at (1989): [<ffffff80007a8854>] _raw_spin_unlock_irqrestore+0x8c/0xa0
 hardirqs last disabled at (1988): [<ffffff80007a861c>] _raw_spin_lock_irqsave+0x34/0x88
 softirqs last  enabled at (1940): [<ffffff80007ab04c>] _etext+0x38c/0x2340
 softirqs last disabled at (1935): [<ffffff800021c754>] irq_exit+0xa4/0xa8
 ---[ end trace f667df957dc78eed ]---
 possible reason: unannotated irqs-off.
```

Problem is due to the fact trace_hardirqs_off used CALLER_ADDR0 and
CALLER_ADDR1. These macros indirectly uses return_addr. Then in
walk_stackframe, address of frame pointer ra is checked to be a kernel
address using __kernel_text_address. In this function, __module_address
take a lock which trigger a hardirqs check that fails. However, since we are
calling it to report that trace_hardirqs are off, this is triggering a check
before even tracking that irqs are off. Remove the call to
__kernel_text_address as it is also not done on arm64.

Ref T10401

Test Plan:
Executed both smp & valid images on FPGA and verified that there are no
signaled lockdep errors. Note that this can't run on debug image since the
overhead to call trace_hardirqs_on in exception does not leave any time for
normal code to execute.

Reviewers: O51 Linux Coolidge, gthouvenin

Reviewed By: O51 Linux Coolidge, gthouvenin

Subscribers: gthouvenin, jcpince, alfred, #linux_coolidge_cc

Maniphest Tasks: T10401

Differential Revision: https://phab.kalray.eu/D2034
snajpa pushed a commit to vpsfreecz/linux that referenced this pull request Mar 2, 2024
[ Upstream commit f99cd56 ]

syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Link: https://lore.kernel.org/r/20231210020200.1539875-1-dongchenchen2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
snajpa pushed a commit to vpsfreecz/linux that referenced this pull request Mar 2, 2024
[ Upstream commit f99cd56 ]

syzkaller report:

 kernel BUG at net/core/skbuff.c:3452!
 invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty torvalds#135
 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
 Call Trace:
 icmp_glue_bits (net/ipv4/icmp.c:357)
 __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
 ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
 icmp_push_reply (net/ipv4/icmp.c:370)
 __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
 ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
 __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
 ip_output (net/ipv4/ip_output.c:427)
 __ip_queue_xmit (net/ipv4/ip_output.c:535)
 __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
 __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
 tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
 tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
 tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)

The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  // TCP B: not send challenge ack for ack limit or packet loss
  // TCP A: close
	tcp_close
	   tcp_send_fin
              if (!tskb && tcp_under_memory_pressure(sk))
                  tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
           TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag

  6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...

  // TCP B: send challenge ack to SYN_FIN_ACK

  7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack

  // TCP A:  <SND.UNA=101>

  8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic

	__tcp_retransmit_skb  //skb->len=0
	    tcp_trim_head
		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
		    __pskb_trim_head
			skb->data_len -= len // skb->len=-1, wrap around
	    ... ...
	    ip_fragment
		icmp_glue_bits //BUG_ON

If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.

Fixes: 1da177e ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Link: https://lore.kernel.org/r/20231210020200.1539875-1-dongchenchen2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
fallen pushed a commit to kalray/linux_coolidge that referenced this pull request Mar 6, 2024
Summary:
In order to enable lockdep support, we need a compelte TRACE_IRQFLAGS
support. While enabled previously, there no trace_hardirqs support
leading to various errors when enabling PROVE_LOCKING.
This patch adds necessary calls to trace_hardirqs_on/off in assembly to
correctly track irqs state when entering exceptions.
During testing, a problem arose when calling trace_hardirqs_off
indirectly from module:

```
 ------------[ cut here ]------------
 WARNING: CPU: 0 PID: 59 at kernel/locking/lockdep.c:4329 check_flags.part.23+0x23c/0x240
 DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)
 Modules linked in: crc32c_generic(+) loop ext4 crc16 mbcache jbd2 crypto_hash crypto_algapi crypto m)
 CPU: 0 PID: 59 Comm: modprobe Tainted: G           O      5.3.0 torvalds#135
 Hardware name: Kalray HAPS prototyping (LS) (DT)
 ...
 Call Trace:
 [<ffffff80007816e8>] dump_stack+0x30/0x50
 [<ffffff8000216c98>] __warn+0x138/0x158
 [<ffffff8000216d10>] warn_slowpath_fmt+0x58/0x78
 [<ffffff800026224c>] check_flags.part.23+0x23c/0x240
 [<ffffff800026649c>] lock_is_held_type+0x1a4/0x1b0
 [<ffffff8000282dd0>] rcu_read_lock_sched_held+0x88/0x90
 [<ffffff80002a2fac>] module_assert_mutex_or_preempt+0x2c/0x98
 [<ffffff80002a3988>] __module_address+0x48/0x150
 [<ffffff80002a3ab0>] __module_text_address+0x20/0xb0
 [<ffffff80002a7900>] is_module_text_address+0x18/0x38
 [<ffffff800023be04>] kernel_text_address+0x8c/0xb8
 [<ffffff800023be50>] __kernel_text_address+0x20/0x90
 [<ffffff800020ded8>] walk_stackframe+0x78/0xe0
 [<ffffff800020e5e8>] return_address+0x58/0x90
 [<ffffff80002c3650>] trace_hardirqs_off+0x68/0x1b8
 [<ffffff800036912c>] kfree+0xe4/0x328
 [<ffffff8040087c3c>] crypto_larval_destroy+0x54/0x78 [crypto]
 [<ffffff8040087464>] crypto_larval_kill+0xd4/0xf8 [crypto]
 [<ffffff804008080c>] crypto_wait_for_test+0x9c/0x108 [crypto_algapi]
 [<ffffff8040080e24>] crypto_register_alg+0xdc/0xe8 [crypto_algapi]
 [<ffffff804008d308>] crypto_register_shash+0x48/0x70 [crypto_hash]
 [<ffffff8040096030>] crc32c_mod_init+0x30/0x4c [crc32c_generic]
 [<ffffff80002099d8>] do_one_initcall+0x70/0x2b0
 [<ffffff80002a7ed4>] do_init_module+0x74/0x270
 [<ffffff80002a6a04>] load_module+0x223c/0x26f0
 [<ffffff80002a70c8>] sys_finit_module+0xb0/0x118
 irq event stamp: 1989
 hardirqs last  enabled at (1989): [<ffffff80007a8854>] _raw_spin_unlock_irqrestore+0x8c/0xa0
 hardirqs last disabled at (1988): [<ffffff80007a861c>] _raw_spin_lock_irqsave+0x34/0x88
 softirqs last  enabled at (1940): [<ffffff80007ab04c>] _etext+0x38c/0x2340
 softirqs last disabled at (1935): [<ffffff800021c754>] irq_exit+0xa4/0xa8
 ---[ end trace f667df957dc78eed ]---
 possible reason: unannotated irqs-off.
```

Problem is due to the fact trace_hardirqs_off used CALLER_ADDR0 and
CALLER_ADDR1. These macros indirectly uses return_addr. Then in
walk_stackframe, address of frame pointer ra is checked to be a kernel
address using __kernel_text_address. In this function, __module_address
take a lock which trigger a hardirqs check that fails. However, since we are
calling it to report that trace_hardirqs are off, this is triggering a check
before even tracking that irqs are off. Remove the call to
__kernel_text_address as it is also not done on arm64.

Ref T10401

Test Plan:
Executed both smp & valid images on FPGA and verified that there are no
signaled lockdep errors. Note that this can't run on debug image since the
overhead to call trace_hardirqs_on in exception does not leave any time for
normal code to execute.

Reviewers: O51 Linux Coolidge, gthouvenin

Reviewed By: O51 Linux Coolidge, gthouvenin

Subscribers: gthouvenin, jcpince, alfred, #linux_coolidge_cc

Maniphest Tasks: T10401

Differential Revision: https://phab.kalray.eu/D2034
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants