Skip to content

patchscan: fix E: vs W: classification and extend to 26.04_linux-nvidia#374

Merged
ianm-nv merged 2 commits into
NVIDIA:github-actionsfrom
nirmoy:patchscan-e-vs-w-and-26.04
Apr 17, 2026
Merged

patchscan: fix E: vs W: classification and extend to 26.04_linux-nvidia#374
ianm-nv merged 2 commits into
NVIDIA:github-actionsfrom
nirmoy:patchscan-e-vs-w-and-26.04

Conversation

@nirmoy
Copy link
Copy Markdown
Collaborator

@nirmoy nirmoy commented Apr 16, 2026

Summary

Two commits:

1. patchscan: distinguish E: (verification errors) from W: (missing fixes)

  • Bug: E: (upstream commit-message mismatch) and W: (missing Fixes: patch) both set fixes_found=true, so the :warning: Missing Fixes Detected comment appeared even when All fixes: was empty.
  • Root cause 1: Two separate if blocks both wrote to GITHUB_OUTPUT for the same variable; the second write always overwrote the first.
  • Root cause 2: The [Uu]pstream commit regex in get_src_commit_strs() matched upstream commit <sha> embedded in prose, not just as a backport trailer. Jacob's original script uses only from commit and correctly ignores such prose references.
  • Fix 1: Single mutually-exclusive if/elif/else chain:
    • W: / Fixes forfixes_found=true:warning: Missing Fixes Detected
    • E: / non-zero exit → fixes_found=error:x: Upstream Verification Error
    • neither → fixes_found=false:white_check_mark: No Missing Fixes
  • Fix 2: Anchor [Uu]pstream commit to line start with re.MULTILINE so it only matches as a standalone trailer.

Triggered by PR #342 where ad33ca9f had upstream commit d18f1b7beadf in description prose, causing a spurious failure with an empty All fixes: section.

2. patchscan: extend to 26.04_linux-nvidia branch

Add 26.04_linux-nvidia to the pull_request_target branches list.

Test plan

  • E:-only PR (prose upstream SHA reference) → :x: Upstream Verification Error, no longer "Missing Fixes Detected"
  • W: PR (actual missing Fixes: patch) → :warning: Missing Fixes Detected
  • Clean PR → :white_check_mark: No Missing Fixes
  • ad33ca9f-style prose reference no longer matched by regex (verified with unit test)

Tested on nirmoy/NV-Kernels fork. Supersedes #368.

🤖 Generated with Claude Code

Previously, both E: (upstream commit-message mismatch) and W: (missing
Fixes: patch) set fixes_found=true, causing the "Missing Fixes Detected"
comment to appear even when no Fixes: patches were missing.  PR NVIDIA#342
triggered exactly this: a SAUCE config commit referencing an upstream SHA
with a different title caused E: output, but All fixes: was empty.

Replace the two separate if-blocks (which could overwrite each other via
GITHUB_OUTPUT) with a single mutually-exclusive chain:
  W: / "Fixes for"  → fixes_found=true  (missing Fixes: patches)
  E: / non-zero rc  → fixes_found=error  (upstream verification failure)
  neither           → fixes_found=false  (all-clear)

Update the "error" PR comment title and body to explain this is typically
a false positive from SAUCE commits that reference upstream SHAs in their
message body with a different title.

Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@nirmoy nirmoy requested review from ianm-nv and nvmochs April 16, 2026 18:06
@nvmochs
Copy link
Copy Markdown
Collaborator

nvmochs commented Apr 16, 2026

We can try this. Jacob's script on looks for "from commit ([a-fA-F0-9]+) [\w-])?", but maybe there are cases where "[u|U]pstream commit" is also needed.

@nirmoy Can you also add 26.04_linux-nvidia-bos to the scanner?

…anches

Add 26.04_linux-nvidia and 26.04_linux-nvidia-bos to the
pull_request_target branches list so patchscan also runs on PRs
targeting those branches.

Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@nirmoy nirmoy force-pushed the patchscan-e-vs-w-and-26.04 branch from 81d2e46 to 7a07732 Compare April 17, 2026 10:48
@nirmoy
Copy link
Copy Markdown
Collaborator Author

nirmoy commented Apr 17, 2026

Added 26.04_linux-nvidia-bos. Branches are now sorted: 24.04_linux-nvidia-6.17-next, 26.04_linux-nvidia-bos, 26.04_linux-nvidia.

@nvmochs
Copy link
Copy Markdown
Collaborator

nvmochs commented Apr 17, 2026

Thanks Nirmoy, LGTM!

Acked-by: Matthew R. Ochs <mochs@nvidia.com>

Copy link
Copy Markdown
Collaborator

@ianm-nv ianm-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks Nirmoy!

Signed-off-by: Ian May <ianm@nvidia.com>

@ianm-nv ianm-nv merged commit 2e77179 into NVIDIA:github-actions Apr 17, 2026
nvidia-bfigg pushed a commit that referenced this pull request Apr 20, 2026
Unloading the HMM test module produces the following warning:

[ 3782.224783] ------------[ cut here ]------------
[ 3782.226323] Device 'hmm_dmirror0' does not have a release() function, it is broken and must be fixed. See Documentation/core-api/kobject.rst.
[ 3782.230570] WARNING: drivers/base/core.c:2567 at device_release+0x185/0x210, CPU#20: rmmod/1924
[ 3782.233949] Modules linked in: test_hmm(-) nvidia_uvm(O) nvidia(O)
[ 3782.236321] CPU: 20 UID: 0 PID: 1924 Comm: rmmod Tainted: G           O        7.0.0-rc1+ #374 PREEMPT(full)
[ 3782.240226] Tainted: [O]=OOT_MODULE
[ 3782.241639] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
[ 3782.246193] RIP: 0010:device_release+0x185/0x210
[ 3782.247860] Code: 00 00 fc ff df 48 8d 7b 50 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 86 00 00 00 48 8b 73 50 48 85 f6 74 11 48 8d 3d db 25 29 03 <67> 48 0f b9 3a e9 0d ff ff ff 48 b8 00 00 00 00 00 fc ff df 48 89
[ 3782.254211] RSP: 0018:ffff888126577d98 EFLAGS: 00010246
[ 3782.256054] RAX: dffffc0000000000 RBX: ffffffffc2b70310 RCX: ffffffff8fe61ba1
[ 3782.258512] RDX: 1ffffffff856e062 RSI: ffff88811341eea0 RDI: ffffffff91bbacb0
[ 3782.261041] RBP: ffff888111475000 R08: 0000000000000001 R09: fffffbfff856e069
[ 3782.263471] R10: ffffffffc2b7034b R11: 00000000ffffffff R12: 0000000000000000
[ 3782.265983] R13: dffffc0000000000 R14: ffff88811341eea0 R15: 0000000000000000
[ 3782.268443] FS:  00007fd5a3689040(0000) GS:ffff88842c8d0000(0000) knlGS:0000000000000000
[ 3782.271236] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3782.273251] CR2: 00007fd5a36d2c10 CR3: 00000001242b8000 CR4: 00000000000006f0
[ 3782.275362] Call Trace:
[ 3782.276071]  <TASK>
[ 3782.276678]  kobject_put+0x146/0x270
[ 3782.277731]  hmm_dmirror_exit+0x7a/0x130 [test_hmm]
[ 3782.279135]  __do_sys_delete_module+0x341/0x510
[ 3782.280438]  ? module_flags+0x300/0x300
[ 3782.281547]  do_syscall_64+0x111/0x670
[ 3782.282620]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
[ 3782.284091] RIP: 0033:0x7fd5a3793b37
[ 3782.285303] Code: 73 01 c3 48 8b 0d c9 82 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 99 82 0c 00 f7 d8 64 89 01 48
[ 3782.290708] RSP: 002b:00007ffd68b7dc68 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ 3782.292817] RAX: ffffffffffffffda RBX: 000055e3c0d1c770 RCX: 00007fd5a3793b37
[ 3782.294735] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000055e3c0d1c7d8
[ 3782.296661] RBP: 0000000000000000 R08: 1999999999999999 R09: 0000000000000000
[ 3782.298622] R10: 00007fd5a3806ac0 R11: 0000000000000206 R12: 00007ffd68b7deb0
[ 3782.300576] R13: 00007ffd68b7e781 R14: 000055e3c0d1b2a0 R15: 00007ffd68b7deb8
[ 3782.301963]  </TASK>
[ 3782.302371] irq event stamp: 5019
[ 3782.302987] hardirqs last  enabled at (5027): [<ffffffff8cf1f062>] __up_console_sem+0x52/0x60
[ 3782.304507] hardirqs last disabled at (5036): [<ffffffff8cf1f047>] __up_console_sem+0x37/0x60
[ 3782.306086] softirqs last  enabled at (4940): [<ffffffff8cd9a4b0>] __irq_exit_rcu+0xc0/0xf0
[ 3782.307567] softirqs last disabled at (4929): [<ffffffff8cd9a4b0>] __irq_exit_rcu+0xc0/0xf0
[ 3782.309105] ---[ end trace 0000000000000000 ]---

This is because the test module doesn't have a device.release method.  In
this case one probably isn't needed for correctness - the device structs
are in a static array so don't need freeing when the final reference goes
away.

However some device state is freed on exit, so to ensure this happens at
the right time and to silence the warning move the deinitialisation to a
release method and assign that as the device release callback.  Whilst
here also fix a minor error handling bug where cdev_device_del() wasn't
being called if allocation failed.

Link: https://lore.kernel.org/20260331063445.3551404-4-apopple@nvidia.com
Fixes: 6a760f5 ("mm/hmm/test: use char dev with struct device to get device node")
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Acked-by: Balbir Singh <balbirs@nvidia.com>
Tested-by: Zenghui Yu (Huawei) <zenghui.yu@linux.dev>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger,kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants