Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for AR5BBU22 [0489:e03c] #17

Closed
wants to merge 1 commit into from
Closed

Add support for AR5BBU22 [0489:e03c] #17

wants to merge 1 commit into from

Conversation

reejk
Copy link

@reejk reejk commented May 11, 2012

No description provided.

@reejk reejk closed this May 11, 2012
@torvalds
Copy link
Owner

I don't do github pull requests.

github throws away all the relevant information, like having even a
valid email address for the person asking me to pull. The diffstat is
also deficient and useless.

Git comes with a nice pull-request generation module, but github
instead decided to replace it with their own totally inferior version.
As a result, I consider github useless for these kinds of things. It's
fine for hosting, but the pull requests and the online commit
editing, are just pure garbage.

I've told github people about my concerns, they didn't think they
mattered, so I gave up. Feel free to make a bugreport to github.

                Linus

On Fri, May 11, 2012 at 4:27 AM, Roman
reply@reply.github.com
wrote:

You can merge this Pull Request by running:

 git pull https://github.com/WNeZRoS/linux master

Or you can view, comment on it, or merge it online at:

 #17

-- Commit Summary --

  • Add support for AR5BBU22 [0489:e03c]

-- File Changes --

M drivers/bluetooth/btusb.c (3)

-- Patch Links --

 https://github.com/torvalds/linux/pull/17.patch
 https://github.com/torvalds/linux/pull/17.diff


Reply to this email directly or view it on GitHub:
#17

@orblivion
Copy link

How do you feel about merging in things that may include commits downstream that have been pull requested with github? Seems hard to stop that.

@jaseemabid
Copy link

Somebody please look at the diff. Thats a simple 3 line code addition. I agree to you @torvalds but you could have excused this time :)

@jaseemabid
Copy link

By the way, its quite funny that github is sending instructions to @torvalds on using git.

@torvalds
Copy link
Owner

On Fri, May 11, 2012 at 1:03 PM, orblivion
reply@reply.github.com
wrote:

How do you feel about merging in things that may include commits downstream that have been pull requested with github? Seems hard to stop that.

Read my email.

I have no problem with people using github as a hosting site.

But in order for me to pull from github, you need to

(a) make a real pull request, not the braindamaged crap that github
does when you ask it to request a pull: real explanation, proper email
addresses, proper shortlog, and proper diffstat.

(b) since github identities are random, I expect the pull request to
be a signed tag, so that I can verify the identity of the person in
question.

I also refuse to pull commits that have been made with the github web
interface. Again, the reason for that is that the way the github web
interface work, those commits are invariably pure crap. Commits done
on github invariably have totally unreadable descriptions, because the
github commit making thing doesn't do any of the simplest things
that the kernel people expect from a commit message:

  • no "short one-line description in the first line"
  • no sane word-wrap of the long description you type: github commit
    messages tend to be (if they have any description at all) one long
    unreadable line.
  • no sign-offs etc that we require for kernel submissions.

github could make it easy to write good commit messages and enforce
the proper "oneliner for shortlogs and gitk, full explanation for full
logs". But github doesn't. Instead, the github "commit on the web"
interface is one single horrible text-entry field with absolutely no
sane way to write a good-looking message.

Maybe some of this has changed, I haven't checked lately. But in
general, the quality of stuff I have seen from people who use the
github web interfaces has been so low that it's not worth my time.

I'm writing these explanations in the (probably vain) hope that people
who use github will actually take them to heart, and github will
eventually improve. But right now github is a total ghetto of crap
commit messages and unreadable and unusable pull requests.

And the fact that other projects apparently have so low expectations
of commit messages that these things get used is just sad. People
should try to compare the quality of the kernel git logs with some
other projects, and cry themselves to sleep.

               Linus

@torvalds
Copy link
Owner

Btw, Joseph, you're a quality example of why I detest the github
interface. For some reason, github has attracted people who have zero
taste, don't care about commit logs, and can't be bothered.

The fact that I have higher standards then makes people like you make
snarky comments, thinking that you are cool.

You're a moron.

               Linus

@skalnik
Copy link

skalnik commented May 11, 2012

@torvalds The GitHub commit UI provides a text area for commit messages. This supports new lines and makes it easy to do nicely formatted commit messages :)

@jedahan
Copy link

jedahan commented May 11, 2012

@skalnik would be nice if it had an 80-character line to help format things nicely.

@anaisbetts
Copy link

Every time another Pull Request fiasco happens on one of Linus's repos it makes me sad, especially because I want someone whose work I greatly respect, to have a good experience on GitHub - instead he gets dozens of troll comments.

An OS kernel very rightfully demands a very disciplined approach to development that is in many ways not compatible with the goals of GitHub, which is to get as many people of all skill levels involved in Free / Open Source Software. We can certainly make improvements though, and I appreciate that Linus has taken some time to detail exactly why he doesn't use PRs, even if it's a bit harsh.

@tubbo
Copy link

tubbo commented May 11, 2012

 - no sane word-wrap of the long description you type: github commit
messages tend to be (if they have any description at all) one long
unreadable line.

I think this is only because people who are new to Git are using GitHub and not understanding about Git-style committing. Remember, a lot of these newbies are just out of the gate from using SVN for years. I bet a lot of them don't even realize that git commit with the "-m" omitted just opens up COMMIT_EDITMSG in your editor. It isn't even very apparent (to newbies) of the 50-char title rule and 72-char every other line rule with commit messages.

github *could* make it easy to write good commit messages and enforce
the proper "oneliner for shortlogs and gitk, full explanation for full
logs". But github doesn't. Instead, the github "commit on the web"
interface is one single horrible text-entry field with absolutely no
sane way to write a good-looking message.

I have to agree with you there. Commit message viewing on Github sucks and I hope they change it soon.

@torvalds
Copy link
Owner

On Fri, May 11, 2012 at 1:29 PM, Mike Skalnik
reply@reply.github.com
wrote:

@torvalds The GitHub commit UI provides a text area for commit messages. This supports new lines and makes it easy to do nicely formatted commit messages :)

No it doesn't.

What it supports is writing long lines that you have not a f*cking
clue how long they are. The text area does not do line breaks for you,
and you have no way to judge where the line breaks would go.

In other words, it makes it very hard indeed to do "nicely formatted
commit messages". It also doesn't enforce the trivial "oneliner for
shortlog" model, so the commit messages often end up looking like
total crap in shortlogs and in gitk.

So the github commit UI should have

  • separate "shortlog" one-liner text window, so that people cannot
    screw that up.
  • some way to actually do sane word-wrap at the standard 72-column mark.
  • reminders about sign-offs etc that some projects need for
    project-specific or even legal reasons.

It didn't do any of those last time I checked.

              Linus

@jedahan
Copy link

jedahan commented May 11, 2012

I always thought of the title of a pull request as the one-liner ...

@jrep
Copy link

jrep commented May 11, 2012

Newbie question I know, but can someone point me to this "nice pull-request generation module" Linus mentions? My google fu, documentation fu, and command-line-help fu all failed.

@torvalds
Copy link
Owner

On Fri, May 11, 2012 at 1:40 PM, Tom Scott
reply@reply.github.com
wrote:

  • no sane word-wrap of the long description you type: github commit
       messages tend to be (if they have any description at all) one long
       unreadable line.

I think this is only because people who are new to Git are using GitHub and not understanding about Git-style committing.

The thing is, even if you do understand about git-style committing,
it's actually really hard to do that with the github web interface.

The best way to do it is literally to open up another text editor
for the commit message, and then cut-and-paste the end result into the
web interface text tool.

Yes, commit messages should have proper word-wrap, with empty lines in
between paragraphs, and at the same time sometimes you need a long
line without word-wrap (compiler error messages or other "non-prose"
explanation).

And yes, that would almost require some kind of "markup" format with
quoting markers etc. And yes, it would be a more complex model of
writing commit messages. But if the default is "word-wrap at 72
characters, put empty lines in between paragraphs", then people who
don't know about the markup would still on average get better results
(even if the word-wrap would then occasionally be the wrong thing to
do)

Right now, github simply seems to default to "broken horrible
messages", and make it really really hard to do a good job.

And I think it should default to "nice readable messages" with some
effort needed for special things.

            Linus

@technoweenie
Copy link

@jrep: I believe he's referring to git-request-pull.

@nugend
Copy link

nugend commented May 11, 2012

I'm not sure I understand why the commit message itself should be hard word-wrapped. Naively, it seems like that should be a display property of the editor used to write the commit message or the tool used to display the commit message.

@torvalds
Copy link
Owner

On Fri, May 11, 2012 at 1:48 PM, Dominik Dabrowski
reply@reply.github.com
wrote:

You might have fun raging on the internet, but I think your goals would be better served if you expressed your thoughts in a clear (maybe even polite) manner that doesn't embarrass the people whose actions you're trying to influence.

Umm. I think I've been able to reach my goals on the internet better
than most people.

The fact that I'm very clear about my opinions is probably part of it.

If people get offended by accurate portrayals of the current state of
github pull requests, that's their problem.

I hate that whole "victim philosophy". The truth shouldn't be sugarcoated.

                    Linus

@scomma
Copy link

scomma commented May 11, 2012

While I do have great respect for you @torvalds and your work, and it's totally valid for the repository of Linux to have rather rigorous standards, have you considered the possibility there could be a lot of GitHub users who don't really need nor care about any of those "features" you try to portray as objectively superior?

@torvalds
Copy link
Owner

On Fri, May 11, 2012 at 1:49 PM, Daniel Nugent
reply@reply.github.com
wrote:

I'm not sure I understand why the commit message itself should be hard word-wrapped. Naively, it seems like that should be a display property of the editor used to write the commit message or the tool used to display the commit message.

No it shouldn't.

Word-wrapping is a property of the text. And the tool you use to
visualize things cannot know. End result: you do word-wrapping at the
only stage where you can do it, namely when writing it. Not when
showing it.

Some things should not be word-wrapped. They may be some kind of
quoted text - long compiler error messages, oops reports, whatever.
Things that have a certain specific format.

The tool displaying the thing can't know. The person writing the
commit message can. End result: you'd better do word-wrapping at
commit time, because that's the only time you know the difference.

Sure, the alternative would be to have commit messages be some
non-pure-textual format (html or similar). But no, that's not how git
does things. Sure, technically it could, but realistically the rule is
simple: we use 72-character columns for word-wrapping, except for
quoted material that has a specific line format.

(And the rule is not 80 characters, because you do want to allow the
standard indentation from git log, and you do want to leave some room
for quoting).

Anyway, you are obviously free to do your commit messages any way you
want. However, these are the rules we try to follow in the kernel, and
in git itself.

And quite frankly, anybody who thinks they have better rules had
better prove their point by showing a project with better commit
messages. Quite frankly, I've seen a lot of open-source projects, and
I have yet to see any project that does a better job of doing good
commit messages than the kernel or git. And I've seen a lot of
projects that do much worse.

So I would suggest taking the cue for good log messages from projects
that have proven that they really can do good log messages. Linux and
git are both good examples of that.

             Linus

@tylermenezes
Copy link

If you add .patch onto this URL you'll get a git-am style patch.

(Github is very silly for not exposing this in the interface, and for not even really mentioning this feature.)

I agree with you on the messages, I wish the text areas were at least monospaced.

@torvalds
Copy link
Owner

On Fri, May 11, 2012 at 2:01 PM, Prathan Thananart
reply@reply.github.com
wrote:

While I do have great respect for you @torvalds and your work, and it's totally valid for the repository of Linux to have rather rigorous standards, have you considered the possibility there could be a lot of GitHub users who don't really need nor care about any of those "features" you try to portray as objectively superior?

Sure.

And when those people with lower standards try to get their commits
included in the kernel, I will ridicule them and point out how broken
their commit messages or pull requests are.

Agreed?

Btw, the commit message rules we use in the kernel really are
objectively better. The fact that some other projects don't care that
much is fine. But just compare kernel message logs to other projects,
and I think you'll find that no, it's not just "my opinion". We do
have standards, and the standards are there to make for better logs.

               Linus

@torvalds
Copy link
Owner

On Fri, May 11, 2012 at 2:03 PM, Mahmut Bulut
reply@reply.github.com
wrote:

So, if you can't "impolite" dear @torvalds, we can say 'why the "linux kernel" is here'?

.. because I think github does some things very well.

So sure, you may think I hate github. I don't. I hate very specific
parts of github that I think are done badly.

But other parts are done really really well.

I think github does a stellar job at the actual hosting part. I
really do. There is no question in my mind that github is one of the
absolute best places to host a project. It's fast, it's efficient, it
works, and it's available to anybody.

That's wonderful. I think github is absolutely lovely in many respects.

And that then makes me really annoyed at the places where I think
github does a subpar job: pull requests and committing changes using
the web interface.

            Linus

@mmorris-gc
Copy link

Word-wrapping is a property of the text. And the tool you use to
visualize things cannot know. End result: you do word-wrapping at the
only stage where you can do it, namely when writing it. Not when
showing it.

Just curious - why is it that the tool used to visualize things cannot know how to wrap text it displays? And if it is the case, isn't that a problem with the viewer itself, rather than a reason to hard wrap?

@valpackett
Copy link

Commit messages must be limited to 140 characters, like tweets. Right in git's core.

(See what I did there? What's “pure garbage” for you is just perfect for a lot of people.)

@vertexclique
Copy link

@torvalds Thank you for your rational and good opinion. I appreciate you.

@brettalton
Copy link

Do you guys not understand that this is Linus' blessed repository and he can accept and reject whomever and whichever request he likes? He has specific and pertinent rules when it comes to merging that he's learned over 20 years of maintaining the Linux kernel. He developed git - in case you forgot, he was the initial developer - with features specifically for gpg signoffs, shortlogs, etc. - things he and other intelligent computer scientists find useful for maintaining repositories.

I've maintained small projects with three developers plus myself and as soon as you become loose with your merging criteria, the entire repository goes to hell. If he wants gpg signoffs, then he'll get gpg signoffs. Try maintaining 20 millions lines of code and merges requests from 2,000 developers, and then you can give Linus advise.

@dustalov
Copy link

I think @torvalds is a pretty cool guy. eh scolds githubs and doesnt afraid of anything.

@MostAwesomeDude
Copy link
Contributor

While I do have great respect for you @torvalds and your work, and it's totally valid for the repository of Linux to have rather rigorous standards, have you considered the possibility there could be a lot of GitHub users who don't really need nor care about any of those "features" you try to portray as objectively superior?

"GitHub is the best place to share code with friends, co-workers,
classmates, and complete strangers." As long as GH actually, genuinely
cares about making this statement true, they should be providing these
features.

Roman, in the future, you should follow the kernel's guide for
submitting patches. I believe that drivers/bluetooth is covered by the
list at linux-bluetooth@vger.kernel.org and you can submit your patch
to them, with a proper Signed-off-by tag.

FWIW, Reviewed-by: Corbin Simpson MostAwesomeDude@gmail.com, but
there's no way to confirm that since GH is going to hide my email
address and I can't easily sign this message.

(As an example of broken UI, while writing this message, I split my
screen between Firefox and vim, vertically. Linus' messages, being
wrapped, were perfectly readable, but because Github has a massive
minimum width, I had to scroll back and forth in order to read everybody
else's messages.)

intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Oct 27, 2023
Add test to validate BPF verifier's register range bounds tracking logic.

The main bulk is a lot of auto-generated tests based on a small set of
seed values for lower and upper 32 bits of full 64-bit values.
Currently we validate only range vs const comparisons, but the idea is
to start validating range over range comparisons in subsequent patch set.

When setting up initial register ranges we treat registers as one of
u64/s64/u32/s32 numeric types, and then independently perform conditional
comparisons based on a potentially different u64/s64/u32/s32 types. This
tests lots of tricky cases of deriving bounds information across
different numeric domains.

Given there are lots of auto-generated cases, we guard them behind
SLOW_TESTS=1 envvar requirement, and skip them altogether otherwise.
With current full set of upper/lower seed value, all supported
comparison operators and all the combinations of u64/s64/u32/s32 number
domains, we get about 7.7 million tests, which run in about 35 minutes
on my local qemu instance without parallelization. But we also split
those tests by init/cond numeric types, which allows to rely on
test_progs's parallelization of tests with `-j` option, getting run time
down to about 5 minutes on 8 cores. It's still something that shouldn't
be run during normal test_progs run.  But we can run it a reasonable
time, and so perhaps a nightly CI test run (once we have it) would be
a good option for this.

We also add a small set of tricky conditions that came up during
development and triggered various bugs or corner cases in either
selftest's reimplementation of range bounds logic or in verifier's logic
itself. These are fast enough to be run as part of normal test_progs
test run and are great for a quick sanity checking.

Let's take a look at test output to understand what's going on:

  $ sudo ./test_progs -t reg_bounds_crafted
  torvalds#191/1   reg_bounds_crafted/(u64)[0; 0xffffffff] (u64)< 0:OK
  ...
  torvalds#191/115 reg_bounds_crafted/(u64)[0; 0x17fffffff] (s32)< 0:OK
  ...
  torvalds#191/137 reg_bounds_crafted/(u64)[0xffffffff; 0x100000000] (u64)== 0:OK

Each test case is uniquely and fully described by this generated string.
E.g.: "(u64)[0; 0x17fffffff] (s32)< 0". This means that we
initialize a register (R6) in such a way that verifier knows that it can
have a value in [(u64)0; (u64)0x17fffffff] range. Another
register (R7) is also set up as u64, but this time a constant (zero in
this case). They then are compared using 32-bit signed < operation.
Resulting TRUE/FALSE branches are evaluated (including cases where it's
known that one of the branches will never be taken, in which case we
validate that verifier also determines this as a dead code). Test
validates that verifier's final register state matches expected state
based on selftest's own reg_state logic, implemented from scratch for
cross-checking purposes.

These test names can be conveniently used for further debugging, and if -vv
verboseness is requested we can get a corresponding verifier log (with
mark_precise logs filtered out as irrelevant and distracting). Example below is
slightly redacted for brevity, omitting irrelevant register output in
some places, marked with [...].

  $ sudo ./test_progs -a 'reg_bounds_crafted/(u32)[0; U32_MAX] (s32)< -1' -vv
  ...
  VERIFIER LOG:
  ========================
  func#0 @0
  0: R1=ctx(off=0,imm=0) R10=fp0
  0: (05) goto pc+2
  3: (85) call bpf_get_current_pid_tgid#14      ; R0_w=scalar()
  4: (bc) w6 = w0                       ; R0_w=scalar() R6_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
  5: (85) call bpf_get_current_pid_tgid#14      ; R0_w=scalar()
  6: (bc) w7 = w0                       ; R0_w=scalar() R7_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
  7: (b4) w1 = 0                        ; R1_w=0
  8: (b4) w2 = -1                       ; R2=4294967295
  9: (ae) if w6 < w1 goto pc-9
  9: R1=0 R6=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
  10: (2e) if w6 > w2 goto pc-10
  10: R2=4294967295 R6=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
  11: (b4) w1 = -1                      ; R1_w=4294967295
  12: (b4) w2 = -1                      ; R2_w=4294967295
  13: (ae) if w7 < w1 goto pc-13        ; R1_w=4294967295 R7=4294967295
  14: (2e) if w7 > w2 goto pc-14
  14: R2_w=4294967295 R7=4294967295
  15: (bc) w0 = w6                      ; [...] R6=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
  16: (bc) w0 = w7                      ; [...] R7=4294967295
  17: (ce) if w6 s< w7 goto pc+3        ; R6=scalar(id=1,smin=0,smax=umax=4294967295,smin32=-1,var_off=(0x0; 0xffffffff)) R7=4294967295
  18: (bc) w0 = w6                      ; [...] R6=scalar(id=1,smin=0,smax=umax=4294967295,smin32=-1,var_off=(0x0; 0xffffffff))
  19: (bc) w0 = w7                      ; [...] R7=4294967295
  20: (95) exit

  from 17 to 21: [...]
  21: (bc) w0 = w6                      ; [...] R6=scalar(id=1,smin=umin=umin32=2147483648,smax=umax=umax32=4294967294,smax32=-2,var_off=(0x80000000; 0x7fffffff))
  22: (bc) w0 = w7                      ; [...] R7=4294967295
  23: (95) exit

  from 13 to 1: [...]
  1: [...]
  1: (b7) r0 = 0                        ; R0_w=0
  2: (95) exit
  processed 24 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1
  =====================

Verifier log above is for `(u32)[0; U32_MAX] (s32)< -1` use cases, where u32
range is used for initialization, followed by signed < operator. Note
how we use w6/w7 in this case for register initialization (it would be
R6/R7 for 64-bit types) and then `if w6 s< w7` for comparison at
instruction torvalds#17. It will be `if R6 < R7` for 64-bit unsigned comparison.
Above example gives a good impression of the overall structure of a BPF
programs generated for reg_bounds tests.

In the future, this "framework" can be extended to test not just
conditional jumps, but also arithmetic operations. Adding randomized
testing is another possibility.

Some implementation notes. We basically have our own generics-like
operations on numbers, where all the numbers are stored in u64, but how
they are interpreted is passed as runtime argument enum num_t. Further,
`struct range` represents a bounds range, and those are collected
together into a minimal `struct reg_state`, which collects range bounds
across all four numberical domains: u64, s64, u32, s64.

Based on these primitives and `enum op` representing possible
conditional operation (<, <=, >, >=, ==, !=), there is a set of generic
helpers to perform "range arithmetics", which is used to maintain struct
reg_state. We simulate what verifier will do for reg bounds of R6 and R7
registers using these range and reg_state primitives. Simulated
information is used to determine branch taken conclusion and expected
exact register state across all four number domains.

Implementation of "range arithmetics" is more generic than what verifier
is currently performing: it allows range over range comparisons and
adjustments. This is the intended end goal of this patch set overall and verifier
logic is enhanced in subsequent patches in this series to handle range
vs range operations, at which point selftests are extended to validate
these conditions as well. For now it's range vs const cases only.

Note that tests are split into multiple groups by their numeric types
for initialization of ranges and for comparison operation. This allows
to use test_progs's -j parallelization to speed up tests, as we now have
16 groups of parallel running tests. Overall reduction of running time
that allows is pretty good, we go down from more than 30 minutes to
slightly less than 5 minutes running time.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
NishadSaraf pushed a commit to NishadSaraf/linux that referenced this pull request Oct 30, 2023
Signed-off-by: david <davidzha@amd.com>

Signed-off-by: david <davidzha@amd.com>
dangowrt pushed a commit to dangowrt/linux that referenced this pull request Nov 16, 2023
Add test to validate BPF verifier's register range bounds tracking logic.

The main bulk is a lot of auto-generated tests based on a small set of
seed values for lower and upper 32 bits of full 64-bit values.
Currently we validate only range vs const comparisons, but the idea is
to start validating range over range comparisons in subsequent patch set.

When setting up initial register ranges we treat registers as one of
u64/s64/u32/s32 numeric types, and then independently perform conditional
comparisons based on a potentially different u64/s64/u32/s32 types. This
tests lots of tricky cases of deriving bounds information across
different numeric domains.

Given there are lots of auto-generated cases, we guard them behind
SLOW_TESTS=1 envvar requirement, and skip them altogether otherwise.
With current full set of upper/lower seed value, all supported
comparison operators and all the combinations of u64/s64/u32/s32 number
domains, we get about 7.7 million tests, which run in about 35 minutes
on my local qemu instance without parallelization. But we also split
those tests by init/cond numeric types, which allows to rely on
test_progs's parallelization of tests with `-j` option, getting run time
down to about 5 minutes on 8 cores. It's still something that shouldn't
be run during normal test_progs run.  But we can run it a reasonable
time, and so perhaps a nightly CI test run (once we have it) would be
a good option for this.

We also add a small set of tricky conditions that came up during
development and triggered various bugs or corner cases in either
selftest's reimplementation of range bounds logic or in verifier's logic
itself. These are fast enough to be run as part of normal test_progs
test run and are great for a quick sanity checking.

Let's take a look at test output to understand what's going on:

  $ sudo ./test_progs -t reg_bounds_crafted
  torvalds#191/1   reg_bounds_crafted/(u64)[0; 0xffffffff] (u64)< 0:OK
  ...
  torvalds#191/115 reg_bounds_crafted/(u64)[0; 0x17fffffff] (s32)< 0:OK
  ...
  torvalds#191/137 reg_bounds_crafted/(u64)[0xffffffff; 0x100000000] (u64)== 0:OK

Each test case is uniquely and fully described by this generated string.
E.g.: "(u64)[0; 0x17fffffff] (s32)< 0". This means that we
initialize a register (R6) in such a way that verifier knows that it can
have a value in [(u64)0; (u64)0x17fffffff] range. Another
register (R7) is also set up as u64, but this time a constant (zero in
this case). They then are compared using 32-bit signed < operation.
Resulting TRUE/FALSE branches are evaluated (including cases where it's
known that one of the branches will never be taken, in which case we
validate that verifier also determines this as a dead code). Test
validates that verifier's final register state matches expected state
based on selftest's own reg_state logic, implemented from scratch for
cross-checking purposes.

These test names can be conveniently used for further debugging, and if -vv
verboseness is requested we can get a corresponding verifier log (with
mark_precise logs filtered out as irrelevant and distracting). Example below is
slightly redacted for brevity, omitting irrelevant register output in
some places, marked with [...].

  $ sudo ./test_progs -a 'reg_bounds_crafted/(u32)[0; U32_MAX] (s32)< -1' -vv
  ...
  VERIFIER LOG:
  ========================
  func#0 @0
  0: R1=ctx(off=0,imm=0) R10=fp0
  0: (05) goto pc+2
  3: (85) call bpf_get_current_pid_tgid#14      ; R0_w=scalar()
  4: (bc) w6 = w0                       ; R0_w=scalar() R6_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
  5: (85) call bpf_get_current_pid_tgid#14      ; R0_w=scalar()
  6: (bc) w7 = w0                       ; R0_w=scalar() R7_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
  7: (b4) w1 = 0                        ; R1_w=0
  8: (b4) w2 = -1                       ; R2=4294967295
  9: (ae) if w6 < w1 goto pc-9
  9: R1=0 R6=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
  10: (2e) if w6 > w2 goto pc-10
  10: R2=4294967295 R6=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
  11: (b4) w1 = -1                      ; R1_w=4294967295
  12: (b4) w2 = -1                      ; R2_w=4294967295
  13: (ae) if w7 < w1 goto pc-13        ; R1_w=4294967295 R7=4294967295
  14: (2e) if w7 > w2 goto pc-14
  14: R2_w=4294967295 R7=4294967295
  15: (bc) w0 = w6                      ; [...] R6=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
  16: (bc) w0 = w7                      ; [...] R7=4294967295
  17: (ce) if w6 s< w7 goto pc+3        ; R6=scalar(id=1,smin=0,smax=umax=4294967295,smin32=-1,var_off=(0x0; 0xffffffff)) R7=4294967295
  18: (bc) w0 = w6                      ; [...] R6=scalar(id=1,smin=0,smax=umax=4294967295,smin32=-1,var_off=(0x0; 0xffffffff))
  19: (bc) w0 = w7                      ; [...] R7=4294967295
  20: (95) exit

  from 17 to 21: [...]
  21: (bc) w0 = w6                      ; [...] R6=scalar(id=1,smin=umin=umin32=2147483648,smax=umax=umax32=4294967294,smax32=-2,var_off=(0x80000000; 0x7fffffff))
  22: (bc) w0 = w7                      ; [...] R7=4294967295
  23: (95) exit

  from 13 to 1: [...]
  1: [...]
  1: (b7) r0 = 0                        ; R0_w=0
  2: (95) exit
  processed 24 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1
  =====================

Verifier log above is for `(u32)[0; U32_MAX] (s32)< -1` use cases, where u32
range is used for initialization, followed by signed < operator. Note
how we use w6/w7 in this case for register initialization (it would be
R6/R7 for 64-bit types) and then `if w6 s< w7` for comparison at
instruction torvalds#17. It will be `if R6 < R7` for 64-bit unsigned comparison.
Above example gives a good impression of the overall structure of a BPF
programs generated for reg_bounds tests.

In the future, this "framework" can be extended to test not just
conditional jumps, but also arithmetic operations. Adding randomized
testing is another possibility.

Some implementation notes. We basically have our own generics-like
operations on numbers, where all the numbers are stored in u64, but how
they are interpreted is passed as runtime argument enum num_t. Further,
`struct range` represents a bounds range, and those are collected
together into a minimal `struct reg_state`, which collects range bounds
across all four numberical domains: u64, s64, u32, s64.

Based on these primitives and `enum op` representing possible
conditional operation (<, <=, >, >=, ==, !=), there is a set of generic
helpers to perform "range arithmetics", which is used to maintain struct
reg_state. We simulate what verifier will do for reg bounds of R6 and R7
registers using these range and reg_state primitives. Simulated
information is used to determine branch taken conclusion and expected
exact register state across all four number domains.

Implementation of "range arithmetics" is more generic than what verifier
is currently performing: it allows range over range comparisons and
adjustments. This is the intended end goal of this patch set overall and verifier
logic is enhanced in subsequent patches in this series to handle range
vs range operations, at which point selftests are extended to validate
these conditions as well. For now it's range vs const cases only.

Note that tests are split into multiple groups by their numeric types
for initialization of ranges and for comparison operation. This allows
to use test_progs's -j parallelization to speed up tests, as we now have
16 groups of parallel running tests. Overall reduction of running time
that allows is pretty good, we go down from more than 30 minutes to
slightly less than 5 minutes running time.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Link: https://lore.kernel.org/r/20231112010609.848406-8-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
kuba-moo added a commit to linux-netdev/testing that referenced this pull request Nov 30, 2023
Petr Machata says:

====================
mlxsw: Support CFF flood mode

The registers to configure to initialize a flood table differ between the
controlled and CFF flood modes. In therefore needs to be an op. Add it,
hook up the current init to the existing families, and invoke the op.

PGT is an in-HW table that maps addresses to sets of ports. Then when some
HW process needs a set of ports as an argument, instead of embedding the
actual set in the dynamic configuration, what gets configured is the
address referencing the set. The HW then works with the appropriate PGT
entry.

Among other allocations, the PGT currently contains two large blocks for
bridge flooding: one for 802.1q and one for 802.1d. Within each of these
blocks are three tables, for unknown-unicast, multicast and broadcast
flooding:

      . . . |    802.1q    |    802.1d    | . . .
            | UC | MC | BC | UC | MC | BC |
             \______ _____/ \_____ ______/
                    v             v
                   FID flood vectors

Thus each FID (which corresponds to an 802.1d bridge or one VLAN in an
802.1q bridge) uses three flood vectors spread across a fairly large region
of PGT.

This way of organizing the flood table (called "controlled") is not very
flexible. E.g. to decrease a bridge scale and store more IP MC vectors, one
would need to completely rewrite the bridge PGT blocks, or resort to hacks
such as storing individual MC flood vectors into unused part of the bridge
table.

In order to address these shortcomings, Spectrum-2 and above support what
is called CFF flood mode, for Compressed FID Flooding. In CFF flood mode,
each FID has a little table of its own, with three entries adjacent to each
other, one for unknown-UC, one for MC, one for BC. This allows for a much
more fine-grained approach to PGT management, where bits of it are
allocated on demand.

      . . . | FID | FID | FID | FID | FID | . . .
            |U|M|B|U|M|B|U|M|B|U|M|B|U|M|B|
             \_____________ _____________/
                           v
                   FID flood vectors

Besides the FID table organization, the CFF flood mode also impacts Router
Subport (RSP) table. This table contains flood vectors for rFIDs, which are
FIDs that reference front panel ports or LAGs. The RSP table contains two
entries per front panel port and LAG, one for unknown-UC traffic, and one
for everything else. Currently, the FW allocates and manages the table in
its own part of PGT. rFIDs are marked with flood_rsp bit and managed
specially. In CFF mode, rFIDs are managed as all other FIDs. The driver
therefore has to allocate and maintain the flood vectors. Like with bridge
FIDs, this is more work, but increases flexibility of the system.

The FW currently supports both the controlled and CFF flood modes. To shed
complexity, in the future it should only support CFF flood mode. Hence this
patchset, which adds CFF flood mode support to mlxsw.

Since mlxsw needs to maintain both the controlled mode as well as CFF mode
support, we will keep the layout as compatible as possible. The bridge
tables will stay in the same overall shape, just their inner organization
will change from flood mode -> FID to FID -> flood mode. Likewise will RSP
be kept as a contiguous block of PGT memory, as was the case when the FW
maintained it.

- The way FIDs get configured under the CFF flood mode differs from the
  currently used controlled mode. The simple approach of having several
  globally visible arrays for spectrum.c to statically choose from no
  longer works.

  Patch #1 thus privatizes all FID initialization and finalization logic,
  and exposes it as ops instead.

- Patch #2 renames the ops that are specific to the controlled mode, to
  make room in the namespace for the CFF variants.

  Patch #3 extracts a helper to compute flood table base out of
  mlxsw_sp_fid_flood_table_mid().

- The op fid_setup configured fid_offset, i.e. the number of this FID
  within its family. For rFIDs in CFF mode, to determine this number, the
  driver will need to do fallible queries.

  Thus in patch #4, make the FID setup operation fallible as well.

- Flood mode initialization routine differs between the controlled and CFF
  flood modes. The controlled mode needs to configure flood table layout,
  which the CFF mode does not need to do.

  In patch #5, move mlxsw_sp_fid_flood_table_init() up so that the
  following patch can make use of it.

  In patch torvalds#6, add an op to be invoked per table (if defined).

- The current way of determining PGT allocation size depends on the number
  of FIDs and number of flood tables. RFIDs however have PGT footprint
  depending not on number of FIDs, but on number of ports and LAGs, because
  which ports an rFID should flood to does not depend on the FID itself,
  but on the port or LAG that it references.

  Therefore in patch torvalds#7, add FID family ops for determining PGT allocation
  size.

- As elaborated above, layout of PGT will differ between controlled and CFF
  flood modes. In CFF mode, it will further differ between rFIDs and other
  FIDs (as described at previous patch). The way to pack the SFMR register
  to configure a FID will likewise differ from controlled to CFF.

  Thus in patches torvalds#8 and torvalds#9 add FID family ops to determine PGT base
  address for a FID and to pack SFMR.

- Patches torvalds#10 and torvalds#11 add more bits for RSP support. In patch torvalds#10, add a
  new traffic type enumerator, for non-UC traffic. This is a combination of
  BC and MC traffic, but the way that mlxsw maps these mnemonic names to
  actual traffic type configurations requires that we have a new name to
  describe this class of traffic.

  Patch torvalds#11 then adds hooks necessary for RSP table maintenance. As ports
  come and go, and join and leave LAGs, it is necessary to update flood
  vectors that the rFIDs use. These new hooks will make that possible.

- Patches torvalds#12, torvalds#13 and torvalds#14 introduce flood profiles. These have been
  implicit so far, but the way that CFF flood mode works with profile IDs
  requires that we make them explicit.

  Thus in patch torvalds#12, introduce flood profile objects as a set of flood
  tables that FID families then refer to. The FID code currently only
  uses a single flood profile.

  In patch torvalds#13, add a flood profile ID to flood profile objects.

  In patch torvalds#14, when in CFF mode, configure SFFP according to the existing
  flood profiles (or the one that exists as of that point).

- Patches torvalds#15 and torvalds#16 add code to implement, respectively, bridge FIDs and
  RSP FIDs in CFF mode.

- In patch torvalds#17, toggle flood_mode_prefer_cff on Spectrum-2 and above, which
  makes the newly-added code live.
====================

Link: https://lore.kernel.org/r/cover.1701183891.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
jonhunter pushed a commit to jonhunter/linux that referenced this pull request Dec 7, 2023
When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a
cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when
removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be
dereferenced as wrong struct in irdma_free_pending_cqp_request().

  PID: 3669   TASK: ffff88aef892c000  CPU: 28  COMMAND: "kworker/28:0"
   #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34
   #1 [fffffe0000549e40] nmi_handle at ffffffff810788b2
   #2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f
   #3 [fffffe0000549eb8] do_nmi at ffffffff81079582
   #4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4
      [exception RIP: native_queued_spin_lock_slowpath+1291]
      RIP: ffffffff8127e72b  RSP: ffff88aa841ef778  RFLAGS: 00000046
      RAX: 0000000000000000  RBX: ffff88b01f849700  RCX: ffffffff8127e47e
      RDX: 0000000000000000  RSI: 0000000000000004  RDI: ffffffff83857ec0
      RBP: ffff88afe3e4efc8   R8: ffffed15fc7c9dfa   R9: ffffed15fc7c9dfa
      R10: 0000000000000001  R11: ffffed15fc7c9df9  R12: 0000000000740000
      R13: ffff88b01f849708  R14: 0000000000000003  R15: ffffed1603f092e1
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
  -- <NMI exception stack> --
   #5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b
   torvalds#6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4
   torvalds#7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363
   torvalds#8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma]
   torvalds#9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma]
   torvalds#10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma]
   torvalds#11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma]
   torvalds#12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb
   torvalds#13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6
   torvalds#14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278
   torvalds#15 [ffff88aa841efb88] device_del at ffffffff82179d23
   torvalds#16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice]
   torvalds#17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice]
   torvalds#18 [ffff88aa841efde8] process_one_work at ffffffff811c589a
   torvalds#19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff
   torvalds#20 [ffff88aa841eff10] kthread at ffffffff811d87a0
   torvalds#21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f

Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions")
Link: https://lore.kernel.org/r/20231130081415.891006-1-lishifeng@sangfor.com.cn
Suggested-by: "Ismail, Mustafa" <mustafa.ismail@intel.com>
Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
gbhardwaja pushed a commit to gbhardwaja/linux that referenced this pull request Dec 8, 2023
Merge in OBUDPST/udpst from OBUDPST-17-add-json-output-option to develop

* commit '4b218a251f30e520e9b1d9bc359e7dfecee7b995':
  Add example for JSON output in README.md
  udpst_control.c edited online with Bitbucket
  added json for test config
  update variable names
  Revert "Merge branch 'OBUDPST-20-add-docker-support-to-run-server-in-container' into OBUDPST-17-add-json-output-option"
  Added docker server support
  Removed personal comments also format tidyup
  minor lib update for ubuntu
  made json unformated added newline
  fixed maximum output var name
  add loss ratio and truncate to 2 decimal places
  alined json variable name to LC's suggestion
  command line option now f with optarg
  fixed optstring
  add first cut json output
  added basic json output for client
mj22226 pushed a commit to mj22226/linux that referenced this pull request Dec 12, 2023
[ Upstream commit e3e82fc ]

When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a
cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when
removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be
dereferenced as wrong struct in irdma_free_pending_cqp_request().

  PID: 3669   TASK: ffff88aef892c000  CPU: 28  COMMAND: "kworker/28:0"
   #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34
   #1 [fffffe0000549e40] nmi_handle at ffffffff810788b2
   #2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f
   #3 [fffffe0000549eb8] do_nmi at ffffffff81079582
   #4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4
      [exception RIP: native_queued_spin_lock_slowpath+1291]
      RIP: ffffffff8127e72b  RSP: ffff88aa841ef778  RFLAGS: 00000046
      RAX: 0000000000000000  RBX: ffff88b01f849700  RCX: ffffffff8127e47e
      RDX: 0000000000000000  RSI: 0000000000000004  RDI: ffffffff83857ec0
      RBP: ffff88afe3e4efc8   R8: ffffed15fc7c9dfa   R9: ffffed15fc7c9dfa
      R10: 0000000000000001  R11: ffffed15fc7c9df9  R12: 0000000000740000
      R13: ffff88b01f849708  R14: 0000000000000003  R15: ffffed1603f092e1
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
  -- <NMI exception stack> --
   #5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b
   torvalds#6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4
   torvalds#7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363
   torvalds#8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma]
   torvalds#9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma]
   torvalds#10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma]
   torvalds#11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma]
   torvalds#12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb
   torvalds#13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6
   torvalds#14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278
   torvalds#15 [ffff88aa841efb88] device_del at ffffffff82179d23
   torvalds#16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice]
   torvalds#17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice]
   torvalds#18 [ffff88aa841efde8] process_one_work at ffffffff811c589a
   torvalds#19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff
   torvalds#20 [ffff88aa841eff10] kthread at ffffffff811d87a0
   torvalds#21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f

Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions")
Link: https://lore.kernel.org/r/20231130081415.891006-1-lishifeng@sangfor.com.cn
Suggested-by: "Ismail, Mustafa" <mustafa.ismail@intel.com>
Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
BoukeHaarsma23 pushed a commit to ChimeraOS/linux that referenced this pull request Dec 13, 2023
[ Upstream commit e3e82fc ]

When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a
cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when
removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be
dereferenced as wrong struct in irdma_free_pending_cqp_request().

  PID: 3669   TASK: ffff88aef892c000  CPU: 28  COMMAND: "kworker/28:0"
   #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34
   #1 [fffffe0000549e40] nmi_handle at ffffffff810788b2
   #2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f
   #3 [fffffe0000549eb8] do_nmi at ffffffff81079582
   #4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4
      [exception RIP: native_queued_spin_lock_slowpath+1291]
      RIP: ffffffff8127e72b  RSP: ffff88aa841ef778  RFLAGS: 00000046
      RAX: 0000000000000000  RBX: ffff88b01f849700  RCX: ffffffff8127e47e
      RDX: 0000000000000000  RSI: 0000000000000004  RDI: ffffffff83857ec0
      RBP: ffff88afe3e4efc8   R8: ffffed15fc7c9dfa   R9: ffffed15fc7c9dfa
      R10: 0000000000000001  R11: ffffed15fc7c9df9  R12: 0000000000740000
      R13: ffff88b01f849708  R14: 0000000000000003  R15: ffffed1603f092e1
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
  -- <NMI exception stack> --
   #5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b
   torvalds#6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4
   torvalds#7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363
   torvalds#8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma]
   torvalds#9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma]
   torvalds#10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma]
   torvalds#11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma]
   torvalds#12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb
   torvalds#13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6
   torvalds#14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278
   torvalds#15 [ffff88aa841efb88] device_del at ffffffff82179d23
   torvalds#16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice]
   torvalds#17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice]
   torvalds#18 [ffff88aa841efde8] process_one_work at ffffffff811c589a
   torvalds#19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff
   torvalds#20 [ffff88aa841eff10] kthread at ffffffff811d87a0
   torvalds#21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f

Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions")
Link: https://lore.kernel.org/r/20231130081415.891006-1-lishifeng@sangfor.com.cn
Suggested-by: "Ismail, Mustafa" <mustafa.ismail@intel.com>
Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Dec 13, 2023
[ Upstream commit e3e82fc ]

When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a
cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when
removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be
dereferenced as wrong struct in irdma_free_pending_cqp_request().

  PID: 3669   TASK: ffff88aef892c000  CPU: 28  COMMAND: "kworker/28:0"
   #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34
   #1 [fffffe0000549e40] nmi_handle at ffffffff810788b2
   #2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f
   #3 [fffffe0000549eb8] do_nmi at ffffffff81079582
   #4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4
      [exception RIP: native_queued_spin_lock_slowpath+1291]
      RIP: ffffffff8127e72b  RSP: ffff88aa841ef778  RFLAGS: 00000046
      RAX: 0000000000000000  RBX: ffff88b01f849700  RCX: ffffffff8127e47e
      RDX: 0000000000000000  RSI: 0000000000000004  RDI: ffffffff83857ec0
      RBP: ffff88afe3e4efc8   R8: ffffed15fc7c9dfa   R9: ffffed15fc7c9dfa
      R10: 0000000000000001  R11: ffffed15fc7c9df9  R12: 0000000000740000
      R13: ffff88b01f849708  R14: 0000000000000003  R15: ffffed1603f092e1
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
  -- <NMI exception stack> --
   #5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b
   torvalds#6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4
   torvalds#7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363
   torvalds#8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma]
   torvalds#9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma]
   torvalds#10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma]
   torvalds#11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma]
   torvalds#12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb
   torvalds#13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6
   torvalds#14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278
   torvalds#15 [ffff88aa841efb88] device_del at ffffffff82179d23
   torvalds#16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice]
   torvalds#17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice]
   torvalds#18 [ffff88aa841efde8] process_one_work at ffffffff811c589a
   torvalds#19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff
   torvalds#20 [ffff88aa841eff10] kthread at ffffffff811d87a0
   torvalds#21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f

Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions")
Link: https://lore.kernel.org/r/20231130081415.891006-1-lishifeng@sangfor.com.cn
Suggested-by: "Ismail, Mustafa" <mustafa.ismail@intel.com>
Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Dec 13, 2023
[ Upstream commit e3e82fc ]

When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a
cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when
removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be
dereferenced as wrong struct in irdma_free_pending_cqp_request().

  PID: 3669   TASK: ffff88aef892c000  CPU: 28  COMMAND: "kworker/28:0"
   #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34
   #1 [fffffe0000549e40] nmi_handle at ffffffff810788b2
   #2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f
   #3 [fffffe0000549eb8] do_nmi at ffffffff81079582
   #4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4
      [exception RIP: native_queued_spin_lock_slowpath+1291]
      RIP: ffffffff8127e72b  RSP: ffff88aa841ef778  RFLAGS: 00000046
      RAX: 0000000000000000  RBX: ffff88b01f849700  RCX: ffffffff8127e47e
      RDX: 0000000000000000  RSI: 0000000000000004  RDI: ffffffff83857ec0
      RBP: ffff88afe3e4efc8   R8: ffffed15fc7c9dfa   R9: ffffed15fc7c9dfa
      R10: 0000000000000001  R11: ffffed15fc7c9df9  R12: 0000000000740000
      R13: ffff88b01f849708  R14: 0000000000000003  R15: ffffed1603f092e1
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
  -- <NMI exception stack> --
   #5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b
   torvalds#6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4
   torvalds#7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363
   torvalds#8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma]
   torvalds#9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma]
   torvalds#10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma]
   torvalds#11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma]
   torvalds#12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb
   torvalds#13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6
   torvalds#14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278
   torvalds#15 [ffff88aa841efb88] device_del at ffffffff82179d23
   torvalds#16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice]
   torvalds#17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice]
   torvalds#18 [ffff88aa841efde8] process_one_work at ffffffff811c589a
   torvalds#19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff
   torvalds#20 [ffff88aa841eff10] kthread at ffffffff811d87a0
   torvalds#21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f

Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions")
Link: https://lore.kernel.org/r/20231130081415.891006-1-lishifeng@sangfor.com.cn
Suggested-by: "Ismail, Mustafa" <mustafa.ismail@intel.com>
Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Dec 28, 2023
When the ATS Invalidation request timeout happens, the qi_submit_sync()
will restart and loop for the invalidation request forever till it is
done, it will block another Invalidation thread such as the fq_timer
to issue invalidation request, cause the system lockup as following

[exception RIP: native_queued_spin_lock_slowpath+92]

RIP: ffffffffa9d1025c RSP: ffffb202f268cdc8 RFLAGS: 00000002

RAX: 0000000000000101 RBX: ffffffffab36c2a0 RCX: 0000000000000000

RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffab36c2a0

RBP: ffffffffab36c2a0 R8: 0000000000000001 R9: 0000000000000000

R10: 0000000000000010 R11: 0000000000000018 R12: 0000000000000000

R13: 0000000000000004 R14: ffff9e10d71b1c88 R15: ffff9e10d71b1980

ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018

torvalds#12 [ffffb202f268cdc8] native_queued_spin_lock_slowpath at ffffffffa9d1025c

torvalds#13 [ffffb202f268cdc8] do_raw_spin_lock at ffffffffa9d121f1

torvalds#14 [ffffb202f268cdd8] _raw_spin_lock_irqsave at ffffffffaa51795b

torvalds#15 [ffffb202f268cdf8] iommu_flush_dev_iotlb at ffffffffaa20df48

torvalds#16 [ffffb202f268ce28] iommu_flush_iova at ffffffffaa20e182

torvalds#17 [ffffb202f268ce60] iova_domain_flush at ffffffffaa220e27

torvalds#18 [ffffb202f268ce70] fq_flush_timeout at ffffffffaa221c9d

torvalds#19 [ffffb202f268cea8] call_timer_fn at ffffffffa9d46661

torvalds#20 [ffffb202f268cf08] run_timer_softirq at ffffffffa9d47933

torvalds#21 [ffffb202f268cf98] __softirqentry_text_start at ffffffffaa8000e0

torvalds#22 [ffffb202f268cff0] asm_call_sysvec_on_stack at ffffffffaa60114f
fallen pushed a commit to kalray/linux_coolidge that referenced this pull request Jan 8, 2024
Summary:
In order to display registers when a signal is not handled,
SYSCTL_EXCEPTION_TRACE exposes a sysctl parameter which allows to enable
or disable this feature. Since it should be handled where signals are
generated, factorized the existing kill function to incorporate the
check for show_unhandled_signals. If this is set and the signal is not
handled, display registers.

Test Plan:
Ran a test which segfault to check the results is the expected one:
```
  # splice01
  tst_test.c:1096: INFO: Timeout per run is 0h 05m 00s
  [  254.090000] splice01[444]: unhandled signal 11 code 0x1 at 0x0 in
  splice01[10000+2b000]
  [  254.110000] CPU: 0 PID: 444 Comm: splice01 Tainted: G        W
  5.1.0 torvalds#17
  [  254.120000] Hardware name: HAPS prototyping (LS) (DT)
  [  254.130000]
  [  254.130000] mode: user
  [  254.130000]     PC: 0000000000000000    PS: 0000000000130f31
  [  254.130000]     CS: 0000000000000000    RA: 0000000000000000
  [  254.130000]     LS: 0000000000020370    LE: 0000000000020374
  [  254.130000]     LC: 0000000000000001
  [  254.130000]
  [  254.140000]     R0: ffffffffffffffff    R1: 0000000000000001
  [  254.150000]     R2: 0000000000000000    R3: 0000000000000003
  [  254.160000]     R4: 0000000000000003    R5: 000000000004b7c8
  [  254.170000]     R6: 0000000000000039    R7: 0000000000000003
  [  254.180000]     R8: ffffff801f4c0b38    R9: ffffff801f4c0b98
  [  254.190000]     R10: 0000000000000018    R11: 0000000000000028
  [  254.200000]     R12: 0000007ffffffa20    R13: 0000000000057d48
  [  254.210000]     R14: 0000007ffffffab0    R15: 62732f0000000000
  [  254.220000]     R16: 0000000000000000    R17: 000000004667f14d
  [  254.230000]     R18: 000000000004b7c0    R19: 0000000000000003
  [  254.240000]     R20: 0000000000034c28    R21: 000000000000006a
  [  254.250000]     R22: 0000000000034f08    R23: 000000000004bbe8
  [  254.260000]     R24: 0000000000035808    R25: 0000000000000000
  [  254.270000]     R26: 0000000000000000    R27: 00000000000554b8
  [  254.280000]     R28: 0000000000000001    R29: 00000000000357e0
  [  254.290000]     R30: 00000000000354d8    R31: 00000000000355d0
  [  254.300000]     R32: 0000000000000000    R33: 0000000000000000
  [  254.310000]     R34: ffffffffffecf0ce    R35: 0000000000000000
  [  254.320000]     R36: ffffff801e97c380    R37: 0000000000000000
  [  254.330000]     R38: 0000000000000002    R39: 0000000000000000
  [  254.340000]     R40: 0000000000000000    R41: 000000000002116c
  [  254.350000]     R42: 0000000000130f31    R43: 0000000000039163
  [  254.360000]     R44: 0000000000000000    R45: 0000000000010000
  [  254.380000]     R46: 0000000000000000    R47: 0000000000000000
  [  254.390000]     R48: 0000000000000000    R49: 0000000000000000
  [  254.400000]     R50: 0000000000000173    R51: 0000000000000000
  [  254.410000]     R52: 0000007ffffff9c0    R53: 0000000000057d48
  [  254.420000]     R54: 0000007ffffffab0    R55: 62732f0000000000
  [  254.430000]     R56: 0000000000000000    R57: 0000000000000000
  [  254.440000]     R58: 0000000000000000    R59: 0000000000000000
  [  254.450000]     R60: 0000000000000001    R61: 0000000000020374
  [  254.460000]     R62: 0000000000020370    R63: 0000000000015244
  [  254.470000]
  [  254.470000]
  tst_test.c:1141: BROK: Test killed by SIGSEGV!

  # sysctl -w debug.exception-trace=0
  # splice01
  tst_test.c:1096: INFO: Timeout per run is 0h 05m 00s
  tst_test.c:1141: BROK: Test killed by SIGSEGV!
```

Reviewers: O51 Linux Coolidge, gthouvenin

Reviewed By: O51 Linux Coolidge, gthouvenin

Subscribers: gthouvenin, #linux_coolidge_cc

Differential Revision: https://phab.kalray.eu/D1140
fallen pushed a commit to kalray/linux_coolidge that referenced this pull request Jan 15, 2024
Summary:
In order to display registers when a signal is not handled,
SYSCTL_EXCEPTION_TRACE exposes a sysctl parameter which allows to enable
or disable this feature. Since it should be handled where signals are
generated, factorized the existing kill function to incorporate the
check for show_unhandled_signals. If this is set and the signal is not
handled, display registers.

Test Plan:
Ran a test which segfault to check the results is the expected one:
```
  # splice01
  tst_test.c:1096: INFO: Timeout per run is 0h 05m 00s
  [  254.090000] splice01[444]: unhandled signal 11 code 0x1 at 0x0 in
  splice01[10000+2b000]
  [  254.110000] CPU: 0 PID: 444 Comm: splice01 Tainted: G        W
  5.1.0 torvalds#17
  [  254.120000] Hardware name: HAPS prototyping (LS) (DT)
  [  254.130000]
  [  254.130000] mode: user
  [  254.130000]     PC: 0000000000000000    PS: 0000000000130f31
  [  254.130000]     CS: 0000000000000000    RA: 0000000000000000
  [  254.130000]     LS: 0000000000020370    LE: 0000000000020374
  [  254.130000]     LC: 0000000000000001
  [  254.130000]
  [  254.140000]     R0: ffffffffffffffff    R1: 0000000000000001
  [  254.150000]     R2: 0000000000000000    R3: 0000000000000003
  [  254.160000]     R4: 0000000000000003    R5: 000000000004b7c8
  [  254.170000]     R6: 0000000000000039    R7: 0000000000000003
  [  254.180000]     R8: ffffff801f4c0b38    R9: ffffff801f4c0b98
  [  254.190000]     R10: 0000000000000018    R11: 0000000000000028
  [  254.200000]     R12: 0000007ffffffa20    R13: 0000000000057d48
  [  254.210000]     R14: 0000007ffffffab0    R15: 62732f0000000000
  [  254.220000]     R16: 0000000000000000    R17: 000000004667f14d
  [  254.230000]     R18: 000000000004b7c0    R19: 0000000000000003
  [  254.240000]     R20: 0000000000034c28    R21: 000000000000006a
  [  254.250000]     R22: 0000000000034f08    R23: 000000000004bbe8
  [  254.260000]     R24: 0000000000035808    R25: 0000000000000000
  [  254.270000]     R26: 0000000000000000    R27: 00000000000554b8
  [  254.280000]     R28: 0000000000000001    R29: 00000000000357e0
  [  254.290000]     R30: 00000000000354d8    R31: 00000000000355d0
  [  254.300000]     R32: 0000000000000000    R33: 0000000000000000
  [  254.310000]     R34: ffffffffffecf0ce    R35: 0000000000000000
  [  254.320000]     R36: ffffff801e97c380    R37: 0000000000000000
  [  254.330000]     R38: 0000000000000002    R39: 0000000000000000
  [  254.340000]     R40: 0000000000000000    R41: 000000000002116c
  [  254.350000]     R42: 0000000000130f31    R43: 0000000000039163
  [  254.360000]     R44: 0000000000000000    R45: 0000000000010000
  [  254.380000]     R46: 0000000000000000    R47: 0000000000000000
  [  254.390000]     R48: 0000000000000000    R49: 0000000000000000
  [  254.400000]     R50: 0000000000000173    R51: 0000000000000000
  [  254.410000]     R52: 0000007ffffff9c0    R53: 0000000000057d48
  [  254.420000]     R54: 0000007ffffffab0    R55: 62732f0000000000
  [  254.430000]     R56: 0000000000000000    R57: 0000000000000000
  [  254.440000]     R58: 0000000000000000    R59: 0000000000000000
  [  254.450000]     R60: 0000000000000001    R61: 0000000000020374
  [  254.460000]     R62: 0000000000020370    R63: 0000000000015244
  [  254.470000]
  [  254.470000]
  tst_test.c:1141: BROK: Test killed by SIGSEGV!

  # sysctl -w debug.exception-trace=0
  # splice01
  tst_test.c:1096: INFO: Timeout per run is 0h 05m 00s
  tst_test.c:1141: BROK: Test killed by SIGSEGV!
```

Reviewers: O51 Linux Coolidge, gthouvenin

Reviewed By: O51 Linux Coolidge, gthouvenin

Subscribers: gthouvenin, #linux_coolidge_cc

Differential Revision: https://phab.kalray.eu/D1140
logic10492 pushed a commit to logic10492/linux-amd-zen2 that referenced this pull request Jan 18, 2024
gyroninja added a commit to gyroninja/linux that referenced this pull request Jan 28, 2024
KSAN calls into rcu code which then triggers a write that reenters into KSAN
getting the system stuck doing infinite recursion.

#0  kmsan_get_context () at mm/kmsan/kmsan.h:106
#1  __msan_get_context_state () at mm/kmsan/instrumentation.c:331
#2  0xffffffff81495671 in get_current () at ./arch/x86/include/asm/current.h:42
#3  rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
#4  __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
#5  0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#6  pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#7  kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#8  virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#9  0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#10 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#11 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#12 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#13 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#14 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#15 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#16 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#17 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#18 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#19 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#20 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#21 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#22 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#23 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#24 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#25 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#26 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#27 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#28 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#29 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#30 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#31 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#32 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#33 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#34 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#35 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#36 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#37 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#38 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#39 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#40 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#41 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#42 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#43 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#44 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#45 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#46 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#47 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#48 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#49 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#50 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#51 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
#52 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
#53 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#54 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#55 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#56 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#57 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
#58 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#59 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#60 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#61 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#62 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#63 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#64 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#65 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#66 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#67 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#68 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#69 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
#70 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#71 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#72 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#73 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#74 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#75 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#76 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#77 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff86203c90) at ./arch/x86/include/asm/kmsan.h:82
torvalds#78 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff86203c90) at mm/kmsan/shadow.c:75
torvalds#79 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff86203c90, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#80 kmsan_get_shadow_origin_ptr (address=0xffffffff86203c90, size=8, store=false) at mm/kmsan/shadow.c:97
torvalds#81 0xffffffff81b1dc72 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=8, store=false) at mm/kmsan/instrumentation.c:36
torvalds#82 __msan_metadata_ptr_for_load_8 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:92
torvalds#83 0xffffffff814fdb9e in filter_irq_stacks (entries=<optimized out>, nr_entries=4) at kernel/stacktrace.c:397
torvalds#84 0xffffffff829520e8 in stack_depot_save_flags (entries=0xffffffff8620d974 <init_task+1012>, nr_entries=4, alloc_flags=0, depot_flags=0) at lib/stackdepot.c:500
torvalds#85 0xffffffff81b1e560 in __msan_poison_alloca (address=0xffffffff86203da0, size=24, descr=<optimized out>) at mm/kmsan/instrumentation.c:285
torvalds#86 0xffffffff8562821c in _printk (fmt=0xffffffff85f191a5 "\0016Attempting lock1") at kernel/printk/printk.c:2324
torvalds#87 0xffffffff81942aa2 in kmem_cache_create_usercopy (name=0xffffffff85f18903 "mm_struct", size=1296, align=0, flags=270336, useroffset=<optimized out>, usersize=<optimized out>, ctor=0x0 <fixed_percpu_data>) at mm/slab_common.c:296
torvalds#88 0xffffffff86f337a0 in mm_cache_init () at kernel/fork.c:3262
torvalds#89 0xffffffff86eacb8e in start_kernel () at init/main.c:932
torvalds#90 0xffffffff86ecdf94 in x86_64_start_reservations (real_mode_data=0x140e0 <exception_stacks+28896> <error: Cannot access memory at address 0x140e0>) at arch/x86/kernel/head64.c:555
torvalds#91 0xffffffff86ecde9b in x86_64_start_kernel (real_mode_data=0x140e0 <exception_stacks+28896> <error: Cannot access memory at address 0x140e0>) at arch/x86/kernel/head64.c:536
torvalds#92 0xffffffff810001d3 in secondary_startup_64 () at /pool/workspace/linux/arch/x86/kernel/head_64.S:461
torvalds#93 0x0000000000000000 in ??
gyroninja added a commit to gyroninja/linux that referenced this pull request Jan 28, 2024
As of 5ec8e8e(mm/sparsemem: fix race in accessing memory_section->usage) KMSAN
now calls into RCU tree code during kmsan_get_metadata. This will trigger a
write that will reenter into KMSAN getting the system stuck doing infinite
recursion.

#0  kmsan_get_context () at mm/kmsan/kmsan.h:106
#1  __msan_get_context_state () at mm/kmsan/instrumentation.c:331
#2  0xffffffff81495671 in get_current () at ./arch/x86/include/asm/current.h:42
#3  rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
#4  __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
#5  0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#6  pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#7  kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#8  virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#9  0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#10 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#11 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#12 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#13 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#14 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#15 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#16 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#17 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#18 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#19 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#20 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#21 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#22 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#23 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#24 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#25 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#26 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#27 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#28 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#29 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#30 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#31 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#32 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#33 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#34 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#35 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#36 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#37 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#38 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#39 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#40 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#41 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#42 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#43 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#44 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#45 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#46 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#47 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#48 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#49 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#50 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#51 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
#52 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
#53 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#54 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#55 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#56 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#57 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
#58 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#59 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#60 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#61 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#62 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#63 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#64 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#65 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#66 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#67 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#68 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#69 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
#70 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#71 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#72 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#73 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#74 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#75 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#76 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#77 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff86203c90) at ./arch/x86/include/asm/kmsan.h:82
torvalds#78 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff86203c90) at mm/kmsan/shadow.c:75
torvalds#79 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff86203c90, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#80 kmsan_get_shadow_origin_ptr (address=0xffffffff86203c90, size=8, store=false) at mm/kmsan/shadow.c:97
torvalds#81 0xffffffff81b1dc72 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=8, store=false) at mm/kmsan/instrumentation.c:36
torvalds#82 __msan_metadata_ptr_for_load_8 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:92
torvalds#83 0xffffffff814fdb9e in filter_irq_stacks (entries=<optimized out>, nr_entries=4) at kernel/stacktrace.c:397
torvalds#84 0xffffffff829520e8 in stack_depot_save_flags (entries=0xffffffff8620d974 <init_task+1012>, nr_entries=4, alloc_flags=0, depot_flags=0) at lib/stackdepot.c:500
torvalds#85 0xffffffff81b1e560 in __msan_poison_alloca (address=0xffffffff86203da0, size=24, descr=<optimized out>) at mm/kmsan/instrumentation.c:285
torvalds#86 0xffffffff8562821c in _printk (fmt=0xffffffff85f191a5 "\0016Attempting lock1") at kernel/printk/printk.c:2324
torvalds#87 0xffffffff81942aa2 in kmem_cache_create_usercopy (name=0xffffffff85f18903 "mm_struct", size=1296, align=0, flags=270336, useroffset=<optimized out>, usersize=<optimized out>, ctor=0x0 <fixed_percpu_data>) at mm/slab_common.c:296
torvalds#88 0xffffffff86f337a0 in mm_cache_init () at kernel/fork.c:3262
torvalds#89 0xffffffff86eacb8e in start_kernel () at init/main.c:932
torvalds#90 0xffffffff86ecdf94 in x86_64_start_reservations (real_mode_data=0x140e0 <exception_stacks+28896> <error: Cannot access memory at address 0x140e0>) at arch/x86/kernel/head64.c:555
torvalds#91 0xffffffff86ecde9b in x86_64_start_kernel (real_mode_data=0x140e0 <exception_stacks+28896> <error: Cannot access memory at address 0x140e0>) at arch/x86/kernel/head64.c:536
torvalds#92 0xffffffff810001d3 in secondary_startup_64 () at /pool/workspace/linux/arch/x86/kernel/head_64.S:461
torvalds#93 0x0000000000000000 in ??
kuba-moo added a commit to linux-netdev/testing that referenced this pull request Feb 13, 2024
Test runners on debug kernels occasionally fail with:

 # #  RUN           tls_err.13_aes_gcm.poll_partial_rec_async ...
 # # tls.c:1883:poll_partial_rec_async:Expected poll(&pfd, 1, 5) (0) == 1 (1)
 # # tls.c:1870:poll_partial_rec_async:Expected status (256) == 0 (0)
 # # poll_partial_rec_async: Test failed at step torvalds#17
 # #          FAIL  tls_err.13_aes_gcm.poll_partial_rec_async
 # not ok 699 tls_err.13_aes_gcm.poll_partial_rec_async
 # # FAILED: 698 / 699 tests passed.

This points to the second poll() in the test which is expected
to wait for the sender to send the rest of the data.
Apparently under some conditions that doesn't happen within 5ms,
bump the timeout to 20ms.

Fixes: 23fcb62 ("selftests: tls: add tests for poll behavior")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo added a commit to linux-netdev/testing that referenced this pull request Feb 13, 2024
Test runners on debug kernels occasionally fail with:

 # #  RUN           tls_err.13_aes_gcm.poll_partial_rec_async ...
 # # tls.c:1883:poll_partial_rec_async:Expected poll(&pfd, 1, 5) (0) == 1 (1)
 # # tls.c:1870:poll_partial_rec_async:Expected status (256) == 0 (0)
 # # poll_partial_rec_async: Test failed at step torvalds#17
 # #          FAIL  tls_err.13_aes_gcm.poll_partial_rec_async
 # not ok 699 tls_err.13_aes_gcm.poll_partial_rec_async
 # # FAILED: 698 / 699 tests passed.

This points to the second poll() in the test which is expected
to wait for the sender to send the rest of the data.
Apparently under some conditions that doesn't happen within 5ms,
bump the timeout to 20ms.

Fixes: 23fcb62 ("selftests: tls: add tests for poll behavior")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo added a commit to linux-netdev/testing that referenced this pull request Feb 14, 2024
Test runners on debug kernels occasionally fail with:

 # #  RUN           tls_err.13_aes_gcm.poll_partial_rec_async ...
 # # tls.c:1883:poll_partial_rec_async:Expected poll(&pfd, 1, 5) (0) == 1 (1)
 # # tls.c:1870:poll_partial_rec_async:Expected status (256) == 0 (0)
 # # poll_partial_rec_async: Test failed at step torvalds#17
 # #          FAIL  tls_err.13_aes_gcm.poll_partial_rec_async
 # not ok 699 tls_err.13_aes_gcm.poll_partial_rec_async
 # # FAILED: 698 / 699 tests passed.

This points to the second poll() in the test which is expected
to wait for the sender to send the rest of the data.
Apparently under some conditions that doesn't happen within 5ms,
bump the timeout to 20ms.

Fixes: 23fcb62 ("selftests: tls: add tests for poll behavior")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo added a commit to linux-netdev/testing that referenced this pull request Feb 14, 2024
Test runners on debug kernels occasionally fail with:

 # #  RUN           tls_err.13_aes_gcm.poll_partial_rec_async ...
 # # tls.c:1883:poll_partial_rec_async:Expected poll(&pfd, 1, 5) (0) == 1 (1)
 # # tls.c:1870:poll_partial_rec_async:Expected status (256) == 0 (0)
 # # poll_partial_rec_async: Test failed at step torvalds#17
 # #          FAIL  tls_err.13_aes_gcm.poll_partial_rec_async
 # not ok 699 tls_err.13_aes_gcm.poll_partial_rec_async
 # # FAILED: 698 / 699 tests passed.

This points to the second poll() in the test which is expected
to wait for the sender to send the rest of the data.
Apparently under some conditions that doesn't happen within 5ms,
bump the timeout to 20ms.

Fixes: 23fcb62 ("selftests: tls: add tests for poll behavior")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo added a commit to linux-netdev/testing that referenced this pull request Feb 14, 2024
Test runners on debug kernels occasionally fail with:

 # #  RUN           tls_err.13_aes_gcm.poll_partial_rec_async ...
 # # tls.c:1883:poll_partial_rec_async:Expected poll(&pfd, 1, 5) (0) == 1 (1)
 # # tls.c:1870:poll_partial_rec_async:Expected status (256) == 0 (0)
 # # poll_partial_rec_async: Test failed at step torvalds#17
 # #          FAIL  tls_err.13_aes_gcm.poll_partial_rec_async
 # not ok 699 tls_err.13_aes_gcm.poll_partial_rec_async
 # # FAILED: 698 / 699 tests passed.

This points to the second poll() in the test which is expected
to wait for the sender to send the rest of the data.
Apparently under some conditions that doesn't happen within 5ms,
bump the timeout to 20ms.

Fixes: 23fcb62 ("selftests: tls: add tests for poll behavior")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo added a commit to linux-netdev/testing that referenced this pull request Feb 14, 2024
Test runners on debug kernels occasionally fail with:

 # #  RUN           tls_err.13_aes_gcm.poll_partial_rec_async ...
 # # tls.c:1883:poll_partial_rec_async:Expected poll(&pfd, 1, 5) (0) == 1 (1)
 # # tls.c:1870:poll_partial_rec_async:Expected status (256) == 0 (0)
 # # poll_partial_rec_async: Test failed at step torvalds#17
 # #          FAIL  tls_err.13_aes_gcm.poll_partial_rec_async
 # not ok 699 tls_err.13_aes_gcm.poll_partial_rec_async
 # # FAILED: 698 / 699 tests passed.

This points to the second poll() in the test which is expected
to wait for the sender to send the rest of the data.
Apparently under some conditions that doesn't happen within 5ms,
bump the timeout to 20ms.

Fixes: 23fcb62 ("selftests: tls: add tests for poll behavior")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo added a commit to linux-netdev/testing that referenced this pull request Feb 14, 2024
Test runners on debug kernels occasionally fail with:

 # #  RUN           tls_err.13_aes_gcm.poll_partial_rec_async ...
 # # tls.c:1883:poll_partial_rec_async:Expected poll(&pfd, 1, 5) (0) == 1 (1)
 # # tls.c:1870:poll_partial_rec_async:Expected status (256) == 0 (0)
 # # poll_partial_rec_async: Test failed at step torvalds#17
 # #          FAIL  tls_err.13_aes_gcm.poll_partial_rec_async
 # not ok 699 tls_err.13_aes_gcm.poll_partial_rec_async
 # # FAILED: 698 / 699 tests passed.

This points to the second poll() in the test which is expected
to wait for the sender to send the rest of the data.
Apparently under some conditions that doesn't happen within 5ms,
bump the timeout to 20ms.

Fixes: 23fcb62 ("selftests: tls: add tests for poll behavior")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo added a commit to linux-netdev/testing that referenced this pull request Feb 14, 2024
Test runners on debug kernels occasionally fail with:

 # #  RUN           tls_err.13_aes_gcm.poll_partial_rec_async ...
 # # tls.c:1883:poll_partial_rec_async:Expected poll(&pfd, 1, 5) (0) == 1 (1)
 # # tls.c:1870:poll_partial_rec_async:Expected status (256) == 0 (0)
 # # poll_partial_rec_async: Test failed at step torvalds#17
 # #          FAIL  tls_err.13_aes_gcm.poll_partial_rec_async
 # not ok 699 tls_err.13_aes_gcm.poll_partial_rec_async
 # # FAILED: 698 / 699 tests passed.

This points to the second poll() in the test which is expected
to wait for the sender to send the rest of the data.
Apparently under some conditions that doesn't happen within 5ms,
bump the timeout to 20ms.

Fixes: 23fcb62 ("selftests: tls: add tests for poll behavior")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo added a commit to linux-netdev/testing that referenced this pull request Feb 14, 2024
Test runners on debug kernels occasionally fail with:

 # #  RUN           tls_err.13_aes_gcm.poll_partial_rec_async ...
 # # tls.c:1883:poll_partial_rec_async:Expected poll(&pfd, 1, 5) (0) == 1 (1)
 # # tls.c:1870:poll_partial_rec_async:Expected status (256) == 0 (0)
 # # poll_partial_rec_async: Test failed at step torvalds#17
 # #          FAIL  tls_err.13_aes_gcm.poll_partial_rec_async
 # not ok 699 tls_err.13_aes_gcm.poll_partial_rec_async
 # # FAILED: 698 / 699 tests passed.

This points to the second poll() in the test which is expected
to wait for the sender to send the rest of the data.
Apparently under some conditions that doesn't happen within 5ms,
bump the timeout to 20ms.

Fixes: 23fcb62 ("selftests: tls: add tests for poll behavior")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo added a commit to linux-netdev/testing that referenced this pull request Feb 14, 2024
Test runners on debug kernels occasionally fail with:

 # #  RUN           tls_err.13_aes_gcm.poll_partial_rec_async ...
 # # tls.c:1883:poll_partial_rec_async:Expected poll(&pfd, 1, 5) (0) == 1 (1)
 # # tls.c:1870:poll_partial_rec_async:Expected status (256) == 0 (0)
 # # poll_partial_rec_async: Test failed at step torvalds#17
 # #          FAIL  tls_err.13_aes_gcm.poll_partial_rec_async
 # not ok 699 tls_err.13_aes_gcm.poll_partial_rec_async
 # # FAILED: 698 / 699 tests passed.

This points to the second poll() in the test which is expected
to wait for the sender to send the rest of the data.
Apparently under some conditions that doesn't happen within 5ms,
bump the timeout to 20ms.

Fixes: 23fcb62 ("selftests: tls: add tests for poll behavior")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo added a commit to linux-netdev/testing that referenced this pull request Feb 15, 2024
Test runners on debug kernels occasionally fail with:

 # #  RUN           tls_err.13_aes_gcm.poll_partial_rec_async ...
 # # tls.c:1883:poll_partial_rec_async:Expected poll(&pfd, 1, 5) (0) == 1 (1)
 # # tls.c:1870:poll_partial_rec_async:Expected status (256) == 0 (0)
 # # poll_partial_rec_async: Test failed at step torvalds#17
 # #          FAIL  tls_err.13_aes_gcm.poll_partial_rec_async
 # not ok 699 tls_err.13_aes_gcm.poll_partial_rec_async
 # # FAILED: 698 / 699 tests passed.

This points to the second poll() in the test which is expected
to wait for the sender to send the rest of the data.
Apparently under some conditions that doesn't happen within 5ms,
bump the timeout to 20ms.

Fixes: 23fcb62 ("selftests: tls: add tests for poll behavior")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo added a commit to linux-netdev/testing that referenced this pull request Feb 15, 2024
Test runners on debug kernels occasionally fail with:

 # #  RUN           tls_err.13_aes_gcm.poll_partial_rec_async ...
 # # tls.c:1883:poll_partial_rec_async:Expected poll(&pfd, 1, 5) (0) == 1 (1)
 # # tls.c:1870:poll_partial_rec_async:Expected status (256) == 0 (0)
 # # poll_partial_rec_async: Test failed at step torvalds#17
 # #          FAIL  tls_err.13_aes_gcm.poll_partial_rec_async
 # not ok 699 tls_err.13_aes_gcm.poll_partial_rec_async
 # # FAILED: 698 / 699 tests passed.

This points to the second poll() in the test which is expected
to wait for the sender to send the rest of the data.
Apparently under some conditions that doesn't happen within 5ms,
bump the timeout to 20ms.

Fixes: 23fcb62 ("selftests: tls: add tests for poll behavior")
Link: https://lore.kernel.org/r/20240213142055.395564-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
fallen pushed a commit to kalray/linux_coolidge that referenced this pull request Mar 6, 2024
Summary:
In order to display registers when a signal is not handled,
SYSCTL_EXCEPTION_TRACE exposes a sysctl parameter which allows to enable
or disable this feature. Since it should be handled where signals are
generated, factorized the existing kill function to incorporate the
check for show_unhandled_signals. If this is set and the signal is not
handled, display registers.

Test Plan:
Ran a test which segfault to check the results is the expected one:
```
  # splice01
  tst_test.c:1096: INFO: Timeout per run is 0h 05m 00s
  [  254.090000] splice01[444]: unhandled signal 11 code 0x1 at 0x0 in
  splice01[10000+2b000]
  [  254.110000] CPU: 0 PID: 444 Comm: splice01 Tainted: G        W
  5.1.0 torvalds#17
  [  254.120000] Hardware name: HAPS prototyping (LS) (DT)
  [  254.130000]
  [  254.130000] mode: user
  [  254.130000]     PC: 0000000000000000    PS: 0000000000130f31
  [  254.130000]     CS: 0000000000000000    RA: 0000000000000000
  [  254.130000]     LS: 0000000000020370    LE: 0000000000020374
  [  254.130000]     LC: 0000000000000001
  [  254.130000]
  [  254.140000]     R0: ffffffffffffffff    R1: 0000000000000001
  [  254.150000]     R2: 0000000000000000    R3: 0000000000000003
  [  254.160000]     R4: 0000000000000003    R5: 000000000004b7c8
  [  254.170000]     R6: 0000000000000039    R7: 0000000000000003
  [  254.180000]     R8: ffffff801f4c0b38    R9: ffffff801f4c0b98
  [  254.190000]     R10: 0000000000000018    R11: 0000000000000028
  [  254.200000]     R12: 0000007ffffffa20    R13: 0000000000057d48
  [  254.210000]     R14: 0000007ffffffab0    R15: 62732f0000000000
  [  254.220000]     R16: 0000000000000000    R17: 000000004667f14d
  [  254.230000]     R18: 000000000004b7c0    R19: 0000000000000003
  [  254.240000]     R20: 0000000000034c28    R21: 000000000000006a
  [  254.250000]     R22: 0000000000034f08    R23: 000000000004bbe8
  [  254.260000]     R24: 0000000000035808    R25: 0000000000000000
  [  254.270000]     R26: 0000000000000000    R27: 00000000000554b8
  [  254.280000]     R28: 0000000000000001    R29: 00000000000357e0
  [  254.290000]     R30: 00000000000354d8    R31: 00000000000355d0
  [  254.300000]     R32: 0000000000000000    R33: 0000000000000000
  [  254.310000]     R34: ffffffffffecf0ce    R35: 0000000000000000
  [  254.320000]     R36: ffffff801e97c380    R37: 0000000000000000
  [  254.330000]     R38: 0000000000000002    R39: 0000000000000000
  [  254.340000]     R40: 0000000000000000    R41: 000000000002116c
  [  254.350000]     R42: 0000000000130f31    R43: 0000000000039163
  [  254.360000]     R44: 0000000000000000    R45: 0000000000010000
  [  254.380000]     R46: 0000000000000000    R47: 0000000000000000
  [  254.390000]     R48: 0000000000000000    R49: 0000000000000000
  [  254.400000]     R50: 0000000000000173    R51: 0000000000000000
  [  254.410000]     R52: 0000007ffffff9c0    R53: 0000000000057d48
  [  254.420000]     R54: 0000007ffffffab0    R55: 62732f0000000000
  [  254.430000]     R56: 0000000000000000    R57: 0000000000000000
  [  254.440000]     R58: 0000000000000000    R59: 0000000000000000
  [  254.450000]     R60: 0000000000000001    R61: 0000000000020374
  [  254.460000]     R62: 0000000000020370    R63: 0000000000015244
  [  254.470000]
  [  254.470000]
  tst_test.c:1141: BROK: Test killed by SIGSEGV!

  # sysctl -w debug.exception-trace=0
  # splice01
  tst_test.c:1096: INFO: Timeout per run is 0h 05m 00s
  tst_test.c:1141: BROK: Test killed by SIGSEGV!
```

Reviewers: O51 Linux Coolidge, gthouvenin

Reviewed By: O51 Linux Coolidge, gthouvenin

Subscribers: gthouvenin, #linux_coolidge_cc

Differential Revision: https://phab.kalray.eu/D1140
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment