New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added backlight support for some samsung laptops #11

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
7 participants
@xonatius

xonatius commented Sep 6, 2011

Models:

  • N120
  • R468/R418
  • X320/X420/X520
  • R510/P510
  • N350
  • R470/R420
  • R528/R728
  • SQ1S
Added backlight support for some samsung laptops
Models:
 * N120
 * R468/R418
 * X320/X420/X520
 * R510/P510
 * N350
 * R470/R420
 * R528/R728
 * SQ1S
@torvalds

This comment has been minimized.

Show comment
Hide comment
@torvalds

torvalds Sep 6, 2011

Owner

I'm not doing github pulls. The pull requests are seriously
misdesigned, and github does horrible things to the commits.

Please don't press the "pull request" github button. Do proper kernel
pull request with diffstat, git source tree (which can be on github,
of course), branch, commit information etc etc etc.

                  Linus
Owner

torvalds commented Sep 6, 2011

I'm not doing github pulls. The pull requests are seriously
misdesigned, and github does horrible things to the commits.

Please don't press the "pull request" github button. Do proper kernel
pull request with diffstat, git source tree (which can be on github,
of course), branch, commit information etc etc etc.

                  Linus

@xonatius xonatius closed this Sep 7, 2011

@nils-werner

This comment has been minimized.

Show comment
Hide comment
@nils-werner

nils-werner Sep 7, 2011

How are pullrequest seriously misdesigned (apart from that you might be used to a different kind of workflow)?

nils-werner commented Sep 7, 2011

How are pullrequest seriously misdesigned (apart from that you might be used to a different kind of workflow)?

@myfreeweb

This comment has been minimized.

Show comment
Hide comment
@myfreeweb

myfreeweb Sep 7, 2011

I'm not doing linux kernel pulls. The kernel pulls are seriously
misdesigned, and linux does horrible things to the commits.

Please don't press the "pull request" kernel button. Do proper github
pull request with diffstat, git source tree (which can be on linux,
of course), branch, commit information etc etc etc.

GitHub

myfreeweb commented Sep 7, 2011

I'm not doing linux kernel pulls. The kernel pulls are seriously
misdesigned, and linux does horrible things to the commits.

Please don't press the "pull request" kernel button. Do proper github
pull request with diffstat, git source tree (which can be on linux,
of course), branch, commit information etc etc etc.

GitHub

@tilsammans

This comment has been minimized.

Show comment
Hide comment
@tilsammans

tilsammans Sep 7, 2011

I honestly would like to know why github pull requests are misdesigned. I'll grant that I didn't actually create git but they seem to work just fine, is there something I am missing?

tilsammans commented Sep 7, 2011

I honestly would like to know why github pull requests are misdesigned. I'll grant that I didn't actually create git but they seem to work just fine, is there something I am missing?

@hotwoofy

This comment has been minimized.

Show comment
Hide comment
@nils-werner

This comment has been minimized.

Show comment
Hide comment
@nils-werner

nils-werner Sep 9, 2011

See https://github.com/torvalds/diveclog/pull/18#issuecomment-2023552

Wow, great discussion went on there. Shacon raised perfectly valid points and Torvalds was basically "f this, I don't care, you're crazy". Great response!

nils-werner commented Sep 9, 2011

See https://github.com/torvalds/diveclog/pull/18#issuecomment-2023552

Wow, great discussion went on there. Shacon raised perfectly valid points and Torvalds was basically "f this, I don't care, you're crazy". Great response!

@torvalds

This comment has been minimized.

Show comment
Hide comment
@torvalds

torvalds Sep 9, 2011

Owner

On Fri, Sep 9, 2011 at 12:49 AM, Nils Werner
reply@reply.github.com
wrote:

Wow, great discussion went on there. Shacon raised perfectly valid points and Torvalds was basically "f this, I don't care, you're crazy". Great response!

Can you read?

"If the merge message doesn't tell me who the merge is from and what
branch it was, the merge message is totally useless."

If you can't understand that, then yes, you're crazy. Or just terminally stupid.

The quality of github "issues" and comments really is very low. This
being just another example of it.

                          Linus
Owner

torvalds commented Sep 9, 2011

On Fri, Sep 9, 2011 at 12:49 AM, Nils Werner
reply@reply.github.com
wrote:

Wow, great discussion went on there. Shacon raised perfectly valid points and Torvalds was basically "f this, I don't care, you're crazy". Great response!

Can you read?

"If the merge message doesn't tell me who the merge is from and what
branch it was, the merge message is totally useless."

If you can't understand that, then yes, you're crazy. Or just terminally stupid.

The quality of github "issues" and comments really is very low. This
being just another example of it.

                          Linus
@nils-werner

This comment has been minimized.

Show comment
Hide comment
@nils-werner

nils-werner Sep 9, 2011

First, I agree with Scott: In many cases people delete their fork (or at least the branch). So where would the message point you to? The pull request of the pulling repository will much more likely be around for a long time.

Also, what if the branch you'll pull from has changed in the meantime? You'd end up with changes that are not documented in the pull request and thus not reviewed by the ones discussing the pull request. As soon as the PR is posted you must put them out of reach of the author to keep them from sneaking in changes.

Great response!

Can you read?

Also, you did notice that you've proven my point right there, right?

nils-werner commented Sep 9, 2011

First, I agree with Scott: In many cases people delete their fork (or at least the branch). So where would the message point you to? The pull request of the pulling repository will much more likely be around for a long time.

Also, what if the branch you'll pull from has changed in the meantime? You'd end up with changes that are not documented in the pull request and thus not reviewed by the ones discussing the pull request. As soon as the PR is posted you must put them out of reach of the author to keep them from sneaking in changes.

Great response!

Can you read?

Also, you did notice that you've proven my point right there, right?

@torvalds

This comment has been minimized.

Show comment
Hide comment
@torvalds

torvalds Sep 9, 2011

Owner

On Fri, Sep 9, 2011 at 12:10 PM, Nils Werner
reply@reply.github.com
wrote:

First, I agree with Scott: In many cases people delete their fork (or at least the branch). So where would the message point you to? The pull request of the pulling repository will much more likely be around for a long time.

That's a "implementation problem". It's not an argument for doing crap.

Simple solution: if people delete the branch or repository, consider
the pull request dead.

You can make the "pull request" namespace separate from the branch
namespace, but do it on the source side, instead of on the
destination side like you do now. So if somebody says "please pull by
branch xyzzy", you turn it into a pull request for

git pull git://github.com/ pull/xyzzy

and then if there i a previous pull request, add a number to it (so it
becomes "pull/xyzzy-2" or whatever).

Or something along those lines. The important part is that YOU MUST
NOT THROW AWAY THE SOURCE INFORMATION!

Also, what if the branch you'll pull from has changed in the meantime?

We actually do this in the kernel on purpose sometimes - people fix up
their stuff.

That said, again, you could do the same thing: if somebody changes a
branch after created a pull request off it, just invalidate the pull
request and refuse to honor it. Again, if you do a separate
"pull/xyzzy" namespace, you should be able to validate that trivially
(save off the commit ID at the time of the pull, and refuse to serve
"pull/xyzzy" if the commit ID doesn't match the branch "xyzzy" any
more).

You'd end up with changes that are not documented in the pull request and thus not reviewed by the ones discussing the pull request.

Umm, considering that the pull requests used to have no documentation
what-so-ever before I even complained about it, that's a pretty damn
weak argument, isn't it?

As soon as the PR is posted you must put them out of reach of the
author to keep them from sneaking in changes.

Can you read?

Also, you did notice that you've proven my point right there, right?

Umm. I'm not polite. Big news. I'd rather be acerbic than stupid.

                       Linus
Owner

torvalds commented Sep 9, 2011

On Fri, Sep 9, 2011 at 12:10 PM, Nils Werner
reply@reply.github.com
wrote:

First, I agree with Scott: In many cases people delete their fork (or at least the branch). So where would the message point you to? The pull request of the pulling repository will much more likely be around for a long time.

That's a "implementation problem". It's not an argument for doing crap.

Simple solution: if people delete the branch or repository, consider
the pull request dead.

You can make the "pull request" namespace separate from the branch
namespace, but do it on the source side, instead of on the
destination side like you do now. So if somebody says "please pull by
branch xyzzy", you turn it into a pull request for

git pull git://github.com/ pull/xyzzy

and then if there i a previous pull request, add a number to it (so it
becomes "pull/xyzzy-2" or whatever).

Or something along those lines. The important part is that YOU MUST
NOT THROW AWAY THE SOURCE INFORMATION!

Also, what if the branch you'll pull from has changed in the meantime?

We actually do this in the kernel on purpose sometimes - people fix up
their stuff.

That said, again, you could do the same thing: if somebody changes a
branch after created a pull request off it, just invalidate the pull
request and refuse to honor it. Again, if you do a separate
"pull/xyzzy" namespace, you should be able to validate that trivially
(save off the commit ID at the time of the pull, and refuse to serve
"pull/xyzzy" if the commit ID doesn't match the branch "xyzzy" any
more).

You'd end up with changes that are not documented in the pull request and thus not reviewed by the ones discussing the pull request.

Umm, considering that the pull requests used to have no documentation
what-so-ever before I even complained about it, that's a pretty damn
weak argument, isn't it?

As soon as the PR is posted you must put them out of reach of the
author to keep them from sneaking in changes.

Can you read?

Also, you did notice that you've proven my point right there, right?

Umm. I'm not polite. Big news. I'd rather be acerbic than stupid.

                       Linus
@nils-werner

This comment has been minimized.

Show comment
Hide comment
@nils-werner

nils-werner Sep 9, 2011

That's a "implementation problem".

A decentralized system that doesn't accept disappearing nodes sounds more like a design problem.

Simple solution: if people delete the branch or repository, consider the pull request dead.

Years after the branch has been merged? Is that a problem we wanted to solve?

Also, what if the branch you'll pull from has changed in the meantime?

We actually do this in the kernel on purpose sometimes - people fix up their stuff.

I meant malicuous changes. Hierarchies are shallow, elite circles basically nonexistant so that's a real issue. And the biggest strength of GitHub.

save off the commit ID at the time of the pull, and refuse to serve "pull/xyzzy" if the commit ID doesn't match the branch "xyzzy" any more

Thats the first constructive comment to this discussion. And sounds like a good idea, apart from the problem that you'd lose the link to the PR wich, to many, is more useful than being able to immediately recognise the source.

Also it would probably require lots of modifications to the deamon though.And very disciplined contributors (always make sure to use dead-end topic-branches, not everybody does that). Separating the two simply improves the workflow a lot.

It'd be interesting what @schacon has to say about it.

Umm, considering that the pull requests used to have no documentation what-so-ever before I even complained about it, that's a pretty damn weak argument, isn't it?

When was that? Months ago? I am talking about your comment 2 days ago.

nils-werner commented Sep 9, 2011

That's a "implementation problem".

A decentralized system that doesn't accept disappearing nodes sounds more like a design problem.

Simple solution: if people delete the branch or repository, consider the pull request dead.

Years after the branch has been merged? Is that a problem we wanted to solve?

Also, what if the branch you'll pull from has changed in the meantime?

We actually do this in the kernel on purpose sometimes - people fix up their stuff.

I meant malicuous changes. Hierarchies are shallow, elite circles basically nonexistant so that's a real issue. And the biggest strength of GitHub.

save off the commit ID at the time of the pull, and refuse to serve "pull/xyzzy" if the commit ID doesn't match the branch "xyzzy" any more

Thats the first constructive comment to this discussion. And sounds like a good idea, apart from the problem that you'd lose the link to the PR wich, to many, is more useful than being able to immediately recognise the source.

Also it would probably require lots of modifications to the deamon though.And very disciplined contributors (always make sure to use dead-end topic-branches, not everybody does that). Separating the two simply improves the workflow a lot.

It'd be interesting what @schacon has to say about it.

Umm, considering that the pull requests used to have no documentation what-so-ever before I even complained about it, that's a pretty damn weak argument, isn't it?

When was that? Months ago? I am talking about your comment 2 days ago.

@nils-werner

This comment has been minimized.

Show comment
Hide comment
@nils-werner

nils-werner Sep 9, 2011

Umm. I'm not polite. Big news. I'd rather be acerbic than stupid.

A personal, unrelated note: Being unable to lead an objective discussion. Judging people, then insulting them just to prove a point. Recognising ones flaws but being unwilling to change them, instead bragging about them. Missing the ability to reflect on ones actions during interactions with others.

That sounds pretty stupid to me. Anyways, I'm moving on.

nils-werner commented Sep 9, 2011

Umm. I'm not polite. Big news. I'd rather be acerbic than stupid.

A personal, unrelated note: Being unable to lead an objective discussion. Judging people, then insulting them just to prove a point. Recognising ones flaws but being unwilling to change them, instead bragging about them. Missing the ability to reflect on ones actions during interactions with others.

That sounds pretty stupid to me. Anyways, I'm moving on.

@jeffWelling

This comment has been minimized.

Show comment
Hide comment
@jeffWelling

jeffWelling Sep 9, 2011

@nils-werner

A decentralized system that doesn't accept disappearing nodes sounds more like a design problem.

I thought we were talking about pull requests and branches? When did a branch become a node?
Perhaps I'm missing something but this sounds simple; if you have a change and you want someone else to pull it, it sounds reasonable to expect you to keep the change published at least until it is pulled.

I meant malicuous changes. Hierarchies are shallow, elite circles basically nonexistant so that's a real issue. And the biggest strength of GitHub.

Except that, as indicated by Scott Chacon [0], the most common scenario is to perform the pull request locally on your machine, allowing you to pull the code and then review it without said code being changed before merging. I can understand your argument in relation to pull requests done using the button on the website though.

[0] https://github.com/torvalds/diveclog/pull/18

jeffWelling commented Sep 9, 2011

@nils-werner

A decentralized system that doesn't accept disappearing nodes sounds more like a design problem.

I thought we were talking about pull requests and branches? When did a branch become a node?
Perhaps I'm missing something but this sounds simple; if you have a change and you want someone else to pull it, it sounds reasonable to expect you to keep the change published at least until it is pulled.

I meant malicuous changes. Hierarchies are shallow, elite circles basically nonexistant so that's a real issue. And the biggest strength of GitHub.

Except that, as indicated by Scott Chacon [0], the most common scenario is to perform the pull request locally on your machine, allowing you to pull the code and then review it without said code being changed before merging. I can understand your argument in relation to pull requests done using the button on the website though.

[0] https://github.com/torvalds/diveclog/pull/18

cuviper pushed a commit to cuviper/linux-uprobes that referenced this pull request Nov 3, 2011

kvm tools: Dump vCPUs in order
* Ingo Molnar <mingo@elte.hu> wrote:

> The patch below addresses these concerns, serializes the output, tidies up the
> printout, resulting in this new output:

There's one bug remaining that my patch does not address: the vCPUs are not
printed in order:

# vCPU #0's dump:
# vCPU #2's dump:
# vCPU #24's dump:
# vCPU #5's dump:
# vCPU #39's dump:
# vCPU #38's dump:
# vCPU #51's dump:
# vCPU #11's dump:
# vCPU #10's dump:
# vCPU #12's dump:

This is undesirable as the order of printout is highly random, so successive
dumps are difficult to compare.

The patch below serializes the signalling itself. (this is on top of the
previous patch)

The patch also tweaks the vCPU printout line a bit so that it does not start
with '#', which is discarded if such messages are pasted into Git commit
messages.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>

torvalds pushed a commit that referenced this pull request Dec 15, 2011

ext4: avoid hangs in ext4_da_should_update_i_disksize()
If the pte mapping in generic_perform_write() is unmapped between
iov_iter_fault_in_readable() and iov_iter_copy_from_user_atomic(), the
"copied" parameter to ->end_write can be zero. ext4 couldn't cope with
it with delayed allocations enabled. This skips the i_disksize
enlargement logic if copied is zero and no new data was appeneded to
the inode.

 gdb> bt
 #0  0xffffffff811afe80 in ext4_da_should_update_i_disksize (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x1\
 08000, len=0x1000, copied=0x0, page=0xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2467
 #1  ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\
 xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512
 #2  0xffffffff810d97f1 in generic_perform_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value o\
 ptimized out>, pos=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2440
 #3  generic_file_buffered_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value optimized out>, p\
 os=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2482
 #4  0xffffffff810db5d1 in __generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, ppos=0\
 xffff88001e26be40) at mm/filemap.c:2600
 #5  0xffffffff810db853 in generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=<value optimi\
 zed out>, pos=<value optimized out>) at mm/filemap.c:2632
 #6  0xffffffff811a71aa in ext4_file_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, pos=0x108000) a\
 t fs/ext4/file.c:136
 #7  0xffffffff811375aa in do_sync_write (filp=0xffff88003f606a80, buf=<value optimized out>, len=<value optimized out>, \
 ppos=0xffff88001e26bf48) at fs/read_write.c:406
 #8  0xffffffff81137e56 in vfs_write (file=0xffff88003f606a80, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x4\
 000, pos=0xffff88001e26bf48) at fs/read_write.c:435
 #9  0xffffffff8113816c in sys_write (fd=<value optimized out>, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x\
 4000) at fs/read_write.c:487
 #10 <signal handler called>
 #11 0x00007f120077a390 in __brk_reservation_fn_dmi_alloc__ ()
 #12 0x0000000000000000 in ?? ()
 gdb> print offset
 $22 = 0xffffffffffffffff
 gdb> print idx
 $23 = 0xffffffff
 gdb> print inode->i_blkbits
 $24 = 0xc
 gdb> up
 #1  ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\
 xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512
 2512                    if (ext4_da_should_update_i_disksize(page, end)) {
 gdb> print start
 $25 = 0x0
 gdb> print end
 $26 = 0xffffffffffffffff
 gdb> print pos
 $27 = 0x108000
 gdb> print new_i_size
 $28 = 0x108000
 gdb> print ((struct ext4_inode_info *)((char *)inode-((int)(&((struct ext4_inode_info *)0)->vfs_inode))))->i_disksize
 $29 = 0xd9000
 gdb> down
 2467            for (i = 0; i < idx; i++)
 gdb> print i
 $30 = 0xd44acbee

This is 100% reproducible with some autonuma development code tuned in
a very aggressive manner (not normal way even for knumad) which does
"exotic" changes to the ptes. It wouldn't normally trigger but I don't
see why it can't happen normally if the page is added to swap cache in
between the two faults leading to "copied" being zero (which then
hangs in ext4). So it should be fixed. Especially possible with lumpy
reclaim (albeit disabled if compaction is enabled) as that would
ignore the young bits in the ptes.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

gdamjan pushed a commit to gdamjan/linux that referenced this pull request Dec 24, 2011

ext4: avoid hangs in ext4_da_should_update_i_disksize()
commit ea51d13 upstream.

If the pte mapping in generic_perform_write() is unmapped between
iov_iter_fault_in_readable() and iov_iter_copy_from_user_atomic(), the
"copied" parameter to ->end_write can be zero. ext4 couldn't cope with
it with delayed allocations enabled. This skips the i_disksize
enlargement logic if copied is zero and no new data was appeneded to
the inode.

 gdb> bt
 #0  0xffffffff811afe80 in ext4_da_should_update_i_disksize (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x1\
 08000, len=0x1000, copied=0x0, page=0xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2467
 #1  ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\
 xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512
 #2  0xffffffff810d97f1 in generic_perform_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value o\
 ptimized out>, pos=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2440
 #3  generic_file_buffered_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value optimized out>, p\
 os=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2482
 #4  0xffffffff810db5d1 in __generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, ppos=0\
 xffff88001e26be40) at mm/filemap.c:2600
 #5  0xffffffff810db853 in generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=<value optimi\
 zed out>, pos=<value optimized out>) at mm/filemap.c:2632
 #6  0xffffffff811a71aa in ext4_file_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, pos=0x108000) a\
 t fs/ext4/file.c:136
 #7  0xffffffff811375aa in do_sync_write (filp=0xffff88003f606a80, buf=<value optimized out>, len=<value optimized out>, \
 ppos=0xffff88001e26bf48) at fs/read_write.c:406
 #8  0xffffffff81137e56 in vfs_write (file=0xffff88003f606a80, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x4\
 000, pos=0xffff88001e26bf48) at fs/read_write.c:435
 #9  0xffffffff8113816c in sys_write (fd=<value optimized out>, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x\
 4000) at fs/read_write.c:487
 #10 <signal handler called>
 #11 0x00007f120077a390 in __brk_reservation_fn_dmi_alloc__ ()
 #12 0x0000000000000000 in ?? ()
 gdb> print offset
 $22 = 0xffffffffffffffff
 gdb> print idx
 $23 = 0xffffffff
 gdb> print inode->i_blkbits
 $24 = 0xc
 gdb> up
 #1  ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\
 xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512
 2512                    if (ext4_da_should_update_i_disksize(page, end)) {
 gdb> print start
 $25 = 0x0
 gdb> print end
 $26 = 0xffffffffffffffff
 gdb> print pos
 $27 = 0x108000
 gdb> print new_i_size
 $28 = 0x108000
 gdb> print ((struct ext4_inode_info *)((char *)inode-((int)(&((struct ext4_inode_info *)0)->vfs_inode))))->i_disksize
 $29 = 0xd9000
 gdb> down
 2467            for (i = 0; i < idx; i++)
 gdb> print i
 $30 = 0xd44acbee

This is 100% reproducible with some autonuma development code tuned in
a very aggressive manner (not normal way even for knumad) which does
"exotic" changes to the ptes. It wouldn't normally trigger but I don't
see why it can't happen normally if the page is added to swap cache in
between the two faults leading to "copied" being zero (which then
hangs in ext4). So it should be fixed. Especially possible with lumpy
reclaim (albeit disabled if compaction is enabled) as that would
ignore the young bits in the ptes.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

jkstrick added a commit to jkstrick/linux that referenced this pull request Feb 11, 2012

ixgbe: do not update real num queues when netdev is going away
If the netdev is already in NETREG_UNREGISTERING/_UNREGISTERED state, do not
update the real num tx queues. netdev_queue_update_kobjects() is already
called via remove_queue_kobjects() at NETREG_UNREGISTERING time. So, when
upper layer driver, e.g., FCoE protocol stack is monitoring the netdev
event of NETDEV_UNREGISTER and calls back to LLD ndo_fcoe_disable() to remove
extra queues allocated for FCoE, the associated txq sysfs kobjects are already
removed, and trying to update the real num queues would cause something like
below:

...
PID: 25138  TASK: ffff88021e64c440  CPU: 3   COMMAND: "kworker/3:3"
 #0 [ffff88021f007760] machine_kexec at ffffffff810226d9
 #1 [ffff88021f0077d0] crash_kexec at ffffffff81089d2d
 #2 [ffff88021f0078a0] oops_end at ffffffff813bca78
 #3 [ffff88021f0078d0] no_context at ffffffff81029e72
 #4 [ffff88021f007920] __bad_area_nosemaphore at ffffffff8102a155
 #5 [ffff88021f0079f0] bad_area_nosemaphore at ffffffff8102a23e
 #6 [ffff88021f007a00] do_page_fault at ffffffff813bf32e
 #7 [ffff88021f007b10] page_fault at ffffffff813bc045
    [exception RIP: sysfs_find_dirent+17]
    RIP: ffffffff81178611  RSP: ffff88021f007bc0  RFLAGS: 00010246
    RAX: ffff88021e64c440  RBX: ffffffff8156cc63  RCX: 0000000000000004
    RDX: ffffffff8156cc63  RSI: 0000000000000000  RDI: 0000000000000000
    RBP: ffff88021f007be0   R8: 0000000000000004   R9: 0000000000000008
    R10: ffffffff816fed00  R11: 0000000000000004  R12: 0000000000000000
    R13: ffffffff8156cc63  R14: 0000000000000000  R15: ffff8802222a0000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffff88021f007be8] sysfs_get_dirent at ffffffff81178c07
 #9 [ffff88021f007c18] sysfs_remove_group at ffffffff8117ac27
#10 [ffff88021f007c48] netdev_queue_update_kobjects at ffffffff813178f9
#11 [ffff88021f007c88] netif_set_real_num_tx_queues at ffffffff81303e38
#12 [ffff88021f007cc8] ixgbe_set_num_queues at ffffffffa0249763 [ixgbe]
#13 [ffff88021f007cf8] ixgbe_init_interrupt_scheme at ffffffffa024ea89 [ixgbe]
#14 [ffff88021f007d48] ixgbe_fcoe_disable at ffffffffa0267113 [ixgbe]
#15 [ffff88021f007d68] vlan_dev_fcoe_disable at ffffffffa014fef5 [8021q]
#16 [ffff88021f007d78] fcoe_interface_cleanup at ffffffffa02b7dfd [fcoe]
#17 [ffff88021f007df8] fcoe_destroy_work at ffffffffa02b7f08 [fcoe]
#18 [ffff88021f007e18] process_one_work at ffffffff8105d7ca
#19 [ffff88021f007e68] worker_thread at ffffffff81060513
#20 [ffff88021f007ee8] kthread at ffffffff810648b6
#21 [ffff88021f007f48] kernel_thread_helper at ffffffff813c40f4

Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

zachariasmaladroit pushed a commit to galaxys-cm7miui-kernel/linux that referenced this pull request Feb 11, 2012

ixgbe: do not update real num queues when netdev is going away
If the netdev is already in NETREG_UNREGISTERING/_UNREGISTERED state, do not
update the real num tx queues. netdev_queue_update_kobjects() is already
called via remove_queue_kobjects() at NETREG_UNREGISTERING time. So, when
upper layer driver, e.g., FCoE protocol stack is monitoring the netdev
event of NETDEV_UNREGISTER and calls back to LLD ndo_fcoe_disable() to remove
extra queues allocated for FCoE, the associated txq sysfs kobjects are already
removed, and trying to update the real num queues would cause something like
below:

...
PID: 25138  TASK: ffff88021e64c440  CPU: 3   COMMAND: "kworker/3:3"
 #0 [ffff88021f007760] machine_kexec at ffffffff810226d9
 #1 [ffff88021f0077d0] crash_kexec at ffffffff81089d2d
 #2 [ffff88021f0078a0] oops_end at ffffffff813bca78
 #3 [ffff88021f0078d0] no_context at ffffffff81029e72
 #4 [ffff88021f007920] __bad_area_nosemaphore at ffffffff8102a155
 #5 [ffff88021f0079f0] bad_area_nosemaphore at ffffffff8102a23e
 #6 [ffff88021f007a00] do_page_fault at ffffffff813bf32e
 #7 [ffff88021f007b10] page_fault at ffffffff813bc045
    [exception RIP: sysfs_find_dirent+17]
    RIP: ffffffff81178611  RSP: ffff88021f007bc0  RFLAGS: 00010246
    RAX: ffff88021e64c440  RBX: ffffffff8156cc63  RCX: 0000000000000004
    RDX: ffffffff8156cc63  RSI: 0000000000000000  RDI: 0000000000000000
    RBP: ffff88021f007be0   R8: 0000000000000004   R9: 0000000000000008
    R10: ffffffff816fed00  R11: 0000000000000004  R12: 0000000000000000
    R13: ffffffff8156cc63  R14: 0000000000000000  R15: ffff8802222a0000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffff88021f007be8] sysfs_get_dirent at ffffffff81178c07
 #9 [ffff88021f007c18] sysfs_remove_group at ffffffff8117ac27
#10 [ffff88021f007c48] netdev_queue_update_kobjects at ffffffff813178f9
#11 [ffff88021f007c88] netif_set_real_num_tx_queues at ffffffff81303e38
#12 [ffff88021f007cc8] ixgbe_set_num_queues at ffffffffa0249763 [ixgbe]
#13 [ffff88021f007cf8] ixgbe_init_interrupt_scheme at ffffffffa024ea89 [ixgbe]
#14 [ffff88021f007d48] ixgbe_fcoe_disable at ffffffffa0267113 [ixgbe]
#15 [ffff88021f007d68] vlan_dev_fcoe_disable at ffffffffa014fef5 [8021q]
#16 [ffff88021f007d78] fcoe_interface_cleanup at ffffffffa02b7dfd [fcoe]
#17 [ffff88021f007df8] fcoe_destroy_work at ffffffffa02b7f08 [fcoe]
#18 [ffff88021f007e18] process_one_work at ffffffff8105d7ca
#19 [ffff88021f007e68] worker_thread at ffffffff81060513
#20 [ffff88021f007ee8] kthread at ffffffff810648b6
#21 [ffff88021f007f48] kernel_thread_helper at ffffffff813c40f4

Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

tworaz pushed a commit to tworaz/linux that referenced this pull request Feb 13, 2012

mm: compaction: check pfn_valid when entering a new MAX_ORDER_NR_PAGE…
…S block during isolation for migration

commit 0bf380b upstream.

When isolating for migration, migration starts at the start of a zone
which is not necessarily pageblock aligned.  Further, it stops isolating
when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally
not aligned.  This allows isolate_migratepages() to call pfn_to_page() on
an invalid PFN which can result in a crash.  This was originally reported
against a 3.0-based kernel with the following trace in a crash dump.

PID: 9902   TASK: d47aecd0  CPU: 0   COMMAND: "memcg_process_s"
 #0 [d72d3ad0] crash_kexec at c028cfdb
 #1 [d72d3b24] oops_end at c05c5322
 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60
 #3 [d72d3bec] bad_area at c0227fb6
 #4 [d72d3c00] do_page_fault at c05c72ec
 #5 [d72d3c80] error_code (via page_fault) at c05c47a4
    EAX: 00000000  EBX: 000c0000  ECX: 00000001  EDX: 00000807  EBP: 000c0000
    DS:  007b      ESI: 00000001  ES:  007b      EDI: f3000a80  GS:  6f50
    CS:  0060      EIP: c030b15a  ERR: ffffffff  EFLAGS: 00010002
 #6 [d72d3cb4] isolate_migratepages at c030b15a
 #7 [d72d3d14] zone_watermark_ok at c02d26cb
 #8 [d72d3d2c] compact_zone at c030b8de
 #9 [d72d3d68] compact_zone_order at c030bba1
#10 [d72d3db4] try_to_compact_pages at c030bc84
#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7
#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7
#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97
#14 [d72d3eb8] alloc_pages_vma at c030a845
#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb
#16 [d72d3f00] handle_mm_fault at c02f36c6
#17 [d72d3f30] do_page_fault at c05c70ed
#18 [d72d3fb0] error_code (via page_fault) at c05c47a4
    EAX: b71ff000  EBX: 00000001  ECX: 00001600  EDX: 00000431
    DS:  007b      ESI: 08048950  ES:  007b      EDI: bfaa3788
    SS:  007b      ESP: bfaa36e0  EBP: bfaa3828  GS:  6f50
    CS:  0073      EIP: 080487c8  ERR: ffffffff  EFLAGS: 00010202

It was also reported by Herbert van den Bergh against 3.1-based kernel
with the following snippet from the console log.

BUG: unable to handle kernel paging request at 01c00008
IP: [<c0522399>] isolate_migratepages+0x119/0x390
*pdpt = 000000002f7ce001 *pde = 0000000000000000

It is expected that it also affects 3.2.x and current mainline.

The problem is that pfn_valid is only called on the first PFN being
checked and that PFN is not necessarily aligned.  Lets say we have a case
like this

H = MAX_ORDER_NR_PAGES boundary
| = pageblock boundary
m = cc->migrate_pfn
f = cc->free_pfn
o = memory hole

H------|------H------|----m-Hoooooo|ooooooH-f----|------H

The migrate_pfn is just below a memory hole and the free scanner is beyond
the hole.  When isolate_migratepages started, it scans from migrate_pfn to
migrate_pfn+pageblock_nr_pages which is now in a memory hole.  It checks
pfn_valid() on the first PFN but then scans into the hole where there are
not necessarily valid struct pages.

This patch ensures that isolate_migratepages calls pfn_valid when
necessary.

Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xXorAa pushed a commit to xXorAa/linux that referenced this pull request Feb 17, 2012

mm: compaction: check pfn_valid when entering a new MAX_ORDER_NR_PAGE…
…S block during isolation for migration

commit 0bf380b upstream.

When isolating for migration, migration starts at the start of a zone
which is not necessarily pageblock aligned.  Further, it stops isolating
when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally
not aligned.  This allows isolate_migratepages() to call pfn_to_page() on
an invalid PFN which can result in a crash.  This was originally reported
against a 3.0-based kernel with the following trace in a crash dump.

PID: 9902   TASK: d47aecd0  CPU: 0   COMMAND: "memcg_process_s"
 #0 [d72d3ad0] crash_kexec at c028cfdb
 #1 [d72d3b24] oops_end at c05c5322
 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60
 #3 [d72d3bec] bad_area at c0227fb6
 #4 [d72d3c00] do_page_fault at c05c72ec
 #5 [d72d3c80] error_code (via page_fault) at c05c47a4
    EAX: 00000000  EBX: 000c0000  ECX: 00000001  EDX: 00000807  EBP: 000c0000
    DS:  007b      ESI: 00000001  ES:  007b      EDI: f3000a80  GS:  6f50
    CS:  0060      EIP: c030b15a  ERR: ffffffff  EFLAGS: 00010002
 #6 [d72d3cb4] isolate_migratepages at c030b15a
 #7 [d72d3d14] zone_watermark_ok at c02d26cb
 #8 [d72d3d2c] compact_zone at c030b8de
 #9 [d72d3d68] compact_zone_order at c030bba1
#10 [d72d3db4] try_to_compact_pages at c030bc84
#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7
#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7
#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97
#14 [d72d3eb8] alloc_pages_vma at c030a845
#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb
#16 [d72d3f00] handle_mm_fault at c02f36c6
#17 [d72d3f30] do_page_fault at c05c70ed
#18 [d72d3fb0] error_code (via page_fault) at c05c47a4
    EAX: b71ff000  EBX: 00000001  ECX: 00001600  EDX: 00000431
    DS:  007b      ESI: 08048950  ES:  007b      EDI: bfaa3788
    SS:  007b      ESP: bfaa36e0  EBP: bfaa3828  GS:  6f50
    CS:  0073      EIP: 080487c8  ERR: ffffffff  EFLAGS: 00010202

It was also reported by Herbert van den Bergh against 3.1-based kernel
with the following snippet from the console log.

BUG: unable to handle kernel paging request at 01c00008
IP: [<c0522399>] isolate_migratepages+0x119/0x390
*pdpt = 000000002f7ce001 *pde = 0000000000000000

It is expected that it also affects 3.2.x and current mainline.

The problem is that pfn_valid is only called on the first PFN being
checked and that PFN is not necessarily aligned.  Lets say we have a case
like this

H = MAX_ORDER_NR_PAGES boundary
| = pageblock boundary
m = cc->migrate_pfn
f = cc->free_pfn
o = memory hole

H------|------H------|----m-Hoooooo|ooooooH-f----|------H

The migrate_pfn is just below a memory hole and the free scanner is beyond
the hole.  When isolate_migratepages started, it scans from migrate_pfn to
migrate_pfn+pageblock_nr_pages which is now in a memory hole.  It checks
pfn_valid() on the first PFN but then scans into the hole where there are
not necessarily valid struct pages.

This patch ensures that isolate_migratepages calls pfn_valid when
necessary.

Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

koct9i pushed a commit to koct9i/linux that referenced this pull request Feb 20, 2012

Frans Meulenbroeks Guenter Roeck
hwmon: (adt7411) fix checkpatch issues
fixed:
WARNING: please, no space before tabs
#11: FILE: adt7411.c:11:
+ * ^I  use power-down mode for suspend?, interrupt handling?$

not fixed as all other macros around it are the same structure and this one is only 2 chars longer:
WARNING: line over 80 characters
#229: FILE: adt7411.c:229:
+static ADT7411_BIT_ATTR(fast_sampling, ADT7411_REG_CFG3, ADT7411_CFG3_ADC_CLK_225);

Signed-off-by: Frans Meulenbroeks <fransmeulenbroeks@gmail.com>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>

koenkooi added a commit to koenkooi/linux that referenced this pull request Feb 23, 2012

mm: compaction: check pfn_valid when entering a new MAX_ORDER_NR_PAGE…
…S block during isolation for migration

commit 0bf380b upstream.

When isolating for migration, migration starts at the start of a zone
which is not necessarily pageblock aligned.  Further, it stops isolating
when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally
not aligned.  This allows isolate_migratepages() to call pfn_to_page() on
an invalid PFN which can result in a crash.  This was originally reported
against a 3.0-based kernel with the following trace in a crash dump.

PID: 9902   TASK: d47aecd0  CPU: 0   COMMAND: "memcg_process_s"
 #0 [d72d3ad0] crash_kexec at c028cfdb
 #1 [d72d3b24] oops_end at c05c5322
 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60
 #3 [d72d3bec] bad_area at c0227fb6
 #4 [d72d3c00] do_page_fault at c05c72ec
 #5 [d72d3c80] error_code (via page_fault) at c05c47a4
    EAX: 00000000  EBX: 000c0000  ECX: 00000001  EDX: 00000807  EBP: 000c0000
    DS:  007b      ESI: 00000001  ES:  007b      EDI: f3000a80  GS:  6f50
    CS:  0060      EIP: c030b15a  ERR: ffffffff  EFLAGS: 00010002
 #6 [d72d3cb4] isolate_migratepages at c030b15a
 #7 [d72d3d14] zone_watermark_ok at c02d26cb
 #8 [d72d3d2c] compact_zone at c030b8de
 #9 [d72d3d68] compact_zone_order at c030bba1
#10 [d72d3db4] try_to_compact_pages at c030bc84
#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7
#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7
#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97
#14 [d72d3eb8] alloc_pages_vma at c030a845
#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb
#16 [d72d3f00] handle_mm_fault at c02f36c6
#17 [d72d3f30] do_page_fault at c05c70ed
#18 [d72d3fb0] error_code (via page_fault) at c05c47a4
    EAX: b71ff000  EBX: 00000001  ECX: 00001600  EDX: 00000431
    DS:  007b      ESI: 08048950  ES:  007b      EDI: bfaa3788
    SS:  007b      ESP: bfaa36e0  EBP: bfaa3828  GS:  6f50
    CS:  0073      EIP: 080487c8  ERR: ffffffff  EFLAGS: 00010202

It was also reported by Herbert van den Bergh against 3.1-based kernel
with the following snippet from the console log.

BUG: unable to handle kernel paging request at 01c00008
IP: [<c0522399>] isolate_migratepages+0x119/0x390
*pdpt = 000000002f7ce001 *pde = 0000000000000000

It is expected that it also affects 3.2.x and current mainline.

The problem is that pfn_valid is only called on the first PFN being
checked and that PFN is not necessarily aligned.  Lets say we have a case
like this

H = MAX_ORDER_NR_PAGES boundary
| = pageblock boundary
m = cc->migrate_pfn
f = cc->free_pfn
o = memory hole

H------|------H------|----m-Hoooooo|ooooooH-f----|------H

The migrate_pfn is just below a memory hole and the free scanner is beyond
the hole.  When isolate_migratepages started, it scans from migrate_pfn to
migrate_pfn+pageblock_nr_pages which is now in a memory hole.  It checks
pfn_valid() on the first PFN but then scans into the hole where there are
not necessarily valid struct pages.

This patch ensures that isolate_migratepages calls pfn_valid when
necessary.

Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

koenkooi added a commit to koenkooi/linux that referenced this pull request Mar 1, 2012

mm: compaction: check pfn_valid when entering a new MAX_ORDER_NR_PAGE…
…S block during isolation for migration

commit 0bf380b upstream.

When isolating for migration, migration starts at the start of a zone
which is not necessarily pageblock aligned.  Further, it stops isolating
when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally
not aligned.  This allows isolate_migratepages() to call pfn_to_page() on
an invalid PFN which can result in a crash.  This was originally reported
against a 3.0-based kernel with the following trace in a crash dump.

PID: 9902   TASK: d47aecd0  CPU: 0   COMMAND: "memcg_process_s"
 #0 [d72d3ad0] crash_kexec at c028cfdb
 #1 [d72d3b24] oops_end at c05c5322
 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60
 #3 [d72d3bec] bad_area at c0227fb6
 #4 [d72d3c00] do_page_fault at c05c72ec
 #5 [d72d3c80] error_code (via page_fault) at c05c47a4
    EAX: 00000000  EBX: 000c0000  ECX: 00000001  EDX: 00000807  EBP: 000c0000
    DS:  007b      ESI: 00000001  ES:  007b      EDI: f3000a80  GS:  6f50
    CS:  0060      EIP: c030b15a  ERR: ffffffff  EFLAGS: 00010002
 #6 [d72d3cb4] isolate_migratepages at c030b15a
 #7 [d72d3d14] zone_watermark_ok at c02d26cb
 #8 [d72d3d2c] compact_zone at c030b8de
 #9 [d72d3d68] compact_zone_order at c030bba1
#10 [d72d3db4] try_to_compact_pages at c030bc84
#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7
#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7
#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97
#14 [d72d3eb8] alloc_pages_vma at c030a845
#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb
#16 [d72d3f00] handle_mm_fault at c02f36c6
#17 [d72d3f30] do_page_fault at c05c70ed
#18 [d72d3fb0] error_code (via page_fault) at c05c47a4
    EAX: b71ff000  EBX: 00000001  ECX: 00001600  EDX: 00000431
    DS:  007b      ESI: 08048950  ES:  007b      EDI: bfaa3788
    SS:  007b      ESP: bfaa36e0  EBP: bfaa3828  GS:  6f50
    CS:  0073      EIP: 080487c8  ERR: ffffffff  EFLAGS: 00010202

It was also reported by Herbert van den Bergh against 3.1-based kernel
with the following snippet from the console log.

BUG: unable to handle kernel paging request at 01c00008
IP: [<c0522399>] isolate_migratepages+0x119/0x390
*pdpt = 000000002f7ce001 *pde = 0000000000000000

It is expected that it also affects 3.2.x and current mainline.

The problem is that pfn_valid is only called on the first PFN being
checked and that PFN is not necessarily aligned.  Lets say we have a case
like this

H = MAX_ORDER_NR_PAGES boundary
| = pageblock boundary
m = cc->migrate_pfn
f = cc->free_pfn
o = memory hole

H------|------H------|----m-Hoooooo|ooooooH-f----|------H

The migrate_pfn is just below a memory hole and the free scanner is beyond
the hole.  When isolate_migratepages started, it scans from migrate_pfn to
migrate_pfn+pageblock_nr_pages which is now in a memory hole.  It checks
pfn_valid() on the first PFN but then scans into the hole where there are
not necessarily valid struct pages.

This patch ensures that isolate_migratepages calls pfn_valid when
necessary.

Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

koenkooi added a commit to koenkooi/linux that referenced this pull request Mar 19, 2012

mm: compaction: check pfn_valid when entering a new MAX_ORDER_NR_PAGE…
…S block during isolation for migration

commit 0bf380b upstream.

When isolating for migration, migration starts at the start of a zone
which is not necessarily pageblock aligned.  Further, it stops isolating
when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally
not aligned.  This allows isolate_migratepages() to call pfn_to_page() on
an invalid PFN which can result in a crash.  This was originally reported
against a 3.0-based kernel with the following trace in a crash dump.

PID: 9902   TASK: d47aecd0  CPU: 0   COMMAND: "memcg_process_s"
 #0 [d72d3ad0] crash_kexec at c028cfdb
 #1 [d72d3b24] oops_end at c05c5322
 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60
 #3 [d72d3bec] bad_area at c0227fb6
 #4 [d72d3c00] do_page_fault at c05c72ec
 #5 [d72d3c80] error_code (via page_fault) at c05c47a4
    EAX: 00000000  EBX: 000c0000  ECX: 00000001  EDX: 00000807  EBP: 000c0000
    DS:  007b      ESI: 00000001  ES:  007b      EDI: f3000a80  GS:  6f50
    CS:  0060      EIP: c030b15a  ERR: ffffffff  EFLAGS: 00010002
 #6 [d72d3cb4] isolate_migratepages at c030b15a
 #7 [d72d3d14] zone_watermark_ok at c02d26cb
 #8 [d72d3d2c] compact_zone at c030b8de
 #9 [d72d3d68] compact_zone_order at c030bba1
#10 [d72d3db4] try_to_compact_pages at c030bc84
#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7
#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7
#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97
#14 [d72d3eb8] alloc_pages_vma at c030a845
#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb
#16 [d72d3f00] handle_mm_fault at c02f36c6
#17 [d72d3f30] do_page_fault at c05c70ed
#18 [d72d3fb0] error_code (via page_fault) at c05c47a4
    EAX: b71ff000  EBX: 00000001  ECX: 00001600  EDX: 00000431
    DS:  007b      ESI: 08048950  ES:  007b      EDI: bfaa3788
    SS:  007b      ESP: bfaa36e0  EBP: bfaa3828  GS:  6f50
    CS:  0073      EIP: 080487c8  ERR: ffffffff  EFLAGS: 00010202

It was also reported by Herbert van den Bergh against 3.1-based kernel
with the following snippet from the console log.

BUG: unable to handle kernel paging request at 01c00008
IP: [<c0522399>] isolate_migratepages+0x119/0x390
*pdpt = 000000002f7ce001 *pde = 0000000000000000

It is expected that it also affects 3.2.x and current mainline.

The problem is that pfn_valid is only called on the first PFN being
checked and that PFN is not necessarily aligned.  Lets say we have a case
like this

H = MAX_ORDER_NR_PAGES boundary
| = pageblock boundary
m = cc->migrate_pfn
f = cc->free_pfn
o = memory hole

H------|------H------|----m-Hoooooo|ooooooH-f----|------H

The migrate_pfn is just below a memory hole and the free scanner is beyond
the hole.  When isolate_migratepages started, it scans from migrate_pfn to
migrate_pfn+pageblock_nr_pages which is now in a memory hole.  It checks
pfn_valid() on the first PFN but then scans into the hole where there are
not necessarily valid struct pages.

This patch ensures that isolate_migratepages calls pfn_valid when
necessary.

Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

koenkooi added a commit to koenkooi/linux that referenced this pull request Mar 22, 2012

mm: compaction: check pfn_valid when entering a new MAX_ORDER_NR_PAGE…
…S block during isolation for migration

commit 0bf380b upstream.

When isolating for migration, migration starts at the start of a zone
which is not necessarily pageblock aligned.  Further, it stops isolating
when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally
not aligned.  This allows isolate_migratepages() to call pfn_to_page() on
an invalid PFN which can result in a crash.  This was originally reported
against a 3.0-based kernel with the following trace in a crash dump.

PID: 9902   TASK: d47aecd0  CPU: 0   COMMAND: "memcg_process_s"
 #0 [d72d3ad0] crash_kexec at c028cfdb
 #1 [d72d3b24] oops_end at c05c5322
 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60
 #3 [d72d3bec] bad_area at c0227fb6
 #4 [d72d3c00] do_page_fault at c05c72ec
 #5 [d72d3c80] error_code (via page_fault) at c05c47a4
    EAX: 00000000  EBX: 000c0000  ECX: 00000001  EDX: 00000807  EBP: 000c0000
    DS:  007b      ESI: 00000001  ES:  007b      EDI: f3000a80  GS:  6f50
    CS:  0060      EIP: c030b15a  ERR: ffffffff  EFLAGS: 00010002
 #6 [d72d3cb4] isolate_migratepages at c030b15a
 #7 [d72d3d14] zone_watermark_ok at c02d26cb
 #8 [d72d3d2c] compact_zone at c030b8de
 #9 [d72d3d68] compact_zone_order at c030bba1
#10 [d72d3db4] try_to_compact_pages at c030bc84
#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7
#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7
#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97
#14 [d72d3eb8] alloc_pages_vma at c030a845
#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb
#16 [d72d3f00] handle_mm_fault at c02f36c6
#17 [d72d3f30] do_page_fault at c05c70ed
#18 [d72d3fb0] error_code (via page_fault) at c05c47a4
    EAX: b71ff000  EBX: 00000001  ECX: 00001600  EDX: 00000431
    DS:  007b      ESI: 08048950  ES:  007b      EDI: bfaa3788
    SS:  007b      ESP: bfaa36e0  EBP: bfaa3828  GS:  6f50
    CS:  0073      EIP: 080487c8  ERR: ffffffff  EFLAGS: 00010202

It was also reported by Herbert van den Bergh against 3.1-based kernel
with the following snippet from the console log.

BUG: unable to handle kernel paging request at 01c00008
IP: [<c0522399>] isolate_migratepages+0x119/0x390
*pdpt = 000000002f7ce001 *pde = 0000000000000000

It is expected that it also affects 3.2.x and current mainline.

The problem is that pfn_valid is only called on the first PFN being
checked and that PFN is not necessarily aligned.  Lets say we have a case
like this

H = MAX_ORDER_NR_PAGES boundary
| = pageblock boundary
m = cc->migrate_pfn
f = cc->free_pfn
o = memory hole

H------|------H------|----m-Hoooooo|ooooooH-f----|------H

The migrate_pfn is just below a memory hole and the free scanner is beyond
the hole.  When isolate_migratepages started, it scans from migrate_pfn to
migrate_pfn+pageblock_nr_pages which is now in a memory hole.  It checks
pfn_valid() on the first PFN but then scans into the hole where there are
not necessarily valid struct pages.

This patch ensures that isolate_migratepages calls pfn_valid when
necessary.

Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

koenkooi added a commit to koenkooi/linux that referenced this pull request Apr 2, 2012

mm: compaction: check pfn_valid when entering a new MAX_ORDER_NR_PAGE…
…S block during isolation for migration

commit 0bf380b upstream.

When isolating for migration, migration starts at the start of a zone
which is not necessarily pageblock aligned.  Further, it stops isolating
when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally
not aligned.  This allows isolate_migratepages() to call pfn_to_page() on
an invalid PFN which can result in a crash.  This was originally reported
against a 3.0-based kernel with the following trace in a crash dump.

PID: 9902   TASK: d47aecd0  CPU: 0   COMMAND: "memcg_process_s"
 #0 [d72d3ad0] crash_kexec at c028cfdb
 #1 [d72d3b24] oops_end at c05c5322
 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60
 #3 [d72d3bec] bad_area at c0227fb6
 #4 [d72d3c00] do_page_fault at c05c72ec
 #5 [d72d3c80] error_code (via page_fault) at c05c47a4
    EAX: 00000000  EBX: 000c0000  ECX: 00000001  EDX: 00000807  EBP: 000c0000
    DS:  007b      ESI: 00000001  ES:  007b      EDI: f3000a80  GS:  6f50
    CS:  0060      EIP: c030b15a  ERR: ffffffff  EFLAGS: 00010002
 #6 [d72d3cb4] isolate_migratepages at c030b15a
 #7 [d72d3d14] zone_watermark_ok at c02d26cb
 #8 [d72d3d2c] compact_zone at c030b8de
 #9 [d72d3d68] compact_zone_order at c030bba1
#10 [d72d3db4] try_to_compact_pages at c030bc84
#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7
#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7
#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97
#14 [d72d3eb8] alloc_pages_vma at c030a845
#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb
#16 [d72d3f00] handle_mm_fault at c02f36c6
#17 [d72d3f30] do_page_fault at c05c70ed
#18 [d72d3fb0] error_code (via page_fault) at c05c47a4
    EAX: b71ff000  EBX: 00000001  ECX: 00001600  EDX: 00000431
    DS:  007b      ESI: 08048950  ES:  007b      EDI: bfaa3788
    SS:  007b      ESP: bfaa36e0  EBP: bfaa3828  GS:  6f50
    CS:  0073      EIP: 080487c8  ERR: ffffffff  EFLAGS: 00010202

It was also reported by Herbert van den Bergh against 3.1-based kernel
with the following snippet from the console log.

BUG: unable to handle kernel paging request at 01c00008
IP: [<c0522399>] isolate_migratepages+0x119/0x390
*pdpt = 000000002f7ce001 *pde = 0000000000000000

It is expected that it also affects 3.2.x and current mainline.

The problem is that pfn_valid is only called on the first PFN being
checked and that PFN is not necessarily aligned.  Lets say we have a case
like this

H = MAX_ORDER_NR_PAGES boundary
| = pageblock boundary
m = cc->migrate_pfn
f = cc->free_pfn
o = memory hole

H------|------H------|----m-Hoooooo|ooooooH-f----|------H

The migrate_pfn is just below a memory hole and the free scanner is beyond
the hole.  When isolate_migratepages started, it scans from migrate_pfn to
migrate_pfn+pageblock_nr_pages which is now in a memory hole.  It checks
pfn_valid() on the first PFN but then scans into the hole where there are
not necessarily valid struct pages.

This patch ensures that isolate_migratepages calls pfn_valid when
necessary.

Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

koenkooi added a commit to koenkooi/linux that referenced this pull request Apr 9, 2012

mm: compaction: check pfn_valid when entering a new MAX_ORDER_NR_PAGE…
…S block during isolation for migration

commit 0bf380b upstream.

When isolating for migration, migration starts at the start of a zone
which is not necessarily pageblock aligned.  Further, it stops isolating
when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally
not aligned.  This allows isolate_migratepages() to call pfn_to_page() on
an invalid PFN which can result in a crash.  This was originally reported
against a 3.0-based kernel with the following trace in a crash dump.

PID: 9902   TASK: d47aecd0  CPU: 0   COMMAND: "memcg_process_s"
 #0 [d72d3ad0] crash_kexec at c028cfdb
 #1 [d72d3b24] oops_end at c05c5322
 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60
 #3 [d72d3bec] bad_area at c0227fb6
 #4 [d72d3c00] do_page_fault at c05c72ec
 #5 [d72d3c80] error_code (via page_fault) at c05c47a4
    EAX: 00000000  EBX: 000c0000  ECX: 00000001  EDX: 00000807  EBP: 000c0000
    DS:  007b      ESI: 00000001  ES:  007b      EDI: f3000a80  GS:  6f50
    CS:  0060      EIP: c030b15a  ERR: ffffffff  EFLAGS: 00010002
 #6 [d72d3cb4] isolate_migratepages at c030b15a
 #7 [d72d3d14] zone_watermark_ok at c02d26cb
 #8 [d72d3d2c] compact_zone at c030b8de
 #9 [d72d3d68] compact_zone_order at c030bba1
#10 [d72d3db4] try_to_compact_pages at c030bc84
#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7
#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7
#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97
#14 [d72d3eb8] alloc_pages_vma at c030a845
#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb
#16 [d72d3f00] handle_mm_fault at c02f36c6
#17 [d72d3f30] do_page_fault at c05c70ed
#18 [d72d3fb0] error_code (via page_fault) at c05c47a4
    EAX: b71ff000  EBX: 00000001  ECX: 00001600  EDX: 00000431
    DS:  007b      ESI: 08048950  ES:  007b      EDI: bfaa3788
    SS:  007b      ESP: bfaa36e0  EBP: bfaa3828  GS:  6f50
    CS:  0073      EIP: 080487c8  ERR: ffffffff  EFLAGS: 00010202

It was also reported by Herbert van den Bergh against 3.1-based kernel
with the following snippet from the console log.

BUG: unable to handle kernel paging request at 01c00008
IP: [<c0522399>] isolate_migratepages+0x119/0x390
*pdpt = 000000002f7ce001 *pde = 0000000000000000

It is expected that it also affects 3.2.x and current mainline.

The problem is that pfn_valid is only called on the first PFN being
checked and that PFN is not necessarily aligned.  Lets say we have a case
like this

H = MAX_ORDER_NR_PAGES boundary
| = pageblock boundary
m = cc->migrate_pfn
f = cc->free_pfn
o = memory hole

H------|------H------|----m-Hoooooo|ooooooH-f----|------H

The migrate_pfn is just below a memory hole and the free scanner is beyond
the hole.  When isolate_migratepages started, it scans from migrate_pfn to
migrate_pfn+pageblock_nr_pages which is now in a memory hole.  It checks
pfn_valid() on the first PFN but then scans into the hole where there are
not necessarily valid struct pages.

This patch ensures that isolate_migratepages calls pfn_valid when
necessary.

Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

koenkooi added a commit to koenkooi/linux that referenced this pull request Apr 11, 2012

mm: compaction: check pfn_valid when entering a new MAX_ORDER_NR_PAGE…
…S block during isolation for migration

commit 0bf380b upstream.

When isolating for migration, migration starts at the start of a zone
which is not necessarily pageblock aligned.  Further, it stops isolating
when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally
not aligned.  This allows isolate_migratepages() to call pfn_to_page() on
an invalid PFN which can result in a crash.  This was originally reported
against a 3.0-based kernel with the following trace in a crash dump.

PID: 9902   TASK: d47aecd0  CPU: 0   COMMAND: "memcg_process_s"
 #0 [d72d3ad0] crash_kexec at c028cfdb
 #1 [d72d3b24] oops_end at c05c5322
 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60
 #3 [d72d3bec] bad_area at c0227fb6
 #4 [d72d3c00] do_page_fault at c05c72ec
 #5 [d72d3c80] error_code (via page_fault) at c05c47a4
    EAX: 00000000  EBX: 000c0000  ECX: 00000001  EDX: 00000807  EBP: 000c0000
    DS:  007b      ESI: 00000001  ES:  007b      EDI: f3000a80  GS:  6f50
    CS:  0060      EIP: c030b15a  ERR: ffffffff  EFLAGS: 00010002
 #6 [d72d3cb4] isolate_migratepages at c030b15a
 #7 [d72d3d14] zone_watermark_ok at c02d26cb
 #8 [d72d3d2c] compact_zone at c030b8de
 #9 [d72d3d68] compact_zone_order at c030bba1
#10 [d72d3db4] try_to_compact_pages at c030bc84
#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7
#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7
#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97
#14 [d72d3eb8] alloc_pages_vma at c030a845
#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb
#16 [d72d3f00] handle_mm_fault at c02f36c6
#17 [d72d3f30] do_page_fault at c05c70ed
#18 [d72d3fb0] error_code (via page_fault) at c05c47a4
    EAX: b71ff000  EBX: 00000001  ECX: 00001600  EDX: 00000431
    DS:  007b      ESI: 08048950  ES:  007b      EDI: bfaa3788
    SS:  007b      ESP: bfaa36e0  EBP: bfaa3828  GS:  6f50
    CS:  0073      EIP: 080487c8  ERR: ffffffff  EFLAGS: 00010202

It was also reported by Herbert van den Bergh against 3.1-based kernel
with the following snippet from the console log.

BUG: unable to handle kernel paging request at 01c00008
IP: [<c0522399>] isolate_migratepages+0x119/0x390
*pdpt = 000000002f7ce001 *pde = 0000000000000000

It is expected that it also affects 3.2.x and current mainline.

The problem is that pfn_valid is only called on the first PFN being
checked and that PFN is not necessarily aligned.  Lets say we have a case
like this

H = MAX_ORDER_NR_PAGES boundary
| = pageblock boundary
m = cc->migrate_pfn
f = cc->free_pfn
o = memory hole

H------|------H------|----m-Hoooooo|ooooooH-f----|------H

The migrate_pfn is just below a memory hole and the free scanner is beyond
the hole.  When isolate_migratepages started, it scans from migrate_pfn to
migrate_pfn+pageblock_nr_pages which is now in a memory hole.  It checks
pfn_valid() on the first PFN but then scans into the hole where there are
not necessarily valid struct pages.

This patch ensures that isolate_migratepages calls pfn_valid when
necessary.

Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

koenkooi added a commit to koenkooi/linux that referenced this pull request Apr 12, 2012

mm: compaction: check pfn_valid when entering a new MAX_ORDER_NR_PAGE…
…S block during isolation for migration

commit 0bf380b upstream.

When isolating for migration, migration starts at the start of a zone
which is not necessarily pageblock aligned.  Further, it stops isolating
when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally
not aligned.  This allows isolate_migratepages() to call pfn_to_page() on
an invalid PFN which can result in a crash.  This was originally reported
against a 3.0-based kernel with the following trace in a crash dump.

PID: 9902   TASK: d47aecd0  CPU: 0   COMMAND: "memcg_process_s"
 #0 [d72d3ad0] crash_kexec at c028cfdb
 #1 [d72d3b24] oops_end at c05c5322
 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60
 #3 [d72d3bec] bad_area at c0227fb6
 #4 [d72d3c00] do_page_fault at c05c72ec
 #5 [d72d3c80] error_code (via page_fault) at c05c47a4
    EAX: 00000000  EBX: 000c0000  ECX: 00000001  EDX: 00000807  EBP: 000c0000
    DS:  007b      ESI: 00000001  ES:  007b      EDI: f3000a80  GS:  6f50
    CS:  0060      EIP: c030b15a  ERR: ffffffff  EFLAGS: 00010002
 #6 [d72d3cb4] isolate_migratepages at c030b15a
 #7 [d72d3d14] zone_watermark_ok at c02d26cb
 #8 [d72d3d2c] compact_zone at c030b8de
 #9 [d72d3d68] compact_zone_order at c030bba1
#10 [d72d3db4] try_to_compact_pages at c030bc84
#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7
#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7
#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97
#14 [d72d3eb8] alloc_pages_vma at c030a845
#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb
#16 [d72d3f00] handle_mm_fault at c02f36c6
#17 [d72d3f30] do_page_fault at c05c70ed
#18 [d72d3fb0] error_code (via page_fault) at c05c47a4
    EAX: b71ff000  EBX: 00000001  ECX: 00001600  EDX: 00000431
    DS:  007b      ESI: 08048950  ES:  007b      EDI: bfaa3788
    SS:  007b      ESP: bfaa36e0  EBP: bfaa3828  GS:  6f50
    CS:  0073      EIP: 080487c8  ERR: ffffffff  EFLAGS: 00010202

It was also reported by Herbert van den Bergh against 3.1-based kernel
with the following snippet from the console log.

BUG: unable to handle kernel paging request at 01c00008
IP: [<c0522399>] isolate_migratepages+0x119/0x390
*pdpt = 000000002f7ce001 *pde = 0000000000000000

It is expected that it also affects 3.2.x and current mainline.

The problem is that pfn_valid is only called on the first PFN being
checked and that PFN is not necessarily aligned.  Lets say we have a case
like this

H = MAX_ORDER_NR_PAGES boundary
| = pageblock boundary
m = cc->migrate_pfn
f = cc->free_pfn
o = memory hole

H------|------H------|----m-Hoooooo|ooooooH-f----|------H

The migrate_pfn is just below a memory hole and the free scanner is beyond
the hole.  When isolate_migratepages started, it scans from migrate_pfn to
migrate_pfn+pageblock_nr_pages which is now in a memory hole.  It checks
pfn_valid() on the first PFN but then scans into the hole where there are
not necessarily valid struct pages.

This patch ensures that isolate_migratepages calls pfn_valid when
necessary.

Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

fengguang pushed a commit to 0day-ci/linux that referenced this pull request Jun 29, 2018

uart: fix race between uart_put_char() and uart_shutdown()
We have reports of the following crash:

    PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
    #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
    #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
    #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
    #3 [ffff88085c6db860] no_context at ffffffff81050b8f
    #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
    #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
    #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
    #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
    #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
    [exception RIP: uart_put_char+149]
    RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
    RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
    RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
    RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
    R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
    R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
    #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
    #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
    #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
    #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
    #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
    #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
    #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
    #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
    #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
    #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
    #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
    #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
    #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​

after slogging through some dissasembly:

ffffffff814b6720 <uart_put_char>:
ffffffff814b6720:	55                   	push   %rbp
ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
ffffffff814b6754:	00 00
ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
ffffffff814b675d:	00
ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
ffffffff814b676f:	00
ffffffff814b6770:	89 ca                	mov    %ecx,%edx
ffffffff814b6772:	f7 d2                	not    %edx
ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
ffffffff814b677b:	00
ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
ffffffff814b67a5:	c9                   	leaveq
ffffffff814b67a6:	c3                   	retq
ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
ffffffff814b67ae:	00
ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
ffffffff814b67c0:	00
ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
ffffffff814b67d1:	00
ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
ffffffff814b67db:	00 00 00 00 00

for our build, this is crashing at:

    circ->buf[circ->head] = c;

Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
protected by the "per-port mutex", which based on uart_port_check() is
state->port.mutex. Indeed, the lock acquired in uart_put_char() is
uport->lock, i.e. not the same lock.

Anyway, since the lock is not acquired, if uart_shutdown() is called, the
last chunk of that function may release state->xmit.buf before its assigned
to null, and cause the race above.

To fix it, let's lock uport->lock when allocating/deallocating
state->xmit.buf in addition to the per-port mutex.

v2: switch to locking uport->lock on allocation/deallocation instead of
    locking the per-port mutex in uart_put_char. Note that since
    uport->lock is a spin lock, we have to switch the allocation to
    GFP_ATOMIC.

Signed-off-by: Tycho Andersen <tycho@tycho.ws>

fengguang pushed a commit to 0day-ci/linux that referenced this pull request Jun 30, 2018

Merge branch 'mlxsw-Add-resource-scale-tests'
Petr Machata says:

====================
mlxsw: Add resource scale tests

There are a number of tests that check features of the Linux networking
stack. By running them on suitable interfaces, one can exercise the
mlxsw offloading code. However none of these tests attempts to push
mlxsw to the limits supported by the ASIC.

As an additional wrinkle, the "limits supported by the ASIC" themselves
may not be a set of fixed numbers, but rather depend on a profile that
determines how the ASIC resources are allocated for different purposes.

This patchset introduces several tests that verify capability of mlxsw
to offload amounts of routes, flower rules, and mirroring sessions that
match predicted ASIC capacity, at different configuration profiles.
Additionally they verify that amounts exceeding the predicted capacity
can *not* be offloaded.

These are not generic tests, but ones that are tailored for mlxsw
specifically. For that reason they are not added to net/forwarding
selftests subdirectory, but rather to a newly-added drivers/net/mlxsw.

Patches #1, #2 and #3 tweak the generic forwarding/lib.sh to support the
new additions.

In patches #4 and #5, new libraries for interfacing with devlink are
introduced, first a generic one, then a Spectrum-specific one.

In patch #6, a devlink resource test is introduced.

Patches #7 and #8, #9 and #10, and #11 and #12 introduce three scale
tests: router, flower and mirror-to-gretap. The first of each pair of
patches introduces a generic portion of the test (mlxsw-specific), the
second introduces a Spectrum-specific wrapper.

Patch #13 then introduces a scale test driver that runs (possibly a
subset of) the tests introduced by patches from previous paragraph.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

fengguang pushed a commit to 0day-ci/linux that referenced this pull request Jul 2, 2018

qla2xxx: Fix NULL pointer derefrence for fcport search
Crash dump shows following instructions

crash> bt
PID: 0      TASK: ffffffffbe412480  CPU: 0   COMMAND: "swapper/0"
 #0 [ffff891ee0003868] machine_kexec at ffffffffbd063ef1
 #1 [ffff891ee00038c8] __crash_kexec at ffffffffbd12b6f2
 #2 [ffff891ee0003998] crash_kexec at ffffffffbd12c84c
 #3 [ffff891ee00039b8] oops_end at ffffffffbd030f0a
 #4 [ffff891ee00039e0] no_context at ffffffffbd074643
 #5 [ffff891ee0003a40] __bad_area_nosemaphore at ffffffffbd07496e
 #6 [ffff891ee0003a90] bad_area_nosemaphore at ffffffffbd074a64
 #7 [ffff891ee0003aa0] __do_page_fault at ffffffffbd074b0a
 #8 [ffff891ee0003b18] do_page_fault at ffffffffbd074fc8
 #9 [ffff891ee0003b50] page_fault at ffffffffbda01925
    [exception RIP: qlt_schedule_sess_for_deletion+15]
    RIP: ffffffffc02e526f  RSP: ffff891ee0003c08  RFLAGS: 00010046
    RAX: 0000000000000000  RBX: 0000000000000000  RCX: ffffffffc0307847
    RDX: 00000000000020e6  RSI: ffff891edbc377c8  RDI: 0000000000000000
    RBP: ffff891ee0003c18   R8: ffffffffc02f0b20   R9: 0000000000000250
    R10: 0000000000000258  R11: 000000000000b780  R12: ffff891ed9b43000
    R13: 00000000000000f0  R14: 0000000000000006  R15: ffff891edbc377c8
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #10 [ffff891ee0003c20] qla2x00_fcport_event_handler at ffffffffc02853d3 [qla2xxx]
 #11 [ffff891ee0003cf0] __dta_qla24xx_async_gnl_sp_done_333 at ffffffffc0285a1d [qla2xxx]
 #12 [ffff891ee0003de8] qla24xx_process_response_queue at ffffffffc02a2eb5 [qla2xxx]
 #13 [ffff891ee0003e88] qla24xx_msix_rsp_q at ffffffffc02a5403 [qla2xxx]
 #14 [ffff891ee0003ec0] __handle_irq_event_percpu at ffffffffbd0f4c59
 #15 [ffff891ee0003f10] handle_irq_event_percpu at ffffffffbd0f4e02
 #16 [ffff891ee0003f40] handle_irq_event at ffffffffbd0f4e90
 #17 [ffff891ee0003f68] handle_edge_irq at ffffffffbd0f8984
 #18 [ffff891ee0003f88] handle_irq at ffffffffbd0305d5
 #19 [ffff891ee0003fb8] do_IRQ at ffffffffbda02a18
 --- <IRQ stack> ---
 #20 [ffffffffbe403d30] ret_from_intr at ffffffffbda0094e
    [exception RIP: unknown or invalid address]
    RIP: 000000000000001f  RSP: 0000000000000000  RFLAGS: fff3b8c2091ebb3f
    RAX: ffffbba5a0000200  RBX: 0000be8cdfa8f9fa  RCX: 0000000000000018
    RDX: 0000000000000101  RSI: 000000000000015d  RDI: 0000000000000193
    RBP: 0000000000000083   R8: ffffffffbe403e38   R9: 0000000000000002
    R10: 0000000000000000  R11: ffffffffbe56b820  R12: ffff891ee001cf00
    R13: ffffffffbd11c0a4  R14: ffffffffbe403d60  R15: 0000000000000001
    ORIG_RAX: ffff891ee0022ac0  CS: 0000  SS: ffffffffffffffb9
 bt: WARNING: possibly bogus exception frame
 #21 [ffffffffbe403dd8] cpuidle_enter_state at ffffffffbd67c6fd
 #22 [ffffffffbe403e40] cpuidle_enter at ffffffffbd67c907
 #23 [ffffffffbe403e50] call_cpuidle at ffffffffbd0d98f3
 #24 [ffffffffbe403e60] do_idle at ffffffffbd0d9b42
 #25 [ffffffffbe403e98] cpu_startup_entry at ffffffffbd0d9da3
 #26 [ffffffffbe403ec0] rest_init at ffffffffbd81d4aa
 #27 [ffffffffbe403ed0] start_kernel at ffffffffbe67d2ca
 #28 [ffffffffbe403f28] x86_64_start_reservations at ffffffffbe67c675
 #29 [ffffffffbe403f38] x86_64_start_kernel at ffffffffbe67c6eb
 #30 [ffffffffbe403f50] secondary_startup_64 at ffffffffbd0000d5

Fixes: 040036b ("scsi: qla2xxx: Delay loop id allocation at login")
Cc: <stable@vger.kernel.org> #4.17.0
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>

fengguang pushed a commit to 0day-ci/linux that referenced this pull request Jul 5, 2018

Thomas Gleixner Sebastian Andrzej Siewior
workqueue: Prevent workqueue versus ata-piix livelock
An Intel i7 system regularly detected rcu_preempt stalls after the kernel
was upgraded from 3.6-rt to 3.8-rt. When the stall happened, disk I/O was no
longer possible, unless the system was restarted.

The kernel message was:
INFO: rcu_preempt self-detected stall on CPU { 6}
[..]
NMI backtrace for cpu 6
CPU 6
Pid: 119, comm: irq/19-ata_piix Not tainted 3.8.13-rt13 #11 Shuttle Inc. SX58/SX58
RIP: 0010:[<ffffffff8124ca60>]  [<ffffffff8124ca60>] ip_compute_csum+0x30/0x30
RSP: 0018:ffff880333303cb0  EFLAGS: 00000002
RAX: 0000000000000006 RBX: 00000000000003e9 RCX: 0000000000000034
RDX: 0000000000000000 RSI: ffffffff81aa16d0 RDI: 0000000000000001
RBP: ffff880333303ce8 R08: ffffffff81aa16d0 R09: ffffffff81c1b8cc
R10: 0000000000000000 R11: 0000000000000000 R12: 000000000005161f
R13: 0000000000000006 R14: ffffffff81aa16d0 R15: 0000000000000002
FS:  0000000000000000(0000) GS:ffff880333300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000003c1b2bb420 CR3: 0000000001a0f000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process irq/19-ata_piix (pid: 119, threadinfo ffff88032d88a000, task ffff88032df80000)
Stack:
ffffffff8124cb32 000000000005161e 00000000000003e9 0000000000001000
0000000000009022 ffffffff81aa16d0 0000000000000002 ffff880333303cf8
ffffffff8124caa9 ffff880333303d08 ffffffff8124cad2 ffff880333303d28
Call Trace:
<IRQ>
[<ffffffff8124cb32>] ? delay_tsc+0x33/0xe3
[<ffffffff8124caa9>] __delay+0xf/0x11
[<ffffffff8124cad2>] __const_udelay+0x27/0x29
[<ffffffff8102d1fa>] native_safe_apic_wait_icr_idle+0x39/0x45
[<ffffffff8102dc9b>] __default_send_IPI_dest_field.constprop.0+0x1e/0x58
[<ffffffff8102dd1e>] default_send_IPI_mask_sequence_phys+0x49/0x7d
[<ffffffff81030326>] physflat_send_IPI_all+0x17/0x19
[<ffffffff8102de53>] arch_trigger_all_cpu_backtrace+0x50/0x79
[<ffffffff810b21d0>] rcu_check_callbacks+0x1cb/0x568
[<ffffffff81048c9c>] ? raise_softirq+0x2e/0x35
[<ffffffff81086be0>] ? tick_sched_do_timer+0x38/0x38
[<ffffffff8104f653>] update_process_times+0x44/0x55
[<ffffffff81086866>] tick_sched_handle+0x4a/0x59
[<ffffffff81086c1c>] tick_sched_timer+0x3c/0x5b
[<ffffffff81062845>] __run_hrtimer+0x9b/0x158
[<ffffffff810631d8>] hrtimer_interrupt+0x172/0x2aa
[<ffffffff8102d498>] smp_apic_timer_interrupt+0x76/0x89
[<ffffffff814d881d>] apic_timer_interrupt+0x6d/0x80
<EOI>
[<ffffffff81057cd2>] ? __local_lock_irqsave+0x17/0x4a
[<ffffffff81059336>] try_to_grab_pending+0x42/0x17e
[<ffffffff8105a699>] mod_delayed_work_on+0x32/0x88
[<ffffffff8105a70b>] mod_delayed_work+0x1c/0x1e
[<ffffffff8122ae84>] blk_run_queue_async+0x37/0x39
[<ffffffff81230985>] flush_end_io+0xf1/0x107
[<ffffffff8122e0da>] blk_finish_request+0x21e/0x264
[<ffffffff8122e162>] blk_end_bidi_request+0x42/0x60
[<ffffffff8122e1ba>] blk_end_request+0x10/0x12
[<ffffffff8132de46>] scsi_io_completion+0x1bf/0x492
[<ffffffff81335cec>] ? sd_done+0x298/0x2ef
[<ffffffff81325a02>] scsi_finish_command+0xe9/0xf2
[<ffffffff8132dbcb>] scsi_softirq_done+0x106/0x10f
[<ffffffff812333d3>] blk_done_softirq+0x77/0x87
[<ffffffff8104826f>] do_current_softirqs+0x172/0x2e1
[<ffffffff810aa820>] ? irq_thread_fn+0x3a/0x3a
[<ffffffff81048466>] local_bh_enable+0x43/0x72
[<ffffffff810aa866>] irq_forced_thread_fn+0x46/0x52
[<ffffffff810ab089>] irq_thread+0x8c/0x17c
[<ffffffff810ab179>] ? irq_thread+0x17c/0x17c
[<ffffffff810aaffd>] ? wake_threads_waitq+0x44/0x44
[<ffffffff8105eb18>] kthread+0x8d/0x95
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65
[<ffffffff814d7b7c>] ret_from_fork+0x7c/0xb0
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65

The state of softirqd of this CPU at the time of the crash was:
ksoftirqd/6     R  running task        0    53      2 0x00000000
ffff88032fc39d18 0000000000000046 ffff88033330c4c0 ffff8803303f4710
ffff88032fc39fd8 ffff88032fc39fd8 0000000000000000 0000000000062500
ffff88032df88000 ffff8803303f4710 0000000000000000 ffff88032fc38000
Call Trace:
[<ffffffff8105a3ae>] ? __queue_work+0x27c/0x27c
[<ffffffff814d178c>] preempt_schedule+0x61/0x76
[<ffffffff8106cccf>] migrate_enable+0xe5/0x1df
[<ffffffff8105a3ae>] ? __queue_work+0x27c/0x27c
[<ffffffff8104ef52>] run_timer_softirq+0x161/0x1d6
[<ffffffff8104826f>] do_current_softirqs+0x172/0x2e1
[<ffffffff8104840b>] run_ksoftirqd+0x2d/0x45
[<ffffffff8106658a>] smpboot_thread_fn+0x2ea/0x308
[<ffffffff810662a0>] ? test_ti_thread_flag+0xc/0xc
[<ffffffff810662a0>] ? test_ti_thread_flag+0xc/0xc
[<ffffffff8105eb18>] kthread+0x8d/0x95
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65
[<ffffffff814d7afc>] ret_from_fork+0x7c/0xb0
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65

Apparently, the softirq demon and the ata_piix IRQ handler were waiting
for each other to finish ending up in a livelock. After the below patch
was applied, the system no longer crashes.

Reported-by: Carsten Emde <C.Emde@osadl.org>
Proposed-by: Thomas Gleixner <tglx@linutronix.de>
Tested by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

tych0 added a commit to tych0/linux that referenced this pull request Jul 6, 2018

uart: fix race between uart_put_char() and uart_shutdown()
We have reports of the following crash:

    PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
    #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
    #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
    #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
    #3 [ffff88085c6db860] no_context at ffffffff81050b8f
    #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
    #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
    #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
    #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
    #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
    [exception RIP: uart_put_char+149]
    RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
    RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
    RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
    RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
    R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
    R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
    #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
    #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
    #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
    #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
    #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
    #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
    #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
    #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
    #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
    #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
    #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
    #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
    #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​

after slogging through some dissasembly:

ffffffff814b6720 <uart_put_char>:
ffffffff814b6720:	55                   	push   %rbp
ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
ffffffff814b6754:	00 00
ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
ffffffff814b675d:	00
ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
ffffffff814b676f:	00
ffffffff814b6770:	89 ca                	mov    %ecx,%edx
ffffffff814b6772:	f7 d2                	not    %edx
ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
ffffffff814b677b:	00
ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
ffffffff814b67a5:	c9                   	leaveq
ffffffff814b67a6:	c3                   	retq
ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
ffffffff814b67ae:	00
ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
ffffffff814b67c0:	00
ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
ffffffff814b67d1:	00
ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
ffffffff814b67db:	00 00 00 00 00

for our build, this is crashing at:

    circ->buf[circ->head] = c;

Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
protected by the "per-port mutex", which based on uart_port_check() is
state->port.mutex. Indeed, the lock acquired in uart_put_char() is
uport->lock, i.e. not the same lock.

Anyway, since the lock is not acquired, if uart_shutdown() is called, the
last chunk of that function may release state->xmit.buf before its assigned
to null, and cause the race above.

To fix it, let's lock uport->lock when allocating/deallocating
state->xmit.buf in addition to the per-port mutex.

v2: switch to locking uport->lock on allocation/deallocation instead of
    locking the per-port mutex in uart_put_char. Note that since
    uport->lock is a spin lock, we have to switch the allocation to
    GFP_ATOMIC.
v3: move the allocation outside the lock, so we can switch back to
    GFP_KERNEL

Signed-off-by: Tycho Andersen <tycho@tycho.ws>

fengguang pushed a commit to 0day-ci/linux that referenced this pull request Jul 6, 2018

uart: fix race between uart_put_char() and uart_shutdown()
We have reports of the following crash:

    PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
    #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
    #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
    #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
    #3 [ffff88085c6db860] no_context at ffffffff81050b8f
    #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
    #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
    #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
    #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
    #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
    [exception RIP: uart_put_char+149]
    RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
    RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
    RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
    RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
    R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
    R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
    #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
    #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
    #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
    #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
    #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
    #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
    #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
    #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
    #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
    #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
    #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
    #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
    #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​

after slogging through some dissasembly:

ffffffff814b6720 <uart_put_char>:
ffffffff814b6720:	55                   	push   %rbp
ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
ffffffff814b6754:	00 00
ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
ffffffff814b675d:	00
ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
ffffffff814b676f:	00
ffffffff814b6770:	89 ca                	mov    %ecx,%edx
ffffffff814b6772:	f7 d2                	not    %edx
ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
ffffffff814b677b:	00
ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
ffffffff814b67a5:	c9                   	leaveq
ffffffff814b67a6:	c3                   	retq
ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
ffffffff814b67ae:	00
ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
ffffffff814b67c0:	00
ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
ffffffff814b67d1:	00
ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
ffffffff814b67db:	00 00 00 00 00

for our build, this is crashing at:

    circ->buf[circ->head] = c;

Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
protected by the "per-port mutex", which based on uart_port_check() is
state->port.mutex. Indeed, the lock acquired in uart_put_char() is
uport->lock, i.e. not the same lock.

Anyway, since the lock is not acquired, if uart_shutdown() is called, the
last chunk of that function may release state->xmit.buf before its assigned
to null, and cause the race above.

To fix it, let's lock uport->lock when allocating/deallocating
state->xmit.buf in addition to the per-port mutex.

v2: switch to locking uport->lock on allocation/deallocation instead of
    locking the per-port mutex in uart_put_char. Note that since
    uport->lock is a spin lock, we have to switch the allocation to
    GFP_ATOMIC.
v3: move the allocation outside the lock, so we can switch back to
    GFP_KERNEL

Signed-off-by: Tycho Andersen <tycho@tycho.ws>

Reichl pushed a commit to Reichl/linux-odroid that referenced this pull request Jul 9, 2018

clk: qcom: gcc-msm8996: Disable halt check on UFS tx clock
Commit 12d807c ("clk: qcom: gcc-msm8996: Disable halt check on UFS
clocks") marked BRANCH_HALT_SKIP for ufs rx clocks, but missed ufs tx
clocks. The result of that is kernel warnings at reboot:

[  105.624283] gcc_ufs_tx_symbol_0_clk status stuck at 'on'
[  105.624311] WARNING: CPU: 1 PID: 1 at drivers/clk/qcom/clk-branch.c:100 clk_branch_toggle+0x190/0x1b0
[  105.633235] Modules linked in:
[  105.645118] CPU: 1 PID: 1 Comm: systemd-shutdow Tainted: G        W         4.18.0-rc2-00002-g2bfbe52a53a3 #11
[  105.647988] Hardware name: Qualcomm Technologies, Inc. DB820c (DT)
[  105.657966] pstate: 60000085 (nZCv daIf -PAN -UAO)
[  105.664127] pc : clk_branch_toggle+0x190/0x1b0
[  105.668900] lr : clk_branch_toggle+0x190/0x1b0
[  105.673324] sp : ffff00000805bb40
[  105.677751] x29: ffff00000805bb40 x28: 0000000000000000
[  105.681140] x27: ffff8000d947cc60 x26: 0000000000000001
[  105.686520] x25: ffff000008f71900 x24: 0000000000000000
[  105.691816] x23: ffff00000925e338 x22: ffff00000855f8e0
[  105.697114] x21: 0000000000000000 x20: 0000000000000000
[  105.702407] x19: ffff0000091c9000 x18: ffffffffffffffff
[  105.707702] x17: 0000ffffac148c58 x16: ffff000008b82928
[  105.712998] x15: ffff0000091c96c8 x14: ffff0000893817c7
[  105.718293] x13: ffff0000093817d5 x12: ffff0000091c9940
[  105.723587] x11: ffff0000085e3e70 x10: ffff00000805b780
[  105.728884] x9 : ffff00000805bb40 x8 : 7320737574617473
[  105.734179] x7 : 206b6c635f305f6c x6 : 00000000000001e5
[  105.739472] x5 : 0000000000000000 x4 : 0000000000000000
[  105.744769] x3 : ffffffffffffffff x2 : ffff0000091e2658
[  105.750063] x1 : a7c4712dd5e09c00 x0 : 0000000000000000
[  105.755360] Call trace:
[  105.760652]  clk_branch_toggle+0x190/0x1b0
[  105.762824]  clk_branch2_disable+0x18/0x20
[  105.766994]  clk_core_disable+0x58/0xa8
[  105.771069]  clk_core_disable_lock+0x20/0x38
[  105.774803]  clk_disable+0x1c/0x28
[  105.779320]  __ufshcd_setup_clocks+0x298/0x308
[  105.782529]  ufshcd_suspend+0x160/0x308
[  105.786953]  ufshcd_shutdown+0x38/0xa0
[  105.790690]  ufshcd_pltfrm_shutdown+0x10/0x18
[  105.794512]  platform_drv_shutdown+0x20/0x30
[  105.798935]  device_shutdown+0x110/0x1e8
[  105.803278]  kernel_restart_prepare+0x34/0x40
[  105.807181]  kernel_restart+0x14/0x78
[  105.811434]  sys_reboot+0x200/0x248
[  105.815081]  el0_svc_naked+0x30/0x34
[  105.818378] ---[ end trace 8d2322276b27879c ]---

Mark gcc_ufs_tx_symbol_0_clk as BRANCH_HALT_SKIP as well.

Fixes: 12d807c ("clk: qcom: gcc-msm8996: Disable halt check on UFS clocks")
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>

fengguang pushed a commit to 0day-ci/linux that referenced this pull request Jul 11, 2018

ipc: reorganize initialization of kern_ipc_perm.id
Patch series "ipc: cleanups & bugfixes, rhashtable update", v2.

I have merged patches from Dmitry, Davidlohr and myself:

- patch #1-#6: Fix syzcall findings & further race cleanups
- patch #7: Cleanup from Dmitry for ipc_idr_alloc.
- patch #8-#11: rhashtable improvement from Davidlohr
- patch #12: Another cleanup for ipc_idr_alloc.

This patch (of 12):

ipc_addid() initializes kern_ipc_perm.id after having called
ipc_idr_alloc().

Thus a parallel semop() or msgrcv() that uses ipc_obtain_object_idr() may
see an uninitialized value.

The patch moves all accesses to kern_ipc_perm.id under the spin_lock().

The issues is related to the finding of
syzbot+2827ef6b3385deb07eaf@syzkaller.appspotmail.com: syzbot found an
issue with kern_ipc_perm.seq

Link: http://lkml.kernel.org/r/20180709151019.1336-2-manfred@colorfullife.com
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>

tych0 added a commit to tych0/linux that referenced this pull request Jul 11, 2018

uart: fix race between uart_put_char() and uart_shutdown()
We have reports of the following crash:

    PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
    #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
    #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
    #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
    #3 [ffff88085c6db860] no_context at ffffffff81050b8f
    #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
    #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
    #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
    #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
    #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
    [exception RIP: uart_put_char+149]
    RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
    RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
    RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
    RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
    R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
    R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
    #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
    #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
    #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
    #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
    #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
    #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
    #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
    #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
    #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
    #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
    #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
    #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
    #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​

after slogging through some dissasembly:

ffffffff814b6720 <uart_put_char>:
ffffffff814b6720:	55                   	push   %rbp
ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
ffffffff814b6754:	00 00
ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
ffffffff814b675d:	00
ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
ffffffff814b676f:	00
ffffffff814b6770:	89 ca                	mov    %ecx,%edx
ffffffff814b6772:	f7 d2                	not    %edx
ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
ffffffff814b677b:	00
ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
ffffffff814b67a5:	c9                   	leaveq
ffffffff814b67a6:	c3                   	retq
ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
ffffffff814b67ae:	00
ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
ffffffff814b67c0:	00
ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
ffffffff814b67d1:	00
ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
ffffffff814b67db:	00 00 00 00 00

for our build, this is crashing at:

    circ->buf[circ->head] = c;

Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
protected by the "per-port mutex", which based on uart_port_check() is
state->port.mutex. Indeed, the lock acquired in uart_put_char() is
uport->lock, i.e. not the same lock.

Anyway, since the lock is not acquired, if uart_shutdown() is called, the
last chunk of that function may release state->xmit.buf before its assigned
to null, and cause the race above.

To fix it, let's lock uport->lock when allocating/deallocating
state->xmit.buf in addition to the per-port mutex.

v2: switch to locking uport->lock on allocation/deallocation instead of
    locking the per-port mutex in uart_put_char. Note that since
    uport->lock is a spin lock, we have to switch the allocation to
    GFP_ATOMIC.
v3: move the allocation outside the lock, so we can switch back to
    GFP_KERNEL
v4: * switch to positive condition of state->xmit.buf in
      uart_port_startup()
    * declare `flags` on its own line
    * use the result of uart_port_lock() in uart_shutdown() to avoid
      uninitialized warning
    * don't use the uart_port_lock/unlock macros in uart_port_startup,
      instead test against uport directly; the compiler can't seem to "see"
      through the macros/ref/unref calls to not warn about uninitialized
      flags. We don't need to do a ref here since we hold the per-port
      mutex anyway.

Signed-off-by: Tycho Andersen <tycho@tycho.ws>

fengguang pushed a commit to 0day-ci/linux that referenced this pull request Jul 11, 2018

scsi: qla2xxx: Fix NULL pointer dereference for fcport search
Crash dump shows following instructions

crash> bt
PID: 0      TASK: ffffffffbe412480  CPU: 0   COMMAND: "swapper/0"
 #0 [ffff891ee0003868] machine_kexec at ffffffffbd063ef1
 #1 [ffff891ee00038c8] __crash_kexec at ffffffffbd12b6f2
 #2 [ffff891ee0003998] crash_kexec at ffffffffbd12c84c
 #3 [ffff891ee00039b8] oops_end at ffffffffbd030f0a
 #4 [ffff891ee00039e0] no_context at ffffffffbd074643
 #5 [ffff891ee0003a40] __bad_area_nosemaphore at ffffffffbd07496e
 #6 [ffff891ee0003a90] bad_area_nosemaphore at ffffffffbd074a64
 #7 [ffff891ee0003aa0] __do_page_fault at ffffffffbd074b0a
 #8 [ffff891ee0003b18] do_page_fault at ffffffffbd074fc8
 #9 [ffff891ee0003b50] page_fault at ffffffffbda01925
    [exception RIP: qlt_schedule_sess_for_deletion+15]
    RIP: ffffffffc02e526f  RSP: ffff891ee0003c08  RFLAGS: 00010046
    RAX: 0000000000000000  RBX: 0000000000000000  RCX: ffffffffc0307847
    RDX: 00000000000020e6  RSI: ffff891edbc377c8  RDI: 0000000000000000
    RBP: ffff891ee0003c18   R8: ffffffffc02f0b20   R9: 0000000000000250
    R10: 0000000000000258  R11: 000000000000b780  R12: ffff891ed9b43000
    R13: 00000000000000f0  R14: 0000000000000006  R15: ffff891edbc377c8
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #10 [ffff891ee0003c20] qla2x00_fcport_event_handler at ffffffffc02853d3 [qla2xxx]
 #11 [ffff891ee0003cf0] __dta_qla24xx_async_gnl_sp_done_333 at ffffffffc0285a1d [qla2xxx]
 #12 [ffff891ee0003de8] qla24xx_process_response_queue at ffffffffc02a2eb5 [qla2xxx]
 #13 [ffff891ee0003e88] qla24xx_msix_rsp_q at ffffffffc02a5403 [qla2xxx]
 #14 [ffff891ee0003ec0] __handle_irq_event_percpu at ffffffffbd0f4c59
 #15 [ffff891ee0003f10] handle_irq_event_percpu at ffffffffbd0f4e02
 #16 [ffff891ee0003f40] handle_irq_event at ffffffffbd0f4e90
 #17 [ffff891ee0003f68] handle_edge_irq at ffffffffbd0f8984
 #18 [ffff891ee0003f88] handle_irq at ffffffffbd0305d5
 #19 [ffff891ee0003fb8] do_IRQ at ffffffffbda02a18
 --- <IRQ stack> ---
 #20 [ffffffffbe403d30] ret_from_intr at ffffffffbda0094e
    [exception RIP: unknown or invalid address]
    RIP: 000000000000001f  RSP: 0000000000000000  RFLAGS: fff3b8c2091ebb3f
    RAX: ffffbba5a0000200  RBX: 0000be8cdfa8f9fa  RCX: 0000000000000018
    RDX: 0000000000000101  RSI: 000000000000015d  RDI: 0000000000000193
    RBP: 0000000000000083   R8: ffffffffbe403e38   R9: 0000000000000002
    R10: 0000000000000000  R11: ffffffffbe56b820  R12: ffff891ee001cf00
    R13: ffffffffbd11c0a4  R14: ffffffffbe403d60  R15: 0000000000000001
    ORIG_RAX: ffff891ee0022ac0  CS: 0000  SS: ffffffffffffffb9
 bt: WARNING: possibly bogus exception frame
 #21 [ffffffffbe403dd8] cpuidle_enter_state at ffffffffbd67c6fd
 #22 [ffffffffbe403e40] cpuidle_enter at ffffffffbd67c907
 #23 [ffffffffbe403e50] call_cpuidle at ffffffffbd0d98f3
 #24 [ffffffffbe403e60] do_idle at ffffffffbd0d9b42
 #25 [ffffffffbe403e98] cpu_startup_entry at ffffffffbd0d9da3
 #26 [ffffffffbe403ec0] rest_init at ffffffffbd81d4aa
 #27 [ffffffffbe403ed0] start_kernel at ffffffffbe67d2ca
 #28 [ffffffffbe403f28] x86_64_start_reservations at ffffffffbe67c675
 #29 [ffffffffbe403f38] x86_64_start_kernel at ffffffffbe67c6eb
 #30 [ffffffffbe403f50] secondary_startup_64 at ffffffffbd0000d5

Fixes: 040036b ("scsi: qla2xxx: Delay loop id allocation at login")
Cc: <stable@vger.kernel.org> # v4.17+
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

fengguang pushed a commit to 0day-ci/linux that referenced this pull request Jul 11, 2018

Thomas Gleixner Sebastian Andrzej Siewior
workqueue: Prevent workqueue versus ata-piix livelock
An Intel i7 system regularly detected rcu_preempt stalls after the kernel
was upgraded from 3.6-rt to 3.8-rt. When the stall happened, disk I/O was no
longer possible, unless the system was restarted.

The kernel message was:
INFO: rcu_preempt self-detected stall on CPU { 6}
[..]
NMI backtrace for cpu 6
CPU 6
Pid: 119, comm: irq/19-ata_piix Not tainted 3.8.13-rt13 #11 Shuttle Inc. SX58/SX58
RIP: 0010:[<ffffffff8124ca60>]  [<ffffffff8124ca60>] ip_compute_csum+0x30/0x30
RSP: 0018:ffff880333303cb0  EFLAGS: 00000002
RAX: 0000000000000006 RBX: 00000000000003e9 RCX: 0000000000000034
RDX: 0000000000000000 RSI: ffffffff81aa16d0 RDI: 0000000000000001
RBP: ffff880333303ce8 R08: ffffffff81aa16d0 R09: ffffffff81c1b8cc
R10: 0000000000000000 R11: 0000000000000000 R12: 000000000005161f
R13: 0000000000000006 R14: ffffffff81aa16d0 R15: 0000000000000002
FS:  0000000000000000(0000) GS:ffff880333300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000003c1b2bb420 CR3: 0000000001a0f000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process irq/19-ata_piix (pid: 119, threadinfo ffff88032d88a000, task ffff88032df80000)
Stack:
ffffffff8124cb32 000000000005161e 00000000000003e9 0000000000001000
0000000000009022 ffffffff81aa16d0 0000000000000002 ffff880333303cf8
ffffffff8124caa9 ffff880333303d08 ffffffff8124cad2 ffff880333303d28
Call Trace:
<IRQ>
[<ffffffff8124cb32>] ? delay_tsc+0x33/0xe3
[<ffffffff8124caa9>] __delay+0xf/0x11
[<ffffffff8124cad2>] __const_udelay+0x27/0x29
[<ffffffff8102d1fa>] native_safe_apic_wait_icr_idle+0x39/0x45
[<ffffffff8102dc9b>] __default_send_IPI_dest_field.constprop.0+0x1e/0x58
[<ffffffff8102dd1e>] default_send_IPI_mask_sequence_phys+0x49/0x7d
[<ffffffff81030326>] physflat_send_IPI_all+0x17/0x19
[<ffffffff8102de53>] arch_trigger_all_cpu_backtrace+0x50/0x79
[<ffffffff810b21d0>] rcu_check_callbacks+0x1cb/0x568
[<ffffffff81048c9c>] ? raise_softirq+0x2e/0x35
[<ffffffff81086be0>] ? tick_sched_do_timer+0x38/0x38
[<ffffffff8104f653>] update_process_times+0x44/0x55
[<ffffffff81086866>] tick_sched_handle+0x4a/0x59
[<ffffffff81086c1c>] tick_sched_timer+0x3c/0x5b
[<ffffffff81062845>] __run_hrtimer+0x9b/0x158
[<ffffffff810631d8>] hrtimer_interrupt+0x172/0x2aa
[<ffffffff8102d498>] smp_apic_timer_interrupt+0x76/0x89
[<ffffffff814d881d>] apic_timer_interrupt+0x6d/0x80
<EOI>
[<ffffffff81057cd2>] ? __local_lock_irqsave+0x17/0x4a
[<ffffffff81059336>] try_to_grab_pending+0x42/0x17e
[<ffffffff8105a699>] mod_delayed_work_on+0x32/0x88
[<ffffffff8105a70b>] mod_delayed_work+0x1c/0x1e
[<ffffffff8122ae84>] blk_run_queue_async+0x37/0x39
[<ffffffff81230985>] flush_end_io+0xf1/0x107
[<ffffffff8122e0da>] blk_finish_request+0x21e/0x264
[<ffffffff8122e162>] blk_end_bidi_request+0x42/0x60
[<ffffffff8122e1ba>] blk_end_request+0x10/0x12
[<ffffffff8132de46>] scsi_io_completion+0x1bf/0x492
[<ffffffff81335cec>] ? sd_done+0x298/0x2ef
[<ffffffff81325a02>] scsi_finish_command+0xe9/0xf2
[<ffffffff8132dbcb>] scsi_softirq_done+0x106/0x10f
[<ffffffff812333d3>] blk_done_softirq+0x77/0x87
[<ffffffff8104826f>] do_current_softirqs+0x172/0x2e1
[<ffffffff810aa820>] ? irq_thread_fn+0x3a/0x3a
[<ffffffff81048466>] local_bh_enable+0x43/0x72
[<ffffffff810aa866>] irq_forced_thread_fn+0x46/0x52
[<ffffffff810ab089>] irq_thread+0x8c/0x17c
[<ffffffff810ab179>] ? irq_thread+0x17c/0x17c
[<ffffffff810aaffd>] ? wake_threads_waitq+0x44/0x44
[<ffffffff8105eb18>] kthread+0x8d/0x95
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65
[<ffffffff814d7b7c>] ret_from_fork+0x7c/0xb0
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65

The state of softirqd of this CPU at the time of the crash was:
ksoftirqd/6     R  running task        0    53      2 0x00000000
ffff88032fc39d18 0000000000000046 ffff88033330c4c0 ffff8803303f4710
ffff88032fc39fd8 ffff88032fc39fd8 0000000000000000 0000000000062500
ffff88032df88000 ffff8803303f4710 0000000000000000 ffff88032fc38000
Call Trace:
[<ffffffff8105a3ae>] ? __queue_work+0x27c/0x27c
[<ffffffff814d178c>] preempt_schedule+0x61/0x76
[<ffffffff8106cccf>] migrate_enable+0xe5/0x1df
[<ffffffff8105a3ae>] ? __queue_work+0x27c/0x27c
[<ffffffff8104ef52>] run_timer_softirq+0x161/0x1d6
[<ffffffff8104826f>] do_current_softirqs+0x172/0x2e1
[<ffffffff8104840b>] run_ksoftirqd+0x2d/0x45
[<ffffffff8106658a>] smpboot_thread_fn+0x2ea/0x308
[<ffffffff810662a0>] ? test_ti_thread_flag+0xc/0xc
[<ffffffff810662a0>] ? test_ti_thread_flag+0xc/0xc
[<ffffffff8105eb18>] kthread+0x8d/0x95
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65
[<ffffffff814d7afc>] ret_from_fork+0x7c/0xb0
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65

Apparently, the softirq demon and the ata_piix IRQ handler were waiting
for each other to finish ending up in a livelock. After the below patch
was applied, the system no longer crashes.

Reported-by: Carsten Emde <C.Emde@osadl.org>
Proposed-by: Thomas Gleixner <tglx@linutronix.de>
Tested by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

fengguang pushed a commit to 0day-ci/linux that referenced this pull request Jul 11, 2018

uart: fix race between uart_put_char() and uart_shutdown()
We have reports of the following crash:

    PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
    #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
    #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
    #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
    #3 [ffff88085c6db860] no_context at ffffffff81050b8f
    #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
    #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
    #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
    #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
    #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
    [exception RIP: uart_put_char+149]
    RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
    RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
    RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
    RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
    R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
    R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
    #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
    #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
    #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
    #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
    #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
    #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
    #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
    #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
    #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
    #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
    #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
    #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
    #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​

after slogging through some dissasembly:

ffffffff814b6720 <uart_put_char>:
ffffffff814b6720:	55                   	push   %rbp
ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
ffffffff814b6754:	00 00
ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
ffffffff814b675d:	00
ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
ffffffff814b676f:	00
ffffffff814b6770:	89 ca                	mov    %ecx,%edx
ffffffff814b6772:	f7 d2                	not    %edx
ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
ffffffff814b677b:	00
ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
ffffffff814b67a5:	c9                   	leaveq
ffffffff814b67a6:	c3                   	retq
ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
ffffffff814b67ae:	00
ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
ffffffff814b67c0:	00
ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
ffffffff814b67d1:	00
ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
ffffffff814b67db:	00 00 00 00 00

for our build, this is crashing at:

    circ->buf[circ->head] = c;

Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
protected by the "per-port mutex", which based on uart_port_check() is
state->port.mutex. Indeed, the lock acquired in uart_put_char() is
uport->lock, i.e. not the same lock.

Anyway, since the lock is not acquired, if uart_shutdown() is called, the
last chunk of that function may release state->xmit.buf before its assigned
to null, and cause the race above.

To fix it, let's lock uport->lock when allocating/deallocating
state->xmit.buf in addition to the per-port mutex.

v2: switch to locking uport->lock on allocation/deallocation instead of
    locking the per-port mutex in uart_put_char. Note that since
    uport->lock is a spin lock, we have to switch the allocation to
    GFP_ATOMIC.
v3: move the allocation outside the lock, so we can switch back to
    GFP_KERNEL
v4: * switch to positive condition of state->xmit.buf in
      uart_port_startup()
    * declare `flags` on its own line
    * use the result of uart_port_lock() in uart_shutdown() to avoid
      uninitialized warning
    * don't use the uart_port_lock/unlock macros in uart_port_startup,
      instead test against uport directly; the compiler can't seem to "see"
      through the macros/ref/unref calls to not warn about uninitialized
      flags. We don't need to do a ref here since we hold the per-port
      mutex anyway.

Signed-off-by: Tycho Andersen <tycho@tycho.ws>

Reichl pushed a commit to Reichl/linux-odroid that referenced this pull request Jul 12, 2018

ipc: reorganize initialization of kern_ipc_perm.id
Patch series "ipc: cleanups & bugfixes, rhashtable update", v2.

I have merged patches from Dmitry, Davidlohr and myself:

- patch #1-#6: Fix syzcall findings & further race cleanups
- patch #7: Cleanup from Dmitry for ipc_idr_alloc.
- patch #8-#11: rhashtable improvement from Davidlohr
- patch #12: Another cleanup for ipc_idr_alloc.

This patch (of 12):

ipc_addid() initializes kern_ipc_perm.id after having called
ipc_idr_alloc().

Thus a parallel semop() or msgrcv() that uses ipc_obtain_object_idr() may
see an uninitialized value.

The patch moves all accesses to kern_ipc_perm.id under the spin_lock().

The issues is related to the finding of
syzbot+2827ef6b3385deb07eaf@syzkaller.appspotmail.com: syzbot found an
issue with kern_ipc_perm.seq

Link: http://lkml.kernel.org/r/20180709151019.1336-2-manfred@colorfullife.com
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>

tych0 added a commit to tych0/linux that referenced this pull request Jul 12, 2018

uart: fix race between uart_put_char() and uart_shutdown()
We have reports of the following crash:

    PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
    #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
    #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
    #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
    #3 [ffff88085c6db860] no_context at ffffffff81050b8f
    #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
    #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
    #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
    #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
    #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
    [exception RIP: uart_put_char+149]
    RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
    RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
    RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
    RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
    R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
    R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
    #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
    #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
    #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
    #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
    #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
    #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
    #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
    #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
    #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
    #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
    #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
    #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
    #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​

after slogging through some dissasembly:

ffffffff814b6720 <uart_put_char>:
ffffffff814b6720:	55                   	push   %rbp
ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
ffffffff814b6754:	00 00
ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
ffffffff814b675d:	00
ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
ffffffff814b676f:	00
ffffffff814b6770:	89 ca                	mov    %ecx,%edx
ffffffff814b6772:	f7 d2                	not    %edx
ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
ffffffff814b677b:	00
ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
ffffffff814b67a5:	c9                   	leaveq
ffffffff814b67a6:	c3                   	retq
ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
ffffffff814b67ae:	00
ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
ffffffff814b67c0:	00
ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
ffffffff814b67d1:	00
ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
ffffffff814b67db:	00 00 00 00 00

for our build, this is crashing at:

    circ->buf[circ->head] = c;

Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
protected by the "per-port mutex", which based on uart_port_check() is
state->port.mutex. Indeed, the lock acquired in uart_put_char() is
uport->lock, i.e. not the same lock.

Anyway, since the lock is not acquired, if uart_shutdown() is called, the
last chunk of that function may release state->xmit.buf before its assigned
to null, and cause the race above.

To fix it, let's lock uport->lock when allocating/deallocating
state->xmit.buf in addition to the per-port mutex.

v2: switch to locking uport->lock on allocation/deallocation instead of
    locking the per-port mutex in uart_put_char. Note that since
    uport->lock is a spin lock, we have to switch the allocation to
    GFP_ATOMIC.
v3: move the allocation outside the lock, so we can switch back to
    GFP_KERNEL
v4: * switch to positive condition of state->xmit.buf in
      uart_port_startup()
    * declare `flags` on its own line
    * use the result of uart_port_lock() in uart_shutdown() to avoid
      uninitialized warning
    * don't use the uart_port_lock/unlock macros in uart_port_startup,
      instead test against uport directly; the compiler can't seem to "see"
      through the macros/ref/unref calls to not warn about uninitialized
      flags. We don't need to do a ref here since we hold the per-port
      mutex anyway.
v5: use uart_port_lock/unlock, but free the page if it is unused outside of
    the critical section

Signed-off-by: Tycho Andersen <tycho@tycho.ws>

fengguang pushed a commit to 0day-ci/linux that referenced this pull request Jul 13, 2018

uart: fix race between uart_put_char() and uart_shutdown()
On Thu, Jul 12, 2018 at 05:40:15PM +0200, Greg Kroah-Hartman wrote:
> On Thu, Jul 12, 2018 at 09:08:22AM -0600, Tycho Andersen wrote:
> > On Thu, Jul 12, 2018 at 05:04:38PM +0200, Greg Kroah-Hartman wrote:
> > > On Wed, Jul 11, 2018 at 10:07:44AM -0600, Tycho Andersen wrote:
> > > > +	if (uport)
> > > > +		spin_lock_irqsave(&uport->lock, flags);
> > >
> > > That's the same thing as just calling uart_port_lock(), why aren't you
> > > doing that?
> >
> > Because the compiler can't seem to "see" through the macros/ref calls,
> > and I get the warning I mentioned here if I use them:
> >
> > https://lkml.org/lkml/2018/7/6/840
>
> What horrible version of gcc are you using that give you that?  Don't
> open-code things just because of a broken compiler.

I've tried with both 7.3.0 and 5.4.0. I think the reason we see this
here but not elsewhere in the file is because there's an actual
function call (free_page()) in the critical section.

If we move that out, something like the below patch, it all works for
me.

Tycho

From 532e82ceec67142b71c60ce74ce08c5339195a94 Mon Sep 17 00:00:00 2001
From: Tycho Andersen <tycho@tycho.ws>
Date: Mon, 4 Jun 2018 09:55:09 -0600
Subject: [PATCH] uart: fix race between uart_put_char() and uart_shutdown()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

We have reports of the following crash:

    PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
    #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
    #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
    #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
    #3 [ffff88085c6db860] no_context at ffffffff81050b8f
    #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
    #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
    #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
    #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
    #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
    [exception RIP: uart_put_char+149]
    RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
    RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
    RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
    RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
    R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
    R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
    #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
    #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
    #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
    #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
    #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
    #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
    #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
    #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
    #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
    #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
    #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
    #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
    #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​

after slogging through some dissasembly:

ffffffff814b6720 <uart_put_char>:
ffffffff814b6720:	55                   	push   %rbp
ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
ffffffff814b6754:	00 00
ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
ffffffff814b675d:	00
ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
ffffffff814b676f:	00
ffffffff814b6770:	89 ca                	mov    %ecx,%edx
ffffffff814b6772:	f7 d2                	not    %edx
ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
ffffffff814b677b:	00
ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
ffffffff814b67a5:	c9                   	leaveq
ffffffff814b67a6:	c3                   	retq
ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
ffffffff814b67ae:	00
ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
ffffffff814b67c0:	00
ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
ffffffff814b67d1:	00
ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
ffffffff814b67db:	00 00 00 00 00

for our build, this is crashing at:

    circ->buf[circ->head] = c;

Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
protected by the "per-port mutex", which based on uart_port_check() is
state->port.mutex. Indeed, the lock acquired in uart_put_char() is
uport->lock, i.e. not the same lock.

Anyway, since the lock is not acquired, if uart_shutdown() is called, the
last chunk of that function may release state->xmit.buf before its assigned
to null, and cause the race above.

To fix it, let's lock uport->lock when allocating/deallocating
state->xmit.buf in addition to the per-port mutex.

v2: switch to locking uport->lock on allocation/deallocation instead of
    locking the per-port mutex in uart_put_char. Note that since
    uport->lock is a spin lock, we have to switch the allocation to
    GFP_ATOMIC.
v3: move the allocation outside the lock, so we can switch back to
    GFP_KERNEL
v4: * switch to positive condition of state->xmit.buf in
      uart_port_startup()
    * declare `flags` on its own line
    * use the result of uart_port_lock() in uart_shutdown() to avoid
      uninitialized warning
    * don't use the uart_port_lock/unlock macros in uart_port_startup,
      instead test against uport directly; the compiler can't seem to "see"
      through the macros/ref/unref calls to not warn about uninitialized
      flags. We don't need to do a ref here since we hold the per-port
      mutex anyway.
v5: use uart_port_lock/unlock, but free the page outside the critical
    section if it is unused

Signed-off-by: Tycho Andersen <tycho@tycho.ws>

Reichl pushed a commit to Reichl/linux-odroid that referenced this pull request Jul 13, 2018

ipc: reorganize initialization of kern_ipc_perm.id
Patch series "ipc: cleanups & bugfixes, rhashtable update", v2.

I have merged patches from Dmitry, Davidlohr and myself:

- patch #1-#6: Fix syzcall findings & further race cleanups
- patch #7: Cleanup from Dmitry for ipc_idr_alloc.
- patch #8-#11: rhashtable improvement from Davidlohr
- patch #12: Another cleanup for ipc_idr_alloc.

This patch (of 12):

ipc_addid() initializes kern_ipc_perm.id after having called
ipc_idr_alloc().

Thus a parallel semop() or msgrcv() that uses ipc_obtain_object_idr() may
see an uninitialized value.

The patch moves all accesses to kern_ipc_perm.id under the spin_lock().

The issues is related to the finding of
syzbot+2827ef6b3385deb07eaf@syzkaller.appspotmail.com: syzbot found an
issue with kern_ipc_perm.seq

Link: http://lkml.kernel.org/r/20180709151019.1336-2-manfred@colorfullife.com
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>

fengguang pushed a commit to 0day-ci/linux that referenced this pull request Jul 13, 2018

uart: fix race between uart_put_char() and uart_shutdown()
We have reports of the following crash:

    PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
    #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
    #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
    #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
    #3 [ffff88085c6db860] no_context at ffffffff81050b8f
    #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
    #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
    #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
    #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
    #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
    [exception RIP: uart_put_char+149]
    RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
    RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
    RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
    RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
    R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
    R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
    #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
    #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
    #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
    #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
    #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
    #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
    #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
    #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
    #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
    #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
    #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
    #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
    #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​

after slogging through some dissasembly:

ffffffff814b6720 <uart_put_char>:
ffffffff814b6720:	55                   	push   %rbp
ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
ffffffff814b6754:	00 00
ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
ffffffff814b675d:	00
ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
ffffffff814b676f:	00
ffffffff814b6770:	89 ca                	mov    %ecx,%edx
ffffffff814b6772:	f7 d2                	not    %edx
ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
ffffffff814b677b:	00
ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
ffffffff814b67a5:	c9                   	leaveq
ffffffff814b67a6:	c3                   	retq
ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
ffffffff814b67ae:	00
ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
ffffffff814b67c0:	00
ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
ffffffff814b67d1:	00
ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
ffffffff814b67db:	00 00 00 00 00

for our build, this is crashing at:

    circ->buf[circ->head] = c;

Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
protected by the "per-port mutex", which based on uart_port_check() is
state->port.mutex. Indeed, the lock acquired in uart_put_char() is
uport->lock, i.e. not the same lock.

Anyway, since the lock is not acquired, if uart_shutdown() is called, the
last chunk of that function may release state->xmit.buf before its assigned
to null, and cause the race above.

To fix it, let's lock uport->lock when allocating/deallocating
state->xmit.buf in addition to the per-port mutex.

v2: switch to locking uport->lock on allocation/deallocation instead of
    locking the per-port mutex in uart_put_char. Note that since
    uport->lock is a spin lock, we have to switch the allocation to
    GFP_ATOMIC.
v3: move the allocation outside the lock, so we can switch back to
    GFP_KERNEL

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

fengguang pushed a commit to 0day-ci/linux that referenced this pull request Jul 13, 2018

perf tools: Fix struct comm_str removal crash
We occasionaly hit following assert failure in perf top,
when processing the /proc info in multiple threads.

  perf: ...include/linux/refcount.h:109: refcount_inc:
        Assertion `!(!refcount_inc_not_zero(r))' failed.

The gdb backtrace looks like this:

  [Switching to Thread 0x7ffff11ba700 (LWP 13749)]
  0x00007ffff50839fb in raise () from /lib64/libc.so.6
  (gdb)
  #0  0x00007ffff50839fb in raise () from /lib64/libc.so.6
  #1  0x00007ffff5085800 in abort () from /lib64/libc.so.6
  #2  0x00007ffff507c0da in __assert_fail_base () from /lib64/libc.so.6
  #3  0x00007ffff507c152 in __assert_fail () from /lib64/libc.so.6
  #4  0x0000000000535373 in refcount_inc (r=0x7fffdc009be0)
      at ...include/linux/refcount.h:109
  #5  0x00000000005354f1 in comm_str__get (cs=0x7fffdc009bc0)
      at util/comm.c:24
  #6  0x00000000005356bd in __comm_str__findnew (str=0x7fffd000b260 ":2",
      root=0xbed5c0 <comm_str_root>) at util/comm.c:72
  #7  0x000000000053579e in comm_str__findnew (str=0x7fffd000b260 ":2",
      root=0xbed5c0 <comm_str_root>) at util/comm.c:95
  #8  0x000000000053582e in comm__new (str=0x7fffd000b260 ":2",
      timestamp=0, exec=false) at util/comm.c:111
  #9  0x00000000005363bc in thread__new (pid=2, tid=2) at util/thread.c:57
  #10 0x0000000000523da0 in ____machine__findnew_thread (machine=0xbfde38,
      threads=0xbfdf28, pid=2, tid=2, create=true) at util/machine.c:457
  #11 0x0000000000523eb4 in __machine__findnew_thread (machine=0xbfde38,
  ...

The failing assertion is this one:

  REFCOUNT_WARN(!refcount_inc_not_zero(r), ...

The problem is that we keep global comm_str_root list, which
is accessed by multiple threads during the perf top startup
and following 2 paths can race:

  thread 1:
    ...
    thread__new
      comm__new
        comm_str__findnew
          down_write(&comm_str_lock);
          __comm_str__findnew
            comm_str__get

  thread 2:
    ...
    comm__override or comm__free
      comm_str__put
        refcount_dec_and_test
          down_write(&comm_str_lock);
          rb_erase(&cs->rb_node, &comm_str_root);

Because thread 2 first decrements the refcnt and only after then it
removes the struct comm_str from the list, the thread 1 can find this
object on the list with refcnt equls to 0 and hit the assert.

This patch fixes the thread 2 path, by removing the struct comm_str
FIRST from the list and only AFTER calling comm_str__put on it. This
way the thread 1 finds only valid objects on the list.

We also need to ensure now, that only one caller removes the struct
comm_str, from the list. Adding 'removed' bool to track that.

Link: http://lkml.kernel.org/n/tip-vrizt6sw1lu1ybsrl9l0wwln@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>

jan-kiszka pushed a commit to siemens/linux that referenced this pull request Jul 25, 2018

Thomas Gleixner Sebastian Andrzej Siewior
workqueue: Prevent workqueue versus ata-piix livelock
An Intel i7 system regularly detected rcu_preempt stalls after the kernel
was upgraded from 3.6-rt to 3.8-rt. When the stall happened, disk I/O was no
longer possible, unless the system was restarted.

The kernel message was:
INFO: rcu_preempt self-detected stall on CPU { 6}
[..]
NMI backtrace for cpu 6
CPU 6
Pid: 119, comm: irq/19-ata_piix Not tainted 3.8.13-rt13 #11 Shuttle Inc. SX58/SX58
RIP: 0010:[<ffffffff8124ca60>]  [<ffffffff8124ca60>] ip_compute_csum+0x30/0x30
RSP: 0018:ffff880333303cb0  EFLAGS: 00000002
RAX: 0000000000000006 RBX: 00000000000003e9 RCX: 0000000000000034
RDX: 0000000000000000 RSI: ffffffff81aa16d0 RDI: 0000000000000001
RBP: ffff880333303ce8 R08: ffffffff81aa16d0 R09: ffffffff81c1b8cc
R10: 0000000000000000 R11: 0000000000000000 R12: 000000000005161f
R13: 0000000000000006 R14: ffffffff81aa16d0 R15: 0000000000000002
FS:  0000000000000000(0000) GS:ffff880333300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000003c1b2bb420 CR3: 0000000001a0f000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process irq/19-ata_piix (pid: 119, threadinfo ffff88032d88a000, task ffff88032df80000)
Stack:
ffffffff8124cb32 000000000005161e 00000000000003e9 0000000000001000
0000000000009022 ffffffff81aa16d0 0000000000000002 ffff880333303cf8
ffffffff8124caa9 ffff880333303d08 ffffffff8124cad2 ffff880333303d28
Call Trace:
<IRQ>
[<ffffffff8124cb32>] ? delay_tsc+0x33/0xe3
[<ffffffff8124caa9>] __delay+0xf/0x11
[<ffffffff8124cad2>] __const_udelay+0x27/0x29
[<ffffffff8102d1fa>] native_safe_apic_wait_icr_idle+0x39/0x45
[<ffffffff8102dc9b>] __default_send_IPI_dest_field.constprop.0+0x1e/0x58
[<ffffffff8102dd1e>] default_send_IPI_mask_sequence_phys+0x49/0x7d
[<ffffffff81030326>] physflat_send_IPI_all+0x17/0x19
[<ffffffff8102de53>] arch_trigger_all_cpu_backtrace+0x50/0x79
[<ffffffff810b21d0>] rcu_check_callbacks+0x1cb/0x568
[<ffffffff81048c9c>] ? raise_softirq+0x2e/0x35
[<ffffffff81086be0>] ? tick_sched_do_timer+0x38/0x38
[<ffffffff8104f653>] update_process_times+0x44/0x55
[<ffffffff81086866>] tick_sched_handle+0x4a/0x59
[<ffffffff81086c1c>] tick_sched_timer+0x3c/0x5b
[<ffffffff81062845>] __run_hrtimer+0x9b/0x158
[<ffffffff810631d8>] hrtimer_interrupt+0x172/0x2aa
[<ffffffff8102d498>] smp_apic_timer_interrupt+0x76/0x89
[<ffffffff814d881d>] apic_timer_interrupt+0x6d/0x80
<EOI>
[<ffffffff81057cd2>] ? __local_lock_irqsave+0x17/0x4a
[<ffffffff81059336>] try_to_grab_pending+0x42/0x17e
[<ffffffff8105a699>] mod_delayed_work_on+0x32/0x88
[<ffffffff8105a70b>] mod_delayed_work+0x1c/0x1e
[<ffffffff8122ae84>] blk_run_queue_async+0x37/0x39
[<ffffffff81230985>] flush_end_io+0xf1/0x107
[<ffffffff8122e0da>] blk_finish_request+0x21e/0x264
[<ffffffff8122e162>] blk_end_bidi_request+0x42/0x60
[<ffffffff8122e1ba>] blk_end_request+0x10/0x12
[<ffffffff8132de46>] scsi_io_completion+0x1bf/0x492
[<ffffffff81335cec>] ? sd_done+0x298/0x2ef
[<ffffffff81325a02>] scsi_finish_command+0xe9/0xf2
[<ffffffff8132dbcb>] scsi_softirq_done+0x106/0x10f
[<ffffffff812333d3>] blk_done_softirq+0x77/0x87
[<ffffffff8104826f>] do_current_softirqs+0x172/0x2e1
[<ffffffff810aa820>] ? irq_thread_fn+0x3a/0x3a
[<ffffffff81048466>] local_bh_enable+0x43/0x72
[<ffffffff810aa866>] irq_forced_thread_fn+0x46/0x52
[<ffffffff810ab089>] irq_thread+0x8c/0x17c
[<ffffffff810ab179>] ? irq_thread+0x17c/0x17c
[<ffffffff810aaffd>] ? wake_threads_waitq+0x44/0x44
[<ffffffff8105eb18>] kthread+0x8d/0x95
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65
[<ffffffff814d7b7c>] ret_from_fork+0x7c/0xb0
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65

The state of softirqd of this CPU at the time of the crash was:
ksoftirqd/6     R  running task        0    53      2 0x00000000
ffff88032fc39d18 0000000000000046 ffff88033330c4c0 ffff8803303f4710
ffff88032fc39fd8 ffff88032fc39fd8 0000000000000000 0000000000062500
ffff88032df88000 ffff8803303f4710 0000000000000000 ffff88032fc38000
Call Trace:
[<ffffffff8105a3ae>] ? __queue_work+0x27c/0x27c
[<ffffffff814d178c>] preempt_schedule+0x61/0x76
[<ffffffff8106cccf>] migrate_enable+0xe5/0x1df
[<ffffffff8105a3ae>] ? __queue_work+0x27c/0x27c
[<ffffffff8104ef52>] run_timer_softirq+0x161/0x1d6
[<ffffffff8104826f>] do_current_softirqs+0x172/0x2e1
[<ffffffff8104840b>] run_ksoftirqd+0x2d/0x45
[<ffffffff8106658a>] smpboot_thread_fn+0x2ea/0x308
[<ffffffff810662a0>] ? test_ti_thread_flag+0xc/0xc
[<ffffffff810662a0>] ? test_ti_thread_flag+0xc/0xc
[<ffffffff8105eb18>] kthread+0x8d/0x95
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65
[<ffffffff814d7afc>] ret_from_fork+0x7c/0xb0
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65

Apparently, the softirq demon and the ata_piix IRQ handler were waiting
for each other to finish ending up in a livelock. After the below patch
was applied, the system no longer crashes.

Reported-by: Carsten Emde <C.Emde@osadl.org>
Proposed-by: Thomas Gleixner <tglx@linutronix.de>
Tested by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

fengguang pushed a commit to 0day-ci/linux that referenced this pull request Jul 25, 2018

Jiri Olsa Arnaldo Carvalho de Melo
perf tools: Fix struct comm_str removal crash
We occasionaly hit following assert failure in 'perf top', when processing the
/proc info in multiple threads.

  perf: ...include/linux/refcount.h:109: refcount_inc:
        Assertion `!(!refcount_inc_not_zero(r))' failed.

The gdb backtrace looks like this:

  [Switching to Thread 0x7ffff11ba700 (LWP 13749)]
  0x00007ffff50839fb in raise () from /lib64/libc.so.6
  (gdb)
  #0  0x00007ffff50839fb in raise () from /lib64/libc.so.6
  #1  0x00007ffff5085800 in abort () from /lib64/libc.so.6
  #2  0x00007ffff507c0da in __assert_fail_base () from /lib64/libc.so.6
  #3  0x00007ffff507c152 in __assert_fail () from /lib64/libc.so.6
  #4  0x0000000000535373 in refcount_inc (r=0x7fffdc009be0)
      at ...include/linux/refcount.h:109
  #5  0x00000000005354f1 in comm_str__get (cs=0x7fffdc009bc0)
      at util/comm.c:24
  #6  0x00000000005356bd in __comm_str__findnew (str=0x7fffd000b260 ":2",
      root=0xbed5c0 <comm_str_root>) at util/comm.c:72
  #7  0x000000000053579e in comm_str__findnew (str=0x7fffd000b260 ":2",
      root=0xbed5c0 <comm_str_root>) at util/comm.c:95
  #8  0x000000000053582e in comm__new (str=0x7fffd000b260 ":2",
      timestamp=0, exec=false) at util/comm.c:111
  #9  0x00000000005363bc in thread__new (pid=2, tid=2) at util/thread.c:57
  #10 0x0000000000523da0 in ____machine__findnew_thread (machine=0xbfde38,
      threads=0xbfdf28, pid=2, tid=2, create=true) at util/machine.c:457
  #11 0x0000000000523eb4 in __machine__findnew_thread (machine=0xbfde38,
  ...

The failing assertion is this one:

  REFCOUNT_WARN(!refcount_inc_not_zero(r), ...

The problem is that we keep global comm_str_root list, which
is accessed by multiple threads during the 'perf top' startup
and following 2 paths can race:

  thread 1:
    ...
    thread__new
      comm__new
        comm_str__findnew
          down_write(&comm_str_lock);
          __comm_str__findnew
            comm_str__get

  thread 2:
    ...
    comm__override or comm__free
      comm_str__put
        refcount_dec_and_test
          down_write(&comm_str_lock);
          rb_erase(&cs->rb_node, &comm_str_root);

Because thread 2 first decrements the refcnt and only after then it removes the
struct comm_str from the list, the thread 1 can find this object on the list
with refcnt equls to 0 and hit the assert.

This patch fixes the thread 1 __comm_str__findnew path, by ignoring objects
that already dropped the refcnt to 0. For the rest of the objects we take the
refcnt before comparing its name and release it afterwards with comm_str__put,
which can also release the object completely.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Lukasz Odzioba <lukasz.odzioba@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20180720101740.GA27176@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

jmarcet pushed a commit to jmarcet/linux that referenced this pull request Jul 26, 2018

scsi: qla2xxx: Fix NULL pointer dereference for fcport search
commit 36eb8ff upstream.

Crash dump shows following instructions

crash> bt
PID: 0      TASK: ffffffffbe412480  CPU: 0   COMMAND: "swapper/0"
 #0 [ffff891ee0003868] machine_kexec at ffffffffbd063ef1
 #1 [ffff891ee00038c8] __crash_kexec at ffffffffbd12b6f2
 #2 [ffff891ee0003998] crash_kexec at ffffffffbd12c84c
 #3 [ffff891ee00039b8] oops_end at ffffffffbd030f0a
 #4 [ffff891ee00039e0] no_context at ffffffffbd074643
 #5 [ffff891ee0003a40] __bad_area_nosemaphore at ffffffffbd07496e
 #6 [ffff891ee0003a90] bad_area_nosemaphore at ffffffffbd074a64
 #7 [ffff891ee0003aa0] __do_page_fault at ffffffffbd074b0a
 #8 [ffff891ee0003b18] do_page_fault at ffffffffbd074fc8
 #9 [ffff891ee0003b50] page_fault at ffffffffbda01925
    [exception RIP: qlt_schedule_sess_for_deletion+15]
    RIP: ffffffffc02e526f  RSP: ffff891ee0003c08  RFLAGS: 00010046
    RAX: 0000000000000000  RBX: 0000000000000000  RCX: ffffffffc0307847
    RDX: 00000000000020e6  RSI: ffff891edbc377c8  RDI: 0000000000000000
    RBP: ffff891ee0003c18   R8: ffffffffc02f0b20   R9: 0000000000000250
    R10: 0000000000000258  R11: 000000000000b780  R12: ffff891ed9b43000
    R13: 00000000000000f0  R14: 0000000000000006  R15: ffff891edbc377c8
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #10 [ffff891ee0003c20] qla2x00_fcport_event_handler at ffffffffc02853d3 [qla2xxx]
 #11 [ffff891ee0003cf0] __dta_qla24xx_async_gnl_sp_done_333 at ffffffffc0285a1d [qla2xxx]
 #12 [ffff891ee0003de8] qla24xx_process_response_queue at ffffffffc02a2eb5 [qla2xxx]
 #13 [ffff891ee0003e88] qla24xx_msix_rsp_q at ffffffffc02a5403 [qla2xxx]
 #14 [ffff891ee0003ec0] __handle_irq_event_percpu at ffffffffbd0f4c59
 #15 [ffff891ee0003f10] handle_irq_event_percpu at ffffffffbd0f4e02
 #16 [ffff891ee0003f40] handle_irq_event at ffffffffbd0f4e90
 #17 [ffff891ee0003f68] handle_edge_irq at ffffffffbd0f8984
 #18 [ffff891ee0003f88] handle_irq at ffffffffbd0305d5
 #19 [ffff891ee0003fb8] do_IRQ at ffffffffbda02a18
 --- <IRQ stack> ---
 #20 [ffffffffbe403d30] ret_from_intr at ffffffffbda0094e
    [exception RIP: unknown or invalid address]
    RIP: 000000000000001f  RSP: 0000000000000000  RFLAGS: fff3b8c2091ebb3f
    RAX: ffffbba5a0000200  RBX: 0000be8cdfa8f9fa  RCX: 0000000000000018
    RDX: 0000000000000101  RSI: 000000000000015d  RDI: 0000000000000193
    RBP: 0000000000000083   R8: ffffffffbe403e38   R9: 0000000000000002
    R10: 0000000000000000  R11: ffffffffbe56b820  R12: ffff891ee001cf00
    R13: ffffffffbd11c0a4  R14: ffffffffbe403d60  R15: 0000000000000001
    ORIG_RAX: ffff891ee0022ac0  CS: 0000  SS: ffffffffffffffb9
 bt: WARNING: possibly bogus exception frame
 #21 [ffffffffbe403dd8] cpuidle_enter_state at ffffffffbd67c6fd
 #22 [ffffffffbe403e40] cpuidle_enter at ffffffffbd67c907
 #23 [ffffffffbe403e50] call_cpuidle at ffffffffbd0d98f3
 #24 [ffffffffbe403e60] do_idle at ffffffffbd0d9b42
 #25 [ffffffffbe403e98] cpu_startup_entry at ffffffffbd0d9da3
 #26 [ffffffffbe403ec0] rest_init at ffffffffbd81d4aa
 #27 [ffffffffbe403ed0] start_kernel at ffffffffbe67d2ca
 #28 [ffffffffbe403f28] x86_64_start_reservations at ffffffffbe67c675
 #29 [ffffffffbe403f38] x86_64_start_kernel at ffffffffbe67c6eb
 #30 [ffffffffbe403f50] secondary_startup_64 at ffffffffbd0000d5

Fixes: 040036b ("scsi: qla2xxx: Delay loop id allocation at login")
Cc: <stable@vger.kernel.org> # v4.17+
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

fengguang pushed a commit to 0day-ci/linux that referenced this pull request Jul 27, 2018

Thomas Gleixner Sebastian Andrzej Siewior
workqueue: Prevent workqueue versus ata-piix livelock
An Intel i7 system regularly detected rcu_preempt stalls after the kernel
was upgraded from 3.6-rt to 3.8-rt. When the stall happened, disk I/O was no
longer possible, unless the system was restarted.

The kernel message was:
INFO: rcu_preempt self-detected stall on CPU { 6}
[..]
NMI backtrace for cpu 6
CPU 6
Pid: 119, comm: irq/19-ata_piix Not tainted 3.8.13-rt13 #11 Shuttle Inc. SX58/SX58
RIP: 0010:[<ffffffff8124ca60>]  [<ffffffff8124ca60>] ip_compute_csum+0x30/0x30
RSP: 0018:ffff880333303cb0  EFLAGS: 00000002
RAX: 0000000000000006 RBX: 00000000000003e9 RCX: 0000000000000034
RDX: 0000000000000000 RSI: ffffffff81aa16d0 RDI: 0000000000000001
RBP: ffff880333303ce8 R08: ffffffff81aa16d0 R09: ffffffff81c1b8cc
R10: 0000000000000000 R11: 0000000000000000 R12: 000000000005161f
R13: 0000000000000006 R14: ffffffff81aa16d0 R15: 0000000000000002
FS:  0000000000000000(0000) GS:ffff880333300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000003c1b2bb420 CR3: 0000000001a0f000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process irq/19-ata_piix (pid: 119, threadinfo ffff88032d88a000, task ffff88032df80000)
Stack:
ffffffff8124cb32 000000000005161e 00000000000003e9 0000000000001000
0000000000009022 ffffffff81aa16d0 0000000000000002 ffff880333303cf8
ffffffff8124caa9 ffff880333303d08 ffffffff8124cad2 ffff880333303d28
Call Trace:
<IRQ>
[<ffffffff8124cb32>] ? delay_tsc+0x33/0xe3
[<ffffffff8124caa9>] __delay+0xf/0x11
[<ffffffff8124cad2>] __const_udelay+0x27/0x29
[<ffffffff8102d1fa>] native_safe_apic_wait_icr_idle+0x39/0x45
[<ffffffff8102dc9b>] __default_send_IPI_dest_field.constprop.0+0x1e/0x58
[<ffffffff8102dd1e>] default_send_IPI_mask_sequence_phys+0x49/0x7d
[<ffffffff81030326>] physflat_send_IPI_all+0x17/0x19
[<ffffffff8102de53>] arch_trigger_all_cpu_backtrace+0x50/0x79
[<ffffffff810b21d0>] rcu_check_callbacks+0x1cb/0x568
[<ffffffff81048c9c>] ? raise_softirq+0x2e/0x35
[<ffffffff81086be0>] ? tick_sched_do_timer+0x38/0x38
[<ffffffff8104f653>] update_process_times+0x44/0x55
[<ffffffff81086866>] tick_sched_handle+0x4a/0x59
[<ffffffff81086c1c>] tick_sched_timer+0x3c/0x5b
[<ffffffff81062845>] __run_hrtimer+0x9b/0x158
[<ffffffff810631d8>] hrtimer_interrupt+0x172/0x2aa
[<ffffffff8102d498>] smp_apic_timer_interrupt+0x76/0x89
[<ffffffff814d881d>] apic_timer_interrupt+0x6d/0x80
<EOI>
[<ffffffff81057cd2>] ? __local_lock_irqsave+0x17/0x4a
[<ffffffff81059336>] try_to_grab_pending+0x42/0x17e
[<ffffffff8105a699>] mod_delayed_work_on+0x32/0x88
[<ffffffff8105a70b>] mod_delayed_work+0x1c/0x1e
[<ffffffff8122ae84>] blk_run_queue_async+0x37/0x39
[<ffffffff81230985>] flush_end_io+0xf1/0x107
[<ffffffff8122e0da>] blk_finish_request+0x21e/0x264
[<ffffffff8122e162>] blk_end_bidi_request+0x42/0x60
[<ffffffff8122e1ba>] blk_end_request+0x10/0x12
[<ffffffff8132de46>] scsi_io_completion+0x1bf/0x492
[<ffffffff81335cec>] ? sd_done+0x298/0x2ef
[<ffffffff81325a02>] scsi_finish_command+0xe9/0xf2
[<ffffffff8132dbcb>] scsi_softirq_done+0x106/0x10f
[<ffffffff812333d3>] blk_done_softirq+0x77/0x87
[<ffffffff8104826f>] do_current_softirqs+0x172/0x2e1
[<ffffffff810aa820>] ? irq_thread_fn+0x3a/0x3a
[<ffffffff81048466>] local_bh_enable+0x43/0x72
[<ffffffff810aa866>] irq_forced_thread_fn+0x46/0x52
[<ffffffff810ab089>] irq_thread+0x8c/0x17c
[<ffffffff810ab179>] ? irq_thread+0x17c/0x17c
[<ffffffff810aaffd>] ? wake_threads_waitq+0x44/0x44
[<ffffffff8105eb18>] kthread+0x8d/0x95
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65
[<ffffffff814d7b7c>] ret_from_fork+0x7c/0xb0
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65

The state of softirqd of this CPU at the time of the crash was:
ksoftirqd/6     R  running task        0    53      2 0x00000000
ffff88032fc39d18 0000000000000046 ffff88033330c4c0 ffff8803303f4710
ffff88032fc39fd8 ffff88032fc39fd8 0000000000000000 0000000000062500
ffff88032df88000 ffff8803303f4710 0000000000000000 ffff88032fc38000
Call Trace:
[<ffffffff8105a3ae>] ? __queue_work+0x27c/0x27c
[<ffffffff814d178c>] preempt_schedule+0x61/0x76
[<ffffffff8106cccf>] migrate_enable+0xe5/0x1df
[<ffffffff8105a3ae>] ? __queue_work+0x27c/0x27c
[<ffffffff8104ef52>] run_timer_softirq+0x161/0x1d6
[<ffffffff8104826f>] do_current_softirqs+0x172/0x2e1
[<ffffffff8104840b>] run_ksoftirqd+0x2d/0x45
[<ffffffff8106658a>] smpboot_thread_fn+0x2ea/0x308
[<ffffffff810662a0>] ? test_ti_thread_flag+0xc/0xc
[<ffffffff810662a0>] ? test_ti_thread_flag+0xc/0xc
[<ffffffff8105eb18>] kthread+0x8d/0x95
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65
[<ffffffff814d7afc>] ret_from_fork+0x7c/0xb0
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65

Apparently, the softirq demon and the ata_piix IRQ handler were waiting
for each other to finish ending up in a livelock. After the below patch
was applied, the system no longer crashes.

Reported-by: Carsten Emde <C.Emde@osadl.org>
Proposed-by: Thomas Gleixner <tglx@linutronix.de>
Tested by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

fengguang pushed a commit to 0day-ci/linux that referenced this pull request Aug 1, 2018

Thomas Gleixner Sebastian Andrzej Siewior
workqueue: Prevent workqueue versus ata-piix livelock
An Intel i7 system regularly detected rcu_preempt stalls after the kernel
was upgraded from 3.6-rt to 3.8-rt. When the stall happened, disk I/O was no
longer possible, unless the system was restarted.

The kernel message was:
INFO: rcu_preempt self-detected stall on CPU { 6}
[..]
NMI backtrace for cpu 6
CPU 6
Pid: 119, comm: irq/19-ata_piix Not tainted 3.8.13-rt13 #11 Shuttle Inc. SX58/SX58
RIP: 0010:[<ffffffff8124ca60>]  [<ffffffff8124ca60>] ip_compute_csum+0x30/0x30
RSP: 0018:ffff880333303cb0  EFLAGS: 00000002
RAX: 0000000000000006 RBX: 00000000000003e9 RCX: 0000000000000034
RDX: 0000000000000000 RSI: ffffffff81aa16d0 RDI: 0000000000000001
RBP: ffff880333303ce8 R08: ffffffff81aa16d0 R09: ffffffff81c1b8cc
R10: 0000000000000000 R11: 0000000000000000 R12: 000000000005161f
R13: 0000000000000006 R14: ffffffff81aa16d0 R15: 0000000000000002
FS:  0000000000000000(0000) GS:ffff880333300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000003c1b2bb420 CR3: 0000000001a0f000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process irq/19-ata_piix (pid: 119, threadinfo ffff88032d88a000, task ffff88032df80000)
Stack:
ffffffff8124cb32 000000000005161e 00000000000003e9 0000000000001000
0000000000009022 ffffffff81aa16d0 0000000000000002 ffff880333303cf8
ffffffff8124caa9 ffff880333303d08 ffffffff8124cad2 ffff880333303d28
Call Trace:
<IRQ>
[<ffffffff8124cb32>] ? delay_tsc+0x33/0xe3
[<ffffffff8124caa9>] __delay+0xf/0x11
[<ffffffff8124cad2>] __const_udelay+0x27/0x29
[<ffffffff8102d1fa>] native_safe_apic_wait_icr_idle+0x39/0x45
[<ffffffff8102dc9b>] __default_send_IPI_dest_field.constprop.0+0x1e/0x58
[<ffffffff8102dd1e>] default_send_IPI_mask_sequence_phys+0x49/0x7d
[<ffffffff81030326>] physflat_send_IPI_all+0x17/0x19
[<ffffffff8102de53>] arch_trigger_all_cpu_backtrace+0x50/0x79
[<ffffffff810b21d0>] rcu_check_callbacks+0x1cb/0x568
[<ffffffff81048c9c>] ? raise_softirq+0x2e/0x35
[<ffffffff81086be0>] ? tick_sched_do_timer+0x38/0x38
[<ffffffff8104f653>] update_process_times+0x44/0x55
[<ffffffff81086866>] tick_sched_handle+0x4a/0x59
[<ffffffff81086c1c>] tick_sched_timer+0x3c/0x5b
[<ffffffff81062845>] __run_hrtimer+0x9b/0x158
[<ffffffff810631d8>] hrtimer_interrupt+0x172/0x2aa
[<ffffffff8102d498>] smp_apic_timer_interrupt+0x76/0x89
[<ffffffff814d881d>] apic_timer_interrupt+0x6d/0x80
<EOI>
[<ffffffff81057cd2>] ? __local_lock_irqsave+0x17/0x4a
[<ffffffff81059336>] try_to_grab_pending+0x42/0x17e
[<ffffffff8105a699>] mod_delayed_work_on+0x32/0x88
[<ffffffff8105a70b>] mod_delayed_work+0x1c/0x1e
[<ffffffff8122ae84>] blk_run_queue_async+0x37/0x39
[<ffffffff81230985>] flush_end_io+0xf1/0x107
[<ffffffff8122e0da>] blk_finish_request+0x21e/0x264
[<ffffffff8122e162>] blk_end_bidi_request+0x42/0x60
[<ffffffff8122e1ba>] blk_end_request+0x10/0x12
[<ffffffff8132de46>] scsi_io_completion+0x1bf/0x492
[<ffffffff81335cec>] ? sd_done+0x298/0x2ef
[<ffffffff81325a02>] scsi_finish_command+0xe9/0xf2
[<ffffffff8132dbcb>] scsi_softirq_done+0x106/0x10f
[<ffffffff812333d3>] blk_done_softirq+0x77/0x87
[<ffffffff8104826f>] do_current_softirqs+0x172/0x2e1
[<ffffffff810aa820>] ? irq_thread_fn+0x3a/0x3a
[<ffffffff81048466>] local_bh_enable+0x43/0x72
[<ffffffff810aa866>] irq_forced_thread_fn+0x46/0x52
[<ffffffff810ab089>] irq_thread+0x8c/0x17c
[<ffffffff810ab179>] ? irq_thread+0x17c/0x17c
[<ffffffff810aaffd>] ? wake_threads_waitq+0x44/0x44
[<ffffffff8105eb18>] kthread+0x8d/0x95
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65
[<ffffffff814d7b7c>] ret_from_fork+0x7c/0xb0
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65

The state of softirqd of this CPU at the time of the crash was:
ksoftirqd/6     R  running task        0    53      2 0x00000000
ffff88032fc39d18 0000000000000046 ffff88033330c4c0 ffff8803303f4710
ffff88032fc39fd8 ffff88032fc39fd8 0000000000000000 0000000000062500
ffff88032df88000 ffff8803303f4710 0000000000000000 ffff88032fc38000
Call Trace:
[<ffffffff8105a3ae>] ? __queue_work+0x27c/0x27c
[<ffffffff814d178c>] preempt_schedule+0x61/0x76
[<ffffffff8106cccf>] migrate_enable+0xe5/0x1df
[<ffffffff8105a3ae>] ? __queue_work+0x27c/0x27c
[<ffffffff8104ef52>] run_timer_softirq+0x161/0x1d6
[<ffffffff8104826f>] do_current_softirqs+0x172/0x2e1
[<ffffffff8104840b>] run_ksoftirqd+0x2d/0x45
[<ffffffff8106658a>] smpboot_thread_fn+0x2ea/0x308
[<ffffffff810662a0>] ? test_ti_thread_flag+0xc/0xc
[<ffffffff810662a0>] ? test_ti_thread_flag+0xc/0xc
[<ffffffff8105eb18>] kthread+0x8d/0x95
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65
[<ffffffff814d7afc>] ret_from_fork+0x7c/0xb0
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65

Apparently, the softirq demon and the ata_piix IRQ handler were waiting
for each other to finish ending up in a livelock. After the below patch
was applied, the system no longer crashes.

Reported-by: Carsten Emde <C.Emde@osadl.org>
Proposed-by: Thomas Gleixner <tglx@linutronix.de>
Tested by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

tombriden pushed a commit to tombriden/linux that referenced this pull request Sep 9, 2018

uart: fix race between uart_put_char() and uart_shutdown()
commit a5ba1d9 upstream.

We have reports of the following crash:

    PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
    #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
    #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
    #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
    #3 [ffff88085c6db860] no_context at ffffffff81050b8f
    #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
    #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
    #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
    #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
    #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
    [exception RIP: uart_put_char+149]
    RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
    RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
    RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
    RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
    R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
    R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
    #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
    #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
    #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
    #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
    #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
    #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
    #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
    #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
    #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
    #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
    #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
    #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
    #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​

after slogging through some dissasembly:

ffffffff814b6720 <uart_put_char>:
ffffffff814b6720:	55                   	push   %rbp
ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
ffffffff814b6754:	00 00
ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
ffffffff814b675d:	00
ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
ffffffff814b676f:	00
ffffffff814b6770:	89 ca                	mov    %ecx,%edx
ffffffff814b6772:	f7 d2                	not    %edx
ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
ffffffff814b677b:	00
ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
ffffffff814b67a5:	c9                   	leaveq
ffffffff814b67a6:	c3                   	retq
ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
ffffffff814b67ae:	00
ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
ffffffff814b67c0:	00
ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
ffffffff814b67d1:	00
ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
ffffffff814b67db:	00 00 00 00 00

for our build, this is crashing at:

    circ->buf[circ->head] = c;

Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
protected by the "per-port mutex", which based on uart_port_check() is
state->port.mutex. Indeed, the lock acquired in uart_put_char() is
uport->lock, i.e. not the same lock.

Anyway, since the lock is not acquired, if uart_shutdown() is called, the
last chunk of that function may release state->xmit.buf before its assigned
to null, and cause the race above.

To fix it, let's lock uport->lock when allocating/deallocating
state->xmit.buf in addition to the per-port mutex.

v2: switch to locking uport->lock on allocation/deallocation instead of
    locking the per-port mutex in uart_put_char. Note that since
    uport->lock is a spin lock, we have to switch the allocation to
    GFP_ATOMIC.
v3: move the allocation outside the lock, so we can switch back to
    GFP_KERNEL

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Noltari pushed a commit to Noltari/linux that referenced this pull request Sep 9, 2018

uart: fix race between uart_put_char() and uart_shutdown()
commit a5ba1d9 upstream.

We have reports of the following crash:

    PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
    #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
    #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
    #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
    #3 [ffff88085c6db860] no_context at ffffffff81050b8f
    #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
    #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
    #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
    #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
    #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
    [exception RIP: uart_put_char+149]
    RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
    RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
    RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
    RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
    R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
    R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
    #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
    #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
    #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
    #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
    #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
    #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
    #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
    #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
    #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
    #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
    #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
    #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
    #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​

after slogging through some dissasembly:

ffffffff814b6720 <uart_put_char>:
ffffffff814b6720:	55                   	push   %rbp
ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
ffffffff814b6754:	00 00
ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
ffffffff814b675d:	00
ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
ffffffff814b676f:	00
ffffffff814b6770:	89 ca                	mov    %ecx,%edx
ffffffff814b6772:	f7 d2                	not    %edx
ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
ffffffff814b677b:	00
ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
ffffffff814b67a5:	c9                   	leaveq
ffffffff814b67a6:	c3                   	retq
ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
ffffffff814b67ae:	00
ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
ffffffff814b67c0:	00
ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
ffffffff814b67d1:	00
ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
ffffffff814b67db:	00 00 00 00 00

for our build, this is crashing at:

    circ->buf[circ->head] = c;

Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
protected by the "per-port mutex", which based on uart_port_check() is
state->port.mutex. Indeed, the lock acquired in uart_put_char() is
uport->lock, i.e. not the same lock.

Anyway, since the lock is not acquired, if uart_shutdown() is called, the
last chunk of that function may release state->xmit.buf before its assigned
to null, and cause the race above.

To fix it, let's lock uport->lock when allocating/deallocating
state->xmit.buf in addition to the per-port mutex.

v2: switch to locking uport->lock on allocation/deallocation instead of
    locking the per-port mutex in uart_put_char. Note that since
    uport->lock is a spin lock, we have to switch the allocation to
    GFP_ATOMIC.
v3: move the allocation outside the lock, so we can switch back to
    GFP_KERNEL

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

vlomovtsev pushed a commit to vlomovtsev/linux that referenced this pull request Sep 10, 2018

uart: fix race between uart_put_char() and uart_shutdown()
commit a5ba1d9 upstream.

We have reports of the following crash:

    PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
    #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
    #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
    #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
    #3 [ffff88085c6db860] no_context at ffffffff81050b8f
    #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
    #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
    #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
    #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
    #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
    [exception RIP: uart_put_char+149]
    RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
    RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
    RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
    RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
    R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
    R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
    #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
    #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
    #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
    #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
    #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
    #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
    #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
    #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
    #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
    #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
    #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
    #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
    #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​

after slogging through some dissasembly:

ffffffff814b6720 <uart_put_char>:
ffffffff814b6720:	55                   	push   %rbp
ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
ffffffff814b6754:	00 00
ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
ffffffff814b675d:	00
ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
ffffffff814b676f:	00
ffffffff814b6770:	89 ca                	mov    %ecx,%edx
ffffffff814b6772:	f7 d2                	not    %edx
ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
ffffffff814b677b:	00
ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
ffffffff814b67a5:	c9                   	leaveq
ffffffff814b67a6:	c3                   	retq
ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
ffffffff814b67ae:	00
ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
ffffffff814b67c0:	00
ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
ffffffff814b67d1:	00
ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
ffffffff814b67db:	00 00 00 00 00

for our build, this is crashing at:

    circ->buf[circ->head] = c;

Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
protected by the "per-port mutex", which based on uart_port_check() is
state->port.mutex. Indeed, the lock acquired in uart_put_char() is
uport->lock, i.e. not the same lock.

Anyway, since the lock is not acquired, if uart_shutdown() is called, the
last chunk of that function may release state->xmit.buf before its assigned
to null, and cause the race above.

To fix it, let's lock uport->lock when allocating/deallocating
state->xmit.buf in addition to the per-port mutex.

v2: switch to locking uport->lock on allocation/deallocation instead of
    locking the per-port mutex in uart_put_char. Note that since
    uport->lock is a spin lock, we have to switch the allocation to
    GFP_ATOMIC.
v3: move the allocation outside the lock, so we can switch back to
    GFP_KERNEL

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

fengguang pushed a commit to 0day-ci/linux that referenced this pull request Sep 18, 2018

Thomas Gleixner Sebastian Andrzej Siewior
workqueue: Prevent workqueue versus ata-piix livelock
An Intel i7 system regularly detected rcu_preempt stalls after the kernel
was upgraded from 3.6-rt to 3.8-rt. When the stall happened, disk I/O was no
longer possible, unless the system was restarted.

The kernel message was:
INFO: rcu_preempt self-detected stall on CPU { 6}
[..]
NMI backtrace for cpu 6
CPU 6
Pid: 119, comm: irq/19-ata_piix Not tainted 3.8.13-rt13 #11 Shuttle Inc. SX58/SX58
RIP: 0010:[<ffffffff8124ca60>]  [<ffffffff8124ca60>] ip_compute_csum+0x30/0x30
RSP: 0018:ffff880333303cb0  EFLAGS: 00000002
RAX: 0000000000000006 RBX: 00000000000003e9 RCX: 0000000000000034
RDX: 0000000000000000 RSI: ffffffff81aa16d0 RDI: 0000000000000001
RBP: ffff880333303ce8 R08: ffffffff81aa16d0 R09: ffffffff81c1b8cc
R10: 0000000000000000 R11: 0000000000000000 R12: 000000000005161f
R13: 0000000000000006 R14: ffffffff81aa16d0 R15: 0000000000000002
FS:  0000000000000000(0000) GS:ffff880333300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000003c1b2bb420 CR3: 0000000001a0f000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process irq/19-ata_piix (pid: 119, threadinfo ffff88032d88a000, task ffff88032df80000)
Stack:
ffffffff8124cb32 000000000005161e 00000000000003e9 0000000000001000
0000000000009022 ffffffff81aa16d0 0000000000000002 ffff880333303cf8
ffffffff8124caa9 ffff880333303d08 ffffffff8124cad2 ffff880333303d28
Call Trace:
<IRQ>
[<ffffffff8124cb32>] ? delay_tsc+0x33/0xe3
[<ffffffff8124caa9>] __delay+0xf/0x11
[<ffffffff8124cad2>] __const_udelay+0x27/0x29
[<ffffffff8102d1fa>] native_safe_apic_wait_icr_idle+0x39/0x45
[<ffffffff8102dc9b>] __default_send_IPI_dest_field.constprop.0+0x1e/0x58
[<ffffffff8102dd1e>] default_send_IPI_mask_sequence_phys+0x49/0x7d
[<ffffffff81030326>] physflat_send_IPI_all+0x17/0x19
[<ffffffff8102de53>] arch_trigger_all_cpu_backtrace+0x50/0x79
[<ffffffff810b21d0>] rcu_check_callbacks+0x1cb/0x568
[<ffffffff81048c9c>] ? raise_softirq+0x2e/0x35
[<ffffffff81086be0>] ? tick_sched_do_timer+0x38/0x38
[<ffffffff8104f653>] update_process_times+0x44/0x55
[<ffffffff81086866>] tick_sched_handle+0x4a/0x59
[<ffffffff81086c1c>] tick_sched_timer+0x3c/0x5b
[<ffffffff81062845>] __run_hrtimer+0x9b/0x158
[<ffffffff810631d8>] hrtimer_interrupt+0x172/0x2aa
[<ffffffff8102d498>] smp_apic_timer_interrupt+0x76/0x89
[<ffffffff814d881d>] apic_timer_interrupt+0x6d/0x80
<EOI>
[<ffffffff81057cd2>] ? __local_lock_irqsave+0x17/0x4a
[<ffffffff81059336>] try_to_grab_pending+0x42/0x17e
[<ffffffff8105a699>] mod_delayed_work_on+0x32/0x88
[<ffffffff8105a70b>] mod_delayed_work+0x1c/0x1e
[<ffffffff8122ae84>] blk_run_queue_async+0x37/0x39
[<ffffffff81230985>] flush_end_io+0xf1/0x107
[<ffffffff8122e0da>] blk_finish_request+0x21e/0x264
[<ffffffff8122e162>] blk_end_bidi_request+0x42/0x60
[<ffffffff8122e1ba>] blk_end_request+0x10/0x12
[<ffffffff8132de46>] scsi_io_completion+0x1bf/0x492
[<ffffffff81335cec>] ? sd_done+0x298/0x2ef
[<ffffffff81325a02>] scsi_finish_command+0xe9/0xf2
[<ffffffff8132dbcb>] scsi_softirq_done+0x106/0x10f
[<ffffffff812333d3>] blk_done_softirq+0x77/0x87
[<ffffffff8104826f>] do_current_softirqs+0x172/0x2e1
[<ffffffff810aa820>] ? irq_thread_fn+0x3a/0x3a
[<ffffffff81048466>] local_bh_enable+0x43/0x72
[<ffffffff810aa866>] irq_forced_thread_fn+0x46/0x52
[<ffffffff810ab089>] irq_thread+0x8c/0x17c
[<ffffffff810ab179>] ? irq_thread+0x17c/0x17c
[<ffffffff810aaffd>] ? wake_threads_waitq+0x44/0x44
[<ffffffff8105eb18>] kthread+0x8d/0x95
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65
[<ffffffff814d7b7c>] ret_from_fork+0x7c/0xb0
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65

The state of softirqd of this CPU at the time of the crash was:
ksoftirqd/6     R  running task        0    53      2 0x00000000
ffff88032fc39d18 0000000000000046 ffff88033330c4c0 ffff8803303f4710
ffff88032fc39fd8 ffff88032fc39fd8 0000000000000000 0000000000062500
ffff88032df88000 ffff8803303f4710 0000000000000000 ffff88032fc38000
Call Trace:
[<ffffffff8105a3ae>] ? __queue_work+0x27c/0x27c
[<ffffffff814d178c>] preempt_schedule+0x61/0x76
[<ffffffff8106cccf>] migrate_enable+0xe5/0x1df
[<ffffffff8105a3ae>] ? __queue_work+0x27c/0x27c
[<ffffffff8104ef52>] run_timer_softirq+0x161/0x1d6
[<ffffffff8104826f>] do_current_softirqs+0x172/0x2e1
[<ffffffff8104840b>] run_ksoftirqd+0x2d/0x45
[<ffffffff8106658a>] smpboot_thread_fn+0x2ea/0x308
[<ffffffff810662a0>] ? test_ti_thread_flag+0xc/0xc
[<ffffffff810662a0>] ? test_ti_thread_flag+0xc/0xc
[<ffffffff8105eb18>] kthread+0x8d/0x95
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65
[<ffffffff814d7afc>] ret_from_fork+0x7c/0xb0
[<ffffffff8105ea8b>] ? __kthread_parkme+0x65/0x65

Apparently, the softirq demon and the ata_piix IRQ handler were waiting
for each other to finish ending up in a livelock. After the below patch
was applied, the system no longer crashes.

Reported-by: Carsten Emde <C.Emde@osadl.org>
Proposed-by: Thomas Gleixner <tglx@linutronix.de>
Tested by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

fengguang pushed a commit to 0day-ci/linux that referenced this pull request Sep 19, 2018

zinc: Poly1305 ARM and ARM64 implementations
These NEON and non-NEON implementations come from Andy Polyakov's
implementation. They are exactly the same as Andy Polyakov's original,
with the following exceptions:

- Entries and exits use the proper kernel convention macro.
- CPU feature checking is done in C by the glue code, so that has been
  removed from the assembly.
- The function names have been renamed to fit kernel conventions.
- Labels have been renamed to fit kernel conventions.
- The neon code can jump to the scalar code when it makes sense to do
  so.

After '/^#/d;/^\..*[^:]$/d', the code has the following diff in actual
instructions from the original.

ARM:

-poly1305_init:
-.Lpoly1305_init:
+ENTRY(poly1305_init_arm)
 	stmdb	sp!,{r4-r11}

 	eor	r3,r3,r3
@@ -18,8 +25,6 @@
 	moveq	r0,#0
 	beq	.Lno_key

-	adr	r11,.Lpoly1305_init
-	ldr	r12,.LOPENSSL_armcap
 	ldrb	r4,[r1,#0]
 	mov	r10,#0x0fffffff
 	ldrb	r5,[r1,#1]
@@ -34,8 +39,6 @@
 	ldrb	r7,[r1,#6]
 	and	r4,r4,r10

-	ldr	r12,[r11,r12]		@ OPENSSL_armcap_P
-	ldr	r12,[r12]
 	ldrb	r8,[r1,#7]
 	orr	r5,r5,r6,lsl#8
 	ldrb	r6,[r1,#8]
@@ -45,22 +48,6 @@
 	ldrb	r8,[r1,#10]
 	and	r5,r5,r3

-	tst	r12,#ARMV7_NEON		@ check for NEON
-	adr	r9,poly1305_blocks_neon
-	adr	r11,poly1305_blocks
-	it	ne
-	movne	r11,r9
-	adr	r12,poly1305_emit
-	adr	r10,poly1305_emit_neon
-	it	ne
-	movne	r12,r10
-	itete	eq
-	addeq	r12,r11,#(poly1305_emit-.Lpoly1305_init)
-	addne	r12,r11,#(poly1305_emit_neon-.Lpoly1305_init)
-	addeq	r11,r11,#(poly1305_blocks-.Lpoly1305_init)
-	addne	r11,r11,#(poly1305_blocks_neon-.Lpoly1305_init)
-	orr	r12,r12,#1	@ thumb-ify address
-	orr	r11,r11,#1
 	ldrb	r9,[r1,#11]
 	orr	r6,r6,r7,lsl#8
 	ldrb	r7,[r1,#12]
@@ -79,17 +66,16 @@
 	str	r6,[r0,#8]
 	and	r7,r7,r3
 	str	r7,[r0,#12]
-	stmia	r2,{r11,r12}		@ fill functions table
-	mov	r0,#1
-	mov	r0,#0
 .Lno_key:
 	ldmia	sp!,{r4-r11}
 	bx	lr				@ bx	lr
 	tst	lr,#1
 	moveq	pc,lr			@ be binary compatible with V4, yet
 	.word	0xe12fff1e			@ interoperable with Thumb ISA:-)
-poly1305_blocks:
-.Lpoly1305_blocks:
+ENDPROC(poly1305_init_arm)
+
+ENTRY(poly1305_blocks_arm)
+.Lpoly1305_blocks_arm:
 	stmdb	sp!,{r3-r11,lr}

 	ands	r2,r2,#-16
@@ -231,10 +217,11 @@
 	tst	lr,#1
 	moveq	pc,lr			@ be binary compatible with V4, yet
 	.word	0xe12fff1e			@ interoperable with Thumb ISA:-)
-poly1305_emit:
+ENDPROC(poly1305_blocks_arm)
+
+ENTRY(poly1305_emit_arm)
 	stmdb	sp!,{r4-r11}
 .Lpoly1305_emit_enter:
-
 	ldmia	r0,{r3-r7}
 	adds	r8,r3,#5		@ compare to modulus
 	adcs	r9,r4,#0
@@ -305,8 +292,12 @@
 	tst	lr,#1
 	moveq	pc,lr			@ be binary compatible with V4, yet
 	.word	0xe12fff1e			@ interoperable with Thumb ISA:-)
+ENDPROC(poly1305_emit_arm)
+
+

-poly1305_init_neon:
+ENTRY(poly1305_init_neon)
+.Lpoly1305_init_neon:
 	ldr	r4,[r0,#20]		@ load key base 2^32
 	ldr	r5,[r0,#24]
 	ldr	r6,[r0,#28]
@@ -515,8 +506,9 @@
 	vst1.32		{d8[1]},[r7]

 	bx	lr				@ bx	lr
+ENDPROC(poly1305_init_neon)

-poly1305_blocks_neon:
+ENTRY(poly1305_blocks_neon)
 	ldr	ip,[r0,#36]		@ is_base2_26
 	ands	r2,r2,#-16
 	beq	.Lno_data_neon
@@ -524,7 +516,7 @@
 	cmp	r2,#64
 	bhs	.Lenter_neon
 	tst	ip,ip			@ is_base2_26?
-	beq	.Lpoly1305_blocks
+	beq	.Lpoly1305_blocks_arm

 .Lenter_neon:
 	stmdb	sp!,{r4-r7}
@@ -534,7 +526,7 @@
 	bne	.Lbase2_26_neon

 	stmdb	sp!,{r1-r3,lr}
-	bl	poly1305_init_neon
+	bl	.Lpoly1305_init_neon

 	ldr	r4,[r0,#0]		@ load hash value base 2^32
 	ldr	r5,[r0,#4]
@@ -989,8 +981,9 @@
 	ldmia	sp!,{r4-r7}
 .Lno_data_neon:
 	bx	lr					@ bx	lr
+ENDPROC(poly1305_blocks_neon)

-poly1305_emit_neon:
+ENTRY(poly1305_emit_neon)
 	ldr	ip,[r0,#36]		@ is_base2_26

 	stmdb	sp!,{r4-r11}
@@ -1055,6 +1048,6 @@

 	ldmia	sp!,{r4-r11}
 	bx	lr				@ bx	lr
+ENDPROC(poly1305_emit_neon)

ARM64:

-poly1305_init:
+ENTRY(poly1305_init_arm)
 	cmp	x1,xzr
 	stp	xzr,xzr,[x0]		// zero hash value
 	stp	xzr,xzr,[x0,#16]	// [along with is_base2_26]
@@ -11,14 +15,9 @@
 	csel	x0,xzr,x0,eq
 	b.eq	.Lno_key

-	ldrsw	x11,.LOPENSSL_armcap_P
-	ldr	x11,.LOPENSSL_armcap_P
-	adr	x10,.LOPENSSL_armcap_P
-
 	ldp	x7,x8,[x1]		// load key
 	mov	x9,#0xfffffffc0fffffff
 	movk	x9,#0x0fff,lsl#48
-	ldr	w17,[x10,x11]
 	rev	x7,x7			// flip bytes
 	rev	x8,x8
 	and	x7,x7,x9		// &=0ffffffc0fffffff
@@ -26,24 +25,11 @@
 	and	x8,x8,x9		// &=0ffffffc0ffffffc
 	stp	x7,x8,[x0,#32]	// save key value

-	tst	w17,#ARMV7_NEON
-
-	adr	x12,poly1305_blocks
-	adr	x7,poly1305_blocks_neon
-	adr	x13,poly1305_emit
-	adr	x8,poly1305_emit_neon
-
-	csel	x12,x12,x7,eq
-	csel	x13,x13,x8,eq
-
-	stp	w12,w13,[x2]
-	stp	x12,x13,[x2]
-
-	mov	x0,#1
 .Lno_key:
 	ret
+ENDPROC(poly1305_init_arm)

-poly1305_blocks:
+ENTRY(poly1305_blocks_arm)
 	ands	x2,x2,#-16
 	b.eq	.Lno_data

@@ -100,8 +86,9 @@

 .Lno_data:
 	ret
+ENDPROC(poly1305_blocks_arm)

-poly1305_emit:
+ENTRY(poly1305_emit_arm)
 	ldp	x4,x5,[x0]		// load hash base 2^64
 	ldr	x6,[x0,#16]
 	ldp	x10,x11,[x2]	// load nonce
@@ -124,7 +111,9 @@
 	stp	x4,x5,[x1]		// write result

 	ret
-poly1305_mult:
+ENDPROC(poly1305_emit_arm)
+
+__poly1305_mult:
 	mul	x12,x4,x7		// h0*r0
 	umulh	x13,x4,x7

@@ -158,7 +147,7 @@

 	ret

-poly1305_splat:
+__poly1305_splat:
 	and	x12,x4,#0x03ffffff	// base 2^64 -> base 2^26
 	ubfx	x13,x4,#26,#26
 	extr	x14,x5,x4,#52
@@ -182,11 +171,11 @@

 	ret

-poly1305_blocks_neon:
+ENTRY(poly1305_blocks_neon)
 	ldr	x17,[x0,#24]
 	cmp	x2,#128
 	b.hs	.Lblocks_neon
-	cbz	x17,poly1305_blocks
+	cbz	x17,poly1305_blocks_arm

 .Lblocks_neon:
 	stp	x29,x30,[sp,#-80]!
@@ -232,7 +221,7 @@
 	adcs	x5,x5,x13
 	adc	x6,x6,x3

-	bl	poly1305_mult
+	bl	__poly1305_mult
 	ldr	x30,[sp,#8]

 	cbz	x3,.Lstore_base2_64_neon
@@ -274,7 +263,7 @@
 	adcs	x5,x5,x13
 	adc	x6,x6,x3

-	bl	poly1305_mult
+	bl	__poly1305_mult

 .Linit_neon:
 	and	x10,x4,#0x03ffffff	// base 2^64 -> base 2^26
@@ -301,19 +290,19 @@
 	mov	x5,x8
 	mov	x6,xzr
 	add	x0,x0,#48+12
-	bl	poly1305_splat
+	bl	__poly1305_splat

-	bl	poly1305_mult		// r^2
+	bl	__poly1305_mult		// r^2
 	sub	x0,x0,#4
-	bl	poly1305_splat
+	bl	__poly1305_splat

-	bl	poly1305_mult		// r^3
+	bl	__poly1305_mult		// r^3
 	sub	x0,x0,#4
-	bl	poly1305_splat
+	bl	__poly1305_splat

-	bl	poly1305_mult		// r^4
+	bl	__poly1305_mult		// r^4
 	sub	x0,x0,#4
-	bl	poly1305_splat
+	bl	__poly1305_splat
 	ldr	x30,[sp,#8]

 	add	x16,x1,#32
@@ -743,10 +732,11 @@
 .Lno_data_neon:
 	ldr	x29,[sp],#80
 	ret
+ENDPROC(poly1305_blocks_neon)

-poly1305_emit_neon:
+ENTRY(poly1305_emit_neon)
 	ldr	x17,[x0,#24]
-	cbz	x17,poly1305_emit
+	cbz	x17,poly1305_emit_arm

 	ldp	w10,w11,[x0]		// load hash value base 2^26
 	ldp	w12,w13,[x0,#8]
@@ -788,6 +778,6 @@
 	stp	x4,x5,[x1]		// write result

 	ret
+ENDPROC(poly1305_emit_neon)

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Polyakov <appro@openssl.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org

fengguang pushed a commit to 0day-ci/linux that referenced this pull request Sep 19, 2018

Revert "ipv6: fix double refcount of fib6_metrics"
This reverts commit e70a3aa.

This change causes use-after-free on dst->_metrics.
The crash trace looks like this:
[   97.763269] BUG: KASAN: use-after-free in ip6_mtu+0x116/0x140
[   97.769038] Read of size 4 at addr ffff881781d2cf84 by task svw_NetThreadEv/8801

[   97.777954] CPU: 76 PID: 8801 Comm: svw_NetThreadEv Not tainted 4.15.0-smp-DEV #11
[   97.777956] Hardware name: Default string Default string/Indus_QC_02, BIOS 5.46.4 03/29/2018
[   97.777957] Call Trace:
[   97.777971]  [<ffffffff895709db>] dump_stack+0x4d/0x72
[   97.777985]  [<ffffffff881651df>] print_address_description+0x6f/0x260
[   97.777997]  [<ffffffff88165747>] kasan_report+0x257/0x370
[   97.778001]  [<ffffffff894488e6>] ? ip6_mtu+0x116/0x140
[   97.778004]  [<ffffffff881658b9>] __asan_report_load4_noabort+0x19/0x20
[   97.778008]  [<ffffffff894488e6>] ip6_mtu+0x116/0x140
[   97.778013]  [<ffffffff892bb91e>] tcp_current_mss+0x12e/0x280
[   97.778016]  [<ffffffff892bb7f0>] ? tcp_mtu_to_mss+0x2d0/0x2d0
[   97.778022]  [<ffffffff887b45b8>] ? depot_save_stack+0x138/0x4a0
[   97.778037]  [<ffffffff87c38985>] ? __mmdrop+0x145/0x1f0
[   97.778040]  [<ffffffff881643b1>] ? save_stack+0xb1/0xd0
[   97.778046]  [<ffffffff89264c82>] tcp_send_mss+0x22/0x220
[   97.778059]  [<ffffffff89273a49>] tcp_sendmsg_locked+0x4f9/0x39f0
[   97.778062]  [<ffffffff881642b4>] ? kasan_check_write+0x14/0x20
[   97.778066]  [<ffffffff89273550>] ? tcp_sendpage+0x60/0x60
[   97.778070]  [<ffffffff881cb359>] ? rw_copy_check_uvector+0x69/0x280
[   97.778075]  [<ffffffff8873c65f>] ? import_iovec+0x9f/0x430
[   97.778078]  [<ffffffff88164be7>] ? kasan_slab_free+0x87/0xc0
[   97.778082]  [<ffffffff8873c5c0>] ? memzero_page+0x140/0x140
[   97.778085]  [<ffffffff881642b4>] ? kasan_check_write+0x14/0x20
[   97.778088]  [<ffffffff89276f6c>] tcp_sendmsg+0x2c/0x50
[   97.778092]  [<ffffffff89276f6c>] ? tcp_sendmsg+0x2c/0x50
[   97.778098]  [<ffffffff89352d43>] inet_sendmsg+0x103/0x480
[   97.778102]  [<ffffffff89352c40>] ? inet_gso_segment+0x15b0/0x15b0
[   97.778105]  [<ffffffff890294da>] sock_sendmsg+0xba/0xf0
[   97.778108]  [<ffffffff8902ab6a>] ___sys_sendmsg+0x6ca/0x8e0
[   97.778113]  [<ffffffff87dccac1>] ? hrtimer_try_to_cancel+0x71/0x3b0
[   97.778116]  [<ffffffff8902a4a0>] ? copy_msghdr_from_user+0x3d0/0x3d0
[   97.778119]  [<ffffffff881646d1>] ? memset+0x31/0x40
[   97.778123]  [<ffffffff87a0cff5>] ? schedule_hrtimeout_range_clock+0x165/0x380
[   97.778127]  [<ffffffff87a0ce90>] ? hrtimer_nanosleep_restart+0x250/0x250
[   97.778130]  [<ffffffff87dcc700>] ? __hrtimer_init+0x180/0x180
[   97.778133]  [<ffffffff87dd1f82>] ? ktime_get_ts64+0x172/0x200
[   97.778137]  [<ffffffff8822b8ec>] ? __fget_light+0x8c/0x2f0
[   97.778141]  [<ffffffff8902d5c6>] __sys_sendmsg+0xe6/0x190
[   97.778144]  [<ffffffff8902d5c6>] ? __sys_sendmsg+0xe6/0x190
[   97.778147]  [<ffffffff8902d4e0>] ? SyS_shutdown+0x20/0x20
[   97.778152]  [<ffffffff87cd4370>] ? wake_up_q+0xe0/0xe0
[   97.778155]  [<ffffffff8902d670>] ? __sys_sendmsg+0x190/0x190
[   97.778158]  [<ffffffff8902d683>] SyS_sendmsg+0x13/0x20
[   97.778162]  [<ffffffff87a1600c>] do_syscall_64+0x2ac/0x430
[   97.778166]  [<ffffffff87c17515>] ? do_page_fault+0x35/0x3d0
[   97.778171]  [<ffffffff8960131f>] ? page_fault+0x2f/0x50
[   97.778174]  [<ffffffff89600071>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[   97.778177] RIP: 0033:0x7f83fa36000d
[   97.778178] RSP: 002b:00007f83ef9229e0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
[   97.778180] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f83fa36000d
[   97.778182] RDX: 0000000000004000 RSI: 00007f83ef922f00 RDI: 0000000000000036
[   97.778183] RBP: 00007f83ef923040 R08: 00007f83ef9231f8 R09: 00007f83ef923168
[   97.778184] R10: 0000000000000000 R11: 0000000000000293 R12: 00007f83f69c5b40
[   97.778185] R13: 000000000000001c R14: 0000000000000001 R15: 0000000000004000

[   97.779684] Allocated by task 5919:
[   97.783185]  save_stack+0x46/0xd0
[   97.783187]  kasan_kmalloc+0xad/0xe0
[   97.783189]  kmem_cache_alloc_trace+0xdf/0x580
[   97.783190]  ip6_convert_metrics.isra.79+0x7e/0x190
[   97.783192]  ip6_route_info_create+0x60a/0x2480
[   97.783193]  ip6_route_add+0x1d/0x80
[   97.783195]  inet6_rtm_newroute+0xdd/0xf0
[   97.783198]  rtnetlink_rcv_msg+0x641/0xb10
[   97.783200]  netlink_rcv_skb+0x27b/0x3e0
[   97.783202]  rtnetlink_rcv+0x15/0x20
[   97.783203]  netlink_unicast+0x4be/0x720
[   97.783204]  netlink_sendmsg+0x7bc/0xbf0
[   97.783205]  sock_sendmsg+0xba/0xf0
[   97.783207]  ___sys_sendmsg+0x6ca/0x8e0
[   97.783208]  __sys_sendmsg+0xe6/0x190
[   97.783209]  SyS_sendmsg+0x13/0x20
[   97.783211]  do_syscall_64+0x2ac/0x430
[   97.783213]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2

[   97.784709] Freed by task 0:
[   97.785056] knetbase: Error: /proc/sys/net/core/txcs_enable does not exist
[   97.794497]  save_stack+0x46/0xd0
[   97.794499]  kasan_slab_free+0x71/0xc0
[   97.794500]  kfree+0x7c/0xf0
[   97.794501]  fib6_info_destroy_rcu+0x24f/0x310
[   97.794504]  rcu_process_callbacks+0x38b/0x1730
[   97.794506]  __do_softirq+0x1c8/0x5d0

Reported-by: John Sperbeck <jsperbeck@google.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fengguang pushed a commit to 0day-ci/linux that referenced this pull request Sep 20, 2018

Merge branch 'mlxsw-Further-MC-awareness-configuration'
Ido Schimmel says:

====================
mlxsw: Further MC-awareness configuration

Petr says:

Due to an issue in Spectrum chips, when unicast traffic shares the same
queue as BUM traffic, and there is congestion, the BUM traffic is
admitted to the queue anyway, thus pushing out all UC traffic. In order
to give unicast traffic precedence over BUM traffic, multicast-aware
mode is now configured on all ports. Under MC-aware mode, egress TCs
8..15 are used for BUM traffic, which has its own dedicated pool.

This patch set improves the way that the MC pool and the higher-order
TCs are integrated into the system.

In patch #1, shaper at the higher TCs is configured to the same value
that it has by default. It's better to have the corresponding artifact
in the code explicitly.

The 8 following patches gradually extend the devlink handling in mlxsw
to support the extra TCs and the new MC pool.

Patch #2 changes the way that pools are indexed in mlxsw. Instead of
using (FW index, direction) tuple to identify the pool and the
associated cache, mlxsw now uses devlink index. This change is necessary
because the new pool 15 is not contiguously adjacent to the
currently-used pools 0..3, and because it's only relevant on egress.
Using devlink index relaxes the requirement for symmetry and adjacency
imposed by using FW indexing.

In patch #3, the assumption that number of ingress TCs matches that of
egress TCs is relaxed to allow exposition of egress TCs 8..15.

In patches #4, #5 and #6, support for infinite quotas is introduced.
Infinite quotas are reported as taking all the memory in the system, but
actually use a mechanism where the infinity is configured explicitly.

In patches #7 and #8, support for configuring static pool sizes in
introduced. Statically-sized pools have been supported for a while now,
but during initialization, all pools have dynamic size. The patches
allow there to be a mix of by-default static and dynamic pools.

In patches #9 and #10, pool 15 resp. per-priority MC quotas are
explicitly configured to be in sync with the current recommendation for
handling BUM traffic in Spectrum chips.

In the following 3 patches, an mlxsw-specific selftest is added to test
the MC-awareness configuration.

First in patches #11 and #12, lib.sh is extended with functions to
collect ethtool stats, and to manage port MTU.

Then in patch #13 the selftest itself is added.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment