Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LEGAL][COPYRIGHT INFRINGEMENT] Comply to terms of GNU GPLv2 License, release Mi Pad's Linux© kernel source code!/遵从 GNU GPLv2 授权条款的规定,释出小米平板的 Linux© 作业系统内核来源代码!/遵從 GNU GPL 第 2 版授權條款的規定,釋出小米平板的 Linux© 作業系統核心來源程式碼! #解放小米 #LibreXiaoMi #18

Closed
Vdragon opened this issue Apr 26, 2015 · 141 comments

Comments

Projects
None yet
@Vdragon
Copy link

commented Apr 26, 2015

English version

Dear @MiCode, hehao hehao@xiaomi.com, Hugo Barra and XiaoMi corp.,

Linux© kernel is a copyrighted software written by Linus Torvalds(@torvalds) et al. and is released under the GNU GPLv2 license. According to licensing terms distributor of derived work of Linux kernel must provide modified Linux© source code along with the distribution of the software. Since "Mi Pad", your product uses Linux© kernel and is distributed to the public XiaoMi corp. must comply to requirement of GNU GPLv2 License and release Mi Pad's Linux© kernel source code.

Not complying to the copyrighted property's license is COPYRIGHT INFRINGEMENT!

Seriously, your Mi Pad customer.

简体中文(中国大陆地区)版本
Chinese(Simplified)(China) version

Linux© 作业系统内核是由 @torvalds 等人设计的一个享有智慧财产权的软件并其被已 GNU GPL 第 2 版授权条款释出。根据授权条款规定基于 Linux© 作业系统内核的衍伸作品的散布者必需伴随其散布的软件提供被修改的 Linux© 作业系统核心来源代码。既然小米平板使用了 Linux© 作业系统内核且已被散布至公开场合 @MiCode 必须尊从 GNU GPL 第 2 版授权条款的规定释出小米平板的 Linux© 作业系统核心来源代码。

违反智慧财产授权条款即为侵权行为!

正體中文(台灣地區)版本
Chinese(Traditional)(Taiwan) version

致 MiCode 社區(@MiCodeGitHub)、hehao hehao@xiaomi.com(有 Xiaomi_Kernel_OpenSource 版本倉庫寫入權限的小米科技成員)、Hugo Barra(小米科技全球副總裁)、雷軍 <http://www.leijun.com/ >(小米科技創始人、董事長兼執行長)以及小米科技公司 <http://www.xiaomi.cn/ >:

Linux© 作業系統核心是由 Linus Torvalds(@torvalds) 等人設計的,以 GNU GPL 第 2 版授權條款釋出的智慧財產。根據 GNU GPL 第 2 版授權條款要求基於 Linux© 作業系統核心的衍伸作品的散佈者必需伴隨衍伸作品提供被修改的 Linux© 作業系統核心來源程式碼。既然貴公司的產品「小米平板」使用了 Linux© 作業系統核心且已被散佈至公開場合貴公司必須尊從 GNU GPL 第 2 版授權條款的要求釋出「小米平板」的 Linux© 作業系統核心來源程式碼。

違反智慧財產授權條款為侵權行為!

嚴肅地,「小米平板」顧客敬上。

@Vdragon

This comment has been minimized.

Copy link
Author

commented Apr 26, 2015

Hi everyone concerning with this issue, please contribute other languages title(& issue summary also) so we can get more attention from @MiCode ;)

@omarhasan95

This comment has been minimized.

Copy link

commented May 7, 2015

Provide us with mipad kernel sources

@vartom

This comment has been minimized.

Copy link

commented May 7, 2015

Agreed. Provide us with mipad kernel sources.

@mikoda

This comment has been minimized.

Copy link

commented May 7, 2015

totally agree

@TheShinobi

This comment has been minimized.

Copy link

commented May 7, 2015

Yes, release the kernel sources for the Xiaomi Mipad!

@omarhasan95

This comment has been minimized.

Copy link

commented May 8, 2015

Why there is no reply from mi???

@abhiyoyo

This comment has been minimized.

Copy link

commented May 8, 2015

Yes they must provide kernel sources of Mipad keeping in mind of customer interest if they can make only half baked miui ROM then give sources many talanted devs will make stable Roms . Keep your image clean your biggest con is closed source . that why you were denied place in Xda forum . Its good to see many Mi mobiles got sources in recent times . But why not Mipad . if its something nvidia does not want you to give or what ?? Please just be cleat why its not possible or when its possible . Your silence on this issue is hurting more than unavailability of sources . we are still not sure whether we are banging our head needlessly or at wrong place

@Doraha-Raman

This comment has been minimized.

Copy link

commented May 8, 2015

yes , they should have already released kernels

@Vdragon Vdragon changed the title Please comply to terms of GNU GPL2 License, release Mi Pad's Linux© kernel source code! | 请遵从 GNU GPL2 授权条款的规定,释出小米平板的 Linux© 作业系统内核来源代码! | 請遵從 GNU GPL2 授權條款的規定,釋出小米平板的 Linux© 作業系統核心來源程式碼! Please comply to terms of GNU GPLv2 License, release Mi Pad's Linux© kernel source code! | 请遵从 GNU GPLv2 授权条款的规定,释出小米平板的 Linux© 作业系统内核来源代码! | 請遵從 GNU GPL 第 2 版授權條款的規定,釋出小米平板的 Linux© 作業系統核心來源程式碼! May 8, 2015

@KinG5Pac

This comment has been minimized.

Copy link

commented May 9, 2015

I don't think they will care about this. Same scenario was on mi3. And @hugo again lied about kernel release , I'm about 3 month kernel release fro all devices.

@abhiyoyo

This comment has been minimized.

Copy link

commented May 9, 2015

but @KinG5Pac but they released sources for mi3 eventually

@KinG5Pac

This comment has been minimized.

Copy link

commented May 9, 2015

@abhiyoyo 2-3 years old phone. And u say it released ?

@abhiyoyo

This comment has been minimized.

Copy link

commented May 9, 2015

Yes

@co11ider

This comment has been minimized.

Copy link

commented May 9, 2015

Yes. Dont make me regret buying a mipad and will never ever buy a Xiaomi product again. Also I'm suspecting you guys must have hidden some dirty secret inside it, like the Indian government warned.

@Juanmiwow

This comment has been minimized.

Copy link

commented May 10, 2015

We need kernel source! PLEASE, provide us the MiPad kernel source!

@sharifsonic

This comment has been minimized.

Copy link

commented May 15, 2015

YEAH GUYS, Dont defame your self.
I used to be a great MI fan..
I'm losing it now!!
Release the mipad kernel.

@ankurs

This comment has been minimized.

Copy link

commented May 26, 2015

+1

@liuguo09

This comment has been minimized.

Copy link
Contributor

commented May 28, 2015

We will, but i can't give you an accurate time

@Vdragon

This comment has been minimized.

Copy link
Author

commented May 28, 2015

@liuguo09 What about your will on issue #4 ?

@abhiyoyo

This comment has been minimized.

Copy link

commented May 28, 2015

ok sir thanks for reply @liuguo09 sir atleast an eta ?? will we get within 1 month ??

@ghost

This comment has been minimized.

Copy link

commented May 28, 2015

He said that he can't give you a time.. If you work in software development you should understand how useless ETAs are

@abhiyoyo

This comment has been minimized.

Copy link

commented May 28, 2015

@stefant234 chill he said exact time not eta Sir !!!!!! I am not putting gun and asking forcibly he will reply if he wants you don't jump in between please

@omarhasan95

This comment has been minimized.

Copy link

commented May 28, 2015

Only sources matter to us please provide us asap we dont need eta
On May 28, 2015 2:11 PM, "abhiyoyo" notifications@github.com wrote:

@stefant234 https://github.com/stefant234 chill he said exact time not
eta Sir !!!!!!


Reply to this email directly or view it on GitHub
#18 (comment)
.

@abhiyoyo

This comment has been minimized.

Copy link

commented May 28, 2015

@omarhasan95 hope so but I know it takes lil time but don't know 1 month should be enough again I don't work in software development so no exact idea but redmi 1s and note 4g can get quick so hope Mipad will be quick

@ghost

This comment has been minimized.

Copy link

commented May 28, 2015

I am relaxed. Just out of curiosity, who of you do kernel development? Bombarding the poor man won't help anyone and won't change company policy. I'm also not saying that you force anyone, I just tell you that any ETA in software development is not to be trusted, because in 99.9% of the time, they can't be held. Thats what I meant. Since thus is public I can jump in as I please ;)

@abhiyoyo

This comment has been minimized.

Copy link

commented May 28, 2015

OK jump hehe but its not bombarding lol its user requests @stefant234 and poor man ?? Lol we just asking not throwing grenades haha

@ghost

This comment has been minimized.

Copy link

commented May 28, 2015

Imagine being in his position. Xiaomi sent him to maintain their Github and now a few doyen people see him as the personification of GPL violations :D Anyways, this sourcecode's state is why I abandoned Xiaomi and going to stick with Samsung for a while. At least they provide the sourcecode for every firmware update :)

@abhiyoyo

This comment has been minimized.

Copy link

commented May 28, 2015

Man @stefant234 people are not targeting him they target xiaomi only and yes its GPL violation and this is the page we really can ask about sources lol who even said he violated this post was about xiaomi :D users don't see who is maintaining the page they just see its kernel source page and ask their query :p if anyone said tagging him that he violated anything that's dumbness bro :p

@ghost

This comment has been minimized.

Copy link

commented May 28, 2015

I think our problem is English, you apparently don't fully get me while I don't fully get you either :D

@abhiyoyo

This comment has been minimized.

Copy link

commented May 28, 2015

Ok hehe

@kdunning

This comment has been minimized.

Copy link

commented Jan 26, 2017

To add this this, @hehao, please can you provide details as to when the kernel will be available for the latest Xiaomi products, including the Mi Note 2 and the Mi 5s/Mi 5s Plus.

@ohjames

This comment has been minimized.

Copy link

commented Jul 2, 2017

It's a shame Xiaomi are GPL violators. My next phone would be a Xiaomi phone if they would clean up their act. As things stand I won't buy a phone from people who steal IP.

AndropaX pushed a commit to AndropaX/android_kernel_xiaomi_msm8992 that referenced this issue Jul 10, 2017

Jon Maxwell Willy Tarreau
dccp/tcp: fix routing redirect race
commit 45caeaa upstream.

As Eric Dumazet pointed out this also needs to be fixed in IPv6.
v2: Contains the IPv6 tcp/Ipv6 dccp patches as well.

We have seen a few incidents lately where a dst_enty has been freed
with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that
dst_entry. If the conditions/timings are right a crash then ensues when the
freed dst_entry is referenced later on. A Common crashing back trace is:

 MiCode#8 [] page_fault at ffffffff8163e648
    [exception RIP: __tcp_ack_snd_check+74]
.
.
 MiCode#9 [] tcp_rcv_established at ffffffff81580b64
MiCode#10 [] tcp_v4_do_rcv at ffffffff8158b54a
MiCode#11 [] tcp_v4_rcv at ffffffff8158cd02
MiCode#12 [] ip_local_deliver_finish at ffffffff815668f4
MiCode#13 [] ip_local_deliver at ffffffff81566bd9
MiCode#14 [] ip_rcv_finish at ffffffff8156656d
MiCode#15 [] ip_rcv at ffffffff81566f06
MiCode#16 [] __netif_receive_skb_core at ffffffff8152b3a2
MiCode#17 [] __netif_receive_skb at ffffffff8152b608
MiCode#18 [] netif_receive_skb at ffffffff8152b690
MiCode#19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3]
MiCode#20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3]
MiCode#21 [] net_rx_action at ffffffff8152bac2
MiCode#22 [] __do_softirq at ffffffff81084b4f
MiCode#23 [] call_softirq at ffffffff8164845c
MiCode#24 [] do_softirq at ffffffff81016fc5
MiCode#25 [] irq_exit at ffffffff81084ee5
MiCode#26 [] do_IRQ at ffffffff81648ff8

Of course it may happen with other NIC drivers as well.

It's found the freed dst_entry here:

 224 static bool tcp_in_quickack_mode(struct sock *sk)�
 225 {�
 226 �       const struct inet_connection_sock *icsk = inet_csk(sk);�
 227 �       const struct dst_entry *dst = __sk_dst_get(sk);�
 228 �
 229 �       return (dst && dst_metric(dst, RTAX_QUICKACK)) ||�
 230 �       �       (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);�
 231 }�

But there are other backtraces attributed to the same freed dst_entry in
netfilter code as well.

All the vmcores showed 2 significant clues:

- Remote hosts behind the default gateway had always been redirected to a
different gateway. A rtable/dst_entry will be added for that host. Making
more dst_entrys with lower reference counts. Making this more probable.

- All vmcores showed a postitive LockDroppedIcmps value, e.g:

LockDroppedIcmps                  267

A closer look at the tcp_v4_err() handler revealed that do_redirect() will run
regardless of whether user space has the socket locked. This can result in a
race condition where the same dst_entry cached in sk->sk_dst_entry can be
decremented twice for the same socket via:

do_redirect()->__sk_dst_check()-> dst_release().

Which leads to the dst_entry being prematurely freed with another socket
pointing to it via sk->sk_dst_cache and a subsequent crash.

To fix this skip do_redirect() if usespace has the socket locked. Instead let
the redirect take place later when user space does not have the socket
locked.

The dccp/IPv6 code is very similar in this respect, so fixing it there too.

As Eric Garver pointed out the following commit now invalidates routes. Which
can set the dst->obsolete flag so that ipv4_dst_check() returns null and
triggers the dst_release().

Fixes: ceb3320 ("ipv4: Kill routes during PMTU/redirect updates.")
Cc: Eric Garver <egarver@redhat.com>
Cc: Hannes Sowa <hsowa@redhat.com>
Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>

mihirshah006 added a commit to mihirshah006/Xiaomi_Kernel_OpenSource that referenced this issue Jul 25, 2017

mm, vmscan: Do not wait for page writeback for GFP_NOFS allocations
commit ecf5fc6 upstream.

Nikolay has reported a hang when a memcg reclaim got stuck with the
following backtrace:

PID: 18308  TASK: ffff883d7c9b0a30  CPU: 1   COMMAND: "rsync"
  #0 __schedule at ffffffff815ab152
  premaca#1 schedule at ffffffff815ab76e
  premaca#2 schedule_timeout at ffffffff815ae5e5
  MiCode#3 io_schedule_timeout at ffffffff815aad6a
  MiCode#4 bit_wait_io at ffffffff815abfc6
  MiCode#5 __wait_on_bit at ffffffff815abda5
  MiCode#6 wait_on_page_bit at ffffffff8111fd4f
  MiCode#7 shrink_page_list at ffffffff81135445
  MiCode#8 shrink_inactive_list at ffffffff81135845
  MiCode#9 shrink_lruvec at ffffffff81135ead
 MiCode#10 shrink_zone at ffffffff811360c3
 MiCode#11 shrink_zones at ffffffff81136eff
 MiCode#12 do_try_to_free_pages at ffffffff8113712f
 MiCode#13 try_to_free_mem_cgroup_pages at ffffffff811372be
 MiCode#14 try_charge at ffffffff81189423
 MiCode#15 mem_cgroup_try_charge at ffffffff8118c6f5
 MiCode#16 __add_to_page_cache_locked at ffffffff8112137d
 MiCode#17 add_to_page_cache_lru at ffffffff81121618
 MiCode#18 pagecache_get_page at ffffffff8112170b
 MiCode#19 grow_dev_page at ffffffff811c8297
 MiCode#20 __getblk_slow at ffffffff811c91d6
 MiCode#21 __getblk_gfp at ffffffff811c92c1
 MiCode#22 ext4_ext_grow_indepth at ffffffff8124565c
 MiCode#23 ext4_ext_create_new_leaf at ffffffff81246ca8
 MiCode#24 ext4_ext_insert_extent at ffffffff81246f09
 MiCode#25 ext4_ext_map_blocks at ffffffff8124a848
 MiCode#26 ext4_map_blocks at ffffffff8121a5b7
 MiCode#27 mpage_map_one_extent at ffffffff8121b1fa
 MiCode#28 mpage_map_and_submit_extent at ffffffff8121f07b
 MiCode#29 ext4_writepages at ffffffff8121f6d5
 MiCode#30 do_writepages at ffffffff8112c490
 MiCode#31 __filemap_fdatawrite_range at ffffffff81120199
 MiCode#32 filemap_flush at ffffffff8112041c
 MiCode#33 ext4_alloc_da_blocks at ffffffff81219da1
 MiCode#34 ext4_rename at ffffffff81229b91
 MiCode#35 ext4_rename2 at ffffffff81229e32
 MiCode#36 vfs_rename at ffffffff811a08a5
 MiCode#37 SYSC_renameat2 at ffffffff811a3ffc
 MiCode#38 sys_renameat2 at ffffffff811a408e
 MiCode#39 sys_rename at ffffffff8119e51e
 MiCode#40 system_call_fastpath at ffffffff815afa89

Dave Chinner has properly pointed out that this is a deadlock in the
reclaim code because ext4 doesn't submit pages which are marked by
PG_writeback right away.

The heuristic was introduced by commit e62e384 ("memcg: prevent OOM
with too many dirty pages") and it was applied only when may_enter_fs
was specified.  The code has been changed by c3b94f4 ("memcg:
further prevent OOM with too many dirty pages") which has removed the
__GFP_FS restriction with a reasoning that we do not get into the fs
code.  But this is not sufficient apparently because the fs doesn't
necessarily submit pages marked PG_writeback for IO right away.

ext4_bio_write_page calls io_submit_add_bh but that doesn't necessarily
submit the bio.  Instead it tries to map more pages into the bio and
mpage_map_one_extent might trigger memcg charge which might end up
waiting on a page which is marked PG_writeback but hasn't been submitted
yet so we would end up waiting for something that never finishes.

Fix this issue by replacing __GFP_IO by may_enter_fs check (for case 2)
before we go to wait on the writeback.  The page fault path, which is
the only path that triggers memcg oom killer since 3.12, shouldn't
require GFP_NOFS and so we shouldn't reintroduce the premature OOM
killer issue which was originally addressed by the heuristic.

As per David Chinner the xfs is doing similar thing since 2.6.15 already
so ext4 is not the only affected filesystem.  Moreover he notes:

: For example: IO completion might require unwritten extent conversion
: which executes filesystem transactions and GFP_NOFS allocations. The
: writeback flag on the pages can not be cleared until unwritten
: extent conversion completes. Hence memory reclaim cannot wait on
: page writeback to complete in GFP_NOFS context because it is not
: safe to do so, memcg reclaim or otherwise.

[tytso@mit.edu: corrected the control flow]
Fixes: c3b94f4 ("memcg: further prevent OOM with too many dirty pages")
Reported-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[@MSF-Jarvis: Fix conflicts from "mm: vmscan: stall page reclaim after a list of pages have been processed" ]

Change-Id: I09aa7c565388b4b323034d5c71a463f4fb175462

mihirshah006 added a commit to mihirshah006/Xiaomi_Kernel_OpenSource that referenced this issue Jul 25, 2017

dccp/tcp: fix routing redirect race
commit 45caeaa upstream.

As Eric Dumazet pointed out this also needs to be fixed in IPv6.
v2: Contains the IPv6 tcp/Ipv6 dccp patches as well.

We have seen a few incidents lately where a dst_enty has been freed
with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that
dst_entry. If the conditions/timings are right a crash then ensues when the
freed dst_entry is referenced later on. A Common crashing back trace is:

 MiCode#8 [] page_fault at ffffffff8163e648
    [exception RIP: __tcp_ack_snd_check+74]
.
.
 MiCode#9 [] tcp_rcv_established at ffffffff81580b64
MiCode#10 [] tcp_v4_do_rcv at ffffffff8158b54a
MiCode#11 [] tcp_v4_rcv at ffffffff8158cd02
MiCode#12 [] ip_local_deliver_finish at ffffffff815668f4
MiCode#13 [] ip_local_deliver at ffffffff81566bd9
MiCode#14 [] ip_rcv_finish at ffffffff8156656d
MiCode#15 [] ip_rcv at ffffffff81566f06
MiCode#16 [] __netif_receive_skb_core at ffffffff8152b3a2
MiCode#17 [] __netif_receive_skb at ffffffff8152b608
MiCode#18 [] netif_receive_skb at ffffffff8152b690
MiCode#19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3]
MiCode#20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3]
MiCode#21 [] net_rx_action at ffffffff8152bac2
MiCode#22 [] __do_softirq at ffffffff81084b4f
MiCode#23 [] call_softirq at ffffffff8164845c
MiCode#24 [] do_softirq at ffffffff81016fc5
MiCode#25 [] irq_exit at ffffffff81084ee5
MiCode#26 [] do_IRQ at ffffffff81648ff8

Of course it may happen with other NIC drivers as well.

It's found the freed dst_entry here:

 224 static bool tcp_in_quickack_mode(struct sock *sk)�
 225 {�
 226 �       const struct inet_connection_sock *icsk = inet_csk(sk);�
 227 �       const struct dst_entry *dst = __sk_dst_get(sk);�
 228 �
 229 �       return (dst && dst_metric(dst, RTAX_QUICKACK)) ||�
 230 �       �       (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);�
 231 }�

But there are other backtraces attributed to the same freed dst_entry in
netfilter code as well.

All the vmcores showed 2 significant clues:

- Remote hosts behind the default gateway had always been redirected to a
different gateway. A rtable/dst_entry will be added for that host. Making
more dst_entrys with lower reference counts. Making this more probable.

- All vmcores showed a postitive LockDroppedIcmps value, e.g:

LockDroppedIcmps                  267

A closer look at the tcp_v4_err() handler revealed that do_redirect() will run
regardless of whether user space has the socket locked. This can result in a
race condition where the same dst_entry cached in sk->sk_dst_entry can be
decremented twice for the same socket via:

do_redirect()->__sk_dst_check()-> dst_release().

Which leads to the dst_entry being prematurely freed with another socket
pointing to it via sk->sk_dst_cache and a subsequent crash.

To fix this skip do_redirect() if usespace has the socket locked. Instead let
the redirect take place later when user space does not have the socket
locked.

The dccp/IPv6 code is very similar in this respect, so fixing it there too.

As Eric Garver pointed out the following commit now invalidates routes. Which
can set the dst->obsolete flag so that ipv4_dst_check() returns null and
triggers the dst_release().

Fixes: ceb3320 ("ipv4: Kill routes during PMTU/redirect updates.")
Cc: Eric Garver <egarver@redhat.com>
Cc: Hannes Sowa <hsowa@redhat.com>
Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
@AntonyLeons

This comment has been minimized.

Copy link

commented Jul 30, 2017

They should release all kernel sources, this does not damage you operations! This helps your image.

@lzto

This comment has been minimized.

Copy link

commented Nov 5, 2017

guys, i am joining you requesting source code for mibox2.
it sucks that the source code is not released.

@ghost

This comment has been minimized.

Copy link

commented Nov 6, 2017

I'm joining this for Redmi 4X
Such as d*****s @hehao always delaying Kernel Source and never hear us. They always act like they don't care wtf. If you always act like this @hehao my next phone wouldn't be Xiaomi again. Waiting is hurts

@luk1337

This comment has been minimized.

Copy link

commented Nov 6, 2017

haha reeeeeeeeeeeeeeee

@julian-alarcon

This comment has been minimized.

@ghost

This comment has been minimized.

Copy link

commented Jan 18, 2018

Spamming bish

maxprzemo added a commit to maxprzemo/Xiaomi_Kernel_OpenSource that referenced this issue Feb 18, 2018

dccp/tcp: fix routing redirect race
commit 45caeaa upstream.

As Eric Dumazet pointed out this also needs to be fixed in IPv6.
v2: Contains the IPv6 tcp/Ipv6 dccp patches as well.

We have seen a few incidents lately where a dst_enty has been freed
with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that
dst_entry. If the conditions/timings are right a crash then ensues when the
freed dst_entry is referenced later on. A Common crashing back trace is:

 MiCode#8 [] page_fault at ffffffff8163e648
    [exception RIP: __tcp_ack_snd_check+74]
.
.
 MiCode#9 [] tcp_rcv_established at ffffffff81580b64
MiCode#10 [] tcp_v4_do_rcv at ffffffff8158b54a
MiCode#11 [] tcp_v4_rcv at ffffffff8158cd02
MiCode#12 [] ip_local_deliver_finish at ffffffff815668f4
MiCode#13 [] ip_local_deliver at ffffffff81566bd9
MiCode#14 [] ip_rcv_finish at ffffffff8156656d
MiCode#15 [] ip_rcv at ffffffff81566f06
MiCode#16 [] __netif_receive_skb_core at ffffffff8152b3a2
MiCode#17 [] __netif_receive_skb at ffffffff8152b608
MiCode#18 [] netif_receive_skb at ffffffff8152b690
MiCode#19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3]
MiCode#20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3]
MiCode#21 [] net_rx_action at ffffffff8152bac2
MiCode#22 [] __do_softirq at ffffffff81084b4f
MiCode#23 [] call_softirq at ffffffff8164845c
MiCode#24 [] do_softirq at ffffffff81016fc5
MiCode#25 [] irq_exit at ffffffff81084ee5
MiCode#26 [] do_IRQ at ffffffff81648ff8

Of course it may happen with other NIC drivers as well.

It's found the freed dst_entry here:

 224 static bool tcp_in_quickack_mode(struct sock *sk)↩
 225 {↩
 226 ▹       const struct inet_connection_sock *icsk = inet_csk(sk);↩
 227 ▹       const struct dst_entry *dst = __sk_dst_get(sk);↩
 228 ↩
 229 ▹       return (dst && dst_metric(dst, RTAX_QUICKACK)) ||↩
 230 ▹       ▹       (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);↩
 231 }↩

But there are other backtraces attributed to the same freed dst_entry in
netfilter code as well.

All the vmcores showed 2 significant clues:

- Remote hosts behind the default gateway had always been redirected to a
different gateway. A rtable/dst_entry will be added for that host. Making
more dst_entrys with lower reference counts. Making this more probable.

- All vmcores showed a postitive LockDroppedIcmps value, e.g:

LockDroppedIcmps                  267

A closer look at the tcp_v4_err() handler revealed that do_redirect() will run
regardless of whether user space has the socket locked. This can result in a
race condition where the same dst_entry cached in sk->sk_dst_entry can be
decremented twice for the same socket via:

do_redirect()->__sk_dst_check()-> dst_release().

Which leads to the dst_entry being prematurely freed with another socket
pointing to it via sk->sk_dst_cache and a subsequent crash.

To fix this skip do_redirect() if usespace has the socket locked. Instead let
the redirect take place later when user space does not have the socket
locked.

The dccp/IPv6 code is very similar in this respect, so fixing it there too.

As Eric Garver pointed out the following commit now invalidates routes. Which
can set the dst->obsolete flag so that ipv4_dst_check() returns null and
triggers the dst_release().

Fixes: ceb3320 ("ipv4: Kill routes during PMTU/redirect updates.")
Cc: Eric Garver <egarver@redhat.com>
Cc: Hannes Sowa <hsowa@redhat.com>
Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

maxprzemo added a commit to maxprzemo/Xiaomi_Kernel_OpenSource that referenced this issue Feb 18, 2018

xfs: fix up xfs_swap_extent_forks inline extent handling
commit 4dfce57 upstream.

There have been several reports over the years of NULL pointer
dereferences in xfs_trans_log_inode during xfs_fsr processes,
when the process is doing an fput and tearing down extents
on the temporary inode, something like:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
PID: 29439  TASK: ffff880550584fa0  CPU: 6   COMMAND: "xfs_fsr"
    [exception RIP: xfs_trans_log_inode+0x10]
 MiCode#9 [ffff8800a57bbbe0] xfs_bunmapi at ffffffffa037398e [xfs]
MiCode#10 [ffff8800a57bbce8] xfs_itruncate_extents at ffffffffa0391b29 [xfs]
MiCode#11 [ffff8800a57bbd88] xfs_inactive_truncate at ffffffffa0391d0c [xfs]
MiCode#12 [ffff8800a57bbdb8] xfs_inactive at ffffffffa0392508 [xfs]
MiCode#13 [ffff8800a57bbdd8] xfs_fs_evict_inode at ffffffffa035907e [xfs]
MiCode#14 [ffff8800a57bbe00] evict at ffffffff811e1b67
MiCode#15 [ffff8800a57bbe28] iput at ffffffff811e23a5
MiCode#16 [ffff8800a57bbe58] dentry_kill at ffffffff811dcfc8
MiCode#17 [ffff8800a57bbe88] dput at ffffffff811dd06c
MiCode#18 [ffff8800a57bbea8] __fput at ffffffff811c823b
MiCode#19 [ffff8800a57bbef0] ____fput at ffffffff811c846e
MiCode#20 [ffff8800a57bbf00] task_work_run at ffffffff81093b27
MiCode#21 [ffff8800a57bbf30] do_notify_resume at ffffffff81013b0c
MiCode#22 [ffff8800a57bbf50] int_signal at ffffffff8161405d

As it turns out, this is because the i_itemp pointer, along
with the d_ops pointer, has been overwritten with zeros
when we tear down the extents during truncate.  When the in-core
inode fork on the temporary inode used by xfs_fsr was originally
set up during the extent swap, we mistakenly looked at di_nextents
to determine whether all extents fit inline, but this misses extents
generated by speculative preallocation; we should be using if_bytes
instead.

This mistake corrupts the in-memory inode, and code in
xfs_iext_remove_inline eventually gets bad inputs, causing
it to memmove and memset incorrect ranges; this became apparent
because the two values in ifp->if_u2.if_inline_ext[1] contained
what should have been in d_ops and i_itemp; they were memmoved due
to incorrect array indexing and then the original locations
were zeroed with memset, again due to an array overrun.

Fix this by properly using i_df.if_bytes to determine the number
of extents, not di_nextents.

Thanks to dchinner for looking at this with me and spotting the
root cause.

[nborisov: backported to 4.4]

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
@Vdragon

This comment has been minimized.

Copy link
Author

commented Feb 26, 2018

Hello for everyone who have followed this issue, as the kernel source of Mi Pad(mocha) has been released by @hehao this issue will be closed by now.

Xiaomi, Inc. spent more than 16 months(calculated from this issue's creation date to the release commit, and even more since the product release date) to release the kernel source, which is far from acceptable from the expectation from the product users, third-party developers and the original kernel developers. They can do better.

Lastly, thanks for everyone here fighting for the software freedom, hope you can have a much successful outcome than this one.

@Vdragon Vdragon closed this Feb 26, 2018

@dineshdb

This comment has been minimized.

Copy link

commented Mar 28, 2018

Hello everyone, I am Redme 4A user and would like kernel sources for it. Is it already released? Made a huge mistake buying a Xiaomi smartphone. Would start turning off other users off unless it gets released soon!

@dineshdb

This comment has been minimized.

Copy link

commented Mar 28, 2018

Found this issue #825

@denzuko

This comment has been minimized.

Copy link

commented Mar 28, 2018

http://gpl-violations.org/ gallor for this company. Hopefully anon doesn't get wind of this

@dineshdb

This comment has been minimized.

Copy link

commented Mar 29, 2018

@denzuko Is there something we can do about GPL violations as a user? Like reporting to FSF or something similar? I don't like sitting here doing nothing. Petitions are good but there could be better options.

@Vdragon

This comment has been minimized.

Copy link
Author

commented Mar 29, 2018

@dineshdb It should be possible to start a crowd-funding for the legal cost of GPL software copyright holders suing the violators.

Goayandi added a commit to Goayandi/android_kernel_xiaomi_cappu that referenced this issue Apr 3, 2018

dccp/tcp: fix routing redirect race
commit 45caeaa upstream.

As Eric Dumazet pointed out this also needs to be fixed in IPv6.
v2: Contains the IPv6 tcp/Ipv6 dccp patches as well.

We have seen a few incidents lately where a dst_enty has been freed
with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that
dst_entry. If the conditions/timings are right a crash then ensues when the
freed dst_entry is referenced later on. A Common crashing back trace is:

 MiCode#8 [] page_fault at ffffffff8163e648
    [exception RIP: __tcp_ack_snd_check+74]
.
.
 MiCode#9 [] tcp_rcv_established at ffffffff81580b64
MiCode#10 [] tcp_v4_do_rcv at ffffffff8158b54a
MiCode#11 [] tcp_v4_rcv at ffffffff8158cd02
MiCode#12 [] ip_local_deliver_finish at ffffffff815668f4
MiCode#13 [] ip_local_deliver at ffffffff81566bd9
MiCode#14 [] ip_rcv_finish at ffffffff8156656d
MiCode#15 [] ip_rcv at ffffffff81566f06
MiCode#16 [] __netif_receive_skb_core at ffffffff8152b3a2
MiCode#17 [] __netif_receive_skb at ffffffff8152b608
MiCode#18 [] netif_receive_skb at ffffffff8152b690
MiCode#19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3]
MiCode#20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3]
MiCode#21 [] net_rx_action at ffffffff8152bac2
MiCode#22 [] __do_softirq at ffffffff81084b4f
MiCode#23 [] call_softirq at ffffffff8164845c
MiCode#24 [] do_softirq at ffffffff81016fc5
MiCode#25 [] irq_exit at ffffffff81084ee5
MiCode#26 [] do_IRQ at ffffffff81648ff8

Of course it may happen with other NIC drivers as well.

It's found the freed dst_entry here:

 224 static bool tcp_in_quickack_mode(struct sock *sk)↩
 225 {↩
 226 ▹       const struct inet_connection_sock *icsk = inet_csk(sk);↩
 227 ▹       const struct dst_entry *dst = __sk_dst_get(sk);↩
 228 ↩
 229 ▹       return (dst && dst_metric(dst, RTAX_QUICKACK)) ||↩
 230 ▹       ▹       (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);↩
 231 }↩

But there are other backtraces attributed to the same freed dst_entry in
netfilter code as well.

All the vmcores showed 2 significant clues:

- Remote hosts behind the default gateway had always been redirected to a
different gateway. A rtable/dst_entry will be added for that host. Making
more dst_entrys with lower reference counts. Making this more probable.

- All vmcores showed a postitive LockDroppedIcmps value, e.g:

LockDroppedIcmps                  267

A closer look at the tcp_v4_err() handler revealed that do_redirect() will run
regardless of whether user space has the socket locked. This can result in a
race condition where the same dst_entry cached in sk->sk_dst_entry can be
decremented twice for the same socket via:

do_redirect()->__sk_dst_check()-> dst_release().

Which leads to the dst_entry being prematurely freed with another socket
pointing to it via sk->sk_dst_cache and a subsequent crash.

To fix this skip do_redirect() if usespace has the socket locked. Instead let
the redirect take place later when user space does not have the socket
locked.

The dccp/IPv6 code is very similar in this respect, so fixing it there too.

As Eric Garver pointed out the following commit now invalidates routes. Which
can set the dst->obsolete flag so that ipv4_dst_check() returns null and
triggers the dst_release().

Fixes: ceb3320 ("ipv4: Kill routes during PMTU/redirect updates.")
Cc: Eric Garver <egarver@redhat.com>
Cc: Hannes Sowa <hsowa@redhat.com>
Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Goayandi added a commit to Goayandi/android_kernel_xiaomi_cappu that referenced this issue Apr 3, 2018

xfs: fix up xfs_swap_extent_forks inline extent handling
commit 4dfce57 upstream.

There have been several reports over the years of NULL pointer
dereferences in xfs_trans_log_inode during xfs_fsr processes,
when the process is doing an fput and tearing down extents
on the temporary inode, something like:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
PID: 29439  TASK: ffff880550584fa0  CPU: 6   COMMAND: "xfs_fsr"
    [exception RIP: xfs_trans_log_inode+0x10]
 MiCode#9 [ffff8800a57bbbe0] xfs_bunmapi at ffffffffa037398e [xfs]
MiCode#10 [ffff8800a57bbce8] xfs_itruncate_extents at ffffffffa0391b29 [xfs]
MiCode#11 [ffff8800a57bbd88] xfs_inactive_truncate at ffffffffa0391d0c [xfs]
MiCode#12 [ffff8800a57bbdb8] xfs_inactive at ffffffffa0392508 [xfs]
MiCode#13 [ffff8800a57bbdd8] xfs_fs_evict_inode at ffffffffa035907e [xfs]
MiCode#14 [ffff8800a57bbe00] evict at ffffffff811e1b67
MiCode#15 [ffff8800a57bbe28] iput at ffffffff811e23a5
MiCode#16 [ffff8800a57bbe58] dentry_kill at ffffffff811dcfc8
MiCode#17 [ffff8800a57bbe88] dput at ffffffff811dd06c
MiCode#18 [ffff8800a57bbea8] __fput at ffffffff811c823b
MiCode#19 [ffff8800a57bbef0] ____fput at ffffffff811c846e
MiCode#20 [ffff8800a57bbf00] task_work_run at ffffffff81093b27
MiCode#21 [ffff8800a57bbf30] do_notify_resume at ffffffff81013b0c
MiCode#22 [ffff8800a57bbf50] int_signal at ffffffff8161405d

As it turns out, this is because the i_itemp pointer, along
with the d_ops pointer, has been overwritten with zeros
when we tear down the extents during truncate.  When the in-core
inode fork on the temporary inode used by xfs_fsr was originally
set up during the extent swap, we mistakenly looked at di_nextents
to determine whether all extents fit inline, but this misses extents
generated by speculative preallocation; we should be using if_bytes
instead.

This mistake corrupts the in-memory inode, and code in
xfs_iext_remove_inline eventually gets bad inputs, causing
it to memmove and memset incorrect ranges; this became apparent
because the two values in ifp->if_u2.if_inline_ext[1] contained
what should have been in d_ops and i_itemp; they were memmoved due
to incorrect array indexing and then the original locations
were zeroed with memset, again due to an array overrun.

Fix this by properly using i_df.if_bytes to determine the number
of extents, not di_nextents.

Thanks to dchinner for looking at this with me and spotting the
root cause.

[nborisov: backported to 4.4]

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Goayandi added a commit to Goayandi/android_kernel_xiaomi_cappu that referenced this issue Apr 3, 2018

dccp/tcp: fix routing redirect race
commit 45caeaa upstream.

As Eric Dumazet pointed out this also needs to be fixed in IPv6.
v2: Contains the IPv6 tcp/Ipv6 dccp patches as well.

We have seen a few incidents lately where a dst_enty has been freed
with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that
dst_entry. If the conditions/timings are right a crash then ensues when the
freed dst_entry is referenced later on. A Common crashing back trace is:

 MiCode#8 [] page_fault at ffffffff8163e648
    [exception RIP: __tcp_ack_snd_check+74]
.
.
 MiCode#9 [] tcp_rcv_established at ffffffff81580b64
MiCode#10 [] tcp_v4_do_rcv at ffffffff8158b54a
MiCode#11 [] tcp_v4_rcv at ffffffff8158cd02
MiCode#12 [] ip_local_deliver_finish at ffffffff815668f4
MiCode#13 [] ip_local_deliver at ffffffff81566bd9
MiCode#14 [] ip_rcv_finish at ffffffff8156656d
MiCode#15 [] ip_rcv at ffffffff81566f06
MiCode#16 [] __netif_receive_skb_core at ffffffff8152b3a2
MiCode#17 [] __netif_receive_skb at ffffffff8152b608
MiCode#18 [] netif_receive_skb at ffffffff8152b690
MiCode#19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3]
MiCode#20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3]
MiCode#21 [] net_rx_action at ffffffff8152bac2
MiCode#22 [] __do_softirq at ffffffff81084b4f
MiCode#23 [] call_softirq at ffffffff8164845c
MiCode#24 [] do_softirq at ffffffff81016fc5
MiCode#25 [] irq_exit at ffffffff81084ee5
MiCode#26 [] do_IRQ at ffffffff81648ff8

Of course it may happen with other NIC drivers as well.

It's found the freed dst_entry here:

 224 static bool tcp_in_quickack_mode(struct sock *sk)↩
 225 {↩
 226 ▹       const struct inet_connection_sock *icsk = inet_csk(sk);↩
 227 ▹       const struct dst_entry *dst = __sk_dst_get(sk);↩
 228 ↩
 229 ▹       return (dst && dst_metric(dst, RTAX_QUICKACK)) ||↩
 230 ▹       ▹       (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);↩
 231 }↩

But there are other backtraces attributed to the same freed dst_entry in
netfilter code as well.

All the vmcores showed 2 significant clues:

- Remote hosts behind the default gateway had always been redirected to a
different gateway. A rtable/dst_entry will be added for that host. Making
more dst_entrys with lower reference counts. Making this more probable.

- All vmcores showed a postitive LockDroppedIcmps value, e.g:

LockDroppedIcmps                  267

A closer look at the tcp_v4_err() handler revealed that do_redirect() will run
regardless of whether user space has the socket locked. This can result in a
race condition where the same dst_entry cached in sk->sk_dst_entry can be
decremented twice for the same socket via:

do_redirect()->__sk_dst_check()-> dst_release().

Which leads to the dst_entry being prematurely freed with another socket
pointing to it via sk->sk_dst_cache and a subsequent crash.

To fix this skip do_redirect() if usespace has the socket locked. Instead let
the redirect take place later when user space does not have the socket
locked.

The dccp/IPv6 code is very similar in this respect, so fixing it there too.

As Eric Garver pointed out the following commit now invalidates routes. Which
can set the dst->obsolete flag so that ipv4_dst_check() returns null and
triggers the dst_release().

Fixes: ceb3320 ("ipv4: Kill routes during PMTU/redirect updates.")
Cc: Eric Garver <egarver@redhat.com>
Cc: Hannes Sowa <hsowa@redhat.com>
Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Goayandi added a commit to Goayandi/android_kernel_xiaomi_cappu that referenced this issue Apr 3, 2018

xfs: fix up xfs_swap_extent_forks inline extent handling
commit 4dfce57 upstream.

There have been several reports over the years of NULL pointer
dereferences in xfs_trans_log_inode during xfs_fsr processes,
when the process is doing an fput and tearing down extents
on the temporary inode, something like:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
PID: 29439  TASK: ffff880550584fa0  CPU: 6   COMMAND: "xfs_fsr"
    [exception RIP: xfs_trans_log_inode+0x10]
 MiCode#9 [ffff8800a57bbbe0] xfs_bunmapi at ffffffffa037398e [xfs]
MiCode#10 [ffff8800a57bbce8] xfs_itruncate_extents at ffffffffa0391b29 [xfs]
MiCode#11 [ffff8800a57bbd88] xfs_inactive_truncate at ffffffffa0391d0c [xfs]
MiCode#12 [ffff8800a57bbdb8] xfs_inactive at ffffffffa0392508 [xfs]
MiCode#13 [ffff8800a57bbdd8] xfs_fs_evict_inode at ffffffffa035907e [xfs]
MiCode#14 [ffff8800a57bbe00] evict at ffffffff811e1b67
MiCode#15 [ffff8800a57bbe28] iput at ffffffff811e23a5
MiCode#16 [ffff8800a57bbe58] dentry_kill at ffffffff811dcfc8
MiCode#17 [ffff8800a57bbe88] dput at ffffffff811dd06c
MiCode#18 [ffff8800a57bbea8] __fput at ffffffff811c823b
MiCode#19 [ffff8800a57bbef0] ____fput at ffffffff811c846e
MiCode#20 [ffff8800a57bbf00] task_work_run at ffffffff81093b27
MiCode#21 [ffff8800a57bbf30] do_notify_resume at ffffffff81013b0c
MiCode#22 [ffff8800a57bbf50] int_signal at ffffffff8161405d

As it turns out, this is because the i_itemp pointer, along
with the d_ops pointer, has been overwritten with zeros
when we tear down the extents during truncate.  When the in-core
inode fork on the temporary inode used by xfs_fsr was originally
set up during the extent swap, we mistakenly looked at di_nextents
to determine whether all extents fit inline, but this misses extents
generated by speculative preallocation; we should be using if_bytes
instead.

This mistake corrupts the in-memory inode, and code in
xfs_iext_remove_inline eventually gets bad inputs, causing
it to memmove and memset incorrect ranges; this became apparent
because the two values in ifp->if_u2.if_inline_ext[1] contained
what should have been in d_ops and i_itemp; they were memmoved due
to incorrect array indexing and then the original locations
were zeroed with memset, again due to an array overrun.

Fix this by properly using i_df.if_bytes to determine the number
of extents, not di_nextents.

Thanks to dchinner for looking at this with me and spotting the
root cause.

[nborisov: backported to 4.4]

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Goayandi added a commit to Goayandi/android_kernel_xiaomi_cappu that referenced this issue Apr 13, 2018

dccp/tcp: fix routing redirect race
commit 45caeaa upstream.

As Eric Dumazet pointed out this also needs to be fixed in IPv6.
v2: Contains the IPv6 tcp/Ipv6 dccp patches as well.

We have seen a few incidents lately where a dst_enty has been freed
with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that
dst_entry. If the conditions/timings are right a crash then ensues when the
freed dst_entry is referenced later on. A Common crashing back trace is:

 MiCode#8 [] page_fault at ffffffff8163e648
    [exception RIP: __tcp_ack_snd_check+74]
.
.
 MiCode#9 [] tcp_rcv_established at ffffffff81580b64
MiCode#10 [] tcp_v4_do_rcv at ffffffff8158b54a
MiCode#11 [] tcp_v4_rcv at ffffffff8158cd02
MiCode#12 [] ip_local_deliver_finish at ffffffff815668f4
MiCode#13 [] ip_local_deliver at ffffffff81566bd9
MiCode#14 [] ip_rcv_finish at ffffffff8156656d
MiCode#15 [] ip_rcv at ffffffff81566f06
MiCode#16 [] __netif_receive_skb_core at ffffffff8152b3a2
MiCode#17 [] __netif_receive_skb at ffffffff8152b608
MiCode#18 [] netif_receive_skb at ffffffff8152b690
MiCode#19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3]
MiCode#20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3]
MiCode#21 [] net_rx_action at ffffffff8152bac2
MiCode#22 [] __do_softirq at ffffffff81084b4f
MiCode#23 [] call_softirq at ffffffff8164845c
MiCode#24 [] do_softirq at ffffffff81016fc5
MiCode#25 [] irq_exit at ffffffff81084ee5
MiCode#26 [] do_IRQ at ffffffff81648ff8

Of course it may happen with other NIC drivers as well.

It's found the freed dst_entry here:

 224 static bool tcp_in_quickack_mode(struct sock *sk)↩
 225 {↩
 226 ▹       const struct inet_connection_sock *icsk = inet_csk(sk);↩
 227 ▹       const struct dst_entry *dst = __sk_dst_get(sk);↩
 228 ↩
 229 ▹       return (dst && dst_metric(dst, RTAX_QUICKACK)) ||↩
 230 ▹       ▹       (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);↩
 231 }↩

But there are other backtraces attributed to the same freed dst_entry in
netfilter code as well.

All the vmcores showed 2 significant clues:

- Remote hosts behind the default gateway had always been redirected to a
different gateway. A rtable/dst_entry will be added for that host. Making
more dst_entrys with lower reference counts. Making this more probable.

- All vmcores showed a postitive LockDroppedIcmps value, e.g:

LockDroppedIcmps                  267

A closer look at the tcp_v4_err() handler revealed that do_redirect() will run
regardless of whether user space has the socket locked. This can result in a
race condition where the same dst_entry cached in sk->sk_dst_entry can be
decremented twice for the same socket via:

do_redirect()->__sk_dst_check()-> dst_release().

Which leads to the dst_entry being prematurely freed with another socket
pointing to it via sk->sk_dst_cache and a subsequent crash.

To fix this skip do_redirect() if usespace has the socket locked. Instead let
the redirect take place later when user space does not have the socket
locked.

The dccp/IPv6 code is very similar in this respect, so fixing it there too.

As Eric Garver pointed out the following commit now invalidates routes. Which
can set the dst->obsolete flag so that ipv4_dst_check() returns null and
triggers the dst_release().

Fixes: ceb3320 ("ipv4: Kill routes during PMTU/redirect updates.")
Cc: Eric Garver <egarver@redhat.com>
Cc: Hannes Sowa <hsowa@redhat.com>
Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Goayandi added a commit to Goayandi/android_kernel_xiaomi_cappu that referenced this issue Apr 13, 2018

xfs: fix up xfs_swap_extent_forks inline extent handling
commit 4dfce57 upstream.

There have been several reports over the years of NULL pointer
dereferences in xfs_trans_log_inode during xfs_fsr processes,
when the process is doing an fput and tearing down extents
on the temporary inode, something like:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
PID: 29439  TASK: ffff880550584fa0  CPU: 6   COMMAND: "xfs_fsr"
    [exception RIP: xfs_trans_log_inode+0x10]
 MiCode#9 [ffff8800a57bbbe0] xfs_bunmapi at ffffffffa037398e [xfs]
MiCode#10 [ffff8800a57bbce8] xfs_itruncate_extents at ffffffffa0391b29 [xfs]
MiCode#11 [ffff8800a57bbd88] xfs_inactive_truncate at ffffffffa0391d0c [xfs]
MiCode#12 [ffff8800a57bbdb8] xfs_inactive at ffffffffa0392508 [xfs]
MiCode#13 [ffff8800a57bbdd8] xfs_fs_evict_inode at ffffffffa035907e [xfs]
MiCode#14 [ffff8800a57bbe00] evict at ffffffff811e1b67
MiCode#15 [ffff8800a57bbe28] iput at ffffffff811e23a5
MiCode#16 [ffff8800a57bbe58] dentry_kill at ffffffff811dcfc8
MiCode#17 [ffff8800a57bbe88] dput at ffffffff811dd06c
MiCode#18 [ffff8800a57bbea8] __fput at ffffffff811c823b
MiCode#19 [ffff8800a57bbef0] ____fput at ffffffff811c846e
MiCode#20 [ffff8800a57bbf00] task_work_run at ffffffff81093b27
MiCode#21 [ffff8800a57bbf30] do_notify_resume at ffffffff81013b0c
MiCode#22 [ffff8800a57bbf50] int_signal at ffffffff8161405d

As it turns out, this is because the i_itemp pointer, along
with the d_ops pointer, has been overwritten with zeros
when we tear down the extents during truncate.  When the in-core
inode fork on the temporary inode used by xfs_fsr was originally
set up during the extent swap, we mistakenly looked at di_nextents
to determine whether all extents fit inline, but this misses extents
generated by speculative preallocation; we should be using if_bytes
instead.

This mistake corrupts the in-memory inode, and code in
xfs_iext_remove_inline eventually gets bad inputs, causing
it to memmove and memset incorrect ranges; this became apparent
because the two values in ifp->if_u2.if_inline_ext[1] contained
what should have been in d_ops and i_itemp; they were memmoved due
to incorrect array indexing and then the original locations
were zeroed with memset, again due to an array overrun.

Fix this by properly using i_df.if_bytes to determine the number
of extents, not di_nextents.

Thanks to dchinner for looking at this with me and spotting the
root cause.

[nborisov: backported to 4.4]

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Goayandi added a commit to Goayandi/android_kernel_xiaomi_cappu that referenced this issue Aug 12, 2018

cdrom: do not call check_disk_change() inside cdrom_open()
[ Upstream commit 2bbea6e117357d17842114c65e9a9cf2d13ae8a3 ]

when mounting an ISO filesystem sometimes (very rarely)
the system hangs because of a race condition between two tasks.

PID: 6766   TASK: ffff88007b2a6dd0  CPU: 0   COMMAND: "mount"
 #0 [ffff880078447ae0] __schedule at ffffffff8168d605
 MiCode#1 [ffff880078447b48] schedule_preempt_disabled at ffffffff8168ed49
 MiCode#2 [ffff880078447b58] __mutex_lock_slowpath at ffffffff8168c995
 MiCode#3 [ffff880078447bb8] mutex_lock at ffffffff8168bdef
 MiCode#4 [ffff880078447bd0] sr_block_ioctl at ffffffffa00b6818 [sr_mod]
 MiCode#5 [ffff880078447c10] blkdev_ioctl at ffffffff812fea50
 MiCode#6 [ffff880078447c70] ioctl_by_bdev at ffffffff8123a8b3
 MiCode#7 [ffff880078447c90] isofs_fill_super at ffffffffa04fb1e1 [isofs]
 MiCode#8 [ffff880078447da8] mount_bdev at ffffffff81202570
 MiCode#9 [ffff880078447e18] isofs_mount at ffffffffa04f9828 [isofs]
MiCode#10 [ffff880078447e28] mount_fs at ffffffff81202d09
MiCode#11 [ffff880078447e70] vfs_kern_mount at ffffffff8121ea8f
MiCode#12 [ffff880078447ea8] do_mount at ffffffff81220fee
MiCode#13 [ffff880078447f28] sys_mount at ffffffff812218d6
MiCode#14 [ffff880078447f80] system_call_fastpath at ffffffff81698c49
    RIP: 00007fd9ea914e9a  RSP: 00007ffd5d9bf648  RFLAGS: 00010246
    RAX: 00000000000000a5  RBX: ffffffff81698c49  RCX: 0000000000000010
    RDX: 00007fd9ec2bc210  RSI: 00007fd9ec2bc290  RDI: 00007fd9ec2bcf30
    RBP: 0000000000000000   R8: 0000000000000000   R9: 0000000000000010
    R10: 00000000c0ed0001  R11: 0000000000000206  R12: 00007fd9ec2bc040
    R13: 00007fd9eb6b2380  R14: 00007fd9ec2bc210  R15: 00007fd9ec2bcf30
    ORIG_RAX: 00000000000000a5  CS: 0033  SS: 002b

This task was trying to mount the cdrom.  It allocated and configured a
super_block struct and owned the write-lock for the super_block->s_umount
rwsem. While exclusively owning the s_umount lock, it called
sr_block_ioctl and waited to acquire the global sr_mutex lock.

PID: 6785   TASK: ffff880078720fb0  CPU: 0   COMMAND: "systemd-udevd"
 #0 [ffff880078417898] __schedule at ffffffff8168d605
 MiCode#1 [ffff880078417900] schedule at ffffffff8168dc59
 MiCode#2 [ffff880078417910] rwsem_down_read_failed at ffffffff8168f605
 MiCode#3 [ffff880078417980] call_rwsem_down_read_failed at ffffffff81328838
 MiCode#4 [ffff8800784179d0] down_read at ffffffff8168cde0
 MiCode#5 [ffff8800784179e8] get_super at ffffffff81201cc7
 MiCode#6 [ffff880078417a10] __invalidate_device at ffffffff8123a8de
 MiCode#7 [ffff880078417a40] flush_disk at ffffffff8123a94b
 MiCode#8 [ffff880078417a88] check_disk_change at ffffffff8123ab50
 MiCode#9 [ffff880078417ab0] cdrom_open at ffffffffa00a29e1 [cdrom]
MiCode#10 [ffff880078417b68] sr_block_open at ffffffffa00b6f9b [sr_mod]
MiCode#11 [ffff880078417b98] __blkdev_get at ffffffff8123ba86
MiCode#12 [ffff880078417bf0] blkdev_get at ffffffff8123bd65
MiCode#13 [ffff880078417c78] blkdev_open at ffffffff8123bf9b
MiCode#14 [ffff880078417c90] do_dentry_open at ffffffff811fc7f7
MiCode#15 [ffff880078417cd8] vfs_open at ffffffff811fc9cf
MiCode#16 [ffff880078417d00] do_last at ffffffff8120d53d
MiCode#17 [ffff880078417db0] path_openat at ffffffff8120e6b2
MiCode#18 [ffff880078417e48] do_filp_open at ffffffff8121082b
MiCode#19 [ffff880078417f18] do_sys_open at ffffffff811fdd33
MiCode#20 [ffff880078417f70] sys_open at ffffffff811fde4e
MiCode#21 [ffff880078417f80] system_call_fastpath at ffffffff81698c49
    RIP: 00007f29438b0c20  RSP: 00007ffc76624b78  RFLAGS: 00010246
    RAX: 0000000000000002  RBX: ffffffff81698c49  RCX: 0000000000000000
    RDX: 00007f2944a5fa70  RSI: 00000000000a0800  RDI: 00007f2944a5fa70
    RBP: 00007f2944a5f540   R8: 0000000000000000   R9: 0000000000000020
    R10: 00007f2943614c40  R11: 0000000000000246  R12: ffffffff811fde4e
    R13: ffff880078417f78  R14: 000000000000000c  R15: 00007f2944a4b010
    ORIG_RAX: 0000000000000002  CS: 0033  SS: 002b

This task tried to open the cdrom device, the sr_block_open function
acquired the global sr_mutex lock. The call to check_disk_change()
then saw an event flag indicating a possible media change and tried
to flush any cached data for the device.
As part of the flush, it tried to acquire the super_block->s_umount
lock associated with the cdrom device.
This was the same super_block as created and locked by the previous task.

The first task acquires the s_umount lock and then the sr_mutex_lock;
the second task acquires the sr_mutex_lock and then the s_umount lock.

This patch fixes the issue by moving check_disk_change() out of
cdrom_open() and let the caller take care of it.

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Goayandi added a commit to Goayandi/android_kernel_xiaomi_cappu that referenced this issue Sep 6, 2018

brcmfmac: stop watchdog before detach and free everything
[ Upstream commit 373c83a801f15b1e3d02d855fad89112bd4ccbe0 ]

Using built-in in kernel image without a firmware in filesystem
or in the kernel image can lead to a kernel NULL pointer deference.
Watchdog need to be stopped in brcmf_sdio_remove

The system is going down NOW!
[ 1348.110759] Unable to handle kernel NULL pointer dereference at virtual address 000002f8
Sent SIGTERM to all processes
[ 1348.121412] Mem abort info:
[ 1348.126962]   ESR = 0x96000004
[ 1348.130023]   Exception class = DABT (current EL), IL = 32 bits
[ 1348.135948]   SET = 0, FnV = 0
[ 1348.138997]   EA = 0, S1PTW = 0
[ 1348.142154] Data abort info:
[ 1348.145045]   ISV = 0, ISS = 0x00000004
[ 1348.148884]   CM = 0, WnR = 0
[ 1348.151861] user pgtable: 4k pages, 48-bit VAs, pgdp = (____ptrval____)
[ 1348.158475] [00000000000002f8] pgd=0000000000000000
[ 1348.163364] Internal error: Oops: 96000004 [MiCode#1] PREEMPT SMP
[ 1348.168927] Modules linked in: ipv6
[ 1348.172421] CPU: 3 PID: 1421 Comm: brcmf_wdog/mmc0 Not tainted 4.17.0-rc5-next-20180517 MiCode#18
[ 1348.180757] Hardware name: Amarula A64-Relic (DT)
[ 1348.185455] pstate: 60000005 (nZCv daif -PAN -UAO)
[ 1348.190251] pc : brcmf_sdiod_freezer_count+0x0/0x20
[ 1348.195124] lr : brcmf_sdio_watchdog_thread+0x64/0x290
[ 1348.200253] sp : ffff00000b85be30
[ 1348.203561] x29: ffff00000b85be30 x28: 0000000000000000
[ 1348.208868] x27: ffff00000b6cb918 x26: ffff80003b990638
[ 1348.214176] x25: ffff0000087b1a20 x24: ffff80003b94f800
[ 1348.219483] x23: ffff000008e620c8 x22: ffff000008f0b660
[ 1348.224790] x21: ffff000008c6a858 x20: 00000000fffffe00
[ 1348.230097] x19: ffff80003b94f800 x18: 0000000000000001
[ 1348.235404] x17: 0000ffffab2e8a74 x16: ffff0000080d7de8
[ 1348.240711] x15: 0000000000000000 x14: 0000000000000400
[ 1348.246018] x13: 0000000000000400 x12: 0000000000000001
[ 1348.251324] x11: 00000000000002c4 x10: 0000000000000a10
[ 1348.256631] x9 : ffff00000b85bc40 x8 : ffff80003be11870
[ 1348.261937] x7 : ffff80003dfc7308 x6 : 000000078ff08b55
[ 1348.267243] x5 : 00000139e1058400 x4 : 0000000000000000
[ 1348.272550] x3 : dead000000000100 x2 : 958f2788d6618100
[ 1348.277856] x1 : 00000000fffffe00 x0 : 0000000000000000

Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Tested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Goayandi added a commit to Goayandi/android_kernel_xiaomi_cappu that referenced this issue Oct 7, 2018

brcmfmac: stop watchdog before detach and free everything
[ Upstream commit 373c83a801f15b1e3d02d855fad89112bd4ccbe0 ]

Using built-in in kernel image without a firmware in filesystem
or in the kernel image can lead to a kernel NULL pointer deference.
Watchdog need to be stopped in brcmf_sdio_remove

The system is going down NOW!
[ 1348.110759] Unable to handle kernel NULL pointer dereference at virtual address 000002f8
Sent SIGTERM to all processes
[ 1348.121412] Mem abort info:
[ 1348.126962]   ESR = 0x96000004
[ 1348.130023]   Exception class = DABT (current EL), IL = 32 bits
[ 1348.135948]   SET = 0, FnV = 0
[ 1348.138997]   EA = 0, S1PTW = 0
[ 1348.142154] Data abort info:
[ 1348.145045]   ISV = 0, ISS = 0x00000004
[ 1348.148884]   CM = 0, WnR = 0
[ 1348.151861] user pgtable: 4k pages, 48-bit VAs, pgdp = (____ptrval____)
[ 1348.158475] [00000000000002f8] pgd=0000000000000000
[ 1348.163364] Internal error: Oops: 96000004 [MiCode#1] PREEMPT SMP
[ 1348.168927] Modules linked in: ipv6
[ 1348.172421] CPU: 3 PID: 1421 Comm: brcmf_wdog/mmc0 Not tainted 4.17.0-rc5-next-20180517 MiCode#18
[ 1348.180757] Hardware name: Amarula A64-Relic (DT)
[ 1348.185455] pstate: 60000005 (nZCv daif -PAN -UAO)
[ 1348.190251] pc : brcmf_sdiod_freezer_count+0x0/0x20
[ 1348.195124] lr : brcmf_sdio_watchdog_thread+0x64/0x290
[ 1348.200253] sp : ffff00000b85be30
[ 1348.203561] x29: ffff00000b85be30 x28: 0000000000000000
[ 1348.208868] x27: ffff00000b6cb918 x26: ffff80003b990638
[ 1348.214176] x25: ffff0000087b1a20 x24: ffff80003b94f800
[ 1348.219483] x23: ffff000008e620c8 x22: ffff000008f0b660
[ 1348.224790] x21: ffff000008c6a858 x20: 00000000fffffe00
[ 1348.230097] x19: ffff80003b94f800 x18: 0000000000000001
[ 1348.235404] x17: 0000ffffab2e8a74 x16: ffff0000080d7de8
[ 1348.240711] x15: 0000000000000000 x14: 0000000000000400
[ 1348.246018] x13: 0000000000000400 x12: 0000000000000001
[ 1348.251324] x11: 00000000000002c4 x10: 0000000000000a10
[ 1348.256631] x9 : ffff00000b85bc40 x8 : ffff80003be11870
[ 1348.261937] x7 : ffff80003dfc7308 x6 : 000000078ff08b55
[ 1348.267243] x5 : 00000139e1058400 x4 : 0000000000000000
[ 1348.272550] x3 : dead000000000100 x2 : 958f2788d6618100
[ 1348.277856] x1 : 00000000fffffe00 x0 : 0000000000000000

Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Tested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Ahmed-Hady added a commit to Ahmed-Hady/android_xiaomi_daisy that referenced this issue Nov 18, 2018

cdrom: do not call check_disk_change() inside cdrom_open()
[ Upstream commit 2bbea6e117357d17842114c65e9a9cf2d13ae8a3 ]

when mounting an ISO filesystem sometimes (very rarely)
the system hangs because of a race condition between two tasks.

PID: 6766   TASK: ffff88007b2a6dd0  CPU: 0   COMMAND: "mount"
 #0 [ffff880078447ae0] __schedule at ffffffff8168d605
 MiCode#1 [ffff880078447b48] schedule_preempt_disabled at ffffffff8168ed49
 MiCode#2 [ffff880078447b58] __mutex_lock_slowpath at ffffffff8168c995
 MiCode#3 [ffff880078447bb8] mutex_lock at ffffffff8168bdef
 MiCode#4 [ffff880078447bd0] sr_block_ioctl at ffffffffa00b6818 [sr_mod]
 MiCode#5 [ffff880078447c10] blkdev_ioctl at ffffffff812fea50
 MiCode#6 [ffff880078447c70] ioctl_by_bdev at ffffffff8123a8b3
 MiCode#7 [ffff880078447c90] isofs_fill_super at ffffffffa04fb1e1 [isofs]
 MiCode#8 [ffff880078447da8] mount_bdev at ffffffff81202570
 MiCode#9 [ffff880078447e18] isofs_mount at ffffffffa04f9828 [isofs]
MiCode#10 [ffff880078447e28] mount_fs at ffffffff81202d09
MiCode#11 [ffff880078447e70] vfs_kern_mount at ffffffff8121ea8f
MiCode#12 [ffff880078447ea8] do_mount at ffffffff81220fee
MiCode#13 [ffff880078447f28] sys_mount at ffffffff812218d6
MiCode#14 [ffff880078447f80] system_call_fastpath at ffffffff81698c49
    RIP: 00007fd9ea914e9a  RSP: 00007ffd5d9bf648  RFLAGS: 00010246
    RAX: 00000000000000a5  RBX: ffffffff81698c49  RCX: 0000000000000010
    RDX: 00007fd9ec2bc210  RSI: 00007fd9ec2bc290  RDI: 00007fd9ec2bcf30
    RBP: 0000000000000000   R8: 0000000000000000   R9: 0000000000000010
    R10: 00000000c0ed0001  R11: 0000000000000206  R12: 00007fd9ec2bc040
    R13: 00007fd9eb6b2380  R14: 00007fd9ec2bc210  R15: 00007fd9ec2bcf30
    ORIG_RAX: 00000000000000a5  CS: 0033  SS: 002b

This task was trying to mount the cdrom.  It allocated and configured a
super_block struct and owned the write-lock for the super_block->s_umount
rwsem. While exclusively owning the s_umount lock, it called
sr_block_ioctl and waited to acquire the global sr_mutex lock.

PID: 6785   TASK: ffff880078720fb0  CPU: 0   COMMAND: "systemd-udevd"
 #0 [ffff880078417898] __schedule at ffffffff8168d605
 MiCode#1 [ffff880078417900] schedule at ffffffff8168dc59
 MiCode#2 [ffff880078417910] rwsem_down_read_failed at ffffffff8168f605
 MiCode#3 [ffff880078417980] call_rwsem_down_read_failed at ffffffff81328838
 MiCode#4 [ffff8800784179d0] down_read at ffffffff8168cde0
 MiCode#5 [ffff8800784179e8] get_super at ffffffff81201cc7
 MiCode#6 [ffff880078417a10] __invalidate_device at ffffffff8123a8de
 MiCode#7 [ffff880078417a40] flush_disk at ffffffff8123a94b
 MiCode#8 [ffff880078417a88] check_disk_change at ffffffff8123ab50
 MiCode#9 [ffff880078417ab0] cdrom_open at ffffffffa00a29e1 [cdrom]
MiCode#10 [ffff880078417b68] sr_block_open at ffffffffa00b6f9b [sr_mod]
MiCode#11 [ffff880078417b98] __blkdev_get at ffffffff8123ba86
MiCode#12 [ffff880078417bf0] blkdev_get at ffffffff8123bd65
MiCode#13 [ffff880078417c78] blkdev_open at ffffffff8123bf9b
MiCode#14 [ffff880078417c90] do_dentry_open at ffffffff811fc7f7
MiCode#15 [ffff880078417cd8] vfs_open at ffffffff811fc9cf
MiCode#16 [ffff880078417d00] do_last at ffffffff8120d53d
MiCode#17 [ffff880078417db0] path_openat at ffffffff8120e6b2
MiCode#18 [ffff880078417e48] do_filp_open at ffffffff8121082b
MiCode#19 [ffff880078417f18] do_sys_open at ffffffff811fdd33
MiCode#20 [ffff880078417f70] sys_open at ffffffff811fde4e
MiCode#21 [ffff880078417f80] system_call_fastpath at ffffffff81698c49
    RIP: 00007f29438b0c20  RSP: 00007ffc76624b78  RFLAGS: 00010246
    RAX: 0000000000000002  RBX: ffffffff81698c49  RCX: 0000000000000000
    RDX: 00007f2944a5fa70  RSI: 00000000000a0800  RDI: 00007f2944a5fa70
    RBP: 00007f2944a5f540   R8: 0000000000000000   R9: 0000000000000020
    R10: 00007f2943614c40  R11: 0000000000000246  R12: ffffffff811fde4e
    R13: ffff880078417f78  R14: 000000000000000c  R15: 00007f2944a4b010
    ORIG_RAX: 0000000000000002  CS: 0033  SS: 002b

This task tried to open the cdrom device, the sr_block_open function
acquired the global sr_mutex lock. The call to check_disk_change()
then saw an event flag indicating a possible media change and tried
to flush any cached data for the device.
As part of the flush, it tried to acquire the super_block->s_umount
lock associated with the cdrom device.
This was the same super_block as created and locked by the previous task.

The first task acquires the s_umount lock and then the sr_mutex_lock;
the second task acquires the sr_mutex_lock and then the s_umount lock.

This patch fixes the issue by moving check_disk_change() out of
cdrom_open() and let the caller take care of it.

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.