Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Lineage 17.1] ASB 2022-05 fixes #19

Merged
merged 6 commits into from
May 12, 2022

Conversation

bananafunction
Copy link

... I hope that I found all :-) (as always, please check)

Jianmin Zhu and others added 2 commits March 23, 2022 04:07
Add bound check before access array to avoid out of bound issue.
Separate array bound and duplicate check of 11a and 11b since they have
different length and type.

Change-Id: Icb9382cd42385339532518759de0f6137c5203bd
CRs-Fixed: 3051517
commit 9bbd42e upstream.

Doing a "get_user_pages()" on a copy-on-write page for reading can be
ambiguous: the page can be COW'ed at any time afterwards, and the
direction of a COW event isn't defined.

Yes, whoever writes to it will generally do the COW, but if the thread
that did the get_user_pages() unmapped the page before the write (and
that could happen due to memory pressure in addition to any outright
action), the writer could also just take over the old page instead.

End result: the get_user_pages() call might result in a page pointer
that is no longer associated with the original VM, and is associated
with - and controlled by - another VM having taken it over instead.

So when doing a get_user_pages() on a COW mapping, the only really safe
thing to do would be to break the COW when getting the page, even when
only getting it for reading.

At the same time, some users simply don't even care.

For example, the perf code wants to look up the page not because it
cares about the page, but because the code simply wants to look up the
physical address of the access for informational purposes, and doesn't
really care about races when a page might be unmapped and remapped
elsewhere.

This adds logic to force a COW event by setting FOLL_WRITE on any
copy-on-write mapping when FOLL_GET (or FOLL_PIN) is used to get a page
pointer as a result.

The current semantics end up being:

 - __get_user_pages_fast(): no change. If you don't ask for a write,
   you won't break COW. You'd better know what you're doing.

 - get_user_pages_fast(): the fast-case "look it up in the page tables
   without anything getting mmap_sem" now refuses to follow a read-only
   page, since it might need COW breaking.  Which happens in the slow
   path - the fast path doesn't know if the memory might be COW or not.

 - get_user_pages() (including the slow-path fallback for gup_fast()):
   for a COW mapping, turn on FOLL_WRITE for FOLL_GET/FOLL_PIN, with
   very similar semantics to FOLL_FORCE.

If it turns out that we want finer granularity (ie "only break COW when
it might actually matter" - things like the zero page are special and
don't need to be broken) we might need to push these semantics deeper
into the lookup fault path.  So if people care enough, it's possible
that we might end up adding a new internal FOLL_BREAK_COW flag to go
with the internal FOLL_COW flag we already have for tracking "I had a
COW".

Alternatively, if it turns out that different callers might want to
explicitly control the forced COW break behavior, we might even want to
make such a flag visible to the users of get_user_pages() instead of
using the above default semantics.

But for now, this is mostly commentary on the issue (this commit message
being a lot bigger than the patch, and that patch in turn is almost all
comments), with that minimal "enable COW breaking early" logic using the
existing FOLL_WRITE behavior.

[ It might be worth noting that we've always had this ambiguity, and it
  could arguably be seen as a user-space issue.

  You only get private COW mappings that could break either way in
  situations where user space is doing cooperative things (ie fork()
  before an execve() etc), but it _is_ surprising and very subtle, and
  fork() is supposed to give you independent address spaces.

  So let's treat this as a kernel issue and make the semantics of
  get_user_pages() easier to understand. Note that obviously a true
  shared mapping will still get a page that can change under us, so this
  does _not_ mean that get_user_pages() somehow returns any "stable"
  page ]

[surenb: backport notes
	Replaced (gup_flags | FOLL_WRITE) with write=1 in gup_pgd_range.
	Removed FOLL_PIN usage in should_force_cow_break since it's missing in
	the earlier kernels.]

Reported-by: Jann Horn <jannh@google.com>
Tested-by: Christoph Hellwig <hch@lst.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Kirill Shutemov <kirill@shutemov.name>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[surenb: backport to 4.19 kernel]
Cc: stable@vger.kernel.org # 4.19.x
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 4.9:
 - Generic get_user_pages_fast() calls __get_user_pages_fast() here,
   so make it pass write=1
 - Various architectures have their own implementations of
   get_user_pages_fast(), so apply the corresponding change there
 - Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
@bananafunction
Copy link
Author

bananafunction commented May 10, 2022

I think we do already have 1 and 3. I am gonna check for 2 🙂

Feedback:

@Flamefire
Copy link
Owner

Something looks wrong with the 2 commits: a902aba and 1533c58

The 2nd seems to be a revert of the first and both don't do what they say or resemble the originals

@bananafunction
Copy link
Author

bananafunction commented May 11, 2022

Something looks wrong with the 2 commits: a902aba and 1533c58

The 2nd seems to be a revert of the first and both don't do what they say or resemble the originals

Indeed that looks strange. I am gonna check that soon...

Update:
Something is wired in our drivers/staging/android/ion/ion.c: it seems that any merge or cherry-pick has not been applied correctly since we do have the function "static int ion_share_dma_buf_fd_nolock" duplicated. My cherry-pick (ion: remove unsafe function ion_handle_get_by_id()) just removed the duplicate as it was shown as conflict. Actually my cherry-pick is totally not necessary as we do already have it (see 52ad8f0).

I do not understand what you mean by "msm: kgsl: Add a sysfs node to control performance counter reads" is a revert?

I am gonna force push an update.

Update 2:
it was my own cherry-pick (staging: android: ion: fix ION_IOC_{MAP,SHARE} use-after-free) that created the mess :-(
For information: I got the commits from here:
https://git.codelinaro.org/clo/la/kernel/msm-4.4/-/commits/kernel.lnx.4.4.r40-rel/
and here:
https://git.codelinaro.org/clo/la/kernel/msm-4.4/-/commits/LA.UM.8.4.c25-05300-8x98.0

lixiang and others added 4 commits May 11, 2022 13:29
When handling memory import, payload_count is used for memory alloc
calculation. If the payload_count is too large, size will overflow
when creating page list.
Adding a sanity check for payload_count is necessary.

Change-Id: I6d60cea0c62bd29092852c55b766b77a94cb6e3b
Signed-off-by: lixiang <lixiang@codeaurora.org>
…ages()

Consider a scenario where user allocates anonymous memory but does not
write to it. Here the physical pages are not yet allocated. Now when this
memory is requested to be imported, a list of newly allocated zero pages
is obtained using get_user_pages(). Currently cache flush is not done for
these pages and hence GPU sees stale data. Fix this by performing cache
flush on these pages.

Change-Id: Id1e8aa20e8a9de112761732ed92f30c01088840b
Signed-off-by: Puranam V G Tejaswi <quic_pvgtejas@quicinc.com>
Signed-off-by: Sebanti Das <quic_sebadas@quicinc.com>
Signed-off-by: Kamal Agrawal <quic_kamaagra@quicinc.com>
Currently performance counters are global and can be read by anyone. Change
the behaviour to disable reading global counters as default and add a sysfs
node to enable/disable reads.

Change-Id: Ic3785acd9bd7425c2a844ed103d7b870d9f80adf
Signed-off-by: Mohammed Mirza Mandayappurath Manzoor <quic_mmandaya@quicinc.com>
Signed-off-by: Harshitha Sai Neelati <quic_hsaineel@quicinc.com>
Signed-off-by: Pankaj Gupta <quic_gpankaj@quicinc.com>
Signed-off-by: Kamal Agrawal <quic_kamaagra@quicinc.com>
	* qcacld-3.0: Avoid possible array OOB
@Flamefire
Copy link
Owner

Thanks for the work, however I'd like to focus on CIP updates and official ASB patches instead of cherry-picking random commits from yet another repo which may create merge conflicts later on. It is already hard enough.
Using those for now anyway as they look right, but e.g. 499b849 and c2f0f9c are not required. The first is related to some further work which actually makes use of that node and the second is related to some component we don't use.

I'd propose the following policy:

@Flamefire Flamefire merged commit 7a30290 into Flamefire:lineage-17.1 May 12, 2022
@bananafunction
Copy link
Author

Thanks for the work, however I'd like to focus on CIP updates and official ASB patches instead of cherry-picking random commits from yet another repo which may create merge conflicts later on. It is already hard enough.
Using those for now anyway as they look right, but e.g. 499b849 and c2f0f9c are not required. The first is related to some further work which actually makes use of that node and the second is related to some component we don't use.

I'd propose the following policy:

Of course we can do so and I fully understand your argumentation 🙂
Yes, CAF repos have been moved.
In means of "up to date" codebase I'd like to mention that we already merged CAF kernel branches before, so I think that keeping track of a CAF kernel branch might be reasonable.

Anyway, thank you for taking care of the kernel 👍

bananafunction pushed a commit to bananafunction/android_kernel_sony_msm8998-1 that referenced this pull request May 16, 2022
[ Upstream commit 4224cfd7fb6523f7a9d1c8bb91bb5df1e38eb624 ]

When bringing down the netdevice or system shutdown, a panic can be
triggered while accessing the sysfs path because the device is already
removed.

    [  755.549084] mlx5_core 0000:12:00.1: Shutdown was called
    [  756.404455] mlx5_core 0000:12:00.0: Shutdown was called
    ...
    [  757.937260] BUG: unable to handle kernel NULL pointer dereference at           (null)
    [  758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280

    crash> bt
    ...
    PID: 12649  TASK: ffff8924108f2100  CPU: 1   COMMAND: "amsd"
    ...
     Flamefire#9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778
        [exception RIP: dma_pool_alloc+0x1ab]
        RIP: ffffffff8ee11acb  RSP: ffff89240e1a3968  RFLAGS: 00010046
        RAX: 0000000000000246  RBX: ffff89243d874100  RCX: 0000000000001000
        RDX: 0000000000000000  RSI: 0000000000000246  RDI: ffff89243d874090
        RBP: ffff89240e1a39c0   R8: 000000000001f080   R9: ffff8905ffc03c00
        R10: ffffffffc04680d4  R11: ffffffff8edde9fd  R12: 00000000000080d0
        R13: ffff89243d874090  R14: ffff89243d874080  R15: 0000000000000000
        ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
    Flamefire#10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core]
    Flamefire#11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core]
    Flamefire#12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core]
    Flamefire#13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core]
    Flamefire#14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core]
    Flamefire#15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core]
    Flamefire#16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core]
    Flamefire#17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46
    Flamefire#18 [ffff89240e1a3d48] speed_show at ffffffff8f277208
    Flamefire#19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3
    Flamefire#20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf
    Flamefire#21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596
    Flamefire#22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10
    Flamefire#23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5
    Flamefire#24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff
    Flamefire#25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f
    Flamefire#26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92

    crash> net_device.state ffff89443b0c0000
      state = 0x5  (__LINK_STATE_START| __LINK_STATE_NOCARRIER)

To prevent this scenario, we also make sure that the netdevice is present.

Signed-off-by: suresh kumar <suresh2514@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Flamefire pushed a commit that referenced this pull request May 17, 2022
[ Upstream commit 4224cfd7fb6523f7a9d1c8bb91bb5df1e38eb624 ]

When bringing down the netdevice or system shutdown, a panic can be
triggered while accessing the sysfs path because the device is already
removed.

    [  755.549084] mlx5_core 0000:12:00.1: Shutdown was called
    [  756.404455] mlx5_core 0000:12:00.0: Shutdown was called
    ...
    [  757.937260] BUG: unable to handle kernel NULL pointer dereference at           (null)
    [  758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280

    crash> bt
    ...
    PID: 12649  TASK: ffff8924108f2100  CPU: 1   COMMAND: "amsd"
    ...
     #9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778
        [exception RIP: dma_pool_alloc+0x1ab]
        RIP: ffffffff8ee11acb  RSP: ffff89240e1a3968  RFLAGS: 00010046
        RAX: 0000000000000246  RBX: ffff89243d874100  RCX: 0000000000001000
        RDX: 0000000000000000  RSI: 0000000000000246  RDI: ffff89243d874090
        RBP: ffff89240e1a39c0   R8: 000000000001f080   R9: ffff8905ffc03c00
        R10: ffffffffc04680d4  R11: ffffffff8edde9fd  R12: 00000000000080d0
        R13: ffff89243d874090  R14: ffff89243d874080  R15: 0000000000000000
        ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
    #10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core]
    #11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core]
    #12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core]
    #13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core]
    #14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core]
    #15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core]
    #16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core]
    #17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46
    #18 [ffff89240e1a3d48] speed_show at ffffffff8f277208
    #19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3
    #20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf
    #21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596
    #22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10
    #23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5
    #24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff
    #25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f
    #26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92

    crash> net_device.state ffff89443b0c0000
      state = 0x5  (__LINK_STATE_START| __LINK_STATE_NOCARRIER)

To prevent this scenario, we also make sure that the netdevice is present.

Signed-off-by: suresh kumar <suresh2514@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Flamefire pushed a commit that referenced this pull request Jun 3, 2024
[ Upstream commit f8bbc07ac535593139c875ffa19af924b1084540 ]

vhost_worker will call tun call backs to receive packets. If too many
illegal packets arrives, tun_do_read will keep dumping packet contents.
When console is enabled, it will costs much more cpu time to dump
packet and soft lockup will be detected.

net_ratelimit mechanism can be used to limit the dumping rate.

PID: 33036    TASK: ffff949da6f20000  CPU: 23   COMMAND: "vhost-32980"
 #0 [fffffe00003fce50] crash_nmi_callback at ffffffff89249253
 #1 [fffffe00003fce58] nmi_handle at ffffffff89225fa3
 #2 [fffffe00003fceb0] default_do_nmi at ffffffff8922642e
 #3 [fffffe00003fced0] do_nmi at ffffffff8922660d
 #4 [fffffe00003fcef0] end_repeat_nmi at ffffffff89c01663
    [exception RIP: io_serial_in+20]
    RIP: ffffffff89792594  RSP: ffffa655314979e8  RFLAGS: 00000002
    RAX: ffffffff89792500  RBX: ffffffff8af428a0  RCX: 0000000000000000
    RDX: 00000000000003fd  RSI: 0000000000000005  RDI: ffffffff8af428a0
    RBP: 0000000000002710   R8: 0000000000000004   R9: 000000000000000f
    R10: 0000000000000000  R11: ffffffff8acbf64f  R12: 0000000000000020
    R13: ffffffff8acbf698  R14: 0000000000000058  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #5 [ffffa655314979e8] io_serial_in at ffffffff89792594
 #6 [ffffa655314979e8] wait_for_xmitr at ffffffff89793470
 #7 [ffffa65531497a08] serial8250_console_putchar at ffffffff897934f6
 #8 [ffffa65531497a20] uart_console_write at ffffffff8978b605
 #9 [ffffa65531497a48] serial8250_console_write at ffffffff89796558
 #10 [ffffa65531497ac8] console_unlock at ffffffff89316124
 #11 [ffffa65531497b10] vprintk_emit at ffffffff89317c07
 #12 [ffffa65531497b68] printk at ffffffff89318306
 #13 [ffffa65531497bc8] print_hex_dump at ffffffff89650765
 #14 [ffffa65531497ca8] tun_do_read at ffffffffc0b06c27 [tun]
 #15 [ffffa65531497d38] tun_recvmsg at ffffffffc0b06e34 [tun]
 #16 [ffffa65531497d68] handle_rx at ffffffffc0c5d682 [vhost_net]
 #17 [ffffa65531497ed0] vhost_worker at ffffffffc0c644dc [vhost]
 #18 [ffffa65531497f10] kthread at ffffffff892d2e72
 #19 [ffffa65531497f50] ret_from_fork at ffffffff89c0022f

Fixes: ef3db4a ("tun: avoid BUG, dump packet on GSO errors")
Signed-off-by: Lei Chen <lei.chen@smartx.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20240415020247.2207781-1-lei.chen@smartx.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[uli: backport to 4.4]
Signed-off-by: Ulrich Hecht <uli@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants