From a32997c73bce9e95b9da133d41ef8de73f241177 Mon Sep 17 00:00:00 2001 From: RageLtMan Date: Mon, 19 Apr 2021 11:17:30 -0400 Subject: [PATCH 1/2] Kernel Hardening: Linux Hardened patch stack Network equipment is critical infrastructure with long uptimes and significant throughput/processing, especially in undercloud fabric. The OS kernel is responsible for managing raw system resources and the enforcement of security (privilege/access) boundaries. This set of responsibilities, and a number of technical reasons such as long-running memory layouts, and physical page table access, make the kernel a high-value target for attackers. Rebooting the system for upgrades can be problematic, and patches providing correct solutions for ring0 concerns may take some time to matriculate to stable release - leaving gaps in the security posture of systems. In order to reduce exposure during these gaps, and the impact or feasibility of 0-day attacks, this high-value target needs to be better protected with probabilistic, deterministic, and semantic defenses. While this effort is by no means a replacement for the professional-grade mitigations in Grsecurity/PaX, it does start down the path of elevated defensive posture by introducing the Linux Hardened kernel patchset from GrapheneOS by Daniel Micay and others. The hardening patchset implements a number of C-level fixes, higher entropy ASLR, namespace protections, FS access restrictions to sensitive targets like /dev/mem, and syscall restrictions. Atop the basics, it adds GCC plugins or improves upon the upstream ones to randomize struct layouts, initify and initialize variables at compile-time, and provides a PRNG from the jitterentropy source. More info at https://www.whonix.org/wiki/Hardened-kernel as well as in the source repo https://github.com/anthraxx/linux-hardened. Notes: While not in the scope of this pull request, the kernel-tier mechanisms provided here should be complemented by Daniel Micay's hardened-malloc to guard against userspace memory corruption, UAF, and other malfeasance. This effort parallels a similar pull request for VyOS - #132. The added functionality provided there in regards to LVS, XTables, and other patches can be backported here on request. Testing: None on this branch, we maintain 5.4 and 5.10 branches in-house --- patch/0000-Linux-Hardened-4.19.patch | 3181 ++++++++++++++++++++++++++ 1 file changed, 3181 insertions(+) create mode 100644 patch/0000-Linux-Hardened-4.19.patch diff --git a/patch/0000-Linux-Hardened-4.19.patch b/patch/0000-Linux-Hardened-4.19.patch new file mode 100644 index 000000000..a144f9570 --- /dev/null +++ b/patch/0000-Linux-Hardened-4.19.patch @@ -0,0 +1,3181 @@ +diff --git i/Documentation/admin-guide/kernel-parameters.txt w/Documentation/admin-guide/kernel-parameters.txt +index b2d0e714d..8ff932fc8 100644 +--- i/Documentation/admin-guide/kernel-parameters.txt ++++ w/Documentation/admin-guide/kernel-parameters.txt +@@ -500,16 +500,6 @@ + nosocket -- Disable socket memory accounting. + nokmem -- Disable kernel memory accounting. + +- checkreqprot [SELINUX] Set initial checkreqprot flag value. +- Format: { "0" | "1" } +- See security/selinux/Kconfig help text. +- 0 -- check protection applied by kernel (includes +- any implied execute protection). +- 1 -- check protection requested by application. +- Default value is set via a kernel config option. +- Value can be changed at runtime via +- /selinux/checkreqprot. +- + cio_ignore= [S390] + See Documentation/s390/CommonIO for details. + clk_ignore_unused +@@ -3211,6 +3201,11 @@ + the specified number of seconds. This is to be used if + your oopses keep scrolling off the screen. + ++ extra_latent_entropy ++ Enable a very simple form of latent entropy extraction ++ from the first 4GB of memory as the bootmem allocator ++ passes the memory pages to the buddy allocator. ++ + pcbit= [HW,ISDN] + + pcd. [PARIDE] +diff --git i/Documentation/networking/ip-sysctl.txt w/Documentation/networking/ip-sysctl.txt +index 7eb936642..a74b6ff7b 100644 +--- i/Documentation/networking/ip-sysctl.txt ++++ w/Documentation/networking/ip-sysctl.txt +@@ -556,6 +556,23 @@ tcp_comp_sack_nr - INTEGER + + Detault : 44 + ++tcp_simult_connect - BOOLEAN ++ Enable TCP simultaneous connect that adds a weakness in Linux's strict ++ implementation of TCP that allows two clients to connect to each other ++ without either entering a listening state. The weakness allows an attacker ++ to easily prevent a client from connecting to a known server provided the ++ source port for the connection is guessed correctly. ++ ++ As the weakness could be used to prevent an antivirus or IPS from fetching ++ updates, or prevent an SSL gateway from fetching a CRL, it should be ++ eliminated by disabling this option. Though Linux is one of few operating ++ systems supporting simultaneous connect, it has no legitimate use in ++ practice and is rarely supported by firewalls. ++ ++ Disabling this may break TCP STUNT which is used by some applications for ++ NAT traversal. ++ Default: Value of CONFIG_TCP_SIMULT_CONNECT_DEFAULT_ON ++ + tcp_slow_start_after_idle - BOOLEAN + If set, provide RFC2861 behavior and time out the congestion + window after an idle period. An idle period is defined at +diff --git i/Documentation/sysctl/kernel.txt w/Documentation/sysctl/kernel.txt +index 37a679501..59b747920 100644 +--- i/Documentation/sysctl/kernel.txt ++++ w/Documentation/sysctl/kernel.txt +@@ -94,6 +94,7 @@ show up in /proc/sys/kernel: + - sysctl_writes_strict + - tainted + - threads-max ++- tiocsti_restrict + - unknown_nmi_panic + - watchdog + - watchdog_thresh +@@ -1041,6 +1042,26 @@ available RAM pages threads-max is reduced accordingly. + + ============================================================== + ++tiocsti_restrict: ++ ++This toggle indicates whether unprivileged users are prevented ++from using the TIOCSTI ioctl to inject commands into other processes ++which share a tty session. ++ ++When tiocsti_restrict is set to (0) there are no restrictions(accept ++the default restriction of only being able to injection commands into ++one's own tty). When tiocsti_restrict is set to (1), users must ++have CAP_SYS_ADMIN to use the TIOCSTI ioctl. ++ ++When user namespaces are in use, the check for the capability ++CAP_SYS_ADMIN is done against the user namespace that originally ++opened the tty. ++ ++The kernel config option CONFIG_SECURITY_TIOCSTI_RESTRICT sets the ++default value of tiocsti_restrict. ++ ++============================================================== ++ + unknown_nmi_panic: + + The value in this file affects behavior of handling NMI. When the +diff --git i/Makefile w/Makefile +index 0907f7b1e..fe4e0e5cc 100644 +--- i/Makefile ++++ w/Makefile +@@ -707,6 +707,9 @@ stackp-flags-$(CONFIG_STACKPROTECTOR_STRONG) := -fstack-protector-strong + KBUILD_CFLAGS += $(stackp-flags-y) + + ifeq ($(cc-name),clang) ++ifdef CONFIG_LOCAL_INIT ++KBUILD_CFLAGS += -fsanitize=local-init ++endif + KBUILD_CPPFLAGS += $(call cc-option,-Qunused-arguments,) + KBUILD_CFLAGS += $(call cc-disable-warning, format-invalid-specifier) + KBUILD_CFLAGS += $(call cc-disable-warning, gnu) +diff --git i/arch/Kconfig w/arch/Kconfig +index a33654848..bbe821420 100644 +--- i/arch/Kconfig ++++ w/arch/Kconfig +@@ -599,7 +599,7 @@ config ARCH_MMAP_RND_BITS + int "Number of bits to use for ASLR of mmap base address" if EXPERT + range ARCH_MMAP_RND_BITS_MIN ARCH_MMAP_RND_BITS_MAX + default ARCH_MMAP_RND_BITS_DEFAULT if ARCH_MMAP_RND_BITS_DEFAULT +- default ARCH_MMAP_RND_BITS_MIN ++ default ARCH_MMAP_RND_BITS_MAX + depends on HAVE_ARCH_MMAP_RND_BITS + help + This value can be used to select the number of bits to use to +@@ -633,7 +633,7 @@ config ARCH_MMAP_RND_COMPAT_BITS + int "Number of bits to use for ASLR of mmap base address for compatible applications" if EXPERT + range ARCH_MMAP_RND_COMPAT_BITS_MIN ARCH_MMAP_RND_COMPAT_BITS_MAX + default ARCH_MMAP_RND_COMPAT_BITS_DEFAULT if ARCH_MMAP_RND_COMPAT_BITS_DEFAULT +- default ARCH_MMAP_RND_COMPAT_BITS_MIN ++ default ARCH_MMAP_RND_COMPAT_BITS_MAX + depends on HAVE_ARCH_MMAP_RND_COMPAT_BITS + help + This value can be used to select the number of bits to use to +@@ -838,6 +838,7 @@ config ARCH_HAS_REFCOUNT + + config REFCOUNT_FULL + bool "Perform full reference count validation at the expense of speed" ++ default y + help + Enabling this switches the refcounting infrastructure from a fast + unchecked atomic_t implementation to a fully state checked +diff --git i/arch/arm64/Kconfig w/arch/arm64/Kconfig +index 1fe3e5cb2..7683d9c7d 100644 +--- i/arch/arm64/Kconfig ++++ w/arch/arm64/Kconfig +@@ -1049,6 +1049,7 @@ endif + + config ARM64_SW_TTBR0_PAN + bool "Emulate Privileged Access Never using TTBR0_EL1 switching" ++ default y + help + Enabling this option prevents the kernel from accessing + user-space memory directly by pointing TTBR0_EL1 to a reserved +@@ -1224,6 +1225,7 @@ config RANDOMIZE_BASE + bool "Randomize the address of the kernel image" + select ARM64_MODULE_PLTS if MODULES + select RELOCATABLE ++ default y + help + Randomizes the virtual address at which the kernel image is + loaded, as a security feature that deters exploit attempts +diff --git i/arch/arm64/Kconfig.debug w/arch/arm64/Kconfig.debug +index 69c9170bd..a786227db 100644 +--- i/arch/arm64/Kconfig.debug ++++ w/arch/arm64/Kconfig.debug +@@ -42,6 +42,7 @@ config ARM64_RANDOMIZE_TEXT_OFFSET + config DEBUG_WX + bool "Warn on W+X mappings at boot" + select ARM64_PTDUMP_CORE ++ default y + ---help--- + Generate a warning if any W+X mappings are found at boot. + +diff --git i/arch/arm64/configs/defconfig w/arch/arm64/configs/defconfig +index 1a4f8b67b..85273063e 100644 +--- i/arch/arm64/configs/defconfig ++++ w/arch/arm64/configs/defconfig +@@ -1,4 +1,3 @@ +-CONFIG_SYSVIPC=y + CONFIG_POSIX_MQUEUE=y + CONFIG_AUDIT=y + CONFIG_NO_HZ_IDLE=y +diff --git i/arch/arm64/include/asm/elf.h w/arch/arm64/include/asm/elf.h +index 433b9554c..1f4b06317 100644 +--- i/arch/arm64/include/asm/elf.h ++++ w/arch/arm64/include/asm/elf.h +@@ -114,10 +114,10 @@ + + /* + * This is the base location for PIE (ET_DYN with INTERP) loads. On +- * 64-bit, this is above 4GB to leave the entire 32-bit address ++ * 64-bit, this is raised to 4GB to leave the entire 32-bit address + * space open for things that want to use the area for 32-bit pointers. + */ +-#define ELF_ET_DYN_BASE (2 * TASK_SIZE_64 / 3) ++#define ELF_ET_DYN_BASE 0x100000000UL + + #ifndef __ASSEMBLY__ + +@@ -171,10 +171,10 @@ extern int arch_setup_additional_pages(struct linux_binprm *bprm, + /* 1GB of VA */ + #ifdef CONFIG_COMPAT + #define STACK_RND_MASK (test_thread_flag(TIF_32BIT) ? \ +- 0x7ff >> (PAGE_SHIFT - 12) : \ +- 0x3ffff >> (PAGE_SHIFT - 12)) ++ ((1UL << mmap_rnd_compat_bits) - 1) >> (PAGE_SHIFT - 12) : \ ++ ((1UL << mmap_rnd_bits) - 1) >> (PAGE_SHIFT - 12)) + #else +-#define STACK_RND_MASK (0x3ffff >> (PAGE_SHIFT - 12)) ++#define STACK_RND_MASK (((1UL << mmap_rnd_bits) - 1) >> (PAGE_SHIFT - 12)) + #endif + + #ifdef __AARCH64EB__ +diff --git i/arch/arm64/kernel/process.c w/arch/arm64/kernel/process.c +index d6a49bb07..16e4214c2 100644 +--- i/arch/arm64/kernel/process.c ++++ w/arch/arm64/kernel/process.c +@@ -517,9 +517,9 @@ unsigned long arch_align_stack(unsigned long sp) + unsigned long arch_randomize_brk(struct mm_struct *mm) + { + if (is_compat_task()) +- return randomize_page(mm->brk, SZ_32M); ++ return mm->brk + get_random_long() % SZ_32M + PAGE_SIZE; + else +- return randomize_page(mm->brk, SZ_1G); ++ return mm->brk + get_random_long() % SZ_1G + PAGE_SIZE; + } + + /* +diff --git i/arch/x86/Kconfig w/arch/x86/Kconfig +index d2453b251..59440667b 100644 +--- i/arch/x86/Kconfig ++++ w/arch/x86/Kconfig +@@ -1189,8 +1189,7 @@ config VM86 + default X86_LEGACY_VM86 + + config X86_16BIT +- bool "Enable support for 16-bit segments" if EXPERT +- default y ++ bool "Enable support for 16-bit segments" + depends on MODIFY_LDT_SYSCALL + ---help--- + This option is required by programs like Wine to run 16-bit +@@ -2319,7 +2318,7 @@ config COMPAT_VDSO + choice + prompt "vsyscall table for legacy applications" + depends on X86_64 +- default LEGACY_VSYSCALL_EMULATE ++ default LEGACY_VSYSCALL_NONE + help + Legacy user code that does not know how to find the vDSO expects + to be able to issue three syscalls by calling fixed addresses in +@@ -2400,8 +2399,7 @@ config CMDLINE_OVERRIDE + be set to 'N' under normal conditions. + + config MODIFY_LDT_SYSCALL +- bool "Enable the LDT (local descriptor table)" if EXPERT +- default y ++ bool "Enable the LDT (local descriptor table)" + ---help--- + Linux can allow user programs to install a per-process x86 + Local Descriptor Table (LDT) using the modify_ldt(2) system +diff --git i/arch/x86/Kconfig.debug w/arch/x86/Kconfig.debug +index 687cd1a21..29075c2bc 100644 +--- i/arch/x86/Kconfig.debug ++++ w/arch/x86/Kconfig.debug +@@ -101,6 +101,7 @@ config EFI_PGT_DUMP + config DEBUG_WX + bool "Warn on W+X mappings at boot" + select X86_PTDUMP_CORE ++ default y + ---help--- + Generate a warning if any W+X mappings are found at boot. + +diff --git i/arch/x86/configs/x86_64_defconfig w/arch/x86/configs/x86_64_defconfig +index 146a12293..7435cb4b2 100644 +--- i/arch/x86/configs/x86_64_defconfig ++++ w/arch/x86/configs/x86_64_defconfig +@@ -1,5 +1,4 @@ + # CONFIG_LOCALVERSION_AUTO is not set +-CONFIG_SYSVIPC=y + CONFIG_POSIX_MQUEUE=y + CONFIG_BSD_PROCESS_ACCT=y + CONFIG_TASKSTATS=y +diff --git i/arch/x86/entry/vdso/vma.c w/arch/x86/entry/vdso/vma.c +index 5b8b556db..a569f08b4 100644 +--- i/arch/x86/entry/vdso/vma.c ++++ w/arch/x86/entry/vdso/vma.c +@@ -204,55 +204,9 @@ static int map_vdso(const struct vdso_image *image, unsigned long addr) + } + + #ifdef CONFIG_X86_64 +-/* +- * Put the vdso above the (randomized) stack with another randomized +- * offset. This way there is no hole in the middle of address space. +- * To save memory make sure it is still in the same PTE as the stack +- * top. This doesn't give that many random bits. +- * +- * Note that this algorithm is imperfect: the distribution of the vdso +- * start address within a PMD is biased toward the end. +- * +- * Only used for the 64-bit and x32 vdsos. +- */ +-static unsigned long vdso_addr(unsigned long start, unsigned len) +-{ +- unsigned long addr, end; +- unsigned offset; +- +- /* +- * Round up the start address. It can start out unaligned as a result +- * of stack start randomization. +- */ +- start = PAGE_ALIGN(start); +- +- /* Round the lowest possible end address up to a PMD boundary. */ +- end = (start + len + PMD_SIZE - 1) & PMD_MASK; +- if (end >= TASK_SIZE_MAX) +- end = TASK_SIZE_MAX; +- end -= len; +- +- if (end > start) { +- offset = get_random_int() % (((end - start) >> PAGE_SHIFT) + 1); +- addr = start + (offset << PAGE_SHIFT); +- } else { +- addr = start; +- } +- +- /* +- * Forcibly align the final address in case we have a hardware +- * issue that requires alignment for performance reasons. +- */ +- addr = align_vdso_addr(addr); +- +- return addr; +-} +- + static int map_vdso_randomized(const struct vdso_image *image) + { +- unsigned long addr = vdso_addr(current->mm->start_stack, image->size-image->sym_vvar_start); +- +- return map_vdso(image, addr); ++ return map_vdso(image, 0); + } + #endif + +diff --git i/arch/x86/include/asm/elf.h w/arch/x86/include/asm/elf.h +index 0a55def01..3785937d5 100644 +--- i/arch/x86/include/asm/elf.h ++++ w/arch/x86/include/asm/elf.h +@@ -251,11 +251,11 @@ extern int force_personality32; + + /* + * This is the base location for PIE (ET_DYN with INTERP) loads. On +- * 64-bit, this is above 4GB to leave the entire 32-bit address ++ * 64-bit, this is raised to 4GB to leave the entire 32-bit address + * space open for things that want to use the area for 32-bit pointers. + */ + #define ELF_ET_DYN_BASE (mmap_is_ia32() ? 0x000400000UL : \ +- (DEFAULT_MAP_WINDOW / 3 * 2)) ++ 0x100000000UL) + + /* This yields a mask that user programs can use to figure out what + instruction set this CPU supports. This could be done in user space, +@@ -315,8 +315,8 @@ extern bool mmap_address_hint_valid(unsigned long addr, unsigned long len); + + #ifdef CONFIG_X86_32 + +-#define __STACK_RND_MASK(is32bit) (0x7ff) +-#define STACK_RND_MASK (0x7ff) ++#define __STACK_RND_MASK(is32bit) ((1UL << mmap_rnd_bits) - 1) ++#define STACK_RND_MASK ((1UL << mmap_rnd_bits) - 1) + + #define ARCH_DLINFO ARCH_DLINFO_IA32 + +@@ -325,7 +325,11 @@ extern bool mmap_address_hint_valid(unsigned long addr, unsigned long len); + #else /* CONFIG_X86_32 */ + + /* 1GB for 64bit, 8MB for 32bit */ +-#define __STACK_RND_MASK(is32bit) ((is32bit) ? 0x7ff : 0x3fffff) ++#ifdef CONFIG_COMPAT ++#define __STACK_RND_MASK(is32bit) ((is32bit) ? (1UL << mmap_rnd_compat_bits) - 1 : (1UL << mmap_rnd_bits) - 1) ++#else ++#define __STACK_RND_MASK(is32bit) ((1UL << mmap_rnd_bits) - 1) ++#endif + #define STACK_RND_MASK __STACK_RND_MASK(mmap_is_ia32()) + + #define ARCH_DLINFO \ +@@ -383,5 +387,4 @@ struct va_alignment { + } ____cacheline_aligned; + + extern struct va_alignment va_align; +-extern unsigned long align_vdso_addr(unsigned long); + #endif /* _ASM_X86_ELF_H */ +diff --git i/arch/x86/include/asm/tlbflush.h w/arch/x86/include/asm/tlbflush.h +index 79ec7add5..2950448e0 100644 +--- i/arch/x86/include/asm/tlbflush.h ++++ w/arch/x86/include/asm/tlbflush.h +@@ -310,6 +310,7 @@ static inline void cr4_set_bits(unsigned long mask) + + local_irq_save(flags); + cr4 = this_cpu_read(cpu_tlbstate.cr4); ++ BUG_ON(cr4 != __read_cr4()); + if ((cr4 | mask) != cr4) + __cr4_set(cr4 | mask); + local_irq_restore(flags); +@@ -322,6 +323,7 @@ static inline void cr4_clear_bits(unsigned long mask) + + local_irq_save(flags); + cr4 = this_cpu_read(cpu_tlbstate.cr4); ++ BUG_ON(cr4 != __read_cr4()); + if ((cr4 & ~mask) != cr4) + __cr4_set(cr4 & ~mask); + local_irq_restore(flags); +@@ -332,6 +334,7 @@ static inline void cr4_toggle_bits_irqsoff(unsigned long mask) + unsigned long cr4; + + cr4 = this_cpu_read(cpu_tlbstate.cr4); ++ BUG_ON(cr4 != __read_cr4()); + __cr4_set(cr4 ^ mask); + } + +@@ -438,6 +441,7 @@ static inline void __native_flush_tlb_global(void) + raw_local_irq_save(flags); + + cr4 = this_cpu_read(cpu_tlbstate.cr4); ++ BUG_ON(cr4 != __read_cr4()); + /* toggle PGE */ + native_write_cr4(cr4 ^ X86_CR4_PGE); + /* write old PGE again and flush TLBs */ +diff --git i/arch/x86/kernel/cpu/common.c w/arch/x86/kernel/cpu/common.c +index 2058e8c0e..820f8508a 100644 +--- i/arch/x86/kernel/cpu/common.c ++++ w/arch/x86/kernel/cpu/common.c +@@ -1824,7 +1824,6 @@ void cpu_init(void) + wrmsrl(MSR_KERNEL_GS_BASE, 0); + barrier(); + +- x86_configure_nx(); + x2apic_setup(); + + /* +diff --git i/arch/x86/kernel/process.c w/arch/x86/kernel/process.c +index cd138bfd9..2a6d5617d 100644 +--- i/arch/x86/kernel/process.c ++++ w/arch/x86/kernel/process.c +@@ -39,6 +39,8 @@ + #include + #include + #include ++#include ++#include + + #include "process.h" + +@@ -775,7 +777,10 @@ unsigned long arch_align_stack(unsigned long sp) + + unsigned long arch_randomize_brk(struct mm_struct *mm) + { +- return randomize_page(mm->brk, 0x02000000); ++ if (mmap_is_ia32()) ++ return mm->brk + get_random_long() % SZ_32M + PAGE_SIZE; ++ else ++ return mm->brk + get_random_long() % SZ_1G + PAGE_SIZE; + } + + /* +diff --git i/arch/x86/kernel/sys_x86_64.c w/arch/x86/kernel/sys_x86_64.c +index 6a78d4b36..715009f7a 100644 +--- i/arch/x86/kernel/sys_x86_64.c ++++ w/arch/x86/kernel/sys_x86_64.c +@@ -54,13 +54,6 @@ static unsigned long get_align_bits(void) + return va_align.bits & get_align_mask(); + } + +-unsigned long align_vdso_addr(unsigned long addr) +-{ +- unsigned long align_mask = get_align_mask(); +- addr = (addr + align_mask) & ~align_mask; +- return addr | get_align_bits(); +-} +- + static int __init control_va_addr_alignment(char *str) + { + /* guard against enabling this on other CPU families */ +@@ -122,10 +115,7 @@ static void find_start_end(unsigned long addr, unsigned long flags, + } + + *begin = get_mmap_base(1); +- if (in_compat_syscall()) +- *end = task_size_32bit(); +- else +- *end = task_size_64bit(addr > DEFAULT_MAP_WINDOW); ++ *end = get_mmap_base(0); + } + + unsigned long +@@ -210,7 +200,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0, + + info.flags = VM_UNMAPPED_AREA_TOPDOWN; + info.length = len; +- info.low_limit = PAGE_SIZE; ++ info.low_limit = get_mmap_base(1); + info.high_limit = get_mmap_base(0); + + /* +diff --git i/arch/x86/mm/init_32.c w/arch/x86/mm/init_32.c +index 79b95910f..fcda13aa0 100644 +--- i/arch/x86/mm/init_32.c ++++ w/arch/x86/mm/init_32.c +@@ -560,9 +560,9 @@ static void __init pagetable_init(void) + + #define DEFAULT_PTE_MASK ~(_PAGE_NX | _PAGE_GLOBAL) + /* Bits supported by the hardware: */ +-pteval_t __supported_pte_mask __read_mostly = DEFAULT_PTE_MASK; ++pteval_t __supported_pte_mask __ro_after_init = DEFAULT_PTE_MASK; + /* Bits allowed in normal kernel mappings: */ +-pteval_t __default_kernel_pte_mask __read_mostly = DEFAULT_PTE_MASK; ++pteval_t __default_kernel_pte_mask __ro_after_init = DEFAULT_PTE_MASK; + EXPORT_SYMBOL_GPL(__supported_pte_mask); + /* Used in PAGE_KERNEL_* macros which are reasonably used out-of-tree: */ + EXPORT_SYMBOL(__default_kernel_pte_mask); +@@ -870,7 +870,7 @@ void arch_remove_memory(int nid, u64 start, u64 size, + } + #endif + +-int kernel_set_to_readonly __read_mostly; ++int kernel_set_to_readonly __ro_after_init; + + void set_kernel_text_rw(void) + { +@@ -922,12 +922,11 @@ void mark_rodata_ro(void) + unsigned long start = PFN_ALIGN(_text); + unsigned long size = PFN_ALIGN(_etext) - start; + ++ kernel_set_to_readonly = 1; + set_pages_ro(virt_to_page(start), size >> PAGE_SHIFT); + printk(KERN_INFO "Write protecting the kernel text: %luk\n", + size >> 10); + +- kernel_set_to_readonly = 1; +- + #ifdef CONFIG_CPA_DEBUG + printk(KERN_INFO "Testing CPA: Reverting %lx-%lx\n", + start, start+size); +diff --git i/arch/x86/mm/init_64.c w/arch/x86/mm/init_64.c +index 81e85a8dd..f0403d1ba 100644 +--- i/arch/x86/mm/init_64.c ++++ w/arch/x86/mm/init_64.c +@@ -66,9 +66,9 @@ + */ + + /* Bits supported by the hardware: */ +-pteval_t __supported_pte_mask __read_mostly = ~0; ++pteval_t __supported_pte_mask __ro_after_init = ~0; + /* Bits allowed in normal kernel mappings: */ +-pteval_t __default_kernel_pte_mask __read_mostly = ~0; ++pteval_t __default_kernel_pte_mask __ro_after_init = ~0; + EXPORT_SYMBOL_GPL(__supported_pte_mask); + /* Used in PAGE_KERNEL_* macros which are reasonably used out-of-tree: */ + EXPORT_SYMBOL(__default_kernel_pte_mask); +@@ -1190,7 +1190,7 @@ void __init mem_init(void) + mem_init_print_info(NULL); + } + +-int kernel_set_to_readonly; ++int kernel_set_to_readonly __ro_after_init; + + void set_kernel_text_rw(void) + { +@@ -1239,9 +1239,8 @@ void mark_rodata_ro(void) + + printk(KERN_INFO "Write protecting the kernel read-only data: %luk\n", + (end - start) >> 10); +- set_memory_ro(start, (end - start) >> PAGE_SHIFT); +- + kernel_set_to_readonly = 1; ++ set_memory_ro(start, (end - start) >> PAGE_SHIFT); + + /* + * The rodata/data/bss/brk section (but not the kernel text!) +diff --git i/block/blk-softirq.c w/block/blk-softirq.c +index 15c1f5e12..ff72cccec 100644 +--- i/block/blk-softirq.c ++++ w/block/blk-softirq.c +@@ -20,7 +20,7 @@ static DEFINE_PER_CPU(struct list_head, blk_cpu_done); + * Softirq action handler - move entries to local list and loop over them + * while passing them to the queue registered handler. + */ +-static __latent_entropy void blk_done_softirq(struct softirq_action *h) ++static __latent_entropy void blk_done_softirq(void) + { + struct list_head *cpu_list, local_list; + +diff --git i/drivers/ata/libata-core.c w/drivers/ata/libata-core.c +index db1d86af2..020407bd4 100644 +--- i/drivers/ata/libata-core.c ++++ w/drivers/ata/libata-core.c +@@ -5162,7 +5162,7 @@ void ata_qc_free(struct ata_queued_cmd *qc) + struct ata_port *ap; + unsigned int tag; + +- WARN_ON_ONCE(qc == NULL); /* ata_qc_from_tag _might_ return NULL */ ++ BUG_ON(qc == NULL); /* ata_qc_from_tag _might_ return NULL */ + ap = qc->ap; + + qc->flags = 0; +@@ -5179,7 +5179,7 @@ void __ata_qc_complete(struct ata_queued_cmd *qc) + struct ata_port *ap; + struct ata_link *link; + +- WARN_ON_ONCE(qc == NULL); /* ata_qc_from_tag _might_ return NULL */ ++ BUG_ON(qc == NULL); /* ata_qc_from_tag _might_ return NULL */ + WARN_ON_ONCE(!(qc->flags & ATA_QCFLAG_ACTIVE)); + ap = qc->ap; + link = qc->dev->link; +diff --git i/drivers/char/Kconfig w/drivers/char/Kconfig +index 1df9cb8e6..eb71148a4 100644 +--- i/drivers/char/Kconfig ++++ w/drivers/char/Kconfig +@@ -9,7 +9,6 @@ source "drivers/tty/Kconfig" + + config DEVMEM + bool "/dev/mem virtual device support" +- default y + help + Say Y here if you want to support the /dev/mem device. + The /dev/mem device is used to access areas of physical +@@ -531,7 +530,6 @@ config TELCLOCK + config DEVPORT + bool "/dev/port character device" + depends on ISA || PCI +- default y + help + Say Y here if you want to support the /dev/port device. The /dev/port + device is similar to /dev/mem, but for I/O ports. +diff --git i/drivers/tty/Kconfig w/drivers/tty/Kconfig +index e0a04bfc8..ec93f827c 100644 +--- i/drivers/tty/Kconfig ++++ w/drivers/tty/Kconfig +@@ -122,7 +122,6 @@ config UNIX98_PTYS + + config LEGACY_PTYS + bool "Legacy (BSD) PTY support" +- default y + ---help--- + A pseudo terminal (PTY) is a software device consisting of two + halves: a master and a slave. The slave device behaves identical to +diff --git i/drivers/tty/tty_io.c w/drivers/tty/tty_io.c +index ac8025cd4..a89e48f53 100644 +--- i/drivers/tty/tty_io.c ++++ w/drivers/tty/tty_io.c +@@ -172,6 +172,7 @@ static void free_tty_struct(struct tty_struct *tty) + put_device(tty->dev); + kfree(tty->write_buf); + tty->magic = 0xDEADDEAD; ++ put_user_ns(tty->owner_user_ns); + kfree(tty); + } + +@@ -2177,11 +2178,19 @@ static int tty_fasync(int fd, struct file *filp, int on) + * FIXME: may race normal receive processing + */ + ++int tiocsti_restrict = IS_ENABLED(CONFIG_SECURITY_TIOCSTI_RESTRICT); ++ + static int tiocsti(struct tty_struct *tty, char __user *p) + { + char ch, mbz = 0; + struct tty_ldisc *ld; + ++ if (tiocsti_restrict && ++ !ns_capable(tty->owner_user_ns, CAP_SYS_ADMIN)) { ++ dev_warn_ratelimited(tty->dev, ++ "Denied TIOCSTI ioctl for non-privileged process\n"); ++ return -EPERM; ++ } + if ((current->signal->tty != tty) && !capable(CAP_SYS_ADMIN)) + return -EPERM; + if (get_user(ch, p)) +@@ -2865,6 +2874,7 @@ struct tty_struct *alloc_tty_struct(struct tty_driver *driver, int idx) + tty->index = idx; + tty_line_name(driver, idx, tty->name); + tty->dev = tty_get_device(tty); ++ tty->owner_user_ns = get_user_ns(current_user_ns()); + + return tty; + } +diff --git i/drivers/usb/core/Makefile w/drivers/usb/core/Makefile +index 18e874b04..a010a4a58 100644 +--- i/drivers/usb/core/Makefile ++++ w/drivers/usb/core/Makefile +@@ -11,6 +11,7 @@ usbcore-y += phy.o port.o + usbcore-$(CONFIG_OF) += of.o + usbcore-$(CONFIG_USB_PCI) += hcd-pci.o + usbcore-$(CONFIG_ACPI) += usb-acpi.o ++usbcore-$(CONFIG_SYSCTL) += sysctl.o + + obj-$(CONFIG_USB) += usbcore.o + +diff --git i/drivers/usb/core/hub.c w/drivers/usb/core/hub.c +index fa28f23a4..2900ffcf4 100644 +--- i/drivers/usb/core/hub.c ++++ w/drivers/usb/core/hub.c +@@ -4980,6 +4980,12 @@ static void hub_port_connect(struct usb_hub *hub, int port1, u16 portstatus, + goto done; + return; + } ++ ++ if (deny_new_usb) { ++ dev_err(&port_dev->dev, "denied insert of USB device on port %d\n", port1); ++ goto done; ++ } ++ + if (hub_is_superspeed(hub->hdev)) + unit_load = 150; + else +diff --git i/drivers/usb/core/usb.c w/drivers/usb/core/usb.c +index 4ebfbd737..5bb29503a 100644 +--- i/drivers/usb/core/usb.c ++++ w/drivers/usb/core/usb.c +@@ -74,6 +74,9 @@ MODULE_PARM_DESC(autosuspend, "default autosuspend delay"); + #define usb_autosuspend_delay 0 + #endif + ++int deny_new_usb __read_mostly = 0; ++EXPORT_SYMBOL(deny_new_usb); ++ + static bool match_endpoint(struct usb_endpoint_descriptor *epd, + struct usb_endpoint_descriptor **bulk_in, + struct usb_endpoint_descriptor **bulk_out, +@@ -1196,6 +1199,9 @@ static int __init usb_init(void) + usb_debugfs_init(); + + usb_acpi_register(); ++ retval = usb_init_sysctl(); ++ if (retval) ++ goto sysctl_init_failed; + retval = bus_register(&usb_bus_type); + if (retval) + goto bus_register_failed; +@@ -1230,6 +1236,8 @@ static int __init usb_init(void) + bus_notifier_failed: + bus_unregister(&usb_bus_type); + bus_register_failed: ++ usb_exit_sysctl(); ++sysctl_init_failed: + usb_acpi_unregister(); + usb_debugfs_cleanup(); + out: +@@ -1253,6 +1261,7 @@ static void __exit usb_exit(void) + usb_hub_cleanup(); + bus_unregister_notifier(&usb_bus_type, &usb_bus_nb); + bus_unregister(&usb_bus_type); ++ usb_exit_sysctl(); + usb_acpi_unregister(); + usb_debugfs_cleanup(); + idr_destroy(&usb_bus_idr); +diff --git i/fs/exec.c w/fs/exec.c +index 1093ea805..3c3a8808b 100644 +--- i/fs/exec.c ++++ w/fs/exec.c +@@ -32,6 +32,7 @@ + #include + #include + #include ++#include + #include + #include + #include +@@ -62,6 +63,7 @@ + #include + #include + #include ++#include + + #include + #include +@@ -321,6 +323,8 @@ static int __bprm_mm_init(struct linux_binprm *bprm) + arch_bprm_mm_init(mm, vma); + up_write(&mm->mmap_sem); + bprm->p = vma->vm_end - sizeof(void *); ++ if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space) ++ bprm->p ^= get_random_int() & ~PAGE_MASK; + return 0; + err: + up_write(&mm->mmap_sem); +diff --git i/fs/namei.c w/fs/namei.c +index 5a68db76d..87288b137 100644 +--- i/fs/namei.c ++++ w/fs/namei.c +@@ -887,8 +887,8 @@ static inline void put_link(struct nameidata *nd) + + int sysctl_protected_symlinks __read_mostly = 1; + int sysctl_protected_hardlinks __read_mostly = 1; +-int sysctl_protected_fifos __read_mostly; +-int sysctl_protected_regular __read_mostly; ++int sysctl_protected_fifos __read_mostly = 2; ++int sysctl_protected_regular __read_mostly = 2; + + /** + * may_follow_link - Check symlink following for unsafe situations +diff --git i/fs/nfs/Kconfig w/fs/nfs/Kconfig +index ac3e06367..06a2e4cf4 100644 +--- i/fs/nfs/Kconfig ++++ w/fs/nfs/Kconfig +@@ -195,4 +195,3 @@ config NFS_DEBUG + bool + depends on NFS_FS && SUNRPC_DEBUG + select CRC32 +- default y +diff --git i/fs/proc/Kconfig w/fs/proc/Kconfig +index 817c02b13..b8cd62b5c 100644 +--- i/fs/proc/Kconfig ++++ w/fs/proc/Kconfig +@@ -40,7 +40,6 @@ config PROC_KCORE + config PROC_VMCORE + bool "/proc/vmcore support" + depends on PROC_FS && CRASH_DUMP +- default y + help + Exports the dump image of crashed kernel in ELF format. + +diff --git i/fs/stat.c w/fs/stat.c +index f8e6fb2c3..240c1432e 100644 +--- i/fs/stat.c ++++ w/fs/stat.c +@@ -40,8 +40,13 @@ void generic_fillattr(struct inode *inode, struct kstat *stat) + stat->gid = inode->i_gid; + stat->rdev = inode->i_rdev; + stat->size = i_size_read(inode); +- stat->atime = inode->i_atime; +- stat->mtime = inode->i_mtime; ++ if (is_sidechannel_device(inode) && !capable_noaudit(CAP_MKNOD)) { ++ stat->atime = inode->i_ctime; ++ stat->mtime = inode->i_ctime; ++ } else { ++ stat->atime = inode->i_atime; ++ stat->mtime = inode->i_mtime; ++ } + stat->ctime = inode->i_ctime; + stat->blksize = i_blocksize(inode); + stat->blocks = inode->i_blocks; +@@ -75,9 +80,14 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat, + stat->result_mask |= STATX_BASIC_STATS; + request_mask &= STATX_ALL; + query_flags &= KSTAT_QUERY_FLAGS; +- if (inode->i_op->getattr) +- return inode->i_op->getattr(path, stat, request_mask, +- query_flags); ++ if (inode->i_op->getattr) { ++ int retval = inode->i_op->getattr(path, stat, request_mask, query_flags); ++ if (!retval && is_sidechannel_device(inode) && !capable_noaudit(CAP_MKNOD)) { ++ stat->atime = stat->ctime; ++ stat->mtime = stat->ctime; ++ } ++ return retval; ++ } + + generic_fillattr(inode, stat); + return 0; +diff --git i/include/linux/cache.h w/include/linux/cache.h +index 750621e41..e7157c18c 100644 +--- i/include/linux/cache.h ++++ w/include/linux/cache.h +@@ -31,6 +31,8 @@ + #define __ro_after_init __attribute__((__section__(".data..ro_after_init"))) + #endif + ++#define __read_only __ro_after_init ++ + #ifndef ____cacheline_aligned + #define ____cacheline_aligned __attribute__((__aligned__(SMP_CACHE_BYTES))) + #endif +diff --git i/include/linux/capability.h w/include/linux/capability.h +index f640dcbc8..2b4f5d651 100644 +--- i/include/linux/capability.h ++++ w/include/linux/capability.h +@@ -207,6 +207,7 @@ extern bool has_capability_noaudit(struct task_struct *t, int cap); + extern bool has_ns_capability_noaudit(struct task_struct *t, + struct user_namespace *ns, int cap); + extern bool capable(int cap); ++extern bool capable_noaudit(int cap); + extern bool ns_capable(struct user_namespace *ns, int cap); + extern bool ns_capable_noaudit(struct user_namespace *ns, int cap); + #else +@@ -232,6 +233,10 @@ static inline bool capable(int cap) + { + return true; + } ++static inline bool capable_noaudit(int cap) ++{ ++ return true; ++} + static inline bool ns_capable(struct user_namespace *ns, int cap) + { + return true; +diff --git i/include/linux/dccp.h w/include/linux/dccp.h +index 6b64b6cc2..fe1770732 100644 +--- i/include/linux/dccp.h ++++ w/include/linux/dccp.h +@@ -259,6 +259,7 @@ struct dccp_ackvec; + * @dccps_sync_scheduled - flag which signals "send out-of-band message soon" + * @dccps_xmitlet - tasklet scheduled by the TX CCID to dequeue data packets + * @dccps_xmit_timer - used by the TX CCID to delay sending (rate-based pacing) ++ * @dccps_ccid_timer - used by the CCIDs + * @dccps_syn_rtt - RTT sample from Request/Response exchange (in usecs) + */ + struct dccp_sock { +@@ -303,6 +304,7 @@ struct dccp_sock { + __u8 dccps_sync_scheduled:1; + struct tasklet_struct dccps_xmitlet; + struct timer_list dccps_xmit_timer; ++ struct timer_list dccps_ccid_timer; + }; + + static inline struct dccp_sock *dccp_sk(const struct sock *sk) +diff --git i/include/linux/fs.h w/include/linux/fs.h +index 40378e5bb..6eecd25c6 100644 +--- i/include/linux/fs.h ++++ w/include/linux/fs.h +@@ -3483,4 +3483,15 @@ extern void inode_nohighmem(struct inode *inode); + extern int vfs_fadvise(struct file *file, loff_t offset, loff_t len, + int advice); + ++extern int device_sidechannel_restrict; ++ ++static inline bool is_sidechannel_device(const struct inode *inode) ++{ ++ umode_t mode; ++ if (!device_sidechannel_restrict) ++ return false; ++ mode = inode->i_mode; ++ return ((S_ISCHR(mode) || S_ISBLK(mode)) && (mode & (S_IROTH | S_IWOTH))); ++} ++ + #endif /* _LINUX_FS_H */ +diff --git i/include/linux/fsnotify.h w/include/linux/fsnotify.h +index fd1ce1055..1905d2476 100644 +--- i/include/linux/fsnotify.h ++++ w/include/linux/fsnotify.h +@@ -177,6 +177,9 @@ static inline void fsnotify_access(struct file *file) + struct inode *inode = file_inode(file); + __u32 mask = FS_ACCESS; + ++ if (is_sidechannel_device(inode)) ++ return; ++ + if (S_ISDIR(inode->i_mode)) + mask |= FS_ISDIR; + +@@ -195,6 +198,9 @@ static inline void fsnotify_modify(struct file *file) + struct inode *inode = file_inode(file); + __u32 mask = FS_MODIFY; + ++ if (is_sidechannel_device(inode)) ++ return; ++ + if (S_ISDIR(inode->i_mode)) + mask |= FS_ISDIR; + +diff --git i/include/linux/gfp.h w/include/linux/gfp.h +index f78d1e895..ff139ff8d 100644 +--- i/include/linux/gfp.h ++++ w/include/linux/gfp.h +@@ -553,9 +553,9 @@ extern struct page *alloc_pages_vma(gfp_t gfp_mask, int order, + extern unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order); + extern unsigned long get_zeroed_page(gfp_t gfp_mask); + +-void *alloc_pages_exact(size_t size, gfp_t gfp_mask); ++void *alloc_pages_exact(size_t size, gfp_t gfp_mask) __attribute__((alloc_size(1))); + void free_pages_exact(void *virt, size_t size); +-void * __meminit alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask); ++void * __meminit alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask) __attribute__((alloc_size(2))); + + #define __get_free_page(gfp_mask) \ + __get_free_pages((gfp_mask), 0) +diff --git i/include/linux/highmem.h w/include/linux/highmem.h +index 069067983..b9394bc86 100644 +--- i/include/linux/highmem.h ++++ w/include/linux/highmem.h +@@ -191,6 +191,13 @@ static inline void clear_highpage(struct page *page) + kunmap_atomic(kaddr); + } + ++static inline void verify_zero_highpage(struct page *page) ++{ ++ void *kaddr = kmap_atomic(page); ++ BUG_ON(memchr_inv(kaddr, 0, PAGE_SIZE)); ++ kunmap_atomic(kaddr); ++} ++ + static inline void zero_user_segments(struct page *page, + unsigned start1, unsigned end1, + unsigned start2, unsigned end2) +diff --git i/include/linux/interrupt.h w/include/linux/interrupt.h +index eeceac337..78ad558bc 100644 +--- i/include/linux/interrupt.h ++++ w/include/linux/interrupt.h +@@ -490,7 +490,7 @@ extern const char * const softirq_to_name[NR_SOFTIRQS]; + + struct softirq_action + { +- void (*action)(struct softirq_action *); ++ void (*action)(void); + }; + + asmlinkage void do_softirq(void); +@@ -505,7 +505,7 @@ static inline void do_softirq_own_stack(void) + } + #endif + +-extern void open_softirq(int nr, void (*action)(struct softirq_action *)); ++extern void __init open_softirq(int nr, void (*action)(void)); + extern void softirq_init(void); + extern void __raise_softirq_irqoff(unsigned int nr); + +diff --git i/include/linux/kobject_ns.h w/include/linux/kobject_ns.h +index 069aa2ebe..cb9e3637a 100644 +--- i/include/linux/kobject_ns.h ++++ w/include/linux/kobject_ns.h +@@ -45,7 +45,7 @@ struct kobj_ns_type_operations { + void (*drop_ns)(void *); + }; + +-int kobj_ns_type_register(const struct kobj_ns_type_operations *ops); ++int __init kobj_ns_type_register(const struct kobj_ns_type_operations *ops); + int kobj_ns_type_registered(enum kobj_ns_type type); + const struct kobj_ns_type_operations *kobj_child_ns_ops(struct kobject *parent); + const struct kobj_ns_type_operations *kobj_ns_ops(struct kobject *kobj); +diff --git i/include/linux/mm.h w/include/linux/mm.h +index 43ba8bd98..40456c6a3 100644 +--- i/include/linux/mm.h ++++ w/include/linux/mm.h +@@ -571,7 +571,7 @@ static inline int is_vmalloc_or_module_addr(const void *x) + } + #endif + +-extern void *kvmalloc_node(size_t size, gfp_t flags, int node); ++extern void *kvmalloc_node(size_t size, gfp_t flags, int node) __attribute__((alloc_size(1))); + static inline void *kvmalloc(size_t size, gfp_t flags) + { + return kvmalloc_node(size, flags, NUMA_NO_NODE); +diff --git i/include/linux/percpu.h w/include/linux/percpu.h +index 70b7123f3..09f301948 100644 +--- i/include/linux/percpu.h ++++ w/include/linux/percpu.h +@@ -129,7 +129,7 @@ extern int __init pcpu_page_first_chunk(size_t reserved_size, + pcpu_fc_populate_pte_fn_t populate_pte_fn); + #endif + +-extern void __percpu *__alloc_reserved_percpu(size_t size, size_t align); ++extern void __percpu *__alloc_reserved_percpu(size_t size, size_t align) __attribute__((alloc_size(1))); + extern bool __is_kernel_percpu_address(unsigned long addr, unsigned long *can_addr); + extern bool is_kernel_percpu_address(unsigned long addr); + +@@ -137,8 +137,8 @@ extern bool is_kernel_percpu_address(unsigned long addr); + extern void __init setup_per_cpu_areas(void); + #endif + +-extern void __percpu *__alloc_percpu_gfp(size_t size, size_t align, gfp_t gfp); +-extern void __percpu *__alloc_percpu(size_t size, size_t align); ++extern void __percpu *__alloc_percpu_gfp(size_t size, size_t align, gfp_t gfp) __attribute__((alloc_size(1))); ++extern void __percpu *__alloc_percpu(size_t size, size_t align) __attribute__((alloc_size(1))); + extern void free_percpu(void __percpu *__pdata); + extern phys_addr_t per_cpu_ptr_to_phys(void *addr); + +diff --git i/include/linux/slab.h w/include/linux/slab.h +index d6393413e..f11e06e87 100644 +--- i/include/linux/slab.h ++++ w/include/linux/slab.h +@@ -180,8 +180,8 @@ void memcg_destroy_kmem_caches(struct mem_cgroup *); + /* + * Common kmalloc functions provided by all allocators + */ +-void * __must_check __krealloc(const void *, size_t, gfp_t); +-void * __must_check krealloc(const void *, size_t, gfp_t); ++void * __must_check __krealloc(const void *, size_t, gfp_t) __attribute__((alloc_size(2))); ++void * __must_check krealloc(const void *, size_t, gfp_t) __attribute((alloc_size(2))); + void kfree(const void *); + void kzfree(const void *); + size_t ksize(const void *); +@@ -354,7 +354,7 @@ static __always_inline unsigned int kmalloc_index(size_t size) + } + #endif /* !CONFIG_SLOB */ + +-void *__kmalloc(size_t size, gfp_t flags) __assume_kmalloc_alignment __malloc; ++void *__kmalloc(size_t size, gfp_t flags) __assume_kmalloc_alignment __malloc __attribute__((alloc_size(1))); + void *kmem_cache_alloc(struct kmem_cache *, gfp_t flags) __assume_slab_alignment __malloc; + void kmem_cache_free(struct kmem_cache *, void *); + +@@ -378,7 +378,7 @@ static __always_inline void kfree_bulk(size_t size, void **p) + } + + #ifdef CONFIG_NUMA +-void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_kmalloc_alignment __malloc; ++void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_kmalloc_alignment __malloc __attribute__((alloc_size(1))); + void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node) __assume_slab_alignment __malloc; + #else + static __always_inline void *__kmalloc_node(size_t size, gfp_t flags, int node) +@@ -500,7 +500,7 @@ static __always_inline void *kmalloc_large(size_t size, gfp_t flags) + * for general use, and so are not documented here. For a full list of + * potential flags, always refer to linux/gfp.h. + */ +-static __always_inline void *kmalloc(size_t size, gfp_t flags) ++static __always_inline __attribute__((alloc_size(1))) void *kmalloc(size_t size, gfp_t flags) + { + if (__builtin_constant_p(size)) { + if (size > KMALLOC_MAX_CACHE_SIZE) +@@ -540,7 +540,7 @@ static __always_inline unsigned int kmalloc_size(unsigned int n) + return 0; + } + +-static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node) ++static __always_inline __attribute__((alloc_size(1))) void *kmalloc_node(size_t size, gfp_t flags, int node) + { + #ifndef CONFIG_SLOB + if (__builtin_constant_p(size) && +diff --git i/include/linux/slub_def.h w/include/linux/slub_def.h +index 3a1a1dbc6..ff38fec9e 100644 +--- i/include/linux/slub_def.h ++++ w/include/linux/slub_def.h +@@ -121,6 +121,11 @@ struct kmem_cache { + unsigned long random; + #endif + ++#ifdef CONFIG_SLAB_CANARY ++ unsigned long random_active; ++ unsigned long random_inactive; ++#endif ++ + #ifdef CONFIG_NUMA + /* + * Defragmentation by allocating from a remote node. +diff --git i/include/linux/string.h w/include/linux/string.h +index 4db285b83..a479f93d5 100644 +--- i/include/linux/string.h ++++ w/include/linux/string.h +@@ -238,6 +238,12 @@ void __read_overflow2(void) __compiletime_error("detected read beyond size of ob + void __read_overflow3(void) __compiletime_error("detected read beyond size of object passed as 3rd parameter"); + void __write_overflow(void) __compiletime_error("detected write beyond size of object passed as 1st parameter"); + ++#ifdef CONFIG_FORTIFY_SOURCE_STRICT_STRING ++#define __string_size(p) __builtin_object_size(p, 1) ++#else ++#define __string_size(p) __builtin_object_size(p, 0) ++#endif ++ + #if !defined(__NO_FORTIFY) && defined(__OPTIMIZE__) && defined(CONFIG_FORTIFY_SOURCE) + + #ifdef CONFIG_KASAN +@@ -266,7 +272,7 @@ extern char *__underlying_strncpy(char *p, const char *q, __kernel_size_t size) + + __FORTIFY_INLINE char *strncpy(char *p, const char *q, __kernel_size_t size) + { +- size_t p_size = __builtin_object_size(p, 0); ++ size_t p_size = __string_size(p); + if (__builtin_constant_p(size) && p_size < size) + __write_overflow(); + if (p_size < size) +@@ -276,7 +282,7 @@ __FORTIFY_INLINE char *strncpy(char *p, const char *q, __kernel_size_t size) + + __FORTIFY_INLINE char *strcat(char *p, const char *q) + { +- size_t p_size = __builtin_object_size(p, 0); ++ size_t p_size = __string_size(p); + if (p_size == (size_t)-1) + return __underlying_strcat(p, q); + if (strlcat(p, q, p_size) >= p_size) +@@ -287,7 +293,7 @@ __FORTIFY_INLINE char *strcat(char *p, const char *q) + __FORTIFY_INLINE __kernel_size_t strlen(const char *p) + { + __kernel_size_t ret; +- size_t p_size = __builtin_object_size(p, 0); ++ size_t p_size = __string_size(p); + + /* Work around gcc excess stack consumption issue */ + if (p_size == (size_t)-1 || +@@ -302,7 +308,7 @@ __FORTIFY_INLINE __kernel_size_t strlen(const char *p) + extern __kernel_size_t __real_strnlen(const char *, __kernel_size_t) __RENAME(strnlen); + __FORTIFY_INLINE __kernel_size_t strnlen(const char *p, __kernel_size_t maxlen) + { +- size_t p_size = __builtin_object_size(p, 0); ++ size_t p_size = __string_size(p); + __kernel_size_t ret = __real_strnlen(p, maxlen < p_size ? maxlen : p_size); + if (p_size <= ret && maxlen != ret) + fortify_panic(__func__); +@@ -314,8 +320,8 @@ extern size_t __real_strlcpy(char *, const char *, size_t) __RENAME(strlcpy); + __FORTIFY_INLINE size_t strlcpy(char *p, const char *q, size_t size) + { + size_t ret; +- size_t p_size = __builtin_object_size(p, 0); +- size_t q_size = __builtin_object_size(q, 0); ++ size_t p_size = __string_size(p); ++ size_t q_size = __string_size(q); + if (p_size == (size_t)-1 && q_size == (size_t)-1) + return __real_strlcpy(p, q, size); + ret = strlen(q); +@@ -335,8 +341,8 @@ __FORTIFY_INLINE size_t strlcpy(char *p, const char *q, size_t size) + __FORTIFY_INLINE char *strncat(char *p, const char *q, __kernel_size_t count) + { + size_t p_len, copy_len; +- size_t p_size = __builtin_object_size(p, 0); +- size_t q_size = __builtin_object_size(q, 0); ++ size_t p_size = __string_size(p); ++ size_t q_size = __string_size(q); + if (p_size == (size_t)-1 && q_size == (size_t)-1) + return __underlying_strncat(p, q, count); + p_len = strlen(p); +@@ -449,8 +455,8 @@ __FORTIFY_INLINE void *kmemdup(const void *p, size_t size, gfp_t gfp) + /* defined after fortified strlen and memcpy to reuse them */ + __FORTIFY_INLINE char *strcpy(char *p, const char *q) + { +- size_t p_size = __builtin_object_size(p, 0); +- size_t q_size = __builtin_object_size(q, 0); ++ size_t p_size = __string_size(p); ++ size_t q_size = __string_size(q); + if (p_size == (size_t)-1 && q_size == (size_t)-1) + return __underlying_strcpy(p, q); + memcpy(p, q, strlen(q) + 1); +diff --git i/include/linux/sysctl.h w/include/linux/sysctl.h +index b769ecfcc..f4d860437 100644 +--- i/include/linux/sysctl.h ++++ w/include/linux/sysctl.h +@@ -51,6 +51,8 @@ extern int proc_dointvec_minmax(struct ctl_table *, int, + extern int proc_douintvec_minmax(struct ctl_table *table, int write, + void __user *buffer, size_t *lenp, + loff_t *ppos); ++extern int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write, ++ void *buffer, size_t *lenp, loff_t *ppos); + extern int proc_dointvec_jiffies(struct ctl_table *, int, + void __user *, size_t *, loff_t *); + extern int proc_dointvec_userhz_jiffies(struct ctl_table *, int, +diff --git i/include/linux/tty.h w/include/linux/tty.h +index 74226a8f9..a4280e6a3 100644 +--- i/include/linux/tty.h ++++ w/include/linux/tty.h +@@ -14,6 +14,7 @@ + #include + #include + #include ++#include + + + /* +@@ -338,6 +339,7 @@ struct tty_struct { + /* If the tty has a pending do_SAK, queue it here - akpm */ + struct work_struct SAK_work; + struct tty_port *port; ++ struct user_namespace *owner_user_ns; + } __randomize_layout; + + /* Each of a tty's open files has private_data pointing to tty_file_private */ +@@ -347,6 +349,8 @@ struct tty_file_private { + struct list_head list; + }; + ++extern int tiocsti_restrict; ++ + /* tty magic number */ + #define TTY_MAGIC 0x5401 + +diff --git i/include/linux/usb.h w/include/linux/usb.h +index ff010d1fd..de5f042cc 100644 +--- i/include/linux/usb.h ++++ w/include/linux/usb.h +@@ -8,6 +8,16 @@ + #define USB_MAJOR 180 + #define USB_DEVICE_MAJOR 189 + ++/* sysctl.c */ ++extern int deny_new_usb; ++#ifdef CONFIG_SYSCTL ++extern int usb_init_sysctl(void); ++extern void usb_exit_sysctl(void); ++#else ++static inline int usb_init_sysctl(void) { return 0; } ++static inline void usb_exit_sysctl(void) { } ++#endif /* CONFIG_SYSCTL */ ++ + + #ifdef __KERNEL__ + +diff --git i/include/linux/vmalloc.h w/include/linux/vmalloc.h +index 206957b1b..17ec08604 100644 +--- i/include/linux/vmalloc.h ++++ w/include/linux/vmalloc.h +@@ -69,19 +69,19 @@ static inline void vmalloc_init(void) + } + #endif + +-extern void *vmalloc(unsigned long size); +-extern void *vzalloc(unsigned long size); +-extern void *vmalloc_user(unsigned long size); +-extern void *vmalloc_node(unsigned long size, int node); +-extern void *vzalloc_node(unsigned long size, int node); +-extern void *vmalloc_exec(unsigned long size); +-extern void *vmalloc_32(unsigned long size); +-extern void *vmalloc_32_user(unsigned long size); +-extern void *__vmalloc(unsigned long size, gfp_t gfp_mask, pgprot_t prot); ++extern void *vmalloc(unsigned long size) __attribute__((alloc_size(1))); ++extern void *vzalloc(unsigned long size) __attribute__((alloc_size(1))); ++extern void *vmalloc_user(unsigned long size) __attribute__((alloc_size(1))); ++extern void *vmalloc_node(unsigned long size, int node) __attribute__((alloc_size(1))); ++extern void *vzalloc_node(unsigned long size, int node) __attribute__((alloc_size(1))); ++extern void *vmalloc_exec(unsigned long size) __attribute__((alloc_size(1))); ++extern void *vmalloc_32(unsigned long size) __attribute__((alloc_size(1))); ++extern void *vmalloc_32_user(unsigned long size) __attribute__((alloc_size(1))); ++extern void *__vmalloc(unsigned long size, gfp_t gfp_mask, pgprot_t prot) __attribute__((alloc_size(1))); + extern void *__vmalloc_node_range(unsigned long size, unsigned long align, + unsigned long start, unsigned long end, gfp_t gfp_mask, + pgprot_t prot, unsigned long vm_flags, int node, +- const void *caller); ++ const void *caller) __attribute__((alloc_size(1))); + #ifndef CONFIG_MMU + extern void *__vmalloc_node_flags(unsigned long size, int node, gfp_t flags); + static inline void *__vmalloc_node_flags_caller(unsigned long size, int node, +diff --git i/include/net/tcp.h w/include/net/tcp.h +index 0d4501f44..9bb7bfb7e 100644 +--- i/include/net/tcp.h ++++ w/include/net/tcp.h +@@ -245,6 +245,7 @@ void tcp_time_wait(struct sock *sk, int state, int timeo); + /* sysctl variables for tcp */ + extern int sysctl_tcp_max_orphans; + extern long sysctl_tcp_mem[3]; ++extern int sysctl_tcp_simult_connect; + + #define TCP_RACK_LOSS_DETECTION 0x1 /* Use RACK to detect losses */ + #define TCP_RACK_STATIC_REO_WND 0x2 /* Use static RACK reo wnd */ +diff --git i/init/Kconfig w/init/Kconfig +index 47035b5a4..ffa102b5f 100644 +--- i/init/Kconfig ++++ w/init/Kconfig +@@ -326,6 +326,7 @@ config USELIB + config AUDIT + bool "Auditing support" + depends on NET ++ default y + help + Enable auditing infrastructure that can be used with another + kernel subsystem, such as SELinux (which requires this for +@@ -957,6 +958,22 @@ config USER_NS + + If unsure, say N. + ++config USER_NS_UNPRIVILEGED ++ bool "Allow unprivileged users to create namespaces" ++ depends on USER_NS ++ default n ++ help ++ When disabled, unprivileged users will not be able to create ++ new namespaces. Allowing users to create their own namespaces ++ has been part of several recent local privilege escalation ++ exploits, so if you need user namespaces but are ++ paranoid^Wsecurity-conscious you want to disable this. ++ ++ This setting can be overridden at runtime via the ++ kernel.unprivileged_userns_clone sysctl. ++ ++ If unsure, say N. ++ + config PID_NS + bool "PID Namespaces" + default y +@@ -1091,6 +1108,12 @@ config CC_OPTIMIZE_FOR_SIZE + + endchoice + ++config LOCAL_INIT ++ bool "Zero uninitialized locals" ++ help ++ Zero-fill uninitialized local variables, other than variable-length ++ arrays. Requires compiler support. ++ + config HAVE_LD_DEAD_CODE_DATA_ELIMINATION + bool + help +@@ -1167,9 +1190,8 @@ menuconfig EXPERT + Only use this if you really know what you are doing. + + config UID16 +- bool "Enable 16-bit UID system calls" if EXPERT ++ bool "Enable 16-bit UID system calls" + depends on HAVE_UID16 && MULTIUSER +- default y + help + This enables the legacy 16-bit UID syscall wrappers. + +@@ -1198,14 +1220,13 @@ config SGETMASK_SYSCALL + If unsure, leave the default option here. + + config SYSFS_SYSCALL +- bool "Sysfs syscall support" if EXPERT +- default y ++ bool "Sysfs syscall support" + ---help--- + sys_sysfs is an obsolete system call no longer supported in libc. + Note that disabling this option is more secure but might break + compatibility with some systems. + +- If unsure say Y here. ++ If unsure say N here. + + config SYSCTL_SYSCALL + bool "Sysctl syscall support" if EXPERT +@@ -1377,8 +1398,7 @@ config SHMEM + which may be appropriate on small systems without swap. + + config AIO +- bool "Enable AIO support" if EXPERT +- default y ++ bool "Enable AIO support" + help + This option enables POSIX asynchronous I/O which may by used + by some high performance threaded applications. Disabling +@@ -1595,7 +1615,7 @@ config VM_EVENT_COUNTERS + + config SLUB_DEBUG + default y +- bool "Enable SLUB debugging support" if EXPERT ++ bool "Enable SLUB debugging support" + depends on SLUB && SYSFS + help + SLUB has extensive debug support features. Disabling these can +@@ -1619,7 +1639,6 @@ config SLUB_MEMCG_SYSFS_ON + + config COMPAT_BRK + bool "Disable heap randomization" +- default y + help + Randomizing heap placement makes heap exploits harder, but it + also breaks ancient binaries (including anything libc5 based). +@@ -1666,7 +1685,6 @@ endchoice + + config SLAB_MERGE_DEFAULT + bool "Allow slab caches to be merged" +- default y + help + For reduced kernel memory fragmentation, slab caches can be + merged when they share the same size and other characteristics. +@@ -1679,9 +1697,9 @@ config SLAB_MERGE_DEFAULT + command line. + + config SLAB_FREELIST_RANDOM +- default n + depends on SLAB || SLUB + bool "SLAB freelist randomization" ++ default y + help + Randomizes the freelist order used on creating new pages. This + security feature reduces the predictability of the kernel slab +@@ -1690,12 +1708,56 @@ config SLAB_FREELIST_RANDOM + config SLAB_FREELIST_HARDENED + bool "Harden slab freelist metadata" + depends on SLUB ++ default y + help + Many kernel heap attacks try to target slab cache metadata and + other infrastructure. This options makes minor performance + sacrifies to harden the kernel slab allocator against common + freelist exploit methods. + ++config SLAB_HARDENED ++ default y ++ depends on SLUB ++ bool "Hardened SLAB infrastructure" ++ help ++ Make minor performance sacrifices to harden the kernel slab ++ allocator. ++ ++config SLAB_CANARY ++ depends on SLUB ++ depends on !SLAB_MERGE_DEFAULT ++ bool "SLAB canaries" ++ default y ++ help ++ Place canaries at the end of kernel slab allocations, sacrificing ++ some performance and memory usage for security. ++ ++ Canaries can detect some forms of heap corruption when allocations ++ are freed and as part of the HARDENED_USERCOPY feature. It provides ++ basic use-after-free detection for HARDENED_USERCOPY. ++ ++ Canaries absorb small overflows (rendering them harmless), mitigate ++ non-NUL terminated C string overflows on 64-bit via a guaranteed zero ++ byte and provide basic double-free detection. ++ ++config SLAB_SANITIZE ++ bool "Sanitize SLAB allocations" ++ depends on SLUB ++ default y ++ help ++ Zero fill slab allocations on free, reducing the lifetime of ++ sensitive data and helping to mitigate use-after-free bugs. ++ ++ For slabs with debug poisoning enabling, this has no impact. ++ ++config SLAB_SANITIZE_VERIFY ++ depends on SLAB_SANITIZE && PAGE_SANITIZE ++ default y ++ bool "Verify sanitized SLAB allocations" ++ help ++ Verify that newly allocated slab allocations are zeroed to detect ++ write-after-free bugs. ++ + config SLUB_CPU_PARTIAL + default y + depends on SLUB && SMP +diff --git i/kernel/audit.c w/kernel/audit.c +index 45741c3c4..a2de0700e 100644 +--- i/kernel/audit.c ++++ w/kernel/audit.c +@@ -1641,6 +1641,9 @@ static int __init audit_enable(char *str) + + if (audit_default == AUDIT_OFF) + audit_initialized = AUDIT_DISABLED; ++ else if (!audit_ever_enabled) ++ audit_initialized = AUDIT_UNINITIALIZED; ++ + if (audit_set_enabled(audit_default)) + pr_err("audit: error setting audit state (%d)\n", + audit_default); +diff --git i/kernel/bpf/core.c w/kernel/bpf/core.c +index 36be400c3..50fa38718 100644 +--- i/kernel/bpf/core.c ++++ w/kernel/bpf/core.c +@@ -368,7 +368,7 @@ void bpf_prog_kallsyms_del_all(struct bpf_prog *fp) + #ifdef CONFIG_BPF_JIT + /* All BPF JIT sysctl knobs here. */ + int bpf_jit_enable __read_mostly = IS_BUILTIN(CONFIG_BPF_JIT_ALWAYS_ON); +-int bpf_jit_harden __read_mostly; ++int bpf_jit_harden __read_mostly = 2; + int bpf_jit_kallsyms __read_mostly; + long bpf_jit_limit __read_mostly; + +diff --git i/kernel/bpf/syscall.c w/kernel/bpf/syscall.c +index 8bbabab3a..796be1451 100644 +--- i/kernel/bpf/syscall.c ++++ w/kernel/bpf/syscall.c +@@ -48,7 +48,7 @@ static DEFINE_SPINLOCK(prog_idr_lock); + static DEFINE_IDR(map_idr); + static DEFINE_SPINLOCK(map_idr_lock); + +-int sysctl_unprivileged_bpf_disabled __read_mostly; ++int sysctl_unprivileged_bpf_disabled __read_mostly = 1; + + static const struct bpf_map_ops * const bpf_map_types[] = { + #define BPF_PROG_TYPE(_id, _ops) +diff --git i/kernel/capability.c w/kernel/capability.c +index 7718d7dca..8a4ce459d 100644 +--- i/kernel/capability.c ++++ w/kernel/capability.c +@@ -432,6 +432,12 @@ bool capable(int cap) + return ns_capable(&init_user_ns, cap); + } + EXPORT_SYMBOL(capable); ++ ++bool capable_noaudit(int cap) ++{ ++ return ns_capable_noaudit(&init_user_ns, cap); ++} ++EXPORT_SYMBOL(capable_noaudit); + #endif /* CONFIG_MULTIUSER */ + + /** +diff --git i/kernel/power/snapshot.c w/kernel/power/snapshot.c +index f2635fc75..a4c445bf7 100644 +--- i/kernel/power/snapshot.c ++++ w/kernel/power/snapshot.c +@@ -1145,7 +1145,7 @@ void free_basic_memory_bitmaps(void) + + void clear_free_pages(void) + { +-#ifdef CONFIG_PAGE_POISONING_ZERO ++#if defined(CONFIG_PAGE_POISONING_ZERO) || defined(CONFIG_PAGE_SANITIZE) + struct memory_bitmap *bm = free_pages_map; + unsigned long pfn; + +@@ -1162,7 +1162,7 @@ void clear_free_pages(void) + } + memory_bm_position_reset(bm); + pr_info("free pages cleared after restore\n"); +-#endif /* PAGE_POISONING_ZERO */ ++#endif /* PAGE_POISONING_ZERO || PAGE_SANITIZE */ + } + + /** +diff --git i/kernel/rcu/tiny.c w/kernel/rcu/tiny.c +index befc9321a..61e192565 100644 +--- i/kernel/rcu/tiny.c ++++ w/kernel/rcu/tiny.c +@@ -162,7 +162,7 @@ static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp) + } + } + +-static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused) ++static __latent_entropy void rcu_process_callbacks(void) + { + __rcu_process_callbacks(&rcu_sched_ctrlblk); + __rcu_process_callbacks(&rcu_bh_ctrlblk); +diff --git i/kernel/rcu/tree.c w/kernel/rcu/tree.c +index f7e89c989..527c17081 100644 +--- i/kernel/rcu/tree.c ++++ w/kernel/rcu/tree.c +@@ -2870,7 +2870,7 @@ __rcu_process_callbacks(struct rcu_state *rsp) + /* + * Do RCU core processing for the current CPU. + */ +-static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused) ++static __latent_entropy void rcu_process_callbacks(void) + { + struct rcu_state *rsp; + +diff --git i/kernel/sched/fair.c w/kernel/sched/fair.c +index 696d08a45..9b273b3e2 100644 +--- i/kernel/sched/fair.c ++++ w/kernel/sched/fair.c +@@ -9732,7 +9732,7 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf) + * run_rebalance_domains is triggered when needed from the scheduler tick. + * Also triggered for nohz idle balancing (with nohz_balancing_kick set). + */ +-static __latent_entropy void run_rebalance_domains(struct softirq_action *h) ++static __latent_entropy void run_rebalance_domains(void) + { + struct rq *this_rq = this_rq(); + enum cpu_idle_type idle = this_rq->idle_balance ? +diff --git i/kernel/softirq.c w/kernel/softirq.c +index 6f584861d..1943fe60f 100644 +--- i/kernel/softirq.c ++++ w/kernel/softirq.c +@@ -53,7 +53,7 @@ DEFINE_PER_CPU_ALIGNED(irq_cpustat_t, irq_stat); + EXPORT_PER_CPU_SYMBOL(irq_stat); + #endif + +-static struct softirq_action softirq_vec[NR_SOFTIRQS] __cacheline_aligned_in_smp; ++static struct softirq_action softirq_vec[NR_SOFTIRQS] __ro_after_init __aligned(PAGE_SIZE); + + DEFINE_PER_CPU(struct task_struct *, ksoftirqd); + +@@ -289,7 +289,7 @@ asmlinkage __visible void __softirq_entry __do_softirq(void) + kstat_incr_softirqs_this_cpu(vec_nr); + + trace_softirq_entry(vec_nr); +- h->action(h); ++ h->action(); + trace_softirq_exit(vec_nr); + if (unlikely(prev_count != preempt_count())) { + pr_err("huh, entered softirq %u %s %p with preempt_count %08x, exited with %08x?\n", +@@ -451,7 +451,7 @@ void __raise_softirq_irqoff(unsigned int nr) + or_softirq_pending(1UL << nr); + } + +-void open_softirq(int nr, void (*action)(struct softirq_action *)) ++void __init open_softirq(int nr, void (*action)(void)) + { + softirq_vec[nr].action = action; + } +@@ -497,8 +497,7 @@ void __tasklet_hi_schedule(struct tasklet_struct *t) + } + EXPORT_SYMBOL(__tasklet_hi_schedule); + +-static void tasklet_action_common(struct softirq_action *a, +- struct tasklet_head *tl_head, ++static void tasklet_action_common(struct tasklet_head *tl_head, + unsigned int softirq_nr) + { + struct tasklet_struct *list; +@@ -535,14 +534,14 @@ static void tasklet_action_common(struct softirq_action *a, + } + } + +-static __latent_entropy void tasklet_action(struct softirq_action *a) ++static __latent_entropy void tasklet_action(void) + { +- tasklet_action_common(a, this_cpu_ptr(&tasklet_vec), TASKLET_SOFTIRQ); ++ tasklet_action_common(this_cpu_ptr(&tasklet_vec), TASKLET_SOFTIRQ); + } + +-static __latent_entropy void tasklet_hi_action(struct softirq_action *a) ++static __latent_entropy void tasklet_hi_action(void) + { +- tasklet_action_common(a, this_cpu_ptr(&tasklet_hi_vec), HI_SOFTIRQ); ++ tasklet_action_common(this_cpu_ptr(&tasklet_hi_vec), HI_SOFTIRQ); + } + + void tasklet_init(struct tasklet_struct *t, +diff --git i/kernel/sysctl.c w/kernel/sysctl.c +index 9f4557040..554ef6bc4 100644 +--- i/kernel/sysctl.c ++++ w/kernel/sysctl.c +@@ -95,6 +95,9 @@ + #ifdef CONFIG_LOCKUP_DETECTOR + #include + #endif ++#if defined CONFIG_TTY ++#include ++#endif + + #if defined(CONFIG_SYSCTL) + +@@ -108,6 +111,9 @@ extern unsigned int core_pipe_limit; + #ifdef CONFIG_USER_NS + extern int unprivileged_userns_clone; + #endif ++#ifdef CONFIG_USER_NS ++extern int unprivileged_userns_clone; ++#endif + extern int pid_max; + extern int pid_max_min, pid_max_max; + extern int percpu_pagelist_fraction; +@@ -119,35 +125,35 @@ extern int sysctl_nr_trim_pages; + + /* Constants used for minimum and maximum */ + #ifdef CONFIG_LOCKUP_DETECTOR +-static int sixty = 60; ++static int sixty __read_only = 60; + #endif + +-static int __maybe_unused neg_one = -1; ++static int __maybe_unused neg_one __read_only = -1; + + static int zero; +-static int __maybe_unused one = 1; +-static int __maybe_unused two = 2; +-static int __maybe_unused four = 4; +-static unsigned long zero_ul; +-static unsigned long one_ul = 1; +-static unsigned long long_max = LONG_MAX; +-static int one_hundred = 100; +-static int one_thousand = 1000; ++static int __maybe_unused one __read_only = 1; ++static int __maybe_unused two __read_only = 2; ++static int __maybe_unused four __read_only = 4; ++static unsigned long zero_ul __read_only; ++static unsigned long one_ul __read_only = 1; ++static unsigned long long_max __read_only = LONG_MAX; ++static int one_hundred __read_only = 100; ++static int one_thousand __read_only = 1000; + #ifdef CONFIG_PRINTK +-static int ten_thousand = 10000; ++static int ten_thousand __read_only = 10000; + #endif + #ifdef CONFIG_PERF_EVENTS +-static int six_hundred_forty_kb = 640 * 1024; ++static int six_hundred_forty_kb __read_only = 640 * 1024; + #endif + + /* this is needed for the proc_doulongvec_minmax of vm_dirty_bytes */ +-static unsigned long dirty_bytes_min = 2 * PAGE_SIZE; ++static unsigned long dirty_bytes_min __read_only = 2 * PAGE_SIZE; + + /* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID */ +-static int maxolduid = 65535; +-static int minolduid; ++static int maxolduid __read_only = 65535; ++static int minolduid __read_only; + +-static int ngroups_max = NGROUPS_MAX; ++static int ngroups_max __read_only = NGROUPS_MAX; + static const int cap_last_cap = CAP_LAST_CAP; + + /* +@@ -155,9 +161,12 @@ static const int cap_last_cap = CAP_LAST_CAP; + * and hung_task_check_interval_secs + */ + #ifdef CONFIG_DETECT_HUNG_TASK +-static unsigned long hung_task_timeout_max = (LONG_MAX/HZ); ++static unsigned long hung_task_timeout_max __read_only = (LONG_MAX/HZ); + #endif + ++int device_sidechannel_restrict __read_mostly = 1; ++EXPORT_SYMBOL(device_sidechannel_restrict); ++ + #ifdef CONFIG_INOTIFY_USER + #include + #endif +@@ -215,11 +224,6 @@ static int proc_taint(struct ctl_table *table, int write, + void __user *buffer, size_t *lenp, loff_t *ppos); + #endif + +-#ifdef CONFIG_PRINTK +-static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write, +- void __user *buffer, size_t *lenp, loff_t *ppos); +-#endif +- + static int proc_dointvec_minmax_coredump(struct ctl_table *table, int write, + void __user *buffer, size_t *lenp, loff_t *ppos); + #ifdef CONFIG_COREDUMP +@@ -301,19 +305,19 @@ static struct ctl_table sysctl_base_table[] = { + }; + + #ifdef CONFIG_SCHED_DEBUG +-static int min_sched_granularity_ns = 100000; /* 100 usecs */ +-static int max_sched_granularity_ns = NSEC_PER_SEC; /* 1 second */ +-static int min_wakeup_granularity_ns; /* 0 usecs */ +-static int max_wakeup_granularity_ns = NSEC_PER_SEC; /* 1 second */ ++static int min_sched_granularity_ns __read_only = 100000; /* 100 usecs */ ++static int max_sched_granularity_ns __read_only = NSEC_PER_SEC; /* 1 second */ ++static int min_wakeup_granularity_ns __read_only; /* 0 usecs */ ++static int max_wakeup_granularity_ns __read_only = NSEC_PER_SEC; /* 1 second */ + #ifdef CONFIG_SMP +-static int min_sched_tunable_scaling = SCHED_TUNABLESCALING_NONE; +-static int max_sched_tunable_scaling = SCHED_TUNABLESCALING_END-1; ++static int min_sched_tunable_scaling __read_only = SCHED_TUNABLESCALING_NONE; ++static int max_sched_tunable_scaling __read_only = SCHED_TUNABLESCALING_END-1; + #endif /* CONFIG_SMP */ + #endif /* CONFIG_SCHED_DEBUG */ + + #ifdef CONFIG_COMPACTION +-static int min_extfrag_threshold; +-static int max_extfrag_threshold = 1000; ++static int min_extfrag_threshold __read_only; ++static int max_extfrag_threshold __read_only = 1000; + #endif + + static struct ctl_table kern_table[] = { +@@ -528,6 +532,15 @@ static struct ctl_table kern_table[] = { + .proc_handler = proc_dointvec, + }, + #endif ++#ifdef CONFIG_USER_NS ++ { ++ .procname = "unprivileged_userns_clone", ++ .data = &unprivileged_userns_clone, ++ .maxlen = sizeof(int), ++ .mode = 0644, ++ .proc_handler = proc_dointvec, ++ }, ++#endif + #ifdef CONFIG_PROC_SYSCTL + { + .procname = "tainted", +@@ -877,6 +890,26 @@ static struct ctl_table kern_table[] = { + .extra2 = &two, + }, + #endif ++#if defined CONFIG_TTY ++ { ++ .procname = "tiocsti_restrict", ++ .data = &tiocsti_restrict, ++ .maxlen = sizeof(int), ++ .mode = 0644, ++ .proc_handler = proc_dointvec_minmax_sysadmin, ++ .extra1 = &zero, ++ .extra2 = &one, ++ }, ++#endif ++ { ++ .procname = "device_sidechannel_restrict", ++ .data = &device_sidechannel_restrict, ++ .maxlen = sizeof(int), ++ .mode = 0644, ++ .proc_handler = proc_dointvec_minmax_sysadmin, ++ .extra1 = &zero, ++ .extra2 = &one, ++ }, + { + .procname = "ngroups_max", + .data = &ngroups_max, +@@ -2537,8 +2570,27 @@ static int proc_taint(struct ctl_table *table, int write, + return err; + } + +-#ifdef CONFIG_PRINTK +-static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write, ++/** ++ * proc_dointvec_minmax_sysadmin - read a vector of integers with min/max values ++ * checking CAP_SYS_ADMIN on write ++ * @table: the sysctl table ++ * @write: %TRUE if this is a write to the sysctl file ++ * @buffer: the user buffer ++ * @lenp: the size of the user buffer ++ * @ppos: file position ++ * ++ * Reads/writes up to table->maxlen/sizeof(unsigned int) integer ++ * values from/to the user buffer, treated as an ASCII string. ++ * ++ * This routine will ensure the values are within the range specified by ++ * table->extra1 (min) and table->extra2 (max). ++ * ++ * Writing is only allowed when root has CAP_SYS_ADMIN. ++ * ++ * Returns 0 on success, -EPERM on permission failure or -EINVAL on write ++ * when the range check fails. ++ */ ++int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write, + void __user *buffer, size_t *lenp, loff_t *ppos) + { + if (write && !capable(CAP_SYS_ADMIN)) +@@ -2546,7 +2598,6 @@ static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write, + + return proc_dointvec_minmax(table, write, buffer, lenp, ppos); + } +-#endif + + /** + * struct do_proc_dointvec_minmax_conv_param - proc_dointvec_minmax() range checking structure +@@ -3226,6 +3277,12 @@ int proc_douintvec_minmax(struct ctl_table *table, int write, + return -ENOSYS; + } + ++int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write, ++ void *buffer, size_t *lenp, loff_t *ppos) ++{ ++ return -ENOSYS; ++} ++ + int proc_dointvec_jiffies(struct ctl_table *table, int write, + void __user *buffer, size_t *lenp, loff_t *ppos) + { +@@ -3269,6 +3326,7 @@ EXPORT_SYMBOL(proc_douintvec); + EXPORT_SYMBOL(proc_dointvec_jiffies); + EXPORT_SYMBOL(proc_dointvec_minmax); + EXPORT_SYMBOL_GPL(proc_douintvec_minmax); ++EXPORT_SYMBOL(proc_dointvec_minmax_sysadmin); + EXPORT_SYMBOL(proc_dointvec_userhz_jiffies); + EXPORT_SYMBOL(proc_dointvec_ms_jiffies); + EXPORT_SYMBOL(proc_dostring); +diff --git i/kernel/time/hrtimer.c w/kernel/time/hrtimer.c +index 736255441..fb8902236 100644 +--- i/kernel/time/hrtimer.c ++++ w/kernel/time/hrtimer.c +@@ -1465,7 +1465,7 @@ static void __hrtimer_run_queues(struct hrtimer_cpu_base *cpu_base, ktime_t now, + } + } + +-static __latent_entropy void hrtimer_run_softirq(struct softirq_action *h) ++static __latent_entropy void hrtimer_run_softirq(void) + { + struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases); + unsigned long flags; +diff --git i/kernel/time/timer.c w/kernel/time/timer.c +index 61e41ea3a..253b57f3c 100644 +--- i/kernel/time/timer.c ++++ w/kernel/time/timer.c +@@ -1709,7 +1709,7 @@ static inline void __run_timers(struct timer_base *base) + /* + * This function runs timers and the timer-tq in bottom half context. + */ +-static __latent_entropy void run_timer_softirq(struct softirq_action *h) ++static __latent_entropy void run_timer_softirq(void) + { + struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]); + +diff --git i/kernel/user_namespace.c w/kernel/user_namespace.c +index 6b9dbc257..5b4a596ca 100644 +--- i/kernel/user_namespace.c ++++ w/kernel/user_namespace.c +@@ -29,6 +29,13 @@ + /* sysctl */ + int unprivileged_userns_clone; + ++/* sysctl */ ++#ifdef CONFIG_USER_NS_UNPRIVILEGED ++int unprivileged_userns_clone = 1; ++#else ++int unprivileged_userns_clone; ++#endif ++ + static struct kmem_cache *user_ns_cachep __read_mostly; + static DEFINE_MUTEX(userns_state_mutex); + +diff --git i/lib/Kconfig.debug w/lib/Kconfig.debug +index 46a910acc..5b60c663a 100644 +--- i/lib/Kconfig.debug ++++ w/lib/Kconfig.debug +@@ -950,6 +950,7 @@ endmenu # "Debug lockups and hangs" + + config PANIC_ON_OOPS + bool "Panic on Oops" ++ default y + help + Say Y here to enable the kernel to panic when it oopses. This + has the same effect as setting oops=panic on the kernel command +@@ -959,7 +960,7 @@ config PANIC_ON_OOPS + anything erroneous after an oops which could result in data + corruption or other issues. + +- Say N if unsure. ++ Say Y if unsure. + + config PANIC_ON_OOPS_VALUE + int +@@ -1328,6 +1329,7 @@ config DEBUG_BUGVERBOSE + config DEBUG_LIST + bool "Debug linked list manipulation" + depends on DEBUG_KERNEL || BUG_ON_DATA_CORRUPTION ++ default y + help + Enable this to turn on extended checks in the linked-list + walking routines. +@@ -1983,6 +1985,7 @@ config MEMTEST + config BUG_ON_DATA_CORRUPTION + bool "Trigger a BUG when data corruption is detected" + select DEBUG_LIST ++ default y + help + Select this option if the kernel should BUG when it encounters + data corruption in kernel memory structures when they get checked +@@ -2022,6 +2025,7 @@ config STRICT_DEVMEM + config IO_STRICT_DEVMEM + bool "Filter I/O access to /dev/mem" + depends on STRICT_DEVMEM ++ default y + ---help--- + If this option is disabled, you allow userspace (root) access to all + io-memory regardless of whether a driver is actively using that +diff --git i/lib/irq_poll.c w/lib/irq_poll.c +index 86a709954..6f15787fc 100644 +--- i/lib/irq_poll.c ++++ w/lib/irq_poll.c +@@ -75,7 +75,7 @@ void irq_poll_complete(struct irq_poll *iop) + } + EXPORT_SYMBOL(irq_poll_complete); + +-static void __latent_entropy irq_poll_softirq(struct softirq_action *h) ++static void __latent_entropy irq_poll_softirq(void) + { + struct list_head *list = this_cpu_ptr(&blk_cpu_iopoll); + int rearm = 0, budget = irq_poll_budget; +diff --git i/lib/kobject.c w/lib/kobject.c +index 97d86dc17..388257c28 100644 +--- i/lib/kobject.c ++++ w/lib/kobject.c +@@ -978,9 +978,9 @@ EXPORT_SYMBOL_GPL(kset_create_and_add); + + + static DEFINE_SPINLOCK(kobj_ns_type_lock); +-static const struct kobj_ns_type_operations *kobj_ns_ops_tbl[KOBJ_NS_TYPES]; ++static const struct kobj_ns_type_operations *kobj_ns_ops_tbl[KOBJ_NS_TYPES] __ro_after_init; + +-int kobj_ns_type_register(const struct kobj_ns_type_operations *ops) ++int __init kobj_ns_type_register(const struct kobj_ns_type_operations *ops) + { + enum kobj_ns_type type = ops->type; + int error; +diff --git i/lib/nlattr.c w/lib/nlattr.c +index e335bcafa..f6334f882 100644 +--- i/lib/nlattr.c ++++ w/lib/nlattr.c +@@ -364,6 +364,8 @@ int nla_memcpy(void *dest, const struct nlattr *src, int count) + { + int minlen = min_t(int, count, nla_len(src)); + ++ BUG_ON(minlen < 0); ++ + memcpy(dest, nla_data(src), minlen); + if (count > minlen) + memset(dest + minlen, 0, count - minlen); +diff --git i/lib/vsprintf.c w/lib/vsprintf.c +index 812e59e13..2c2104884 100644 +--- i/lib/vsprintf.c ++++ w/lib/vsprintf.c +@@ -1371,7 +1371,7 @@ char *pointer_string(char *buf, char *end, const void *ptr, + return number(buf, end, (unsigned long int)ptr, spec); + } + +-int kptr_restrict __read_mostly; ++int kptr_restrict __read_mostly = 2; + + static noinline_for_stack + char *restricted_pointer(char *buf, char *end, const void *ptr, +diff --git i/mm/Kconfig w/mm/Kconfig +index b457e94ae..ec2440e66 100644 +--- i/mm/Kconfig ++++ w/mm/Kconfig +@@ -311,7 +311,8 @@ config KSM + config DEFAULT_MMAP_MIN_ADDR + int "Low address space to protect from user allocation" + depends on MMU +- default 4096 ++ default 32768 if ARM || (ARM64 && COMPAT) ++ default 65536 + help + This is the portion of low virtual memory which should be protected + from userspace allocation. Keeping a user from writing to low pages +diff --git i/mm/mmap.c w/mm/mmap.c +index af65f8895..63f5b2bf5 100644 +--- i/mm/mmap.c ++++ w/mm/mmap.c +@@ -224,6 +224,13 @@ SYSCALL_DEFINE1(brk, unsigned long, brk) + + newbrk = PAGE_ALIGN(brk); + oldbrk = PAGE_ALIGN(mm->brk); ++ /* properly handle unaligned min_brk as an empty heap */ ++ if (min_brk & ~PAGE_MASK) { ++ if (brk == min_brk) ++ newbrk -= PAGE_SIZE; ++ if (mm->brk == min_brk) ++ oldbrk -= PAGE_SIZE; ++ } + if (oldbrk == newbrk) + goto set_brk; + +diff --git i/mm/page_alloc.c w/mm/page_alloc.c +index 4325e7d58..7b18ed518 100644 +--- i/mm/page_alloc.c ++++ w/mm/page_alloc.c +@@ -67,6 +67,7 @@ + #include + #include + #include ++#include + + #include + #include +@@ -100,6 +101,15 @@ int _node_numa_mem_[MAX_NUMNODES]; + DEFINE_MUTEX(pcpu_drain_mutex); + DEFINE_PER_CPU(struct work_struct, pcpu_drain); + ++bool __meminitdata extra_latent_entropy; ++ ++static int __init setup_extra_latent_entropy(char *str) ++{ ++ extra_latent_entropy = true; ++ return 0; ++} ++early_param("extra_latent_entropy", setup_extra_latent_entropy); ++ + #ifdef CONFIG_GCC_PLUGIN_LATENT_ENTROPY + volatile unsigned long latent_entropy __latent_entropy; + EXPORT_SYMBOL(latent_entropy); +@@ -1056,6 +1066,13 @@ static __always_inline bool free_pages_prepare(struct page *page, + debug_check_no_obj_freed(page_address(page), + PAGE_SIZE << order); + } ++ ++ if (IS_ENABLED(CONFIG_PAGE_SANITIZE)) { ++ int i; ++ for (i = 0; i < (1 << order); i++) ++ clear_highpage(page + i); ++ } ++ + arch_free_page(page, order); + kernel_poison_pages(page, 1 << order, 0); + kernel_map_pages(page, 1 << order, 0); +@@ -1301,6 +1318,21 @@ static void __init __free_pages_boot_core(struct page *page, unsigned int order) + __ClearPageReserved(p); + set_page_count(p, 0); + ++ if (extra_latent_entropy && !PageHighMem(page) && page_to_pfn(page) < 0x100000) { ++ unsigned long hash = 0; ++ size_t index, end = PAGE_SIZE * nr_pages / sizeof hash; ++ const unsigned long *data = lowmem_page_address(page); ++ ++ for (index = 0; index < end; index++) ++ hash ^= hash + data[index]; ++#ifdef CONFIG_GCC_PLUGIN_LATENT_ENTROPY ++ latent_entropy ^= hash; ++ add_device_randomness((const void *)&latent_entropy, sizeof(latent_entropy)); ++#else ++ add_device_randomness((const void *)&hash, sizeof(hash)); ++#endif ++ } ++ + page_zone(page)->managed_pages += nr_pages; + set_page_refcounted(page); + __free_pages(page, order); +@@ -1886,8 +1918,8 @@ static inline int check_new_page(struct page *page) + + static inline bool free_pages_prezeroed(void) + { +- return IS_ENABLED(CONFIG_PAGE_POISONING_ZERO) && +- page_poisoning_enabled(); ++ return IS_ENABLED(CONFIG_PAGE_SANITIZE) || ++ (IS_ENABLED(CONFIG_PAGE_POISONING_ZERO) && page_poisoning_enabled()); + } + + #ifdef CONFIG_DEBUG_VM +@@ -1944,6 +1976,11 @@ static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags + + post_alloc_hook(page, order, gfp_flags); + ++ if (IS_ENABLED(CONFIG_PAGE_SANITIZE_VERIFY)) { ++ for (i = 0; i < (1 << order); i++) ++ verify_zero_highpage(page + i); ++ } ++ + if (!free_pages_prezeroed() && (gfp_flags & __GFP_ZERO)) + for (i = 0; i < (1 << order); i++) + clear_highpage(page + i); +diff --git i/mm/slab.h w/mm/slab.h +index 9632772e1..802ff9ee8 100644 +--- i/mm/slab.h ++++ w/mm/slab.h +@@ -314,7 +314,11 @@ static inline bool is_root_cache(struct kmem_cache *s) + static inline bool slab_equal_or_root(struct kmem_cache *s, + struct kmem_cache *p) + { ++#ifdef CONFIG_SLAB_HARDENED ++ return p == s; ++#else + return true; ++#endif + } + + static inline const char *cache_name(struct kmem_cache *s) +@@ -366,18 +370,26 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x) + * to not do even the assignment. In that case, slab_equal_or_root + * will also be a constant. + */ +- if (!memcg_kmem_enabled() && ++ if (!IS_ENABLED(CONFIG_SLAB_HARDENED) && ++ !memcg_kmem_enabled() && + !unlikely(s->flags & SLAB_CONSISTENCY_CHECKS)) + return s; + + page = virt_to_head_page(x); ++#ifdef CONFIG_SLAB_HARDENED ++ BUG_ON(!PageSlab(page)); ++#endif + cachep = page->slab_cache; + if (slab_equal_or_root(cachep, s)) + return cachep; + + pr_err("%s: Wrong slab cache. %s but object is from %s\n", + __func__, s->name, cachep->name); ++#ifdef CONFIG_BUG_ON_DATA_CORRUPTION ++ BUG_ON(1); ++#else + WARN_ON_ONCE(1); ++#endif + return s; + } + +@@ -402,7 +414,7 @@ static inline size_t slab_ksize(const struct kmem_cache *s) + * back there or track user information then we can + * only use the space before that information. + */ +- if (s->flags & (SLAB_TYPESAFE_BY_RCU | SLAB_STORE_USER)) ++ if ((s->flags & (SLAB_TYPESAFE_BY_RCU | SLAB_STORE_USER)) || IS_ENABLED(CONFIG_SLAB_CANARY)) + return s->inuse; + /* + * Else we can use all the padding etc for the allocation +diff --git i/mm/slab_common.c w/mm/slab_common.c +index a94b9981e..52a95af6a 100644 +--- i/mm/slab_common.c ++++ w/mm/slab_common.c +@@ -27,10 +27,10 @@ + + #include "slab.h" + +-enum slab_state slab_state; ++enum slab_state slab_state __ro_after_init; + LIST_HEAD(slab_caches); + DEFINE_MUTEX(slab_mutex); +-struct kmem_cache *kmem_cache; ++struct kmem_cache *kmem_cache __ro_after_init; + + #ifdef CONFIG_HARDENED_USERCOPY + bool usercopy_fallback __ro_after_init = +@@ -58,7 +58,7 @@ static DECLARE_WORK(slab_caches_to_rcu_destroy_work, + /* + * Merge control. If this is set then no merging of slab caches will occur. + */ +-static bool slab_nomerge = !IS_ENABLED(CONFIG_SLAB_MERGE_DEFAULT); ++static bool slab_nomerge __ro_after_init = !IS_ENABLED(CONFIG_SLAB_MERGE_DEFAULT); + + static int __init setup_slab_nomerge(char *str) + { +diff --git i/mm/slub.c w/mm/slub.c +index dfc9b4267..3fc9de7ef 100644 +--- i/mm/slub.c ++++ w/mm/slub.c +@@ -124,6 +124,16 @@ static inline int kmem_cache_debug(struct kmem_cache *s) + #endif + } + ++static inline bool has_sanitize(struct kmem_cache *s) ++{ ++ return IS_ENABLED(CONFIG_SLAB_SANITIZE) && !(s->flags & (SLAB_TYPESAFE_BY_RCU | SLAB_POISON)); ++} ++ ++static inline bool has_sanitize_verify(struct kmem_cache *s) ++{ ++ return IS_ENABLED(CONFIG_SLAB_SANITIZE_VERIFY) && has_sanitize(s); ++} ++ + void *fixup_red_left(struct kmem_cache *s, void *p) + { + if (kmem_cache_debug(s) && s->flags & SLAB_RED_ZONE) +@@ -297,6 +307,35 @@ static inline void set_freepointer(struct kmem_cache *s, void *object, void *fp) + *(void **)freeptr_addr = freelist_ptr(s, fp, freeptr_addr); + } + ++#ifdef CONFIG_SLAB_CANARY ++static inline unsigned long *get_canary(struct kmem_cache *s, void *object) ++{ ++ if (s->offset) ++ return object + s->offset + sizeof(void *); ++ return object + s->inuse; ++} ++ ++static inline unsigned long get_canary_value(const void *canary, unsigned long value) ++{ ++ return (value ^ (unsigned long)canary) & CANARY_MASK; ++} ++ ++static inline void set_canary(struct kmem_cache *s, void *object, unsigned long value) ++{ ++ unsigned long *canary = get_canary(s, object); ++ *canary = get_canary_value(canary, value); ++} ++ ++static inline void check_canary(struct kmem_cache *s, void *object, unsigned long value) ++{ ++ unsigned long *canary = get_canary(s, object); ++ BUG_ON(*canary != get_canary_value(canary, value)); ++} ++#else ++#define set_canary(s, object, value) ++#define check_canary(s, object, value) ++#endif ++ + /* Loop over all objects in a slab */ + #define for_each_object(__p, __s, __addr, __objects) \ + for (__p = fixup_red_left(__s, __addr); \ +@@ -469,13 +508,13 @@ static inline void *restore_red_left(struct kmem_cache *s, void *p) + * Debug settings: + */ + #if defined(CONFIG_SLUB_DEBUG_ON) +-static slab_flags_t slub_debug = DEBUG_DEFAULT_FLAGS; ++static slab_flags_t slub_debug __ro_after_init = DEBUG_DEFAULT_FLAGS; + #else +-static slab_flags_t slub_debug; ++static slab_flags_t slub_debug __ro_after_init; + #endif + +-static char *slub_debug_slabs; +-static int disable_higher_order_debug; ++static char *slub_debug_slabs __ro_after_init; ++static int disable_higher_order_debug __ro_after_init; + + /* + * slub is about to manipulate internal object metadata. This memory lies +@@ -535,6 +574,9 @@ static struct track *get_track(struct kmem_cache *s, void *object, + else + p = object + s->inuse; + ++ if (IS_ENABLED(CONFIG_SLAB_CANARY)) ++ p = (void *)p + sizeof(void *); ++ + return p + alloc; + } + +@@ -688,6 +730,9 @@ static void print_trailer(struct kmem_cache *s, struct page *page, u8 *p) + else + off = s->inuse; + ++ if (IS_ENABLED(CONFIG_SLAB_CANARY)) ++ off += sizeof(void *); ++ + if (s->flags & SLAB_STORE_USER) + off += 2 * sizeof(struct track); + +@@ -817,6 +862,9 @@ static int check_pad_bytes(struct kmem_cache *s, struct page *page, u8 *p) + /* Freepointer is placed after the object. */ + off += sizeof(void *); + ++ if (IS_ENABLED(CONFIG_SLAB_CANARY)) ++ off += sizeof(void *); ++ + if (s->flags & SLAB_STORE_USER) + /* We also have user information there */ + off += 2 * sizeof(struct track); +@@ -1436,8 +1484,9 @@ static void setup_object(struct kmem_cache *s, struct page *page, + void *object) + { + setup_object_debug(s, page, object); ++ set_canary(s, object, s->random_inactive); + kasan_init_slab_obj(s, object); +- if (unlikely(s->ctor)) { ++ if (unlikely(s->ctor) && !has_sanitize_verify(s)) { + kasan_unpoison_object_data(s, object); + s->ctor(object); + kasan_poison_object_data(s, object); +@@ -2735,9 +2784,21 @@ static __always_inline void *slab_alloc_node(struct kmem_cache *s, + stat(s, ALLOC_FASTPATH); + } + +- if (unlikely(gfpflags & __GFP_ZERO) && object) ++ if (has_sanitize_verify(s) && object) { ++ size_t offset = s->offset ? 0 : sizeof(void *); ++ BUG_ON(memchr_inv(object + offset, 0, s->object_size - offset)); ++ if (s->ctor) ++ s->ctor(object); ++ if (unlikely(gfpflags & __GFP_ZERO) && offset) ++ memset(object, 0, sizeof(void *)); ++ } else if (unlikely(gfpflags & __GFP_ZERO) && object) + memset(object, 0, s->object_size); + ++ if (object) { ++ check_canary(s, object, s->random_inactive); ++ set_canary(s, object, s->random_active); ++ } ++ + slab_post_alloc_hook(s, gfpflags, 1, &object); + + return object; +@@ -2944,6 +3005,27 @@ static __always_inline void do_slab_free(struct kmem_cache *s, + void *tail_obj = tail ? : head; + struct kmem_cache_cpu *c; + unsigned long tid; ++ bool sanitize = has_sanitize(s); ++ ++ if (IS_ENABLED(CONFIG_SLAB_CANARY) || sanitize) { ++ __maybe_unused int offset = s->offset ? 0 : sizeof(void *); ++ void *x = head; ++ ++ while (1) { ++ check_canary(s, x, s->random_active); ++ set_canary(s, x, s->random_inactive); ++ ++ if (sanitize) { ++ memset(x + offset, 0, s->object_size - offset); ++ if (!IS_ENABLED(CONFIG_SLAB_SANITIZE_VERIFY) && s->ctor) ++ s->ctor(x); ++ } ++ if (x == tail_obj) ++ break; ++ x = get_freepointer(s, x); ++ } ++ } ++ + redo: + /* + * Determine the currently cpus per cpu slab. +@@ -3122,7 +3204,7 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size, + void **p) + { + struct kmem_cache_cpu *c; +- int i; ++ int i, k; + + /* memcg and kmem_cache debug support */ + s = slab_pre_alloc_hook(s, flags); +@@ -3168,13 +3250,29 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size, + local_irq_enable(); + + /* Clear memory outside IRQ disabled fastpath loop */ +- if (unlikely(flags & __GFP_ZERO)) { ++ if (has_sanitize_verify(s)) { ++ int j; ++ ++ for (j = 0; j < i; j++) { ++ size_t offset = s->offset ? 0 : sizeof(void *); ++ BUG_ON(memchr_inv(p[j] + offset, 0, s->object_size - offset)); ++ if (s->ctor) ++ s->ctor(p[j]); ++ if (unlikely(flags & __GFP_ZERO) && offset) ++ memset(p[j], 0, sizeof(void *)); ++ } ++ } else if (unlikely(flags & __GFP_ZERO)) { + int j; + + for (j = 0; j < i; j++) + memset(p[j], 0, s->object_size); + } + ++ for (k = 0; k < i; k++) { ++ check_canary(s, p[k], s->random_inactive); ++ set_canary(s, p[k], s->random_active); ++ } ++ + /* memcg and kmem_cache debug support */ + slab_post_alloc_hook(s, flags, size, p); + return i; +@@ -3206,9 +3304,9 @@ EXPORT_SYMBOL(kmem_cache_alloc_bulk); + * and increases the number of allocations possible without having to + * take the list_lock. + */ +-static unsigned int slub_min_order; +-static unsigned int slub_max_order = PAGE_ALLOC_COSTLY_ORDER; +-static unsigned int slub_min_objects; ++static unsigned int slub_min_order __ro_after_init; ++static unsigned int slub_max_order __ro_after_init = PAGE_ALLOC_COSTLY_ORDER; ++static unsigned int slub_min_objects __ro_after_init; + + /* + * Calculate the order of allocation given an slab object size. +@@ -3380,6 +3478,7 @@ static void early_kmem_cache_node_alloc(int node) + init_object(kmem_cache_node, n, SLUB_RED_ACTIVE); + init_tracking(kmem_cache_node, n); + #endif ++ set_canary(kmem_cache_node, n, kmem_cache_node->random_active); + kasan_kmalloc(kmem_cache_node, n, sizeof(struct kmem_cache_node), + GFP_KERNEL); + init_kmem_cache_node(n); +@@ -3536,6 +3635,9 @@ static int calculate_sizes(struct kmem_cache *s, int forced_order) + size += sizeof(void *); + } + ++ if (IS_ENABLED(CONFIG_SLAB_CANARY)) ++ size += sizeof(void *); ++ + #ifdef CONFIG_SLUB_DEBUG + if (flags & SLAB_STORE_USER) + /* +@@ -3608,6 +3710,10 @@ static int kmem_cache_open(struct kmem_cache *s, slab_flags_t flags) + #ifdef CONFIG_SLAB_FREELIST_HARDENED + s->random = get_random_long(); + #endif ++#ifdef CONFIG_SLAB_CANARY ++ s->random_active = get_random_long(); ++ s->random_inactive = get_random_long(); ++#endif + + if (!calculate_sizes(s, -1)) + goto error; +@@ -3884,6 +3990,8 @@ void __check_heap_object(const void *ptr, unsigned long n, struct page *page, + offset -= s->red_left_pad; + } + ++ check_canary(s, (void *)ptr - offset, s->random_active); ++ + /* Allow address range falling entirely within usercopy region. */ + if (offset >= s->useroffset && + offset - s->useroffset <= s->usersize && +@@ -3917,7 +4025,11 @@ static size_t __ksize(const void *object) + page = virt_to_head_page(object); + + if (unlikely(!PageSlab(page))) { ++#ifdef CONFIG_BUG_ON_DATA_CORRUPTION ++ BUG_ON(!PageCompound(page)); ++#else + WARN_ON(!PageCompound(page)); ++#endif + return PAGE_SIZE << compound_order(page); + } + +@@ -4777,7 +4889,7 @@ enum slab_stat_type { + #define SO_TOTAL (1 << SL_TOTAL) + + #ifdef CONFIG_MEMCG +-static bool memcg_sysfs_enabled = IS_ENABLED(CONFIG_SLUB_MEMCG_SYSFS_ON); ++static bool memcg_sysfs_enabled __ro_after_init = IS_ENABLED(CONFIG_SLUB_MEMCG_SYSFS_ON); + + static int __init setup_slub_memcg_sysfs(char *str) + { +diff --git i/mm/swap.c w/mm/swap.c +index 45fdbfb6b..55ec851eb 100644 +--- i/mm/swap.c ++++ w/mm/swap.c +@@ -93,6 +93,13 @@ static void __put_compound_page(struct page *page) + if (!PageHuge(page)) + __page_cache_release(page); + dtor = get_compound_page_dtor(page); ++ if (!PageHuge(page)) ++ BUG_ON(dtor != free_compound_page ++#ifdef CONFIG_TRANSPARENT_HUGEPAGE ++ && dtor != free_transhuge_page ++#endif ++ ); ++ + (*dtor)(page); + } + +diff --git i/net/core/dev.c w/net/core/dev.c +index c77d12a35..830418c2d 100644 +--- i/net/core/dev.c ++++ w/net/core/dev.c +@@ -4533,7 +4533,7 @@ int netif_rx_ni(struct sk_buff *skb) + } + EXPORT_SYMBOL(netif_rx_ni); + +-static __latent_entropy void net_tx_action(struct softirq_action *h) ++static __latent_entropy void net_tx_action(void) + { + struct softnet_data *sd = this_cpu_ptr(&softnet_data); + +@@ -6312,7 +6312,7 @@ static int napi_poll(struct napi_struct *n, struct list_head *repoll) + return work; + } + +-static __latent_entropy void net_rx_action(struct softirq_action *h) ++static __latent_entropy void net_rx_action(void) + { + struct softnet_data *sd = this_cpu_ptr(&softnet_data); + unsigned long time_limit = jiffies + +diff --git i/net/dccp/ccids/ccid2.c w/net/dccp/ccids/ccid2.c +index 842a9c7c7..c1fc4a4a3 100644 +--- i/net/dccp/ccids/ccid2.c ++++ w/net/dccp/ccids/ccid2.c +@@ -139,21 +139,26 @@ static void dccp_tasklet_schedule(struct sock *sk) + + static void ccid2_hc_tx_rto_expire(struct timer_list *t) + { +- struct ccid2_hc_tx_sock *hc = from_timer(hc, t, tx_rtotimer); +- struct sock *sk = hc->sk; +- const bool sender_was_blocked = ccid2_cwnd_network_limited(hc); ++ struct dccp_sock *dp = from_timer(dp, t, dccps_ccid_timer); ++ struct sock *sk = (struct sock *)dp; ++ struct ccid2_hc_tx_sock *hc; ++ bool sender_was_blocked; + + bh_lock_sock(sk); ++ ++ if (inet_sk_state_load(sk) == DCCP_CLOSED) ++ goto out; ++ ++ hc = ccid_priv(dp->dccps_hc_tx_ccid); ++ sender_was_blocked = ccid2_cwnd_network_limited(hc); ++ + if (sock_owned_by_user(sk)) { +- sk_reset_timer(sk, &hc->tx_rtotimer, jiffies + HZ / 5); ++ sk_reset_timer(sk, &dp->dccps_ccid_timer, jiffies + HZ / 5); + goto out; + } + + ccid2_pr_debug("RTO_EXPIRE\n"); + +- if (sk->sk_state == DCCP_CLOSED) +- goto out; +- + /* back-off timer */ + hc->tx_rto <<= 1; + if (hc->tx_rto > DCCP_RTO_MAX) +@@ -179,7 +184,7 @@ static void ccid2_hc_tx_rto_expire(struct timer_list *t) + if (sender_was_blocked) + dccp_tasklet_schedule(sk); + /* restart backed-off timer */ +- sk_reset_timer(sk, &hc->tx_rtotimer, jiffies + hc->tx_rto); ++ sk_reset_timer(sk, &dp->dccps_ccid_timer, jiffies + hc->tx_rto); + out: + bh_unlock_sock(sk); + sock_put(sk); +@@ -343,7 +348,7 @@ static void ccid2_hc_tx_packet_sent(struct sock *sk, unsigned int len) + } + #endif + +- sk_reset_timer(sk, &hc->tx_rtotimer, jiffies + hc->tx_rto); ++ sk_reset_timer(sk, &dp->dccps_ccid_timer, jiffies + hc->tx_rto); + + #ifdef CONFIG_IP_DCCP_CCID2_DEBUG + do { +@@ -713,9 +718,9 @@ static void ccid2_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb) + + /* restart RTO timer if not all outstanding data has been acked */ + if (hc->tx_pipe == 0) +- sk_stop_timer(sk, &hc->tx_rtotimer); ++ sk_stop_timer(sk, &dp->dccps_ccid_timer); + else +- sk_reset_timer(sk, &hc->tx_rtotimer, jiffies + hc->tx_rto); ++ sk_reset_timer(sk, &dp->dccps_ccid_timer, jiffies + hc->tx_rto); + done: + /* check if incoming Acks allow pending packets to be sent */ + if (sender_was_blocked && !ccid2_cwnd_network_limited(hc)) +@@ -750,17 +755,18 @@ static int ccid2_hc_tx_init(struct ccid *ccid, struct sock *sk) + hc->tx_last_cong = hc->tx_lsndtime = hc->tx_cwnd_stamp = ccid2_jiffies32; + hc->tx_cwnd_used = 0; + hc->sk = sk; +- timer_setup(&hc->tx_rtotimer, ccid2_hc_tx_rto_expire, 0); ++ timer_setup(&dp->dccps_ccid_timer, ccid2_hc_tx_rto_expire, 0); + INIT_LIST_HEAD(&hc->tx_av_chunks); + return 0; + } + + static void ccid2_hc_tx_exit(struct sock *sk) + { ++ struct dccp_sock *dp = dccp_sk(sk); + struct ccid2_hc_tx_sock *hc = ccid2_hc_tx_sk(sk); + int i; + +- sk_stop_timer(sk, &hc->tx_rtotimer); ++ sk_stop_timer(sk, &dp->dccps_ccid_timer); + + for (i = 0; i < hc->tx_seqbufc; i++) + kfree(hc->tx_seqbuf[i]); +diff --git i/net/dccp/ccids/ccid3.c w/net/dccp/ccids/ccid3.c +index 12877a151..be3a80a09 100644 +--- i/net/dccp/ccids/ccid3.c ++++ w/net/dccp/ccids/ccid3.c +@@ -197,17 +197,24 @@ static inline void ccid3_hc_tx_update_win_count(struct ccid3_hc_tx_sock *hc, + + static void ccid3_hc_tx_no_feedback_timer(struct timer_list *t) + { +- struct ccid3_hc_tx_sock *hc = from_timer(hc, t, tx_no_feedback_timer); +- struct sock *sk = hc->sk; ++ struct dccp_sock *dp = from_timer(dp, t, dccps_ccid_timer); ++ struct ccid3_hc_tx_sock *hc; ++ struct sock *sk = (struct sock *)dp; + unsigned long t_nfb = USEC_PER_SEC / 5; + + bh_lock_sock(sk); ++ ++ if (inet_sk_state_load(sk) == DCCP_CLOSED) ++ goto out; ++ + if (sock_owned_by_user(sk)) { + /* Try again later. */ + /* XXX: set some sensible MIB */ + goto restart_timer; + } + ++ hc = ccid_priv(dp->dccps_hc_tx_ccid); ++ + ccid3_pr_debug("%s(%p, state=%s) - entry\n", dccp_role(sk), sk, + ccid3_tx_state_name(hc->tx_state)); + +@@ -263,8 +270,8 @@ static void ccid3_hc_tx_no_feedback_timer(struct timer_list *t) + t_nfb = max(hc->tx_t_rto, 2 * hc->tx_t_ipi); + + restart_timer: +- sk_reset_timer(sk, &hc->tx_no_feedback_timer, +- jiffies + usecs_to_jiffies(t_nfb)); ++ sk_reset_timer(sk, &dp->dccps_ccid_timer, ++ jiffies + usecs_to_jiffies(t_nfb)); + out: + bh_unlock_sock(sk); + sock_put(sk); +@@ -293,7 +300,7 @@ static int ccid3_hc_tx_send_packet(struct sock *sk, struct sk_buff *skb) + return -EBADMSG; + + if (hc->tx_state == TFRC_SSTATE_NO_SENT) { +- sk_reset_timer(sk, &hc->tx_no_feedback_timer, (jiffies + ++ sk_reset_timer(sk, &dp->dccps_ccid_timer, (jiffies + + usecs_to_jiffies(TFRC_INITIAL_TIMEOUT))); + hc->tx_last_win_count = 0; + hc->tx_t_last_win_count = now; +@@ -367,6 +374,7 @@ static void ccid3_hc_tx_packet_sent(struct sock *sk, unsigned int len) + static void ccid3_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb) + { + struct ccid3_hc_tx_sock *hc = ccid3_hc_tx_sk(sk); ++ struct dccp_sock *dp = dccp_sk(sk); + struct tfrc_tx_hist_entry *acked; + ktime_t now; + unsigned long t_nfb; +@@ -433,7 +441,7 @@ static void ccid3_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb) + (unsigned int)(hc->tx_x >> 6)); + + /* unschedule no feedback timer */ +- sk_stop_timer(sk, &hc->tx_no_feedback_timer); ++ sk_stop_timer(sk, &dp->dccps_ccid_timer); + + /* + * As we have calculated new ipi, delta, t_nom it is possible +@@ -458,8 +466,8 @@ static void ccid3_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb) + "expire in %lu jiffies (%luus)\n", + dccp_role(sk), sk, usecs_to_jiffies(t_nfb), t_nfb); + +- sk_reset_timer(sk, &hc->tx_no_feedback_timer, +- jiffies + usecs_to_jiffies(t_nfb)); ++ sk_reset_timer(sk, &dp->dccps_ccid_timer, ++ jiffies + usecs_to_jiffies(t_nfb)); + } + + static int ccid3_hc_tx_parse_options(struct sock *sk, u8 packet_type, +@@ -501,21 +509,23 @@ static int ccid3_hc_tx_parse_options(struct sock *sk, u8 packet_type, + + static int ccid3_hc_tx_init(struct ccid *ccid, struct sock *sk) + { ++ struct dccp_sock *dp = dccp_sk(sk); + struct ccid3_hc_tx_sock *hc = ccid_priv(ccid); + + hc->tx_state = TFRC_SSTATE_NO_SENT; + hc->tx_hist = NULL; + hc->sk = sk; +- timer_setup(&hc->tx_no_feedback_timer, ++ timer_setup(&dp->dccps_ccid_timer, + ccid3_hc_tx_no_feedback_timer, 0); + return 0; + } + + static void ccid3_hc_tx_exit(struct sock *sk) + { ++ struct dccp_sock *dp = dccp_sk(sk); + struct ccid3_hc_tx_sock *hc = ccid3_hc_tx_sk(sk); + +- sk_stop_timer(sk, &hc->tx_no_feedback_timer); ++ sk_stop_timer(sk, &dp->dccps_ccid_timer); + tfrc_tx_hist_purge(&hc->tx_hist); + } + +diff --git i/net/dccp/proto.c w/net/dccp/proto.c +index 43733accf..d08459db0 100644 +--- i/net/dccp/proto.c ++++ w/net/dccp/proto.c +@@ -283,7 +283,9 @@ int dccp_disconnect(struct sock *sk, int flags) + + dccp_clear_xmit_timers(sk); + ccid_hc_rx_delete(dp->dccps_hc_rx_ccid, sk); ++ ccid_hc_tx_delete(dp->dccps_hc_tx_ccid, sk); + dp->dccps_hc_rx_ccid = NULL; ++ dp->dccps_hc_tx_ccid = NULL; + + __skb_queue_purge(&sk->sk_receive_queue); + __skb_queue_purge(&sk->sk_write_queue); +diff --git i/net/ipv4/Kconfig w/net/ipv4/Kconfig +index 2e12f8482..99718ae2a 100644 +--- i/net/ipv4/Kconfig ++++ w/net/ipv4/Kconfig +@@ -266,6 +266,7 @@ config IP_PIMSM_V2 + + config SYN_COOKIES + bool "IP: TCP syncookie support" ++ default y + ---help--- + Normal TCP/IP networking is open to an attack known as "SYN + flooding". This denial-of-service attack prevents legitimate remote +@@ -754,3 +755,26 @@ config TCP_MD5SIG + on the Internet. + + If unsure, say N. ++ ++config TCP_SIMULT_CONNECT_DEFAULT_ON ++ bool "Enable TCP simultaneous connect" ++ help ++ Enable TCP simultaneous connect that adds a weakness in Linux's strict ++ implementation of TCP that allows two clients to connect to each other ++ without either entering a listening state. The weakness allows an ++ attacker to easily prevent a client from connecting to a known server ++ provided the source port for the connection is guessed correctly. ++ ++ As the weakness could be used to prevent an antivirus or IPS from ++ fetching updates, or prevent an SSL gateway from fetching a CRL, it ++ should be eliminated by disabling this option. Though Linux is one of ++ few operating systems supporting simultaneous connect, it has no ++ legitimate use in practice and is rarely supported by firewalls. ++ ++ Disabling this may break TCP STUNT which is used by some applications ++ for NAT traversal. ++ ++ This setting can be overridden at runtime via the ++ net.ipv4.tcp_simult_connect sysctl. ++ ++ If unsure, say N. +diff --git i/net/ipv4/sysctl_net_ipv4.c w/net/ipv4/sysctl_net_ipv4.c +index ad132b6e8..0e17aa9d6 100644 +--- i/net/ipv4/sysctl_net_ipv4.c ++++ w/net/ipv4/sysctl_net_ipv4.c +@@ -552,6 +552,15 @@ static struct ctl_table ipv4_table[] = { + .mode = 0644, + .proc_handler = proc_doulongvec_minmax, + }, ++ { ++ .procname = "tcp_simult_connect", ++ .data = &sysctl_tcp_simult_connect, ++ .maxlen = sizeof(int), ++ .mode = 0644, ++ .proc_handler = proc_dointvec_minmax, ++ .extra1 = &zero, ++ .extra2 = &one, ++ }, + { } + }; + +diff --git i/net/ipv4/tcp_input.c w/net/ipv4/tcp_input.c +index 9813d62de..36333f003 100644 +--- i/net/ipv4/tcp_input.c ++++ w/net/ipv4/tcp_input.c +@@ -81,6 +81,7 @@ + #include + + int sysctl_tcp_max_orphans __read_mostly = NR_FILE; ++int sysctl_tcp_simult_connect __read_mostly = IS_ENABLED(CONFIG_TCP_SIMULT_CONNECT_DEFAULT_ON); + + #define FLAG_DATA 0x01 /* Incoming frame contained data. */ + #define FLAG_WIN_UPDATE 0x02 /* Incoming ACK was a window update. */ +@@ -5941,7 +5942,7 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, + tcp_paws_reject(&tp->rx_opt, 0)) + goto discard_and_undo; + +- if (th->syn) { ++ if (th->syn && sysctl_tcp_simult_connect) { + /* We see SYN without ACK. It is attempt of + * simultaneous connect with crossed SYNs. + * Particularly, it can be connect to self. +diff --git i/scripts/gcc-plugins/Kconfig w/scripts/gcc-plugins/Kconfig +index cb0c889e1..305f52f58 100644 +--- i/scripts/gcc-plugins/Kconfig ++++ w/scripts/gcc-plugins/Kconfig +@@ -59,6 +59,11 @@ config GCC_PLUGIN_LATENT_ENTROPY + is some slowdown of the boot process (about 0.5%) and fork and + irq processing. + ++ When extra_latent_entropy is passed on the kernel command line, ++ entropy will be extracted from up to the first 4GB of RAM while the ++ runtime memory allocator is being initialized. This costs even more ++ slowdown of the boot process. ++ + Note that entropy extracted this way is not cryptographically + secure! + +diff --git i/scripts/mod/modpost.c w/scripts/mod/modpost.c +index 91a80036c..41692ca62 100644 +--- i/scripts/mod/modpost.c ++++ w/scripts/mod/modpost.c +@@ -35,6 +35,7 @@ static int vmlinux_section_warnings = 1; + static int warn_unresolved = 0; + /* How a symbol is exported */ + static int sec_mismatch_count = 0; ++static int writable_fptr_count = 0; + static int sec_mismatch_verbose = 1; + static int sec_mismatch_fatal = 0; + /* ignore missing files */ +@@ -954,6 +955,7 @@ enum mismatch { + ANY_EXIT_TO_ANY_INIT, + EXPORT_TO_INIT_EXIT, + EXTABLE_TO_NON_TEXT, ++ DATA_TO_TEXT + }; + + /** +@@ -1080,6 +1082,12 @@ static const struct sectioncheck sectioncheck[] = { + .good_tosec = {ALL_TEXT_SECTIONS , NULL}, + .mismatch = EXTABLE_TO_NON_TEXT, + .handler = extable_mismatch_handler, ++}, ++/* Do not reference code from writable data */ ++{ ++ .fromsec = { DATA_SECTIONS, NULL }, ++ .bad_tosec = { ALL_TEXT_SECTIONS, NULL }, ++ .mismatch = DATA_TO_TEXT + } + }; + +@@ -1267,10 +1275,10 @@ static Elf_Sym *find_elf_symbol(struct elf_info *elf, Elf64_Sword addr, + continue; + if (!is_valid_name(elf, sym)) + continue; +- if (sym->st_value == addr) +- return sym; + /* Find a symbol nearby - addr are maybe negative */ + d = sym->st_value - addr; ++ if (d == 0) ++ return sym; + if (d < 0) + d = addr - sym->st_value; + if (d < distance) { +@@ -1405,7 +1413,11 @@ static void report_sec_mismatch(const char *modname, + char *prl_from; + char *prl_to; + +- sec_mismatch_count++; ++ if (mismatch->mismatch == DATA_TO_TEXT) ++ writable_fptr_count++; ++ else ++ sec_mismatch_count++; ++ + if (!sec_mismatch_verbose) + return; + +@@ -1529,6 +1541,14 @@ static void report_sec_mismatch(const char *modname, + fatal("There's a special handler for this mismatch type, " + "we should never get here."); + break; ++ case DATA_TO_TEXT: ++#if 0 ++ fprintf(stderr, ++ "The %s %s:%s references\n" ++ "the %s %s:%s%s\n", ++ from, fromsec, fromsym, to, tosec, tosym, to_p); ++#endif ++ break; + } + fprintf(stderr, "\n"); + } +@@ -2540,6 +2560,14 @@ int main(int argc, char **argv) + } + } + free(buf.p); ++ if (writable_fptr_count) { ++ if (!sec_mismatch_verbose) { ++ warn("modpost: Found %d writable function pointer(s).\n" ++ "To see full details build your kernel with:\n" ++ "'make CONFIG_DEBUG_SECTION_MISMATCH=y'\n", ++ writable_fptr_count); ++ } ++ } + + return err; + } +diff --git i/security/Kconfig w/security/Kconfig +index e3cb7bc6d..0d798a1f4 100644 +--- i/security/Kconfig ++++ w/security/Kconfig +@@ -8,7 +8,7 @@ source security/keys/Kconfig + + config SECURITY_DMESG_RESTRICT + bool "Restrict unprivileged access to the kernel syslog" +- default n ++ default y + help + This enforces restrictions on unprivileged users reading the kernel + syslog via dmesg(8). +@@ -27,10 +27,34 @@ config SECURITY_PERF_EVENTS_RESTRICT + perf_event_open syscall will be permitted unless it is + changed. + ++config SECURITY_PERF_EVENTS_RESTRICT ++ bool "Restrict unprivileged use of performance events" ++ depends on PERF_EVENTS ++ default y ++ help ++ If you say Y here, the kernel.perf_event_paranoid sysctl ++ will be set to 3 by default, and no unprivileged use of the ++ perf_event_open syscall will be permitted unless it is ++ changed. ++ ++config SECURITY_TIOCSTI_RESTRICT ++ bool "Restrict unprivileged use of tiocsti command injection" ++ default y ++ help ++ This enforces restrictions on unprivileged users injecting commands ++ into other processes which share a tty session using the TIOCSTI ++ ioctl. This option makes TIOCSTI use require CAP_SYS_ADMIN. ++ ++ If this option is not selected, no restrictions will be enforced ++ unless the tiocsti_restrict sysctl is explicitly set to (1). ++ ++ If you are unsure how to answer this question, answer N. ++ + config SECURITY + bool "Enable different security models" + depends on SYSFS + depends on MULTIUSER ++ default y + help + This allows you to choose different security modules to be + configured into your kernel. +@@ -57,6 +81,7 @@ config SECURITYFS + config SECURITY_NETWORK + bool "Socket and Networking Security Hooks" + depends on SECURITY ++ default y + help + This enables the socket and networking security hooks. + If enabled, a security module can use these hooks to +@@ -163,6 +188,7 @@ config HARDENED_USERCOPY + bool "Harden memory copies between kernel and userspace" + depends on HAVE_HARDENED_USERCOPY_ALLOCATOR + imply STRICT_DEVMEM ++ default y + help + This option checks for obviously wrong memory regions when + copying memory to/from the kernel (via copy_to_user() and +@@ -175,7 +201,6 @@ config HARDENED_USERCOPY + config HARDENED_USERCOPY_FALLBACK + bool "Allow usercopy whitelist violations to fallback to object size" + depends on HARDENED_USERCOPY +- default y + help + This is a temporary option that allows missing usercopy whitelists + to be discovered via a WARN() to the kernel log, instead of +@@ -200,10 +225,36 @@ config HARDENED_USERCOPY_PAGESPAN + config FORTIFY_SOURCE + bool "Harden common str/mem functions against buffer overflows" + depends on ARCH_HAS_FORTIFY_SOURCE ++ default y + help + Detect overflows of buffers in common string and memory functions + where the compiler can determine and validate the buffer sizes. + ++config FORTIFY_SOURCE_STRICT_STRING ++ bool "Harden common functions against buffer overflows" ++ depends on FORTIFY_SOURCE ++ depends on EXPERT ++ help ++ Perform stricter overflow checks catching overflows within objects ++ for common C string functions rather than only between objects. ++ ++ This is not yet intended for production use, only bug finding. ++ ++config PAGE_SANITIZE ++ bool "Sanitize pages" ++ default y ++ help ++ Zero fill page allocations on free, reducing the lifetime of ++ sensitive data and helping to mitigate use-after-free bugs. ++ ++config PAGE_SANITIZE_VERIFY ++ bool "Verify sanitized pages" ++ depends on PAGE_SANITIZE ++ default y ++ help ++ Verify that newly allocated pages are zeroed to detect ++ write-after-free bugs. ++ + config STATIC_USERMODEHELPER + bool "Force all usermode helper calls through a single binary" + help +diff --git i/security/selinux/Kconfig w/security/selinux/Kconfig +index 8af7a690e..6539694b0 100644 +--- i/security/selinux/Kconfig ++++ w/security/selinux/Kconfig +@@ -2,7 +2,7 @@ config SECURITY_SELINUX + bool "NSA SELinux Support" + depends on SECURITY_NETWORK && AUDIT && NET && INET + select NETWORK_SECMARK +- default n ++ default y + help + This selects NSA Security-Enhanced Linux (SELinux). + You will also need a policy configuration and a labeled filesystem. +@@ -79,23 +79,3 @@ config SECURITY_SELINUX_AVC_STATS + This option collects access vector cache statistics to + /selinux/avc/cache_stats, which may be monitored via + tools such as avcstat. +- +-config SECURITY_SELINUX_CHECKREQPROT_VALUE +- int "NSA SELinux checkreqprot default value" +- depends on SECURITY_SELINUX +- range 0 1 +- default 0 +- help +- This option sets the default value for the 'checkreqprot' flag +- that determines whether SELinux checks the protection requested +- by the application or the protection that will be applied by the +- kernel (including any implied execute for read-implies-exec) for +- mmap and mprotect calls. If this option is set to 0 (zero), +- SELinux will default to checking the protection that will be applied +- by the kernel. If this option is set to 1 (one), SELinux will +- default to checking the protection requested by the application. +- The checkreqprot flag may be changed from the default via the +- 'checkreqprot=' boot parameter. It may also be changed at runtime +- via /selinux/checkreqprot if authorized by policy. +- +- If you are unsure how to answer this question, answer 0. +diff --git i/security/selinux/hooks.c w/security/selinux/hooks.c +index 250b725f5..ff9ae6034 100644 +--- i/security/selinux/hooks.c ++++ w/security/selinux/hooks.c +@@ -135,18 +135,7 @@ __setup("selinux=", selinux_enabled_setup); + int selinux_enabled = 1; + #endif + +-static unsigned int selinux_checkreqprot_boot = +- CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE; +- +-static int __init checkreqprot_setup(char *str) +-{ +- unsigned long checkreqprot; +- +- if (!kstrtoul(str, 0, &checkreqprot)) +- selinux_checkreqprot_boot = checkreqprot ? 1 : 0; +- return 1; +-} +-__setup("checkreqprot=", checkreqprot_setup); ++static const unsigned int selinux_checkreqprot_boot; + + static struct kmem_cache *sel_inode_cache; + static struct kmem_cache *file_security_cache; +diff --git i/security/selinux/selinuxfs.c w/security/selinux/selinuxfs.c +index 60b3f16bb..591a30b5e 100644 +--- i/security/selinux/selinuxfs.c ++++ w/security/selinux/selinuxfs.c +@@ -640,7 +640,6 @@ static ssize_t sel_read_checkreqprot(struct file *filp, char __user *buf, + static ssize_t sel_write_checkreqprot(struct file *file, const char __user *buf, + size_t count, loff_t *ppos) + { +- struct selinux_fs_info *fsi = file_inode(file)->i_sb->s_fs_info; + char *page; + ssize_t length; + unsigned int new_value; +@@ -664,10 +663,9 @@ static ssize_t sel_write_checkreqprot(struct file *file, const char __user *buf, + return PTR_ERR(page); + + length = -EINVAL; +- if (sscanf(page, "%u", &new_value) != 1) ++ if (sscanf(page, "%u", &new_value) != 1 || new_value) + goto out; + +- fsi->state->checkreqprot = new_value ? 1 : 0; + length = count; + out: + kfree(page); +diff --git i/security/yama/Kconfig w/security/yama/Kconfig +index 96b274055..485c1b85c 100644 +--- i/security/yama/Kconfig ++++ w/security/yama/Kconfig +@@ -1,7 +1,7 @@ + config SECURITY_YAMA + bool "Yama support" + depends on SECURITY +- default n ++ default y + help + This selects Yama, which extends DAC support with additional + system-wide security settings beyond regular Linux discretionary From 6e62be44652910f31665fdc5b391b44f4a86c400 Mon Sep 17 00:00:00 2001 From: RageLtMan Date: Mon, 19 Apr 2021 11:20:30 -0400 Subject: [PATCH 2/2] Kernel Hardening: Linux Kernel Runtime Guard Import the Linux Kernel Runtime Guard (LKRG) from OpenWall by Adam Zabrocki and and Alex Peslyak. LKRG provides additional tiers of mitigation by actively hashing and validating kernel memory regions, further restricting access to common LPE and escape vectors, as well as mechanisms for modifying the running kernel commonly used to bypass LSMs. LKRG can be built directly into the kernel to provide enforcement from early-boot, but should be deployed as a module initially while tunables and operational stability are ironed out and validated on this platform. More information is available at the projects homepage: https://www.openwall.com/lkrg/ and in their source repo: https://github.com/openwall/lkrg --- patch/0000-Linux-Kernel-Runtime-Guard.patch | 25 +++++++++++++++++++++ 1 file changed, 25 insertions(+) create mode 100644 patch/0000-Linux-Kernel-Runtime-Guard.patch diff --git a/patch/0000-Linux-Kernel-Runtime-Guard.patch b/patch/0000-Linux-Kernel-Runtime-Guard.patch new file mode 100644 index 000000000..2ee6bbf70 --- /dev/null +++ b/patch/0000-Linux-Kernel-Runtime-Guard.patch @@ -0,0 +1,25 @@ +diff --git i/security/Kconfig w/security/Kconfig +index 0d798a1f4..f53076cd1 100644 +--- i/security/Kconfig ++++ w/security/Kconfig +@@ -321,6 +321,7 @@ source security/loadpin/Kconfig + source security/yama/Kconfig + + source security/integrity/Kconfig ++source security/lkrg/Kconfig + + choice + prompt "Default security module" +diff --git i/security/Makefile w/security/Makefile +index 507ac8c52..c2a7493be 100644 +--- i/security/Makefile ++++ w/security/Makefile +@@ -33,3 +33,8 @@ obj-$(CONFIG_INTEGRITY) += integrity/ + + # Allow the kernel to be locked down + obj-$(CONFIG_LOCK_DOWN_KERNEL) += lock_down.o ++ ++# LKRG file list ++subdir-$(CONFIG_SECURITY_LKRG) += lkrg ++obj-$(CONFIG_SECURITY_LKRG) += lkrg/ ++