Skip to content
Permalink
Branch: 9.0.0-sultan
Commits on Aug 6, 2019
  1. Merge tag 'android-9.0.0_r0.112' into 9.0.0-sultan

    kerneltoast committed Aug 6, 2019
    Android 9.0.0 Release 0.112 (PQ3A.190801.002,taimen/walleye)
Commits on Aug 5, 2019
  1. iommu: msm: Rewrite to improve clarity and performance

    kerneltoast committed Aug 4, 2019
    This scope of this driver's lock usage is extremely wide, leading to
    excessively long lock hold times. Additionally, an entire linked list
    is traversed for the sole purpose of trying to find a reason to invoke a
    BUG. These are the two most significant contributors to poor performance
    in this driver.
    
    Fix all of this by greatly reducing the scope of the locks used and by
    using atomic reader/writer locks. The superfluous linked list traversal
    is also gone, and the entire driver's code is refactored and improved
    for better cleanliness and performance.
    
    Note that ION must be modified to pass a known structure via the private
    dma_buf pointer, so that the iommu driver can prevent races when
    operating on the same buffer concurrently. This is the only way to
    eliminate said buffer races without hurting the iommu driver's
    performance.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  2. ion: Rewrite for improved clarity and performance

    kerneltoast committed Aug 5, 2019
    The ION driver suffers from massive code bloat caused by excessive
    debug features, as well as poor lock usage as a result of that. Multiple
    locks in ION exist to make the debug features thread-safe, which hurts
    ION's actual performance when doing its job.
    
    There are numerous code paths in ION that hold mutexes for no reason and
    hold them for longer than necessary. This results in not only unwanted
    lock contention, but also long delays when a mutex lock results in the
    calling thread getting preempted for a while. All lock usage in ION
    follows this pattern, which causes poor performance across the board.
    Furthermore, a single big lock is used mostly everywhere rather than
    multiple fine-grained locks.
    
    Most of the mutex locks can be replaced with simple atomic operations.
    Where a mutex lock can't be eliminated completely, a spinlock or rwlock
    can be used instead for quick operations, thereby avoiding long delays
    due to preemption. Fine-grained locks are also now used in place of the
    single big lock that was used before.
    
    Additionally, ion_dupe_sg_table is called very frequently, and lies
    within the rendering path for the display. Speed it up by reserving
    caches for its sg_table and page-sized scatterlist allocations, as well
    as by improving the sg copy process. Note that sg_alloc_table zeroes
    out `table`, so there's no need to zero it out using the memory
    allocator.
    
    Overall, just rewrite ION entirely to fix its deficiencies. This
    optimizes ION for excellent performance and discards its rarely-used
    debug bloat.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  3. tas2557: Don't return uninitialized value in tas2557_codec_resume

    kerneltoast committed Aug 1, 2019
    Store the return value of pTAS2557->resume() so tas2557_codec_resume
    doesn't return a garbage value.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  4. msm: mdss: Fix uninitialized vsync_time in mdss_mdp_cmd_pingpong_done

    kerneltoast committed Aug 1, 2019
    Initialize vsync_time the same way other mdss functions do, so it isn't
    passed uninitialized.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  5. qcacld-3.0: Remove unused parameter from process_rx_public_action_frame

    kerneltoast committed Aug 1, 2019
    The frm_type parameter is not only passed uninitialized, but it's also
    unused. Remove it.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  6. msm: sps: Fix uninitialized result usage when an invalid IRQ is found

    kerneltoast committed Aug 1, 2019
    When an invalid IRQ number is detected, result is used uninitialized.
    Immediately return in this case to avoid the uninitialized variable
    usage.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  7. input: synaptics_dsx_htc: Don't return an uninitialized retval

    kerneltoast committed Aug 1, 2019
    When SYN_I2C_RETRY_TIMES is reached in synaptics_rmi4_i2c_set_page,
    retval is left uninitialized. Since the success value for retval is
    PAGE_SELECT_LEN, which is 2, set the default return value to be 0 to
    indicate an error.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  8. fib_rules: Fix payload calculation

    kerneltoast committed Aug 1, 2019
    The errant semicolon here results in the final addend getting discarded.
    
    Fixes: ef5fbba ("net: core: add UID to flows, rules, and routes")
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  9. wahoo_defconfig: Panic in schedule() when stack corruption is found

    kerneltoast committed Aug 1, 2019
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  10. lib: Kconfig.debug: Remove debug dependency from SCHED_STACK_END_CHECK

    kerneltoast committed Aug 1, 2019
    This is a very useful feature that doesn't have any real dependencies on
    DEBUG_KERNEL. Let it be used in the absence of DEBUG_KERNEL.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  11. ANDROID: sdcardfs: Allocate temporary name buffer on the stack

    kerneltoast committed Aug 1, 2019
    Since this 4 KiB name buffer is only used temporarily, allocate it on
    the stack to improve performance. This is confirmed to be safe according
    to runtime stack usage measurements.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  12. kobject_uevent: Allocate environment buffer on the stack

    kerneltoast committed Aug 1, 2019
    The environment buffer isn't very big; when it's allocated on the stack,
    kobject_uevent_env's stack frame size increases to just over 2 KiB,
    which is safe considering that we have a 16 KiB stack.
    
    Allocate the environment buffer on the stack instead of using the slab
    allocator in order to improve performance.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  13. ion: system_heap: Speed up system heap allocations

    kerneltoast committed Aug 1, 2019
    The system heap allocation process consists of allocating numerous
    temporary buffers that can instead be placed on the stack, to an extent;
    if the new 4 KiB on-stack buffer runs out, page_info allocations will
    fall back to using kmalloc.
    
    Additionally, since system heap allocations are frequent, they can
    benefit from the use of a memory pool for allocating the persistent
    sg_table structures. These allocations, along with a few others, also
    don't need to be zeroed out.
    
    These changes improve system heap allocation performance considerably.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  14. ANDROID: crypto: heh - Avoid dynamically allocating memory for keys

    kerneltoast committed Jul 31, 2019
    The derived keys are usually quite small (48 B). We can use a small
    on-stack buffer of 1 KiB to dodge dynamic memory allocation, speeding up
    heh_setkey.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  15. ext4: Avoid dynamically allocating memory in ext4_ext_remove_space

    kerneltoast committed Jul 31, 2019
    Although path depth is unbounded, we can still fulfill many path
    allocations here with a 4 KiB stack allocation, thereby improving
    performance.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  16. xattr: Avoid dynamically allocating memory in getxattr

    kerneltoast committed Jul 31, 2019
    Although the maximum xattr size is too big to fit on the stack (64 KiB),
    we can still fulfill most getxattr requests with a 4 KiB stack
    allocation, thereby improving performance. Such a large stack allocation
    here is confirmed to be safe via stack usage measurements at runtime.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  17. binfmt_elf: Don't allocate memory dynamically in load_elf_binary

    kerneltoast committed Jul 31, 2019
    The dynamic memory allocations in load_elf_binary can be replaced by
    large stack allocations that bring the total frame size for the function
    up to about 4 KiB. We have 16 KiB of stack space, and runtime
    measurements confirm that using this much stack memory here is safe.
    This improves performance by eliminating the overhead of dynamic memory
    allocation.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  18. staging: sync: Use an on-stack allocation for fence info ioctl

    kerneltoast committed Jul 31, 2019
    Since the fence info ioctl limits output data length to 4096 bytes, we
    can just use a 4096-byte on-stack buffer for it (which is confirmed to
    be safe via runtime stack usage measurements). This ioctl is used for
    every frame rendered to the display, so eliminating dynamic memory
    allocation overhead here improves frame rendering performance.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  19. media: v4l2-ioctl: Use larger on-stack video copy buffers

    kerneltoast committed Jul 31, 2019
    We have a 16 KiB stack; buffers of 4 KiB and 512 B work perfectly fine
    in place of a small 128-byte buffer and especially no on-stack buffer at
    all. This avoids dynamic memory allocation more often, improving
    performance, and it's safe according to stack usage measurements at
    runtime.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  20. msm: camera: Optimize memory allocation for small buffers

    kerneltoast committed Jul 31, 2019
    Try to use an on-stack buffer for memory allocations that are small
    enough to not warrant a dynamic allocation, and eliminate dynamic memory
    allocation entirely in msm_camera_cci_i2c_read_seq. This improves
    performance by skipping latency-prone dynamic memory allocation when it
    isn't needed.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  21. msm: kgsl: Don't allocate memory dynamically for temp command buffers

    kerneltoast committed Jul 31, 2019
    The temporary command buffer in _set_pagetable_gpu is only the size of a
    single page; it is therefore easy to replace the dynamic command buffer
    allocation with a static one to improve performance by avoiding the
    latency of dynamic memory allocation.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  22. msm: mdss: Don't allocate memory dynamically for small layer buffers

    kerneltoast committed Jul 31, 2019
    There's no reason to dynamically allocate memory for a single, small
    struct instance (output_layer) when it can be allocated on the stack.
    Additionally, layer_list and validate_info_list are limited by the
    maximum number of layers allowed, so they can be replaced by stack
    allocations.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  23. ext4 crypto: Use a larger on-stack file name buffer

    kerneltoast committed Jul 31, 2019
    32 bytes for the on-stack file name buffer is rather small and doesn't
    fit many file names, causing dynamic allocation to be used more often
    than necessary instead. Increasing the on-stack buffer to 4 KiB is safe
    and helps this function avoid dynamic memory allocations far more
    frequently.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  24. selinux: Avoid dynamic memory allocation for small context buffers

    kerneltoast committed Jul 31, 2019
    Most context buffers are rather small and can fit on the stack,
    eliminating the need to allocate them dynamically. Reserve a 4 KiB
    stack buffer for this purpose to avoid the overhead of dynamic
    memory allocation.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  25. kernfs: Avoid dynamic memory allocation for small write buffers

    kerneltoast committed Jul 30, 2019
    Most write buffers are rather small and can fit on the stack,
    eliminating the need to allocate them dynamically. Reserve a 4 KiB
    stack buffer for this purpose to avoid the overhead of dynamic
    memory allocation.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  26. msm: kgsl: Avoid dynamically allocating small command buffers

    kerneltoast committed Jul 30, 2019
    Most command buffers here are rather small (fewer than 256 words); it's
    a waste of time to dynamically allocate memory for such a small buffer
    when it could easily fit on the stack.
    
    Conditionally using an on-stack command buffer when the size is small
    enough eliminates the need for using a dynamically-allocated buffer most
    of the time, reducing GPU command submission latency.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  27. wahoo_defconfig: Disable stack frame size warning

    kerneltoast committed Jul 30, 2019
    The stack frame size warning can be deceptive when it is clear that a
    function with a large frame size won't cause stack overflows given how
    it is used. Since this warning is more of a nuisance rather than
    helpful, disable it.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  28. ext4: Allocate allocation-context on the stack

    kerneltoast committed Jul 13, 2019
    The allocation context structure is quite small and easily fits on the
    stack. There's no need to allocate it dynamically.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  29. scatterlist: Don't allocate sg lists using __get_free_page

    kerneltoast committed Jul 12, 2019
    Allocating pages with __get_free_page is slower than going through the
    slab allocator to grab free pages out from a pool.
    
    These are the results from running the code at the bottom of this
    message:
    [    1.278602] speedtest: __get_free_page: 9 us
    [    1.278606] speedtest: kmalloc: 4 us
    [    1.278609] speedtest: kmem_cache_alloc: 4 us
    [    1.278611] speedtest: vmalloc: 13 us
    
    kmalloc and kmem_cache_alloc (which is what kmalloc uses for common
    sizes behind the scenes) are the fastest choices. Use kmalloc to speed
    up sg list allocation.
    
    This is the code used to produce the above measurements:
    #include <linux/kthread.h>
    #include <linux/slab.h>
    #include <linux/vmalloc.h>
    
    static int speedtest(void *data)
    {
    	static const struct sched_param sched_max_rt_prio = {
    		.sched_priority = MAX_RT_PRIO - 1
    	};
    	volatile s64 ctotal = 0, gtotal = 0, ktotal = 0, vtotal = 0;
    	struct kmem_cache *page_pool;
    	int i, j, trials = 1000;
    	volatile ktime_t start;
    	void *ptr[100];
    
    	sched_setscheduler_nocheck(current, SCHED_FIFO, &sched_max_rt_prio);
    
    	page_pool = kmem_cache_create("pages", PAGE_SIZE, PAGE_SIZE, SLAB_PANIC,
    				      NULL);
    	for (i = 0; i < trials; i++) {
    		start = ktime_get();
    		for (j = 0; j < ARRAY_SIZE(ptr); j++)
    			while (!(ptr[j] = kmem_cache_alloc(page_pool, GFP_KERNEL)));
    		ctotal += ktime_us_delta(ktime_get(), start);
    		for (j = 0; j < ARRAY_SIZE(ptr); j++)
    			kmem_cache_free(page_pool, ptr[j]);
    
    		start = ktime_get();
    		for (j = 0; j < ARRAY_SIZE(ptr); j++)
    			while (!(ptr[j] = (void *)__get_free_page(GFP_KERNEL)));
    		gtotal += ktime_us_delta(ktime_get(), start);
    		for (j = 0; j < ARRAY_SIZE(ptr); j++)
    			free_page((unsigned long)ptr[j]);
    
    		start = ktime_get();
    		for (j = 0; j < ARRAY_SIZE(ptr); j++)
    			while (!(ptr[j] = kmalloc(PAGE_SIZE, GFP_KERNEL)));
    		ktotal += ktime_us_delta(ktime_get(), start);
    		for (j = 0; j < ARRAY_SIZE(ptr); j++)
    			kfree(ptr[j]);
    
    		start = ktime_get();
    		*ptr = vmalloc(ARRAY_SIZE(ptr) * PAGE_SIZE);
    		vtotal += ktime_us_delta(ktime_get(), start);
    		vfree(*ptr);
    	}
    	kmem_cache_destroy(page_pool);
    
    	printk("%s: __get_free_page: %lld us\n", __func__, gtotal / trials);
    	printk("%s: kmalloc: %lld us\n", __func__, ktotal / trials);
    	printk("%s: kmem_cache_alloc: %lld us\n", __func__, ctotal / trials);
    	printk("%s: vmalloc: %lld us\n", __func__, vtotal / trials);
    	complete(data);
    	return 0;
    }
    
    static int __init start_test(void)
    {
    	DECLARE_COMPLETION_ONSTACK(done);
    
    	BUG_ON(IS_ERR(kthread_run(speedtest, &done, "malloc_test")));
    	wait_for_completion(&done);
    	return 0;
    }
    late_initcall(start_test);
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  30. mm: kmemleak: Don't die when memory allocation fails

    kerneltoast committed Jul 7, 2019
    When memory is leaking, it's going to be harder to allocate more memory,
    making it more likely for this failure condition inside of kmemleak to
    manifest itself. This is extremely frustrating since kmemleak kills
    itself upon the first instance of memory allocation failure.
    
    Bypass that and make kmemleak more resilient when memory is running low.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  31. mm: kmemleak: Don't require global debug options

    kerneltoast committed Jul 7, 2019
    This allows kmemleak to function even when debugfs is globally disabled,
    allowing kmemleak to give accurate results for CONFIG_DEBUG_FS=n.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  32. mbcache2: Speed up cache entry creation

    kerneltoast committed Jul 3, 2019
    In order to prevent redundant entry creation by racing against itself,
    mb2_cache_entry_create scans through a large hash-list of all current
    entries in order to see if another allocation for the requested new
    entry has been made. Furthermore, it allocates memory for a new entry
    before scanning through this hash-list, which results in that allocated
    memory being discarded when the requested new entry is already present.
    This happens more than half the time.
    
    Speed up cache entry creation by keeping a small linked list of
    requested new entries in progress, and scanning through that first
    instead of the large hash-list. Additionally, don't bother allocating
    memory for a new entry until it's known that the allocated memory will
    be used.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  33. msm: mdss: Don't allocate memory dynamically for destination scaler

    kerneltoast committed Jul 31, 2019
    Every atomic frame commit allocates memory dynamically for the
    destination scaler, when those allocations can just be stored on the
    stack instead. Eliminate these dynamic memory allocations in the frame
    commit path to improve performance.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  34. wahoo_defconfig: Disable SMACK and Integrity security suites

    kerneltoast committed Jul 1, 2019
    We don't use these.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
You can’t perform that action at this time.