Skip to content
Permalink
Branch: 9.0.0-sultan
Commits on Nov 6, 2019
  1. wahoo_defconfig: Use Simple LMK

    kerneltoast committed May 12, 2019
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  2. simple_lmk: Mark reclaim kthread as performance critical

    kerneltoast committed Nov 5, 2019
    Simple LMK's reclaim thread needs to run as quickly as possible to
    reduce memory allocation latency when memory pressure is high. Mark it
    as performance critical to schedule it on faster CPUs.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  3. simple_lmk: Introduce Simple Low Memory Killer for Android

    kerneltoast committed Nov 6, 2019
    This is a complete low memory killer solution for Android that is small
    and simple. Processes are killed according to the priorities that
    Android gives them, so that the least important processes are always
    killed first. Processes are killed until memory deficits are satisfied,
    as observed from kswapd struggling to free up pages. Simple LMK stops
    killing processes when kswapd finally goes back to sleep.
    
    The only tunables are the desired amount of memory to be freed per
    reclaim event and desired frequency of reclaim events. Simple LMK tries
    to free at least the desired amount of memory per reclaim and waits
    until all of its victims' memory is freed before proceeding to kill more
    processes.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Commits on Nov 5, 2019
  1. ion: system_heap: Fix uninitialized sg-table usage

    kerneltoast committed Oct 19, 2019
    The table_sync sg-table is used uninitialized when nents_sync is zero.
    Fix it by only using it when it's allocated.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  2. ion: Rewrite to improve clarity and performance

    kerneltoast committed Oct 20, 2019
    The ION driver suffers from massive code bloat caused by excessive
    debug features, as well as poor lock usage as a result of that. Multiple
    locks in ION exist to make the debug features thread-safe, which hurts
    ION's actual performance when doing its job.
    
    There are numerous code paths in ION that hold mutexes for no reason and
    hold them for longer than necessary. This results in not only unwanted
    lock contention, but also long delays when a mutex lock results in the
    calling thread getting preempted for a while. All lock usage in ION
    follows this pattern, which causes poor performance across the board.
    Furthermore, a big mutex lock is used mostly everywhere, which causes
    performance degradation due to unnecessary lock overhead.
    
    Instead of having a big mutex lock, multiple fine-grained locks are now
    used, improving performance.
    
    Additionally, ion_dup_sg_table is called very frequently, and lies
    within the rendering path for the display. Speed it up by copying
    scatterlists in page-sized chunks rather than iterating one at a time.
    Note that sg_alloc_table zeroes out `table`, so there's no need to zero
    it out using the memory allocator.
    
    Overall, just rewrite ION entirely to fix its deficiencies. This
    optimizes ION for excellent performance and discards its debug cruft.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  3. iommu: msm: Rewrite to improve clarity and performance

    kerneltoast committed Oct 20, 2019
    This scope of this driver's lock usage is extremely wide, leading to
    excessively long lock hold times. Additionally, there is lots of
    excessive linked-list traversal and unnecessary dynamic memory
    allocation in a critical path, causing poor performance across the
    board.
    
    Fix all of this by greatly reducing the scope of the locks used and by
    significantly reducing the amount of operations performed when
    msm_dma_map_sg_attrs() is called. The entire driver's code is overhauled
    for better cleanliness and performance.
    
    Note that ION must be modified to pass a known structure via the private
    dma_buf pointer, so that the iommu driver can prevent races when
    operating on the same buffer concurrently. This is the only way to
    eliminate said buffer races without hurting the iommu driver's
    performance.
    
    Some additional members are added to the device struct as well to make
    these various performance improvements possible.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  4. Merge tag 'android-9.0.0_r0.112' into 9.0.0-sultan

    kerneltoast committed Nov 5, 2019
    Android 9.0.0 Release 0.112 (PQ3A.190801.002,taimen/walleye)
  5. tas2557: Don't return uninitialized value in tas2557_codec_resume

    kerneltoast committed Aug 1, 2019
    Store the return value of pTAS2557->resume() so tas2557_codec_resume
    doesn't return a garbage value.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  6. msm: mdss: Fix uninitialized vsync_time in mdss_mdp_cmd_pingpong_done

    kerneltoast committed Aug 1, 2019
    Initialize vsync_time the same way other mdss functions do, so it isn't
    passed uninitialized.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  7. qcacld-3.0: Remove unused parameter from process_rx_public_action_frame

    kerneltoast committed Aug 1, 2019
    The frm_type parameter is not only passed uninitialized, but it's also
    unused. Remove it.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  8. msm: sps: Fix uninitialized result usage when an invalid IRQ is found

    kerneltoast committed Aug 1, 2019
    When an invalid IRQ number is detected, result is used uninitialized.
    Immediately return in this case to avoid the uninitialized variable
    usage.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  9. input: synaptics_dsx_htc: Don't return an uninitialized retval

    kerneltoast committed Aug 1, 2019
    When SYN_I2C_RETRY_TIMES is reached in synaptics_rmi4_i2c_set_page,
    retval is left uninitialized. Since the success value for retval is
    PAGE_SELECT_LEN, which is 2, set the default return value to be 0 to
    indicate an error.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  10. fib_rules: Fix payload calculation

    kerneltoast committed Aug 1, 2019
    The errant semicolon here results in the final addend getting discarded.
    
    Fixes: ef5fbba ("net: core: add UID to flows, rules, and routes")
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  11. wahoo_defconfig: Panic in schedule() when stack corruption is found

    kerneltoast committed Aug 1, 2019
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  12. lib: Kconfig.debug: Remove debug dependency from SCHED_STACK_END_CHECK

    kerneltoast committed Aug 1, 2019
    This is a very useful feature that doesn't have any real dependencies on
    DEBUG_KERNEL. Let it be used in the absence of DEBUG_KERNEL.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  13. ANDROID: sdcardfs: Allocate temporary name buffer on the stack

    kerneltoast committed Aug 1, 2019
    Since this 4 KiB name buffer is only used temporarily, allocate it on
    the stack to improve performance. This is confirmed to be safe according
    to runtime stack usage measurements.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  14. kobject_uevent: Allocate environment buffer on the stack

    kerneltoast committed Aug 1, 2019
    The environment buffer isn't very big; when it's allocated on the stack,
    kobject_uevent_env's stack frame size increases to just over 2 KiB,
    which is safe considering that we have a 16 KiB stack.
    
    Allocate the environment buffer on the stack instead of using the slab
    allocator in order to improve performance.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  15. ion: system_heap: Speed up system heap allocations

    kerneltoast committed Aug 1, 2019
    The system heap allocation process consists of allocating numerous
    temporary buffers that can instead be placed on the stack, to an extent;
    if the new 4 KiB on-stack buffer runs out, page_info allocations will
    fall back to using kmalloc.
    
    Additionally, since system heap allocations are frequent, they can
    benefit from the use of a memory pool for allocating the persistent
    sg_table structures. These allocations, along with a few others, also
    don't need to be zeroed out.
    
    These changes improve system heap allocation performance considerably.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  16. ANDROID: crypto: heh - Avoid dynamically allocating memory for keys

    kerneltoast committed Jul 31, 2019
    The derived keys are usually quite small (48 B). We can use a small
    on-stack buffer of 1 KiB to dodge dynamic memory allocation, speeding up
    heh_setkey.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  17. ext4: Avoid dynamically allocating memory in ext4_ext_remove_space

    kerneltoast committed Jul 31, 2019
    Although path depth is unbounded, we can still fulfill many path
    allocations here with a 4 KiB stack allocation, thereby improving
    performance.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  18. xattr: Avoid dynamically allocating memory in getxattr

    kerneltoast committed Jul 31, 2019
    Although the maximum xattr size is too big to fit on the stack (64 KiB),
    we can still fulfill most getxattr requests with a 4 KiB stack
    allocation, thereby improving performance. Such a large stack allocation
    here is confirmed to be safe via stack usage measurements at runtime.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  19. binfmt_elf: Don't allocate memory dynamically in load_elf_binary

    kerneltoast committed Jul 31, 2019
    The dynamic memory allocations in load_elf_binary can be replaced by
    large stack allocations that bring the total frame size for the function
    up to about 4 KiB. We have 16 KiB of stack space, and runtime
    measurements confirm that using this much stack memory here is safe.
    This improves performance by eliminating the overhead of dynamic memory
    allocation.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  20. staging: sync: Use an on-stack allocation for fence info ioctl

    kerneltoast committed Jul 31, 2019
    Since the fence info ioctl limits output data length to 4096 bytes, we
    can just use a 4096-byte on-stack buffer for it (which is confirmed to
    be safe via runtime stack usage measurements). This ioctl is used for
    every frame rendered to the display, so eliminating dynamic memory
    allocation overhead here improves frame rendering performance.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  21. media: v4l2-ioctl: Use larger on-stack video copy buffers

    kerneltoast committed Jul 31, 2019
    We have a 16 KiB stack; buffers of 4 KiB and 512 B work perfectly fine
    in place of a small 128-byte buffer and especially no on-stack buffer at
    all. This avoids dynamic memory allocation more often, improving
    performance, and it's safe according to stack usage measurements at
    runtime.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  22. msm: camera: Optimize memory allocation for small buffers

    kerneltoast committed Jul 31, 2019
    Try to use an on-stack buffer for memory allocations that are small
    enough to not warrant a dynamic allocation, and eliminate dynamic memory
    allocation entirely in msm_camera_cci_i2c_read_seq. This improves
    performance by skipping latency-prone dynamic memory allocation when it
    isn't needed.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  23. msm: kgsl: Don't allocate memory dynamically for temp command buffers

    kerneltoast committed Jul 31, 2019
    The temporary command buffer in _set_pagetable_gpu is only the size of a
    single page; it is therefore easy to replace the dynamic command buffer
    allocation with a static one to improve performance by avoiding the
    latency of dynamic memory allocation.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  24. msm: mdss: Don't allocate memory dynamically for small layer buffers

    kerneltoast committed Jul 31, 2019
    There's no reason to dynamically allocate memory for a single, small
    struct instance (output_layer) when it can be allocated on the stack.
    Additionally, layer_list and validate_info_list are limited by the
    maximum number of layers allowed, so they can be replaced by stack
    allocations.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  25. ext4 crypto: Use a larger on-stack file name buffer

    kerneltoast committed Jul 31, 2019
    32 bytes for the on-stack file name buffer is rather small and doesn't
    fit many file names, causing dynamic allocation to be used more often
    than necessary instead. Increasing the on-stack buffer to 4 KiB is safe
    and helps this function avoid dynamic memory allocations far more
    frequently.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  26. selinux: Avoid dynamic memory allocation for small context buffers

    kerneltoast committed Jul 31, 2019
    Most context buffers are rather small and can fit on the stack,
    eliminating the need to allocate them dynamically. Reserve a 4 KiB
    stack buffer for this purpose to avoid the overhead of dynamic
    memory allocation.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  27. kernfs: Avoid dynamic memory allocation for small write buffers

    kerneltoast committed Jul 30, 2019
    Most write buffers are rather small and can fit on the stack,
    eliminating the need to allocate them dynamically. Reserve a 4 KiB
    stack buffer for this purpose to avoid the overhead of dynamic
    memory allocation.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  28. msm: kgsl: Avoid dynamically allocating small command buffers

    kerneltoast committed Jul 30, 2019
    Most command buffers here are rather small (fewer than 256 words); it's
    a waste of time to dynamically allocate memory for such a small buffer
    when it could easily fit on the stack.
    
    Conditionally using an on-stack command buffer when the size is small
    enough eliminates the need for using a dynamically-allocated buffer most
    of the time, reducing GPU command submission latency.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  29. wahoo_defconfig: Disable stack frame size warning

    kerneltoast committed Jul 30, 2019
    The stack frame size warning can be deceptive when it is clear that a
    function with a large frame size won't cause stack overflows given how
    it is used. Since this warning is more of a nuisance rather than
    helpful, disable it.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  30. ext4: Allocate allocation-context on the stack

    kerneltoast committed Jul 13, 2019
    The allocation context structure is quite small and easily fits on the
    stack. There's no need to allocate it dynamically.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  31. scatterlist: Don't allocate sg lists using __get_free_page

    kerneltoast committed Jul 12, 2019
    Allocating pages with __get_free_page is slower than going through the
    slab allocator to grab free pages out from a pool.
    
    These are the results from running the code at the bottom of this
    message:
    [    1.278602] speedtest: __get_free_page: 9 us
    [    1.278606] speedtest: kmalloc: 4 us
    [    1.278609] speedtest: kmem_cache_alloc: 4 us
    [    1.278611] speedtest: vmalloc: 13 us
    
    kmalloc and kmem_cache_alloc (which is what kmalloc uses for common
    sizes behind the scenes) are the fastest choices. Use kmalloc to speed
    up sg list allocation.
    
    This is the code used to produce the above measurements:
    #include <linux/kthread.h>
    #include <linux/slab.h>
    #include <linux/vmalloc.h>
    
    static int speedtest(void *data)
    {
    	static const struct sched_param sched_max_rt_prio = {
    		.sched_priority = MAX_RT_PRIO - 1
    	};
    	volatile s64 ctotal = 0, gtotal = 0, ktotal = 0, vtotal = 0;
    	struct kmem_cache *page_pool;
    	int i, j, trials = 1000;
    	volatile ktime_t start;
    	void *ptr[100];
    
    	sched_setscheduler_nocheck(current, SCHED_FIFO, &sched_max_rt_prio);
    
    	page_pool = kmem_cache_create("pages", PAGE_SIZE, PAGE_SIZE, SLAB_PANIC,
    				      NULL);
    	for (i = 0; i < trials; i++) {
    		start = ktime_get();
    		for (j = 0; j < ARRAY_SIZE(ptr); j++)
    			while (!(ptr[j] = kmem_cache_alloc(page_pool, GFP_KERNEL)));
    		ctotal += ktime_us_delta(ktime_get(), start);
    		for (j = 0; j < ARRAY_SIZE(ptr); j++)
    			kmem_cache_free(page_pool, ptr[j]);
    
    		start = ktime_get();
    		for (j = 0; j < ARRAY_SIZE(ptr); j++)
    			while (!(ptr[j] = (void *)__get_free_page(GFP_KERNEL)));
    		gtotal += ktime_us_delta(ktime_get(), start);
    		for (j = 0; j < ARRAY_SIZE(ptr); j++)
    			free_page((unsigned long)ptr[j]);
    
    		start = ktime_get();
    		for (j = 0; j < ARRAY_SIZE(ptr); j++)
    			while (!(ptr[j] = kmalloc(PAGE_SIZE, GFP_KERNEL)));
    		ktotal += ktime_us_delta(ktime_get(), start);
    		for (j = 0; j < ARRAY_SIZE(ptr); j++)
    			kfree(ptr[j]);
    
    		start = ktime_get();
    		*ptr = vmalloc(ARRAY_SIZE(ptr) * PAGE_SIZE);
    		vtotal += ktime_us_delta(ktime_get(), start);
    		vfree(*ptr);
    	}
    	kmem_cache_destroy(page_pool);
    
    	printk("%s: __get_free_page: %lld us\n", __func__, gtotal / trials);
    	printk("%s: kmalloc: %lld us\n", __func__, ktotal / trials);
    	printk("%s: kmem_cache_alloc: %lld us\n", __func__, ctotal / trials);
    	printk("%s: vmalloc: %lld us\n", __func__, vtotal / trials);
    	complete(data);
    	return 0;
    }
    
    static int __init start_test(void)
    {
    	DECLARE_COMPLETION_ONSTACK(done);
    
    	BUG_ON(IS_ERR(kthread_run(speedtest, &done, "malloc_test")));
    	wait_for_completion(&done);
    	return 0;
    }
    late_initcall(start_test);
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
  32. mm: kmemleak: Don't die when memory allocation fails

    kerneltoast committed Jul 7, 2019
    When memory is leaking, it's going to be harder to allocate more memory,
    making it more likely for this failure condition inside of kmemleak to
    manifest itself. This is extremely frustrating since kmemleak kills
    itself upon the first instance of memory allocation failure.
    
    Bypass that and make kmemleak more resilient when memory is running low.
    
    Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Older
You can’t perform that action at this time.