Skip to content


Subversion checkout URL

You can clone with
Download ZIP
Commits on Mar 3, 2010
  1. Add CM kernel config used on Donut.

    Steve Kondik committed
  2. Update Compcache notify support patch.

    Steve Kondik committed
  3. mmc: msm_sdcc: Fix the bug in platform resume method

    Sahitya Tummala committed with Steve Kondik
    If the config MMC_UNSAFE_RESUME is not defined, the suspend/resume is
    basically a removal/insertion event.  The current code resumes the host
    only if there is a card and is not SDIO type.  Resume the host even if
    no card is present or if card is not SDIO type.
    Signed-off-by: Sahitya Tummala <>
Commits on Jan 31, 2010
  1. [ARM] 5383/2: unwind: Add core support for ARM stack unwinding

    Catalin Marinas committed with Steve Kondik
    This patch adds the main functionality for parsing the stack unwinding
    information generated by the ARM EABI toolchains. The unwinding
    information consists of an index with a pair of words per function and a
    table with unwinding instructions. For more information, see "Exception
    Handling ABI for the ARM Architecture" at:
    Signed-off-by: Catalin Marinas <>
    Signed-off-by: Russell King <>
  2. ARM: 5746/1: Handle possible translation errors in ARMv6/v7 coherent_…

    Catalin Marinas committed with Steve Kondik
    This is needed because applications using the sys_cacheflush system call
    can pass a memory range which isn't mapped yet even though the
    corresponding vma is valid. The patch also adds unwinding annotations
    for correct backtraces from the coherent_user_range() functions.
    Signed-off-by: Catalin Marinas <>
    Signed-off-by: Russell King <>
  3. PM: wakelocks: Use seq_file for /proc/wakelocks so we can get more th…

    Arve Hjønnevåg committed with Steve Kondik
    …an 3K of stats.
    Change-Id: I42ed8bea639684f7a8a95b2057516764075c6b01
    Signed-off-by: Arve Hjønnevåg <>
  4. [ARM] msm: adjust i2c driving strength to 8mA

    Farmer Tseng committed with Steve Kondik
    According to HW and TI's suggestion, set i2c driving strength to 8mA.
    HW has confirmed that 8mA also suit to Dream and Sapphire.
    Change-Id: Ie3e0b81c4b4430d0f9799111dcc85b699b7824e6
    Signed-off-by: Farmer Tseng <>
  5. msm_rmnet: ensure packet writes are atomic

    Brian Swetland committed with Steve Kondik
    Use the smd_write_atomic() function to prevent concurrent
    packet writes to the transport from stepping on each other.
    Signed-off-by: Brian Swetland <>
  6. msm: smd: provide atomic channel writes

    Brian Swetland committed with Steve Kondik
    Some smd clients may write from multiple threads, in which case it's
    not safe to call smd_write without holding a lock.  smd_write_atomic()
    provides the same functionality as smd_write() but obtains the smd
    lock first.
    Signed-off-by: Brian Swetland <>
  7. Revert "BFS: Squashed set of prereqs from 2.6.31"

    Steve Kondik committed
    This reverts commit 676d825.
Commits on Jan 12, 2010
  1. mmc: sdio: Add high speed support to sdio_reset_comm()

    Daniel Chen committed with Steve Kondik
    Signed-off-by: San Mehat <>
  2. mmc: msm_sdcc: Fix issue where clocks could be disabled mid transaction

    San Mehat committed with Steve Kondik
        msmsdcc_enable_clocks() was incorrectly being called depending on
    the state of host->clks_on. This means the busclk idle timer was never
    being deleted if the clock was already on.. Bogus.
        Also fixes a possible double clk disable if the call to
    del_timer_sync() in msmsdcc_disable_clocks() raced with
    the busclk timer.
    Signed-off-by: San Mehat <>
  3. mmc: msm_sdcc: Fix the dma exec function to use the proper delays

    San Mehat committed with Steve Kondik
    Signed-off-by: San Mehat <>
  4. mmc: msm_sdcc: Don't set host->curr.mrq until after we're sure the bu…

    Dmitry Shmidt committed with Steve Kondik
    …sclk timer won't fire
    Signed-off-by: San Mehat <>
  5. mmc: msm_sdcc: Enable busclk idle timer for power savings

    San Mehat committed with Steve Kondik
    Signed-off-by: San Mehat <>
  6. mmc: msm_sdcc: Don't disable interrupts while suspending

    San Mehat committed with Steve Kondik
    Signed-off-by: San Mehat <>
  7. mmc: msm_sdcc: Add DataMover channel 8 status to debugfs

    San Mehat committed with Steve Kondik
    Signed-off-by: San Mehat <>
  8. mmc: msm_sdcc: Fix issue where we might not end a sucessfull request

    San Mehat committed with Steve Kondik
    Signed-off-by: San Mehat <>
  9. mmc: msm_sdcc: Featurize busclock power save and disable it by default

    San Mehat committed with Steve Kondik
    Signed-off-by: San Mehat <>
  10. mmc: msm_sdcc: Fix bug where busclk expiry timer was not properly dis…

    San Mehat committed with Steve Kondik
    Signed-off-by: San Mehat <>
  11. mmc: msm_sdcc: Reduce command timeouts and improve reliability.

    San Mehat committed with Steve Kondik
    Based on an original patch by Brent DeGraaf:
    "Previous versions of the SD driver were beset with excessive command
    timeouts. These timeouts were silent by default, but happened
    frequently, especially during heavy system activity and concurrent
    access of two or more SD devices. Worst case, these timeouts would
    occasionally hit at the end of a successful write, resulting in false
    failures that could adversely affect journaling file systems if timing
    was unfortunate. This update tightens the association and timing between
    dma transfers and the commands that trigger them by utilizing a new api
    implemented in the datamover.  In addition, it also fixes a dma cache
    coherency issue that was exposed during testing of this fix that
    occasionally resulted in card corruption.  Processing of results in the
    interrupt status routine was modified to process command results prior to
    data because overwritten command results were observed during testing
    since the data section can result in command issuances of its own.
    This change also eliminates the software command timeout, relying entirely
    on the hardware version, since the software timeout was found to cause
    problems of its own after extensive testing (having hardware timer and
    software timers addressing the same issue was found to cause a race
    condition under heavy system load)."
    This change originally added PROG_DONE handling, which has been split out
    into a separate patch. Also on our platform, the data mover driver maintains
    coherency to ensure API reliability, so the above mentioned cache corruption
    issue was not an issue for us.
    Signed-off-by: San Mehat <>
    Cc: Brian Swetland <>
    Signed-off-by: San Mehat <>
  12. Revert "mmc: msm_sdcc: Create a data wdt and gather better informatio…

    San Mehat committed with Steve Kondik
    …n when either fire."
    This reverts commit d6a46c1.
  13. [ARM] msm: Add 'execute' datamover callback

    San Mehat committed with Steve Kondik
    Based on a patch from Brent DeGraaf:
    "The datamover supports channels which can be shared amongst devices.
    As a result, the actual data transfer may occur some time after the
    request is queued up. Some devices such as mmc host controllers
    will timeout if a command is issued too far in advance of the actual
    transfer, so if dma to other devices on the same channel is already
    in progress or queued up, the added delay can cause pending transfers
    to fail before they start. This change extends the api to allow a
    user callback to be invoked just before the actual transfer takes
    place, thus allowing actions directly associated with the dma
    transfer, such as device commands, to be invoked with precise timing.
    Without this mechanism, there is no way for a driver to realize
    this timing. Also adds a user pointer to the command structure for use
    by the caller to reference information that may be needed by the
    callback routine for proper identification and processing associated
    with that specific request. This change is necessary to fix problems
    associated with excessive command timeouts and race conditions in the
    mmc driver."
    This patch also fixes all the callers of msm_dmov_enqueue_cmd() to
    ensure their callback function is NULL.
    Signed-off-by: San Mehat <>
    Cc: Brent DeGraaf <>
    Cc: Brian Swetland <>
  14. Revert "mmc: msm_sdcc: Lock AXI rate at 128mhz when we're using the bus"

    San Mehat committed with Steve Kondik
    This reverts commit 13bc34a.
  15. mmc: msm_sdcc: Lock AXI rate at 128mhz when we're using the bus

    San Mehat committed with Steve Kondik
    Signed-off-by: San Mehat <>
  16. mmc: msm_sdcc: Create a data wdt and gather better information when e…

    San Mehat committed with Steve Kondik
    …ither fire.
    Signed-off-by: San Mehat <>
  17. mmc: msm_sdcc: Schedule clock disable after probe

    San Mehat committed with Steve Kondik
    Signed-off-by: San Mehat <>
  18. mmc: msm_sdcc: Wrap readl/writel calls with appropriate clk delays

    San Mehat committed with Steve Kondik
        As it turns out, all sdcc register writes must be delayed by at
    least 3 core clock cycles for the writes to take effect. *sigh*
        Also removes the 30us constant delay on clock enable in favor
    of a 3 core clock delay.
    Signed-off-by: San Mehat <>
  19. mmc: msm_sdcc: Driver clocking/irq improvements

    San Mehat committed with Steve Kondik
    - Clocks are now disabled after 1 second of inactivity
    - Fixed issue which was causing us to loop through our ISR twice
    - Bump core clock enable delay to 30us
    Signed-off-by: San Mehat <>
  20. mmc: msm_sdcc: Clean up clock management and add a 10us delay after e…

    San Mehat committed with Steve Kondik
    …nabling clocks
    It appears that in some cases there may be a delay on the ARM9 in enabling our clock.
    As a result, we may put the controller into a bad state. Delay 10us after enabling
    clocks to let the peripheral settle. Note - this is all imperical.
    Also ensure set_ios() callback grabs the host lock.
    Signed-off-by: San Mehat <>
  21. mmc: msm_sdcc: Snoop SDIO_CCCR_ABORT register

    San Mehat committed with Steve Kondik
    Signed-off-by: San Mehat <>
  22. mmc: core: Hack for allowing access to SDIO_CCCR_ABORT register

    San Mehat committed with Steve Kondik
    Signed-off-by: San Mehat <>
Commits on Jan 4, 2010
  1. Revert "Revert "Revert "Revert to old mt hack"""

    Steve Kondik committed
    This reverts commit e658853.
  2. @utrace

    BFS: Squashed set of prereqs from 2.6.31

    utrace committed with Steve Kondik
    Includes the following squashed commits:
    pids: document task_pgrp/task_session is not safe without tasklist/rcu
    Even if task == current, it is not safe to dereference the result of
    task_pgrp/task_session.  We can race with another thread which changes the
    special pid via setpgid/setsid.
    Document this.  The next 2 patches give an example of the unsafe usage, we
    have more bad users.
    [ coding-style fixes]
    Signed-off-by: Oleg Nesterov <>
    Cc: Louis Rilling <>
    Cc: "Eric W. Biederman" <>
    Cc: Pavel Emelyanov <>
    Cc: Sukadev Bhattiprolu <>
    Cc: Roland McGrath <>
    Signed-off-by: Andrew Morton <>
    Signed-off-by: Linus Torvalds <>
    pids: refactor vnr/nr_ns helpers to make them safe
    Inho, the safety rules for vnr/nr_ns helpers are horrible and buggy.
    task_pid_nr_ns(task) needs rcu/tasklist depending on task == current.
    As for "special" pids, vnr/nr_ns helpers always need rcu.  However, if
    task != current, they are unsafe even under rcu lock, we can't trust
    task->group_leader without the special checks.
    And almost every helper has a callsite which needs a fix.
    Also, it is a bit annoying that the implementations of, say,
    task_pgrp_vnr() and task_pgrp_nr_ns() are not "symmetrical".
    This patch introduces the new helper, __task_pid_nr_ns(), which is always
    safe to use, and turns all other helpers into the trivial wrappers.
    After this I'll send another patch which converts task_tgid_xxx() as well,
    they're are a bit special.
    Signed-off-by: Oleg Nesterov <>
    Cc: Louis Rilling <>
    Cc: "Eric W. Biederman" <>
    Cc: Pavel Emelyanov <>
    Cc: Sukadev Bhattiprolu <>
    Cc: Roland McGrath <>
    Signed-off-by: Andrew Morton <>
    Signed-off-by: Linus Torvalds <>
    pids: kill signal_struct-> __pgrp/__session and friends
    We are wasting 2 words in signal_struct without any reason to implement
    task_pgrp_nr() and task_session_nr().
    task_session_nr() has no callers since
    2e2ba22, we can remove it.
    task_pgrp_nr() is still (I believe wrongly) used in fs/autofsX and
    This patch reimplements task_pgrp_nr() via task_pgrp_nr_ns(), and kills
    __pgrp/__session and the related helpers.
    The change in drivers/char/tty_io.c is cosmetic, but hopefully makes sense
    Signed-off-by: Oleg Nesterov <>
    Acked-by: Alan Cox <>		[tty parts]
    Cc: Cedric Le Goater <>
    Cc: Dave Hansen <>
    Cc: Eric Biederman <>
    Cc: Pavel Emelyanov <>
    Cc: Serge Hallyn <>
    Cc: Sukadev Bhattiprolu <>
    Cc: Roland McGrath <>
    Signed-off-by: Andrew Morton <>
    Signed-off-by: Linus Torvalds <>
    Simplify copy_thread()
    First argument unused since 2.3.11.
    [ coding-style fixes]
    Signed-off-by: Alexey Dobriyan <>
    Cc: <>
    Signed-off-by: Andrew Morton <>
    Signed-off-by: Linus Torvalds <>
    do_wait: fix waiting for the group stop with the dead leader
    do_wait(WSTOPPED) assumes that p->state must be == TASK_STOPPED, this is
    not true if the leader is already dead.  Check SIGNAL_STOP_STOPPED instead
    and use signal->group_exit_code.
    Trivial test-case:
    	void *tfunc(void *arg)
    		return NULL;
    	int main(void)
    		pthread_t thr;
    		pthread_create(&thr, NULL, tfunc, NULL);
    		return 0;
    It doesn't react to ^Z (and then to ^C or ^\). The task is stopped, but
    bash can't see this.
    The bug is very old, and it was reported multiple times. This patch was sent
    more than a year ago ( but it was ignored.
    This change also fixes other oddities (but not all) in this area.  For
    example, before this patch:
    	$ sleep 100
    	[1]+  Stopped                 sleep 100
    	$ strace -p `pidof sleep`
    	Process 11442 attached - interrupt to quit
    strace hangs in do_wait(), because ->exit_code was already consumed by
    bash.  After this patch, strace happily proceeds:
    	--- SIGTSTP (Stopped) @ 0 (0) ---
    	restart_syscall(<... resuming interrupted call ...>
    To me, this looks much more "natural" and correct.
    Another example.  Let's suppose we have the main thread M and sub-thread
    T, the process is stopped, and its parent did wait(WSTOPPED).  Now we can
    ptrace T but not M.  This looks at least strange to me.
    Imho, do_wait() should not confuse the per-thread ptrace stops with the
    per-process job control stops.
    Signed-off-by: Oleg Nesterov <>
    Cc: Denys Vlasenko <>
    Cc: "Eric W. Biederman" <>
    Cc: Jan Kratochvil <>
    Cc: Kaz Kylheku <>
    Cc: Michael Kerrisk <>
    Cc: Roland McGrath <>
    Cc: Ulrich Drepper <>
    Signed-off-by: Andrew Morton <>
    Signed-off-by: Linus Torvalds <>
    reparent_thread: don't call kill_orphaned_pgrp() if task_detached()
    If task_detached(p) == T, then either
      a) p is not the main thread, we will find the group leader on the
         ->children list.
      b) p is the group leader but its ->exit_state = EXIT_DEAD.  This
         can only happen when the last sub-thread has died, but in that case
         that thread has already called kill_orphaned_pgrp() from
    In both cases kill_orphaned_pgrp() looks bogus.
    Move the task_detached() check up and simplify the code, this is also
    right from the "common sense" pov: we should do nothing with the detached
    childs, except move them to the new parent's ->children list.
    Signed-off-by: Oleg Nesterov <>
    Cc: Roland McGrath <>
    Cc: "Eric W. Biederman" <>
    Signed-off-by: Andrew Morton <>
    Signed-off-by: Linus Torvalds <>
    reparent_thread: fix the "is it traced" check
    reparent_thread() uses ptrace_reparented() to check whether this thread is
    ptraced, in that case we should not notify the new parent.
    But ptrace_reparented() is not exactly correct when the reparented thread
    is traced by /sbin/init, because forget_original_parent() has already
    changed ->real_parent.
    Currently, the only problem is the false notification.  But with the next
    patch the kernel crash in this (yes, pathological) case.
    Signed-off-by: Oleg Nesterov <>
    Cc: Roland McGrath <>
    Cc: "Eric W. Biederman" <>
    Signed-off-by: Andrew Morton <>
    Signed-off-by: Linus Torvalds <>
    reparent_thread: fix a zombie leak if /sbin/init ignores SIGCHLD
    If /sbin/init ignores SIGCHLD and we re-parent a zombie, it is leaked.
    reparent_thread() does do_notify_parent() which sets ->exit_signal = -1 in
    this case.  This means that nobody except us can reap it, the detached
    task is not visible to do_wait().
    Change reparent_thread() to return a boolean (like __pthread_detach) to
    indicate that the thread is dead and must be released.  Also change
    forget_original_parent() to add the child to ptrace_dead list in this
    The naming becomes insane, the next patch does the cleanup.
    Signed-off-by: Oleg Nesterov <>
    Cc: Roland McGrath <>
    Cc: "Eric W. Biederman" <>
    Signed-off-by: Andrew Morton <>
    Signed-off-by: Linus Torvalds <>
    mm: fix proc_dointvec_userhz_jiffies "breakage"
    On i386, HZ=1000, jiffies_to_clock_t() converts time in a somewhat strange
    way from the user's point of view:
    	# echo 500 >/proc/sys/vm/dirty_writeback_centisecs
    	# cat /proc/sys/vm/dirty_writeback_centisecs
    So, we have 5000 jiffies converted to only 499 clock ticks and reported
    TICK_NSEC = 999848
    ACTHZ = 256039
    Keeping in-kernel variable in units passed from userspace will fix issue
    of course, but this probably won't be right for every sysctl.
    [ coding-style fixes]
    Signed-off-by: Alexey Dobriyan <>
    Cc: Peter Zijlstra <>
    Cc: Nick Piggin <>
    Signed-off-by: Andrew Morton <>
    Signed-off-by: Linus Torvalds <>
    mm: prevent divide error for small values of vm_dirty_bytes
    Avoid setting less than two pages for vm_dirty_bytes: this is necessary to
    avoid potential division by 0 (like the following) in get_dirty_limits().
    [   49.951610] divide error: 0000 [#1] PREEMPT SMP
    [   49.952195] last sysfs file: /sys/devices/pci0000:00/0000:00:01.1/host0/target0:0:0/0:0:0:0/block/sda/uevent
    [   49.952195] CPU 1
    [   49.952195] Modules linked in: pcspkr
    [   49.952195] Pid: 3064, comm: dd Not tainted 2.6.30-rc3 #1
    [   49.952195] RIP: 0010:[<ffffffff802d39a9>]  [<ffffffff802d39a9>] get_dirty_limits+0xe9/0x2c0
    [   49.952195] RSP: 0018:ffff88001de03a98  EFLAGS: 00010202
    [   49.952195] RAX: 00000000000000c0 RBX: ffff88001de03b80 RCX: 28f5c28f5c28f5c3
    [   49.952195] RDX: 0000000000000000 RSI: 00000000000000c0 RDI: 0000000000000000
    [   49.952195] RBP: ffff88001de03ae8 R08: 0000000000000000 R09: 0000000000000000
    [   49.952195] R10: ffff88001ddda9a0 R11: 0000000000000001 R12: 0000000000000001
    [   49.952195] R13: ffff88001fbc8218 R14: ffff88001de03b70 R15: ffff88001de03b78
    [   49.952195] FS:  00007fe9a435b6f0(0000) GS:ffff8800025d9000(0000) knlGS:0000000000000000
    [   49.952195] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [   49.952195] CR2: 00007fe9a39ab000 CR3: 000000001de38000 CR4: 00000000000006e0
    [   49.952195] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [   49.952195] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    [   49.952195] Process dd (pid: 3064, threadinfo ffff88001de02000, task ffff88001ddda250)
    [   49.952195] Stack:
    [   49.952195]  ffff88001fa0de00 ffff88001f2dbd70 ffff88001f9fe800 000080b900000000
    [   49.952195]  00000000000000c0 ffff8800027a6100 0000000000000400 ffff88001fbc8218
    [   49.952195]  0000000000000000 0000000000000600 ffff88001de03bb8 ffffffff802d3ed7
    [   49.952195] Call Trace:
    [   49.952195]  [<ffffffff802d3ed7>] balance_dirty_pages_ratelimited_nr+0x1d7/0x3f0
    [   49.952195]  [<ffffffff80368f8e>] ? ext3_writeback_write_end+0x9e/0x120
    [   49.952195]  [<ffffffff802cc7df>] generic_file_buffered_write+0x12f/0x330
    [   49.952195]  [<ffffffff802cce8d>] __generic_file_aio_write_nolock+0x26d/0x460
    [   49.952195]  [<ffffffff802cda32>] ? generic_file_aio_write+0x52/0xd0
    [   49.952195]  [<ffffffff802cda49>] generic_file_aio_write+0x69/0xd0
    [   49.952195]  [<ffffffff80365fa6>] ext3_file_write+0x26/0xc0
    [   49.952195]  [<ffffffff803034d1>] do_sync_write+0xf1/0x140
    [   49.952195]  [<ffffffff80290d1a>] ? get_lock_stats+0x2a/0x60
    [   49.952195]  [<ffffffff80280730>] ? autoremove_wake_function+0x0/0x40
    [   49.952195]  [<ffffffff8030411b>] vfs_write+0xcb/0x190
    [   49.952195]  [<ffffffff803042d0>] sys_write+0x50/0x90
    [   49.952195]  [<ffffffff8022ff6b>] system_call_fastpath+0x16/0x1b
    [   49.952195] Code: 00 00 00 2b 05 09 1c 17 01 48 89 c6 49 0f af f4 48 c1 ee 02 48 89 f0 48 f7 e1 48 89 d6 31 d2 48 c1 ee 02 48 0f af 75 d0 48 89 f0 <48> f7 f7 41 8b 95 ac 01 00 00 48 89 c7 49 0f af d4 48 c1 ea 02
    [   49.952195] RIP  [<ffffffff802d39a9>] get_dirty_limits+0xe9/0x2c0
    [   49.952195]  RSP <ffff88001de03a98>
    [   50.096523] ---[ end trace 008d7aa02f244d7b ]---
    Signed-off-by: Andrea Righi <>
    Cc: Peter Zijlstra <>
    Cc: David Rientjes <>
    Cc: Dave Chinner <>
    Cc: Christoph Lameter <>
    Signed-off-by: Andrew Morton <>
    Signed-off-by: Linus Torvalds <>
    workqueue: avoid recursion in run_workqueue()
    1) lockdep will complain when run_workqueue() performs recursion.
    2) The recursive implementation of run_workqueue() means that
       flush_workqueue() and its documentation are inconsistent.  This may
       hide deadlocks and other bugs.
    3) The recursion in run_workqueue() will poison cwq->current_work, but
       flush_work() and __cancel_work_timer(), etcetera need a reliable
    Signed-off-by: Lai Jiangshan <>
    Acked-by: Oleg Nesterov <>
    Cc: Peter Zijlstra <>
    Cc: Ingo Molnar <>
    Cc: Frederic Weisbecker <>
    Cc: Eric Dumazet <>
    Cc: Rusty Russell <>
    Signed-off-by: Andrew Morton <>
    Signed-off-by: Linus Torvalds <>
    work_on_cpu(): rewrite it to create a kernel thread on demand
    Impact: circular locking bugfix
    The various implemetnations and proposed implemetnations of work_on_cpu()
    are vulnerable to various deadlocks because they all used queues of some
    Unrelated pieces of kernel code thus gained dependencies wherein if one
    work_on_cpu() caller holds a lock which some other work_on_cpu() callback
    also takes, the kernel could rarely deadlock.
    Fix this by creating a short-lived kernel thread for each work_on_cpu()
    This is not terribly fast, but the only current caller of work_on_cpu() is
    It would be nice to find some other way of doing the node-local
    allocations in the PCI probe code so that we can zap work_on_cpu()
    altogether.  The code there is rather nasty.  I can't think of anything
    simple at this time...
    Cc: Ingo Molnar <>
    Signed-off-by: Andrew Morton <>
    Signed-off-by: Rusty Russell <>
    epoll keyed wakeups: add __wake_up_locked_key() and __wake_up_sync_key()
    This patchset introduces wakeup hints for some of the most popular (from
    epoll POV) devices, so that epoll code can avoid spurious wakeups on its
    The problem with epoll is that the callback-based wakeups do not, ATM,
    carry any information about the events the wakeup is related to.  So the
    only choice epoll has (not being able to call f_op->poll() from inside the
    callback), is to add the file* to a ready-list and resolve the real events
    later on, at epoll_wait() (or its own f_op->poll()) time.  This can cause
    spurious wakeups, since the wake_up() itself might be for an event the
    caller is not interested into.
    The rate of these spurious wakeup can be pretty high in case of many
    network sockets being monitored.
    By allowing devices to report the events the wakeups refer to (at least
    the two major classes - POLLIN/POLLOUT), we are able to spare useless
    wakeups by proper handling inside the epoll's poll callback.
    Epoll will have in any case to call f_op->poll() on the file* later on,
    since the change to be done in order to have the full event set sent via
    wakeup, is too invasive for the way our f_op->poll() system works (the
    full event set is calculated inside the poll function - there are too many
    of them to even start thinking the change - also poll/select would need
    change too).
    Epoll is changed in a way that both devices which send event hints, and
    the ones that don't, are correctly handled.  The former will gain some
    efficiency though.
    As a general rule for devices, would be to add an event mask by using
    key-aware wakeup macros, when making up poll wait queues.  I tested it
    (together with the epoll's poll fix patch Andrew has in -mm) and wakeups
    for the supported devices are correctly filtered.
    Test program available here:
    This patch:
    Nothing revolutionary here.  Just using the available "key" that our
    wakeup core already support.  The __wake_up_locked_key() was no brainer,
    since both __wake_up_locked() and __wake_up_locked_key() are thin wrappers
    around __wake_up_common().
    The __wake_up_sync() function had a body, so the choice was between
    borrowing the body for __wake_up_sync_key() and calling it from
    __wake_up_sync(), or make an inline and calling it from both.  I chose the
    former since in most archs it all resolves to "mov $0, REG; jmp ADDR".
    Signed-off-by: Davide Libenzi <>
    Cc: Alan Cox <>
    Cc: Ingo Molnar <>
    Cc: David Miller <>
    Cc: William Lee Irwin III <>
    Signed-off-by: Andrew Morton <>
    Signed-off-by: Linus Torvalds <>
    kernel/posix-cpu-timers.c: fix sparse warning
    Sparse reports the following in kernel/posix-cpu-timers.c:
      warning: symbol 'firing' shadows an earlier one
    Signed-off-by: H Hartley Sweeten <>
    Cc: Subrata Modak <>
    LKML-Reference: <>
    Signed-off-by: Ingo Molnar <>
    posix timers: fix RLIMIT_CPU && fork()
    copy_signal() copies signal->rlim, but RLIMIT_CPU is "lost". Because
    posix_cpu_timers_init_group() sets cputime_expires.prof_exp = 0 and thus
    fastpath_timer_check() returns false unless we have other cpu timers.
    This is the minimal fix for 2.6.29 (tested) and 2.6.28. The patch is not
    optimal, we need further cleanups here. With this patch update_rlimit_cpu()
    is not really needed, but I don't think it should be removed.
    The proper fix (I think) is:
    	- set_process_cpu_timer() should just start the cputimer->running
    	  logic (it does), no need to change cputime_expires.xxx_exp
    	- posix_cpu_timers_init_group() should set ->running when needed
    	- fastpath_timer_check() can check ->running instead of
    Reported-by: Peter Lojkin <>
    Signed-off-by: Oleg Nesterov <>
    Cc: Peter Zijlstra <>
    Cc: Roland McGrath <>
    Cc: <> [for 2.6.29.x]
    LKML-Reference: <>
    Signed-off-by: Ingo Molnar <>
    posix_cpu_timers_exit_group(): Do not use thread_group_cputimer()
    When the process exits we don't have to run new cputimer nor
    use running one (as it not accounts when tsk->exit_state != 0)
    to get process CPU times.  As there is only one thread we can
    just use CPU times fields from task and signal structs.
    Signed-off-by: Stanislaw Gruszka <>
    Cc: Peter Zijlstra <>
    Cc: Roland McGrath <>
    Cc: Vitaly Mayatskikh <>
    Signed-off-by: Andrew Morton <>
    Signed-off-by: Ingo Molnar <>
    sched, timers: move calc_load() to scheduler
    Dimitri Sivanich noticed that xtime_lock is held write locked across
    calc_load() which iterates over all online CPUs. That can cause long
    latencies for xtime_lock readers on large SMP systems.
    The load average calculation is an rough estimate anyway so there is
    no real need to protect the readers vs. the update. It's not a problem
    when the avenrun array is updated while a reader copies the values.
    Instead of iterating over all online CPUs let the scheduler_tick code
    update the number of active tasks shortly before the avenrun update
    happens. The avenrun update itself is handled by the CPU which calls
    [ Impact: reduce xtime_lock write locked section ]
    Signed-off-by: Thomas Gleixner <>
    Acked-by: Peter Zijlstra <>
    sched: remove unused fields from struct rq
    Impact: cleanup, new schedstat ABI
    Since they are used on in statistics and are always set to zero, the
    following fields from struct rq have been removed: yld_exp_empty,
    yld_act_empty and yld_both_empty.
    Both Sched Debug and SCHEDSTAT_VERSION versions has also been
    incremented since ABIs have been changed.
    The schedtop tool has been updated to properly handle new version of
    Signed-off-by: Luis Henriques <>
    Acked-by: Gregory Haskins <>
    Acked-by: Peter Zijlstra <>
    LKML-Reference: <>
    Signed-off-by: Ingo Molnar <>
    sched, timers: cleanup avenrun users
    avenrun is an rough estimate so we don't have to worry about
    consistency of the three avenrun values. Remove the xtime lock
    dependency and provide a function to scale the values. Cleanup the
    [ Impact: cleanup ]
    Signed-off-by: Thomas Gleixner <>
    Acked-by: Peter Zijlstra <>
  3. Revert "[ARM] msm: audio: Add error check for sched_setscheduler call"

    Steve Kondik committed
    This reverts commit 82711a2.
Something went wrong with that request. Please try again.