gby edited this page Mar 29, 2012 · 13 revisions
Clone this wiki locally

Sources of CPU interference in core Linux code

Linux version: 3.3

CPU interference: For the the purpose of this document, CPU interference is defined as any case where kernel management functions will take CPU time from a pure CPU bound user task making no system call and running pinned to an isolated CPU with no other contending user tasks. Specifically, any kernel activity that occurred as a result of the the specific task activity is not considered interference. However, kernel activity on the specific CPU happening as a result of work of other tasks or the general kernel infrastructure not related a specific task are considered interference.

Global IPIs

  1. fs/buffer.c - Global IPI for each LRU drain [1]
  2. kernel/hrtimer - Global IPI for setting high res-timer when system clock is changed [2]
  3. kernel/profile.c - double flip buffer management, used during profiling only [2]
  4. kernel/rcutree.c - rcu_barrier() implementation uses a global IPI to queue an RCU callback on every CPU
  5. mm/page_alloc.c - global IPI for draining all per-cpu pages [1] [4]
  6. mm/slab.c - global IPI during per cpu cache drain and tuning [3]
  7. mm/slub.c - global IPI during per cpu caches drain, called when destroying a cache (also does an rcu_barrier) [1]
  8. net/core/dev.c - global IPI to flush backlog during net device unregister (also does an rcu_barrier) [2]

Global work queue scheduling

  1. mm/swap.c - global work scheduled on each cpu for draining pagevecs to LRU lists [5]
  2. mm/slab.c - delayed work scheduled one per second to reap per cpu slab cache
  3. mm/vmstat.c - delayed work scheduled one per second for pcp cache draining and VM statistics

Global kthread scheduling

  1. kernel/stop_machine.c - anything using stop_machine mechanism, such as
  • synchronize_..._expedited [7]
  • module unload [2]
  • cpu hot-unplug [2]
  • text_poke (used for kprobes) [2]

Global Timers

  1. kernel/time/tick-sched.c - the scheduler tick runs the scheduler, run queue time keeping, timer management, RCU [6]
  2. kernel/time/clocksource.c - the clocksource_watchdog registers a timer on each CPU in a cyclic manner [8]


  1. Dealt with in Reduce cross CPU IPI interference patch set (https://lkml.org/lkml/2012/1/8/109). This patch set is now in Linus 3.4-rc0 tree.
  2. Controlled outside events, most of which not normally done during normal production run time (maybe with the exception of module load/unload).
  3. Since slab and slub are mutually exclusive it is enough to deal with one of them.
  4. Made a very rare event by the Mel Gorman's patch to not do a global drain for direct reclaim path of memory allocator (https://lkml.org/lkml/2012/1/11/93).
  5. Might be dealt with by the effort to get rid of pagevecs (see: https://lkml.org/lkml/2012/1/4/376) or we can do schedule_on_each_cpu_mask(...).
  6. Dealt with by Frederic Weisbecker NOHZ cpuset patchset (see: http://lwn.net/Articles/455044/)
  7. Paul McKenny now has "Make synchronize_sched_expedited() avoid IPIing idle CPUs" on his to do list: http://kernel.org/pub/linux/kernel/people/paulmck/rcutodo.html.
  8. Only relevant if your clock source is not stable. Can be turned off by a CONFIG option, which is currently hard coded to true for x86. This patch allows is to be turned off: https://lkml.org/lkml/2012/3/27/193.


Thank you Frederic Weisbecker, Peter Zijlstra, Paul E. McKenney, Christoph Lameter for contrinuting to this list!