Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
sched/fair: Scan cluster before scanning LLC in wake-up path
For platforms having clusters like Kunpeng920, CPUs within the same cluster have lower latency when synchronizing and accessing shared resources like cache. Thus, this patch tries to find an idle cpu within the cluster of the target CPU before scanning the whole LLC to gain lower latency. Note neither Kunpeng920 nor x86 Jacobsville supports SMT, so this patch doesn't consider SMT for this moment. Testing has been done on Kunpeng920 by pinning tasks to one numa and two numa. On Kunpeng920, Each numa has 8 clusters and each cluster has 4 CPUs. With this patch, We noticed enhancement on tbench within one numa or cross two numa. On numa 0: 5.19-rc1 patched Hmean 1 350.27 ( 0.00%) 406.88 * 16.16%* Hmean 2 702.01 ( 0.00%) 808.22 * 15.13%* Hmean 4 1405.14 ( 0.00%) 1614.34 * 14.89%* Hmean 8 2830.53 ( 0.00%) 3169.02 * 11.96%* Hmean 16 5597.95 ( 0.00%) 6224.20 * 11.19%* Hmean 32 10537.38 ( 0.00%) 10524.97 * -0.12%* Hmean 64 8366.04 ( 0.00%) 8437.41 * 0.85%* Hmean 128 7060.87 ( 0.00%) 7150.25 * 1.27%* On numa 0-1: 5.19-rc1 patched Hmean 1 346.11 ( 0.00%) 408.47 * 18.02%* Hmean 2 693.34 ( 0.00%) 805.78 * 16.22%* Hmean 4 1384.96 ( 0.00%) 1602.49 * 15.71%* Hmean 8 2699.45 ( 0.00%) 3069.98 * 13.73%* Hmean 16 5327.11 ( 0.00%) 5688.19 * 6.78%* Hmean 32 10019.10 ( 0.00%) 11862.56 * 18.40%* Hmean 64 13850.57 ( 0.00%) 17748.54 * 28.14%* Hmean 128 12498.25 ( 0.00%) 15541.59 * 24.35%* Hmean 256 11195.77 ( 0.00%) 13854.06 * 23.74%* Tested-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
- Loading branch information