Permalink
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
clocksource: skip check while watchdog hung up or unstable
After patch 1f45f1f (clocksource: Make clocksource validation work for all clocksources), md_nsec may be 0 in some scenarios, such as the watchdog is delayed for a long time or the watchdog has a time-warp. We found a problem when testing nvme disks with fio, when multiple queue interrupts of a disk were mapped to a single CPU. IO interrupt processing will cause the watchdog to be delayed for a long time (155 seconds), the system reports TSC unstable and switches the clock to hpet. It seems that this scenario cannot be handled by optimizing softirq. Therefore, when md_nsec returns 0, the machine or watchdog should be in unstable state,the verification result not unreliable. Is it possible for us to skip the current check at this time? 1. If the watchdog is delayed because the system is busy, and the clocksource is switched to hpet due to a wrong judgment, the performance degradation may directly cause the machine to be unavailable and cause more problems. 2. If watchdog has time-warp, we should not rely on hpet to directly mark TSC as unstable. Later we register watchdog to other CPU, if other CPU is not busy, we can also check the stability of TSC. Signed-off-by: Chunguang Xu <brookxu@tencent.com>
- Loading branch information