Permalink
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
rv/monitor: Add safe watchdog monitor
The watchdog is an essential building block for the usage of Linux in safety-critical systems because it allows the system to be monitored from an external element - the watchdog hardware, acting as a safety-monitor. A user-space application controls the watchdog device via the watchdog interface. This application, hereafter safety_app, enables the watchdog and periodically pets the watchdog upon correct completion of the safety related processing. If the safety_app, for any reason, stops pinging the watchdog, the watchdog hardware can set the system in a fail-safe state. For example, shutting the system down. Given the importance of the safety_app / watchdog hardware couple, the interaction between these software pieces also needs some sort of monitoring. In other words, "who monitors the monitor?" The safe watchdog (safe_wtd) RV monitor monitors the interaction between the safety_app and the watchdog device, enforcing the correct sequence of events that leads the system to a safe state. Furthermore, the safety_app can monitor the RV monitor by collecting the events generated by the RV monitor itself via tracing interface. In this way, closing the monitoring loop with the safety_app. To reach a safe state, the safe_wtd RV monitor requires the safety_app to: - Open the watchdog device - Start the watchdog - Set a timeout - ping at least once The RV monitor also avoids some undesired actions. For example, to have other threads to touch the watchdog. The monitor also has a set of options, enabled via kernel command line/module options. They are: - watchdog_id: the device id to monitor (default 0). - dont_stop: once enabled, do not allow the RV monitor to be stopped (default off); - safe_timeout: define a maximum safe value that an user-space application can set as the watchdog timeout (default unlimited). - check_timeout: After every ping, check if the time left in the watchdog is less than or equal to the last timeout set for the watchdog. It only works for watchdog devices that provide the get_timeleft() function (default off). For further information, please refer to: Documentation/trace/rv/watchdog-monitor.rst The monitor specification was developed together with Gabriele Paoloni, in the context of the Linux Foundation Elisa Project. Cc: Wim Van Sebroeck <wim@linux-watchdog.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Gabriele Paoloni <gpaoloni@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
- Loading branch information