|
2 | 2 |
|
3 | 3 | Tracing and Logging high-level design
|
4 | 4 | #####################################
|
| 5 | + |
| 6 | +Both Trace and Log are built on top of a mechanism named shared |
| 7 | +buffer (Sbuf). |
| 8 | + |
| 9 | +Shared Buffer |
| 10 | +************* |
| 11 | + |
| 12 | +Shared Buffer is a ring buffer divided into predetermined-size slots. There |
| 13 | +are two use scenarios of Sbuf: |
| 14 | + |
| 15 | +- sbuf can serve as a lockless ring buffer to share data from ACRN HV to |
| 16 | + SOS in non-overwritten mode. (Writing will fail if an overrun |
| 17 | + happens.) |
| 18 | +- sbuf can serve as a conventional ring buffer in hypervisor in |
| 19 | + over-written mode. A lock is required to synchronize access by the |
| 20 | + producer and consumer. |
| 21 | + |
| 22 | +Both ACRNTrace and ACRNLog use sbuf as a lockless ring buffer. The Sbuf |
| 23 | +is allocated by SOS and assigned to HV via a hypercall. To hold pointers |
| 24 | +to sbuf passed down via hypercall, an array ``sbuf[ACRN_SBUF_ID_MAX]`` |
| 25 | +is defined in per_cpu region of HV, with predefined sbuf id to identify |
| 26 | +the usage, such as ACRNTrace, ACRNLog, etc. |
| 27 | + |
| 28 | +For each physical CPU there is a dedicated Sbuf. Only a single producer |
| 29 | +is allowed to put data into that Sbuf in HV, and a single consumer is |
| 30 | +allowed to get data from Sbuf in SOS. Therefore, no lock is required to |
| 31 | +synchronize access by the producer and consumer. |
| 32 | + |
| 33 | +sbuf APIs |
| 34 | +========= |
| 35 | + |
| 36 | +.. note:: reference APIs defined in hypervisor/include/debug/sbuf.h |
| 37 | + |
| 38 | + |
| 39 | +ACRN Trace |
| 40 | +********** |
| 41 | + |
| 42 | +ACRNTrace is a tool running on the Service OS (SOS) to capture trace |
| 43 | +data. It allows developers to add performance profiling trace points at |
| 44 | +key locations to get a picture of what is going on inside the |
| 45 | +hypervisor. Scripts to analyze the collected trace data are also |
| 46 | +provided. |
| 47 | + |
| 48 | +As shown in :numref:`acrntrace-arch`, ACRNTrace is built using |
| 49 | +Shared Buffers (Sbuf), and consists of three parts from bottom layer |
| 50 | +up: |
| 51 | + |
| 52 | +- **ACRNTrace userland app**: Userland application collecting trace data to |
| 53 | + files (Per Physical CPU) |
| 54 | + |
| 55 | +- **SOS Trace Module**: allocates/frees SBufs, creates device for each |
| 56 | + SBuf, sets up sbuf shared between SOS and HV, and provides a dev node for the |
| 57 | + userland app to retrieve trace data from Sbuf |
| 58 | + |
| 59 | +- **Trace APIs**: provide APIs to generate trace event and insert to Sbuf. |
| 60 | + |
| 61 | +.. figure:: images/log-image50.png |
| 62 | + :align: center |
| 63 | + :name: acrntrace-arch |
| 64 | + |
| 65 | + Architectural diagram of ACRNTrace |
| 66 | + |
| 67 | +Trace APIs |
| 68 | +========== |
| 69 | + |
| 70 | +.. note:: reference APIs defined in hypervisor/include/debug/trace.h |
| 71 | + for trace_entry struct and functions. |
| 72 | + |
| 73 | + |
| 74 | +SOS Trace Module |
| 75 | +================ |
| 76 | + |
| 77 | +The SOS trace module is responsible for: |
| 78 | + |
| 79 | +- allocating sbuf in sos memory range for each physical CPU, and assign |
| 80 | + the gpa of Sbuf to ``per_cpu sbuf[ACRN_TRACE]`` |
| 81 | +- create a misc device for each physical CPU |
| 82 | +- provide mmap operation to map entire Sbuf to userspace for high |
| 83 | + flexible and efficient access. |
| 84 | + |
| 85 | +On SOS shutdown, the trace module is responsible to remove misc devices, free |
| 86 | +SBufs, and set ``per_cpu sbuf[ACRN_TRACE]`` to null. |
| 87 | + |
| 88 | +ACRNTrace Application |
| 89 | +===================== |
| 90 | + |
| 91 | +ACRNTrace application includes a binary to retrieve trace data from |
| 92 | +Sbuf, and Python scripts to convert trace data from raw format into |
| 93 | +readable text, and do analysis. |
| 94 | + |
| 95 | +Figure 2.2 shows the sequence of trace initialization and trace data |
| 96 | +collection. With a debug build, trace components are initialized at boot |
| 97 | +time. After initialization, HV writes trace event date into sbuf |
| 98 | +until sbuf is full, which can happen easily if the ACRNTrace app is not |
| 99 | +consuming trace data from Sbuf on SOS user space. |
| 100 | + |
| 101 | +Once ACRNTrace is launched, for each physical CPU a consumer thread is |
| 102 | +created to periodically read RAW trace data from sbuf and write to a |
| 103 | +file. |
| 104 | + |
| 105 | +.. note:: figure is missing |
| 106 | + |
| 107 | + Figure 2.2 Sequence of trace init and trace data collection |
| 108 | + |
| 109 | +These are the Python scripts provided: |
| 110 | + |
| 111 | +- **acrntrace_format.py** converts RAW trace data to human-readable |
| 112 | + text offline according to given format; |
| 113 | + |
| 114 | +- **acrnalyze.py** analyzes trace data (as output by acrntrace) |
| 115 | + based on given analyzer filters, such as vm_exit or irq, and generates a |
| 116 | + report. |
| 117 | + |
| 118 | +See :ref:`acrntrace` for details and usage. |
| 119 | + |
| 120 | +ACRN Log |
| 121 | +******** |
| 122 | + |
| 123 | +acrnlog is a tool used to capture ACRN hypervisor log to files on |
| 124 | +SOS filesystem. It can run as an SOS service at boot, capturing two |
| 125 | +kinds of logs: |
| 126 | + |
| 127 | +- Current runtime logs; |
| 128 | +- Logs remaining in the buffer, from last crashed running. |
| 129 | + |
| 130 | +Architectural diagram |
| 131 | +===================== |
| 132 | + |
| 133 | +Similar to the design of ACRN Trace, ACRN Log is built on the top of |
| 134 | +Shared Buffer (Sbuf), and consists of three parts from bottom layer |
| 135 | +up: |
| 136 | + |
| 137 | +- **ACRN Log app**: Userland application collecting hypervisor log to |
| 138 | + files; |
| 139 | +- **SOS ACRN Log Module**: constructs/frees SBufs at reserved memory |
| 140 | + area, creates dev for current/last logs, sets up sbuf shared between |
| 141 | + SOS and HV, and provides a dev node for the userland app to |
| 142 | + retrieve logs |
| 143 | +- **ACRN log support in HV**: put logs at specified loglevel to Sbuf. |
| 144 | + |
| 145 | +.. figure:: images/log-image73.png |
| 146 | + :align: center |
| 147 | + |
| 148 | + Architectural diagram of ACRN Log |
| 149 | + |
| 150 | + |
| 151 | +ACRN log support in Hypervisor |
| 152 | +============================== |
| 153 | + |
| 154 | +To support acrn log, the following adaption was made to hypervisor log |
| 155 | +system: |
| 156 | + |
| 157 | +- log messages with severity level higher than a specified value will |
| 158 | + be put into Sbuf when calling logmsg in hypervisor |
| 159 | +- allocate sbuf to accommodate early hypervisor logs before SOS |
| 160 | + can allocate and set up sbuf |
| 161 | + |
| 162 | +There are 6 different loglevels, as shown below. The specified |
| 163 | +severity loglevel is stored in ``mem_loglevel``, initialized |
| 164 | +by :option:`CONFIG_MEM_LOGLEVEL_DEFAULT`. The loglevel can |
| 165 | +be set to a new value |
| 166 | +at runtime via hypervisor shell command "loglevel". |
| 167 | + |
| 168 | +.. code-block:: c |
| 169 | +
|
| 170 | + #define LOG_FATAL 1U |
| 171 | + #define LOG_ACRN 2U |
| 172 | + #define LOG_ERROR 3U |
| 173 | + #define LOG_WARNING 4U |
| 174 | + #define LOG_INFO 5U |
| 175 | + #define LOG_DEBUG 6U |
| 176 | +
|
| 177 | +
|
| 178 | +The element size of sbuf for logs is fixed at 80 bytes, and the max size |
| 179 | +of a single log message is 320 bytes. Log messages with a length between |
| 180 | +80 and 320 bytes will be separated into multiple sbuf elements. Log |
| 181 | +messages with length larger then 320 will be truncated. |
| 182 | + |
| 183 | +For security, SOS allocates sbuf in its memory range and assigns it to |
| 184 | +the hypervisor. To handle log messages before SOS boots, sbuf for each |
| 185 | +physical cpu will be allocated in acrn hypervisor memory range for any |
| 186 | +early log entries. Once sbuf in the SOS memory range is allocated and |
| 187 | +assigned to hypervisor via hypercall, the Hypervisor logmsg will switch |
| 188 | +to use SOS allocated sbuf, early logs will be copied, and early sbuf in |
| 189 | +hypervisor memory range will be freed. |
| 190 | + |
| 191 | +SOS ACRN Log Module |
| 192 | +=================== |
| 193 | + |
| 194 | +To enable retrieving log messages from a crash, 4MB of memory from |
| 195 | +0x6DE00000 is reserved for acrn log. This space is further divided into |
| 196 | +two each ranges, one for current run and one for last previous run: |
| 197 | + |
| 198 | +.. figure:: images/log-image59.png |
| 199 | + :align: center |
| 200 | + |
| 201 | + ACRN Log crash log/current log buffers |
| 202 | + |
| 203 | +On SOS boot, SOS acrnlog module is responsible to: |
| 204 | + |
| 205 | +- examine if there are log messages remaining from last crashed |
| 206 | + run by checking the magic number of each sbuf |
| 207 | + |
| 208 | +- if there are previous crash logs, construct sbuf and create misc devices for |
| 209 | + these last logs |
| 210 | + |
| 211 | +- construct sbuf in the usable buf range for each physical CPU, |
| 212 | + assign the gpa of Sbuf to ``per_cpu sbuf[ACRN_LOG]`` and create a misc |
| 213 | + device for each physical CPU |
| 214 | + |
| 215 | +- the misc devices implement read() file operation to allow |
| 216 | + userspace app to read one Sbuf element. |
| 217 | + |
| 218 | +When checking the validity of sbuf for last logs examination, it sets the |
| 219 | +current sbuf with magic number ``0x5aa57aa71aa13aa3``, and changes the |
| 220 | +magic number of last sbuf to ``0x5aa57aa71aa13aa2``, to distinguish which is |
| 221 | +the current/last. |
| 222 | + |
| 223 | +On SOS shutdown, the module is responsible to remove misc devices, |
| 224 | +free SBufs, and set ``per_cpu sbuf[ACRN_TRACE]`` to null. |
| 225 | + |
| 226 | +ACRN Log Application |
| 227 | +==================== |
| 228 | + |
| 229 | +ACRNLog application reads log messages from sbuf for each physical |
| 230 | +CPU and combines them into log files with log messages in ascending |
| 231 | +order by the global sequence number. If the sequence number is not |
| 232 | +continuous, a warning of "incontinuous logs" will be inserted. |
| 233 | + |
| 234 | +To avoid using up storage space, the size of a single log file and |
| 235 | +the total number of log files are both limited. By default, log file |
| 236 | +size limitation is 1MB and file number limitation is 4. |
| 237 | + |
| 238 | +If there are last log devices, ACRN log will read out the log |
| 239 | +messages, combine them, and save them into last log files. |
| 240 | + |
| 241 | +See :ref:`acrnlog` for usage details. |
0 commit comments