# **NOVA Microhypervisor Interface Specification**

Udo Steinberg udo@hypervisor.org

January 24, 2022

Copyright © 2006–2011 Udo Steinberg, Technische Universität Dresden

Copyright © 2012–2013 Udo Steinberg, Intel Corporation

Copyright © 2014–2016 Udo Steinberg, FireEye, Inc.

Copyright © 2019-2022 Udo Steinberg, BedRock Systems, Inc.

This specification is provided "as is" and may contain defects or deficiencies which cannot or will not be corrected. The author makes no representations or warranties, either expressed or implied, including but not limited to, warranties of merchantability, fitness for a particular purpose, or non-infringement that the contents of the specification are suitable for any purpose or that any practice or implementation of such contents will not infringe any third party patents, copyrights, trade secrets or other rights.

The specification could include technical inaccuracies or typographical errors. Additions and changes are periodically made to the information therein; these will be incorporated into new versions of the specification, if any.

# **Contents**

|    | Int             | troduc      | tion       |               |   |   |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
|----|-----------------|-------------|------------|---------------|---|---|-----|---|-----|---------|-----|-----|---|-----|-------|---|---|---|---|--|
| 1  | Sys             | tem Arc     | chitecture | Э             |   |   |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
| II | Ba              | sic Al      | ostracti   | ons           |   |   |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
| 2  | Kerı            | nel Obj     | ects       |               |   |   |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
|    | 2.1             | Protec      | tion Doma  | in            |   |   |     |   |     | <br>    |     |     |   |     | <br>  |   |   |   |   |  |
|    |                 | 2.1.1       | Object S   | pace          |   |   |     |   |     | <br>    |     |     |   |     | <br>  |   |   |   |   |  |
|    |                 | 2.1.2       | Memory     | Space         |   |   |     |   |     | <br>    |     |     |   |     | <br>  |   |   |   |   |  |
|    |                 | 2.1.3       | I/O Port   | Space         |   |   |     |   |     | <br>    |     |     |   |     | <br>  |   |   |   |   |  |
|    |                 | 2.1.4       | MSR Sp     | ace           |   |   |     |   |     | <br>    |     |     |   |     | <br>  |   |   |   |   |  |
|    | 2.2             |             |            | xt            |   |   |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
|    | 2.3             | Schedu      | aling Cont | text          |   |   |     |   |     | <br>    |     |     |   |     | <br>  |   |   |   |   |  |
|    | 2.4             | Portal      |            |               |   |   |     |   |     | <br>    |     |     |   |     | <br>  |   |   |   |   |  |
|    | 2.5             | Semap       | hore       |               |   |   |     |   |     | <br>    |     |     |   |     | <br>  |   |   |   |   |  |
| _  | Hann            | december 5  |            | _             |   |   |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
| 3  |                 |             | Resource   | sounter       |   |   |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
|    | 3.1             | Systen      | 1 11me Co  | ounter        |   |   | • • |   | • • | <br>    | • • | • • | • | • • | <br>  | • |   | • |   |  |
| 1  | <b>Data</b> 4.1 | Types Capab |            |               |   |   |     |   |     | <br>    |     |     |   |     | <br>  |   |   |   |   |  |
|    |                 | 4.1.1       | Null Cap   | oability      |   |   |     |   |     | <br>    |     |     |   |     | <br>  |   |   |   |   |  |
|    |                 | 4.1.2       | Object C   | Capability    |   |   |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
|    |                 |             | 4.1.2.1    | PD Object (   | - | • |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
|    |                 |             | 4.1.2.2    | EC Object (   | - | • |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
|    |                 |             | 4.1.2.3    | SC Object O   | - | • |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
|    |                 |             | 4.1.2.4    | PT Object C   |   | - |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
|    |                 |             | 4.1.2.5    | SM Object     | _ | - |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
|    |                 | 4.1.3       | •          | Capability .  |   |   |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
|    |                 | 4.1.4       |            | Capability .  |   |   |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
|    | 4.2             | 4.1.5       |            | pability      |   |   |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
|    | 4.2             |             |            | tor           |   |   |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
|    | 4.3             | 4.3.1       |            | Layout        |   |   |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
|    |                 | 4.3.2       | _          | tural Layout  |   |   |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
|    | 4.4             |             |            | er Descriptor |   |   |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
|    | 7.7             | 4.4.1       |            | IPC           |   |   |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
|    |                 | 4.4.2       | _          | tural IPC     |   |   |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
| 5  | Hvn             | ercalls     | 1 Homico   |               |   |   | ••  |   |     | <br>• • | • • |     | • |     | <br>  | • |   | • |   |  |
|    | 5.1             |             | tions      |               |   |   |     |   |     | <br>    |     |     |   |     |       | , |   | , |   |  |
|    |                 | 5.1.1       |            | ll Numbers .  |   |   |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
|    |                 | 5.1.2       | • •        | odes          |   |   |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
|    |                 | 5.1.3       |            | ype           |   |   |     |   |     |         |     |     |   |     |       |   |   |   |   |  |
|    |                 | 5 1 4       | Access     | r -<br>C      |   |   |     | • | •   | <br>    | •   | •   | • | •   | <br>• | - | • | - | • |  |

|         | 5.2                                   | Communication                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 16                                                    |
|---------|---------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------|
|         |                                       | 5.2.1 IPC Call                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 16                                                    |
|         |                                       | 5.2.2 IPC Reply                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 17                                                    |
|         | 5.3                                   | Object Creation                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 18                                                    |
|         |                                       | 5.3.1 Create Protection Domain                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 18                                                    |
|         |                                       | 5.3.2 Create Execution Context                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 19                                                    |
|         |                                       | 5.3.3 Create Scheduling Context                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 21                                                    |
|         |                                       | 5.3.4 Create Portal                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | 22                                                    |
|         |                                       | 5.3.5 Create Semaphore                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 23                                                    |
|         | 5.4                                   | Object Control                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 24                                                    |
|         |                                       | 5.4.1 Control Protection Domain                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 24                                                    |
|         |                                       | 5.4.2 Control Execution Context                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 26                                                    |
|         |                                       | 5.4.3 Control Scheduling Context                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | 27                                                    |
|         |                                       | 5.4.4 Control Portal                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | 28                                                    |
|         |                                       | 5.4.5 Control Semaphore                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 29                                                    |
|         | 5.5                                   | Platform Management                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | 30                                                    |
|         | 3.3                                   | 5.5.1 Control Power Management                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 30                                                    |
|         |                                       | 5.5.2 Assign Interrupt                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 31                                                    |
|         |                                       | 5.5.3 Assign Device                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | 33                                                    |
|         |                                       | 5.5.5 Assign Device                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | 33                                                    |
| 6       | Boo                                   | tina                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | 34                                                    |
| •       | 6.1                                   | Microhypervisor                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 34                                                    |
|         | 0.1                                   | 6.1.1 ELF Image Loading                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 34                                                    |
|         |                                       | 6.1.2 Platform Resource Access                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 34                                                    |
|         | 6.2                                   | Root Protection Domain                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 35                                                    |
|         | 0.2                                   | 6.2.1 ELF Image Format                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 35                                                    |
|         |                                       | 6.2.2 Initial Configuration                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 35                                                    |
|         |                                       | 6.2.2.1 Object Space                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | 35                                                    |
|         |                                       | 6.2.2.2 Memory Space                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | 36                                                    |
|         |                                       | 0.2.2.2 Welliofy Space                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 20                                                    |
|         | 6.3                                   | Hypervisor Information Page                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |                                                       |
|         | 6.3                                   | Hypervisor Information Page                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 37                                                    |
|         | 6.3                                   | Hypervisor Information Page                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |                                                       |
| IV      |                                       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |                                                       |
| IV      |                                       | Hypervisor Information Page                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 37                                                    |
| IV<br>7 | Ap                                    |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 37                                                    |
|         | Ap                                    | plication Binary Interface                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 37<br>39<br>40                                        |
|         | Ap<br>ABI                             | plication Binary Interface aarch64                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | 37 39 40 40                                           |
|         | Ap<br>ABI                             | pplication Binary Interface  aarch64  Boot State                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | 37 39 40 40                                           |
|         | Ap<br>ABI                             | pplication Binary Interface  aarch64  Boot State                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | 37 39 40 40 40                                        |
|         | Ap<br>ABI                             | pplication Binary Interface  aarch64  Boot State  7.1.1 NOVA Microhypervisor  7.1.1.1 Multiboot v2 Launch                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 37 39 40 40 40 40                                     |
|         | Ap<br>ABI                             | pplication Binary Interface  aarch64  Boot State  7.1.1 NOVA Microhypervisor  7.1.1.1 Multiboot v2 Launch  7.1.1.2 Multiboot v1 Launch                                                                                                                                                                                                                                                                                                                                                                                                                                | 37 39 40 40 40 40 40                                  |
|         | Ap<br>ABI                             | ## Splication Binary Interface  ### aarch64  Boot State                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 37 39 40 40 40 40 40 40 40                            |
|         | <b>Ap</b><br><b>ABI</b><br>7.1        | aarch64 Boot State 7.1.1 NOVA Microhypervisor 7.1.1.1 Multiboot v2 Launch 7.1.1.2 Multiboot v1 Launch 7.1.1.3 Legacy Launch 7.1.2 Root Protection Domain                                                                                                                                                                                                                                                                                                                                                                                                              | 37 39 40 40 40 40 40 40 41                            |
|         | <b>Ap</b><br><b>ABI</b><br>7.1        | aarch64 Boot State 7.1.1 NOVA Microhypervisor 7.1.1.1 Multiboot v2 Launch 7.1.1.2 Multiboot v1 Launch 7.1.1.3 Legacy Launch 7.1.1.2 Root Protection Domain Protected Resources 7.2.1 Memory Space                                                                                                                                                                                                                                                                                                                                                                     | 37 39 40 40 40 40 40 41 42                            |
|         | <b>Ap</b><br><b>ABI</b><br>7.1        | aarch64 Boot State 7.1.1 NOVA Microhypervisor 7.1.1.1 Multiboot v2 Launch 7.1.1.2 Multiboot v1 Launch 7.1.1.3 Legacy Launch 7.1.2 Root Protection Domain Protected Resources 7.2.1 Memory Space Physical Memory                                                                                                                                                                                                                                                                                                                                                       | 37 40 40 40 40 40 41 42 42                            |
|         | <b>Ap</b><br><b>ABI</b><br>7.1        | aarch64 Boot State 7.1.1 NOVA Microhypervisor 7.1.1.1 Multiboot v2 Launch 7.1.1.2 Multiboot v1 Launch 7.1.1.3 Legacy Launch 7.1.1.2 Root Protection Domain Protected Resources 7.2.1 Memory Space Physical Memory 7.3.1 Memory Map                                                                                                                                                                                                                                                                                                                                    | 37 40 40 40 40 40 41 42 42 42                         |
|         | <b>Ap ABI</b> 7.1  7.2  7.3           | pplication Binary Interface  aarch64  Boot State 7.1.1 NOVA Microhypervisor 7.1.1.1 Multiboot v2 Launch 7.1.1.2 Multiboot v1 Launch 7.1.1.3 Legacy Launch 7.1.2 Root Protection Domain  Protected Resources 7.2.1 Memory Space Physical Memory 7.3.1 Memory Map  Virtual Memory                                                                                                                                                                                                                                                                                       | 37 40 40 40 40 41 42 42 42 42 42                      |
|         | <b>Ap ABI</b> 7.1  7.2  7.3           | pplication Binary Interface  aarch64  Boot State 7.1.1 NOVA Microhypervisor 7.1.1.1 Multiboot v2 Launch 7.1.1.2 Multiboot v1 Launch 7.1.1.3 Legacy Launch 7.1.2 Root Protection Domain  Protected Resources 7.2.1 Memory Space Physical Memory 7.3.1 Memory Map Virtual Memory 7.4.1 Cacheability Attributes                                                                                                                                                                                                                                                          | 37 40 40 40 40 41 42 42 42 42 42 42                   |
|         | <b>Ap ABI</b> 7.1  7.2  7.3           | pplication Binary Interface  aarch64  Boot State  7.1.1 NOVA Microhypervisor  7.1.1.1 Multiboot v2 Launch  7.1.1.2 Multiboot v1 Launch  7.1.1.3 Legacy Launch  7.1.2 Root Protection Domain  Protected Resources  7.2.1 Memory Space Physical Memory  7.3.1 Memory Map  Virtual Memory  7.4.1 Cacheability Attributes  7.4.2 Shareability Attributes                                                                                                                                                                                                                  | 37 40 40 40 40 41 42 42 42 42 42                      |
|         | <b>Api</b> 7.1 7.2 7.3 7.4            | arch64 Boot State 7.1.1 NOVA Microhypervisor 7.1.1.1 Multiboot v2 Launch 7.1.1.2 Multiboot v1 Launch 7.1.1.3 Legacy Launch 7.1.1.3 Legacy Launch 7.1.1 Memory Space Physical Memory 7.3.1 Memory Map Virtual Memory 7.4.1 Cacheability Attributes 7.4.2 Shareability Attributes Event-Specific Capability Selectors                                                                                                                                                                                                                                                   | 37 40 40 40 40 41 42 42 42 42 42 42 42 42             |
|         | <b>Api</b> 7.1 7.2 7.3 7.4            | pplication Binary Interface  aarch64  Boot State 7.1.1 NOVA Microhypervisor 7.1.1.1 Multiboot v2 Launch 7.1.1.2 Multiboot v1 Launch 7.1.1.3 Legacy Launch 7.1.2 Root Protection Domain  Protected Resources 7.2.1 Memory Space Physical Memory 7.3.1 Memory Map Virtual Memory 7.4.1 Cacheability Attributes 7.4.2 Shareability Attributes Event-Specific Capability Selectors 7.5.1 Architectural Events                                                                                                                                                             | 37 40 40 40 40 41 42 42 42 42 42 42 42 43             |
|         | <b>Ap ABI</b> 7.1  7.2  7.3  7.4  7.5 | plication Binary Interface  aarch64  Boot State 7.1.1 NOVA Microhypervisor 7.1.1.1 Multiboot v2 Launch 7.1.1.2 Multiboot v1 Launch 7.1.1.3 Legacy Launch 7.1.2 Root Protection Domain  Protected Resources 7.2.1 Memory Space Physical Memory 7.3.1 Memory Map Virtual Memory 7.4.1 Cacheability Attributes 7.4.2 Shareability Attributes Event-Specific Capability Selectors 7.5.1 Architectural Events 7.5.2 Microhypervisor Events                                                                                                                                 | 37 40 40 40 40 41 42 42 42 42 42 43 43 44             |
|         | <b>Api</b> 7.1 7.2 7.3 7.4            | aarch64 Boot State 7.1.1 NOVA Microhypervisor 7.1.1.1 Multiboot v2 Launch 7.1.1.2 Multiboot v1 Launch 7.1.1.3 Legacy Launch 7.1.2 Root Protection Domain Protected Resources 7.2.1 Memory Space Physical Memory 7.3.1 Memory Map Virtual Memory 7.4.1 Cacheability Attributes 7.4.2 Shareability Attributes Event-Specific Capability Selectors 7.5.1 Architectural Events 7.5.2 Microhypervisor Events Architecture-Dependent Structures                                                                                                                             | 37 39 40 40 40 40 41 42 42 42 42 42 42 43 43 44 45    |
|         | <b>Ap ABI</b> 7.1  7.2  7.3  7.4  7.5 | aarch64 Boot State 7.1.1 NOVA Microhypervisor 7.1.1.2 Multiboot v2 Launch 7.1.1.3 Legacy Launch 7.1.1.2 Root Protection Domain Protected Resources 7.2.1 Memory Space Physical Memory 7.3.1 Memory Map Virtual Memory 7.4.1 Cacheability Attributes 7.4.2 Shareability Attributes Event-Specific Capability Selectors 7.5.1 Architectural Events 7.5.2 Microhypervisor Events Architecture-Dependent Structures 7.6.1 Hypervisor Information Page                                                                                                                     | 37 39 40 40 40 40 41 42 42 42 42 42 42 43 43 44 45 45 |
|         | <b>Ap ABI</b> 7.1  7.2  7.3  7.4  7.5 | arch64 Boot State 7.1.1 NOVA Microhypervisor 7.1.1.1 Multiboot v2 Launch 7.1.1.2 Multiboot v1 Launch 7.1.1.3 Legacy Launch 7.1.2 Root Protection Domain Protected Resources 7.2.1 Memory Space Physical Memory 7.3.1 Memory Map Virtual Memory 7.4.1 Cacheability Attributes Fvent-Specific Capability Selectors 7.5.1 Architectural Events 7.5.2 Microhypervisor Events Architecture-Dependent Structures 7.6.1 Hypervisor Information Page 7.6.2 User Thread Control Block                                                                                          | 37 39 40 40 40 40 41 42 42 42 42 42 42 43 43 44 45 46 |
|         | <b>Api</b> 7.1 7.2 7.3 7.4 7.5 7.6    | plication Binary Interface  aarch64  Boot State 7.1.1 NOVA Microhypervisor 7.1.1.1 Multiboot v2 Launch 7.1.1.2 Multiboot v1 Launch 7.1.1.3 Legacy Launch 7.1.2 Root Protection Domain  Protected Resources 7.2.1 Memory Space Physical Memory 7.3.1 Memory Map Virtual Memory 7.4.1 Cacheability Attributes 7.4.2 Shareability Attributes 7.4.2 Shareability Attributes 7.5.1 Architectural Events 7.5.2 Microhypervisor Events Architecture-Dependent Structures 7.6.1 Hypervisor Information Page 7.6.2 User Thread Control Block 7.6.3 Message Transfer Descriptor | 37  40 40 40 40 41 42 42 42 42 42 42 45 45 46 47      |
|         | <b>Ap ABI</b> 7.1  7.2  7.3  7.4  7.5 | arch64 Boot State 7.1.1 NOVA Microhypervisor 7.1.1.1 Multiboot v2 Launch 7.1.1.2 Multiboot v1 Launch 7.1.1.3 Legacy Launch 7.1.2 Root Protection Domain Protected Resources 7.2.1 Memory Space Physical Memory 7.3.1 Memory Map Virtual Memory 7.4.1 Cacheability Attributes Fvent-Specific Capability Selectors 7.5.1 Architectural Events 7.5.2 Microhypervisor Events Architecture-Dependent Structures 7.6.1 Hypervisor Information Page 7.6.2 User Thread Control Block                                                                                          | 37 39 40 40 40 40 41 42 42 42 42 42 42 43 43 44 45 46 |

| 8 | 8 ABI x86-64 |                            |    |  |  |  |  |  |  |
|---|--------------|----------------------------|----|--|--|--|--|--|--|
|   | 8.1          | Boot State                 | 52 |  |  |  |  |  |  |
|   |              | 8.1.1 NOVA Microhypervisor | 52 |  |  |  |  |  |  |
|   |              |                            | 52 |  |  |  |  |  |  |
|   |              |                            | 52 |  |  |  |  |  |  |
|   |              |                            | 52 |  |  |  |  |  |  |
|   | 8.2          |                            | 53 |  |  |  |  |  |  |
|   |              | <b>▼ 1</b>                 | 53 |  |  |  |  |  |  |
|   |              | ' I                        | 53 |  |  |  |  |  |  |
|   | 8.3          |                            | 53 |  |  |  |  |  |  |
|   |              | • •                        | 53 |  |  |  |  |  |  |
|   | 8.4          | Virtual Memory             | 53 |  |  |  |  |  |  |
|   |              | •                          | 53 |  |  |  |  |  |  |
|   |              |                            | 53 |  |  |  |  |  |  |
|   | 8.5          |                            | 54 |  |  |  |  |  |  |
|   |              |                            | 54 |  |  |  |  |  |  |
|   |              | *1                         | 55 |  |  |  |  |  |  |
|   | 8.6          | 1                          | 56 |  |  |  |  |  |  |
|   |              | $\mathcal{U}$              | 56 |  |  |  |  |  |  |
|   |              |                            | 56 |  |  |  |  |  |  |
|   |              | 8                          | 57 |  |  |  |  |  |  |
|   |              |                            | 57 |  |  |  |  |  |  |
|   |              |                            | 58 |  |  |  |  |  |  |
|   | 8.7          | Calling Convention         | 59 |  |  |  |  |  |  |
| V | ۸۳           | pondiv                     | 63 |  |  |  |  |  |  |
| V | Aþ           | ppendix                    | ၁၁ |  |  |  |  |  |  |
| A | Acro         | onyms                      | 64 |  |  |  |  |  |  |
| В | Bibl         | liography                  | 67 |  |  |  |  |  |  |
| С | Con          | asole                      | 69 |  |  |  |  |  |  |
|   | <b>C</b> .1  | Memory-Buffer Console      | 69 |  |  |  |  |  |  |
|   | C.2          | UART Console               | 69 |  |  |  |  |  |  |
| D | Dow          | vnload                     | 70 |  |  |  |  |  |  |

## **Notation**

The key words **must**, **must not**, **required**, **should**, **should not**, **recommended**, **may** and **optional** in this document are to be interpreted as described in RFC 2119 [1].

Throughout this document, the following symbols are used:

- Indicates that the value of this parameter or field is **undefined**. Future versions of this specification may define a meaning for the parameter or field.
- \_ Indicates that the value of this parameter or field is **ignored**. Future versions of this specification may define a meaning for the parameter or field.
- Indicates that the value of this parameter or field is **unchanged**. The microhypervisor will preserve the value across hypercalls.

# Part I Introduction

## 1 System Architecture

The NOVA OS Virtualization Architecture [2] (NOVA) facilitates the coexistence of multiple legacy guest operating systems and a user-mode host framework on a single platform. The core system leverages hardware virtualization technology provided by modern x86 or ARM platforms and comprises the NOVA microhypervisor and one or more Virtual-Machine Monitors (VMMs).



Figure 1.1: System Architecture

Figure 1.1 shows the structure of the system. The microhypervisor is the only component executing in privileged host/kernel mode. It isolates the various user-mode components, including the virtual-machine monitors, from one another by placing them in different protection domains in unprivileged host/user mode. Each legacy guest operating system runs in its own virtual-machine environment in guest mode and is therefore isolated from the other components.

Besides spatial and temporal isolation, the microhypervisor also provides mechanisms for partitioning and delegation of platform resources, such as CPU time, physical memory, I/O ports and hardware interrupts and for establishing communication channels and signaling between different protection domains.

The virtual-machine monitors handle virtualization events and implement virtual devices that enable legacy guest operating systems to function in the same manner as they would on bare-metal hardware. Providing this functionality outside the microhypervisor in the VMMs reduces the size of the trusted computing base significantly for all components that do not require virtualization support.

The architecture and interfaces of the VMM and the user-mode host framework are not described in this document.

# Part II Basic Abstractions

## 2 Kernel Objects

#### 2.1 Protection Domain

- 1. The Protection Domain (PD) is a unit of protection and spatial isolation.
- 2. Access to a Protection Domain is controlled by a PD Object Capability (CAP<sub>OBJpp</sub>).
- 3. A Protection Domain is composed of a set of spaces that store Capabilities (CAPs) to kernel objects or platform resources that can be accessed by Execution Contexts (ECs) within that PD. Not all spaces are available on all architectures (see 5.1.3 for details). The following subsections detail all spaces.

#### 2.1.1 Object Space

- 1. An Object Capability Selector (SELOBJ) serves as index into the Object Space and selects a slot.
- 2. Each slot of the Object Space contains either a Null Capability (CAP<sub>0</sub>) or an Object Capability (CAP<sub>OBJ</sub>) that refers to a kernel object.
- 3. Each hypercall issued from within the PD explicitly specifies the SEL<sub>OBJ</sub> to select the CAP<sub>OBJ</sub> for the kernel object on which it operates.

#### 2.1.2 Memory Space

- 1. A Memory Capability Selector (SEL<sub>MEM</sub>) serves as index into the Memory Space and selects a slot.
- 2. Each slot of the Memory Space contains either a Null Capability (CAP<sub>0</sub>) or a Memory Capability (CAP<sub>MEM</sub>) that refers to a 4 KiB page frame in physical memory.
- 3. Each memory access issued from within the PD implicitly uses the virtual page number (VirtAddr » 12) of the access as SEL<sub>MEM</sub> to select the CAP<sub>MEM</sub> for the 4 KiB page frame on which it operates.

#### 2.1.3 I/O Port Space

- 1. An I/O Port Capability Selector (SELPIO) serves as index into the I/O Port Space and selects a slot.
- 2. Each slot of the I/O Port Space contains either a Null Capability (CAP<sub>0</sub>) or an I/O Port Capability (CAP<sub>PIO</sub>) that refers to the physical I/O port corresponding to the slot number.
- 3. Each I/O access (IN/OUT instruction) issued from within the PD implicitly uses the I/O port number of the access as SEL<sub>PIO</sub> to select the CAP<sub>PIO</sub> for the I/O port on which it operates.

## 2.1.4 MSR Space

- 1. An MSR Capability Selector (SEL<sub>MSR</sub>) serves as index into the MSR Space and selects a slot.
- 2. Each slot of the MSR Space contains either a Null Capability (CAP<sub>0</sub>) or an MSR Capability (CAP<sub>MSR</sub>) that refers to the physical MSR corresponding to the slot number.
- 3. Each MSR access (RDMSR/WRMSR instruction) issued from within the PD implicitly uses the MSR number of the access as SEL<sub>MSR</sub> to select the CAP<sub>MSR</sub> for the MSR on which it operates.

#### 2.2 Execution Context

- 1. The Execution Context (EC) is an abstraction for an activity within a PD.
- 2. Access to an Execution Context is controlled by an EC Object Capability (CAP<sub>OBJEC</sub>).
- 3. An EC is permanently bound to exactly one physical CPU.
- 4. An EC is permanently bound to the PD for which it was created.
- 5. There exist three types of Execution Context:
  - Local Threads these may optionally have PTs (but not SCs) bound to it.
  - Global Threads these may optionally have an SC (but not PTs) bound to it.
  - Virtual CPUs these may optionally have an SC (but not PTs) bound to it.
- 6. An EC comprises the following state:
  - Reference to bound PD (2.1)
  - Event Selector Base [ARM, x86] (SEL<sub>EVT</sub>)
  - User Thread Control Block [ARM, x86] (UTCB) (4.3)
  - Central Processing Unit (CPU) registers (architecture dependent)
  - Floating Point Unit (FPU) registers (architecture dependent)

## 2.3 Scheduling Context

- 1. The Scheduling Context (SC) is a unit of prioritization and temporal isolation.
- 2. Access to a Scheduling Context is controlled by an SC Object Capability (CAP<sub>OBJsc</sub>).
- 3. An SC is permanently bound to exactly one physical CPU.
- 4. An SC is permanently bound to the EC for which it was created.
- 5. Donation allows another EC to consume the budget of the SC for the duration of the donation.
- 6. A scheduling context comprises the following state:
  - Reference to bound EC (2.2)
  - Scheduling priority numerically higher priorities always preempt numerically lower priorities
  - Scheduling budget time after which the SC can be preempted by an SC with the same priority

#### 2.4 Portal

- 1. A Portal (PT) represents a dedicated entry point into the PD for which the portal was created.
- 2. Access to a Portal is controlled by a PT Object Capability (CAP<sub>OBJpr</sub>).
- 3. A PT is permanently bound to the EC for which it was created.
- 4. A portal comprises the following state:
  - Reference to bound EC (2.2)
  - Message Transfer Descriptor [ARM, x86] (MTD) (4.4)
  - Entry Instruction Pointer (IP)
  - Portal Identifier (PID)

## 2.5 Semaphore

- 1. A Semaphore (SM) provides a means to synchronize execution and interrupt delivery by selectively blocking and unblocking Execution Contexts (ECs).
- 2. Access to a Semaphore is controlled by a SM Object Capability (CAP<sub>OBJ<sub>SM</sub></sub>).

## 3 Hardware Resources

## 3.1 System Time Counter

The system time is represented by an unsigned 64-bit System Time Counter (STC) with the following properties:

- 1. The STC starts with a power-on value of 0.
- 2. Subsequent reads of the STC return a higher value that reflects the platform uptime.
- 3. While the platform is in a shallow sleep state, the STC retains its current value.
- 4. While the platform is running, the STC monotonically increments at a fixed frequency, which is conveyed in the Hypervisor Information Page [ARM, x86] (HIP).
- 5. The STC and its frequency are synchronized across all CPUs. Applications can use both values to convert between system time and wall clock time.
- 6. Applications can obtain the current STC value as follows:

**ARM:** By reading CNTVCT\_EL0 via the MRS instruction [3].

**x86:** By reading IA32\_TSC via the RDTSC instruction [4, 5].

# Part III Application Programming Interface

# 4 Data Types

## 4.1 Capability

A Capability (CAP) is a reference to a resource coupled with auxiliary data, such as access permissions.

Capabilities are opaque and immutable for applications – they cannot be inspected or modified directly; instead applications refer to a Capability via a Capability Selector (SEL).

#### 4.1.1 Null Capability

A Null Capability (CAP<sub>0</sub>) does not refer to anything and carries no permissions.

#### 4.1.2 Object Capability

An Object Capability (CAP<sub>OBJ</sub>) is stored in the Object Space (SPC<sub>OBJ</sub>) of a PD and refers to a kernel object.

#### 4.1.2.1 PD Object Capability

A PD Object Capability (CAP<sub>OBJpn</sub>) refers to a Protection Domain (PD) and carries the following permissions:

```
CTRL ctrl_pd permitted if set.

PD create_pd permitted if set.

EC PT SM create_ec, create_pt, create_sm permitted it set.

SC create_sc permitted if set.

ASSIGN assign_dev permitted if set.
```

### 4.1.2.2 EC Object Capability

An EC Object Capability (CAP<sub>OBJFC</sub>) refers to an Execution Context (EC) and carries the following permissions:

#### 4.1.2.3 SC Object Capability

An SC Object Capability (CAP<sub>OBJsc</sub>) refers to a Scheduling Context (SC) and carries the following permissions:



#### 4.1.2.4 PT Object Capability

A PT Object Capability (CAP<sub>OBJer</sub>) refers to a Portal (PT) and carries the following permissions:



CTRL ctrl\_pt permitted if set.

CALL ipc\_call permitted if set.

EVENT Delivery of events permitted if set.

#### 4.1.2.5 SM Object Capability

An SM Object Capability (CAP<sub>OBJest</sub>) refers to a Semaphore (SM) and carries the following permissions:



 $\begin{array}{lll} \text{CTRL}_{\text{UP}} & \text{ctrl\_sm} \text{ (Up) permitted if set.} \\ \text{CTRL}_{\text{DN}} & \text{ctrl\_sm} \text{ (Down) permitted if set.} \\ \text{ASSIGN} & \text{assign\_int permitted if set.} \\ \end{array}$ 

## 4.1.3 Memory Capability

A Memory Capability (CAP<sub>MEM</sub>) is stored in the Memory Space (SPC<sub>MEM</sub>) of a PD, refers to a 4 KiB page frame, and carries the following permissions:



R the page frame is readable if set.
W the page frame is writable if set.

 $X_U$  <sup>‡</sup> the page frame is executable (in user mode) if set.

 $X_S$  the page frame is executable (in supervisor mode) if set.

#### 4.1.4 I/O Port Capability

A I/O Port Capability (CAP<sub>PIO</sub>) is stored in the I/O Port Space (SPC<sub>PIO</sub>) of a PD, refers to an I/O port, and carries the following permissions:



A the I/O port is accessible (via IN/OUT) if set.

#### 4.1.5 MSR Capability

A MSR Capability (CAP<sub>MSR</sub>) is stored in the MSR Space (SPC<sub>MSR</sub>) of a PD, refers to a Model-Specific Register (MSR), and carries the following permissions:



the MSR is readable (via RDMSR) if set.
the MSR is writable (via WRMSR) if set.

<sup>&</sup>lt;sup>†</sup>This permission bit is only defined for interrupt semaphores.

<sup>&</sup>lt;sup>‡</sup> If the hardware supports only combined execute permissions (X) for both modes, then  $X = X_U \vee X_S$ .

## 4.2 Capability Selector

A Capability Selector (SEL) is an application-visible unsigned number as follows:

- An Object Capability Selector (SEL<sub>OBJ</sub>) indexes into the Object Space (SPC<sub>OBJ</sub>) of a Protection Domain (PD) and selects a slot that contains either a Null Capability (CAP<sub>0</sub>) or an Object Capability (CAP<sub>OBJ</sub>).
- A Memory Capability Selector (SEL<sub>MEM</sub>) indexes into the Memory Space (SPC<sub>MEM</sub>) of a Protection Domain (PD) and selects a slot that contains either a Null Capability (CAP<sub>®</sub>) or a Memory Capability (CAP<sub>MEM</sub>).
- An I/O Port Capability Selector (SEL<sub>PIO</sub>) indexes into the I/O Port Space (SPC<sub>PIO</sub>) of a Protection Domain (PD) and selects a slot that contains either a Null Capability (CAP<sub>0</sub>) or an I/O Port Capability (CAP<sub>PIO</sub>).
- An MSR Capability Selector (SEL<sub>MSR</sub>) indexes into the MSR Space (SPC<sub>MSR</sub>) of a Protection Domain (PD) and selects a slot that contains either a Null Capability (CAP<sub>0</sub>) or an MSR Capability (CAP<sub>MSR</sub>).

#### 4.3 User Thread Control Block

Each host EC (local/global thread) has its own User Thread Control Block [ARM, x86] (UTCB), which is mapped into the Memory Space (SPC<sub>MEM</sub>) of the PD in which that EC is executing. A guest EC (virtual CPU) does not have a UTCB.

A User Thread Control Block [ARM, x86] has a size of one memory page (4 KiB). Because a UTCB is owned by the microhypervisor, it cannot be delegated using ctrl\_pd.

To ensure proper visibility of loads and stores with relaxed memory ordering, application programs are expected to access a UTCB only from the EC to which that UTCB is bound.

#### 4.3.1 Regular Layout

During regular IPC (see 4.4.1), the UTCB is used for data transfer and has a regular layout with 512 message words.



The data transfer from one UTCB to another UTCB is defined as follows:

- The data transfer is performed by the CPU on which the caller EC and callee EC execute.
- The data transfer uses the regular layout.
- The data is copied from low words to high words, beginning with word<sub>0</sub>.
- The granularity of the loads and stores used for copying is undefined.
- Loads from and stores to the UTCB are non-atomic and use relaxed memory ordering.

#### 4.3.2 Architectural Layout

During architectural IPC (see 4.4.2), the UTCB is used for state transfer and has an architectural layout (ARM, x86).

The state transfer between the architectural registers and a UTCB is defined as follows:

- The state transfer is performed by the CPU on which the affected EC and callee EC execute.
- The state transfer uses the architectural layout.
- The state is copied between architectural registers and the UTCB in an undefined order.
- The granularity of the loads and stores used for copying is **undefined**.
- Loads from and stores to the UTCB are **non-atomic** and use **relaxed** memory ordering.

## 4.4 Message Transfer Descriptor

#### 4.4.1 Regular IPC

For regular Inter-Process Communication (IPC), the Message Transfer Descriptor [ARM, x86] (MTD) is provided by the sender, passed to the receiver, and uses the following layout:



The MTD controls the data transfer (see 4.3.1) as shown in Figure 4.1:

- During ipc\_call, it specifies the number of message words to transfer from the UTCB of the caller EC (sender) to the UTCB of the callee EC (receiver).
- During ipc\_reply, it specifies the number of message words to transfer from the UTCB of the callee EC (sender) to the UTCB of the caller EC (receiver).



Figure 4.1: Regular IPC

#### 4.4.2 Architectural IPC

For exceptions and intercepts, the Message Transfer Descriptor [ARM, x86] (MTD) is provided by the architectural event-specific portal (ARM, x86) or sender, passed to the receiver, and uses an architectural bitfield layout (ARM, x86):

- If a bit is 0, then the microhypervisor does **not** transmit the architectural state associated with that bit.
- If a bit is 1, then the microhypervisor transmits the architectural state associated with that bit.

The MTD controls the state transfer (see 4.3.2) as shown in Figure 4.2:

- During an exception/intercept, it specifies the subset of registers to transfer from the architectural state of the affected EC (sender) to the UTCB of the callee EC (receiver).
- During ipc\_reply, it specifies the subset of registers to transfer from the UTCB of the callee EC (sender) to the architectural state of the affected EC (receiver).



Figure 4.2: Architectural IPC

# **5 Hypercalls**

## 5.1 Definitions

## 5.1.1 Hypercall Numbers

Each hypercall is identified by a unique number. The following hypercalls are currently defined:

| Number | Hypercall               | Section |
|--------|-------------------------|---------|
| 0x0    | ipc_call                | 5.2.1   |
| 0x1    | <pre>ipc_reply</pre>    | 5.2.2   |
| 0x2    | create_pd               | 5.3.1   |
| 0x3    | create_ec               | 5.3.2   |
| 0x4    | create_sc               | 5.3.3   |
| 0x5    | create_pt               | 5.3.4   |
| 0x6    | create_sm               | 5.3.5   |
| 0x7    | ctrl_pd                 | 5.4.1   |
| 8x0    | ctrl_ec                 | 5.4.2   |
| 0x9    | ctrl_sc                 | 5.4.3   |
| 0xa    | ctrl_pt                 | 5.4.4   |
| 0xb    | ctrl_sm                 | 5.4.5   |
| 0xc    | ctrl_pm                 | 5.5.1   |
| 0xd    | assign_int              | 5.5.2   |
| 0xe    | assign_dev              | 5.5.3   |
| 0xf    | reserved for future use |         |

## 5.1.2 Status Codes

Hypercalls return a status code to indicate success or failure. The following status codes are currently defined:

| Number      | Status Code             | Description          |
|-------------|-------------------------|----------------------|
| 0x0         | SUCCESS                 | Operation Successful |
| 0x1         | TIMEOUT                 | Operation Timeout    |
| 0x2         | ABORTED                 | Operation Abort      |
| 0x3         | OVRFLOW                 | Operation Overflow   |
| 0x4         | BAD_HYP                 | Invalid Hypercall    |
| 0x5         | BAD_CAP                 | Invalid Capability   |
| 0x6         | BAD_PAR                 | Invalid Parameter    |
| 0x7         | BAD_FTR                 | Invalid Feature      |
| 0x8         | BAD_CPU                 | Invalid CPU Number   |
| <b>0</b> x9 | BAD_DEV                 | Invalid Device ID    |
| 0xa         | INS_MEM                 | Insufficient Memory  |
| ≥0xb        | reserved for future use |                      |

## 5.1.3 Space Type

The following table lists the currently defined space types and for which architectures they are valid  $(\checkmark)$ :

| Number | TYPE <sub>SPC</sub>     | ARM          | <b>x86</b>   | Description    |
|--------|-------------------------|--------------|--------------|----------------|
| 0x0    | SPC <sub>OBJ</sub>      | ✓            | <b>√</b>     | Object Space   |
| 0x1    | SPC <sub>MEM</sub>      | $\checkmark$ | $\checkmark$ | Memory Space   |
| 0x2    | SPC <sub>PIO</sub>      | ×            | $\checkmark$ | I/O Port Space |
| 0x3    | $SPC_{MSR}$             | ×            | $\checkmark$ | MSR Space      |
| ≥0x4   | reserved for future use |              |              |                |

## 5.1.4 Access Type

The following table lists the currently defined access types and for which space types they are valid  $(\checkmark)$ :

| Number        | TYPE <sub>ACC</sub>     | SPC <sub>OBJ</sub> | SPC <sub>MEM</sub> | SPC <sub>PIO</sub> | $SPC_{MSR}$  | Description           |
|---------------|-------------------------|--------------------|--------------------|--------------------|--------------|-----------------------|
| 0x0           | CPU_HST                 | ✓                  | ✓                  | <b>√</b>           | ×            | CPU Access from Host  |
| 0x1           | CPU_GST                 | ×                  | $\checkmark$       | $\checkmark$       | $\checkmark$ | CPU Access from Guest |
| 0x2           | DMA_HST                 | ×                  | $\checkmark$       | ×                  | ×            | DMA Access from Host  |
| 0x3           | DMA_GST                 | ×                  | $\checkmark$       | ×                  | ×            | DMA Access from Guest |
| ≥ <b>0</b> x4 | reserved for future use |                    |                    |                    |              |                       |

## 5.2 Communication

#### 5.2.1 IPC Call

#### Parameters:

#### Flags:



#### **Description:**

Sends a message from EC<sub>CURRENT</sub> (caller) to the EC (callee) to which the specified Portal (PT) is bound. Prior to the hypercall:

• { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> pt } must refer to a PT Object Capability (CAP<sub>OBJPT</sub>) with permission CALL.

If the hypercall completed successfully:

- If **T=0** (**No Timeout**): If the callee **EC** was still busy handling a prior **ipc\_call**, then the caller **EC** has helped run that prior **ipc\_call** to completion, i.e. until the callee **EC** became available again.
- The microhypervisor has transferred a message from the UTCB of the caller EC to the UTCB of the callee EC. The content of that message is defined by the MTD mtd, which has been passed from the caller EC to the callee EC.
- The hypercall returns once the callee EC has issued an ipc\_reply. Upon return, the UTCB of the caller EC and the parameter mtd have been updated by the reply message.
- The Current Scheduling Context (SC<sub>CURRENT</sub>) has been donated to the callee EC upon ipc\_call and returned back upon ipc\_reply, thereby accounting the entire handling of the request to SC<sub>CURRENT</sub>.

#### Status:

#### **SUCCESS**

• The hypercall completed successfully.

#### BAD\_CAP

• { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> pt } did not refer to a PT Object Capability (CAP<sub>OBJPT</sub>) or that capability had insufficient permissions.

#### **BAD CPU**

• Caller EC and callee EC are on different CPUs.

#### **TIMEOUT**

• The callee EC is still busy handling a prior ipc\_call – only if T=1 (Timeout).

#### **ABORTED**

• The callee EC is dead and the operation aborted.

#### 5.2.2 IPC Reply

#### Parameters:

#### Flags:



#### **Description:**

Sends a reply message from EC<sub>CURRENT</sub> (callee) back to the caller EC (if one exists) and subsequently waits for the next incoming message.

If the hypercall completed successfully:

- If a caller **EC** exists:
  - The microhypervisor has transferred a reply message from the UTCB of the callee EC back to the UTCB of the caller EC.
  - The content of that reply message is defined by the MTD mtd, which has been passed from the callee EC back to the caller EC.
  - The Current Scheduling Context (SC<sub>CURRENT</sub>) that had been donated to the callee EC upon ipc\_call
    has been returned back to the caller EC.
- ECCURRENT blocks until the next incoming message arrives on any Portal (PT) bound to it.

#### Status:

This hypercall does not return directly.

Instead, when the next message arrives via a subsequent ipc\_call to any Portal (PT) bound to the callee EC:

- The microhypervisor passes the Portal Identifier (PID) of the called PT to the callee EC.
- The UTCB of the callee EC and the parameter mtd have been updated by the incoming message.
- Execution of the callee EC continues at the Instruction Pointer (IP) configured in the called PT.

## 5.3 Object Creation

#### 5.3.1 Create Protection Domain

#### Parameters:

#### Flags:



#### **Description:**

Creates a new Protection Domain (PD).

Prior to the hypercall:

- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> own } must refer to a PD Object Capability (CAP<sub>OBJpp</sub>) with permission PD.
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> sel } must refer to a Null Capability (CAP<sub>0</sub>).

If the hypercall completed successfully:

- A new Protection Domain (PD) has been created.
- The resources for the created PD were accounted to the PD referred to by {  $PD_{CURRENT}$ ,  $SEL_{OBJ}$  own }.
- {  $PD_{CURRENT}$ ,  $SEL_{OBJ}$  sel } refers to a PD Object Capability ( $CAP_{OBJ_{PD}}$ ) for the created PD with defined permissions inherited from {  $PD_{CURRENT}$ ,  $SEL_{OBJ}$  own }.

#### Status:

#### **SUCCESS**

• The hypercall completed successfully.

#### BAD\_CAP

- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> own } did not refer to a PD Object Capability (CAP<sub>OBJ<sub>PD</sub></sub>) or that capability had insufficient permissions.
- {  $PD_{CURRENT}$ ,  $SEL_{OBJ}$  sel } did not refer to a Null Capability (CAP<sub>0</sub>).

#### INS\_MEM

• { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> own } had insufficient memory resources for PD creation.

#### 5.3.2 Create Execution Context

#### Parameters:

```
status = create_ec (SEL<sub>OBJ</sub>
                                sel,
                                              // Created EC
                                              // Owner PD
                       SEL<sub>OB1</sub>
                                own,
                                              // UTCB Address (Page Number)
                       SEL_{MEM}
                                utcb,
                       UINT
                                              // CPU Number
                                cpu,
                       UINT
                                              // Initial Stack Pointer
                                sp,
                                              // Event Selector Base
                       SELEVE
                                evt);
```

#### Flags:



#### **Description:**

Creates a new Execution Context (EC).

Prior to the hypercall:

- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> own } must refer to a PD Object Capability (CAP<sub>OBJpp</sub>) with permission EC.
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> sel } must refer to a Null Capability (CAP<sub>0</sub>).

If the hypercall completed successfully:

- V=0 (Thread): A new host Execution Context (EC) has been created with its UTCB mapped at virtual page number utcb and its initial Stack Pointer (SP) set to sp.
  - **T=0** (**Local Thread**): Portals (PTs) can subsequently be bound to that EC and the EC will run whenever any of those bound portals is called.
  - T=1 (Global Thread): The EC will generate a startup exception the first time a Scheduling Context (SC) is bound to it.
- V=1 (Virtual CPU): A new guest Execution Context (EC) has been created. The EC will generate a startup exception the first time a Scheduling Context (SC) is bound to it. The parameters utcb and sp were ignored.
  - T=0: The virtual CPU uses no time adjustment.
  - T=1: The virtual CPU uses time offsetting.
- The created EC will be able to use FPU instructions only if the F-flag is set. Otherwise any FPU access by that EC will generate an exception.
- The created EC is bound to the PD referred to by { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> own } on CPU cpu with its Event Selector Base [ARM, x86] (SEL<sub>EVT</sub>) set to evt.
- The resources for the created EC were accounted to the PD referred to by { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> own }.
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> sel } refers to an EC Object Capability (CAP<sub>OBJ<sub>EC</sub></sub>) for the created EC with all defined permissions set.

#### Status:

#### **SUCCESS**

• The hypercall completed successfully.

#### BAD\_CAP

- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> own } did not refer to a PD Object Capability (CAP<sub>OBJPD</sub>) or that capability had insufficient permissions.
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> sel } did not refer to a Null Capability (CAP<sub>0</sub>).

#### **BAD CPU**

• The CPU number is invalid.

#### **BAD FTR**

• Virtual CPUs are not supported on the machine.

## BAD\_PAR

• UTCB region is not free or outside the user-accessible memory range.

## INS\_MEM

• {  $PD_{CURRENT}$ ,  $SEL_{OBJ}$  own } had insufficient memory resources for EC creation.

#### 5.3.3 Create Scheduling Context

#### Parameters:

#### Flags:



#### **Description:**

Creates a new Scheduling Context (SC).

Prior to the hypercall:

- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> own } must refer to a PD Object Capability (CAP<sub>OBJpD</sub>) with permission SC.
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> ec } must refer to an EC Object Capability (CAP<sub>OBJEC</sub>) with permission BIND<sub>SC</sub>.
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> sel } must refer to a Null Capability (CAP<sub>0</sub>).

If the hypercall completed successfully:

- A new Scheduling Context (SC) has been created.
- The created SC is bound to the EC referred to by { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> ec } on the CPU of that EC with its scheduling parameters set to budget and priority.
- The resources for the created SC were accounted to the PD referred to by  $\{PD_{CURRENT}, SEL_{OBJ} own \}$ .
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> sel } refers to an SC Object Capability (CAP<sub>OBJ<sub>SC</sub></sub>) for the created SC with all defined permissions set.

#### Status:

#### **SUCCESS**

• The hypercall completed successfully.

#### BAD\_CAP

- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> own } did not refer to a PD Object Capability (CAP<sub>OBJpD</sub>) or that capability had insufficient permissions.
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> ec } did not refer to a EC Object Capability (CAP<sub>OBJEC</sub>) or that capability had insufficient permissions.
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> sel } did not refer to a Null Capability (CAP<sub>0</sub>).
- Binding the SC to the EC failed, e.g. because the EC is a local EC.

#### **BAD PAR**

• Scheduling budget or priority was zero.

#### INS MEM

 $\bullet \ \, \{ \ \, PD_{CURRENT} \, , \ \, SEL_{OBJ} \ \, own \, \, \}$  had insufficient memory resources for SC creation.

#### 5.3.4 Create Portal

#### Parameters:

#### Flags:



#### **Description:**

Creates a new Portal (PT).

Prior to the hypercall:

- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> own } must refer to a PD Object Capability (CAP<sub>OBJpD</sub>) with permission PT.
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> ec } must refer to an EC Object Capability (CAP<sub>OBJEC</sub>) with permission BIND<sub>PT</sub>.
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> sel } must refer to a Null Capability (CAP<sub>0</sub>).

If the hypercall completed successfully:

- A new Portal (PT) has been created.
- The created PT is bound to the EC referred to by { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> ec } on the CPU of that EC, with its portal Instruction Pointer (IP) set to ip, its initial MTD set to 0 and its initial PID set to 0.
- The resources for the created PT were accounted to the PD referred to by { PD\_CURRENT, SELOBJ own }.
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> sel } refers to an PT Object Capability (CAP<sub>OBJPT</sub>) for the created PT with all defined permissions set.

#### Status:

#### **SUCCESS**

• The hypercall completed successfully.

#### BAD\_CAP

- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> own } did not refer to a PD Object Capability (CAP<sub>OBJPD</sub>) or that capability had insufficient permissions.
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> ec } did not refer to a EC Object Capability (CAP<sub>OBJEC</sub>) or that capability had insufficient permissions.
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> sel } did not refer to a Null Capability (CAP<sub>0</sub>).
- Binding the PT to the EC failed, e.g. because the EC is not a local EC.

#### INS\_MEM

• { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> own } had insufficient memory resources for PT creation.

## 5.3.5 Create Semaphore

#### Parameters:

#### Flags:



#### **Description:**

Creates a new Semaphore (SM).

Prior to the hypercall:

- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> own } must refer to a PD Object Capability (CAP<sub>OBJPP</sub>) with permission SM.
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> sel } must refer to a Null Capability (CAP<sub>0</sub>).

If the hypercall completed successfully:

- A new Semaphore (SM) has been created.
- The created SM has its initial counter value set to cnt.
- The resources for the created SM were accounted to the PD referred to by {  $PD_{CURRENT}$ ,  $SEL_{OBJ}$  own }.
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> sel } refers to an SM Object Capability (CAP<sub>OBJ<sub>SM</sub></sub>) for the created SM with all defined permissions set.

#### Status:

#### **SUCCESS**

• The hypercall completed successfully.

#### **BAD CAP**

- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> own } did not refer to a PD Object Capability (CAP<sub>OBJPD</sub>) or that capability had insufficient permissions.
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> sel } did not refer to a Null Capability (CAP<sub>0</sub>).

#### **INS MEM**

• { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> own } had insufficient memory resources for SM creation.

## 5.4 Object Control

#### 5.4.1 Control Protection Domain

#### Parameters:

```
status = ctrl_pd (SEL_{OR1} spd,
                                           // Protection Domain: Source
                    SELORI dpd,
                                           // Protection Domain: Destination
                    SEL
                          src,
                                           // Base Selector: Source
                    SEL
                          dst,
                                           // Base Selector: Destination
                                           // Order
                   UINT
                          ord,
                   UINT
                          pmm,
                                           // Permission Mask
                                           // Space Type
                   TYPE_{SPC} spc,
                                           // Access Type
                   TYPE<sub>ACC</sub> acc,
                   ATTR_{CA} ca,
                                           // Cacheability Attribute
                   ATTR<sub>SH</sub> sh);
                                           // Shareability Attribute
```

#### Flags:



#### **Description:**

Takes capabilities from the Source Protection Domain (PD) and grants them to the Destination Protection Domain (PD) and thereby optionally reduces the permissions of the destination capabilities.

Prior to the hypercall:

- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> spd } must refer to a PD Object Capability (CAP<sub>OBJpD</sub>) with permission CTRL.
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> dpd } must refer to a PD Object Capability (CAP<sub>OBJpD</sub>) with permission CTRL.
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> dpd } must not refer to a PD Object Capability (CAP<sub>OBJpD</sub>) for PD<sub>NOVA</sub>.
- SEL src and SEL dst must be order-aligned, i.e. src=0 (mod 2<sup>ord</sup>) and dst=0 (mod 2<sup>ord</sup>).
- TYPE<sub>SPC</sub> spc and TYPE<sub>ACC</sub> acc must be valid, i.e. supported by the architecture.
- ATTR<sub>CA</sub> ca and ATTR<sub>SH</sub> sh must be valid, i.e. supported by the architecture.

If the hypercall completed successfully:

- If spc=SPC<sub>0BJ</sub>: All CAP<sub>0BJ</sub> and CAP<sub>0</sub> from source SEL range { PD spd, SEL<sub>0BJ</sub> src...src+2<sup>ord</sup>-1 } were delegated to destination SEL range { PD dpd, SEL<sub>0BJ</sub> dst...dst+2<sup>ord</sup>-1 }. Any pre-existing CAP<sub>0BJ</sub> in the destination selector range were revoked. The parameters acc, ca and sh were ignored.
- If spc=SPC<sub>MEM</sub>: All CAP<sub>MEM</sub> and CAP<sub>0</sub> from source SEL range { PD spd, SEL<sub>MEM</sub> src...src+2<sup>ord</sup>-1 } were delegated to destination SEL range { PD dpd, SEL<sub>MEM</sub> dst...dst+2<sup>ord</sup>-1 }. Any pre-existing CAP<sub>MEM</sub> in the destination selector range were revoked.

#### **Delegation of Physical Memory:**

If spd refers to a PD Object Capability ( $CAP_{OBJ_{PD}}$ ) for  $PD_{NOVA}$ , then the source selectors are physical page numbers (see 6.1.2) and the cacheability and shareability attribute of each destination capability were *set* to ca and sh respectively.

#### **Delegation of Virtual Memory:**

If spd refers to a PD Object Capability (CAP<sub>OBJPD</sub>) for any other PD, then the source selectors are virtual page numbers and the cacheability and shareability attribute of each destination capability were *inherited* from the respective source capability, i.e. the parameters ca and sh were ignored.

- If  $spc=SPC_{PIO}$ : All  $CAP_{PIO}$  and  $CAP_{0}$  from source SEL range { PD spd, SEL\_{PIO} src...src+2<sup>ord</sup>-1 } were delegated to destination SEL range { PD dpd, SEL\_{PIO} dst...dst+2<sup>ord</sup>-1 }. Any pre-existing  $CAP_{PIO}$  in the destination selector range were revoked. The parameters ca and sh were ignored.
- If spc=SPC<sub>MSR</sub>: All CAP<sub>MSR</sub> and CAP<sub>0</sub> from source SEL range { PD spd, SEL<sub>MSR</sub> src...src+2<sup>ord</sup>-1 } were delegated to destination SEL range { PD dpd, SEL<sub>MSR</sub> dst...dst+2<sup>ord</sup>-1 }. Any pre-existing CAP<sub>MSR</sub> in the destination selector range were revoked. The parameters ca and sh were ignored.

- The permissions of each destination capability were masked by computing the logical AND of the permissions of the respective source capability and the permission mask pmm, i.e.
  - for bits set (1) in pmm, the respective permissions were *inherited* from the source capability.
  - for bits clear (0) in pmm, the respective permissions were removed for the destination capability.
- If the source capability was a Null Capability ( $CAP_0$ ) or if the destination capability has zero permissions after masking, then the destination capability is now a Null Capability ( $CAP_0$ ).
- The resources for storing the granted capabilities were accounted to the PD referred to by { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> dpd }.

#### Status:

#### **SUCCESS**

• The hypercall completed successfully.

#### **BAD CAP**

- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> spd } did not refer to a PD Object Capability (CAP<sub>OBJPD</sub>) or that capability had insufficient permissions.
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> dpd } did not refer to a PD Object Capability (CAP<sub>OBJPD</sub>) or that capability had insufficient permissions.
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> dpd } referred to a PD Object Capability (CAP<sub>OBJPD</sub>) for PD<sub>NOVA</sub>.

#### BAD\_PAR

- SEL src or SEL dst was not order-aligned.
- SEL src+2<sup>ord</sup>-1 or SEL dst+2<sup>ord</sup>-1 was larger than the maximum selector number.
- If spc=SPC<sub>PIO</sub> or spc=SPC<sub>MSR</sub>: SEL src was not equal to SEL dst.
- TYPE<sub>SPC</sub> spc or TYPE<sub>ACC</sub> acc was not valid, i.e. not supported by the architecture.
- ATTR<sub>CA</sub> ca or ATTR<sub>SH</sub> sh was not valid, i.e. not supported by the architecture.

#### INS MEM

• { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> dpd } had insufficient memory resources for allocating the storage required for granting all destination capabilities. This constitutes a partial failure of the operation, because all destination capabilities up to the first allocation failure have been granted.

#### **5.4.2 Control Execution Context**

#### Parameters:

```
status = ctrl_ec (SEL<sub>OBJ</sub> ec);  // Execution Context
```

#### Flags:



#### **Description:**

Prior to the hypercall:

• { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> ec } must refer to a EC Object Capability (CAP<sub>OBJFC</sub>) with permission CTRL.

If the hypercall completed successfully:

- The EC referred to by { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> ec } has been forced to enter the microhypervisor. It will generate a recall exception prior to its next exit from the microhypervisor and will traverse through the respective Portal (PT).
- If **S=0** (**Weak**): the hypercall returns as soon as the recall exception has been *pended*, i.e. the EC may not have entered the microhypervisor yet.
- If **S=1** (**Strong**): the hypercall returns as soon as the recall exception has been *observed*, i.e the EC will have entered the microhypervisor.

#### Status:

#### **SUCCESS**

• The hypercall completed successfully.

#### BAD\_CAP

• { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> ec } did not refer to a EC Object Capability (CAP<sub>OBJ<sub>EC</sub></sub>) or that capability had insufficient permissions.

## 5.4.3 Control Scheduling Context

#### Parameters:

#### Flags:



#### **Description:**

Prior to the hypercall:

• { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> sc } must refer to an SC Object Capability (CAP<sub>OBJsc</sub>) with permission CTRL.

If the hypercall completed successfully:

• The microhypervisor has returned the total consumed execution time as System Time Counter (STC) value for the SC referred to by {  $PD_{CURRENT}$ ,  $SEL_{OBJ}$  sc }.

#### Status:

#### **SUCCESS**

• The hypercall completed successfully.

#### BAD\_CAP

• { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> sc } did not refer to an SC Object Capability (CAP<sub>OBJ<sub>SC</sub></sub>) or that capability had insufficient permissions.

#### 5.4.4 Control Portal

#### Parameters:

#### Flags:



#### **Description:**

Prior to the hypercall:

 $\bullet \ \ \{ \ \ \, PD_{CURRENT} \, , \ \ \, SEL_{OBJ} \ \ \, pt \ \ \} \ \, must \ \, refer \ \, to \ \, a \ \, PT \ \, Object \ \, Capability \, (CAP_{OBJ_{PT}}) \ \, with \ \, permission \ \, CTRL. \\$ 

If the hypercall completed successfully:

- The microhypervisor has set the Portal Identifier (PID) to pid and the Message Transfer Descriptor [ARM, x86] (MTD) to mtd for the Portal referred to by {  $PD_{CURRENT}$ ,  $SEL_{OBJ}$  pt }.
- Subsequent portal traversals will use the new MTD and return the new PID.

#### Status:

#### **SUCCESS**

• The hypercall completed successfully.

#### BAD\_CAP

• {  $PD_{CURRENT}$ ,  $SEL_{OBJ}$  pt } did not refer to a PT Object Capability (CAP<sub>OBJPT</sub>) or that capability had insufficient permissions.

#### 5.4.5 Control Semaphore

#### Parameters:

#### Flags:



#### **Description:**

Prior to the hypercall:

- If D=0 (Up): {  $PD_{CURRENT}$ ,  $SEL_{OBJ}$  sm } must refer to a SM Object Capability (CAP<sub>OBJ<sub>SM</sub></sub>) with permission CTRL<sub>UP</sub>.
- If D=1 (Down): {  $PD_{CURRENT}$ ,  $SEL_{OBJ}$  sm } must refer to a SM Object Capability ( $CAP_{OBJ_{SM}}$ ) with permission  $CTRL_{DN}$ .

If the hypercall completed successfully:

- If **D=0** (**Up**): if there were **EC**s blocked on the semaphore, then the microhypervisor has released one of those blocked **EC**s. Otherwise, the microhypervisor has incremented the semaphore counter. The timeout value and the Z-flag were ignored.
- If **D=1** (**Down**): if the semaphore counter was larger than zero, then the microhypervisor has decremented the semaphore counter (**Z=0**) or set it to zero (**Z=1**). Otherwise, the microhypervisor has blocked EC<sub>CURRENT</sub> on the semaphore. If the timeout value was non-zero, EC<sub>CURRENT</sub> unblocks with a timeout status when the System Time Counter (STC) reaches or exceeds the specified value.

Blocking and releasing of ECs on a semaphore uses the FIFO queueing discipline.

#### Status:

#### **SUCCESS**

 $\bullet\,$  The hypercall completed successfully.

#### **TIMEOUT**

• If **D=1**: Down operation aborted when the timeout triggered.

#### **OVRFLOW**

• If **D=0**: Up operation aborted because the semaphore counter would overflow.

#### BAD\_CAP

• { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> sm } did not refer to a SM Object Capability (CAP<sub>OBJ<sub>SM</sub></sub>) or that capability had insufficient permissions.

#### **BAD CPU**

• If **D=1** on an interrupt semaphore: Attempt to wait for the interrupt on a different CPU than the CPU to which that interrupt has been routed via assign\_int.

## 5.5 Platform Management

#### 5.5.1 Control Power Management

#### Parameters:

```
status = ctrl_pm (UINT state);  // State Information
```

#### Flags:



#### **Description:**

Transitions the platform to the specified power management state.

Prior to the hypercall:

- PD<sub>CURRENT</sub> must be the Root Protection Domain (PD<sub>ROOT</sub>).
- If OP=1 (S-State Transition):
  - The state parameter uses the following encoding:



The value S designates the state to enter. The values A and B are the first two bytes of the respective \\_Sx package in the ACPI root namespace as follows:

| $\mathbf{S}$ | $\mathbf{A}$     | В       | Shallow      | Description          |
|--------------|------------------|---------|--------------|----------------------|
| 0x1          | \_S1[0]          | \_S1[1] | <b>√</b>     | S1: Power-On Suspend |
| 0x2          | \_S2[0]          | \_S2[1] | $\checkmark$ | S2: Standby          |
| 0x3          | \_S3[ <b>0</b> ] | \_S3[1] | $\checkmark$ | S3: Suspend to RAM   |
| 0x4          | \_S4[0]          | \_S4[1] | ×            | S4: Suspend to Disk  |
| 0x5          | \_S5[0]          | \_S5[1] | ×            | S5: Soft Off         |
| 0x7          | 0x0              | 0x0     | ×            | Platform Reset       |

- The caller is responsible for invoking the necessary pre-sleep ACPI methods, for transitioning platform devices into a suitable Dx sleep state, and for programming wakeup events.

If the hypercall completed successfully:

- If **OP=1** (**S-State Transition**): The platform enters the specified **ACPI** sleep state or resets.
  - For shallow sleep states, the hypercall returns upon a wakeup event. The caller is responsible for invoking the necessary post-sleep ACPI methods and for transitioning platform devices back into the D0 working state.
  - For deep sleep states or platform reset, the hypercall does not return.

#### Status:

#### **SUCCESS**

• The hypercall completed successfully.

#### **BAD HYP**

• The hypercall was not issued from the Root Protection Domain (PD<sub>ROOT</sub>).

#### **BAD PAR**

• The requested operation (OP) is not supported.

#### **BAD FTR**

• The requested power management state is not supported.

#### **ABORTED**

• A concurrent power management request prevailed.

## 5.5.2 Assign Interrupt

#### Parameters:

#### Flags:



## **Description:**

Configures an interrupt and routes it to the specified CPU.

Prior to the hypercall:

- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> sm } must refer to a SM Object Capability (CAP<sub>OBJ<sub>SM</sub></sub>) with permission ASSIGN.
- CAP<sub>OBJsm</sub> must refer to an interrupt semaphore and thereby designates the interrupt.

If the hypercall completed successfully:

- The interrupt referred to by { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> sm } has been routed to the CPU cpu.
- Mask
  - M=0: The interrupt is now unmasked, i.e. it will be signaled on the semaphore.
  - **M=1**: The interrupt is now masked, i.e. it will not be signaled on the semaphore.
- Trigger
  - **T=0**: The interrupt is now configured for edge-triggered operation.
  - **T=1**: The interrupt is now configured for level-triggered operation.
- Polarity
  - **P=0**: The interrupt is now configured for active-high operation.
  - P=1: The interrupt is now configured for active-low operation.
- Guest
  - **G=0**: The interrupt is now host-owned.
  - **G=1**: The interrupt is now guest-owned (VM pass-through).
- If the interrupt is an MSI, only the PCI device referred to by dev will be authorized to generate that MSI. The device driver must program the returned msi\_addr and msi\_data values into the MSI registers of that device to ensure proper interrupt operation. If the interrupt is pin-based, the parameter dev was ignored and the parameters msi\_addr and msi\_data return 0.

Prior to the first invocation of assign\_int for an interrupt, the state of that interrupt is as follows:

- the interrupt is masked.
- trigger, polarity and ownership are undefined.
- target CPU and authorized device are undefined.

#### Status:

#### **SUCCESS**

• The hypercall completed successfully.

## **BAD CPU**

• The specified CPU number was invalid.

## BAD\_CAP

- {  $PD_{CURRENT}$ ,  $SEL_{OBJ}$  sm } did not refer to a SM Object Capability (CAP\_{OBJ\_{SM}}) or that capability had insufficient permissions.
- $\bullet~CAP_{OBJ_{SM}}$  did not refer to an interrupt semaphore.

## 5.5.3 Assign Device

#### Parameters:

## Flags:



## **Description:**

Assigns the specified device (\*) to the specified Protection Domain (PD):

- ARM: dev encodes the SID of the device and also the SMMU resources (stream mapping group, translation context) to be used for managing that device.
- x86: dev encodes the BDF of the device. There are no SMMU resources needed.

Prior to the hypercall:

- PD<sub>CURRENT</sub> must be the Root Protection Domain (PD<sub>ROOT</sub>).
- { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> pd } must refer to a PD Object Capability (CAP<sub>OBJen</sub>) with permission ASSIGN.
- { PD<sub>NOVA</sub>, SEL<sub>MEM</sub> smmu } must refer to the physical address of an SMMU device.
- The SID/BDF and SMMU resources encoded in dev must be supported by the hardware (see 7.6.1).
- TYPE<sub>ACC</sub> acc must refer to a DMA access type.

If the hypercall completed successfully:

- The device, referred to by the SID/BDF in dev, has been assigned to the Protection Domain (PD) referred to by { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> pd }, such that DMA transactions of that device will be translated by the DMA page table corresponding to acc of that PD.
- DMA transactions of that device will be managed using the SMMU resources encoded in dev. Prior users
  of those SMMU resources have been unconfigured.

## Status:

#### **SUCCESS**

• The hypercall completed successfully.

#### **BAD HYP**

• The hypercall was not issued from the Root Protection Domain (PD<sub>ROOT</sub>).

## BAD\_DEV

• { PD<sub>NOVA</sub>, SEL<sub>MEM</sub> smmu } did not refer to the physical address of an SMMU device.

### BAD\_CAP

• { PD<sub>CURRENT</sub>, SEL<sub>OBJ</sub> pd } did not refer to a PD Object Capability (CAP<sub>OBJPD</sub>) or that capability had insufficient permissions.

### **BAD PAR**

• At least one of the parameters dev or acc was not valid.

<sup>\*</sup>See the architecture-specific binding for encoding details.

## 6 Booting

## 6.1 Microhypervisor

## 6.1.1 ELF Image Loading

The bootloader must place all loadable (PT\_LOAD) program segments of the NOVA microhypervisor into physical memory (RAM) according to the physical addresses (p\_paddr) and memory sizes (p\_memsz) defined in the NOVA microhypervisor ELF image. The following is an example:

```
readelf -1 nova.elf

Elf file type is EXEC (Executable file)

Entry point 0x48000000

There are 2 program headers, starting at offset 64
```

#### Program Headers:

| J    |                                        |                    |         |              |
|------|----------------------------------------|--------------------|---------|--------------|
| Type | Offset                                 | VirtAddr           | PhysAdd | lr           |
|      | FileSiz                                | MemSiz             | Flags   | Align        |
| LOAD | 0x000000000000000b0                    | 0x0000000048000000 | 0x00000 | 000048000000 |
|      | 0x00000000000000268                    | 0x000000000001000  | RWE     | 0x8          |
| LOAD | 0x000000000000000000000000000000000000 | 0x0000ff8000001000 | 0x00000 | 000048001000 |
|      | 0x000000000000000e960                  | 0x0000000000fff000 | RWE     | 0x800        |

If the physical address range defined in the ELF image is suboptimal for a particular platform, the bootloader may optionally shift all loadable program segments lower or higher in physical memory, by applying an offset, subject to the following constraints:

- The same offset must be applied to each loadable program segment and to the entry point.
- The offset must be a multiple of 2 MiB, i.e.  $PhysAddr_{NEW} = PhysAddr_{ELF} \pm n \times 2 MiB$ .
- The entire physical memory region occupied by the NOVA microhypervisor must be RAM.

After loading the NOVA microhypervisor into physical memory, the bootloader must invoke the entry point of the ELF image with architecture-specific preconditions (ARM, x86).

#### 6.1.2 Platform Resource Access

Possession of a PD Object Capability (CAP<sub>OBJpD</sub>) for PD<sub>NOVA</sub> allows the caller to invoke the ctrl\_pd hypercall to take resources from the NOVA Protection Domain and grant them to another Protection Domain.

The following capabilities can be taken from the NOVA Protection Domain (PD<sub>NOVA</sub>):

#### **Physical Memory**

{ PD<sub>NOVA</sub>, SEL<sub>MEM</sub> 0...PHYS<sub>NUM</sub>-1 } refer to CAP<sub>MEM</sub> for page frames in physical memory, where PHYS<sub>NUM</sub> is the number of page frames supported by the platform. Physical memory regions protected by the NOVA microhypervisor (ARM, x86) cannot be taken.

#### **Interrupt Semaphores**

{  $PD_{NOVA}$ ,  $SEL_{OBJ}$  1024...1024+ $INT_{NUM}$ -1 } refer to  $CAP_{OBJ_{SM}}$  for interrupt semaphores, where  $INT_{NUM}$  is the number of supported interrupts, as conveyed by the HIP. These capabilities can be used with the  $ctrl\_sm$  and  $assign\_int$  hypercalls.

### **Console Signaling Semaphore**

{  $PD_{NOVA}$ ,  $SEL_{OBJ}$   $SEL_{NUM}-1$  } refers to a  $CAP_{OBJ_{SM}}$  for the signaling semaphore of the NOVA memory-buffer console. This capability can be used with the  $ctrl\_sm$  hypercall.

## 6.2 Root Protection Domain

After the NOVA microhypervisor has initialized the system, it creates the following initial kernel objects:

- PD<sub>ROOT</sub> the Root Protection Domain
- EC<sub>ROOT</sub> the Root Execution Context (executing in PD<sub>ROOT</sub>)
- SC<sub>ROOT</sub> the Root Scheduling Context (bound to EC<sub>ROOT</sub>)

The Root Protection Domain is responsible for bootstrapping the other components of the user-mode framework by creating additional kernel objects, loading additional images, assigning resources, etc.

## 6.2.1 ELF Image Format

The ELF image of the Root Protection Domain (PD<sub>ROOT</sub>) must be an executable (ET\_EXEC) file that has been compiled for the respective architecture and

- linked such that p\_filesz = p\_memsz
- loaded such that p\_vaddr ≡ LOAD\_ADDR\* + p\_offset (mod PAGE\_SIZE)

holds for each loadable (PT\_LOAD) program segment. These constraints ensure that the NOVA microhypervisor can map all program segments directly from physical into virtual memory without any additional memory allocation or copying. The following is an example:

```
readelf -1 root.elf

Elf file type is EXEC (Executable file)
Entry point 0x10000120

There are 2 program headers, starting at offset 64
```

#### Program Headers:

| Type | Offset               | VirtAddr            | PhysAd | dr           |
|------|----------------------|---------------------|--------|--------------|
|      | FileSiz              | MemSiz              | Flags  | Align        |
| LOAD | 0x00000000000000000  | 0x0000000010000000  | 0x0000 | 000010000000 |
|      | 0x000000000000000a75 | 0x00000000000000a75 | R E    | 0x1000       |
| LOAD | 0x0000000000001000   | 0x0000000010001000  | 0x0000 | 000010001000 |
|      | 0x000000000000f004   | 0x000000000000f004  | RW     | 0x1000       |

## 6.2.2 Initial Configuration

Prior to invoking the entry point of the Root Protection Domain ( $PD_{ROOT}$ ) ELF image, using the Root Execution Context ( $EC_{ROOT}$ ), the NOVA microhypervisor sets up  $PD_{ROOT}$  as follows.

#### 6.2.2.1 Object Space

The object space contains the following initial capabilities:

```
• { PD<sub>ROOT</sub>, SEL<sub>OBJ</sub> SEL<sub>NUM</sub>-1 } refers to a PD Object Capability (CAP<sub>OBJPD</sub>) for PD<sub>NOVA</sub>.
```

- { PD<sub>ROOT</sub>, SEL<sub>OBJ</sub> SEL<sub>NUM</sub>-2 } refers to a PD Object Capability (CAP<sub>OBJpn</sub>) for PD<sub>ROOT</sub>.
- { PD<sub>ROOT</sub>, SEL<sub>OBJ</sub> SEL<sub>NUM</sub>-3 } refers to a EC Object Capability (CAP<sub>OBJFC</sub>) for EC<sub>ROOT</sub>.
- { PD<sub>ROOT</sub>, SEL<sub>OB</sub> SEL<sub>NUM</sub>-4 } refers to a SC Object Capability (CAP<sub>OB</sub>) for SC<sub>ROOT</sub>.

All other {  $PD_{ROOT}$ ,  $SEL_{OBJ}$  } refer to a Null Capability (CAP<sub>0</sub>).

The value of SEL<sub>NUM</sub> is conveyed in the Hypervisor Information Page [ARM, x86].

<sup>\*</sup>This is the address in physical memory at which the bootloader has placed the ELF image.

## 6.2.2.2 Memory Space

## **ELF Program Segments**

The microhypervisor maps the Root Protection Domain  $(PD_{ROOT})$  into virtual memory according to the virtual addresses  $(p\_vaddr)$ , memory sizes  $(p\_memsz)$  and page attributes  $(p\_flags)$  of all loadable  $(PT\_LOAD)$  program segments defined in the  $PD_{ROOT}$  ELF image.

## **Hypervisor Information Page**

The microhypervisor maps the Hypervisor Information Page [ARM, x86] read-only into the memory space 4 KiB below the end of user-accessible virtual memory. The virtual address of the HIP is passed to EC<sub>ROOT</sub> at the entry point (ARM, x86).

## **UTCB**

The microhypervisor maps the User Thread Control Block [ARM, x86] of EC<sub>ROOT</sub> into the memory space 4 KiB below the address of the Hypervisor Information Page [ARM, x86].

All other {  $PD_{ROOT}$ ,  $SEL_{MEM}$  } refer to a Null Capability (CAP<sub>0</sub>).

## 6.3 Hypervisor Information Page

The Hypervisor Information Page [ARM, x86] (HIP) conveys information about the platform and configuration to the Root Protection Domain (PD<sub>ROOT</sub>) and has the following layout:



All HIP fields are unsigned values, unless stated otherwise, and have the following meaning:

#### **Signature**

The value 0x41564f4e identifies the NOVA microhypervisor.

#### Checksum

The checksum is valid if 16bit-wise addition of the entire HIP contents produces a value of 0.

#### Length

Length of the entire **HIP** in bytes.

## **NOVA Start/End Address**

Physical start and end address of the NOVA microhypervisor image.

## **MBUF Start/End Address**

Physical start and end address of the memory buffer console region (see C.1).

## **ROOT Start/End Address**

Physical start and end address of the root protection domain image.

## **ACPI RSDP Address**

## **UEFI Memory Map Address**

## **UEFI Memory Map Size**

Total size of the **UEFI** Memory Map (**0** if not present).

## **UEFI Desc Size**

**UEFI** Memory Descriptor Size (0 if not present).

## **UEFI Desc Version**

**UEFI** Memory Descriptor Version (**0** if not present).

#### **STC Frequency**

Frequency of the System Time Counter (STC) in Hz.

## SEL<sub>NUM</sub>

Total number of Capability Selectors in each object space.

## SEL<sub>HST/ARCH</sub>

Number of Capability Selectors required for handling architectual host events. (ARM, x86)

#### SEL<sub>HST/NOVA</sub>

Number of additional Capability Selectors required for handling microhypervisor host events. (ARM, x86)

## SEL<sub>GST/ARCH</sub>

Number of Capability Selectors required for handling architectual guest events. (ARM, x86)

#### SEL<sub>GST/NOVA</sub>

Number of additional Capability Selectors required for handling microhypervisor guest events. (ARM, x86)

## $CPU_{NUM}$

Total number of CPUs that are online.

#### **CPU<sub>BSP</sub>**

The Bootstrap Processor (BSP) on which EC<sub>ROOT</sub> and SC<sub>ROOT</sub> have been created.

## $INT_{NUM}$

Total number of interrupts that can be used via interrupt semaphores.

## **Features**

Supported platform features.

## **Architecture-Dependent**

Architecture-dependent part. (ARM, x86)

# Part IV Application Binary Interface

## 7 ABI aarch64

## 7.1 Boot State

## 7.1.1 NOVA Microhypervisor

The bootloader must set up the CPU register state according to one of the launch types listed below when it transfers control to the NOVA microhypervisor entry point. Furthermore, the following preconditions must be satisfied:

- The CPU must execute in EL2 (hypervisor mode) or in EL3 (monitor mode).
- Paging (MMU) must be disabled (SCTLR\_ELx.M=0) or must use an identity (1:1) mapping.
- Interrupts must be disabled (PSTATE.DAIF=0b1111).
- The physical memory region occupied by the microhypervisor image must be clean to the PoC.
- All DMA activity targeting the physical memory region occupied by the microhypervisor must be quiesced. That physical memory region should also be protected against DMA accesses on systems with an SMMU.

#### 7.1.1.1 Multiboot v2 Launch

Only this launch type supports 64-bit **UEFI** platforms.

| Register | Value / Description                                                                        |
|----------|--------------------------------------------------------------------------------------------|
| IP       | Physical address of the NOVA Protection Domain (PD <sub>NOVA</sub> ) ELF image entry point |
| X0       | Multiboot v2 magic value (0x36d76289) [8]                                                  |
| X1       | Physical address of the Multiboot v2 information structure [8]                             |
| Other    | ~                                                                                          |

The NOVA microhypervisor consumes the following multiboot tags, if present: 1, 3, 12, 20.

#### 7.1.1.2 Multiboot v1 Launch

| Register | Value / Description                                                                        |
|----------|--------------------------------------------------------------------------------------------|
| IP       | Physical address of the NOVA Protection Domain (PD <sub>NOVA</sub> ) ELF image entry point |
| X0       | Multiboot v1 magic value (0x2badb002) [9]                                                  |
| X1       | Physical address of the Multiboot v1 information structure [9]                             |
| Other    | ~                                                                                          |

The NOVA microhypervisor consumes the following multiboot flags, if present: 2, 3.

## 7.1.1.3 Legacy Launch

| Register | Value / Description                                                                             |
|----------|-------------------------------------------------------------------------------------------------|
| IP       | Physical address of the NOVA Protection Domain (PD <sub>NOVA</sub> ) ELF image entry point      |
| X0       | Physical address of the Flattened Device Tree [10] (FDT) for the hardware platform <sup>†</sup> |
| X1       | Physical address of the Root Protection Domain (PD <sub>ROOT</sub> ) ELF image                  |
| Other    | ~                                                                                               |

<sup>&</sup>lt;sup>†</sup>Due to its alignment constraint, a valid FDT address will never be equal to a Multiboot magic value.

## 7.1.2 Root Protection Domain

The NOVA microhypervisor sets up the CPU register state as follows when it transfers control to the Root Execution Context ( $EC_{ROOT}$ ):

| Content (20,001).                                |  |  |  |  |
|--------------------------------------------------|--|--|--|--|
|                                                  |  |  |  |  |
| main (PD <sub>ROOT</sub> ) ELF image entry point |  |  |  |  |
| ntion Page [ARM, x86] (HIP)                      |  |  |  |  |
|                                                  |  |  |  |  |
|                                                  |  |  |  |  |
|                                                  |  |  |  |  |
|                                                  |  |  |  |  |
|                                                  |  |  |  |  |

 $<sup>^{\</sup>dagger}$ The register contains the preserved original value from the point when control was transferred from the bootloader to the microhypervisor.

## 7.2 Protected Resources

The following resources are protected by the NOVA microhypervisor and are therefore inaccessible to user-mode applications.

## 7.2.1 Memory Space

Physical memory regions occupied by:

- NOVA microhypervisor conveyed via HIP.
- GICD, GICR, GICC, GICH devices [11, 12] conveyed via ACPI MADT or via FDT.
- SMMU devices [13, 14] conveyed via ACPI IORT or via FDT.
- Firmware runtime services conveyed via UEFI memory map.

## 7.3 Physical Memory

## 7.3.1 Memory Map

The Root Protection Domain (PD<sub>ROOT</sub>) can obtain a list of available/reserved memory regions as follows:

- On platforms using Unified Extensible Firmware Interface [7], by parsing the UEFI memory map.
- On platforms using Flattened Device Tree [10], by parsing the FDT.

## 7.4 Virtual Memory

The accessible virtual memory range for user-mode applications is 0 - 0x7fffffffff.

## 7.4.1 Cacheability Attributes

| <b>Encoding</b> | ATTR <sub>CA</sub> | Description                              |
|-----------------|--------------------|------------------------------------------|
| 0x0             | DEV                | Device                                   |
| 0x1             | DEV_E              | Device, Early Ack                        |
| 0x2             | DEV_RE             | Device, Early Ack, Reordering            |
| 0x3             | DEV_GRE            | Device, Early Ack, Reordering, Gathering |
| 0x4             | -                  | reserved                                 |
| 0x5             | MEM_NC             | Memory, Inner/Outer Non-Cacheable        |
| <b>0</b> x6     | MEM_WT             | Memory, Inner/Outer Write-Through        |
| 0x7             | MEM_WB             | Memory, Inner/Outer Write-Back           |

Please refer to [3] for details on the architectural behavior.

## 7.4.2 Shareability Attributes

| Encoding | ATTR <sub>SH</sub> | Description     |
|----------|--------------------|-----------------|
| 0x0      | NONE               | Not Shareable   |
| 0x1      | -                  | reserved        |
| 0x2      | OUTER              | Outer Shareable |
| 0x3      | INNER              | Inner Shareable |

Please refer to [3] for details on the architectural behavior.

## 7.5 Event-Specific Capability Selectors

For the delivery of exception/intercept messages, the microhypervisor performs an implicit portal traversal.

The selector for the destination portal (SEL<sub>OBJ</sub>):

- is determined by adding the exception/intercept number to the affected Execution Context's Event Selector Base [ARM, x86] (SEL<sub>EVT</sub>).
- indexes into the Object Space (SPC<sub>OBJ</sub>) of the affected EC's Protection Domain (PD).
- must refer to a PT Object Capability (CAP<sub>OBJ<sub>PT</sub></sub>) with permission EVENT that is bound to an EC on the same core as the affected EC, otherwise the affected EC is killed.

## 7.5.1 Architectural Events

## **Host Exceptions and Guest Intercepts**

| SEL <sub>OBJ</sub>          | Exception / Intercept        | SEL <sub>OBJ</sub>          | Exception / Intercept        |
|-----------------------------|------------------------------|-----------------------------|------------------------------|
| $SEL_{EVT} + 0x00$          | Unknown Reason               | SEL <sub>EVT</sub> + 0x20   | Instruction Abort (lower EL) |
| $\frac{SEL_{EVT}}{} + 0x01$ | Trapped WFI or WFE           | $SEL_{EVT} + 0x21$          | Instruction Abort (same EL)* |
| $SEL_{EVT} + 0x02$          | reserved                     | $SEL_{EVT} + 0x22$          | PC Alignment Fault           |
| $SEL_{EVT} + 0x03$          | Trapped MCR or MRC           | $SEL_{EVT} + 0x23$          | reserved                     |
| $SEL_{EVT} + 0x04$          | Trapped MCRR or MRRC         | $SEL_{EVT} + 0x24$          | Data Abort (lower EL)        |
| $SEL_{EVT} + 0x05$          | Trapped MCR or MRC           | $SEL_{EVT} + 0x25$          | Data Abort (same EL)*        |
| $SEL_{EVT} + 0x06$          | Trapped LDC or STC           | $SEL_{EVT} + 0x26$          | SP Alignment Fault           |
| $SEL_{EVT} + 0x07$          | SME, SVE, SIMD, FPU          | $SEL_{EVT} + 0x27$          | Memory Operation Exception   |
| $SEL_{EVT} + 0x08$          | Trapped VMRS Access          | $\frac{SEL_{EVT}}{} + 0x28$ | Trapped FPU (AArch32)        |
| $SEL_{EVT} + 0x09$          | Trapped PAuth Instruction    | $SEL_{EVT} + 0x29$          | reserved                     |
| $SEL_{EVT} + 0x0a$          | Trapped LD64B or ST64B       | $SEL_{EVT} + 0x2a$          | reserved                     |
| $SEL_{EVT} + 0x0b$          | reserved                     | $SEL_{EVT} + 0x2b$          | reserved                     |
| $SEL_{EVT} + 0x0c$          | Trapped MRRC                 | $SEL_{EVT} + 0x2c$          | Trapped FPU (AArch64)        |
| $SEL_{EVT} + 0x0d$          | Branch Target Exception      | $SEL_{EVT} + 0x2d$          | reserved                     |
| $SEL_{EVT} + 0x0e$          | Illegal Execution State      | $SEL_{EVT} + 0x2e$          | reserved                     |
| $SEL_{EVT} + 0x0f$          | reserved                     | $SEL_{EVT} + 0x2f$          | SError                       |
| $SEL_{EVT} + 0x10$          | reserved                     | $SEL_{EVT} + 0x30$          | Breakpoint (lower EL)        |
| $SEL_{EVT} + 0x11$          | SVC (from AArch32 State)     | $SEL_{EVT} + 0x31$          | Breakpoint (same EL)*        |
| $\frac{SEL_{EVT}}{} + 0x12$ | HVC (from AArch32 State)     | $SEL_{EVT} + 0x32$          | Software Step (lower EL)     |
| $SEL_{EVT} + 0x13$          | SMC (from AArch32 State)     | $SEL_{EVT} + 0x33$          | Software Step (same EL)*     |
| $SEL_{EVT} + 0x14$          | reserved                     | $SEL_{EVT} + 0x34$          | Watchpoint (lower EL)        |
| $\frac{SEL_{EVT}}{} + 0x15$ | SVC (from AArch64 State)*    | $SEL_{EVT} + 0x35$          | Watchpoint (same EL)*        |
| $SEL_{EVT} + 0x16$          | HVC (from AArch64 State)     | $SEL_{EVT} + 0x36$          | reserved                     |
| $\frac{SEL_{EVT}}{} + 0x17$ | SMC (from AArch64 State)     | $SEL_{EVT} + 0x37$          | reserved                     |
| $\frac{SEL_{EVT}}{} + 0x18$ | Trapped MSR or MRS           | $SEL_{EVT} + 0x38$          | BKPT (AArch32)               |
| $\frac{SEL_{EVT}}{} + 0x19$ | Trapped SVE                  | $SEL_{EVT} + 0x39$          | reserved                     |
| $SEL_{EVT} + 0x1a$          | Trapped ERET                 | $SEL_{EVT} + 0x3a$          | Vector Catch (AArch32)       |
| $SEL_{EVT} + 0x1b$          | TSTART Exception             | $SEL_{EVT} + 0x3b$          | reserved                     |
| $SEL_{EVT} + 0x1c$          | PAuth Instruction Failure    | $SEL_{EVT} + 0x3c$          | BRK (AArch64)                |
| $SEL_{EVT} + 0x1d$          | Trapped SME                  | $SEL_{EVT} + 0x3d$          | reserved                     |
| $SEL_{EVT} + 0x1e$          | Granule Protection Exception | $SEL_{EVT} + 0x3e$          | reserved                     |
| $SEL_{EVT} + 0x1f$          | reserved                     | $SEL_{EVT} + 0x3f$          | reserved                     |

Please refer to [3] for more details on each of these events.

<sup>\*</sup>These events may be handled by the microhypervisor, in which case they will not cause portal traversals.

## 7.5.2 Microhypervisor Events

| SEL <sub>OBJ</sub>                             | Event         |
|------------------------------------------------|---------------|
| SEL <sub>EVT</sub> + SEL <sub>ARCH</sub> + 0x0 | Startup       |
| $SEL_{EVT} + SEL_{ARCH} + 0x1$                 | Recall        |
| $SEL_{EVT} + SEL_{ARCH} + 0x2$                 | Virtual Timer |

The value of  $\mathsf{SEL}_{\mathsf{ARCH}}$  depends on the origin of the event:

- $SEL_{ARCH} = SEL_{HST/ARCH}$  (0x40) for events that occurred in the host.
- $SEL_{ARCH} = \frac{SEL_{GST/ARCH}}{(0x40)}$  for events that occurred in the guest.

# 7.6 Architecture-Dependent Structures

## 7.6.1 Hypervisor Information Page



## $\text{SMG}_{\text{NUM}}$

Number of SMMU stream mapping groups.

#### **CTX**NIIM

Number of **SMMU** translation contexts.

## 7.6.2 User Thread Control Block

| )     | +0x2d0           | ELRSR                 | VMCR     | -                         |                |
|-------|------------------|-----------------------|----------|---------------------------|----------------|
|       | +0x2c0           | AP1R0                 | AP1R1    | AP1R2                     | AP1R3          |
|       | +0x2b0           | AP0R0                 | AP0R1    | APOR3 APOR2               |                |
|       | +0x2a0           | 14                    | LR1      | LR15                      |                |
|       | +0x290           | 12                    | LR1      | 13                        | LR             |
| GIC   | +0x280           | 10                    | LR1      | 11                        | LR             |
|       | +0x270           | .8                    | LR       | R9                        | Li             |
|       | +0x260           | .6                    | LR6      |                           | LI             |
|       | +0x250           | 4                     | LR       | R5                        | LI             |
|       | +0x240           | .2                    | LR       | R3                        | LI             |
| J     | +0x230           | .0                    | LR       | R1                        | LI             |
| TMD   | +0x220           | L_EL1                 | CNTKCT   | FF_EL2                    | CNTVO          |
| } TMR | +0x210           | AL_EL0                | CNTV_CV  | TL_EL0                    | CNTV_C         |
| )     | +0x200           | _EL2                  | HPFAR    |                           |                |
|       | +0x1f0           | EL2                   | ESR_     | _EL2                      | FAR_           |
| EL2   | +0x1e0           | EL2                   | ELR_     | _EL2                      | SPSR           |
|       | +0x1d0           | _EL2                  | VPIDR    | R_EL2                     | VMPID          |
| )     | +0x1c0           |                       | HCR_     |                           | HCRX           |
| )     | +0x1b0           |                       | MDSCR    |                           |                |
|       | +0x1b0<br>+0x1a0 |                       | VBAR_    |                           | SCTLI          |
|       |                  |                       | MAIR_    |                           | AMAII          |
|       | +0x190           | TTBR1_EL1             |          |                           | TCR_           |
| EL1   | +0x180           | AFSR1_EL1             |          |                           | TTBRO          |
|       | +0x170           | FAR_EL1               |          |                           | AFSRO          |
|       | +0x160           | SPSR_EL1              |          |                           |                |
|       | +0x150           | CONTEXTIDR_EL1 SP_EL1 |          | ESR_EL1 ELR_EL1 TPIDR_EL1 |                |
| )     | +0x140           |                       |          |                           |                |
| )     | +0x130           | IFSR DACR             |          | HSTR                      | -              |
| A32   | +0x120           | SPSR_abt              | SPSR_fiq | SPSR_irq                  | SPSR_und       |
| )     | +0x110           |                       | TPIDR    |                           | TPIDRE         |
|       | +0x100           | X30                   |          |                           | SP_            |
|       | +0x0f0           | X28                   |          |                           | X              |
|       | +0x0e0           | X26                   |          |                           | X              |
|       | +0x0d0           | X24                   |          |                           | X              |
|       | +0x0c0           | X22                   |          |                           | X              |
|       | +0x0b0           | X20                   |          |                           |                |
|       | +0x0a0           | X20<br>X18            |          | X21                       |                |
| EL0   | +0x090           |                       |          | X19<br>X17                |                |
| LLU   | -0x080           | X16                   |          | X17<br>X15                |                |
|       | +0x070           | X14                   |          | X13                       |                |
|       | +0x060           | X12<br>X10            |          | X11                       |                |
|       | +0x050           |                       |          | X11<br>X9                 |                |
|       | +0x040           |                       | X8       |                           |                |
|       | +0x030           |                       | X6       |                           | X              |
|       | +0x020           | X4                    |          |                           | X              |
|       | +0x010           | X2<br>X0              |          |                           | X              |
| ,     | +0x000           |                       | 48 32    |                           | <b>X</b> 48 33 |

## 7.6.3 Message Transfer Descriptor

The Message Transfer Descriptor [ARM, x86] (MTD), which controls the subset of the architectural state transferred during exceptions and intercepts, as described in Section 4.4.2, has the following layout:



Each MTD bit controls the transfer of the listed architectural state to/from the respective fields in the UTCB (7.6.2) as follows:

- State with access r can be read from the architectural state into the UTCB.
- State with access w can be written from the UTCB into the architectural state.

| MTD Bit         | Access | <b>Host Exception State</b>    | Guest Intercept State                  |
|-----------------|--------|--------------------------------|----------------------------------------|
| POISON          | W      | Kills the Thread               | Kills the vCPU                         |
| $ICI^{\dagger}$ | W      | Invalidates the entire I-Cache | Invalidates the entire I-Cache         |
| GPR             | rw     | X0 X30                         | X0 X30                                 |
| EL0_SP          | rw     | SP_EL0                         | SP_EL0                                 |
| EL0_IDR         | rw     | TPIDR_EL0, TPIDRRO_EL0         | TPIDR_EL0, TPIDRRO_EL0                 |
| A32_SPSR        | rw     | -                              | SPSR_ABT, SPSR_FIQ, SPSR_IRQ, SPSR_UND |
| A32_DIH         | rw     | _                              | DACR, IFSR, HSTR                       |
| EL1_SP          | rw     | -                              | SP_EL1                                 |
| EL1_IDR         | rw     | _                              | TPIDR_EL1, CONTEXTIDR_EL1              |
| EL1_ELR_SPSR    | rw     | _                              | ELR_EL1, SPSR_EL1                      |
| EL1_ESR_FAR     | rw     | -                              | ESR_EL1, FAR_EL1                       |
| EL1_AFSR        | rw     | _                              | AFSR0_EL1, AFSR1_EL1                   |
| EL1_TTBR        | rw     | -                              | TTBR0_EL1, TTBR1_EL1                   |
| EL1_TCR         | rw     | _                              | TCR_EL1                                |
| EL1_MAIR        | rw     | _                              | MAIR_EL1, AMAIR_EL1                    |
| EL1_VBAR        | rw     | _                              | VBAR_EL1                               |
| EL1_SCTLR       | rw     | _                              | SCTLR_EL1                              |
| EL1_MDSCR       | rw     | _                              | MDSCR_EL1                              |
| EL2_HCR         | rw     | -                              | HCR_EL2, HCRX_EL2                      |
| EL2_IDR         | rw     | -                              | VPIDR_EL2, VMPIDR_EL2                  |
| EL2_ELR_SPSR    | rw     | ELR_EL2, SPSR_EL2              | ELR_EL2, SPSR_EL2                      |
| EL2_ESR_FAR     | r      | ESR_EL2, FAR_EL2               | ESR_EL2, FAR_EL2                       |
| EL2_HPFAR       | r      | _                              | HPFAR_EL2                              |
| TMR             | rw     | _                              | CNTV_CVAL_ELO, CNTV_CTL_ELO            |
| 1111            | I w    | _                              | CNTKCTL_EL1, CNTVOFF_EL2               |
| GIC             | rw     | _                              | LRO LR15, APxRO APxR3                  |
| QTC.            | r      |                                | ELRSR, VMCR                            |

<sup>†</sup>Only affects a VIPT instruction cache of the local core. Has no effect on PIPT instruction caches, data caches, or caches of other cores.

## 7.7 Calling Convention

The following pages describes the calling convention for each hypercall. An execution context calls into the microhypervisor by loading the hypercall identifier and other parameters into the specified processor registers and then executes the svc #0 instruction [3].

The hypercall identifier consists of the hypercall number and hypercall-specific flags, as illustrated in Figure 7.1.



Figure 7.1: Hypercall Identifier

The status code returned from a hypercall has the format shown in Figure 7.2.



Figure 7.2: Status Code

The assignment of hypercall parameters to general-purpose registers is shown on the left side; the contents of the registers after the hypercall is shown on the right side.

## **IPC Call**

$$\begin{array}{c|cccc} pt_{[63-8]} \ hypercall_{[7-0]} & \textbf{X0} & & & \textbf{ipc\_call} \\ & mtd_{[31-0]} & \textbf{X1} & & & \textbf{X1} \\ & & & & & & \textbf{X1} & mtd_{[31-0]} \\ & & & & & & \textbf{IP} & \textbf{IP+4} \end{array}$$

## **IPC Reply**

## **Create Protection Domain**

#### **Create Execution Context**

## **Create Scheduling Context**

## **Create Portal**

## **Create Semaphore**

#### **Control Protection Domain**

## **Control Execution Context**

## **Control Scheduling Context**

#### **Control Portal**

## **Control Semaphore**

## **Control Power Management**

## **Assign Interrupt**

## **Assign Device**

## 7.8 Supplementary Functionality

This section describes functions that do **not** conform to the calling convention for hypercalls. Because these functions cannot perform capability-based access control, their invocation is restricted to the Root Protection Domain (PD<sub>ROOT</sub>). Invocation of these functions from any other Protection Domain generates an exception.

#### **Secure Monitor Call**

This call is proxy-filtered by the microhypervisor. If the function parameter indicates an **atomic SIP service call**, then the microhypervisor issues the corresponding SMC to the platform firmware on behalf of the caller. Otherwise this function generates an exception. Register allocation conforms to the ARM SMCCC [15].

|                |     | proxy_smc |            |      |
|----------------|-----|-----------|------------|------|
| function[31-0] | X0  | <u> </u>  | X0         | ~    |
| _              | X1  |           | X1         | ~    |
| _              | X2  |           | X2         | ~    |
| _              | Х3  |           | Х3         | ~    |
| _              | X4  |           | X4         | ~    |
| _              | X5  |           | <b>X</b> 5 | ~    |
| _              | Х6  |           | <b>X</b> 6 | ~    |
| _              | X7  |           | X7         | ~    |
| _              | X8  |           | X8         | ~    |
| _              | Х9  |           | <b>X</b> 9 | ~    |
| _              | X10 |           | X10        | ~    |
| _              | X11 |           | X11        | ~    |
| _              | X12 |           | X12        | ~    |
| _              | X13 |           | X13        | ~    |
| _              | X14 |           | X14        | ~    |
| _              | X15 |           | X15        | ~    |
| _              | X16 |           | X16        | ~    |
| _              | X17 |           | X17        | ~    |
| _              | IP  | svc #1    | IP         | IP+4 |

## 8 ABI x86-64

## 8.1 Boot State

## 8.1.1 NOVA Microhypervisor

The bootloader must set up the CPU register state according to one of the launch types listed below when it transfers control to the NOVA microhypervisor entry point. Furthermore, the following preconditions must be satisfied:

- The CPU state must conform to a machine state defined in the Multiboot Specification v2 [8] or v1 [9].
- All DMA activity targeting the physical memory region occupied by the microhypervisor must be quiesced. That physical memory region should also be protected against DMA accesses on systems with an SMMU.

#### 8.1.1.1 Multiboot v2 Launch

Only this launch type supports 64-bit **UEFI** platforms.

| Register | Value / Description                                                                        |
|----------|--------------------------------------------------------------------------------------------|
| EIP      | Physical address of the NOVA Protection Domain (PD <sub>NOVA</sub> ) ELF image entry point |
| EAX      | Multiboot v2 magic value (0x36d76289) [8]                                                  |
| EBX      | Physical address of the Multiboot v2 information structure [8]                             |
| Other    | ~                                                                                          |

The NOVA microhypervisor consumes the following multiboot tags, if present: 1, 3, 12, 20.

## 8.1.1.2 Multiboot v1 Launch

| Register | Value / Description                                                                        |
|----------|--------------------------------------------------------------------------------------------|
| EIP      | Physical address of the NOVA Protection Domain (PD <sub>NOVA</sub> ) ELF image entry point |
| EAX      | Multiboot v1 magic value (0x2badb002) [9]                                                  |
| EBX      | Physical address of the Multiboot v1 information structure [9]                             |
| Other    | ~                                                                                          |

The NOVA microhypervisor consumes the following multiboot flags, if present: 2, 3.

## 8.1.2 Root Protection Domain

The NOVA microhypervisor sets up the CPU register state as follows when it transfers control to the Root Execution Context ( $EC_{ROOT}$ ):

| Register | Value / Description                                                                       |
|----------|-------------------------------------------------------------------------------------------|
| RIP      | Virtual address of the Root Protection Domain (PD <sub>ROOT</sub> ) ELF image entry point |
| RSP      | Virtual address of the Hypervisor Information Page [ARM, x86] (HIP)                       |
| RDI      | EAX at boot time †                                                                        |
| RSI      | EBX at boot time †                                                                        |
| 0ther    | ~                                                                                         |

<sup>†</sup>The register contains the preserved original value from the point when control was transferred from the bootloader to the microhypervisor.

## 8.2 Protected Resources

The following resources are protected by the NOVA microhypervisor and are therefore inaccessible to user-mode applications.

## 8.2.1 Memory Space

Physical memory regions occupied by:

- NOVA microhypervisor conveyed via HIP.
- LAPIC, IOAPIC devices conveyed via ACPI MADT.
- IOMMU devices [16, 17] conveyed via ACPI DMAR or IVRS.
- Firmware runtime services conveyed via **UEFI** memory map.

## 8.2.2 I/O Port Space

- ACPI fixed registers PM1a\_CNT, PM1b\_CNT, PM2\_CNT conveyed via ACPI FADT.
- SMI\_CMD port conveyed via ACPI FADT.

## 8.3 Physical Memory

## 8.3.1 Memory Map

The Root Protection Domain (PD<sub>ROOT</sub>) can obtain a list of available/reserved memory regions as follows:

- On platforms using Multiboot v2 (UEFI boot services enabled), by parsing the UEFI memory map [7].
- On platforms using Multiboot v2, by parsing the Multiboot v2 memory map [8].
- On platforms using Multiboot v1, by parsing the Multiboot v1 memory map [9].

## 8.4 Virtual Memory

The accessible virtual memory range for user-mode applications is 0-0x7ffffffffff.

## 8.4.1 Cacheability Attributes

| Encoding | ATTR <sub>CA</sub> | Description        |
|----------|--------------------|--------------------|
| 0x0      | WB                 | Write Back         |
| 0x1      | WT                 | Write Through      |
| 0x2      | WC                 | Write Combining    |
| 0x3      | UC                 | Strong Uncacheable |
| 0x4      | WP                 | Write Protected    |

Please refer to [4, 5] for details on the architectural behavior.

## 8.4.2 Shareability Attributes

| Encoding | <b>ATTR</b> <sub>SH</sub> | Description           |
|----------|---------------------------|-----------------------|
| 0x0      | UNUSED                    | Always use this value |

## 8.5 Event-Specific Capability Selectors

For the delivery of exception/intercept messages, the microhypervisor performs an implicit portal traversal.

The selector for the destination portal (SEL<sub>OBJ</sub>):

- is determined by adding the exception/intercept number to the affected Execution Context's Event Selector Base [ARM, x86] (SEL<sub>EVT</sub>).
- indexes into the Object Space (SPC<sub>OBJ</sub>) of the affected EC's Protection Domain (PD).
- must refer to a PT Object Capability (CAP<sub>OBJ<sub>PT</sub></sub>) with permission EVENT that is bound to an EC on the same core as the affected EC, otherwise the affected EC is killed.

## 8.5.1 Architectural Events

## **Host Exceptions**

| SEL <sub>OBJ</sub>           | Exception | SEL <sub>OBJ</sub>        | Exception |
|------------------------------|-----------|---------------------------|-----------|
| $\frac{SEL_{EVT} + 0x00}{N}$ | #DE       | SEL <sub>EVT</sub> + 0x10 | #MF       |
| $SEL_{EVT} + 0x01$           | #DB       | $SEL_{EVT} + 0x11$        | #AC       |
| $\frac{SEL_{EVT}}{} + 0x02$  | reserved  | $SEL_{EVT} + 0x12$        | #MC*      |
| $SEL_{EVT} + 0x03$           | #BP       | $SEL_{EVT} + 0x13$        | #XM       |
| $SEL_{EVT} + 0x04$           | #OF       | $SEL_{EVT} + 0x14$        | #VE       |
| $SEL_{EVT} + 0x05$           | #BR       | $SEL_{EVT} + 0x15$        | #CP       |
| $SEL_{EVT} + 0x06$           | #UD       | $SEL_{EVT} + 0x16$        | reserved  |
| $\frac{SEL_{EVT}}{} + 0x07$  | #NM*      | $SEL_{EVT} + 0x17$        | reserved  |
| $SEL_{EVT} + 0x08$           | #DF*      | $SEL_{EVT} + 0x18$        | reserved  |
| $SEL_{EVT} + 0x09$           | reserved  | $SEL_{EVT} + 0x19$        | reserved  |
| $SEL_{EVT} + 0x0a$           | #TS*      | $SEL_{EVT} + 0x1a$        | reserved  |
| $SEL_{EVT} + 0x0b$           | #NP       | $SEL_{EVT} + 0x1b$        | reserved  |
| $SEL_{EVT} + 0x0c$           | #SS       | $SEL_{EVT} + 0x1c$        | reserved  |
| $SEL_{EVT} + 0x0d$           | #GP       | $SEL_{EVT} + 0x1d$        | reserved  |
| $SEL_{EVT} + 0x0e$           | #PF       | $SEL_{EVT} + 0x1e$        | reserved  |
| $SEL_{EVT} + 0x0f$           | reserved  | $SEL_{EVT} + 0x1f$        | reserved  |

<sup>\*</sup>These events may be handled by the microhypervisor, in which case they will not cause portal traversals.

<sup>&</sup>lt;sup>†</sup>These events may be force-enabled by the microhypervisor, in which case they will cause portal traversals.

## **Guest Intercepts (VMX)**

| SEL <sub>OBJ</sub>          | Intercept                             | SEL <sub>OBJ</sub>        | Intercept                   |
|-----------------------------|---------------------------------------|---------------------------|-----------------------------|
| $SEL_{EVT} + 0x00$          | Exception or NMI*                     | SEL <sub>EVT</sub> + 0x28 | PAUSE                       |
| $SEL_{EVT} + 0x01$          | External Interrupt*                   | $SEL_{EVT} + 0x29$        | VM Entry Failure (MCE)      |
| $SEL_{EVT} + 0x02$          | Triple Fault <sup>†</sup>             | $SEL_{EVT} + 0x2a$        | reserved                    |
| $SEL_{EVT} + 0x03$          | INIT <sup>†</sup>                     | $SEL_{EVT} + 0x2b$        | TPR Below Threshold         |
| $SEL_{EVT} + 0x04$          | SIPI <sup>†</sup>                     | $SEL_{EVT} + 0x2c$        | APIC Access                 |
| $SEL_{EVT} + 0x05$          | I/O SMI                               | $SEL_{EVT} + 0x2d$        | Virtualized EOI             |
| $SEL_{EVT} + 0x06$          | Other SMI                             | $SEL_{EVT} + 0x2e$        | GDTR/IDTR Access            |
| $SEL_{EVT} + 0x07$          | Interrupt Window                      | $SEL_{EVT} + 0x2f$        | LDTR/TR Access              |
| $SEL_{EVT} + 0x08$          | NMI Window                            | $SEL_{EVT} + 0x30$        | EPT Violation <sup>†</sup>  |
| $SEL_{EVT} + 0x09$          | Task Switch <sup>†</sup>              | $SEL_{EVT} + 0x31$        | EPT Misconfiguration        |
| $SEL_{EVT} + 0x0a$          | CPUID <sup>†</sup>                    | $SEL_{EVT} + 0x32$        | INVEPT                      |
| $SEL_{EVT} + 0x0b$          | GETSEC <sup>†</sup>                   | $SEL_{EVT} + 0x33$        | RDTSCP                      |
| $SEL_{EVT} + 0x0c$          | HLT <sup>†</sup>                      | $SEL_{EVT} + 0x34$        | Preemption Timer            |
| $SEL_{EVT} + 0x0d$          | INVD <sup>†</sup>                     | $SEL_{EVT} + 0x35$        | INVVPID                     |
| $SEL_{EVT} + 0x0e$          | INVLPG                                | $SEL_{EVT} + 0x36$        | WBINVD, WBNOINVD            |
| $SEL_{EVT} + 0x0f$          | RDPMC                                 | $SEL_{EVT} + 0x37$        | XSETBV                      |
| $SEL_{EVT} + 0x10$          | RDTSC                                 | $SEL_{EVT} + 0x38$        | APIC Write                  |
| $SEL_{EVT} + 0x11$          | RSM                                   | $SEL_{EVT} + 0x39$        | RDRAND                      |
| $SEL_{EVT} + 0x12$          | VMCALL                                | $SEL_{EVT} + 0x3a$        | INVPCID                     |
| $SEL_{EVT} + 0x13$          | VMCLEAR                               | $SEL_{EVT} + 0x3b$        | VMFUNC                      |
| $SEL_{EVT} + 0x14$          | VMLAUNCH                              | $SEL_{EVT} + 0x3c$        | ENCLS                       |
| $SEL_{EVT} + 0x15$          | VMPTRLD                               | $SEL_{EVT} + 0x3d$        | RDSEED                      |
| $SEL_{EVT} + 0x16$          | VMPTRST                               | $SEL_{EVT} + 0x3e$        | PML Log Full                |
| $SEL_{EVT} + 0x17$          | VMREAD                                | $SEL_{EVT} + 0x3f$        | XSAVES                      |
| $SEL_{EVT} + 0x18$          | VMRESUME                              | $SEL_{EVT} + 0x40$        | XRSTORS                     |
| $SEL_{EVT} + 0x19$          | VMWRITE                               | $SEL_{EVT} + 0x41$        | reserved                    |
| $SEL_{EVT} + 0x1a$          | VMXOFF                                | $SEL_{EVT} + 0x42$        | SPP Miss / Misconfiguration |
| $SEL_{EVT} + 0x1b$          | VMXON                                 | $SEL_{EVT} + 0x43$        | UMWAIT                      |
| $SEL_{EVT} + 0x1c$          | CR Access*                            | $SEL_{EVT} + 0x44$        | TPAUSE                      |
| $SEL_{EVT} + 0x1d$          | DR Access                             | $SEL_{EVT} + 0x45$        | LOADIWKEY                   |
| $SEL_{EVT} + 0x1e$          | I/O Access <sup>†</sup>               | $SEL_{EVT} + 0x46$        | reserved                    |
| $SEL_{EVT} + 0x1f$          | RDMSR <sup>†</sup>                    | $SEL_{EVT} + 0x47$        | reserved                    |
| $SEL_{EVT} + 0x20$          | WRMSR <sup>†</sup>                    | $SEL_{EVT} + 0x48$        | ENQCMD PASID Failure        |
| $SEL_{EVT} + 0x21$          | VM Entry Failure (State) <sup>†</sup> | $SEL_{EVT} + 0x49$        | ENQCMDS PASID Failure       |
| $SEL_{EVT} + 0x22$          | VM Entry Failure (MSR)                | $SEL_{EVT} + 0x4a$        | Bus Lock                    |
| $SEL_{EVT} + 0x23$          | reserved                              | $SEL_{EVT} + 0x4b$        | Notify Window               |
| $SEL_{EVT} + 0x24$          | MWAIT                                 | $SEL_{EVT} + 0x4c$        | SEAMCALL                    |
| $\frac{SEL_{EVT}}{} + 0x25$ | MTF                                   | $SEL_{EVT} + 0x4d$        | TDCALL                      |
| $SEL_{EVT} + 0x26$          | reserved                              | $SEL_{EVT} + 0x4e$        | reserved                    |
| $SEL_{EVT} + 0x27$          | MONITOR                               | $SEL_{EVT} + 0x4f$        | reserved                    |

Please refer to [4] for more details on each of these events.

## 8.5.2 Microhypervisor Events

| SEL <sub>OB</sub> J                            | Event   |
|------------------------------------------------|---------|
| SEL <sub>EVT</sub> + SEL <sub>ARCH</sub> + 0x0 | Startup |
| $SEL_{EVT} + SEL_{ARCH} + 0x1$                 | Recall  |

The value of SEL<sub>ARCH</sub> depends on the origin of the event:

- $SEL_{ARCH} = SEL_{HST/ARCH}$  (0x20) for events that occurred in the host.
- SEL<sub>ARCH</sub> = SEL<sub>GST/ARCH</sub> (0x100) for events that occurred in the guest.

## 8.6 Architecture-Dependent Structures

## 8.6.1 Hypervisor Information Page

The architecture-dependent HIP structure is empty.

## 8.6.2 User Thread Control Block

|                    | - <u> </u>         | IA32_KERNI         | EL_GS_BASE                     |          |  |  |  |
|--------------------|--------------------|--------------------|--------------------------------|----------|--|--|--|
| IA32_              | FMASK              | IA32_LSTAR         |                                |          |  |  |  |
| IA32               | _STAR              | IA32               | IA32_EFER                      |          |  |  |  |
| IA32               | _PAT               | IA32_SYSI          | ENTER_EIP                      |          |  |  |  |
| IA32_SYSF          | ENTER_ESP          | IA32_SYS           | ENTER_CS                       |          |  |  |  |
| DI                 | R7                 | CI                 | R8                             |          |  |  |  |
| CF                 | R4                 | CI                 | R3                             |          |  |  |  |
| CF                 | R2                 | CI                 | R <b>0</b>                     |          |  |  |  |
| PDP                | TE3                | PDP                | TE2                            |          |  |  |  |
| PDP                | TE1                | PDP                | TE0                            | Column   |  |  |  |
| Base               | IDTR               | Limit IDTR         |                                | -        |  |  |  |
| Base               | GDTR               | Limit GDTR         |                                | -        |  |  |  |
| Base               | LDTR               | Limit LDTR         | AR LDTR*                       | SEL LDTR |  |  |  |
| Base               | e TR               | Limit TR           | AR TR*                         | SEL TR   |  |  |  |
| Base               | e GS               | Limit GS           | AR GS*                         | SEL GS   |  |  |  |
| Base               | e FS               | Limit FS           | AR FS*                         | SEL FS   |  |  |  |
| Base               | e ES               | Limit ES           | AR ES*                         | SEL ES   |  |  |  |
| Base               | e DS               | Limit DS           | AR DS*                         | SEL DS   |  |  |  |
| Base               | e SS               | Limit SS           | AR SS*                         | SEL SS   |  |  |  |
| Base               | e CS               | Limit CS           | AR CS*                         | SEL CS   |  |  |  |
| DT Vectoring Error | IDT Vectoring Info | Interruption Error | Interruption Info <sup>†</sup> |          |  |  |  |
| TPR Threshold      | PF Error Match     | PF Error Mask      | EXC Int                        | ercepts  |  |  |  |
| CR4 Int            | ercepts            | CR0 Int            | ercepts                        |          |  |  |  |
| 3rd Exec           | Controls           | 2nd Exec Controls  | 1st Exec Controls              |          |  |  |  |
| 2nd Exit Qu        | alification        | 1st Exit Qu        | alification                    |          |  |  |  |
| Activity           | Interruptibility   | Instruction Info   | Instruction Length             |          |  |  |  |
| RIP                | )<br>              | RFLAGS             |                                |          |  |  |  |
| R15                |                    | R14                |                                |          |  |  |  |
| R13                |                    | R12                | R12                            |          |  |  |  |
| R11                |                    | R10                |                                |          |  |  |  |
| R9                 |                    | R8                 |                                |          |  |  |  |
| R7                 | (RDI)              | R6                 | (RSI)                          |          |  |  |  |
| R5                 | (RBP)              | R4                 | (RSP)                          |          |  |  |  |
| R3                 | (RBX)              | R2                 | (RDX)                          |          |  |  |  |
| R1                 | (RCX)              | R0                 | (RAX)                          |          |  |  |  |

<sup>\*</sup>See Section 8.6.2.1 for encoding details.  $^\dagger$  See Section 8.6.2.2 for encoding details.

## 8.6.2.1 Encoding: Segment Access Rights

| ~     | U    | G     | D/B                                  | L                          | AVL     | P     | D | PL | S           |   | Type |   |
|-------|------|-------|--------------------------------------|----------------------------|---------|-------|---|----|-------------|---|------|---|
|       | 12   | 11    | 10                                   | 9                          | 8       | 7     | 6 | 5  | 4           | 3 |      | 0 |
| Field |      |       | Desc                                 | riptio                     | n       |       |   |    |             |   |      |   |
| U     |      |       | 0 = S                                | egme                       | nt Usal | ole   |   |    | <del></del> |   |      |   |
| U     |      |       | 1 = S                                | egme                       | nt Unu  | sable |   |    |             |   |      |   |
| G     |      |       | Gran                                 | ularity                    | 7       |       |   |    | _           |   |      |   |
| D /D  | D /D |       |                                      | 0 = 16-bit segment         |         |       |   | _  |             |   |      |   |
| ם /ע  | D/B  |       |                                      | 1 = 32-bit segment         |         |       |   |    |             |   |      |   |
| L     |      |       | 64-bit mode active (CS only)         |                            |         |       |   |    |             |   |      |   |
| AVL   |      |       | Available for use by system software |                            |         |       |   |    |             |   |      |   |
| P     |      |       | Segm                                 | Segment Present            |         |       |   |    |             |   |      |   |
| DPL   |      |       | Desc                                 | Descriptor Privilege Level |         |       |   |    |             |   |      |   |
| S     |      | 0 = S | ystem                                | 1                          |         |       |   | _  |             |   |      |   |
| ა     |      |       | 1 = C                                | ode o                      | r Data  |       |   |    |             |   |      |   |
| Type  |      |       | Segm                                 | Segment Type               |         |       |   |    |             |   |      |   |

## 8.6.2.2 Encoding: Interruption Information

| V      | ~ N I E Type Vector                                                      |  |  |  |  |  |  |
|--------|--------------------------------------------------------------------------|--|--|--|--|--|--|
| 31     | 13 12 11 10 8 7 0                                                        |  |  |  |  |  |  |
| Field  | Description                                                              |  |  |  |  |  |  |
| V      | 0 = Fields E, Type, Vector are invalid                                   |  |  |  |  |  |  |
| V      | 1 = Fields E, Type, Vector are valid                                     |  |  |  |  |  |  |
| N      | 0 = Do not request an NMI window                                         |  |  |  |  |  |  |
| N      | 1 = Request an NMI window                                                |  |  |  |  |  |  |
| т      | 0 = Do not request an interrupt window                                   |  |  |  |  |  |  |
| 1      | 1 = Request an interrupt window                                          |  |  |  |  |  |  |
| E      | 0 = Do not deliver the error code from the UTCB Interruption Error field |  |  |  |  |  |  |
| £      | 1 = Deliver the error code from the UTCB Interruption Error field        |  |  |  |  |  |  |
|        | 0 = External Interrupt                                                   |  |  |  |  |  |  |
|        | 2 = Non-Maskable Interrupt                                               |  |  |  |  |  |  |
| Type   | 3 = Hardware Exception                                                   |  |  |  |  |  |  |
| Type   | 4 = Software Interrupt                                                   |  |  |  |  |  |  |
|        | 5 = Privileged Software Exception                                        |  |  |  |  |  |  |
|        | 6 = Software Exception                                                   |  |  |  |  |  |  |
|        | 7 = Other Event (not delivered through IDT)                              |  |  |  |  |  |  |
| Vector | IDT Vector of Interrupt or Exception                                     |  |  |  |  |  |  |

## 8.6.3 Message Transfer Descriptor

The Message Transfer Descriptor [ARM, x86] (MTD), which controls the subset of the architectural state transferred during exceptions and intercepts, as described in Section 4.4.2, has the following layout:



Each MTD bit controls the transfer of the listed architectural state to/from the respective fields in the UTCB (8.6.2) as follows:

- State with access r can be read from the architectural state into the UTCB.
- State with access w can be written from the UTCB into the architectural state.

| MTD Bit            | Access | <b>Host Exception State</b>      | Guest Intercept State                             |
|--------------------|--------|----------------------------------|---------------------------------------------------|
| POISON             | W      | Kills the Thread                 | Kills the vCPU                                    |
| GPR <sub>0-7</sub> | rw     | R0 R7                            | R0 R7                                             |
| $GPR_{8-15}$       | rw     | R8 R15                           | R8 R15                                            |
| RFLAGS             | rw     | RFLAGS*                          | RFLAGS                                            |
| RIP                | rw     | RIP                              | RIP, Instruction Length, Instruction Info         |
| STA                | rw     | _                                | Interruptibility State, Activity State            |
| QUAL               | r      | Exit Qualifications <sup>†</sup> | Exit Qualifications                               |
| CTRL               | W      | -                                | Execution Controls, CR Intercepts, EXC Intercepts |
|                    |        |                                  | PF Error Mask/Match                               |
| TPR                | W      | -                                | TPR Threshold                                     |
| INJ                | rw     |                                  | Interruption Info, Interruption Error             |
|                    | r      | _                                | IDT Vectoring Info, IDT Vectoring Error           |
| CS/SS              | rw     | -                                | CS, SS (Selector, Base, Limit, AR)                |
| DS/ES              | rw     | _                                | DS, ES (Selector, Base, Limit, AR)                |
| FS/GS              | rw     | _                                | FS, GS (Selector, Base, Limit, AR)                |
| TR                 | rw     | _                                | TR (Selector, Base, Limit, AR)                    |
| LDTR               | rw     | _                                | LDTR (Selector, Base, Limit, AR)                  |
| GDTR               | rw     | _                                | GDTR (Base, Limit)                                |
| IDTR               | rw     | _                                | IDTR (Base, Limit)                                |
| PDPTE              | rw     | -                                | PDPTE0 PDPTE3                                     |
| CR                 | rw     | _                                | CRO, CR2, CR3, CR4, CR8                           |
| DR                 | rw     | _                                | DR7                                               |
| SYSENTER           | rw     | -                                | IA32_SYSENTER_{CS,ESP,EIP}                        |
| PAT                | rw     | _                                | IA32_PAT                                          |
| EFER               | rw     | _                                | IA32_EFER                                         |
| SYSCALL            | rw     | _                                | IA32_{STAR,LSTAR,FMASK}                           |
| KERNEL_GS          | rw     | _                                | IA32_KERNEL_GS_BASE                               |
| TLB                | W      | -                                | Invalidates the TLB for the vCPU                  |

<sup>\*</sup>Only the arithmetic flags are writable.

<sup>&</sup>lt;sup>†</sup>The 1st exit qualification contains the exception error code. The 2nd exit qualification contains the fault address.

## 8.7 Calling Convention

The following pages describes the calling convention for each hypercall. An execution context calls into the microhypervisor by loading the hypercall identifier and other parameters into the specified processor registers and then executes the syscall instruction [4, 5].

The hypercall identifier consists of the hypercall number and hypercall-specific flags, as illustrated in Figure 8.1.



Figure 8.1: Hypercall Identifier

The status code returned from a hypercall has the format shown in Figure 8.2.



Figure 8.2: Status Code

The assignment of hypercall parameters to general-purpose registers is shown on the left side; the contents of the registers after the hypercall is shown on the right side.

## **IPC Call**

$$\begin{array}{c|cccc} pt_{[63-8]} \ hypercall_{[7-0]} & RDI \\ mtd_{[31-0]} & RSI \\ & - & RCX \\ & - & R11 \\ & - & RIP \end{array} \begin{array}{c|cccc} ipc\_call \\ RSI & mtd_{[31-0]} \\ RCX & RIP+2 \\ R11 & 0x202 \\ RIP+2 \end{array}$$

## **IPC Reply**

## **Create Protection Domain**



## **Create Execution Context**



## **Create Scheduling Context**



## **Create Portal**



## **Create Semaphore**



#### **Control Protection Domain**



## **Control Execution Context**

## **Control Scheduling Context**



## **Control Portal**

$$\begin{array}{c|cccc} pt_{[63-8]} \ hypercall_{[7-0]} & RDI \\ pid & RSI \\ mtd_{[31-0]} & RDX \\ & & & RCX \\ & & & RCX \\ & & & RCX \\ & & & & RIP+2 \\ & & & & R11 \\ & & & & & RIP+2 \\ \end{array}$$

## **Control Semaphore**

## **Control Power Management**

## **Assign Interrupt**



## **Assign Device**



# Part V Appendix

# A Acronyms

ACPI Advanced Configuration and Power Interface [6]

**BDF** PCI Bus : Device : Function

BSP Bootstrap Processor

CAP Capability

CAP<sub>0</sub> Null Capability

CAP<sub>MEM</sub> Memory Capability

CAP<sub>MSR</sub> MSR Capability

CAP<sub>OBJ</sub> Object Capability

CAP<sub>OBJ<sub>PD</sub></sub> PD Object Capability

CAP<sub>OBJ<sub>EC</sub></sub> EC Object Capability

CAP<sub>OBJ<sub>SC</sub></sub> SC Object Capability

CAP<sub>OBJ<sub>SK</sub></sub> PT Object Capability

CAP<sub>OBJ<sub>SK</sub></sub> SM Object Capability

CAP<sub>PIO</sub> I/O Port Capability

CPU Central Processing Unit

DMA Direct Memory Access

**EC** Execution Context

EC<sub>CURRENT</sub> Current Execution Context
EC<sub>ROOT</sub> Root Execution Context

**ELF** Executable and Linkable Format [18]

**FDT** Flattened Device Tree [10]

**FPU** Floating Point Unit

Generic Interrupt Controller [11, 12]

GICC GIC CPU Interface
GICD GIC Distributor
GICH GIC HYP Interface
GICR GIC Redistributor

HIP Hypervisor Information Page [ARM, x86]

**IOAPIC** I/O Advanced Programmable Interrupt Controller

**I/O** Memory Management Unit [16, 17]

IP Instruction Pointer

IPC Inter-Process Communication

LAPIC Local Advanced Programmable Interrupt Controller

MMU Memory Management Unit
MSI PCI Message Signaled Interrupt

MSR Model-Specific Register

MTD Message Transfer Descriptor [ARM, x86]

NOVA NOVA OS Virtualization Architecture [2]

PCI Peripheral Component Interconnect [19, 20]

PD Protection Domain

PD<sub>CURRENT</sub> Current Protection Domain
PD<sub>NOVA</sub> NOVA Protection Domain
PD<sub>ROOT</sub> Root Protection Domain

PID Portal Identifier

PT Portal

SC Scheduling Context

 SC<sub>CURRENT</sub>
 Current Scheduling Context

 SC<sub>ROOT</sub>
 Root Scheduling Context

SEL Capability Selector

SEL\_EVT Event Selector Base [ARM, x86]
SEL\_MEM Memory Capability Selector
SEL\_MSR MSR Capability Selector
SEL\_OBJ Object Capability Selector
SEL\_PIO I/O Port Capability Selector
SID SMMU Stream Identifier

SM Semaphore

SMMU System Memory Management Unit [13, 14]

SP Stack Pointer
SPC<sub>MEM</sub> Memory Space
SPC<sub>MSR</sub> MSR Space
SPC<sub>OBJ</sub> Object Space
SPC<sub>PIO</sub> I/O Port Space

STC System Time Counter

TYPE<sub>SPC</sub> Space Type
TYPE<sub>ACC</sub> Access Type

UART Universal Asynchronous Receiver Transmitter
 UEFI Unified Extensible Firmware Interface [7]
 UTCB User Thread Control Block [ARM, x86]

VMM Virtual-Machine Monitor

| ipc_call   | Hypercall [ARM, x86]: IPC Call                   |
|------------|--------------------------------------------------|
| ipc_reply  | Hypercall [ARM, x86]: IPC Reply                  |
| create_pd  | Hypercall [ARM, x86]: Create Protection Domain   |
| create_ec  | Hypercall [ARM, x86]: Create Execution Context   |
| create_sc  | Hypercall [ARM, x86]: Create Scheduling Context  |
| create_pt  | Hypercall [ARM, x86]: Create Portal              |
| create_sm  | Hypercall [ARM, x86]: Create Semaphore           |
| ctrl_pd    | Hypercall [ARM, x86]: Control Protection Domain  |
| ctrl_ec    | Hypercall [ARM, x86]: Control Execution Context  |
| ctrl_sc    | Hypercall [ARM, x86]: Control Scheduling Context |
| ctrl_pt    | Hypercall [ARM, x86]: Control Portal             |
| ctrl_sm    | Hypercall [ARM, x86]: Control Semaphore          |
| ctrl_pm    | Hypercall [ARM, x86]: Control Power Management   |
| assign_int | Hypercall [ARM, x86]: Assign Interrupt           |
| assign_dev | Hypercall [ARM, x86]: Assign Device              |

# **B** Bibliography

- [1] RFC 2119. Internet Engineering Task Force (IETF), 1997. URL https://tools.ietf.org/html/rfc2119. iv
- [2] Udo Steinberg and Bernhard Kauer. NOVA: A Microhypervisor-Based Secure Virtualization Architecture. In Proceedings of the 5th ACM SIGOPS/EuroSys European Conference on Computer Systems, pages 209–222. ACM, 2010. ISBN 978-1-60558-577-2. URL https://doi.acm.org/10.1145/1755913.1755935. 2, 65
- [3] ARM Architecture Reference Manual ARMv8, for ARMv8-A Architecture Profile. ARM Limited, 2021. URL https://developer.arm.com/documentation/ddi0487/. Document Number: DDI0487. 7, 42, 43, 48
- [4] Intel 64 and IA-32 Architectures Software Developer's Manual, Combined Volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D and 4. Intel Corporation, 2021. URL https://software.intel.com/en-us/articles/intel-sdm. Document Number: 325462. 7, 53, 55, 59
- [5] AMD64 Architecture Programmer's Manual: Volumes 1-5. Advanced Micro Devices, Inc., 2021. URL https://developer.amd.com/resources/developer-guides-manuals. Document Number: 40332. 7, 53, 59
- [6] Advanced Configuration and Power Interface (ACPI) Specification. UEFI Forum, Inc., 2021. URL https://uefi.org/specifications. Version 6.4. 37, 64
- [7] Unified Extensible Firmware Interface (UEFI) Specification. UEFI Forum, Inc., 2021. URL https://uefi.org/specifications. Version 2.9. 37, 42, 53, 65
- [8] Yoshinori K. Okuji, Bryan Ford, Erich Stefan Boleyn, Kunihiro Ishiguro, Vladimir Serbinenko, and Daniel Kiper. The Multiboot2 Specification, 2016. URL https://www.gnu.org/software/grub/manual/multiboot2/multiboot.pdf. Version 2.0. 40, 52, 53
- [9] Yoshinori K. Okuji, Bryan Ford, Erich Stefan Boleyn, and Kunihiro Ishiguro. The Multiboot Specification, 2010. URL https://www.gnu.org/software/grub/manual/multiboot/multiboot.pdf. Version 0.6.96. 40, 52, 53
- [10] Devicetree Specification. Linaro Limited, 2020. URL https://www.devicetree.org/specifications. Version 0.3. 40, 42, 64
- [11] ARM Generic Interrupt Controller Architecture Specification Version 2. ARM Limited, 2013. URL https://developer.arm.com/documentation/ihi0048/. Document Number: IHI0048. 42, 64
- [12] ARM Generic Interrupt Controller Architecture Specification Version 3 and Version 4. ARM Limited, 2021. URL https://developer.arm.com/documentation/ihi0069/. Document Number: IHI0069. 42, 64
- [13] ARM System Memory Management Unit Architecture Specification Version 2. ARM Limited, 2016. URL https://developer.arm.com/documentation/ihi0062/. Document Number: IHI0062. 42, 65
- [14] ARM System Memory Management Unit Architecture Specification Version 3. ARM Limited, 2021. URL https://developer.arm.com/documentation/ihi0070/. Document Number: IHI0070. 42, 65
- [15] ARM SMC Calling Convention. ARM Limited, 2021. URL https://developer.arm.com/documentation/den0028/. Document Number: DEN0028. 51
- [16] Intel Virtualization Technology for Directed I/O Architecture Specification. Intel Corporation, 2021. URL https://www.intel.com/content/www/us/en/develop/download/intel-virtualization-technology-for-directed-io-architecture-specification.html. Document Number: D51397. 53, 64

- [17] AMD I/O Virtualization Technology (IOMMU) Specification. Advanced Micro Devices, Inc., 2021. URL https://www.amd.com/en/support/tech-docs/amd-io-virtualization-technology-iommu-specification. Document Number: 48882. 53, 64
- [18] Executable and Linking Format (ELF) Specification. TIS Committee, 1995. URL https://refspecs.linuxbase.org/elf/elf.pdf. Version 1.2. 64
- [19] *PCI Local Bus Specification*. PCI-SIG, 2004. URL https://pcisig.com/specifications. Revision 3.0. 65
- [20] PCI Express Base Specification. PCI-SIG, 2019. URL https://pcisig.com/specifications. Revision 5.0. 65

## C Console

## C.1 Memory-Buffer Console

The NOVA microhypervisor implements a memory-buffer console that provides run-time debug output. The memory-buffer console consists of a signaling semaphore (see 6.1.2) and an in-memory data structure with a header and a buffer as follows:



The start address and end address of the memory-buffer console are conveyed in the HIP.

The buffer size (N characters) can be computed as:

The fields of the header are used as follows:

- RdIdx ranges from 0 ... N-1.
   It points to the next character in the buffer that the console consumer will read and is typically advanced by the console consumer.
- WrIdx ranges from 0 ... N-1.
   It points to the next character in the buffer that the NOVA microhypervisor will write and is only advanced by the NOVA microhypervisor.
- The buffer is empty if RdIdx is equal to WrIdx.
- Otherwise WrIdx is ahead of RdIdx, wrapping around the buffer size N accordingly, i.e. character N+x will be stored in the same buffer slot as character x.
- If the buffer becomes full, the NOVA microhypervisor advances RdIdx, forcing the oldest character to be discarded from the buffer.
- At the end of each line, the NOVA microhypervisor invokes ctrl\_sm (Up) on the signaling semaphore. The console consumer should use ctrl\_sm (Down) on the signaling semaphore instead of polling Wrldx.

## C.2 UART Console

Additionally several different UART consoles can be used to provide boot-time-only debug output of the microhypervisor. UART consoles must be configured for 115200 baud and 8N1 mode.

# **D** Download

The source code of the NOVA microhypervisor and the latest version of this document can be downloaded from GitHub: https://github.com/udosteinberg/NOVA