Skip to content

Commit

Permalink
Merge pull request #8 from simonjbeaumont/housekeeping
Browse files Browse the repository at this point in the history
Fix up code markdown. Renders better in browser and editor
  • Loading branch information
djs55 committed Dec 2, 2014
2 parents fcfa86a + 56d4b5a commit f5fca1f
Show file tree
Hide file tree
Showing 11 changed files with 109 additions and 95 deletions.
2 changes: 2 additions & 0 deletions README.md
Expand Up @@ -48,6 +48,7 @@ $ gem install jekyll

Then you can host the site locally with the following command from the root
directory of this repository:

```
$ jekyll serve -w --baseurl '/xapi-project'
```
Expand All @@ -57,6 +58,7 @@ You will then be able to view the page at `localhost:4000/xapi-project`.
## A note on images
If you are contributing images, consider compressing them to keep this repo as
slim as possible:

```
convert -resize 900 -background white -colors 256 [input.png] [output.png]
```
Expand Down
55 changes: 28 additions & 27 deletions features/HA/HA.md
Expand Up @@ -100,7 +100,7 @@ continue to fail then xhad will consider the host to have failed and
self-fence.

xhad is configured via a simple config file written on each host in
```/etc/xensource/xhad.conf```. The file must be identical on each host
`/etc/xensource/xhad.conf`. The file must be identical on each host
in the cluster. To make changes to the file, HA must be disabled and then
re-enabled afterwards. Note it may not be possible to re-enable HA depending
on the configuration change (e.g. if a host has been added but that host has
Expand All @@ -114,6 +114,7 @@ The xhad.conf file is written in XML and contains
which local network interface and block device to use for heartbeating.

The following is an example xhad.conf file:

```
<?xml version="1.0" encoding="utf-8"?>
<xhad-config version="1.0">
Expand Down Expand Up @@ -186,29 +187,29 @@ The fields have the following meaning:
xapi's notion of a management network.
- HeartbeatTimeout: if a heartbeat packet is not received for this many
seconds, then xhad considers the heartbeat to have failed. This is
the user-supplied "HA timeout" value, represented below as ```T```.
```T``` must be bigger than 10; we would normally use 60s.
the user-supplied "HA timeout" value, represented below as `T`.
`T` must be bigger than 10; we would normally use 60s.
- StateFileTimeout: if a storage update is not seen for a host for this
many seconds, then xhad considers the storage heartbeat to have failed.
We would normally use the same value as the HeartbeatTimeout ```T```.
We would normally use the same value as the HeartbeatTimeout `T`.
- HeartbeatInterval: interval between heartbeat packets sent. We would
normally use a value ```2 <= t <= 6```, derived from the user-supplied
HA timeout via ```t = (T + 10) / 10```
normally use a value `2 <= t <= 6`, derived from the user-supplied
HA timeout via `t = (T + 10) / 10`
- StateFileInterval: interval betwen storage updates (also known as
"statefile updates"). This would normally be set to the same value as
HeartbeatInterval.
- HeartbeatWatchdogTimeout: If the host does not send a heartbeat for this
amount of time then the host self-fences via the Xen watchdog. We normally
set this to ```T```.
set this to `T`.
- StateFileWatchdogTimeout: If the host does not update the statefile for
this amount of time then the host self-fences via the Xen watchdog. We
normally set this to ```T+15```.
normally set this to `T+15`.
- BootJoinTimeout: When the host is booting and joining the liveset (i.e.
the cluster), consider the join a failure if it takes longer than this
amount of time. We would normally set this to ```T+60```.
amount of time. We would normally set this to `T+60`.
- EnableJoinTimeout: When the host is enabling HA for the first time,
consider the enable a failure if it takes longer than this amount of time.
We would normally set this to ```T+60```.
We would normally set this to `T+60`.
- XapiHealthCheckInterval: Interval between "health checks" where we run
a script to check whether Xapi is responding or not.
- XapiHealthCheckTimeout: Number of seconds to wait before assuming that
Expand All @@ -223,20 +224,20 @@ The fields have the following meaning:
In addition to the config file, Xhad exposes a simple control API which
is exposed as scripts:

- ```ha_set_pool_state (Init | Invalid)```: sets the global pool state to "Init" (before starting
- `ha_set_pool_state (Init | Invalid)`: sets the global pool state to "Init" (before starting
HA) or "Invalid" (causing all other daemons who can see the statefile to
shutdown)
"Invalid"
- ```ha_start_daemon```: if the pool state is "Init" then the daemon will
- `ha_start_daemon`: if the pool state is "Init" then the daemon will
attempt to contact other daemons and enable HA. If the pool state is
"Active" then the host will attempt to join the existing liveset.
- ```ha_query_liveset```: returns the current state of the cluster.
- ```ha_propose_master```: returns whether the current node has been
- `ha_query_liveset`: returns the current state of the cluster.
- `ha_propose_master`: returns whether the current node has been
elected pool master.
- ```ha_stop_daemon```: shuts down the xhad on the local host. Note this
- `ha_stop_daemon`: shuts down the xhad on the local host. Note this
will not disarm the Xen watchdog by itself.
- ```ha_disarm_fencing```: disables fencing on the local host.
- ```ha_set_excluded```: when a host is being shutdown cleanly, record the
- `ha_disarm_fencing`: disables fencing on the local host.
- `ha_set_excluded`: when a host is being shutdown cleanly, record the
fact that the VMs have all been shutdown so that this host can be ignored
in future cluster membership calculations.

Expand Down Expand Up @@ -288,7 +289,7 @@ Starting up a host

![Starting up a host](HA.start.svg)

First Xapi starts up the xhad via the ```ha_start_daemon``` script. The
First Xapi starts up the xhad via the `ha_start_daemon` script. The
daemons read their config files and start exchanging heartbeats over
the network and storage. All hosts must be online and all heartbeats must
be working for HA to be enabled -- it is not sensible to enable HA when
Expand All @@ -297,11 +298,11 @@ join the liveset then it clears the "excluded" flag which would have
been set if the host had been shutdown cleanly before -- this is only
needed when a host is shutdown cleanly and then restarted.

Xapi periodically queries the state of xhad via the ```ha_query_liveset```
command. The state will be ```Starting``` until the liveset is fully
formed at which point the state will be ```Online```.
Xapi periodically queries the state of xhad via the `ha_query_liveset`
command. The state will be `Starting` until the liveset is fully
formed at which point the state will be `Online`.

When the ```ha_start_daemon``` script returns then Xapi will decide
When the `ha_start_daemon` script returns then Xapi will decide
whether to stand for master election or not. Initially when HA is being
enabled and there is a master already, this node will be expected to
stand unopposed. Later when HA notices that the master host has been
Expand All @@ -322,9 +323,9 @@ running on it. An excluded host will never allow itself to form part
of a "split brain".

Once a host has given up its master role and shutdown any VMs, it is safe
to disable fencing with ```ha_disarm_fencing``` and stop xhad with
```ha_stop_daemon```. Once the daemon has been stopped the "excluded"
bit can be set in the statefile via ```ha_set_excluded``` and the
to disable fencing with `ha_disarm_fencing` and stop xhad with
`ha_stop_daemon`. Once the daemon has been stopped the "excluded"
bit can be set in the statefile via `ha_set_excluded` and the
host safely rebooted.

Disabling HA cleanly
Expand All @@ -334,7 +335,7 @@ Disabling HA cleanly

HA can be shutdown cleanly when the statefile is working i.e. when hosts
are alive because of survival rule 1. First the master Xapi tells the local
Xhad to mark the pool state as "invalid" using ```ha_set_pool_state```.
Xhad to mark the pool state as "invalid" using `ha_set_pool_state`.
Every xhad instance will notice this state change the next time it performs
a storage heartbeat. The Xhad instances will shutdown and Xapi will notice
that HA has been disabled the next time it attempts to query the liveset.
Expand All @@ -353,7 +354,7 @@ on every host. Since all hosts are online thanks to survival rule 2,
the Xapi master is able to tell all Xapi instances to disable their
recovery logic. Once the Xapis have been disabled -- and there is no
possibility of split brain -- each host is asked to disable the watchdog
with ```ha_disarm_fencing``` and then to stop Xhad with ```ha_stop_daemon```.
with `ha_disarm_fencing` and then to stop Xhad with `ha_stop_daemon`.

Add a host to the pool
----------------------
Expand Down
4 changes: 2 additions & 2 deletions features/XSM/XSM.md
Expand Up @@ -16,8 +16,8 @@ The following diagram shows how XSM works at a high level:
The slowest part of a storage migration is migrating the storage, since virtual
disks can be very large. Xapi starts by taking a snapshot and copying that to
the destination as a background task. Before the datapath connecting the VM
to the disk is re-established, xapi tells ```tapdisk``` to start mirroring all
writes to a remote ```tapdisk``` over NBD. From this point on all VM disk writes
to the disk is re-established, xapi tells `tapdisk` to start mirroring all
writes to a remote `tapdisk` over NBD. From this point on all VM disk writes
are written to both the old and the new disk.
When the background snapshot copy is complete, xapi can migrate the VM memory
across. Once the VM memory image has been received, the destination VM is
Expand Down
34 changes: 17 additions & 17 deletions features/backtraces.md
Expand Up @@ -41,18 +41,18 @@ let finally f cleanup =
raise e (* <-- backtrace starts here now *)
```

This function performs some action (i.e. ```f ()```) and guarantees to
perform some cleanup action (```cleanup ()```) whether or not an exception
This function performs some action (i.e. `f ()`) and guarantees to
perform some cleanup action (`cleanup ()`) whether or not an exception
is thrown. This is a common pattern to ensure resources are freed (e.g.
closing a socket or file descriptor). Unfortunately the ```raise e``` in
closing a socket or file descriptor). Unfortunately the `raise e` in
the exception handler loses the backtrace context: when the exception
gets to the toplevel, ```Printexc.get_backtrace ()``` will point at the
```finally``` rather than the real cause of the error.
gets to the toplevel, `Printexc.get_backtrace ()` will point at the
`finally` rather than the real cause of the error.

We will use a variant of the solution proposed by
[Jacques-Henri Jourdan](http://gallium.inria.fr/blog/a-library-to-record-ocaml-backtraces/)
where we will record backtraces when we catch exceptions, before the
buffer is reinitialised. Our ```finally``` function will now look like this:
buffer is reinitialised. Our `finally` function will now look like this:

```
let finally f cleanup =
Expand All @@ -66,16 +66,16 @@ let finally f cleanup =
raise e
```

The function ```Backtrace.is_important e``` associates the exception ```e```
The function `Backtrace.is_important e` associates the exception `e`
with the current backtrace before it gets deleted.

Xapi always has high-level exception handlers or other wrappers around all the
threads it spawns. In particular Xapi tries really hard to associate threads
with active tasks, so it can prefix all log lines with a task id. This helps
admins see the related log lines even when there is lots of concurrent activity.
Xapi also tries very hard to label other threads with names for the same reason
(e.g. ```db_gc```). Every thread should end up being wrapped in ```with_thread_named```
which allows us to catch exceptions and log stacktraces from ```Backtrace.get```
(e.g. `db_gc`). Every thread should end up being wrapped in `with_thread_named`
which allows us to catch exceptions and log stacktraces from `Backtrace.get`
on the way out.

OCaml design guidelines
Expand All @@ -85,21 +85,21 @@ Making nice backtraces requires us to think when we write our exception raising
and handling code. In particular:

- If a function handles an exception and re-raise it, you must call
```Backtrace.is_important e``` with the exception to capture the backtrace first.
- If a function raises a different exception (e.g. ```Not_found``` becoming a XenAPI
```INTERNAL_ERROR```) then you must use ```Backtrace.reraise <old> <new>``` to
`Backtrace.is_important e` with the exception to capture the backtrace first.
- If a function raises a different exception (e.g. `Not_found` becoming a XenAPI
`INTERNAL_ERROR`) then you must use `Backtrace.reraise <old> <new>` to
ensure the backtrace is preserved.
- All exceptions should be printable -- if the generic printer doesn't do a good
enough job then register a custom printer.
- If you are the last person who will see an exception (because you aren't going
to rethow it) then you *may* log the backtrace via ```Debug.log_backtrace e```
to rethow it) then you *may* log the backtrace via `Debug.log_backtrace e`
*if and only if* you reasonably expect the resulting backtrace to be helpful
and not spammy.
- If you aren't the last person who will see an exception (because you are going
to rethrow it or another exception), then *do not* log the backtrace; the
next handler will do that.
- All threads should have a final exception handler at the outermost level
for example ```Debug.with_thread_named``` will do this for you.
for example `Debug.with_thread_named` will do this for you.


Backtraces in python
Expand Down Expand Up @@ -269,7 +269,7 @@ The SMAPIv1 API

Errors in SMAPIv1 are returned as XMLRPC "Faults" containing a code and
a status line. Xapi transforms these into XenAPI exceptions usually of the
form ```SR_BACKEND_FAILURE_<code>```. We can extend the SM backends to use the
form `SR_BACKEND_FAILURE_<code>`. We can extend the SM backends to use the
XenAPI exception type directly: i.e. to marshal exceptions as dictionaries:

```python
Expand All @@ -281,12 +281,12 @@ XenAPI exception type directly: i.e. to marshal exceptions as dictionaries:

We can then define a new backtrace-carrying error:

- code = ```SR_BACKEND_FAILURE_WITH_BACKTRACE```
- code = `SR_BACKEND_FAILURE_WITH_BACKTRACE`
- param1 = json-encoded backtrace
- param2 = code
- param3 = reason

which is internally transformed into ```SR_BACKEND_FAILURE_<code>``` and
which is internally transformed into `SR_BACKEND_FAILURE_<code>` and
the backtrace is appended to the current Task backtrace. From the client's
point of view the final exception should look the same, but Xapi will have
a chance to see and log the whole backtrace.
Expand Down
4 changes: 2 additions & 2 deletions getting-started/architecture.md
Expand Up @@ -19,8 +19,8 @@ running xapi, all sharing some storage:
At any time, at most one host is known as the *pool master* and is responsible
for co-ordination and locking resources within the pool. When a pool is
first created a master host is chosen. The master role can be transferred
- on user request in an orderly fashion (```xe pool-designate-new-master```)
- on user request in an emergency (```xe pool-emergency-transition-to-master```)
- on user request in an orderly fashion (`xe pool-designate-new-master`)
- on user request in an emergency (`xe pool-emergency-transition-to-master`)
- automatically if HA is enabled on the cluster.

All hosts expose an HTTP and XML/RPC interface running on port 80 and with TLS/SSL
Expand Down
6 changes: 3 additions & 3 deletions squeezed/architecture/architecture.md
Expand Up @@ -13,7 +13,7 @@ The following diagram shows the internals of Squeezed:
At the center of squeezed is an abstract model of a Xen host. The model includes
- the amount of already-used host memory (used by fixed overheads such as Xen
and the crash kernel)
- per-domain memory policy specifically ```dynamic-min``` and ```dynamic-max``` which
- per-domain memory policy specifically `dynamic-min` and `dynamic-max` which
together describe a range, within which the domain's actual used memory should remain
- per-domain calibration data which allows us to compute the necessary balloon target
value to achive a particular memory usage value.
Expand All @@ -32,7 +32,7 @@ their targets. Note that ballooning is fundamentally a co-operative process, so
must handle cases where the domains refuse to obey commands.

The "output" of squeezed is a list of "actions" which include:
- set domain x's ```memory/target``` to a new value
- set the ```maxmem``` of a domain to a new value (as a hard limit beyond which the domain
- set domain x's `memory/target` to a new value
- set the `maxmem` of a domain to a new value (as a hard limit beyond which the domain
cannot allocate)

16 changes: 8 additions & 8 deletions xapi/architecture.md
Expand Up @@ -38,11 +38,11 @@ The APIs are classified into categories:
efficient access to the data.
* emergency: these deal with scenarios where the master is offline

If the incoming API call should be resent to the master than a XenAPI ```HOST_IS_SLAVE```
If the incoming API call should be resent to the master than a XenAPI `HOST_IS_SLAVE`
error message containing the master's IP is sent to the client.

Once past the initial checks, API calls enter the "message forwarding" layer which
- locks resources (via the ```current_operations``` mechanism)
- locks resources (via the `current_operations` mechanism)
- decides which host should execute the request.

If the request should run locally then a direct function call is used; otherwise
Expand All @@ -63,7 +63,7 @@ If the XenAPI call is a storage operation then the "storage access" layer
- invokes the relevant operation in the Storage Manager API (SMAPI) v2 interface;
- uses the SMAPIv2 to SMAPIv1 converter to generate the necessary command-line to talk to
the SMAPIv1 plugin (EXT, NFS, LVM etc) and to execute it
- persists the state of the storage objects (including the result of a ```VDI.attach```
- persists the state of the storage objects (including the result of a `VDI.attach`
call) to persistent storage

Internally the SMAPIv1 plugins use privileged access to the Xapi database to directly
Expand All @@ -76,19 +76,19 @@ The SMAPIv1 plugins also rely on Xapi for
The Xapi database contains Host and VM metadata and is shared pool-wide. The master
keeps a copy in memory, and all other nodes remote queries to the master. The database
associates each object with a generation count which is used to implement the XenAPI
```event.next``` and ```event.from``` APIs. The database is routinely asynchronously flushed to disk
`event.next` and `event.from` APIs. The database is routinely asynchronously flushed to disk
in XML format. If the "redo-log" is enabled then all database writes are made synchronously
as deltas to a shared block device. Without the redo-log, recent updates may be lost
if Xapi is killed before a flush.

High-Availability refers to planning for host failure, monitoring host liveness and then
following-through on the plans. Xapi defers to an external host liveness monitor
called ```xhad```. When ```xhad``` confirms that a host has failed -- and has been
called `xhad`. When `xhad` confirms that a host has failed -- and has been
isolated from the storage -- then Xapi will restart any VMs which have failed and which
have been marked as "protected" by HA. Xapi can also impose admission control to prevent
the pool becoming too overloaded to cope with ```n``` arbitrary host failures.
the pool becoming too overloaded to cope with `n` arbitrary host failures.

The ```xe``` CLI is implemented in terms of the XenAPI, but for efficiency the implementation
is linked directly into Xapi. The ```xe``` program remotes its command-line to Xapi,
The `xe` CLI is implemented in terms of the XenAPI, but for efficiency the implementation
is linked directly into Xapi. The `xe` program remotes its command-line to Xapi,
and Xapi sends back a series of simple commands (prompt for input; print line; fetch file;
exit etc).
2 changes: 1 addition & 1 deletion xapi/design/XenAPI-evolution.md
Expand Up @@ -87,6 +87,6 @@ The XenAPI documentation will contain its complete lifecycle history for each
XenAPI element. Only the elements described in the documentation are
"official" and supported.

Each object, message and field in ```datamodel.ml``` will have lifecycle
Each object, message and field in `datamodel.ml` will have lifecycle
metadata attached to it, which is a list of transitions (transition type *
release * explanation string) as described above. Release notes are automatically generated from this data.

0 comments on commit f5fca1f

Please sign in to comment.