Skip to content

Latest commit

 

History

History
3542 lines (2490 loc) · 113 KB

guestfs.pod

File metadata and controls

3542 lines (2490 loc) · 113 KB

NAME

guestfs - Library for accessing and modifying virtual machine images

SYNOPSIS

#include <guestfs.h>

guestfs_h *g = guestfs_create ();
guestfs_add_drive (g, "guest.img");
guestfs_launch (g);
guestfs_mount (g, "/dev/sda1", "/");
guestfs_touch (g, "/hello");
guestfs_umount (g, "/");
guestfs_shutdown (g);
guestfs_close (g);

cc prog.c -o prog -lguestfs
or:
cc prog.c -o prog `pkg-config libguestfs --cflags --libs`

DESCRIPTION

Libguestfs is a library for accessing and modifying disk images and virtual machines. This manual page documents the C API.

If you are looking for an introduction to libguestfs, see the web site: http://libguestfs.org/

Each virt tool has its own man page (for a full list, go to "SEE ALSO" at the end of this file).

The libguestfs FAQ contains many useful answers: guestfs-faq(1).

For examples of using the API from C, see guestfs-examples(3). For examples in other languages, see "USING LIBGUESTFS WITH OTHER PROGRAMMING LANGUAGES" below.

For tips and recipes, see guestfs-recipes(1).

If you are having performance problems, read guestfs-performance(1). To help test libguestfs, read libguestfs-test-tool(1) and guestfs-testing(1). To contribute code to libguestfs, see guestfs-hacking(1). To find out how libguestfs works, see guestfs-internals(1).

For security information, including CVEs affecting libguestfs, see guestfs-security(1).

API OVERVIEW

This section provides a gentler overview of the libguestfs API. We also try to group API calls together, where that may not be obvious from reading about the individual calls in the main section of this manual.

HANDLES

Before you can use libguestfs calls, you have to create a handle. Then you must add at least one disk image to the handle, followed by launching the handle, then performing whatever operations you want, and finally closing the handle. By convention we use the single letter g for the name of the handle variable, although of course you can use any name you want.

The general structure of all libguestfs-using programs looks like this:

guestfs_h *g = guestfs_create ();

/* Call guestfs_add_drive additional times if there are
 * multiple disk images.
 */
guestfs_add_drive (g, "guest.img");

/* Most manipulation calls won't work until you've launched
 * the handle 'g'.  You have to do this _after_ adding drives
 * and _before_ other commands.
 */
guestfs_launch (g);

/* Either: examine what partitions, LVs etc are available: */
char **partitions = guestfs_list_partitions (g);
char **logvols = guestfs_lvs (g);

/* Or: ask libguestfs to find filesystems for you: */
char **filesystems = guestfs_list_filesystems (g);

/* Or: use inspection (see INSPECTION section below). */

/* To access a filesystem in the image, you must mount it. */
guestfs_mount (g, "/dev/sda1", "/");

/* Now you can perform filesystem actions on the guest
 * disk image.
 */
guestfs_touch (g, "/hello");

/* Synchronize the disk.  This is the opposite of guestfs_launch. */
guestfs_shutdown (g);

/* Close and free the handle 'g'. */
guestfs_close (g);

The code above doesn't include any error checking. In real code you should check return values carefully for errors. In general all functions that return integers return -1 on error, and all functions that return pointers return NULL on error. See section "ERROR HANDLING" below for how to handle errors, and consult the documentation for each function call below to see precisely how they return error indications.

The code above does not free(3) the strings and arrays returned from functions. Consult the documentation for each function to find out how to free the return value.

See guestfs-examples(3) for fully worked examples.

DISK IMAGES

The image filename ("guest.img" in the example above) could be a disk image from a virtual machine, a dd(1) copy of a physical hard disk, an actual block device, or simply an empty file of zeroes that you have created through posix_fallocate(3). Libguestfs lets you do useful things to all of these.

The call you should use in modern code for adding drives is "guestfs_add_drive_opts". To add a disk image, allowing writes, and specifying that the format is raw, do:

guestfs_add_drive_opts (g, filename,
                        GUESTFS_ADD_DRIVE_OPTS_FORMAT, "raw",
                        -1);

You can add a disk read-only using:

guestfs_add_drive_opts (g, filename,
                        GUESTFS_ADD_DRIVE_OPTS_FORMAT, "raw",
                        GUESTFS_ADD_DRIVE_OPTS_READONLY, 1,
                        -1);

or by calling the older function "guestfs_add_drive_ro". If you use the readonly flag, libguestfs won't modify the file. (See also "DISK IMAGE FORMATS" below).

Be extremely cautious if the disk image is in use, eg. if it is being used by a virtual machine. Adding it read-write will almost certainly cause disk corruption, but adding it read-only is safe.

You should usually add at least one disk image, and you may add multiple disk images. If adding multiple disk images, they usually have to be "related", ie. from the same guest. In the API, the disk images are usually referred to as /dev/sda (for the first one you added), /dev/sdb (for the second one you added), etc.

Once "guestfs_launch" has been called you cannot add any more images. You can call "guestfs_list_devices" to get a list of the device names, in the order that you added them. See also "BLOCK DEVICE NAMING" below.

There are slightly different rules when hotplugging disks (in libguestfs ≥ 1.20). See "HOTPLUGGING" below.

MOUNTING

Before you can read or write files, create directories and so on in a disk image that contains filesystems, you have to mount those filesystems using "guestfs_mount" or "guestfs_mount_ro". If you already know that a disk image contains (for example) one partition with a filesystem on that partition, then you can mount it directly:

guestfs_mount (g, "/dev/sda1", "/");

where /dev/sda1 means literally the first partition (1) of the first disk image that we added (/dev/sda). If the disk contains Linux LVM2 logical volumes you could refer to those instead (eg. /dev/VG/LV). Note that these are libguestfs virtual devices, and are nothing to do with host devices.

If you are given a disk image and you don't know what it contains then you have to find out. Libguestfs can do that too: use "guestfs_list_partitions" and "guestfs_lvs" to list possible partitions and LVs, and either try mounting each to see what is mountable, or else examine them with "guestfs_vfs_type" or "guestfs_file". To list just filesystems, use "guestfs_list_filesystems".

Libguestfs also has a set of APIs for inspection of unknown disk images (see "INSPECTION" below). You might also want to look at higher level programs built on top of libguestfs, in particular virt-inspector(1).

To mount a filesystem read-only, use "guestfs_mount_ro". There are several other variations of the guestfs_mount_* call.

FILESYSTEM ACCESS AND MODIFICATION

The majority of the libguestfs API consists of fairly low-level calls for accessing and modifying the files, directories, symlinks etc on mounted filesystems. There are over a hundred such calls which you can find listed in detail below in this man page, and we don't even pretend to cover them all in this overview.

Specify filenames as full paths, starting with "/" and including the mount point.

For example, if you mounted a filesystem at "/" and you want to read the file called "etc/passwd" then you could do:

char *data = guestfs_cat (g, "/etc/passwd");

This would return data as a newly allocated buffer containing the full content of that file (with some conditions: see also "DOWNLOADING" below), or NULL if there was an error.

As another example, to create a top-level directory on that filesystem called "var" you would do:

guestfs_mkdir (g, "/var");

To create a symlink you could do:

guestfs_ln_s (g, "/etc/init.d/portmap",
              "/etc/rc3.d/S30portmap");

Libguestfs will reject attempts to use relative paths and there is no concept of a current working directory.

Libguestfs can return errors in many situations: for example if the filesystem isn't writable, or if a file or directory that you requested doesn't exist. If you are using the C API (documented here) you have to check for those error conditions after each call. (Other language bindings turn these errors into exceptions).

File writes are affected by the per-handle umask, set by calling "guestfs_umask" and defaulting to 022. See "UMASK".

Since libguestfs 1.18, it is possible to mount the libguestfs filesystem on a local directory, subject to some restrictions. See "MOUNT LOCAL" below.

PARTITIONING

Libguestfs contains API calls to read, create and modify partition tables on disk images.

In the common case where you want to create a single partition covering the whole disk, you should use the "guestfs_part_disk" call:

const char *parttype = "mbr";
if (disk_is_larger_than_2TB)
  parttype = "gpt";
guestfs_part_disk (g, "/dev/sda", parttype);

Obviously this effectively wipes anything that was on that disk image before.

LVM2

Libguestfs provides access to a large part of the LVM2 API, such as "guestfs_lvcreate" and "guestfs_vgremove". It won't make much sense unless you familiarize yourself with the concepts of physical volumes, volume groups and logical volumes.

This author strongly recommends reading the LVM HOWTO, online at http://tldp.org/HOWTO/LVM-HOWTO/.

DOWNLOADING

Use "guestfs_cat" to download small, text only files. This call cannot handle files containing any ASCII NUL (\0) characters. However the API is very simple to use.

"guestfs_read_file" can be used to read files which contain arbitrary 8 bit data, since it returns a (pointer, size) pair.

"guestfs_download" can be used to download any file, with no limits on content or size.

To download multiple files, see "guestfs_tar_out" and "guestfs_tgz_out".

UPLOADING

To write a small file with fixed content, use "guestfs_write". To create a file of all zeroes, use "guestfs_truncate_size" (sparse) or "guestfs_fallocate64" (with all disk blocks allocated). There are a variety of other functions for creating test files, for example "guestfs_fill" and "guestfs_fill_pattern".

To upload a single file, use "guestfs_upload". This call has no limits on file content or size.

To upload multiple files, see "guestfs_tar_in" and "guestfs_tgz_in".

However the fastest way to upload large numbers of arbitrary files is to turn them into a squashfs or CD ISO (see mksquashfs(8) and mkisofs(8)), then attach this using "guestfs_add_drive_ro". If you add the drive in a predictable way (eg. adding it last after all other drives) then you can get the device name from "guestfs_list_devices" and mount it directly using "guestfs_mount_ro". Note that squashfs images are sometimes non-portable between kernel versions, and they don't support labels or UUIDs. If you want to pre-build an image or you need to mount it using a label or UUID, use an ISO image instead.

COPYING

There are various different commands for copying between files and devices and in and out of the guest filesystem. These are summarised in the table below.

file to file

Use "guestfs_cp" to copy a single file, or "guestfs_cp_a" to copy directories recursively.

To copy part of a file (offset and size) use "guestfs_copy_file_to_file".

file to device
device to file
device to device

Use "guestfs_copy_file_to_device", "guestfs_copy_device_to_file", or "guestfs_copy_device_to_device".

Example: duplicate the contents of an LV:

guestfs_copy_device_to_device (g,
        "/dev/VG/Original", "/dev/VG/Copy",
        /* -1 marks the end of the list of optional parameters */
        -1);

The destination (/dev/VG/Copy) must be at least as large as the source (/dev/VG/Original). To copy less than the whole source device, use the optional size parameter:

guestfs_copy_device_to_device (g,
        "/dev/VG/Original", "/dev/VG/Copy",
        GUESTFS_COPY_DEVICE_TO_DEVICE_SIZE, 10000,
        -1);
file on the host to file or device

Use "guestfs_upload". See "UPLOADING" above.

file or device to file on the host

Use "guestfs_download". See "DOWNLOADING" above.

UPLOADING AND DOWNLOADING TO PIPES AND FILE DESCRIPTORS

Calls like "guestfs_upload", "guestfs_download", "guestfs_tar_in", "guestfs_tar_out" etc appear to only take filenames as arguments, so it appears you can only upload and download to files. However many Un*x-like hosts let you use the special device files /dev/stdin, /dev/stdout, /dev/stderr and /dev/fd/N to read and write from stdin, stdout, stderr, and arbitrary file descriptor N.

For example, virt-cat(1) writes its output to stdout by doing:

guestfs_download (g, filename, "/dev/stdout");

and you can write tar output to a file descriptor fd by doing:

char devfd[64];
snprintf (devfd, sizeof devfd, "/dev/fd/%d", fd);
guestfs_tar_out (g, "/", devfd);

LISTING FILES

"guestfs_ll" is just designed for humans to read (mainly when using the guestfish(1)-equivalent command ll).

"guestfs_ls" is a quick way to get a list of files in a directory from programs, as a flat list of strings.

"guestfs_readdir" is a programmatic way to get a list of files in a directory, plus additional information about each one. It is more equivalent to using the readdir(3) call on a local filesystem.

"guestfs_find" and "guestfs_find0" can be used to recursively list files.

RUNNING COMMANDS

Although libguestfs is primarily an API for manipulating files inside guest images, we also provide some limited facilities for running commands inside guests.

There are many limitations to this:

  • The kernel version that the command runs under will be different from what it expects.

  • If the command needs to communicate with daemons, then most likely they won't be running.

  • The command will be running in limited memory.

  • The network may not be available unless you enable it (see "guestfs_set_network").

  • Only supports Linux guests (not Windows, BSD, etc).

  • Architecture limitations (eg. won't work for a PPC guest on an X86 host).

  • For SELinux guests, you may need to enable SELinux and load policy first. See "SELINUX" in this manpage.

  • Security: It is not safe to run commands from untrusted, possibly malicious guests. These commands may attempt to exploit your program by sending unexpected output. They could also try to exploit the Linux kernel or qemu provided by the libguestfs appliance. They could use the network provided by the libguestfs appliance to bypass ordinary network partitions and firewalls. They could use the elevated privileges or different SELinux context of your program to their advantage.

    A secure alternative is to use libguestfs to install a "firstboot" script (a script which runs when the guest next boots normally), and to have this script run the commands you want in the normal context of the running guest, network security and so on. For information about other security issues, see guestfs-security(1).

The two main API calls to run commands are "guestfs_command" and "guestfs_sh" (there are also variations).

The difference is that "guestfs_sh" runs commands using the shell, so any shell globs, redirections, etc will work.

CONFIGURATION FILES

To read and write configuration files in Linux guest filesystems, we strongly recommend using Augeas. For example, Augeas understands how to read and write, say, a Linux shadow password file or X.org configuration file, and so avoids you having to write that code.

The main Augeas calls are bound through the guestfs_aug_* APIs. We don't document Augeas itself here because there is excellent documentation on the http://augeas.net/ website.

If you don't want to use Augeas (you fool!) then try calling "guestfs_read_lines" to get the file as a list of lines which you can iterate over.

SYSTEMD JOURNAL FILES

To read the systemd journal from a Linux guest, use the guestfs_journal_* APIs starting with "guestfs_journal_open".

Consult the journal documentation here: sd-journal(3), sd_journal_open(3).

SELINUX

We support SELinux guests. To ensure that labeling happens correctly in SELinux guests, you need to enable SELinux and load the guest's policy:

  1. Before launching, do:

    guestfs_set_selinux (g, 1);
  2. After mounting the guest's filesystem(s), load the policy. This is best done by running the load_policy(8) command in the guest itself:

    guestfs_sh (g, "/usr/sbin/load_policy");

    (Older versions of load_policy require you to specify the name of the policy file).

  3. Optionally, set the security context for the API. The correct security context to use can only be known by inspecting the guest. As an example:

    guestfs_setcon (g, "unconfined_u:unconfined_r:unconfined_t:s0");

This will work for running commands and editing existing files.

When new files are created, you may need to label them explicitly, for example by running the external command restorecon pathname.

UMASK

Certain calls are affected by the current file mode creation mask (the "umask"). In particular ones which create files or directories, such as "guestfs_touch", "guestfs_mknod" or "guestfs_mkdir". This affects either the default mode that the file is created with or modifies the mode that you supply.

The default umask is 022, so files are created with modes such as 0644 and directories with 0755.

There are two ways to avoid being affected by umask. Either set umask to 0 (call guestfs_umask (g, 0) early after launching). Or call "guestfs_chmod" after creating each file or directory.

For more information about umask, see umask(2).

LABELS AND UUIDS

Many filesystems, devices and logical volumes support either labels (short strings like "BOOT" which might not be unique) and/or UUIDs (globally unique IDs).

For filesystems, use "guestfs_vfs_label" or "guestfs_vfs_uuid" to read the label or UUID. Some filesystems let you call "guestfs_set_label" or "guestfs_set_uuid" to change the label or UUID.

You can locate a filesystem by its label or UUID using "guestfs_findfs_label" or "guestfs_findfs_uuid".

For LVM2 (which supports only UUIDs), there is a rich set of APIs for fetching UUIDs, fetching UUIDs of the contained objects, and changing UUIDs. See: "guestfs_lvuuid", "guestfs_vguuid", "guestfs_pvuuid", "guestfs_vglvuuids", "guestfs_vgpvuuids", "guestfs_vgchange_uuid", "guestfs_vgchange_uuid_all", "guestfs_pvchange_uuid", "guestfs_pvchange_uuid_all".

Note when cloning a filesystem, device or whole guest, it is a good idea to set new randomly generated UUIDs on the copy.

ENCRYPTED DISKS

Libguestfs allows you to access Linux guests which have been encrypted using whole disk encryption that conforms to the Linux Unified Key Setup (LUKS) standard. This includes nearly all whole disk encryption systems used by modern Linux guests.

Use "guestfs_vfs_type" to identify LUKS-encrypted block devices (it returns the string crypto_LUKS).

Then open these devices by calling "guestfs_luks_open". Obviously you will require the passphrase!

Opening a LUKS device creates a new device mapper device called /dev/mapper/mapname (where mapname is the string you supply to "guestfs_luks_open"). Reads and writes to this mapper device are decrypted from and encrypted to the underlying block device respectively.

LVM volume groups on the device can be made visible by calling "guestfs_vgscan" followed by "guestfs_vg_activate_all". The logical volume(s) can now be mounted in the usual way.

Use the reverse process to close a LUKS device. Unmount any logical volumes on it, deactivate the volume groups by calling guestfs_vg_activate (g, 0, ["/dev/VG"]). Then close the mapper device by calling "guestfs_luks_close" on the /dev/mapper/mapname device (not the underlying encrypted block device).

MOUNT LOCAL

In libguestfs ≥ 1.18, it is possible to mount the libguestfs filesystem on a local directory and access it using ordinary POSIX calls and programs.

Availability of this is subject to a number of restrictions: it requires FUSE (the Filesystem in USErspace), and libfuse must also have been available when libguestfs was compiled. FUSE may require that a kernel module is loaded, and it may be necessary to add the current user to a special fuse group. See the documentation for your distribution and http://fuse.sf.net for further information.

The call to mount the libguestfs filesystem on a local directory is "guestfs_mount_local" (q.v.) followed by "guestfs_mount_local_run". The latter does not return until you unmount the filesystem. The reason is that the call enters the FUSE main loop and processes kernel requests, turning them into libguestfs calls. An alternative design would have been to create a background thread to do this, but libguestfs doesn't require pthreads. This way is also more flexible: for example the user can create another thread for "guestfs_mount_local_run".

"guestfs_mount_local" needs a certain amount of time to set up the mountpoint. The mountpoint is not ready to use until the call returns. At this point, accesses to the filesystem will block until the main loop is entered (ie. "guestfs_mount_local_run"). So if you need to start another process to access the filesystem, put the fork between "guestfs_mount_local" and "guestfs_mount_local_run".

MOUNT LOCAL COMPATIBILITY

Since local mounting was only added in libguestfs 1.18, and may not be available even in these builds, you should consider writing code so that it doesn't depend on this feature, and can fall back to using libguestfs file system calls.

If libguestfs was compiled without support for "guestfs_mount_local" then calling it will return an error with errno set to ENOTSUP (see "guestfs_last_errno").

MOUNT LOCAL PERFORMANCE

Libguestfs on top of FUSE performs quite poorly. For best performance do not use it. Use ordinary libguestfs filesystem calls, upload, download etc. instead.

HOTPLUGGING

In libguestfs ≥ 1.20, you may add drives and remove after calling "guestfs_launch". There are some restrictions, see below. This is called hotplugging.

Only a subset of the backends support hotplugging (currently only the libvirt backend has support). It also requires that you use libvirt ≥ 0.10.3 and qemu ≥ 1.2.

To hot-add a disk, simply call "guestfs_add_drive_opts" after "guestfs_launch". It is mandatory to specify the label parameter so that the newly added disk has a predictable name. For example:

if (guestfs_launch (g) == -1)
  error ("launch failed");

if (guestfs_add_drive_opts (g, filename,
                            GUESTFS_ADD_DRIVE_OPTS_LABEL, "newdisk",
                            -1) == -1)
  error ("hot-add of disk failed");

if (guestfs_part_disk ("/dev/disk/guestfs/newdisk", "mbr") == -1)
  error ("partitioning of hot-added disk failed");

To hot-remove a disk, call "guestfs_remove_drive". You can call this before or after "guestfs_launch". You can only remove disks that were previously added with a label.

Backends that support hotplugging do not require that you add ≥ 1 disk before calling launch. When hotplugging is supported you don't need to add any disks.

REMOTE STORAGE

CEPH

Libguestfs can access Ceph (librbd/RBD) disks.

To do this, set the optional protocol and server parameters of "guestfs_add_drive_opts" like this:

char **servers = { "ceph1.example.org:3000", /* ... */, NULL };
guestfs_add_drive_opts (g, "pool/image",
                        GUESTFS_ADD_DRIVE_OPTS_FORMAT, "raw",
                        GUESTFS_ADD_DRIVE_OPTS_PROTOCOL, "rbd",
                        GUESTFS_ADD_DRIVE_OPTS_SERVER, servers,
                        GUESTFS_ADD_DRIVE_OPTS_USERNAME, "rbduser",
                        GUESTFS_ADD_DRIVE_OPTS_SECRET, "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==",
                        -1);

servers (the server parameter) is a list of one or more Ceph servers. The server string is documented in "guestfs_add_drive_opts". The username and secret parameters are also optional, and if not given, then no authentication will be used.

FTP, HTTP AND TFTP

Libguestfs can access remote disks over FTP, FTPS, HTTP, HTTPS or TFTP protocols.

To do this, set the optional protocol and server parameters of "guestfs_add_drive_opts" like this:

char **servers = { "www.example.org", NULL };
guestfs_add_drive_opts (g, "/disk.img",
                        GUESTFS_ADD_DRIVE_OPTS_FORMAT, "raw",
                        GUESTFS_ADD_DRIVE_OPTS_PROTOCOL, "http",
                        GUESTFS_ADD_DRIVE_OPTS_SERVER, servers,
                        -1);

The protocol can be one of "ftp", "ftps", "http", "https" or "tftp".

servers (the server parameter) is a list which must have a single element. The single element is a string defining the web, FTP or TFTP server. The format of this string is documented in "guestfs_add_drive_opts".

GLUSTER

Libguestfs can access Gluster disks.

To do this, set the optional protocol and server parameters of "guestfs_add_drive_opts" like this:

char **servers = { "gluster.example.org:24007", NULL };
guestfs_add_drive_opts (g, "volname/image",
                        GUESTFS_ADD_DRIVE_OPTS_FORMAT, "raw",
                        GUESTFS_ADD_DRIVE_OPTS_PROTOCOL, "gluster",
                        GUESTFS_ADD_DRIVE_OPTS_SERVER, servers,
                        -1);

servers (the server parameter) is a list which must have a single element. The single element is a string defining the Gluster server. The format of this string is documented in "guestfs_add_drive_opts".

Note that gluster usually requires the client process (ie. libguestfs) to run as root and will give unfathomable errors if it is not (eg. "No data available").

ISCSI

Libguestfs can access iSCSI disks remotely.

To do this, set the optional protocol and server parameters like this:

char **server = { "iscsi.example.org:3000", NULL };
guestfs_add_drive_opts (g, "target-iqn-name/lun",
                        GUESTFS_ADD_DRIVE_OPTS_FORMAT, "raw",
                        GUESTFS_ADD_DRIVE_OPTS_PROTOCOL, "iscsi",
                        GUESTFS_ADD_DRIVE_OPTS_SERVER, server,
                        -1);

The server parameter is a list which must have a single element. The single element is a string defining the iSCSI server. The format of this string is documented in "guestfs_add_drive_opts".

NETWORK BLOCK DEVICE

Libguestfs can access Network Block Device (NBD) disks remotely.

To do this, set the optional protocol and server parameters of "guestfs_add_drive_opts" like this:

char **server = { "nbd.example.org:3000", NULL };
guestfs_add_drive_opts (g, "" /* export name - see below */,
                        GUESTFS_ADD_DRIVE_OPTS_FORMAT, "raw",
                        GUESTFS_ADD_DRIVE_OPTS_PROTOCOL, "nbd",
                        GUESTFS_ADD_DRIVE_OPTS_SERVER, server,
                        -1);

Notes:

  • server is in fact a list of servers. For NBD you must always supply a list with a single element. (Other remote protocols require zero or more than one server, hence the requirement for this parameter to be a list).

  • The server string is documented in "guestfs_add_drive_opts". To connect to a local qemu-nbd instance over a Unix domain socket, use "unix:/path/to/socket".

  • The filename parameter is the NBD export name. Use an empty string to mean the default export. Many NBD servers, including qemu-nbd, do not support export names.

  • If using qemu-nbd as your server, you should always specify the -t option. The reason is that libguestfs may open several connections to the server.

  • The libvirt backend requires that you set the format parameter of "guestfs_add_drive_opts" accurately when you use writable NBD disks.

  • The libvirt backend has a bug that stops Unix domain socket connections from working: https://bugzilla.redhat.com/show_bug.cgi?id=922888

  • The direct backend does not support readonly connections because of a bug in qemu: https://bugs.launchpad.net/qemu/+bug/1155677

SHEEPDOG

Libguestfs can access Sheepdog disks.

To do this, set the optional protocol and server parameters of "guestfs_add_drive_opts" like this:

char **servers = { /* optional servers ... */ NULL };
guestfs_add_drive_opts (g, "volume",
                        GUESTFS_ADD_DRIVE_OPTS_FORMAT, "raw",
                        GUESTFS_ADD_DRIVE_OPTS_PROTOCOL, "sheepdog",
                        GUESTFS_ADD_DRIVE_OPTS_SERVER, servers,
                        -1);

The optional list of servers may be zero or more server addresses ("hostname:port"). The format of the server strings is documented in "guestfs_add_drive_opts".

SSH

Libguestfs can access disks over a Secure Shell (SSH) connection.

To do this, set the protocol and server and (optionally) username parameters of "guestfs_add_drive_opts" like this:

char **server = { "remote.example.com", NULL };
guestfs_add_drive_opts (g, "/path/to/disk.img",
                        GUESTFS_ADD_DRIVE_OPTS_FORMAT, "raw",
                        GUESTFS_ADD_DRIVE_OPTS_PROTOCOL, "ssh",
                        GUESTFS_ADD_DRIVE_OPTS_SERVER, server,
                        GUESTFS_ADD_DRIVE_OPTS_USERNAME, "remoteuser",
                        -1);

The format of the server string is documented in "guestfs_add_drive_opts".

INSPECTION

Libguestfs has APIs for inspecting an unknown disk image to find out if it contains operating systems, an install CD or a live CD.

Add all disks belonging to the unknown virtual machine and call "guestfs_launch" in the usual way.

Then call "guestfs_inspect_os". This function uses other libguestfs calls and certain heuristics, and returns a list of operating systems that were found. An empty list means none were found. A single element is the root filesystem of the operating system. For dual- or multi-boot guests, multiple roots can be returned, each one corresponding to a separate operating system. (Multi-boot virtual machines are extremely rare in the world of virtualization, but since this scenario can happen, we have built libguestfs to deal with it.)

For each root, you can then call various guestfs_inspect_get_* functions to get additional details about that operating system. For example, call "guestfs_inspect_get_type" to return the string windows or linux for Windows and Linux-based operating systems respectively.

Un*x-like and Linux-based operating systems usually consist of several filesystems which are mounted at boot time (for example, a separate boot partition mounted on /boot). The inspection rules are able to detect how filesystems correspond to mount points. Call guestfs_inspect_get_mountpoints to get this mapping. It might return a hash table like this example:

/boot => /dev/sda1
/     => /dev/vg_guest/lv_root
/usr  => /dev/vg_guest/lv_usr

The caller can then make calls to "guestfs_mount" to mount the filesystems as suggested.

Be careful to mount filesystems in the right order (eg. / before /usr). Sorting the keys of the hash by length, shortest first, should work.

Inspection currently only works for some common operating systems. Contributors are welcome to send patches for other operating systems that we currently cannot detect.

Encrypted disks must be opened before inspection. See "ENCRYPTED DISKS" for more details. The "guestfs_inspect_os" function just ignores any encrypted devices.

A note on the implementation: The call "guestfs_inspect_os" performs inspection and caches the results in the guest handle. Subsequent calls to guestfs_inspect_get_* return this cached information, but do not re-read the disks. If you change the content of the guest disks, you can redo inspection by calling "guestfs_inspect_os" again. ("guestfs_inspect_list_applications2" works a little differently from the other calls and does read the disks. See documentation for that function for details).

INSPECTING INSTALL DISKS

Libguestfs (since 1.9.4) can detect some install disks, install CDs, live CDs and more.

Call "guestfs_inspect_get_format" to return the format of the operating system, which currently can be installed (a regular operating system) or installer (some sort of install disk).

Further information is available about the operating system that can be installed using the regular inspection APIs like "guestfs_inspect_get_product_name", "guestfs_inspect_get_major_version" etc.

Some additional information specific to installer disks is also available from the "guestfs_inspect_is_live", "guestfs_inspect_is_netinst" and "guestfs_inspect_is_multipart" calls.

SPECIAL CONSIDERATIONS FOR WINDOWS GUESTS

Libguestfs can mount NTFS partitions. It does this using the http://www.ntfs-3g.org/ driver.

DRIVE LETTERS AND PATHS

DOS and Windows still use drive letters, and the filesystems are always treated as case insensitive by Windows itself, and therefore you might find a Windows configuration file referring to a path like c:\windows\system32. When the filesystem is mounted in libguestfs, that directory might be referred to as /WINDOWS/System32.

Drive letter mappings can be found using inspection (see "INSPECTION" and "guestfs_inspect_get_drive_mappings")

Dealing with separator characters (backslash vs forward slash) is outside the scope of libguestfs, but usually a simple character replacement will work.

To resolve the case insensitivity of paths, call "guestfs_case_sensitive_path".

LONG FILENAMES ON NTFS

NTFS supports filenames up to 255 characters long. "Character" means a 2 byte UTF-16 codepoint which can encode the most common Unicode codepoints.

Most Linux filesystems support filenames up to 255 bytes. This means you may get an error:

File name too long

when you copy a file from NTFS to a Linux filesystem if the name, when reencoded as UTF-8, would exceed 255 bytes in length.

This will most often happen when using non-ASCII names that are longer than ~127 characters (eg. Greek, Cyrillic) or longer than ~85 characters (Asian languages).

A workaround is not to try to store such long filenames on Linux native filesystems. Since the tar(1) format can store unlimited length filenames, keep the files in a tarball.

ACCESSING THE WINDOWS REGISTRY

Libguestfs also provides some help for decoding Windows Registry "hive" files, through a separate C library called hivex(3).

Before libguestfs 1.19.35 you had to download the hive file, operate on it locally using hivex, and upload it again. Since this version, we have included the major hivex APIs directly in the libguestfs API (see "guestfs_hivex_open"). This means that if you have opened a Windows guest, you can read and write the registry directly.

See also virt-win-reg(1).

Ntfs-3g tries to rewrite "Junction Points" and NTFS "symbolic links" to provide something which looks like a Linux symlink. The way it tries to do the rewriting is described here:

http://www.tuxera.com/community/ntfs-3g-advanced/junction-points-and-symbolic-links/

The essential problem is that ntfs-3g simply does not have enough information to do a correct job. NTFS links can contain drive letters and references to external device GUIDs that ntfs-3g has no way of resolving. It is almost certainly the case that libguestfs callers should ignore what ntfs-3g does (ie. don't use "guestfs_readlink" on NTFS volumes).

Instead if you encounter a symbolic link on an ntfs-3g filesystem, use "guestfs_lgetxattr" to read the system.ntfs_reparse_data extended attribute, and read the raw reparse data from that (you can find the format documented in various places around the web).

EXTENDED ATTRIBUTES ON NTFS-3G FILESYSTEMS

There are other useful extended attributes that can be read from ntfs-3g filesystems (using "guestfs_getxattr"). See:

http://www.tuxera.com/community/ntfs-3g-advanced/extended-attributes/

WINDOWS HIBERNATION AND WINDOWS 8 FAST STARTUP

Windows guests which have been hibernated (instead of fully shut down) cannot be mounted. This is a limitation of ntfs-3g. You will see an error like this:

The disk contains an unclean file system (0, 0).
Metadata kept in Windows cache, refused to mount.
Failed to mount '/dev/sda2': Operation not permitted
The NTFS partition is in an unsafe state. Please resume
and shutdown Windows fully (no hibernation or fast
restarting), or mount the volume read-only with the
'ro' mount option.

In Windows 8, the shutdown button does not shut down the guest at all. Instead it usually hibernates the guest. This is known as "fast startup".

Some suggested workarounds are:

  • Mount read-only (eg. "guestfs_mount_ro").

  • On Windows 8, turn off fast startup. It is in the Control Panel → Power Options → Choose what the power buttons do → Change settings that are currently unavailable → Turn on fast startup.

  • On Windows 7 and earlier, shut the guest off properly instead of hibernating it.

RESIZE2FS ERRORS

The "guestfs_resize2fs", "guestfs_resize2fs_size" and "guestfs_resize2fs_M" calls are used to resize ext2/3/4 filesystems.

The underlying program (resize2fs(8)) requires that the filesystem is clean and recently fsck'd before you can resize it. Also, if the resize operation fails for some reason, then you had to call fsck the filesystem again to fix it.

In libguestfs lt 1.17.14, you usually had to call "guestfs_e2fsck_f" before the resize. However, in ge 1.17.14, e2fsck(8) is called automatically before the resize, so you no longer need to do this.

The resize2fs(8) program can still fail, in which case it prints an error message similar to:

Please run 'e2fsck -fy <device>' to fix the filesystem
after the aborted resize operation.

You can do this by calling "guestfs_e2fsck" with the forceall option. However in the context of disk images, it is usually better to avoid this situation, eg. by rolling back to an earlier snapshot, or by copying and resizing and on failure going back to the original.

USING LIBGUESTFS WITH OTHER PROGRAMMING LANGUAGES

Although we don't want to discourage you from using the C API, we will mention here that the same API is also available in other languages.

The API is broadly identical in all supported languages. This means that the C call guestfs_add_drive_ro(g,file) is $g->add_drive_ro($file) in Perl, g.add_drive_ro(file) in Python, and g#add_drive_ro file in OCaml. In other words, a straightforward, predictable isomorphism between each language.

Error messages are automatically transformed into exceptions if the language supports it.

We don't try to "object orientify" parts of the API in OO languages, although contributors are welcome to write higher level APIs above what we provide in their favourite languages if they wish.

C++

You can use the guestfs.h header file from C++ programs. The C++ API is identical to the C API. C++ classes and exceptions are not used.

C#

The C# bindings are highly experimental. Please read the warnings at the top of csharp/Libguestfs.cs.

Erlang

See guestfs-erlang(3).

GObject

Experimental GObject bindings (with GObject Introspection support) are available. See the gobject directory in the source.

Go

See <guestfs-golang(3)>.

Haskell

This language binding is working but incomplete:

  • Functions with optional arguments are not bound. Implementing optional arguments in Haskell seems to be very complex.

  • Events are not bound.

  • Functions with the following return types are not bound:

    • Any function returning a struct.

    • Any function returning a list of structs.

    • A few functions that return fixed length buffers (specifically ones declared RBufferOut in the generator).

    • A tiny number of obscure functions that return constant strings (specifically ones declared RConstOptString in the generator).

Java

Full documentation is contained in the Javadoc which is distributed with libguestfs. For examples, see guestfs-java(3).

Lua

See guestfs-lua(3).

OCaml

See guestfs-ocaml(3).

Perl

See guestfs-perl(3) and Sys::Guestfs(3).

PHP

For documentation see README-PHP supplied with libguestfs sources or in the php-libguestfs package for your distribution.

The PHP binding only works correctly on 64 bit machines.

Python

See guestfs-python(3).

Ruby

See guestfs-ruby(3).

For JRuby, use the Java bindings.

shell scripts

See guestfish(1).

LIBGUESTFS GOTCHAS

http://en.wikipedia.org/wiki/Gotcha_(programming): "A feature of a system [...] that works in the way it is documented but is counterintuitive and almost invites mistakes."

Since we developed libguestfs and the associated tools, there are several things we would have designed differently, but are now stuck with for backwards compatibility or other reasons. If there is ever a libguestfs 2.0 release, you can expect these to change. Beware of them.

Read-only should be the default.

In guestfish(3), --ro should be the default, and you should have to specify --rw if you want to make changes to the image.

This would reduce the potential to corrupt live VM images.

Note that many filesystems change the disk when you just mount and unmount, even if you didn't perform any writes. You need to use "guestfs_add_drive_ro" to guarantee that the disk is not changed.

guestfish command line is hard to use.

guestfish disk.img doesn't do what people expect (open disk.img for examination). It tries to run a guestfish command disk.img which doesn't exist, so it fails. In earlier versions of guestfish the error message was also unintuitive, but we have corrected this since. Like the Bourne shell, we should have used guestfish -c command to run commands.

guestfish megabyte modifiers don't work right on all commands

In recent guestfish you can use 1M to mean 1 megabyte (and similarly for other modifiers). What guestfish actually does is to multiply the number part by the modifier part and pass the result to the C API. However this doesn't work for a few APIs which aren't expecting bytes, but are already expecting some other unit (eg. megabytes).

The most common is "guestfs_lvcreate". The guestfish command:

lvcreate LV VG 100M

does not do what you might expect. Instead because "guestfs_lvcreate" is already expecting megabytes, this tries to create a 100 terabyte (100 megabytes * megabytes) logical volume. The error message you get from this is also a little obscure.

This could be fixed in the generator by specially marking parameters and return values which take bytes or other units.

Ambiguity between devices and paths

There is a subtle ambiguity in the API between a device name (eg. /dev/sdb2) and a similar pathname. A file might just happen to be called sdb2 in the directory /dev (consider some non-Unix VM image).

In the current API we usually resolve this ambiguity by having two separate calls, for example "guestfs_checksum" and "guestfs_checksum_device". Some API calls are ambiguous and (incorrectly) resolve the problem by detecting if the path supplied begins with /dev/.

To avoid both the ambiguity and the need to duplicate some calls, we could make paths/devices into structured names. One way to do this would be to use a notation like grub (hd(0,0)), although nobody really likes this aspect of grub. Another way would be to use a structured type, equivalent to this OCaml type:

type path = Path of string | Device of int | Partition of int * int

which would allow you to pass arguments like:

Path "/foo/bar"
Device 1            (* /dev/sdb, or perhaps /dev/sda *)
Partition (1, 2)    (* /dev/sdb2 (or is it /dev/sda2 or /dev/sdb3?) *)
Path "/dev/sdb2"    (* not a device *)

As you can see there are still problems to resolve even with this representation. Also consider how it might work in guestfish.

KEYS AND PASSPHRASES

Certain libguestfs calls take a parameter that contains sensitive key material, passed in as a C string.

In the future we would hope to change the libguestfs implementation so that keys are mlock(2)-ed into physical RAM, and thus can never end up in swap. However this is not done at the moment, because of the complexity of such an implementation.

Therefore you should be aware that any key parameter you pass to libguestfs might end up being written out to the swap partition. If this is a concern, scrub the swap partition or don't use libguestfs on encrypted devices.

MULTIPLE HANDLES AND MULTIPLE THREADS

All high-level libguestfs actions are synchronous. If you want to use libguestfs asynchronously then you must create a thread.

Only use the handle from a single thread. Either use the handle exclusively from one thread, or provide your own mutex so that two threads cannot issue calls on the same handle at the same time. Even apparently innocent functions like "guestfs_get_trace" are not safe to be called from multiple threads without a mutex.

See the graphical program guestfs-browser for one possible architecture for multithreaded programs using libvirt and libguestfs.

Use "guestfs_set_identifier" to make it simpler to identify threads in trace output.

PATH

Libguestfs needs a supermin appliance, which it finds by looking along an internal path.

By default it looks for these in the directory $libdir/guestfs (eg. /usr/local/lib/guestfs or /usr/lib64/guestfs).

Use "guestfs_set_path" or set the environment variable "LIBGUESTFS_PATH" to change the directories that libguestfs will search in. The value is a colon-separated list of paths. The current directory is not searched unless the path contains an empty element or .. For example LIBGUESTFS_PATH=:/usr/lib/guestfs would search the current directory and then /usr/lib/guestfs.

QEMU WRAPPERS

If you want to compile your own qemu, run qemu from a non-standard location, or pass extra arguments to qemu, then you can write a shell-script wrapper around qemu.

There is one important rule to remember: you must exec qemu as the last command in the shell script (so that qemu replaces the shell and becomes the direct child of the libguestfs-using program). If you don't do this, then the qemu process won't be cleaned up correctly.

Here is an example of a wrapper, where I have built my own copy of qemu from source:

#!/bin/sh -
qemudir=/home/rjones/d/qemu
exec $qemudir/x86_64-softmmu/qemu-system-x86_64 -L $qemudir/pc-bios "$@"

Save this script as /tmp/qemu.wrapper (or wherever), chmod +x, and then use it by setting the LIBGUESTFS_HV environment variable. For example:

LIBGUESTFS_HV=/tmp/qemu.wrapper guestfish

Note that libguestfs also calls qemu with the -help and -version options in order to determine features.

Wrappers can also be used to edit the options passed to qemu. In the following example, the -machine ... option (-machine and the following argument) are removed from the command line and replaced with -machine pc,accel=tcg. The while loop iterates over the options until it finds the right one to remove, putting the remaining options into the args array.

#!/bin/bash -

i=0
while [ $# -gt 0 ]; do
    case "$1" in
    -machine)
        shift 2;;
    *)
        args[i]="$1"
        (( i++ ))
        shift ;;
    esac
done

exec qemu-kvm -machine pc,accel=tcg "${args[@]}"

The backend (previously known as the "attach method") controls how libguestfs creates and/or connects to the backend daemon, eg. by starting qemu directly, or using libvirt to manage an appliance, running User-Mode Linux, or connecting to an already running daemon.

You can set the backend by calling "guestfs_set_backend", or by setting the environment variable LIBGUESTFS_BACKEND.

Possible backends are described below:

direct
appliance

Run qemu directly to launch an appliance.

direct and appliance are synonyms.

This is the ordinary method and normally the default, but see the note below.

libvirt
libvirt:null
libvirt:URI

Use libvirt to launch and manage the appliance.

libvirt causes libguestfs to choose a suitable URI for creating session guests. If using the libvirt backend, you almost always should use this.

libvirt:null causes libguestfs to use the NULL connection URI, which causes libvirt to try to guess what the user meant. You probably don't want to use this.

libvirt:URI uses URI as the libvirt connection URI (see http://libvirt.org/uri.html). The typical libvirt backend with a URI would be libvirt:qemu:///session

The libvirt backend supports more features, including hotplugging (see "HOTPLUGGING") and sVirt.

uml

Run the User-Mode Linux kernel. The location of the kernel is set using $LIBGUESTFS_HV or using the "guestfs_set_qemu" API (note that qemu is not involved, we just reuse the same variable in the handle for convenience).

User-Mode Linux can be much faster, simpler and more lightweight than using a full-blown virtual machine, but it also has some shortcomings. See "USER-MODE LINUX BACKEND" below.

unix:path

Connect to the Unix domain socket path.

This method lets you connect to an existing daemon or (using virtio-serial) to a live guest. For more information, see "ATTACHING TO RUNNING DAEMONS".

direct is usually the default backend. However since libguestfs ≥ 1.19.24, libguestfs can be built with a different default by doing:

./configure --with-default-backend=...

To find out if libguestfs was compiled with a different default backend, do:

unset LIBGUESTFS_BACKEND
guestfish get-backend

BACKEND SETTINGS

Each backend can be configured by passing a list of strings. You can either call "guestfs_set_backend_settings" with a list of strings, or set the LIBGUESTFS_BACKEND_SETTINGS environment variable to a colon-separated list of strings (before creating the handle).

force_tcg

Using:

export LIBGUESTFS_BACKEND_SETTINGS=force_tcg

will force the direct and libvirt backends to use TCG (software emulation) instead of KVM (hardware accelerated virtualization).

gdb

The direct backend supports:

export LIBGUESTFS_BACKEND_SETTINGS=gdb

When this is set, qemu will not start running the appliance immediately. It will wait for you to connect to it using gdb:

$ gdb
(gdb) symbol-file /path/to/vmlinux
(gdb) target remote tcp::1234
(gdb) cont

You can then debug the appliance kernel, which is useful to debug boot failures (especially ones where there are no debug messages printed - tip: look in the kernel log_buf).

On Fedora, install kernel-debuginfo for the vmlinux file (containing symbols). Make sure the symbols precisely match the kernel being used.

network_bridge

The libvirt backend supports:

export LIBGUESTFS_BACKEND_SETTINGS=network_bridge=virbrX

This allows you to override the bridge that is connected to when the network is enabled. The default is virbr0. See also "guestfs_set_network".

ATTACHING TO RUNNING DAEMONS

Note (1): This is highly experimental and has a tendency to eat babies. Use with caution.

Note (2): This section explains how to attach to a running daemon from a low level perspective. For most users, simply using virt tools such as guestfish(1) with the --live option will "just work".

Using guestfs_set_backend

By calling "guestfs_set_backend" you can change how the library connects to the guestfsd daemon in "guestfs_launch" (read "ARCHITECTURE" in guestfs-internals(1) for some background).

The normal backend is direct, where a small appliance is created containing the daemon, and then the library connects to this. libvirt or libvirt:URI are alternatives that use libvirt to start the appliance.

Setting the backend to unix:path (where path is the path of a Unix domain socket) causes "guestfs_launch" to connect to an existing daemon over the Unix domain socket.

The normal use for this is to connect to a running virtual machine that contains a guestfsd daemon, and send commands so you can read and write files inside the live virtual machine.

Using guestfs_add_domain with live flag

"guestfs_add_domain" provides some help for getting the correct backend. If you pass the live option to this function, then (if the virtual machine is running) it will examine the libvirt XML looking for a virtio-serial channel to connect to:

<domain>
  ...
  <devices>
    ...
    <channel type='unix'>
      <source mode='bind' path='/path/to/socket'/>
      <target type='virtio' name='org.libguestfs.channel.0'/>
    </channel>
    ...
  </devices>
</domain>

"guestfs_add_domain" extracts /path/to/socket and sets the backend to unix:/path/to/socket.

Some of the libguestfs tools (including guestfish) support a --live option which is passed through to "guestfs_add_domain" thus allowing you to attach to and modify live virtual machines.

The virtual machine needs to have been set up beforehand so that it has the virtio-serial channel and so that guestfsd is running inside it.

USER-MODE LINUX BACKEND

Setting the following environment variables (or the equivalent in the API) selects the User-Mode Linux backend:

export LIBGUESTFS_BACKEND=uml
export LIBGUESTFS_HV=/path/to/vmlinux

vmlinux (or it may be called linux) is the Linux binary, compiled to run as a userspace process. Note that we reuse the qemu variable in the handle for convenience; qemu is not involved.

User-Mode Linux can be faster and more lightweight than running a full-blown virtual machine as the backend (especially if you are already running libguestfs in a virtual machine or cloud instance), but it also has some shortcomings compared to the usual qemu/KVM-based backend.

BUILDING USER-MODE LINUX FROM SOURCE

Your Linux distro may provide UML in which case you can ignore this section.

These instructions are adapted from: http://user-mode-linux.sourceforge.net/source.html

1. Check out Linux sources

Clone the Linux git repository or download the Linux source tarball.

2. Configure the kernel

Note: All 'make' commands must have ARCH=um added.

make menuconfig ARCH=um

Make sure any filesystem drivers that you need are compiled into the kernel.

Currently, it needs a large amount of extra work to get modules working. It's recommended that you disable module support in the kernel configuration, which will cause everything to be compiled into the image.

3. Build the kernel
make ARCH=um

This will leave a file called linux or vmlinux in the top-level directory. This is the UML kernel. You should set LIBGUESTFS_HV to point to this file.

USER-MODE LINUX DIFFERENCES FROM KVM

UML only supports raw-format images

Only plain raw-format images will work. No qcow2, no backing files.

UML does not support any remote drives

No NBD, etc.

UML only works on ix86 and x86-64
UML is experimental

In particular, support for UML in libguestfs depends on support for UML in the upstream kernel. If UML was ever removed from the upstream Linux kernel, then we might remove it from libguestfs too.

ABI GUARANTEE

We guarantee the libguestfs ABI (binary interface), for public, high-level actions as outlined in this section. Although we will deprecate some actions, for example if they get replaced by newer calls, we will keep the old actions forever. This allows you the developer to program in confidence against the libguestfs API.

BLOCK DEVICE NAMING

In the kernel there is now quite a profusion of schemata for naming block devices (in this context, by block device I mean a physical or virtual hard drive). The original Linux IDE driver used names starting with /dev/hd*. SCSI devices have historically used a different naming scheme, /dev/sd*. When the Linux kernel libata driver became a popular replacement for the old IDE driver (particularly for SATA devices) those devices also used the /dev/sd* scheme. Additionally we now have virtual machines with paravirtualized drivers. This has created several different naming systems, such as /dev/vd* for virtio disks and /dev/xvd* for Xen PV disks.

As discussed above, libguestfs uses a qemu appliance running an embedded Linux kernel to access block devices. We can run a variety of appliances based on a variety of Linux kernels.

This causes a problem for libguestfs because many API calls use device or partition names. Working scripts and the recipe (example) scripts that we make available over the internet could fail if the naming scheme changes.

Therefore libguestfs defines /dev/sd* as the standard naming scheme. Internally /dev/sd* names are translated, if necessary, to other names as required. For example, under RHEL 5 which uses the /dev/hd* scheme, any device parameter /dev/sda2 is translated to /dev/hda2 transparently.

Note that this only applies to parameters. The "guestfs_list_devices", "guestfs_list_partitions" and similar calls return the true names of the devices and partitions as known to the appliance, but see "guestfs_canonical_device_name".

DISK LABELS

In libguestfs ≥ 1.20, you can give a label to a disk when you add it, using the optional label parameter to "guestfs_add_drive_opts". (Note that disk labels are different from and not related to filesystem labels).

Not all versions of libguestfs support setting a disk label, and when it is supported, it is limited to 20 ASCII characters [a-zA-Z].

When you add a disk with a label, it can either be addressed using /dev/sd*, or using /dev/disk/guestfs/label. Partitions on the disk can be addressed using /dev/disk/guestfs/labelpartnum.

Listing devices ("guestfs_list_devices") and partitions ("guestfs_list_partitions") returns the raw block device name. However you can use "guestfs_list_disk_labels" to map disk labels to raw block device and partition names.

ALGORITHM FOR BLOCK DEVICE NAME TRANSLATION

Usually this translation is transparent. However in some (very rare) cases you may need to know the exact algorithm. Such cases include where you use "guestfs_config" to add a mixture of virtio and IDE devices to the qemu-based appliance, so have a mixture of /dev/sd* and /dev/vd* devices.

The algorithm is applied only to parameters which are known to be either device or partition names. Return values from functions such as "guestfs_list_devices" are never changed.

  • Is the string a parameter which is a device or partition name?

  • Does the string begin with /dev/sd?

  • Does the named device exist? If so, we use that device. However if not then we continue with this algorithm.

  • Replace initial /dev/sd string with /dev/hd.

    For example, change /dev/sda2 to /dev/hda2.

    If that named device exists, use it. If not, continue.

  • Replace initial /dev/sd string with /dev/vd.

    If that named device exists, use it. If not, return an error.

PORTABILITY CONCERNS WITH BLOCK DEVICE NAMING

Although the standard naming scheme and automatic translation is useful for simple programs and guestfish scripts, for larger programs it is best not to rely on this mechanism.

Where possible for maximum future portability programs using libguestfs should use these future-proof techniques:

  • Use "guestfs_list_devices" or "guestfs_list_partitions" to list actual device names, and then use those names directly.

    Since those device names exist by definition, they will never be translated.

  • Use higher level ways to identify filesystems, such as LVM names, UUIDs and filesystem labels.

NULL DISKS

When adding a disk using, eg., "guestfs_add_drive", you can set the filename to "/dev/null". This string is treated specially by libguestfs, causing it to add a "null disk".

A null disk has the following properties:

  • A null disk will appear as a normal device, eg. in calls to "guestfs_list_devices".

  • You may add "/dev/null" multiple times.

  • You should not try to access a null disk in any way. For example, you shouldn't try to read it or mount it.

Null disks are used for three main purposes:

  1. Performance testing of libguestfs (see guestfs-performance(1)).

  2. The internal test suite.

  3. If you want to use libguestfs APIs that don't refer to disks, since libguestfs requires that at least one disk is added, you should add a null disk.

    For example, to test if a feature is available, use code like this:

    guestfs_h *g;
    char **groups = [ "btrfs", NULL ];
    
    g = guestfs_create ();
    guestfs_add_drive (g, "/dev/null");
    guestfs_launch (g);
    if (guestfs_available (g, groups) == 0) {
      // group(s) are available
    } else {
      // group(s) are not available
    }
    guestfs_close (g);

DISK IMAGE FORMATS

Virtual disks come in a variety of formats. Some common formats are listed below.

Note that libguestfs itself is not responsible for handling the disk format: this is done using qemu(1). If support for a particular format is missing or broken, this has to be fixed in qemu.

COMMON VIRTUAL DISK IMAGE FORMATS

raw

Raw format is simply a dump of the sequential bytes of the virtual hard disk. There is no header, container, compression or processing of any sort.

Since raw format requires no translation to read or write, it is both fast and very well supported by qemu and all other hypervisors. You can consider it to be a universal format that any hypervisor can access.

Raw format files are not compressed and so take up the full space of the original disk image even when they are empty. A variation (on Linux/Unix at least) is to not store ranges of all-zero bytes by storing the file as a sparse file. This "variant format" is sometimes called raw sparse. Many utilities, including virt-sparsify(1), can make raw disk images sparse.

qcow2

Qcow2 is the native disk image format used by qemu. Internally it uses a two-level directory structure so that only blocks containing data are stored in the file. It also has many other features such as compression, snapshots and backing files.

There are at least two distinct variants of this format, although qemu (and hence libguestfs) handles both transparently to the user.

vmdk

VMDK is VMware's native disk image format. There are many variations. Modern qemu (hence libguestfs) supports most variations, but you should be aware that older versions of qemu had some very bad data-corrupting bugs in this area.

Note that VMware ESX exposes files with the name guest-flat.vmdk. These are not VMDK. They are raw format files which happen to have a .vmdk extension.

vdi

VDI is VirtualBox's native disk image format. Qemu (hence libguestfs) has generally good support for this.

vpc
vhd

VPC (old) and VHD (modern) are the native disk image format of Microsoft (and previously, Connectix) Virtual PC and Hyper-V.

Obsolete formats

The following formats are obsolete and should not be used: qcow (aka qcow1), cow, bochs.

DETECTING THE FORMAT OF A DISK IMAGE

Firstly note there is a security issue with auto-detecting the format of a disk image. It may or may not apply in your use case. Read "CVE-2010-3851" below.

Libguestfs offers an API to get the format of a disk image ("guestfs_disk_format", and it is safest to use this.

Don't be tempted to try parsing the text / human-readable output of qemu-img since it cannot be parsed reliably and securely. Also do not use the file command since the output of that changes over time.

CONNECTION MANAGEMENT

guestfs_h *

guestfs_h is the opaque type representing a connection handle. Create a handle by calling "guestfs_create" or "guestfs_create_flags". Call "guestfs_close" to free the handle and release all resources used.

For information on using multiple handles and threads, see the section "MULTIPLE HANDLES AND MULTIPLE THREADS" above.

guestfs_create

guestfs_h *guestfs_create (void);

Create a connection handle.

On success this returns a non-NULL pointer to a handle. On error it returns NULL.

You have to "configure" the handle after creating it. This includes calling "guestfs_add_drive_opts" (or one of the equivalent calls) on the handle at least once.

After configuring the handle, you have to call "guestfs_launch".

You may also want to configure error handling for the handle. See the "ERROR HANDLING" section below.

guestfs_create_flags

guestfs_h *guestfs_create_flags (unsigned flags [, ...]);

Create a connection handle, supplying extra flags and extra arguments to control how the handle is created.

On success this returns a non-NULL pointer to a handle. On error it returns NULL.

"guestfs_create" is equivalent to calling guestfs_create_flags(0).

The following flags may be logically ORed together. (Currently no extra arguments are used).

GUESTFS_CREATE_NO_ENVIRONMENT

Don't parse any environment variables (such as LIBGUESTFS_DEBUG etc).

You can call "guestfs_parse_environment" or "guestfs_parse_environment_list" afterwards to parse environment variables. Alternately, don't call these functions if you want the handle to be unaffected by environment variables. See the example below.

The default (if this flag is not given) is to implicitly call "guestfs_parse_environment".

GUESTFS_CREATE_NO_CLOSE_ON_EXIT

Don't try to close the handle in an atexit(3) handler if the program exits without explicitly closing the handle.

The default (if this flag is not given) is to install such an atexit handler.

USING GUESTFS_CREATE_NO_ENVIRONMENT

You might use GUESTFS_CREATE_NO_ENVIRONMENT and an explicit call to "guestfs_parse_environment" like this:

guestfs_h *g;
int r;

g = guestfs_create_flags (GUESTFS_CREATE_NO_ENVIRONMENT);
if (!g) {
  perror ("guestfs_create_flags");
  exit (EXIT_FAILURE);
}
r = guestfs_parse_environment (g);
if (r == -1)
  exit (EXIT_FAILURE);

Or to create a handle which is unaffected by environment variables, omit the call to guestfs_parse_environment from the above code.

The above code has another advantage which is that any errors from parsing the environment are passed through the error handler, whereas guestfs_create prints errors on stderr and ignores them.

guestfs_close

void guestfs_close (guestfs_h *g);

This closes the connection handle and frees up all resources used. If a close callback was set on the handle, then it is called.

The correct way to close the handle is:

if (guestfs_shutdown (g) == -1) {
  /* handle write errors here */
}
guestfs_close (g);

"guestfs_shutdown" is only needed if all of the following are true:

  1. one or more disks were added in read-write mode, and

  2. guestfs_launch was called, and

  3. you made some changes, and

  4. you have a way to handle write errors (eg. by exiting with an error code or reporting something to the user).

ERROR HANDLING

API functions can return errors. For example, almost all functions that return int will return -1 to indicate an error.

Additional information is available for errors: an error message string and optionally an error number (errno) if the thing that failed was a system call.

You can get at the additional information about the last error on the handle by calling "guestfs_last_error", "guestfs_last_errno", and/or by setting up an error handler with "guestfs_set_error_handler".

When the handle is created, a default error handler is installed which prints the error message string to stderr. For small short-running command line programs it is sufficient to do:

if (guestfs_launch (g) == -1)
  exit (EXIT_FAILURE);

since the default error handler will ensure that an error message has been printed to stderr before the program exits.

For other programs the caller will almost certainly want to install an alternate error handler or do error handling in-line as in the example below. The non-C language bindings all install NULL error handlers and turn errors into exceptions using code similar to this:

const char *msg;
int errnum;

/* This disables the default behaviour of printing errors
   on stderr. */
guestfs_set_error_handler (g, NULL, NULL);

if (guestfs_launch (g) == -1) {
  /* Examine the error message and print it, throw it,
     etc. */
  msg = guestfs_last_error (g);
  errnum = guestfs_last_errno (g);

  fprintf (stderr, "%s", msg);
  if (errnum != 0)
    fprintf (stderr, ": %s", strerror (errnum));
  fprintf (stderr, "\n");

  /* ... */
}

"guestfs_create" returns NULL if the handle cannot be created, and because there is no handle if this happens there is no way to get additional error information. Since libguestfs ≥ 1.20, you can use "guestfs_create_flags" to properly deal with errors during handle creation, although the vast majority of programs can continue to use "guestfs_create" and not worry about this situation.

Out of memory errors are handled differently. The default action is to call abort(3). If this is undesirable, then you can set a handler using "guestfs_set_out_of_memory_handler".

guestfs_last_error

const char *guestfs_last_error (guestfs_h *g);

This returns the last error message that happened on g. If there has not been an error since the handle was created, then this returns NULL.

Note the returned string does not have a newline character at the end. Most error messages are single lines. Some are split over multiple lines and contain \n characters within the string but not at the end.

The lifetime of the returned string is until the next error occurs on the same handle, or "guestfs_close" is called. If you need to keep it longer, copy it.

guestfs_last_errno

int guestfs_last_errno (guestfs_h *g);

This returns the last error number (errno) that happened on g.

If successful, an errno integer not equal to zero is returned.

In many cases the special errno ENOTSUP is returned if you tried to call a function or use a feature which is not supported.

If no error number is available, this returns 0. This call can return 0 in three situations:

  1. There has not been any error on the handle.

  2. There has been an error but the errno was meaningless. This corresponds to the case where the error did not come from a failed system call, but for some other reason.

  3. There was an error from a failed system call, but for some reason the errno was not captured and returned. This usually indicates a bug in libguestfs.

Libguestfs tries to convert the errno from inside the appliance into a corresponding errno for the caller (not entirely trivial: the appliance might be running a completely different operating system from the library and error numbers are not standardized across Un*xen). If this could not be done, then the error is translated to EINVAL. In practice this should only happen in very rare circumstances.

guestfs_set_error_handler

typedef void (*guestfs_error_handler_cb) (guestfs_h *g,
                                          void *opaque,
                                          const char *msg);
void guestfs_set_error_handler (guestfs_h *g,
                                guestfs_error_handler_cb cb,
                                void *opaque);

The callback cb will be called if there is an error. The parameters passed to the callback are an opaque data pointer and the error message string.

errno is not passed to the callback. To get that the callback must call "guestfs_last_errno".

Note that the message string msg is freed as soon as the callback function returns, so if you want to stash it somewhere you must make your own copy.

The default handler prints messages on stderr.

If you set cb to NULL then no handler is called.

guestfs_get_error_handler

guestfs_error_handler_cb guestfs_get_error_handler (guestfs_h *g,
                                                    void **opaque_rtn);

Returns the current error handler callback.

guestfs_push_error_handler

void guestfs_push_error_handler (guestfs_h *g,
                                 guestfs_error_handler_cb cb,
                                 void *opaque);

This is the same as "guestfs_set_error_handler", except that the old error handler is stashed away in a stack inside the handle. You can restore the previous error handler by calling "guestfs_pop_error_handler".

Use the following code to temporarily disable errors around a function:

guestfs_push_error_handler (g, NULL, NULL);
guestfs_mkdir (g, "/foo"); /* We don't care if this fails. */
guestfs_pop_error_handler (g);

guestfs_pop_error_handler

void guestfs_pop_error_handler (guestfs_h *g);

Restore the previous error handler (see "guestfs_push_error_handler").

If you pop the stack too many times, then the default error handler is restored.

guestfs_set_out_of_memory_handler

typedef void (*guestfs_abort_cb) (void);
void guestfs_set_out_of_memory_handler (guestfs_h *g,
                                        guestfs_abort_cb);

The callback cb will be called if there is an out of memory situation. Note this callback must not return.

The default is to call abort(3).

You cannot set cb to NULL. You can't ignore out of memory situations.

guestfs_get_out_of_memory_handler

guestfs_abort_fn guestfs_get_out_of_memory_handler (guestfs_h *g);

This returns the current out of memory handler.

API CALLS

__ACTIONS__

STRUCTURES

__STRUCTS__

AVAILABILITY

GROUPS OF FUNCTIONALITY IN THE APPLIANCE

Using "guestfs_available" you can test availability of the following groups of functions. This test queries the appliance to see if the appliance you are currently using supports the functionality.

__AVAILABILITY__

FILESYSTEM AVAILABLE

The "guestfs_filesystem_available" call tests whether a filesystem type is supported by the appliance kernel.

This is mainly useful as a negative test. If this returns true, it doesn't mean that a particular filesystem can be mounted, since filesystems can fail for other reasons such as it being a later version of the filesystem, or having incompatible features.

GUESTFISH supported COMMAND

In guestfish(3) there is a handy interactive command supported which prints out the available groups and whether they are supported by this build of libguestfs. Note however that you have to do run first.

SINGLE CALLS AT COMPILE TIME

Since version 1.5.8, <guestfs.h> defines symbols for each C API function, such as:

#define GUESTFS_HAVE_DD 1

if "guestfs_dd" is available.

Before version 1.5.8, if you needed to test whether a single libguestfs function is available at compile time, we recommended using build tools such as autoconf or cmake. For example in autotools you could use:

AC_CHECK_LIB([guestfs],[guestfs_create])
AC_CHECK_FUNCS([guestfs_dd])

which would result in HAVE_GUESTFS_DD being either defined or not defined in your program.

SINGLE CALLS AT RUN TIME

Testing at compile time doesn't guarantee that a function really exists in the library. The reason is that you might be dynamically linked against a previous libguestfs.so (dynamic library) which doesn't have the call. This situation unfortunately results in a segmentation fault, which is a shortcoming of the C dynamic linking system itself.

You can use dlopen(3) to test if a function is available at run time, as in this example program (note that you still need the compile time check as well):

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <dlfcn.h>
#include <guestfs.h>

main ()
{
#ifdef GUESTFS_HAVE_DD
  void *dl;
  int has_function;

  /* Test if the function guestfs_dd is really available. */
  dl = dlopen (NULL, RTLD_LAZY);
  if (!dl) {
    fprintf (stderr, "dlopen: %s\n", dlerror ());
    exit (EXIT_FAILURE);
  }
  has_function = dlsym (dl, "guestfs_dd") != NULL;
  dlclose (dl);

  if (!has_function)
    printf ("this libguestfs.so does NOT have guestfs_dd function\n");
  else {
    printf ("this libguestfs.so has guestfs_dd function\n");
    /* Now it's safe to call
    guestfs_dd (g, "foo", "bar");
    */
  }
#else
  printf ("guestfs_dd function was not found at compile time\n");
#endif
 }

You may think the above is an awful lot of hassle, and it is. There are other ways outside of the C linking system to ensure that this kind of incompatibility never arises, such as using package versioning:

Requires: libguestfs >= 1.0.80

CALLS WITH OPTIONAL ARGUMENTS

A recent feature of the API is the introduction of calls which take optional arguments. In C these are declared 3 ways. The main way is as a call which takes variable arguments (ie. ...), as in this example:

int guestfs_add_drive_opts (guestfs_h *g, const char *filename, ...);

Call this with a list of optional arguments, terminated by -1. So to call with no optional arguments specified:

guestfs_add_drive_opts (g, filename, -1);

With a single optional argument:

guestfs_add_drive_opts (g, filename,
                        GUESTFS_ADD_DRIVE_OPTS_FORMAT, "qcow2",
                        -1);

With two:

guestfs_add_drive_opts (g, filename,
                        GUESTFS_ADD_DRIVE_OPTS_FORMAT, "qcow2",
                        GUESTFS_ADD_DRIVE_OPTS_READONLY, 1,
                        -1);

and so forth. Don't forget the terminating -1 otherwise Bad Things will happen!

USING va_list FOR OPTIONAL ARGUMENTS

The second variant has the same name with the suffix _va, which works the same way but takes a va_list. See the C manual for details. For the example function, this is declared:

int guestfs_add_drive_opts_va (guestfs_h *g, const char *filename,
                               va_list args);

CONSTRUCTING OPTIONAL ARGUMENTS

The third variant is useful where you need to construct these calls. You pass in a structure where you fill in the optional fields. The structure has a bitmask as the first element which you must set to indicate which fields you have filled in. For our example function the structure and call are declared:

struct guestfs_add_drive_opts_argv {
  uint64_t bitmask;
  int readonly;
  const char *format;
  /* ... */
};
int guestfs_add_drive_opts_argv (guestfs_h *g, const char *filename,
             const struct guestfs_add_drive_opts_argv *optargs);

You could call it like this:

struct guestfs_add_drive_opts_argv optargs = {
  .bitmask = GUESTFS_ADD_DRIVE_OPTS_READONLY_BITMASK |
             GUESTFS_ADD_DRIVE_OPTS_FORMAT_BITMASK,
  .readonly = 1,
  .format = "qcow2"
};

guestfs_add_drive_opts_argv (g, filename, &optargs);

Notes:

  • The _BITMASK suffix on each option name when specifying the bitmask.

  • You do not need to fill in all fields of the structure.

  • There must be a one-to-one correspondence between fields of the structure that are filled in, and bits set in the bitmask.

OPTIONAL ARGUMENTS IN OTHER LANGUAGES

In other languages, optional arguments are expressed in the way that is natural for that language. We refer you to the language-specific documentation for more details on that.

For guestfish, see "OPTIONAL ARGUMENTS" in guestfish(1).

EVENTS

SETTING CALLBACKS TO HANDLE EVENTS

Note: This section documents the generic event mechanism introduced in libguestfs 1.10, which you should use in new code if possible. The old functions guestfs_set_log_message_callback, guestfs_set_subprocess_quit_callback, guestfs_set_launch_done_callback, guestfs_set_close_callback and guestfs_set_progress_callback are no longer documented in this manual page. Because of the ABI guarantee, the old functions continue to work.

Handles generate events when certain things happen, such as log messages being generated, progress messages during long-running operations, or the handle being closed. The API calls described below let you register a callback to be called when events happen. You can register multiple callbacks (for the same, different or overlapping sets of events), and individually remove callbacks. If callbacks are not removed, then they remain in force until the handle is closed.

In the current implementation, events are only generated synchronously: that means that events (and hence callbacks) can only happen while you are in the middle of making another libguestfs call. The callback is called in the same thread.

Events may contain a payload, usually nothing (void), an array of 64 bit unsigned integers, or a message buffer. Payloads are discussed later on.

CLASSES OF EVENTS

GUESTFS_EVENT_CLOSE (payload type: void)

The callback function will be called while the handle is being closed (synchronously from "guestfs_close").

Note that libguestfs installs an atexit(3) handler to try to clean up handles that are open when the program exits. This means that this callback might be called indirectly from exit(3), which can cause unexpected problems in higher-level languages (eg. if your HLL interpreter has already been cleaned up by the time this is called, and if your callback then jumps into some HLL function).

If no callback is registered: the handle is closed without any callback being invoked.

GUESTFS_EVENT_SUBPROCESS_QUIT (payload type: void)

The callback function will be called when the child process quits, either asynchronously or if killed by "guestfs_kill_subprocess". (This corresponds to a transition from any state to the CONFIG state).

If no callback is registered: the event is ignored.

GUESTFS_EVENT_LAUNCH_DONE (payload type: void)

The callback function will be called when the child process becomes ready first time after it has been launched. (This corresponds to a transition from LAUNCHING to the READY state).

If no callback is registered: the event is ignored.

GUESTFS_EVENT_PROGRESS (payload type: array of 4 x uint64_t)

Some long-running operations can generate progress messages. If this callback is registered, then it will be called each time a progress message is generated (usually two seconds after the operation started, and three times per second thereafter until it completes, although the frequency may change in future versions).

The callback receives in the payload four unsigned 64 bit numbers which are (in order): proc_nr, serial, position, total.

The units of total are not defined, although for some operations total may relate in some way to the amount of data to be transferred (eg. in bytes or megabytes), and position may be the portion which has been transferred.

The only defined and stable parts of the API are:

  • The callback can display to the user some type of progress bar or indicator which shows the ratio of position:total.

  • 0 <= position <= total

  • If any progress notification is sent during a call, then a final progress notification is always sent when position = total (unless the call fails with an error).

    This is to simplify caller code, so callers can easily set the progress indicator to "100%" at the end of the operation, without requiring special code to detect this case.

  • For some calls we are unable to estimate the progress of the call, but we can still generate progress messages to indicate activity. This is known as "pulse mode", and is directly supported by certain progress bar implementations (eg. GtkProgressBar).

    For these calls, zero or more progress messages are generated with position = 0 and total = 1, followed by a final message with position = total = 1.

    As noted above, if the call fails with an error then the final message may not be generated.

The callback also receives the procedure number (proc_nr) and serial number (serial) of the call. These are only useful for debugging protocol issues, and the callback can normally ignore them. The callback may want to print these numbers in error messages or debugging messages.

If no callback is registered: progress messages are discarded.

GUESTFS_EVENT_APPLIANCE (payload type: message buffer)

The callback function is called whenever a log message is generated by qemu, the appliance kernel, guestfsd (daemon), or utility programs.

If the verbose flag ("guestfs_set_verbose") is set before launch ("guestfs_launch") then additional debug messages are generated.

If no callback is registered: the messages are discarded unless the verbose flag is set in which case they are sent to stderr. You can override the printing of verbose messages to stderr by setting up a callback.

GUESTFS_EVENT_LIBRARY (payload type: message buffer)

The callback function is called whenever a log message is generated by the library part of libguestfs.

If the verbose flag ("guestfs_set_verbose") is set then additional debug messages are generated.

If no callback is registered: the messages are discarded unless the verbose flag is set in which case they are sent to stderr. You can override the printing of verbose messages to stderr by setting up a callback.

GUESTFS_EVENT_WARNING (payload type: message buffer)

The callback function is called whenever a warning message is generated by the library part of libguestfs.

If no callback is registered: the messages are printed to stderr. You can override the printing of warning messages to stderr by setting up a callback.

GUESTFS_EVENT_TRACE (payload type: message buffer)

The callback function is called whenever a trace message is generated. This only applies if the trace flag ("guestfs_set_trace") is set.

If no callback is registered: the messages are sent to stderr. You can override the printing of trace messages to stderr by setting up a callback.

GUESTFS_EVENT_ENTER (payload type: function name)

The callback function is called whenever a libguestfs function is entered.

The payload is a string which contains the name of the function that we are entering (not including guestfs_ prefix).

Note that libguestfs functions can call themselves, so you may see many events from a single call. A few libguestfs functions do not generate this event.

If no callback is registered: the event is ignored.

GUESTFS_EVENT_LIBVIRT_AUTH (payload type: libvirt URI)

For any API function that opens a libvirt connection, this event may be generated to indicate that libvirt demands authentication information. See "LIBVIRT AUTHENTICATION" below.

If no callback is registered: virConnectAuthPtrDefault is used (suitable for command-line programs only).

EVENT API

guestfs_set_event_callback

int guestfs_set_event_callback (guestfs_h *g,
                                guestfs_event_callback cb,
                                uint64_t event_bitmask,
                                int flags,
                                void *opaque);

This function registers a callback (cb) for all event classes in the event_bitmask.

For example, to register for all log message events, you could call this function with the bitmask GUESTFS_EVENT_APPLIANCE|GUESTFS_EVENT_LIBRARY|GUESTFS_EVENT_WARNING. To register a single callback for all possible classes of events, use GUESTFS_EVENT_ALL.

flags should always be passed as 0.

opaque is an opaque pointer which is passed to the callback. You can use it for any purpose.

The return value is the event handle (an integer) which you can use to delete the callback (see below).

If there is an error, this function returns -1, and sets the error in the handle in the usual way (see "guestfs_last_error" etc.)

Callbacks remain in effect until they are deleted, or until the handle is closed.

In the case where multiple callbacks are registered for a particular event class, all of the callbacks are called. The order in which multiple callbacks are called is not defined.

guestfs_delete_event_callback

void guestfs_delete_event_callback (guestfs_h *g, int event_handle);

Delete a callback that was previously registered. event_handle should be the integer that was returned by a previous call to guestfs_set_event_callback on the same handle.

guestfs_event_to_string

char *guestfs_event_to_string (uint64_t event);

event is either a single event or a bitmask of events. This returns a string representation (useful for debugging or printing events).

A single event is returned as the name in lower case, eg. "close".

A bitmask of several events is returned as a comma-separated list, eg. "close,progress".

If zero is passed, then the empty string "" is returned.

On success this returns a string. On error it returns NULL and sets errno.

The returned string must be freed by the caller.

guestfs_event_callback

typedef void (*guestfs_event_callback) (
                 guestfs_h *g,
                 void *opaque,
                 uint64_t event,
                 int event_handle,
                 int flags,
                 const char *buf, size_t buf_len,
                 const uint64_t *array, size_t array_len);

This is the type of the event callback function that you have to provide.

The basic parameters are: the handle (g), the opaque user pointer (opaque), the event class (eg. GUESTFS_EVENT_PROGRESS), the event handle, and flags which in the current API you should ignore.

The remaining parameters contain the event payload (if any). Each event may contain a payload, which usually relates to the event class, but for future proofing your code should be written to handle any payload for any event class.

buf and buf_len contain a message buffer (if buf_len == 0, then there is no message buffer). Note that this message buffer can contain arbitrary 8 bit data, including NUL bytes.

array and array_len is an array of 64 bit unsigned integers. At the moment this is only used for progress messages.

EXAMPLE: CAPTURING LOG MESSAGES

A working program demonstrating this can be found in examples/debug-logging.c in the source of libguestfs.

One motivation for the generic event API was to allow GUI programs to capture debug and other messages. In libguestfs ≤ 1.8 these were sent unconditionally to stderr.

Events associated with log messages are: GUESTFS_EVENT_LIBRARY, GUESTFS_EVENT_APPLIANCE, GUESTFS_EVENT_WARNING and GUESTFS_EVENT_TRACE. (Note that error messages are not events; you must capture error messages separately).

Programs have to set up a callback to capture the classes of events of interest:

int eh =
  guestfs_set_event_callback
    (g, message_callback,
     GUESTFS_EVENT_LIBRARY | GUESTFS_EVENT_APPLIANCE |
     GUESTFS_EVENT_WARNING | GUESTFS_EVENT_TRACE,
     0, NULL) == -1)
if (eh == -1) {
  // handle error in the usual way
}

The callback can then direct messages to the appropriate place. In this example, messages are directed to syslog:

static void
message_callback (
        guestfs_h *g,
        void *opaque,
        uint64_t event,
        int event_handle,
        int flags,
        const char *buf, size_t buf_len,
        const uint64_t *array, size_t array_len)
{
  const int priority = LOG_USER|LOG_INFO;
  if (buf_len > 0)
    syslog (priority, "event 0x%lx: %s", event, buf);
}

LIBVIRT AUTHENTICATION

Some libguestfs API calls can open libvirt connections. Currently the only ones are "guestfs_add_domain"; and "guestfs_launch" if the libvirt backend has been selected. Libvirt connections may require authentication, for example if they need to access a remote server or to access root services from non-root. Libvirt authentication happens via a callback mechanism, see http://libvirt.org/guide/html/Application_Development_Guide-Connections.html

You may provide libvirt authentication data by registering a callback for events of type GUESTFS_EVENT_LIBVIRT_AUTH.

If no such event is registered, then libguestfs uses a libvirt function that provides command-line prompts (virConnectAuthPtrDefault). This is only suitable for command-line libguestfs programs.

To provide authentication, first call "guestfs_set_libvirt_supported_credentials" with the list of credentials your program knows how to provide. Second, register a callback for the GUESTFS_EVENT_LIBVIRT_AUTH event. The event handler will be called when libvirt is requesting authentication information.

In the event handler, call "guestfs_get_libvirt_requested_credentials" to get a list of the credentials that libvirt is asking for. You then need to ask (eg. the user) for each credential, and call "guestfs_set_libvirt_requested_credential" with the answer. Note that for each credential, additional information may be available via the calls "guestfs_get_libvirt_requested_credential_prompt", "guestfs_get_libvirt_requested_credential_challenge" or "guestfs_get_libvirt_requested_credential_defresult".

The example program below should make this clearer.

There is also a more substantial working example program supplied with the libguestfs sources, called libvirt-auth.c.

main ()
{
  guestfs_h *g;
  char *creds[] = { "authname", "passphrase", NULL };
  int r, eh;

  g = guestfs_create ();
  if (!g) exit (EXIT_FAILURE);

  /* Tell libvirt what credentials the program supports. */
  r = guestfs_set_libvirt_supported_credentials (g, creds);
  if (r == -1)
    exit (EXIT_FAILURE);

  /* Set up the event handler. */
  eh = guestfs_set_event_callback (
      g, do_auth,
      GUESTFS_EVENT_LIBVIRT_AUTH, 0, NULL);
  if (eh == -1)
    exit (EXIT_FAILURE);

  /* An example of a call that may ask for credentials. */
  r = guestfs_add_domain (
      g, "dom",
      GUESTFS_ADD_DOMAIN_LIBVIRTURI, "qemu:///system",
      -1);
  if (r == -1)
    exit (EXIT_FAILURE);

  exit (EXIT_SUCCESS);
}

static void
do_auth (guestfs_h *g,
         void *opaque,
         uint64_t event,
         int event_handle,
         int flags,
         const char *buf, size_t buf_len,
         const uint64_t *array, size_t array_len)
{
  char **creds;
  size_t i;
  char *prompt;
  char *reply;
  size_t replylen;
  int r;

  // buf will be the libvirt URI.  buf_len may be ignored.
  printf ("Authentication required for libvirt conn '%s'\n",
          buf);

  // Ask libguestfs what credentials libvirt is demanding.
  creds = guestfs_get_libvirt_requested_credentials (g);
  if (creds == NULL)
    exit (EXIT_FAILURE);

  // Now ask the user for answers.
  for (i = 0; creds[i] != NULL; ++i)
  {
    if (strcmp (creds[i], "authname") == 0 ||
        strcmp (creds[i], "passphrase") == 0)
    {
      prompt =
        guestfs_get_libvirt_requested_credential_prompt (g, i);
      if (prompt && strcmp (prompt, "") != 0)
        printf ("%s: ", prompt);
      free (prompt);

      // Some code here to ask for the credential.
      // ...
      // Put the reply in 'reply', length 'replylen' (bytes).

     r = guestfs_set_libvirt_requested_credential (g, i,
         reply, replylen);
     if (r == -1)
       exit (EXIT_FAILURE);
    }

    free (creds[i]);
  }

  free (creds);
}

CANCELLING LONG TRANSFERS

Some operations can be cancelled by the caller while they are in progress. Currently only operations that involve uploading or downloading data can be cancelled (technically: operations that have FileIn or FileOut parameters in the generator).

To cancel the transfer, call "guestfs_user_cancel". For more information, read the description of "guestfs_user_cancel".

PRIVATE DATA AREA

You can attach named pieces of private data to the libguestfs handle, fetch them by name, and walk over them, for the lifetime of the handle. This is called the private data area and is only available from the C API.

To attach a named piece of data, use the following call:

void guestfs_set_private (guestfs_h *g, const char *key, void *data);

key is the name to associate with this data, and data is an arbitrary pointer (which can be NULL). Any previous item with the same key is overwritten.

You can use any key string you want, but avoid keys beginning with an underscore character (libguestfs uses those for its own internal purposes, such as implementing language bindings). It is recommended that you prefix the key with some unique string to avoid collisions with other users.

To retrieve the pointer, use:

void *guestfs_get_private (guestfs_h *g, const char *key);

This function returns NULL if either no data is found associated with key, or if the user previously set the key's data pointer to NULL.

Libguestfs does not try to look at or interpret the data pointer in any way. As far as libguestfs is concerned, it need not be a valid pointer at all. In particular, libguestfs does not try to free the data when the handle is closed. If the data must be freed, then the caller must either free it before calling "guestfs_close" or must set up a close callback to do it (see "GUESTFS_EVENT_CLOSE").

To walk over all entries, use these two functions:

void *guestfs_first_private (guestfs_h *g, const char **key_rtn);

void *guestfs_next_private (guestfs_h *g, const char **key_rtn);

guestfs_first_private returns the first key, pointer pair ("first" does not have any particular meaning -- keys are not returned in any defined order). A pointer to the key is returned in *key_rtn and the corresponding data pointer is returned from the function. NULL is returned if there are no keys stored in the handle.

guestfs_next_private returns the next key, pointer pair. The return value of this function is NULL if there are no further entries to return.

Notes about walking over entries:

  • You must not call guestfs_set_private while walking over the entries.

  • The handle maintains an internal iterator which is reset when you call guestfs_first_private. This internal iterator is invalidated when you call guestfs_set_private.

  • If you have set the data pointer associated with a key to NULL, ie:

    guestfs_set_private (g, key, NULL);

    then that key is not returned when walking.

  • *key_rtn is only valid until the next call to guestfs_first_private, guestfs_next_private or guestfs_set_private.

The following example code shows how to print all keys and data pointers that are associated with the handle g:

const char *key;
void *data = guestfs_first_private (g, &key);
while (data != NULL)
  {
    printf ("key = %s, data = %p\n", key, data);
    data = guestfs_next_private (g, &key);
  }

More commonly you are only interested in keys that begin with an application-specific prefix foo_. Modify the loop like so:

const char *key;
void *data = guestfs_first_private (g, &key);
while (data != NULL)
  {
    if (strncmp (key, "foo_", strlen ("foo_")) == 0)
      printf ("key = %s, data = %p\n", key, data);
    data = guestfs_next_private (g, &key);
  }

If you need to modify keys while walking, then you have to jump back to the beginning of the loop. For example, to delete all keys prefixed with foo_:

const char *key;
void *data;
 again:
data = guestfs_first_private (g, &key);
while (data != NULL)
  {
    if (strncmp (key, "foo_", strlen ("foo_")) == 0)
      {
        guestfs_set_private (g, key, NULL);
        /* note that 'key' pointer is now invalid, and so is
           the internal iterator */
        goto again;
      }
    data = guestfs_next_private (g, &key);
  }

Note that the above loop is guaranteed to terminate because the keys are being deleted, but other manipulations of keys within the loop might not terminate unless you also maintain an indication of which keys have been visited.

SYSTEMTAP

The libguestfs C library can be probed using systemtap or DTrace. This is true of any library, not just libguestfs. However libguestfs also contains static markers to help in probing internal operations.

You can list all the static markers by doing:

stap -l 'process("/usr/lib*/libguestfs.so.0")
             .provider("guestfs").mark("*")'

Note: These static markers are not part of the stable API and may change in future versions.

SYSTEMTAP SCRIPT EXAMPLE

This script contains examples of displaying both the static markers and some ordinary C entry points:

global last;

function display_time () {
      now = gettimeofday_us ();
      delta = 0;
      if (last > 0)
            delta = now - last;
      last = now;

      printf ("%d (+%d):", now, delta);
}

probe begin {
      last = 0;
      printf ("ready\n");
}

/* Display all calls to static markers. */
probe process("/usr/lib*/libguestfs.so.0")
          .provider("guestfs").mark("*") ? {
      display_time();
      printf ("\t%s %s\n", $$name, $$parms);
}

/* Display all calls to guestfs_mkfs* functions. */
probe process("/usr/lib*/libguestfs.so.0")
          .function("guestfs_mkfs*") ? {
      display_time();
      printf ("\t%s %s\n", probefunc(), $$parms);
}

The script above can be saved to test.stap and run using the stap(1) program. Note that you either have to be root, or you have to add yourself to several special stap groups. Consult the systemtap documentation for more information.

# stap /tmp/test.stap
ready

In another terminal, run a guestfish command such as this:

guestfish -N fs

In the first terminal, stap trace output similar to this is shown:

1318248056692655 (+0):  launch_start
1318248056692850 (+195):       launch_build_appliance_start
1318248056818285 (+125435):    launch_build_appliance_end
1318248056838059 (+19774):     launch_run_qemu
1318248061071167 (+4233108):   launch_end
1318248061280324 (+209157):    guestfs_mkfs g=0x1024ab0 fstype=0x46116f device=0x1024e60

LIBGUESTFS VERSION NUMBERS

Since April 2010, libguestfs has started to make separate development and stable releases, along with corresponding branches in our git repository. These separate releases can be identified by version number:

even numbers for stable: 1.2.x, 1.4.x, ...
       .-------- odd numbers for development: 1.3.x, 1.5.x, ...
       |
       v
 1  .  3  .  5
 ^           ^
 |           |
 |           `-------- sub-version
 |
 `------ always '1' because we don't change the ABI

Thus "1.3.5" is the 5th update to the development branch "1.3".

As time passes we cherry pick fixes from the development branch and backport those into the stable branch, the effect being that the stable branch should get more stable and less buggy over time. So the stable releases are ideal for people who don't need new features but would just like the software to work.

Our criteria for backporting changes are:

  • Documentation changes which don't affect any code are backported unless the documentation refers to a future feature which is not in stable.

  • Bug fixes which are not controversial, fix obvious problems, and have been well tested are backported.

  • Simple rearrangements of code which shouldn't affect how it works get backported. This is so that the code in the two branches doesn't get too far out of step, allowing us to backport future fixes more easily.

  • We don't backport new features, new APIs, new tools etc, except in one exceptional case: the new feature is required in order to implement an important bug fix.

A new stable branch starts when we think the new features in development are substantial and compelling enough over the current stable branch to warrant it. When that happens we create new stable and development versions 1.N.0 and 1.(N+1).0 [N is even]. The new dot-oh release won't necessarily be so stable at this point, but by backporting fixes from development, that branch will stabilize over time.

LIMITS

PROTOCOL LIMITS

Internally libguestfs uses a message-based protocol to pass API calls and their responses to and from a small "appliance" (see guestfs-internals(1) for plenty more detail about this). The maximum message size used by the protocol is slightly less than 4 MB. For some API calls you may need to be aware of this limit. The API calls which may be affected are individually documented, with a link back to this section of the documentation.

In libguestfs < 1.19.32, several calls had to encode either their entire argument list or their entire return value (or sometimes both) in a single protocol message, and this gave them an arbitrary limitation on how much data they could handle. For example, "guestfs_cat" could only download a file if it was less than around 4 MB in size. In later versions of libguestfs, some of these limits have been removed. The APIs which were previously limited but are now unlimited (except perhaps by available memory) are listed below. To find out if a specific API is subject to protocol limits, check for the warning in the API documentation which links to this section, and remember to check the version of the documentation that matches the version of libguestfs you are using.

"guestfs_cat", "guestfs_find", "guestfs_read_file", "guestfs_read_lines", "guestfs_write", "guestfs_write_append", "guestfs_lstatlist", "guestfs_lxattrlist", "guestfs_readlinklist", "guestfs_ls".

See also "UPLOADING" and "DOWNLOADING" for further information about copying large amounts of data into or out of a filesystem.

MAXIMUM NUMBER OF DISKS

In libguestfs ≥ 1.19.7, you can query the maximum number of disks that may be added by calling "guestfs_max_disks". In earlier versions of libguestfs (ie. where this call is not available) you should assume the maximum is 25.

The rest of this section covers implementation details, which could change in future.

When using virtio-scsi disks (the default if available in qemu) the current limit is 255 disks. When using virtio-blk (the old default) the limit is around 27 disks, but may vary according to implementation details and whether the network is enabled.

Virtio-scsi as used by libguestfs is configured to use one target per disk, and 256 targets are available.

Virtio-blk consumes 1 virtual PCI slot per disk, and PCI is limited to 31 slots, but some of these are used for other purposes.

One virtual disk is used by libguestfs internally.

Before libguestfs 1.19.7, disk names had to be a single character (eg. /dev/sda through /dev/sdz), and since one disk is reserved, that meant the limit was 25. This has been fixed in more recent versions.

In libguestfs ≥ 1.20 it is possible to hot plug disks. See "HOTPLUGGING".

MAXIMUM NUMBER OF PARTITIONS PER DISK

Virtio limits the maximum number of partitions per disk to 15.

This is because it reserves 4 bits for the minor device number (thus /dev/vda, and /dev/vda1 through /dev/vda15).

If you attach a disk with more than 15 partitions, the extra partitions are ignored by libguestfs.

MAXIMUM SIZE OF A DISK

Probably the limit is between 2**63-1 and 2**64-1 bytes.

We have tested block devices up to 1 exabyte (2**60 or 1,152,921,504,606,846,976 bytes) using sparse files backed by an XFS host filesystem.

Although libguestfs probably does not impose any limit, the underlying host storage will. If you store disk images on a host ext4 filesystem, then the maximum size will be limited by the maximum ext4 file size (currently 16 TB). If you store disk images as host logical volumes then you are limited by the maximum size of an LV.

For the hugest disk image files, we recommend using XFS on the host for storage.

MAXIMUM SIZE OF A PARTITION

The MBR (ie. classic MS-DOS) partitioning scheme uses 32 bit sector numbers. Assuming a 512 byte sector size, this means that MBR cannot address a partition located beyond 2 TB on the disk.

It is recommended that you use GPT partitions on disks which are larger than this size. GPT uses 64 bit sector numbers and so can address partitions which are theoretically larger than the largest disk we could support.

MAXIMUM SIZE OF A FILESYSTEM, FILES, DIRECTORIES

This depends on the filesystem type. libguestfs itself does not impose any known limit. Consult Wikipedia or the filesystem documentation to find out what these limits are.

MAXIMUM UPLOAD AND DOWNLOAD

The API functions "guestfs_upload", "guestfs_download", "guestfs_tar_in", "guestfs_tar_out" and the like allow unlimited sized uploads and downloads.

INSPECTION LIMITS

The inspection code has several arbitrary limits on things like the size of Windows Registry hive it will read, and the length of product name. These are intended to stop a malicious guest from consuming arbitrary amounts of memory and disk space on the host, and should not be reached in practice. See the source code for more information.

ENVIRONMENT VARIABLES

LIBGUESTFS_APPEND

Pass additional options to the guest kernel.

LIBGUESTFS_ATTACH_METHOD

This is the old way to set LIBGUESTFS_BACKEND.

LIBGUESTFS_BACKEND

Choose the default way to create the appliance. See "guestfs_set_backend" and "BACKEND".

LIBGUESTFS_BACKEND_SETTINGS

A colon-separated list of backend-specific settings. See "BACKEND", "BACKEND SETTINGS".

LIBGUESTFS_CACHEDIR

The location where libguestfs will cache its appliance, when using a supermin appliance. The appliance is cached and shared between all handles which have the same effective user ID.

If LIBGUESTFS_CACHEDIR is not set, then TMPDIR is used. If TMPDIR is not set, then /var/tmp is used.

See also "LIBGUESTFS_TMPDIR", "guestfs_set_cachedir".

LIBGUESTFS_DEBUG

Set LIBGUESTFS_DEBUG=1 to enable verbose messages. This has the same effect as calling guestfs_set_verbose (g, 1).

LIBGUESTFS_HV

Set the default hypervisor (usually qemu) binary that libguestfs uses. If not set, then the qemu which was found at compile time by the configure script is used.

See also "QEMU WRAPPERS" above.

LIBGUESTFS_MEMSIZE

Set the memory allocated to the qemu process, in megabytes. For example:

LIBGUESTFS_MEMSIZE=700
LIBGUESTFS_PATH

Set the path that libguestfs uses to search for a supermin appliance. See the discussion of paths in section "PATH" above.

LIBGUESTFS_QEMU

This is the old way to set LIBGUESTFS_HV.

LIBGUESTFS_TMPDIR

The location where libguestfs will store temporary files used by each handle.

If LIBGUESTFS_TMPDIR is not set, then TMPDIR is used. If TMPDIR is not set, then /tmp is used.

See also "LIBGUESTFS_CACHEDIR", "guestfs_set_tmpdir".

LIBGUESTFS_TRACE

Set LIBGUESTFS_TRACE=1 to enable command traces. This has the same effect as calling guestfs_set_trace (g, 1).

PATH

Libguestfs may run some external programs, and relies on $PATH being set to a reasonable value. If using the libvirt backend, libvirt will not work at all unless $PATH contains the path of qemu/KVM. Note that PHP by default removes $PATH from the environment which tends to break everything.

SUPERMIN_KERNEL
SUPERMIN_KERNEL_VERSION
SUPERMIN_MODULES

These three environment variables allow the kernel that libguestfs uses in the appliance to be selected. If $SUPERMIN_KERNEL is not set, then the most recent host kernel is chosen. For more information about kernel selection, see supermin(1).

TMPDIR

See "LIBGUESTFS_CACHEDIR", "LIBGUESTFS_TMPDIR".

SEE ALSO

Examples written in C: guestfs-examples(3).

Language bindings: guestfs-erlang(3), guestfs-golang(3), guestfs-java(3), guestfs-lua(3), guestfs-ocaml(3), guestfs-perl(3), guestfs-python(3), guestfs-ruby(3).

Tools: guestfish(1), guestmount(1), virt-alignment-scan(1), virt-builder(1), virt-cat(1), virt-copy-in(1), virt-copy-out(1), virt-customize(1), virt-df(1), virt-diff(1), virt-edit(1), virt-filesystems(1), virt-format(1), virt-inspector(1), virt-list-filesystems(1), virt-list-partitions(1), virt-log(1), virt-ls(1), virt-make-fs(1), virt-p2v(1), virt-rescue(1), virt-resize(1), virt-sparsify(1), virt-sysprep(1), virt-tar(1), virt-tar-in(1), virt-tar-out(1), virt-v2v(1), virt-win-reg(1).

Other libguestfs topics: guestfs-faq(1), guestfs-hacking(1), guestfs-internals(1), guestfs-performance(1), guestfs-release-notes(1), guestfs-security(1), guestfs-testing(1), libguestfs-test-tool(1), libguestfs-make-fixed-appliance(1).

Related manual pages: supermin(1), qemu(1), hivex(3), stap(1), sd-journal(3).

Website: http://libguestfs.org/

Tools with a similar purpose: fdisk(8), parted(8), kpartx(8), lvm(8), disktype(1).

AUTHORS

Richard W.M. Jones (rjones at redhat dot com)

COPYRIGHT

Copyright (C) 2009-2016 Red Hat Inc.

POD ERRORS

Hey! The above document had some coding errors, which are explained below:

Around line 1060:

Non-ASCII character seen before =encoding in '→'. Assuming UTF-8