kernel-netbook/toi-3.13.patch

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index b9e9bd8..a88912b 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -3343,6 +3343,9 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 					HIGHMEM regardless of setting
 					of CONFIG_HIGHPTE.
 
+	uuid_debug=	(Boolean) whether to enable debugging of TuxOnIce's
+			uuid support.
+
 	vdso=		[X86,SH]
 			vdso=2: enable compat VDSO (default with COMPAT_VDSO)
 			vdso=1: enable VDSO (default)
diff --git a/Documentation/power/tuxonice-internals.txt b/Documentation/power/tuxonice-internals.txt
new file mode 100644
index 0000000..7a96186
--- /dev/null
+++ b/Documentation/power/tuxonice-internals.txt
@@ -0,0 +1,477 @@
+		   TuxOnIce 3.0 Internal Documentation.
+			Updated to 26 March 2009
+
+1.  Introduction.
+
+    TuxOnIce 3.0 is an addition to the Linux Kernel, designed to
+    allow the user to quickly shutdown and quickly boot a computer, without
+    needing to close documents or programs. It is equivalent to the
+    hibernate facility in some laptops. This implementation, however,
+    requires no special BIOS or hardware support.
+
+    The code in these files is based upon the original implementation
+    prepared by Gabor Kuti and additional work by Pavel Machek and a
+    host of others. This code has been substantially reworked by Nigel
+    Cunningham, again with the help and testing of many others, not the
+    least of whom is Michael Frank. At its heart, however, the operation is
+    essentially the same as Gabor's version.
+
+2.  Overview of operation.
+
+    The basic sequence of operations is as follows:
+
+	a. Quiesce all other activity.
+	b. Ensure enough memory and storage space are available, and attempt
+	   to free memory/storage if necessary.
+	c. Allocate the required memory and storage space.
+	d. Write the image.
+	e. Power down.
+
+    There are a number of complicating factors which mean that things are
+    not as simple as the above would imply, however...
+
+    o The activity of each process must be stopped at a point where it will
+    not be holding locks necessary for saving the image, or unexpectedly
+    restart operations due to something like a timeout and thereby make
+    our image inconsistent.
+
+    o It is desirous that we sync outstanding I/O to disk before calculating
+    image statistics. This reduces corruption if one should suspend but
+    then not resume, and also makes later parts of the operation safer (see
+    below).
+
+    o We need to get as close as we can to an atomic copy of the data.
+    Inconsistencies in the image will result in inconsistent memory contents at
+    resume time, and thus in instability of the system and/or file system
+    corruption. This would appear to imply a maximum image size of one half of
+    the amount of RAM, but we have a solution... (again, below).
+
+    o In 2.6, we choose to play nicely with the other suspend-to-disk
+    implementations.
+
+3.  Detailed description of internals.
+
+    a. Quiescing activity.
+
+    Safely quiescing the system is achieved using three separate but related
+    aspects.
+
+    First, we note that the vast majority of processes don't need to run during
+    suspend. They can be 'frozen'. We therefore implement a refrigerator
+    routine, which processes enter and in which they remain until the cycle is
+    complete. Processes enter the refrigerator via try_to_freeze() invocations
+    at appropriate places.  A process cannot be frozen in any old place. It
+    must not be holding locks that will be needed for writing the image or
+    freezing other processes. For this reason, userspace processes generally
+    enter the refrigerator via the signal handling code, and kernel threads at
+    the place in their event loops where they drop locks and yield to other
+    processes or sleep.
+
+    The task of freezing processes is complicated by the fact that there can be
+    interdependencies between processes. Freezing process A before process B may
+    mean that process B cannot be frozen, because it stops at waiting for
+    process A rather than in the refrigerator. This issue is seen where
+    userspace waits on freezeable kernel threads or fuse filesystem threads. To
+    address this issue, we implement the following algorithm for quiescing
+    activity:
+
+	- Freeze filesystems (including fuse - userspace programs starting
+		new requests are immediately frozen; programs already running
+		requests complete their work before being frozen in the next
+		step)
+	- Freeze userspace
+	- Thaw filesystems (this is safe now that userspace is frozen and no
+		fuse requests are outstanding).
+	- Invoke sys_sync (noop on fuse).
+	- Freeze filesystems
+	- Freeze kernel threads
+
+    If we need to free memory, we thaw kernel threads and filesystems, but not
+    userspace. We can then free caches without worrying about deadlocks due to
+    swap files being on frozen filesystems or such like.
+
+    b. Ensure enough memory & storage are available.
+
+    We have a number of constraints to meet in order to be able to successfully
+    suspend and resume.
+
+    First, the image will be written in two parts, described below. One of these
+    parts needs to have an atomic copy made, which of course implies a maximum
+    size of one half of the amount of system memory. The other part ('pageset')
+    is not atomically copied, and can therefore be as large or small as desired.
+
+    Second, we have constraints on the amount of storage available. In these
+    calculations, we may also consider any compression that will be done. The
+    cryptoapi module allows the user to configure an expected compression ratio.
+
+    Third, the user can specify an arbitrary limit on the image size, in
+    megabytes. This limit is treated as a soft limit, so that we don't fail the
+    attempt to suspend if we cannot meet this constraint.
+
+    c. Allocate the required memory and storage space.
+
+    Having done the initial freeze, we determine whether the above constraints
+    are met, and seek to allocate the metadata for the image. If the constraints
+    are not met, or we fail to allocate the required space for the metadata, we
+    seek to free the amount of memory that we calculate is needed and try again.
+    We allow up to four iterations of this loop before aborting the cycle. If we
+    do fail, it should only be because of a bug in TuxOnIce's calculations.
+
+    These steps are merged together in the prepare_image function, found in
+    prepare_image.c. The functions are merged because of the cyclical nature
+    of the problem of calculating how much memory and storage is needed. Since
+    the data structures containing the information about the image must
+    themselves take memory and use storage, the amount of memory and storage
+    required changes as we prepare the image. Since the changes are not large,
+    only one or two iterations will be required to achieve a solution.
+
+    The recursive nature of the algorithm is miminised by keeping user space
+    frozen while preparing the image, and by the fact that our records of which
+    pages are to be saved and which pageset they are saved in use bitmaps (so
+    that changes in number or fragmentation of the pages to be saved don't
+    feedback via changes in the amount of memory needed for metadata). The
+    recursiveness is thus limited to any extra slab pages allocated to store the
+    extents that record storage used, and the effects of seeking to free memory.
+
+    d. Write the image.
+
+    We previously mentioned the need to create an atomic copy of the data, and
+    the half-of-memory limitation that is implied in this. This limitation is
+    circumvented by dividing the memory to be saved into two parts, called
+    pagesets.
+
+    Pageset2 contains most of the page cache - the pages on the active and
+    inactive LRU lists that aren't needed or modified while TuxOnIce is
+    running, so they can be safely written without an atomic copy. They are
+    therefore saved first and reloaded last. While saving these pages,
+    TuxOnIce carefully ensures that the work of writing the pages doesn't make
+    the image inconsistent. With the support for Kernel (Video) Mode Setting
+    going into the kernel at the time of writing, we need to check for pages
+    on the LRU that are used by KMS, and exclude them from pageset2. They are
+    atomically copied as part of pageset 1.
+
+    Once pageset2 has been saved, we prepare to do the atomic copy of remaining
+    memory. As part of the preparation, we power down drivers, thereby providing
+    them with the opportunity to have their state recorded in the image. The
+    amount of memory allocated by drivers for this is usually negligible, but if
+    DRI is in use, video drivers may require significants amounts. Ideally we
+    would be able to query drivers while preparing the image as to the amount of
+    memory they will need. Unfortunately no such mechanism exists at the time of
+    writing. For this reason, TuxOnIce allows the user to set an
+    'extra_pages_allowance', which is used to seek to ensure sufficient memory
+    is available for drivers at this point. TuxOnIce also lets the user set this
+    value to 0. In this case, a test driver suspend is done while preparing the
+    image, and the difference (plus a margin) used instead. TuxOnIce will also
+    automatically restart the hibernation process (twice at most) if it finds
+    that the extra pages allowance is not sufficient. It will then use what was
+    actually needed (plus a margin, again). Failure to hibernate should thus
+    be an extremely rare occurence.
+
+    Having suspended the drivers, we save the CPU context before making an
+    atomic copy of pageset1, resuming the drivers and saving the atomic copy.
+    After saving the two pagesets, we just need to save our metadata before
+    powering down.
+
+    As we mentioned earlier, the contents of pageset2 pages aren't needed once
+    they've been saved. We therefore use them as the destination of our atomic
+    copy. In the unlikely event that pageset1 is larger, extra pages are
+    allocated while the image is being prepared. This is normally only a real
+    possibility when the system has just been booted and the page cache is
+    small.
+
+    This is where we need to be careful about syncing, however. Pageset2 will
+    probably contain filesystem meta data. If this is overwritten with pageset1
+    and then a sync occurs, the filesystem will be corrupted - at least until
+    resume time and another sync of the restored data. Since there is a
+    possibility that the user might not resume or (may it never be!) that
+    TuxOnIce might oops, we do our utmost to avoid syncing filesystems after
+    copying pageset1.
+
+    e. Power down.
+
+    Powering down uses standard kernel routines. TuxOnIce supports powering down
+    using the ACPI S3, S4 and S5 methods or the kernel's non-ACPI power-off.
+    Supporting suspend to ram (S3) as a power off option might sound strange,
+    but it allows the user to quickly get their system up and running again if
+    the battery doesn't run out (we just need to re-read the overwritten pages)
+    and if the battery does run out (or the user removes power), they can still
+    resume.
+
+4.  Data Structures.
+
+    TuxOnIce uses three main structures to store its metadata and configuration
+    information:
+
+    a) Pageflags bitmaps.
+
+    TuxOnIce records which pages will be in pageset1, pageset2, the destination
+    of the atomic copy and the source of the atomically restored image using
+    bitmaps. The code used is that written for swsusp, with small improvements
+    to match TuxOnIce's requirements.
+
+    The pageset1 bitmap is thus easily stored in the image header for use at
+    resume time.
+
+    As mentioned above, using bitmaps also means that the amount of memory and
+    storage required for recording the above information is constant. This
+    greatly simplifies the work of preparing the image. In earlier versions of
+    TuxOnIce, extents were used to record which pages would be stored. In that
+    case, however, eating memory could result in greater fragmentation of the
+    lists of pages, which in turn required more memory to store the extents and
+    more storage in the image header. These could in turn require further
+    freeing of memory, and another iteration. All of this complexity is removed
+    by having bitmaps.
+
+    Bitmaps also make a lot of sense because TuxOnIce only ever iterates
+    through the lists. There is therefore no cost to not being able to find the
+    nth page in order 0 time. We only need to worry about the cost of finding
+    the n+1th page, given the location of the nth page. Bitwise optimisations
+    help here.
+
+    b) Extents for block data.
+
+    TuxOnIce supports writing the image to multiple block devices. In the case
+    of swap, multiple partitions and/or files may be in use, and we happily use
+    them all (with the exception of compcache pages, which we allocate but do
+    not use). This use of multiple block devices is accomplished as follows:
+
+    Whatever the actual source of the allocated storage, the destination of the
+    image can be viewed in terms of one or more block devices, and on each
+    device, a list of sectors. To simplify matters, we only use contiguous,
+    PAGE_SIZE aligned sectors, like the swap code does.
+
+    Since sector numbers on each bdev may well not start at 0, it makes much
+    more sense to use extents here. Contiguous ranges of pages can thus be
+    represented in the extents by contiguous values.
+
+    Variations in block size are taken account of in transforming this data
+    into the parameters for bio submission.
+
+    We can thus implement a layer of abstraction wherein the core of TuxOnIce
+    doesn't have to worry about which device we're currently writing to or
+    where in the device we are. It simply requests that the next page in the
+    pageset or header be written, leaving the details to this lower layer.
+    The lower layer remembers where in the sequence of devices and blocks each
+    pageset starts. The header always starts at the beginning of the allocated
+    storage.
+
+    So extents are:
+
+    struct extent {
+      unsigned long minimum, maximum;
+      struct extent *next;
+    }
+
+    These are combined into chains of extents for a device:
+
+    struct extent_chain {
+      int size; /* size of the extent ie sum (max-min+1) */
+      int allocs, frees;
+      char *name;
+      struct extent *first, *last_touched;
+    };
+
+    For each bdev, we need to store a little more info:
+
+    struct suspend_bdev_info {
+       struct block_device *bdev;
+       dev_t dev_t;
+       int bmap_shift;
+       int blocks_per_page;
+    };
+
+    The dev_t is used to identify the device in the stored image. As a result,
+    we expect devices at resume time to have the same major and minor numbers
+    as they had while suspending.  This is primarily a concern where the user
+    utilises LVM for storage, as they will need to dmsetup their partitions in
+    such a way as to maintain this consistency at resume time.
+
+    bmap_shift and blocks_per_page apply the effects of variations in blocks
+    per page settings for the filesystem and underlying bdev. For most
+    filesystems, these are the same, but for xfs, they can have independant
+    values.
+
+    Combining these two structures together, we have everything we need to
+    record what devices and what blocks on each device are being used to
+    store the image, and to submit i/o using bio_submit.
+
+    The last elements in the picture are a means of recording how the storage
+    is being used.
+
+    We do this first and foremost by implementing a layer of abstraction on
+    top of the devices and extent chains which allows us to view however many
+    devices there might be as one long storage tape, with a single 'head' that
+    tracks a 'current position' on the tape:
+
+    struct extent_iterate_state {
+      struct extent_chain *chains;
+      int num_chains;
+      int current_chain;
+      struct extent *current_extent;
+      unsigned long current_offset;
+    };
+
+    That is, *chains points to an array of size num_chains of extent chains.
+    For the filewriter, this is always a single chain. For the swapwriter, the
+    array is of size MAX_SWAPFILES.
+
+    current_chain, current_extent and current_offset thus point to the current
+    index in the chains array (and into a matching array of struct
+    suspend_bdev_info), the current extent in that chain (to optimise access),
+    and the current value in the offset.
+
+    The image is divided into three parts:
+    - The header
+    - Pageset 1
+    - Pageset 2
+
+    The header always starts at the first device and first block. We know its
+    size before we begin to save the image because we carefully account for
+    everything that will be stored in it.
+
+    The second pageset (LRU) is stored first. It begins on the next page after
+    the end of the header.
+
+    The first pageset is stored second. It's start location is only known once
+    pageset2 has been saved, since pageset2 may be compressed as it is written.
+    This location is thus recorded at the end of saving pageset2. It is page
+    aligned also.
+
+    Since this information is needed at resume time, and the location of extents
+    in memory will differ at resume time, this needs to be stored in a portable
+    way:
+
+    struct extent_iterate_saved_state {
+        int chain_num;
+        int extent_num;
+        unsigned long offset;
+    };
+
+    We can thus implement a layer of abstraction wherein the core of TuxOnIce
+    doesn't have to worry about which device we're currently writing to or
+    where in the device we are. It simply requests that the next page in the
+    pageset or header be written, leaving the details to this layer, and
+    invokes the routines to remember and restore the position, without having
+    to worry about the details of how the data is arranged on disk or such like.
+
+    c) Modules
+
+    One aim in designing TuxOnIce was to make it flexible. We wanted to allow
+    for the implementation of different methods of transforming a page to be
+    written to disk and different methods of getting the pages stored.
+
+    In early versions (the betas and perhaps Suspend1), compression support was
+    inlined in the image writing code, and the data structures and code for
+    managing swap were intertwined with the rest of the code. A number of people
+    had expressed interest in implementing image encryption, and alternative
+    methods of storing the image.
+
+    In order to achieve this, TuxOnIce was given a modular design.
+
+    A module is a single file which encapsulates the functionality needed
+    to transform a pageset of data (encryption or compression, for example),
+    or to write the pageset to a device. The former type of module is called
+    a 'page-transformer', the later a 'writer'.
+
+    Modules are linked together in pipeline fashion. There may be zero or more
+    page transformers in a pipeline, and there is always exactly one writer.
+    The pipeline follows this pattern:
+
+		---------------------------------
+		|          TuxOnIce Core        |
+		---------------------------------
+				|
+				|
+		---------------------------------
+		|	Page transformer 1	|
+		---------------------------------
+				|
+				|
+		---------------------------------
+		|	Page transformer 2	|
+		---------------------------------
+				|
+				|
+		---------------------------------
+		|            Writer		|
+		---------------------------------
+
+    During the writing of an image, the core code feeds pages one at a time
+    to the first module. This module performs whatever transformations it
+    implements on the incoming data, completely consuming the incoming data and
+    feeding output in a similar manner to the next module.
+
+    All routines are SMP safe, and the final result of the transformations is
+    written with an index (provided by the core) and size of the output by the
+    writer. As a result, we can have multithreaded I/O without needing to
+    worry about the sequence in which pages are written (or read).
+
+    During reading, the pipeline works in the reverse direction. The core code
+    calls the first module with the address of a buffer which should be filled.
+    (Note that the buffer size is always PAGE_SIZE at this time). This module
+    will in turn request data from the next module and so on down until the
+    writer is made to read from the stored image.
+
+    Part of definition of the structure of a module thus looks like this:
+
+        int (*rw_init) (int rw, int stream_number);
+        int (*rw_cleanup) (int rw);
+        int (*write_chunk) (struct page *buffer_page);
+        int (*read_chunk) (struct page *buffer_page, int sync);
+
+    It should be noted that the _cleanup routine may be called before the
+    full stream of data has been read or written. While writing the image,
+    the user may (depending upon settings) choose to abort suspending, and
+    if we are in the midst of writing the last portion of the image, a portion
+    of the second pageset may be reread. This may also happen if an error
+    occurs and we seek to abort the process of writing the image.
+
+    The modular design is also useful in a number of other ways. It provides
+    a means where by we can add support for:
+
+    - providing overall initialisation and cleanup routines;
+    - serialising configuration information in the image header;
+    - providing debugging information to the user;
+    - determining memory and image storage requirements;
+    - dis/enabling components at run-time;
+    - configuring the module (see below);
+
+    ...and routines for writers specific to their work:
+    - Parsing a resume= location;
+    - Determining whether an image exists;
+    - Marking a resume as having been attempted;
+    - Invalidating an image;
+
+    Since some parts of the core - the user interface and storage manager
+    support - have use for some of these functions, they are registered as
+    'miscellaneous' modules as well.
+
+    d) Sysfs data structures.
+
+    This brings us naturally to support for configuring TuxOnIce. We desired to
+    provide a way to make TuxOnIce as flexible and configurable as possible.
+    The user shouldn't have to reboot just because they want to now hibernate to
+    a file instead of a partition, for example.
+
+    To accomplish this, TuxOnIce implements a very generic means whereby the
+    core and modules can register new sysfs entries. All TuxOnIce entries use
+    a single _store and _show routine, both of which are found in
+    tuxonice_sysfs.c in the kernel/power directory. These routines handle the
+    most common operations - getting and setting the values of bits, integers,
+    longs, unsigned longs and strings in one place, and allow overrides for
+    customised get and set options as well as side-effect routines for all
+    reads and writes.
+
+    When combined with some simple macros, a new sysfs entry can then be defined
+    in just a couple of lines:
+
+        SYSFS_INT("progress_granularity", SYSFS_RW, &progress_granularity, 1,
+                        2048, 0, NULL),
+
+    This defines a sysfs entry named "progress_granularity" which is rw and
+    allows the user to access an integer stored at &progress_granularity, giving
+    it a value between 1 and 2048 inclusive.
+
+    Sysfs entries are registered under /sys/power/tuxonice, and entries for
+    modules are located in a subdirectory named after the module.
+
diff --git a/Documentation/power/tuxonice.txt b/Documentation/power/tuxonice.txt
new file mode 100644
index 0000000..3bf0575
--- /dev/null
+++ b/Documentation/power/tuxonice.txt
@@ -0,0 +1,948 @@
+	--- TuxOnIce, version 3.0 ---
+
+1.  What is it?
+2.  Why would you want it?
+3.  What do you need to use it?
+4.  Why not just use the version already in the kernel?
+5.  How do you use it?
+6.  What do all those entries in /sys/power/tuxonice do?
+7.  How do you get support?
+8.  I think I've found a bug. What should I do?
+9.  When will XXX be supported?
+10  How does it work?
+11. Who wrote TuxOnIce?
+
+1. What is it?
+
+   Imagine you're sitting at your computer, working away. For some reason, you
+   need to turn off your computer for a while - perhaps it's time to go home
+   for the day. When you come back to your computer next, you're going to want
+   to carry on where you left off. Now imagine that you could push a button and
+   have your computer store the contents of its memory to disk and power down.
+   Then, when you next start up your computer, it loads that image back into
+   memory and you can carry on from where you were, just as if you'd never
+   turned the computer off. You have far less time to start up, no reopening of
+   applications or finding what directory you put that file in yesterday.
+   That's what TuxOnIce does.
+
+   TuxOnIce has a long heritage. It began life as work by Gabor Kuti, who,
+   with some help from Pavel Machek, got an early version going in 1999. The
+   project was then taken over by Florent Chabaud while still in alpha version
+   numbers. Nigel Cunningham came on the scene when Florent was unable to
+   continue, moving the project into betas, then 1.0, 2.0 and so on up to
+   the present series. During the 2.0 series, the name was contracted to
+   Suspend2 and the website suspend2.net created. Beginning around July 2007,
+   a transition to calling the software TuxOnIce was made, to seek to help
+   make it clear that TuxOnIce is more concerned with hibernation than suspend
+   to ram.
+
+   Pavel Machek's swsusp code, which was merged around 2.5.17 retains the
+   original name, and was essentially a fork of the beta code until Rafael
+   Wysocki came on the scene in 2005 and began to improve it further.
+
+2. Why would you want it?
+
+   Why wouldn't you want it?
+
+   Being able to save the state of your system and quickly restore it improves
+   your productivity - you get a useful system in far less time than through
+   the normal boot process. You also get to be completely 'green', using zero
+   power, or as close to that as possible (the computer may still provide
+   minimal power to some devices, so they can initiate a power on, but that
+   will be the same amount of power as would be used if you told the computer
+   to shutdown.
+
+3. What do you need to use it?
+
+   a. Kernel Support.
+
+   i) The TuxOnIce patch.
+
+   TuxOnIce is part of the Linux Kernel. This version is not part of Linus's
+   2.6 tree at the moment, so you will need to download the kernel source and
+   apply the latest patch. Having done that, enable the appropriate options in
+   make [menu|x]config (under Power Management Options - look for "Enhanced
+   Hibernation"), compile and install your kernel. TuxOnIce works with SMP,
+   Highmem, preemption, fuse filesystems, x86-32, PPC and x86_64.
+
+   TuxOnIce patches are available from http://tuxonice.net.
+
+   ii) Compression support.
+
+   Compression support is implemented via the cryptoapi. You will therefore want
+   to select any Cryptoapi transforms that you want to use on your image from
+   the Cryptoapi menu while configuring your kernel. We recommend the use of the
+   LZO compression method - it is very fast and still achieves good compression.
+
+   You can also tell TuxOnIce to write its image to an encrypted and/or
+   compressed filesystem/swap partition. In that case, you don't need to do
+   anything special for TuxOnIce when it comes to kernel configuration.
+
+   iii) Configuring other options.
+
+   While you're configuring your kernel, try to configure as much as possible
+   to build as modules. We recommend this because there are a number of drivers
+   that are still in the process of implementing proper power management
+   support. In those cases, the best way to work around their current lack is
+   to build them as modules and remove the modules while hibernating. You might
+   also bug the driver authors to get their support up to speed, or even help!
+
+   b. Storage.
+
+   i) Swap.
+
+   TuxOnIce can store the hibernation image in your swap partition, a swap file or
+   a combination thereof. Whichever combination you choose, you will probably
+   want to create enough swap space to store the largest image you could have,
+   plus the space you'd normally use for swap. A good rule of thumb would be
+   to calculate the amount of swap you'd want without using TuxOnIce, and then
+   add the amount of memory you have. This swapspace can be arranged in any way
+   you'd like. It can be in one partition or file, or spread over a number. The
+   only requirement is that they be active when you start a hibernation cycle.
+
+   There is one exception to this requirement. TuxOnIce has the ability to turn
+   on one swap file or partition at the start of hibernating and turn it back off
+   at the end. If you want to ensure you have enough memory to store a image
+   when your memory is fully used, you might want to make one swap partition or
+   file for 'normal' use, and another for TuxOnIce to activate & deactivate
+   automatically. (Further details below).
+
+   ii) Normal files.
+
+   TuxOnIce includes a 'file allocator'. The file allocator can store your
+   image in a simple file. Since Linux has the concept of everything being a
+   file, this is more powerful than it initially sounds. If, for example, you
+   were to set up a network block device file, you could hibernate to a network
+   server. This has been tested and works to a point, but nbd itself isn't
+   stateless enough for our purposes.
+
+   Take extra care when setting up the file allocator. If you just type
+   commands without thinking and then try to hibernate, you could cause
+   irreversible corruption on your filesystems! Make sure you have backups.
+
+   Most people will only want to hibernate to a local file. To achieve that, do
+   something along the lines of:
+
+   echo "TuxOnIce" > /hibernation-file
+   dd if=/dev/zero bs=1M count=512 >> /hibernation-file
+
+   This will create a 512MB file called /hibernation-file. To get TuxOnIce to use
+   it:
+
+   echo /hibernation-file > /sys/power/tuxonice/file/target
+
+   Then
+
+   cat /sys/power/tuxonice/resume
+
+   Put the results of this into your bootloader's configuration (see also step
+   C, below):
+
+   ---EXAMPLE-ONLY-DON'T-COPY-AND-PASTE---
+   # cat /sys/power/tuxonice/resume
+   file:/dev/hda2:0x1e001
+
+   In this example, we would edit the append= line of our lilo.conf|menu.lst
+   so that it included:
+
+   resume=file:/dev/hda2:0x1e001
+   ---EXAMPLE-ONLY-DON'T-COPY-AND-PASTE---
+
+   For those who are thinking 'Could I make the file sparse?', the answer is
+   'No!'. At the moment, there is no way for TuxOnIce to fill in the holes in
+   a sparse file while hibernating. In the longer term (post merge!), I'd like
+   to change things so that the file could be dynamically resized and have
+   holes filled as needed. Right now, however, that's not possible and not a
+   priority.
+
+   c. Bootloader configuration.
+
+   Using TuxOnIce also requires that you add an extra parameter to
+   your lilo.conf or equivalent. Here's an example for a swap partition:
+
+   append="resume=swap:/dev/hda1"
+
+   This would tell TuxOnIce that /dev/hda1 is a swap partition you
+   have. TuxOnIce will use the swap signature of this partition as a
+   pointer to your data when you hibernate. This means that (in this example)
+   /dev/hda1 doesn't need to be _the_ swap partition where all of your data
+   is actually stored. It just needs to be a swap partition that has a
+   valid signature.
+
+   You don't need to have a swap partition for this purpose. TuxOnIce
+   can also use a swap file, but usage is a little more complex. Having made
+   your swap file, turn it on and do
+
+   cat /sys/power/tuxonice/swap/headerlocations
+
+   (this assumes you've already compiled your kernel with TuxOnIce
+   support and booted it). The results of the cat command will tell you
+   what you need to put in lilo.conf:
+
+   For swap partitions like /dev/hda1, simply use resume=/dev/hda1.
+   For swapfile `swapfile`, use resume=swap:/dev/hda2:0x242d.
+
+   If the swapfile changes for any reason (it is moved to a different
+   location, it is deleted and recreated, or the filesystem is
+   defragmented) then you will have to check
+   /sys/power/tuxonice/swap/headerlocations for a new resume_block value.
+
+   Once you've compiled and installed the kernel and adjusted your bootloader
+   configuration, you should only need to reboot for the most basic part
+   of TuxOnIce to be ready.
+
+   If you only compile in the swap allocator, or only compile in the file
+   allocator, you don't need to add the "swap:" part of the resume=
+   parameters above. resume=/dev/hda2:0x242d will work just as well. If you
+   have compiled both and your storage is on swap, you can also use this
+   format (the swap allocator is the default allocator).
+
+   When compiling your kernel, one of the options in the 'Power Management
+   Support' menu, just above the 'Enhanced Hibernation (TuxOnIce)' entry is
+   called 'Default resume partition'. This can be used to set a default value
+   for the resume= parameter.
+
+   d. The hibernate script.
+
+   Since the driver model in 2.6 kernels is still being developed, you may need
+   to do more than just configure TuxOnIce. Users of TuxOnIce usually start the
+   process via a script which prepares for the hibernation cycle, tells the
+   kernel to do its stuff and then restore things afterwards. This script might
+   involve:
+
+   - Switching to a text console and back if X doesn't like the video card
+     status on resume.
+   - Un/reloading drivers that don't play well with hibernation.
+
+   Note that you might not be able to unload some drivers if there are
+   processes using them. You might have to kill off processes that hold
+   devices open. Hint: if your X server accesses an USB mouse, doing a
+   'chvt' to a text console releases the device and you can unload the
+   module.
+
+   Check out the latest script (available on tuxonice.net).
+
+   e. The userspace user interface.
+
+   TuxOnIce has very limited support for displaying status if you only apply
+   the kernel patch - it can printk messages, but that is all. In addition,
+   some of the functions mentioned in this document (such as cancelling a cycle
+   or performing interactive debugging) are unavailable. To utilise these
+   functions, or simply get a nice display, you need the 'userui' component.
+   Userui comes in three flavours, usplash, fbsplash and text. Text should
+   work on any console. Usplash and fbsplash require the appropriate
+   (distro specific?) support.
+
+   To utilise a userui, TuxOnIce just needs to be told where to find the
+   userspace binary:
+
+   echo "/usr/local/sbin/tuxoniceui_fbsplash" > /sys/power/tuxonice/user_interface/program
+
+   The hibernate script can do this for you, and a default value for this
+   setting can be configured when compiling the kernel. This path is also
+   stored in the image header, so if you have an initrd or initramfs, you can
+   use the userui during the first part of resuming (prior to the atomic
+   restore) by putting the binary in the same path in your initrd/ramfs.
+   Alternatively, you can put it in a different location and do an echo
+   similar to the above prior to the echo > do_resume. The value saved in the
+   image header will then be ignored.
+
+4. Why not just use the version already in the kernel?
+
+   The version in the vanilla kernel has a number of drawbacks. The most
+   serious of these are:
+	- it has a maximum image size of 1/2 total memory;
+	- it doesn't allocate storage until after it has snapshotted memory.
+	  This means that you can't be sure hibernating will work until you
+	  see it start to write the image;
+	- it does not allow you to press escape to cancel a cycle;
+	- it does not allow you to press escape to cancel resuming;
+	- it does not allow you to automatically swapon a file when
+	  starting a cycle;
+	- it does not allow you to use multiple swap partitions or files;
+	- it does not allow you to use ordinary files;
+	- it just invalidates an image and continues to boot if you
+	  accidentally boot the wrong kernel after hibernating;
+	- it doesn't support any sort of nice display while hibernating;
+	- it is moving toward requiring that you have an initrd/initramfs
+	  to ever have a hope of resuming (uswsusp). While uswsusp will
+	  address some of the concerns above, it won't address all of them,
+          and will be more complicated to get set up;
+        - it doesn't have support for suspend-to-both (write a hibernation
+	  image, then suspend to ram; I think this is known as ReadySafe
+	  under M$).
+
+5. How do you use it?
+
+   A hibernation cycle can be started directly by doing:
+
+	echo > /sys/power/tuxonice/do_hibernate
+
+   In practice, though, you'll probably want to use the hibernate script
+   to unload modules, configure the kernel the way you like it and so on.
+   In that case, you'd do (as root):
+
+	hibernate
+
+   See the hibernate script's man page for more details on the options it
+   takes.
+
+   If you're using the text or splash user interface modules, one feature of
+   TuxOnIce that you might find useful is that you can press Escape at any time
+   during hibernating, and the process will be aborted.
+
+   Due to the way hibernation works, this means you'll have your system back and
+   perfectly usable almost instantly. The only exception is when it's at the
+   very end of writing the image. Then it will need to reload a small (usually
+   4-50MBs, depending upon the image characteristics) portion first.
+
+   Likewise, when resuming, you can press escape and resuming will be aborted.
+   The computer will then powerdown again according to settings at that time for
+   the powerdown method or rebooting.
+
+   You can change the settings for powering down while the image is being
+   written by pressing 'R' to toggle rebooting and 'O' to toggle between
+   suspending to ram and powering down completely).
+
+   If you run into problems with resuming, adding the "noresume" option to
+   the kernel command line will let you skip the resume step and recover your
+   system. This option shouldn't normally be needed, because TuxOnIce modifies
+   the image header prior to the atomic restore, and will thus prompt you
+   if it detects that you've tried to resume an image before (this flag is
+   removed if you press Escape to cancel a resume, so you won't be prompted
+   then).
+
+   Recent kernels (2.6.24 onwards) add support for resuming from a different
+   kernel to the one that was hibernated (thanks to Rafael for his work on
+   this - I've just embraced and enhanced the support for TuxOnIce). This
+   should further reduce the need for you to use the noresume option.
+
+6. What do all those entries in /sys/power/tuxonice do?
+
+   /sys/power/tuxonice is the directory which contains files you can use to
+   tune and configure TuxOnIce to your liking. The exact contents of
+   the directory will depend upon the version of TuxOnIce you're
+   running and the options you selected at compile time. In the following
+   descriptions, names in brackets refer to compile time options.
+   (Note that they're all dependant upon you having selected CONFIG_TUXONICE
+   in the first place!).
+
+   Since the values of these settings can open potential security risks, the
+   writeable ones are accessible only to the root user. You may want to
+   configure sudo to allow you to invoke your hibernate script as an ordinary
+   user.
+
+   - alloc/failure_test
+
+   This debugging option provides a way of testing TuxOnIce's handling of
+   memory allocation failures. Each allocation type that TuxOnIce makes has
+   been given a unique number (see the source code). Echo the appropriate
+   number into this entry, and when TuxOnIce attempts to do that allocation,
+   it will pretend there was a failure and act accordingly.
+
+   - alloc/find_max_mem_allocated
+
+   This debugging option will cause TuxOnIce to find the maximum amount of
+   memory it used during a cycle, and report that information in debugging
+   information at the end of the cycle.
+
+   - alt_resume_param
+
+   Instead of powering down after writing a hibernation image, TuxOnIce
+   supports resuming from a different image. This entry lets you set the
+   location of the signature for that image (the resume= value you'd use
+   for it). Using an alternate image and keep_image mode, you can do things
+   like using an alternate image to power down an uninterruptible power
+   supply.
+
+   - block_io/target_outstanding_io
+
+   This value controls the amount of memory that the block I/O code says it
+   needs when the core code is calculating how much memory is needed for
+   hibernating and for resuming. It doesn't directly control the amount of
+   I/O that is submitted at any one time - that depends on the amount of
+   available memory (we may have more available than we asked for), the
+   throughput that is being achieved and the ability of the CPU to keep up
+   with disk throughput (particularly where we're compressing pages).
+
+   - checksum/enabled
+
+   Use cryptoapi hashing routines to verify that Pageset2 pages don't change
+   while we're saving the first part of the image, and to get any pages that
+   do change resaved in the atomic copy. This should normally not be needed,
+   but if you're seeing issues, please enable this. If your issues stop you
+   being able to resume, enable this option, hibernate and cancel the cycle
+   after the atomic copy is done. If the debugging info shows a non-zero
+   number of pages resaved, please report this to Nigel.
+
+   - compression/algorithm
+
+   Set the cryptoapi algorithm used for compressing the image.
+
+   - compression/expected_compression
+
+   These values allow you to set an expected compression ratio, which TuxOnice
+   will use in calculating whether it meets constraints on the image size. If
+   this expected compression ratio is not attained, the hibernation cycle will
+   abort, so it is wise to allow some spare. You can see what compression
+   ratio is achieved in the logs after hibernating.
+
+   - debug_info:
+
+   This file returns information about your configuration that may be helpful
+   in diagnosing problems with hibernating.
+
+   - did_suspend_to_both:
+
+   This file can be used when you hibernate with powerdown method 3 (ie suspend
+   to ram after writing the image). There can be two outcomes in this case. We
+   can resume from the suspend-to-ram before the battery runs out, or we can run
+   out of juice and and up resuming like normal. This entry lets you find out,
+   post resume, which way we went. If the value is 1, we resumed from suspend
+   to ram. This can be useful when actions need to be run post suspend-to-ram
+   that don't need to be run if we did the normal resume from power off.
+
+   - do_hibernate:
+
+   When anything is written to this file, the kernel side of TuxOnIce will
+   begin to attempt to write an image to disk and power down. You'll normally
+   want to run the hibernate script instead, to get modules unloaded first.
+
+   - do_resume:
+
+   When anything is written to this file TuxOnIce will attempt to read and
+   restore an image. If there is no image, it will return almost immediately.
+   If an image exists, the echo > will never return. Instead, the original
+   kernel context will be restored and the original echo > do_hibernate will
+   return.
+
+   - */enabled
+
+   These option can be used to temporarily disable various parts of TuxOnIce.
+
+   - extra_pages_allowance
+
+   When TuxOnIce does its atomic copy, it calls the driver model suspend
+   and resume methods. If you have DRI enabled with a driver such as fglrx,
+   this can result in the driver allocating a substantial amount of memory
+   for storing its state. Extra_pages_allowance tells TuxOnIce how much
+   extra memory it should ensure is available for those allocations. If
+   your attempts at hibernating end with a message in dmesg indicating that
+   insufficient extra pages were allowed, you need to increase this value.
+
+   - file/target:
+
+   Read this value to get the current setting. Write to it to point TuxOnice
+   at a new storage location for the file allocator. See section 3.b.ii above
+   for details of how to set up the file allocator.
+
+   - freezer_test
+
+   This entry can be used to get TuxOnIce to just test the freezer and prepare
+   an image without actually doing a hibernation cycle. It is useful for
+   diagnosing freezing and image preparation issues.
+
+   - full_pageset2
+
+   TuxOnIce divides the pages that are stored in an image into two sets. The
+   difference between the two sets is that pages in pageset 1 are atomically
+   copied, and pages in pageset 2 are written to disk without being copied
+   first. A page CAN be written to disk without being copied first if and only
+   if its contents will not be modified or used at any time after userspace
+   processes are frozen. A page MUST be in pageset 1 if its contents are
+   modified or used at any time after userspace processes have been frozen.
+
+   Normally (ie if this option is enabled), TuxOnIce will put all pages on the
+   per-zone LRUs in pageset2, then remove those pages used by any userspace
+   user interface helper and TuxOnIce storage manager that are running,
+   together with pages used by the GEM memory manager introduced around 2.6.28
+   kernels.
+
+   If this option is disabled, a much more conservative approach will be taken.
+   The only pages in pageset2 will be those belonging to userspace processes,
+   with the exclusion of those belonging to the TuxOnIce userspace helpers
+   mentioned above. This will result in a much smaller pageset2, and will
+   therefore result in smaller images than are possible with this option
+   enabled.
+
+   - ignore_rootfs
+
+   TuxOnIce records which device is mounted as the root filesystem when
+   writing the hibernation image. It will normally check at resume time that
+   this device isn't already mounted - that would be a cause of filesystem
+   corruption. In some particular cases (RAM based root filesystems), you
+   might want to disable this check. This option allows you to do that.
+
+   - image_exists:
+
+   Can be used in a script to determine whether a valid image exists at the
+   location currently pointed to by resume=. Returns up to three lines.
+   The first is whether an image exists (-1 for unsure, otherwise 0 or 1).
+   If an image eixsts, additional lines will return the machine and version.
+   Echoing anything to this entry removes any current image.
+
+   - image_size_limit:
+
+   The maximum size of hibernation image written to disk, measured in megabytes
+   (1024*1024).
+
+   - last_result:
+
+   The result of the last hibernation cycle, as defined in
+   include/linux/suspend-debug.h with the values SUSPEND_ABORTED to
+   SUSPEND_KEPT_IMAGE. This is a bitmask.
+
+   - late_cpu_hotplug:
+
+   This sysfs entry controls whether cpu hotplugging is done - as normal - just
+   before (unplug) and after (replug) the atomic copy/restore (so that all
+   CPUs/cores are available for multithreaded I/O). The alternative is to
+   unplug all secondary CPUs/cores at the start of hibernating/resuming, and
+   replug them at the end of resuming. No multithreaded I/O will be possible in
+   this configuration, but the odd machine has been reported to require it.
+
+   - lid_file:
+
+   This determines which ACPI button file we look in to determine whether the
+   lid is open or closed after resuming from suspend to disk or power off.
+   If the entry is set to "lid/LID", we'll open /proc/acpi/button/lid/LID/state
+   and check its contents at the appropriate moment. See post_wake_state below
+   for more details on how this entry is used.
+
+   - log_everything (CONFIG_PM_DEBUG):
+
+   Setting this option results in all messages printed being logged. Normally,
+   only a subset are logged, so as to not slow the process and not clutter the
+   logs. Useful for debugging. It can be toggled during a cycle by pressing
+   'L'.
+
+   - no_load_direct:
+
+   This is a debugging option. If, when loading the atomically copied pages of
+   an image, TuxOnIce finds that the destination address for a page is free,
+   it will normally allocate the image, load the data directly into that
+   address and skip it in the atomic restore. If this option is disabled, the
+   page will be loaded somewhere else and atomically restored like other pages.
+
+   - no_flusher_thread:
+
+   When doing multithreaded I/O (see below), the first online CPU can be used
+   to _just_ submit compressed pages when writing the image, rather than
+   compressing and submitting data. This option is normally disabled, but has
+   been included because Nigel would like to see whether it will be more useful
+   as the number of cores/cpus in computers increases.
+
+   - no_multithreaded_io:
+
+   TuxOnIce will normally create one thread per cpu/core on your computer,
+   each of which will then perform I/O. This will generally result in
+   throughput that's the maximum the storage medium can handle. There
+   shouldn't be any reason to disable multithreaded I/O now, but this option
+   has been retained for debugging purposes.
+
+   - no_pageset2
+
+   See the entry for full_pageset2 above for an explanation of pagesets.
+   Enabling this option causes TuxOnIce to do an atomic copy of all pages,
+   thereby limiting the maximum image size to 1/2 of memory, as swsusp does.
+
+   - no_pageset2_if_unneeded
+
+   See the entry for full_pageset2 above for an explanation of pagesets.
+   Enabling this option causes TuxOnIce to act like no_pageset2 was enabled
+   if and only it isn't needed anyway. This option may still make TuxOnIce
+   less reliable because pageset2 pages are normally used to store the
+   atomic copy - drivers that want to do allocations of larger amounts of
+   memory in one shot will be more likely to find that those amounts aren't
+   available if this option is enabled.
+
+   - pause_between_steps (CONFIG_PM_DEBUG):
+
+   This option is used during debugging, to make TuxOnIce pause between
+   each step of the process. It is ignored when the nice display is on.
+
+   - post_wake_state:
+
+   TuxOnIce provides support for automatically waking after a user-selected
+   delay, and using a different powerdown method if the lid is still closed.
+   (Yes, we're assuming a laptop).  This entry lets you choose what state
+   should be entered next. The values are those described under
+   powerdown_method, below. It can be used to suspend to RAM after hibernating,
+   then powerdown properly (say) 20 minutes. It can also be used to power down
+   properly, then wake at (say) 6.30am and suspend to RAM until you're ready
+   to use the machine.
+
+   - powerdown_method:
+
+   Used to select a method by which TuxOnIce should powerdown after writing the
+   image. Currently:
+
+   0: Don't use ACPI to power off.
+   3: Attempt to enter Suspend-to-ram.
+   4: Attempt to enter ACPI S4 mode.
+   5: Attempt to power down via ACPI S5 mode.
+
+   Note that these options are highly dependant upon your hardware & software:
+
+   3: When succesful, your machine suspends to ram instead of powering off.
+      The advantage of using this mode is that it doesn't matter whether your
+      battery has enough charge to make it through to your next resume. If it
+      lasts, you will simply resume from suspend to ram (and the image on disk
+      will be discarded). If the battery runs out, you will resume from disk
+      instead. The disadvantage is that it takes longer than a normal
+      suspend-to-ram to enter the state, since the suspend-to-disk image needs
+      to be written first.
+   4/5: When successful, your machine will be off and comsume (almost) no power.
+      But it might still react to some external events like opening the lid or
+      trafic on  a network or usb device. For the bios, resume is then the same
+      as warm boot, similar to a situation where you used the command `reboot'
+      to reboot your machine. If your machine has problems on warm boot or if
+      you want to protect your machine with the bios password, this is probably
+      not the right choice. Mode 4 may be necessary on some machines where ACPI
+      wake up methods need to be run to properly reinitialise hardware after a
+      hibernation cycle.
+   0: Switch the machine completely off. The only possible wakeup is the power
+      button. For the bios, resume is then the same as a cold boot, in
+      particular you would  have to provide your bios boot password if your
+      machine uses that feature for booting.
+
+   - progressbar_granularity_limit:
+
+   This option can be used to limit the granularity of the progress bar
+   displayed with a bootsplash screen. The value is the maximum number of
+   steps. That is, 10 will make the progress bar jump in 10% increments.
+
+   - reboot:
+
+   This option causes TuxOnIce to reboot rather than powering down
+   at the end of saving an image. It can be toggled during a cycle by pressing
+   'R'.
+
+   - resume:
+
+   This sysfs entry can be used to read and set the location in which TuxOnIce
+   will look for the signature of an image - the value set using resume= at
+   boot time or CONFIG_PM_STD_PARTITION ("Default resume partition"). By
+   writing to this file as well as modifying your bootloader's configuration
+   file (eg menu.lst), you can set or reset the location of your image or the
+   method of storing the image without rebooting.
+
+   - replace_swsusp (CONFIG_TOI_REPLACE_SWSUSP):
+
+   This option makes
+
+     echo disk > /sys/power/state
+
+   activate TuxOnIce instead of swsusp. Regardless of whether this option is
+   enabled, any invocation of swsusp's resume time trigger will cause TuxOnIce
+   to check for an image too. This is due to the fact that at resume time, we
+   can't know whether this option was enabled until we see if an image is there
+   for us to resume from. (And when an image exists, we don't care whether we
+   did replace swsusp anyway - we just want to resume).
+
+   - resume_commandline:
+
+   This entry can be read after resuming to see the commandline that was used
+   when resuming began. You might use this to set up two bootloader entries
+   that are the same apart from the fact that one includes a extra append=
+   argument "at_work=1". You could then grep resume_commandline in your
+   post-resume scripts and configure networking (for example) differently
+   depending upon whether you're at home or work. resume_commandline can be
+   set to arbitrary text if you wish to remove sensitive contents.
+
+   - swap/swapfilename:
+
+   This entry is used to specify the swapfile or partition that
+   TuxOnIce will attempt to swapon/swapoff automatically. Thus, if
+   I normally use /dev/hda1 for swap, and want to use /dev/hda2 for specifically
+   for my hibernation image, I would
+
+   echo /dev/hda2 > /sys/power/tuxonice/swap/swapfile
+
+   /dev/hda2 would then be automatically swapon'd and swapoff'd. Note that the
+   swapon and swapoff occur while other processes are frozen (including kswapd)
+   so this swap file will not be used up when attempting to free memory. The
+   parition/file is also given the highest priority, so other swapfiles/partitions
+   will only be used to save the image when this one is filled.
+
+   The value of this file is used by headerlocations along with any currently
+   activated swapfiles/partitions.
+
+   - swap/headerlocations:
+
+   This option tells you the resume= options to use for swap devices you
+   currently have activated. It is particularly useful when you only want to
+   use a swap file to store your image. See above for further details.
+
+   - test_bio
+
+   This is a debugging option. When enabled, TuxOnIce will not hibernate.
+   Instead, when asked to write an image, it will skip the atomic copy,
+   just doing the writing of the image and then returning control to the
+   user at the point where it would have powered off. This is useful for
+   testing throughput in different configurations.
+
+   - test_filter_speed
+
+   This is a debugging option. When enabled, TuxOnIce will not hibernate.
+   Instead, when asked to write an image, it will not write anything or do
+   an atomic copy, but will only run any enabled compression algorithm on the
+   data that would have been written (the source pages of the atomic copy in
+   the case of pageset 1). This is useful for comparing the performance of
+   compression algorithms and for determining the extent to which an upgrade
+   to your storage method would improve hibernation speed.
+
+   - user_interface/debug_sections (CONFIG_PM_DEBUG):
+
+   This value, together with the console log level, controls what debugging
+   information is displayed. The console log level determines the level of
+   detail, and this value determines what detail is displayed. This value is
+   a bit vector, and the meaning of the bits can be found in the kernel tree
+   in include/linux/tuxonice.h. It can be overridden using the kernel's
+   command line option suspend_dbg.
+
+   - user_interface/default_console_level (CONFIG_PM_DEBUG):
+
+   This determines the value of the console log level at the start of a
+   hibernation cycle. If debugging is compiled in, the console log level can be
+   changed during a cycle by pressing the digit keys. Meanings are:
+
+   0: Nice display.
+   1: Nice display plus numerical progress.
+   2: Errors only.
+   3: Low level debugging info.
+   4: Medium level debugging info.
+   5: High level debugging info.
+   6: Verbose debugging info.
+
+   - user_interface/enable_escape:
+
+   Setting this to "1" will enable you abort a hibernation cycle or resuming by
+   pressing escape, "0" (default) disables this feature. Note that enabling
+   this option means that you cannot initiate a hibernation cycle and then walk
+   away from your computer, expecting it to be secure. With feature disabled,
+   you can validly have this expectation once TuxOnice begins to write the
+   image to disk. (Prior to this point, it is possible that TuxOnice might
+   about because of failure to freeze all processes or because constraints
+   on its ability to save the image are not met).
+
+   - user_interface/program
+
+   This entry is used to tell TuxOnice what userspace program to use for
+   providing a user interface while hibernating. The program uses a netlink
+   socket to pass messages back and forward to the kernel, allowing all of the
+   functions formerly implemented in the kernel user interface components.
+
+   - version:
+
+   The version of TuxOnIce you have compiled into the currently running kernel.
+
+   - wake_alarm_dir:
+
+   As mentioned above (post_wake_state), TuxOnIce supports automatically waking
+   after some delay. This entry allows you to select which wake alarm to use.
+   It should contain the value "rtc0" if you're wanting to use
+   /sys/class/rtc/rtc0.
+
+   - wake_delay:
+
+   This value determines the delay from the end of writing the image until the
+   wake alarm is triggered. You can set an absolute time by writing the desired
+   time into /sys/class/rtc/<wake_alarm_dir>/wakealarm and leaving these values
+   empty.
+
+   Note that for the wakeup to actually occur, you may need to modify entries
+   in /proc/acpi/wakeup. This is done by echoing the name of the button in the
+   first column (eg PBTN) into the file.
+
+7. How do you get support?
+
+   Glad you asked. TuxOnIce is being actively maintained and supported
+   by Nigel (the guy doing most of the kernel coding at the moment), Bernard
+   (who maintains the hibernate script and userspace user interface components)
+   and its users.
+
+   Resources availble include HowTos, FAQs and a Wiki, all available via
+   tuxonice.net.  You can find the mailing lists there.
+
+8. I think I've found a bug. What should I do?
+
+   By far and a way, the most common problems people have with TuxOnIce
+   related to drivers not having adequate power management support. In this
+   case, it is not a bug with TuxOnIce, but we can still help you. As we
+   mentioned above, such issues can usually be worked around by building the
+   functionality as modules and unloading them while hibernating. Please visit
+   the Wiki for up-to-date lists of known issues and work arounds.
+
+   If this information doesn't help, try running:
+
+   hibernate --bug-report
+
+   ..and sending the output to the users mailing list.
+
+   Good information on how to provide us with useful information from an
+   oops is found in the file REPORTING-BUGS, in the top level directory
+   of the kernel tree. If you get an oops, please especially note the
+   information about running what is printed on the screen through ksymoops.
+   The raw information is useless.
+
+9. When will XXX be supported?
+
+   If there's a feature missing from TuxOnIce that you'd like, feel free to
+   ask. We try to be obliging, within reason.
+
+   Patches are welcome. Please send to the list.
+
+10. How does it work?
+
+   TuxOnIce does its work in a number of steps.
+
+   a. Freezing system activity.
+
+   The first main stage in hibernating is to stop all other activity. This is
+   achieved in stages. Processes are considered in fours groups, which we will
+   describe in reverse order for clarity's sake: Threads with the PF_NOFREEZE
+   flag, kernel threads without this flag, userspace processes with the
+   PF_SYNCTHREAD flag and all other processes. The first set (PF_NOFREEZE) are
+   untouched by the refrigerator code. They are allowed to run during hibernating
+   and resuming, and are used to support user interaction, storage access or the
+   like. Other kernel threads (those unneeded while hibernating) are frozen last.
+   This leaves us with userspace processes that need to be frozen. When a
+   process enters one of the *_sync system calls, we set a PF_SYNCTHREAD flag on
+   that process for the duration of that call. Processes that have this flag are
+   frozen after processes without it, so that we can seek to ensure that dirty
+   data is synced to disk as quickly as possible in a situation where other
+   processes may be submitting writes at the same time. Freezing the processes
+   that are submitting data stops new I/O from being submitted. Syncthreads can
+   then cleanly finish their work. So the order is:
+
+   - Userspace processes without PF_SYNCTHREAD or PF_NOFREEZE;
+   - Userspace processes with PF_SYNCTHREAD (they won't have NOFREEZE);
+   - Kernel processes without PF_NOFREEZE.
+
+   b. Eating memory.
+
+   For a successful hibernation cycle, you need to have enough disk space to store the
+   image and enough memory for the various limitations of TuxOnIce's
+   algorithm. You can also specify a maximum image size. In order to attain
+   to those constraints, TuxOnIce may 'eat' memory. If, after freezing
+   processes, the constraints aren't met, TuxOnIce will thaw all the
+   other processes and begin to eat memory until its calculations indicate
+   the constraints are met. It will then freeze processes again and recheck
+   its calculations.
+
+   c. Allocation of storage.
+
+   Next, TuxOnIce allocates the storage that will be used to save
+   the image.
+
+   The core of TuxOnIce knows nothing about how or where pages are stored. We
+   therefore request the active allocator (remember you might have compiled in
+   more than one!) to allocate enough storage for our expect image size. If
+   this request cannot be fulfilled, we eat more memory and try again. If it
+   is fulfiled, we seek to allocate additional storage, just in case our
+   expected compression ratio (if any) isn't achieved. This time, however, we
+   just continue if we can't allocate enough storage.
+
+   If these calls to our allocator change the characteristics of the image
+   such that we haven't allocated enough memory, we also loop. (The allocator
+   may well need to allocate space for its storage information).
+
+   d. Write the first part of the image.
+
+   TuxOnIce stores the image in two sets of pages called 'pagesets'.
+   Pageset 2 contains pages on the active and inactive lists; essentially
+   the page cache. Pageset 1 contains all other pages, including the kernel.
+   We use two pagesets for one important reason: We need to make an atomic copy
+   of the kernel to ensure consistency of the image. Without a second pageset,
+   that would limit us to an image that was at most half the amount of memory
+   available. Using two pagesets allows us to store a full image. Since pageset
+   2 pages won't be needed in saving pageset 1, we first save pageset 2 pages.
+   We can then make our atomic copy of the remaining pages using both pageset 2
+   pages and any other pages that are free. While saving both pagesets, we are
+   careful not to corrupt the image. Among other things, we use lowlevel block
+   I/O routines that don't change the pagecache contents.
+
+   The next step, then, is writing pageset 2.
+
+   e. Suspending drivers and storing processor context.
+
+   Having written pageset2, TuxOnIce calls the power management functions to
+   notify drivers of the hibernation, and saves the processor state in preparation
+   for the atomic copy of memory we are about to make.
+
+   f. Atomic copy.
+
+   At this stage, everything else but the TuxOnIce code is halted. Processes
+   are frozen or idling, drivers are quiesced and have stored (ideally and where
+   necessary) their configuration in memory we are about to atomically copy.
+   In our lowlevel architecture specific code, we have saved the CPU state.
+   We can therefore now do our atomic copy before resuming drivers etc.
+
+   g. Save the atomic copy (pageset 1).
+
+   TuxOnice can then write the atomic copy of the remaining pages. Since we
+   have copied the pages into other locations, we can continue to use the
+   normal block I/O routines without fear of corruption our image.
+
+   f. Save the image header.
+
+   Nearly there! We save our settings and other parameters needed for
+   reloading pageset 1 in an 'image header'. We also tell our allocator to
+   serialise its data at this stage, so that it can reread the image at resume
+   time.
+
+   g. Set the image header.
+
+   Finally, we edit the header at our resume= location. The signature is
+   changed by the allocator to reflect the fact that an image exists, and to
+   point to the start of that data if necessary (swap allocator).
+
+   h. Power down.
+
+   Or reboot if we're debugging and the appropriate option is selected.
+
+   Whew!
+
+   Reloading the image.
+   --------------------
+
+   Reloading the image is essentially the reverse of all the above. We load
+   our copy of pageset 1, being careful to choose locations that aren't going
+   to be overwritten as we copy it back (We start very early in the boot
+   process, so there are no other processes to quiesce here). We then copy
+   pageset 1 back to its original location in memory and restore the process
+   context. We are now running with the original kernel. Next, we reload the
+   pageset 2 pages, free the memory and swap used by TuxOnIce, restore
+   the pageset header and restart processes. Sounds easy in comparison to
+   hibernating, doesn't it!
+
+   There is of course more to TuxOnIce than this, but this explanation
+   should be a good start. If there's interest, I'll write further
+   documentation on range pages and the low level I/O.
+
+11. Who wrote TuxOnIce?
+
+   (Answer based on the writings of Florent Chabaud, credits in files and
+   Nigel's limited knowledge; apologies to anyone missed out!)
+
+   The main developers of TuxOnIce have been...
+
+   Gabor Kuti
+   Pavel Machek
+   Florent Chabaud
+   Bernard Blackham
+   Nigel Cunningham
+
+   Significant portions of swsusp, the code in the vanilla kernel which
+   TuxOnIce enhances, have been worked on by Rafael Wysocki. Thanks should
+   also be expressed to him.
+
+   The above mentioned developers have been aided in their efforts by a host
+   of hundreds, if not thousands of testers and people who have submitted bug
+   fixes & suggestions. Of special note are the efforts of Michael Frank, who
+   had his computers repetitively hibernate and resume for literally tens of
+   thousands of cycles and developed scripts to stress the system and test
+   TuxOnIce far beyond the point most of us (Nigel included!) would consider
+   testing. His efforts have contributed as much to TuxOnIce as any of the
+   names above.
diff --git a/MAINTAINERS b/MAINTAINERS
index 6a6e4ac..bc989c4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8809,6 +8809,13 @@ S:	Maintained
 F:	drivers/tc/
 F:	include/linux/tc.h
 
+TUXONICE (ENHANCED HIBERNATION)
+P:	Nigel Cunningham
+M:	nigel@tuxonice.net
+L:	tuxonice-devel@tuxonice.net
+W:	http://tuxonice.net
+S:	Maintained
+
 U14-34F SCSI DRIVER
 M:	Dario Ballabio <ballabio_dario@emc.com>
 L:	linux-scsi@vger.kernel.org
 
diff --git a/arch/arm/mach-pxa/am300epd.c b/arch/arm/mach-pxa/am300epd.c
index e355657..3dfec1e 100644
--- a/arch/arm/mach-pxa/am300epd.c
+++ b/arch/arm/mach-pxa/am300epd.c
@@ -30,7 +30,6 @@
 
 #include <mach/gumstix.h>
 #include <mach/mfp-pxa25x.h>
-#include <mach/irqs.h>
 #include <linux/platform_data/video-pxafb.h>
 
 #include "generic.h"
diff --git a/arch/arm/mach-pxa/include/mach/balloon3.h b/arch/arm/mach-pxa/include/mach/balloon3.h
index 1b08259..954641e 100644
--- a/arch/arm/mach-pxa/include/mach/balloon3.h
+++ b/arch/arm/mach-pxa/include/mach/balloon3.h
@@ -14,8 +14,6 @@
 #ifndef ASM_ARCH_BALLOON3_H
 #define ASM_ARCH_BALLOON3_H
 
-#include "irqs.h" /* PXA_NR_BUILTIN_GPIO */
-
 enum balloon3_features {
 	BALLOON3_FEATURE_OHCI,
 	BALLOON3_FEATURE_MMC,
diff --git a/arch/arm/mach-pxa/include/mach/corgi.h b/arch/arm/mach-pxa/include/mach/corgi.h
index c030d95..f3c3493 100644
--- a/arch/arm/mach-pxa/include/mach/corgi.h
+++ b/arch/arm/mach-pxa/include/mach/corgi.h
@@ -13,7 +13,6 @@
 #ifndef __ASM_ARCH_CORGI_H
 #define __ASM_ARCH_CORGI_H  1
 
-#include "irqs.h" /* PXA_NR_BUILTIN_GPIO */
 
 /*
  * Corgi (Non Standard) GPIO Definitions
diff --git a/arch/arm/mach-pxa/include/mach/csb726.h b/arch/arm/mach-pxa/include/mach/csb726.h
index 00cfbbb..2628e7b 100644
--- a/arch/arm/mach-pxa/include/mach/csb726.h
+++ b/arch/arm/mach-pxa/include/mach/csb726.h
@@ -11,8 +11,6 @@
 #ifndef CSB726_H
 #define CSB726_H
 
-#include "irqs.h" /* PXA_GPIO_TO_IRQ */
-
 #define CSB726_GPIO_IRQ_LAN	52
 #define CSB726_GPIO_IRQ_SM501	53
 #define CSB726_GPIO_MMC_DETECT	100
diff --git a/arch/arm/mach-pxa/include/mach/gumstix.h b/arch/arm/mach-pxa/include/mach/gumstix.h
index f7df27b..dba14b6 100644
--- a/arch/arm/mach-pxa/include/mach/gumstix.h
+++ b/arch/arm/mach-pxa/include/mach/gumstix.h
@@ -6,7 +6,6 @@
  * published by the Free Software Foundation.
  */
 
-#include "irqs.h" /* PXA_GPIO_TO_IRQ */
 
 /* BTRESET - Reset line to Bluetooth module, active low signal. */
 #define GPIO_GUMSTIX_BTRESET          7
diff --git a/arch/arm/mach-pxa/include/mach/idp.h b/arch/arm/mach-pxa/include/mach/idp.h
index 7e63f46..22a96f8 100644
--- a/arch/arm/mach-pxa/include/mach/idp.h
+++ b/arch/arm/mach-pxa/include/mach/idp.h
@@ -23,7 +23,6 @@
  * IDP hardware.
  */
 
-#include "irqs.h" /* PXA_GPIO_TO_IRQ */
 
 #define IDP_FLASH_PHYS		(PXA_CS0_PHYS)
 #define IDP_ALT_FLASH_PHYS	(PXA_CS1_PHYS)
diff --git a/arch/arm/mach-pxa/include/mach/palmld.h b/arch/arm/mach-pxa/include/mach/palmld.h
index b184f29..2c44713 100644
--- a/arch/arm/mach-pxa/include/mach/palmld.h
+++ b/arch/arm/mach-pxa/include/mach/palmld.h
@@ -13,8 +13,6 @@
 #ifndef _INCLUDE_PALMLD_H_
 #define _INCLUDE_PALMLD_H_
 
-#include "irqs.h" /* PXA_GPIO_TO_IRQ */
-
 /** HERE ARE GPIOs **/
 
 /* GPIOs */
diff --git a/arch/arm/mach-pxa/include/mach/palmt5.h b/arch/arm/mach-pxa/include/mach/palmt5.h
index e342c59..0bd4f03 100644
--- a/arch/arm/mach-pxa/include/mach/palmt5.h
+++ b/arch/arm/mach-pxa/include/mach/palmt5.h
@@ -15,8 +15,6 @@
 #ifndef _INCLUDE_PALMT5_H_
 #define _INCLUDE_PALMT5_H_
 
-#include "irqs.h" /* PXA_GPIO_TO_IRQ */
-
 /** HERE ARE GPIOs **/
 
 /* GPIOs */
diff --git a/arch/arm/mach-pxa/include/mach/palmtc.h b/arch/arm/mach-pxa/include/mach/palmtc.h
index 81c727b..c383a21 100644
--- a/arch/arm/mach-pxa/include/mach/palmtc.h
+++ b/arch/arm/mach-pxa/include/mach/palmtc.h
@@ -16,8 +16,6 @@
 #ifndef _INCLUDE_PALMTC_H_
 #define _INCLUDE_PALMTC_H_
 
-#include "irqs.h" /* PXA_GPIO_TO_IRQ */
-
 /** HERE ARE GPIOs **/
 
 /* GPIOs */
diff --git a/arch/arm/mach-pxa/include/mach/palmtx.h b/arch/arm/mach-pxa/include/mach/palmtx.h
index 92bc1f0..f2e5303 100644
--- a/arch/arm/mach-pxa/include/mach/palmtx.h
+++ b/arch/arm/mach-pxa/include/mach/palmtx.h
@@ -16,8 +16,6 @@
 #ifndef _INCLUDE_PALMTX_H_
 #define _INCLUDE_PALMTX_H_
 
-#include "irqs.h" /* PXA_GPIO_TO_IRQ */
-
 /** HERE ARE GPIOs **/
 
 /* GPIOs */
diff --git a/arch/arm/mach-pxa/include/mach/pcm027.h b/arch/arm/mach-pxa/include/mach/pcm027.h
index 86ebd7b..6bf28de 100644
--- a/arch/arm/mach-pxa/include/mach/pcm027.h
+++ b/arch/arm/mach-pxa/include/mach/pcm027.h
@@ -23,8 +23,6 @@
  * Definitions of CPU card resources only
  */
 
-#include "irqs.h" /* PXA_GPIO_TO_IRQ */
-
 /* phyCORE-PXA270 (PCM027) Interrupts */
 #define PCM027_IRQ(x)          (IRQ_BOARD_START + (x))
 #define PCM027_BTDET_IRQ       PCM027_IRQ(0)
diff --git a/arch/arm/mach-pxa/include/mach/pcm990_baseboard.h b/arch/arm/mach-pxa/include/mach/pcm990_baseboard.h
index 7e544c1..0260aaa 100644
--- a/arch/arm/mach-pxa/include/mach/pcm990_baseboard.h
+++ b/arch/arm/mach-pxa/include/mach/pcm990_baseboard.h
@@ -20,7 +20,6 @@
  */
 
 #include <mach/pcm027.h>
-#include "irqs.h" /* PXA_GPIO_TO_IRQ */
 
 /*
  * definitions relevant only when the PCM-990
diff --git a/arch/arm/mach-pxa/include/mach/poodle.h b/arch/arm/mach-pxa/include/mach/poodle.h
index b56b193..f32ff75 100644
--- a/arch/arm/mach-pxa/include/mach/poodle.h
+++ b/arch/arm/mach-pxa/include/mach/poodle.h
@@ -15,8 +15,6 @@
 #ifndef __ASM_ARCH_POODLE_H
 #define __ASM_ARCH_POODLE_H  1
 
-#include "irqs.h" /* PXA_GPIO_TO_IRQ */
-
 /*
  * GPIOs
  */
diff --git a/arch/arm/mach-pxa/include/mach/spitz.h b/arch/arm/mach-pxa/include/mach/spitz.h
index 25c9f62..0bfe650 100644
--- a/arch/arm/mach-pxa/include/mach/spitz.h
+++ b/arch/arm/mach-pxa/include/mach/spitz.h
@@ -15,8 +15,8 @@
 #define __ASM_ARCH_SPITZ_H  1
 #endif
 
-#include "irqs.h" /* PXA_NR_BUILTIN_GPIO, PXA_GPIO_TO_IRQ */
 #include <linux/fb.h>
+#include <linux/gpio.h>
 
 /* Spitz/Akita GPIOs */
 
diff --git a/arch/arm/mach-pxa/include/mach/tosa.h b/arch/arm/mach-pxa/include/mach/tosa.h
index 0497d95..2bb0e86 100644
--- a/arch/arm/mach-pxa/include/mach/tosa.h
+++ b/arch/arm/mach-pxa/include/mach/tosa.h
@@ -13,8 +13,6 @@
 #ifndef _ASM_ARCH_TOSA_H_
 #define _ASM_ARCH_TOSA_H_ 1
 
-#include "irqs.h" /* PXA_NR_BUILTIN_GPIO */
-
 /*  TOSA Chip selects  */
 #define TOSA_LCDC_PHYS		PXA_CS4_PHYS
 /* Internel Scoop */
diff --git a/arch/arm/mach-pxa/include/mach/trizeps4.h b/arch/arm/mach-pxa/include/mach/trizeps4.h
index ae3ca01..d2ca010 100644
--- a/arch/arm/mach-pxa/include/mach/trizeps4.h
+++ b/arch/arm/mach-pxa/include/mach/trizeps4.h
@@ -10,8 +10,6 @@
 #ifndef _TRIPEPS4_H_
 #define _TRIPEPS4_H_
 
-#include "irqs.h" /* PXA_GPIO_TO_IRQ */
-
 /* physical memory regions */
 #define TRIZEPS4_FLASH_PHYS	(PXA_CS0_PHYS)  /* Flash region */
 #define TRIZEPS4_DISK_PHYS	(PXA_CS1_PHYS)  /* Disk On Chip region */
diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c
index 015ae55..75d4f73 100644
--- a/arch/powerpc/kernel/machine_kexec.c
+++ b/arch/powerpc/kernel/machine_kexec.c
@@ -196,9 +196,7 @@ int overlaps_crashkernel(unsigned long start, unsigned long size)
 
 /* Values we need to export to the second kernel via the device tree. */
 static phys_addr_t kernel_end;
-static phys_addr_t crashk_base;
 static phys_addr_t crashk_size;
-static unsigned long long mem_limit;
 
 static struct property kernel_end_prop = {
 	.name = "linux,kernel-end",
@@ -209,7 +207,7 @@ static struct property kernel_end_prop = {
 static struct property crashk_base_prop = {
 	.name = "linux,crashkernel-base",
 	.length = sizeof(phys_addr_t),
-	.value = &crashk_base
+	.value = &crashk_res.start,
 };
 
 static struct property crashk_size_prop = {
@@ -221,11 +219,9 @@ static struct property crashk_size_prop = {
 static struct property memory_limit_prop = {
 	.name = "linux,memory-limit",
 	.length = sizeof(unsigned long long),
-	.value = &mem_limit,
+	.value = &memory_limit,
 };
 
-#define cpu_to_be_ulong	__PASTE(cpu_to_be, BITS_PER_LONG)
-
 static void __init export_crashk_values(struct device_node *node)
 {
 	struct property *prop;
@@ -241,9 +237,8 @@ static void __init export_crashk_values(struct device_node *node)
 		of_remove_property(node, prop);
 
 	if (crashk_res.start != 0) {
-		crashk_base = cpu_to_be_ulong(crashk_res.start),
 		of_add_property(node, &crashk_base_prop);
-		crashk_size = cpu_to_be_ulong(resource_size(&crashk_res));
+		crashk_size = resource_size(&crashk_res);
 		of_add_property(node, &crashk_size_prop);
 	}
 
@@ -251,7 +246,6 @@ static void __init export_crashk_values(struct device_node *node)
 	 * memory_limit is required by the kexec-tools to limit the
 	 * crash regions to the actual memory used.
 	 */
-	mem_limit = cpu_to_be_ulong(memory_limit);
 	of_update_property(node, &memory_limit_prop);
 }
 
@@ -270,7 +264,7 @@ static int __init kexec_setup(void)
 		of_remove_property(node, prop);
 
 	/* information needed by userspace when using default_machine_kexec */
-	kernel_end = cpu_to_be_ulong(__pa(_end));
+	kernel_end = __pa(_end);
 	of_add_property(node, &kernel_end_prop);
 
 	export_crashk_values(node);
diff --git a/arch/powerpc/kernel/machine_kexec_64.c b/arch/powerpc/kernel/machine_kexec_64.c
index 59d229a..be4e6d6 100644
--- a/arch/powerpc/kernel/machine_kexec_64.c
+++ b/arch/powerpc/kernel/machine_kexec_64.c
@@ -369,7 +369,6 @@ void default_machine_kexec(struct kimage *image)
 
 /* Values we need to export to the second kernel via the device tree. */
 static unsigned long htab_base;
-static unsigned long htab_size;
 
 static struct property htab_base_prop = {
 	.name = "linux,htab-base",
@@ -380,7 +379,7 @@ static struct property htab_base_prop = {
 static struct property htab_size_prop = {
 	.name = "linux,htab-size",
 	.length = sizeof(unsigned long),
-	.value = &htab_size,
+	.value = &htab_size_bytes,
 };
 
 static int __init export_htab_values(void)
@@ -404,9 +403,8 @@ static int __init export_htab_values(void)
 	if (prop)
 		of_remove_property(node, prop);
 
-	htab_base = cpu_to_be64(__pa(htab_address));
+	htab_base = __pa(htab_address);
 	of_add_property(node, &htab_base_prop);
-	htab_size = cpu_to_be64(htab_size_bytes);
 	of_add_property(node, &htab_size_prop);
 
 	of_node_put(node);
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 5b96017..d0e7c5e 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -436,6 +436,7 @@ void kernel_map_pages(struct page *page, int numpages, int enable)
 
 	change_page_attr(page, numpages, enable ? PAGE_KERNEL : __pgprot(0));
 }
+EXPORT_SYMBOL_GPL(kernel_map_pages);
 #endif /* CONFIG_DEBUG_PAGEALLOC */
 
 static int fixmaps;
diff --git a/arch/powerpc/platforms/83xx/suspend.c b/arch/powerpc/platforms/83xx/suspend.c
index 3d9716c..747015b 100644
--- a/arch/powerpc/platforms/83xx/suspend.c
+++ b/arch/powerpc/platforms/83xx/suspend.c
@@ -265,6 +265,8 @@ static int mpc83xx_suspend_begin(suspend_state_t state)
 
 static int agent_thread_fn(void *data)
 {
+	set_freezable();
+
 	while (1) {
 		wait_event_interruptible(agent_wq, pci_pm_state >= 2);
 		try_to_freeze();
diff --git a/arch/powerpc/platforms/ps3/device-init.c b/arch/powerpc/platforms/ps3/device-init.c
index 3f175e8..b5d59c6 100644
--- a/arch/powerpc/platforms/ps3/device-init.c
+++ b/arch/powerpc/platforms/ps3/device-init.c
@@ -841,6 +841,8 @@ static int ps3_probe_thread(void *data)
 	if (res)
 		goto fail_free_irq;
 
+	set_freezable();
+
 	/* Loop here processing the requested notification events. */
 	do {
 		try_to_freeze();
diff --git a/arch/s390/kernel/head64.S b/arch/s390/kernel/head64.S
index d7c0050..b9e25ae 100644
--- a/arch/s390/kernel/head64.S
+++ b/arch/s390/kernel/head64.S
@@ -59,7 +59,7 @@ ENTRY(startup_continue)
 	.quad	0			# cr12: tracing off
 	.quad	0			# cr13: home space segment table
 	.quad	0xc0000000		# cr14: machine check handling off
-	.quad	.Llinkage_stack		# cr15: linkage stack operations
+	.quad	0			# cr15: linkage stack operations
 .Lpcmsk:.quad	0x0000000180000000
 .L4malign:.quad 0xffffffffffc00000
 .Lscan2g:.quad	0x80000000 + 0x20000 - 8	# 2GB + 128K - 8
@@ -67,15 +67,12 @@ ENTRY(startup_continue)
 .Lparmaddr:
 	.quad	PARMAREA
 	.align	64
-.Lduct: .long	0,.Laste,.Laste,0,.Lduald,0,0,0
+.Lduct: .long	0,0,0,0,.Lduald,0,0,0
 	.long	0,0,0,0,0,0,0,0
-.Laste:	.quad	0,0xffffffffffffffff,0,0,0,0,0,0
 	.align	128
 .Lduald:.rept	8
 	.long	0x80000000,0,0,0	# invalid access-list entries
 	.endr
-.Llinkage_stack:
-	.long	0,0,0x89000000,0,0,0,0x8a000000,0
 
 ENTRY(_ehead)
 
diff --git a/arch/s390/mm/page-states.c b/arch/s390/mm/page-states.c
index 27c50f4..a90d45e 100644
--- a/arch/s390/mm/page-states.c
+++ b/arch/s390/mm/page-states.c
@@ -12,8 +12,6 @@
 #include <linux/mm.h>
 #include <linux/gfp.h>
 #include <linux/init.h>
-#include <asm/setup.h>
-#include <asm/ipl.h>
 
 #define ESSA_SET_STABLE		1
 #define ESSA_SET_UNUSED		2
@@ -43,14 +41,6 @@ void __init cmma_init(void)
 
 	if (!cmma_flag)
 		return;
-	/*
-	 * Disable CMM for dump, otherwise  the tprot based memory
-	 * detection can fail because of unstable pages.
-	 */
-	if (OLDMEM_BASE || ipl_info.type == IPL_TYPE_FCP_DUMP) {
-		cmma_flag = 0;
-		return;
-	}
 	asm volatile(
 		"       .insn rrf,0xb9ab0000,%1,%1,0,0\n"
 		"0:     la      %0,0\n"
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 5ad38ad..bbc8b12 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -445,20 +445,10 @@ static inline int pte_same(pte_t a, pte_t b)
 	return a.pte == b.pte;
 }
 
-static inline int pteval_present(pteval_t pteval)
-{
-	/*
-	 * Yes Linus, _PAGE_PROTNONE == _PAGE_NUMA. Expressing it this
-	 * way clearly states that the intent is that protnone and numa
-	 * hinting ptes are considered present for the purposes of
-	 * pagetable operations like zapping, protection changes, gup etc.
-	 */
-	return pteval & (_PAGE_PRESENT | _PAGE_PROTNONE | _PAGE_NUMA);
-}
-
 static inline int pte_present(pte_t a)
 {
-	return pteval_present(pte_flags(a));
+	return pte_flags(a) & (_PAGE_PRESENT | _PAGE_PROTNONE |
+			       _PAGE_NUMA);
 }
 
 #define pte_accessible pte_accessible
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index fe2bdd0..6abc172 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -284,13 +284,8 @@ static __always_inline void setup_smap(struct cpuinfo_x86 *c)
 	raw_local_save_flags(eflags);
 	BUG_ON(eflags & X86_EFLAGS_AC);
 
-	if (cpu_has(c, X86_FEATURE_SMAP)) {
-#ifdef CONFIG_X86_SMAP
+	if (cpu_has(c, X86_FEATURE_SMAP))
 		set_in_cr4(X86_CR4_SMAP);
-#else
-		clear_in_cr4(X86_CR4_SMAP);
-#endif
-	}
 }
 
 /*
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index e625319..d4bdd25 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -77,7 +77,8 @@ within(unsigned long addr, unsigned long start, unsigned long end)
 	return addr >= start && addr < end;
 }
 
-static unsigned long text_ip_addr(unsigned long ip)
+static int
+do_ftrace_mod_code(unsigned long ip, const void *new_code)
 {
 	/*
 	 * On x86_64, kernel text mappings are mapped read-only with
@@ -90,7 +91,7 @@ static unsigned long text_ip_addr(unsigned long ip)
 	if (within(ip, (unsigned long)_text, (unsigned long)_etext))
 		ip = (unsigned long)__va(__pa_symbol(ip));
 
-	return ip;
+	return probe_kernel_write((void *)ip, new_code, MCOUNT_INSN_SIZE);
 }
 
 static const unsigned char *ftrace_nop_replace(void)
@@ -122,10 +123,8 @@ ftrace_modify_code_direct(unsigned long ip, unsigned const char *old_code,
 	if (memcmp(replaced, old_code, MCOUNT_INSN_SIZE) != 0)
 		return -EINVAL;
 
-	ip = text_ip_addr(ip);
-
 	/* replace the text with the new text */
-	if (probe_kernel_write((void *)ip, new_code, MCOUNT_INSN_SIZE))
+	if (do_ftrace_mod_code(ip, new_code))
 		return -EPERM;
 
 	sync_core();
@@ -222,51 +221,37 @@ int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
 	return -EINVAL;
 }
 
-static unsigned long ftrace_update_func;
-
-static int update_ftrace_func(unsigned long ip, void *new)
+int ftrace_update_ftrace_func(ftrace_func_t func)
 {
-	unsigned char old[MCOUNT_INSN_SIZE];
+	unsigned long ip = (unsigned long)(&ftrace_call);
+	unsigned char old[MCOUNT_INSN_SIZE], *new;
 	int ret;
 
-	memcpy(old, (void *)ip, MCOUNT_INSN_SIZE);
-
-	ftrace_update_func = ip;
-	/* Make sure the breakpoints see the ftrace_update_func update */
-	smp_wmb();
+	memcpy(old, &ftrace_call, MCOUNT_INSN_SIZE);
+	new = ftrace_call_replace(ip, (unsigned long)func);
 
 	/* See comment above by declaration of modifying_ftrace_code */
 	atomic_inc(&modifying_ftrace_code);
 
 	ret = ftrace_modify_code(ip, old, new);
 
-	atomic_dec(&modifying_ftrace_code);
-
-	return ret;
-}
-
-int ftrace_update_ftrace_func(ftrace_func_t func)
-{
-	unsigned long ip = (unsigned long)(&ftrace_call);
-	unsigned char *new;
-	int ret;
-
-	new = ftrace_call_replace(ip, (unsigned long)func);
-	ret = update_ftrace_func(ip, new);
-
 	/* Also update the regs callback function */
 	if (!ret) {
 		ip = (unsigned long)(&ftrace_regs_call);
+		memcpy(old, &ftrace_regs_call, MCOUNT_INSN_SIZE);
 		new = ftrace_call_replace(ip, (unsigned long)func);
-		ret = update_ftrace_func(ip, new);
+		ret = ftrace_modify_code(ip, old, new);
 	}
 
+	atomic_dec(&modifying_ftrace_code);
+
 	return ret;
 }
 
 static int is_ftrace_caller(unsigned long ip)
 {
-	if (ip == ftrace_update_func)
+	if (ip == (unsigned long)(&ftrace_call) ||
+		ip == (unsigned long)(&ftrace_regs_call))
 		return 1;
 
 	return 0;
@@ -692,41 +677,45 @@ int __init ftrace_dyn_arch_init(void *data)
 #ifdef CONFIG_DYNAMIC_FTRACE
 extern void ftrace_graph_call(void);
 
-static unsigned char *ftrace_jmp_replace(unsigned long ip, unsigned long addr)
+static int ftrace_mod_jmp(unsigned long ip,
+			  int old_offset, int new_offset)
 {
-	static union ftrace_code_union calc;
+	unsigned char code[MCOUNT_INSN_SIZE];
 
-	/* Jmp not a call (ignore the .e8) */
-	calc.e8		= 0xe9;
-	calc.offset	= ftrace_calc_offset(ip + MCOUNT_INSN_SIZE, addr);
+	if (probe_kernel_read(code, (void *)ip, MCOUNT_INSN_SIZE))
+		return -EFAULT;
 
-	/*
-	 * ftrace external locks synchronize the access to the static variable.
-	 */
-	return calc.code;
-}
+	if (code[0] != 0xe9 || old_offset != *(int *)(&code[1]))
+		return -EINVAL;
 
-static int ftrace_mod_jmp(unsigned long ip, void *func)
-{
-	unsigned char *new;
+	*(int *)(&code[1]) = new_offset;
 
-	new = ftrace_jmp_replace(ip, (unsigned long)func);
+	if (do_ftrace_mod_code(ip, &code))
+		return -EPERM;
 
-	return update_ftrace_func(ip, new);
+	return 0;
 }
 
 int ftrace_enable_ftrace_graph_caller(void)
 {
 	unsigned long ip = (unsigned long)(&ftrace_graph_call);
+	int old_offset, new_offset;
 
-	return ftrace_mod_jmp(ip, &ftrace_graph_caller);
+	old_offset = (unsigned long)(&ftrace_stub) - (ip + MCOUNT_INSN_SIZE);
+	new_offset = (unsigned long)(&ftrace_graph_caller) - (ip + MCOUNT_INSN_SIZE);
+
+	return ftrace_mod_jmp(ip, old_offset, new_offset);
 }
 
 int ftrace_disable_ftrace_graph_caller(void)
 {
 	unsigned long ip = (unsigned long)(&ftrace_graph_call);
+	int old_offset, new_offset;
+
+	old_offset = (unsigned long)(&ftrace_graph_caller) - (ip + MCOUNT_INSN_SIZE);
+	new_offset = (unsigned long)(&ftrace_stub) - (ip + MCOUNT_INSN_SIZE);
 
-	return ftrace_mod_jmp(ip, &ftrace_stub);
+	return ftrace_mod_jmp(ip, old_offset, new_offset);
 }
 
 #endif /* !CONFIG_DYNAMIC_FTRACE */
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 6dea040..9d591c8 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1001,12 +1001,6 @@ static int fault_in_kernel_space(unsigned long address)
 
 static inline bool smap_violation(int error_code, struct pt_regs *regs)
 {
-	if (!IS_ENABLED(CONFIG_X86_SMAP))
-		return false;
-
-	if (!static_cpu_has(X86_FEATURE_SMAP))
-		return false;
-
 	if (error_code & PF_USER)
 		return false;
 
@@ -1093,9 +1087,11 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code)
 	if (unlikely(error_code & PF_RSVD))
 		pgtable_bad(regs, error_code, address);
 
-	if (unlikely(smap_violation(error_code, regs))) {
-		bad_area_nosemaphore(regs, error_code, address);
-		return;
+	if (static_cpu_has(X86_FEATURE_SMAP)) {
+		if (unlikely(smap_violation(error_code, regs))) {
+			bad_area_nosemaphore(regs, error_code, address);
+			return;
+		}
 	}
 
 	/*
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index bb32480..17fe8e1 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1416,6 +1416,8 @@ void kernel_map_pages(struct page *page, int numpages, int enable)
 	arch_flush_lazy_mmu_mode();
 }
 
+EXPORT_SYMBOL_GPL(kernel_map_pages);
+
 #ifdef CONFIG_HIBERNATION
 
 bool kernel_page_present(struct page *page)
@@ -1429,7 +1431,7 @@ bool kernel_page_present(struct page *page)
 	pte = lookup_address((unsigned long)page_address(page), &level);
 	return (pte_val(*pte) & _PAGE_PRESENT);
 }
-
+EXPORT_SYMBOL_GPL(kernel_page_present);
 #endif /* CONFIG_HIBERNATION */
 
 #endif /* CONFIG_DEBUG_PAGEALLOC */
diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
index 424f4c9..41f9004 100644
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -122,9 +122,7 @@ void save_processor_state(void)
 	__save_processor_state(&saved_context);
 	x86_platform.save_sched_clock_state();
 }
-#ifdef CONFIG_X86_32
 EXPORT_SYMBOL(save_processor_state);
-#endif
 
 static void do_fpu_end(void)
 {
diff --git a/arch/x86/power/hibernate_32.c b/arch/x86/power/hibernate_32.c
index 7d28c88..4f1dd95 100644
--- a/arch/x86/power/hibernate_32.c
+++ b/arch/x86/power/hibernate_32.c
@@ -9,6 +9,7 @@
 #include <linux/gfp.h>
 #include <linux/suspend.h>
 #include <linux/bootmem.h>
+#include <linux/export.h>
 
 #include <asm/page.h>
 #include <asm/pgtable.h>
@@ -161,6 +162,7 @@ int swsusp_arch_resume(void)
 	restore_image();
 	return 0;
 }
+EXPORT_SYMBOL_GPL(swsusp_arch_resume);
 
 /*
  *	pfn_is_nosave - check if given pfn is in the 'nosave' section
diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c
index 304fca2..dd7339a 100644
--- a/arch/x86/power/hibernate_64.c
+++ b/arch/x86/power/hibernate_64.c
@@ -11,8 +11,7 @@
 #include <linux/gfp.h>
 #include <linux/smp.h>
 #include <linux/suspend.h>
-
-#include <asm/init.h>
+#include <linux/export.h>
 #include <asm/proto.h>
 #include <asm/page.h>
 #include <asm/pgtable.h>
@@ -41,21 +40,41 @@ pgd_t *temp_level4_pgt __visible;
 
 void *relocated_restore_code __visible;
 
-static void *alloc_pgt_page(void *context)
+static int res_phys_pud_init(pud_t *pud, unsigned long address, unsigned long end)
 {
-	return (void *)get_safe_page(GFP_ATOMIC);
+	long i, j;
+
+	i = pud_index(address);
+	pud = pud + i;
+	for (; i < PTRS_PER_PUD; pud++, i++) {
+		unsigned long paddr;
+		pmd_t *pmd;
+
+		paddr = address + i*PUD_SIZE;
+		if (paddr >= end)
+			break;
+
+		pmd = (pmd_t *)get_safe_page(GFP_ATOMIC);
+		if (!pmd)
+			return -ENOMEM;
+		set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE));
+		for (j = 0; j < PTRS_PER_PMD; pmd++, j++, paddr += PMD_SIZE) {
+			unsigned long pe;
+
+			if (paddr >= end)
+				break;
+			pe = __PAGE_KERNEL_LARGE_EXEC | paddr;
+			pe &= __supported_pte_mask;
+			set_pmd(pmd, __pmd(pe));
+		}
+	}
+	return 0;
 }
 
 static int set_up_temporary_mappings(void)
 {
-	struct x86_mapping_info info = {
-		.alloc_pgt_page	= alloc_pgt_page,
-		.pmd_flag	= __PAGE_KERNEL_LARGE_EXEC,
-		.kernel_mapping = true,
-	};
-	unsigned long mstart, mend;
-	int result;
-	int i;
+	unsigned long start, end, next;
+	int error;
 
 	temp_level4_pgt = (pgd_t *)get_safe_page(GFP_ATOMIC);
 	if (!temp_level4_pgt)
@@ -66,17 +85,21 @@ static int set_up_temporary_mappings(void)
 		init_level4_pgt[pgd_index(__START_KERNEL_map)]);
 
 	/* Set up the direct mapping from scratch */
-	for (i = 0; i < nr_pfn_mapped; i++) {
-		mstart = pfn_mapped[i].start << PAGE_SHIFT;
-		mend   = pfn_mapped[i].end << PAGE_SHIFT;
-
-		result = kernel_ident_mapping_init(&info, temp_level4_pgt,
-						   mstart, mend);
-
-		if (result)
-			return result;
+	start = (unsigned long)pfn_to_kaddr(0);
+	end = (unsigned long)pfn_to_kaddr(max_pfn);
+
+	for (; start < end; start = next) {
+		pud_t *pud = (pud_t *)get_safe_page(GFP_ATOMIC);
+		if (!pud)
+			return -ENOMEM;
+		next = start + PGDIR_SIZE;
+		if (next > end)
+			next = end;
+		if ((error = res_phys_pud_init(pud, __pa(start), __pa(next))))
+			return error;
+		set_pgd(temp_level4_pgt + pgd_index(start),
+			mk_kernel_pgd(__pa(pud)));
 	}
-
 	return 0;
 }
 
@@ -97,6 +120,7 @@ int swsusp_arch_resume(void)
 	restore_image();
 	return 0;
 }
+EXPORT_SYMBOL_GPL(swsusp_arch_resume);
 
 /*
  *	pfn_is_nosave - check if given pfn is in the 'nosave' section
@@ -147,3 +171,4 @@ int arch_hibernation_header_restore(void *addr)
 	restore_cr3 = rdr->cr3;
 	return (rdr->magic == RESTORE_MAGIC) ? 0 : -EINVAL;
 }
+EXPORT_SYMBOL_GPL(arch_hibernation_header_restore);
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index 3c76c3d..ce563be 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -365,7 +365,7 @@ void xen_ptep_modify_prot_commit(struct mm_struct *mm, unsigned long addr,
 /* Assume pteval_t is equivalent to all the other *val_t types. */
 static pteval_t pte_mfn_to_pfn(pteval_t val)
 {
-	if (pteval_present(val)) {
+	if (val & _PAGE_PRESENT) {
 		unsigned long mfn = (val & PTE_PFN_MASK) >> PAGE_SHIFT;
 		unsigned long pfn = mfn_to_pfn(mfn);
 
@@ -381,7 +381,7 @@ static pteval_t pte_mfn_to_pfn(pteval_t val)
 
 static pteval_t pte_pfn_to_mfn(pteval_t val)
 {
-	if (pteval_present(val)) {
+	if (val & _PAGE_PRESENT) {
 		unsigned long pfn = (val & PTE_PFN_MASK) >> PAGE_SHIFT;
 		pteval_t flags = val & PTE_FLAGS_MASK;
 		unsigned long mfn;
diff --git a/block/Makefile b/block/Makefile
index 20645e8..f2c091d 100644
--- a/block/Makefile
+++ b/block/Makefile
@@ -7,7 +7,7 @@ obj-$(CONFIG_BLOCK) := elevator.o blk-core.o blk-tag.o blk-sysfs.o \
 			blk-exec.o blk-merge.o blk-softirq.o blk-timeout.o \
 			blk-iopoll.o blk-lib.o blk-mq.o blk-mq-tag.o \
 			blk-mq-sysfs.o blk-mq-cpu.o blk-mq-cpumap.o ioctl.o \
-			genhd.o scsi_ioctl.o partition-generic.o partitions/
+			uuid.o genhd.o scsi_ioctl.o partition-generic.o partitions/
 
 obj-$(CONFIG_BLK_DEV_BSG)	+= bsg.o
 obj-$(CONFIG_BLK_DEV_BSGLIB)	+= bsg-lib.o
diff --git a/block/blk-core.c b/block/blk-core.c
index 8bdd012..24aef0a 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -46,6 +46,9 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(block_unplug);
 
 DEFINE_IDA(blk_queue_ida);
 
+int trap_non_toi_io;
+EXPORT_SYMBOL_GPL(trap_non_toi_io);
+
 /*
  * For the allocated request tables
  */
@@ -1850,6 +1853,9 @@ void submit_bio(int rw, struct bio *bio)
 {
 	bio->bi_rw |= rw;
 
+	if (unlikely(trap_non_toi_io))
+		BUG_ON(!(bio->bi_flags & BIO_TOI));
+
 	/*
 	 * If it's a regular read/write or a barrier with data attached,
 	 * go through the normal accounting stuff before submission.
diff --git a/block/blk-lib.c b/block/blk-lib.c
index 4cb9ee8..9b5b561 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -119,14 +119,6 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 
 		atomic_inc(&bb.done);
 		submit_bio(type, bio);
-
-		/*
-		 * We can loop for a long time in here, if someone does
-		 * full device discards (like mkfs). Be nice and allow
-		 * us to schedule out to avoid softlocking if preempt
-		 * is disabled.
-		 */
-		cond_resched();
 	}
 	blk_finish_plug(&plug);
 
diff --git a/block/blk.h b/block/blk.h
index d23b415..c90e1d8 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -113,7 +113,7 @@ static inline struct request *__elv_next_request(struct request_queue *q)
 			q->flush_queue_delayed = 1;
 			return NULL;
 		}
-		if (unlikely(blk_queue_bypass(q)) ||
+		if (unlikely(blk_queue_dying(q)) ||
 		    !q->elevator->type->ops.elevator_dispatch_fn(q, 0))
 			return NULL;
 	}
diff --git a/block/genhd.c b/block/genhd.c
index 791f419..97985a4 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -17,6 +17,8 @@
 #include <linux/kobj_map.h>
 #include <linux/mutex.h>
 #include <linux/idr.h>
+#include <linux/ctype.h>
+#include <linux/fs_uuid.h>
 #include <linux/log2.h>
 #include <linux/pm_runtime.h>
 
@@ -1375,6 +1377,87 @@ int invalidate_partition(struct gendisk *disk, int partno)
 
 EXPORT_SYMBOL(invalidate_partition);
 
+dev_t blk_lookup_fs_info(struct fs_info *seek)
+{
+	dev_t devt = MKDEV(0, 0);
+	struct class_dev_iter iter;
+	struct device *dev;
+	int best_score = 0;
+
+	class_dev_iter_init(&iter, &block_class, NULL, &disk_type);
+	while (best_score < 3 && (dev = class_dev_iter_next(&iter))) {
+		struct gendisk *disk = dev_to_disk(dev);
+		struct disk_part_iter piter;
+		struct hd_struct *part;
+
+		disk_part_iter_init(&piter, disk, DISK_PITER_INCL_PART0);
+
+		while (best_score < 3 && (part = disk_part_iter_next(&piter))) {
+			int score = part_matches_fs_info(part, seek);
+			if (score > best_score) {
+				devt = part_devt(part);
+				best_score = score;
+			}
+		}
+		disk_part_iter_exit(&piter);
+	}
+	class_dev_iter_exit(&iter);
+	return devt;
+}
+EXPORT_SYMBOL_GPL(blk_lookup_fs_info);
+
+/* Caller uses NULL, key to start. For each match found, we return a bdev on
+ * which we have done blkdev_get, and we do the blkdev_put on block devices
+ * that are passed to us. When no more matches are found, we return NULL.
+ */
+struct block_device *next_bdev_of_type(struct block_device *last,
+	const char *key)
+{
+	dev_t devt = MKDEV(0, 0);
+	struct class_dev_iter iter;
+	struct device *dev;
+	struct block_device *next = NULL, *bdev;
+	int got_last = 0;
+
+	if (!key)
+		goto out;
+
+	class_dev_iter_init(&iter, &block_class, NULL, &disk_type);
+	while (!devt && (dev = class_dev_iter_next(&iter))) {
+		struct gendisk *disk = dev_to_disk(dev);
+		struct disk_part_iter piter;
+		struct hd_struct *part;
+
+		disk_part_iter_init(&piter, disk, DISK_PITER_INCL_PART0);
+
+		while ((part = disk_part_iter_next(&piter))) {
+			bdev = bdget(part_devt(part));
+			if (last && !got_last) {
+				if (last == bdev)
+					got_last = 1;
+				continue;
+			}
+
+			if (blkdev_get(bdev, FMODE_READ, 0))
+				continue;
+
+			if (bdev_matches_key(bdev, key)) {
+				next = bdev;
+				break;
+			}
+
+			blkdev_put(bdev, FMODE_READ);
+		}
+		disk_part_iter_exit(&piter);
+	}
+	class_dev_iter_exit(&iter);
+out:
+	if (last)
+		blkdev_put(last, FMODE_READ);
+	return next;
+}
+EXPORT_SYMBOL_GPL(next_bdev_of_type);
+
 /*
  * Disk events - monitor disk events like media change and eject request.
  */
diff --git a/block/uuid.c b/block/uuid.c
new file mode 100644
index 0000000..72c5029
--- /dev/null
+++ b/block/uuid.c
@@ -0,0 +1,511 @@
+#include <linux/blkdev.h>
+#include <linux/ctype.h>
+#include <linux/fs_uuid.h>
+#include <linux/slab.h>
+#include <linux/export.h>
+
+static int debug_enabled;
+
+#define PRINTK(fmt, args...) do {					\
+	if (debug_enabled)						\
+		printk(KERN_DEBUG fmt, ## args);			\
+	} while(0)
+
+#define PRINT_HEX_DUMP(v1, v2, v3, v4, v5, v6, v7, v8)			\
+	do {								\
+		if (debug_enabled)					\
+			print_hex_dump(v1, v2, v3, v4, v5, v6, v7, v8);	\
+	} while(0)
+
+/*
+ * Simple UUID translation
+ */
+
+struct uuid_info {
+	const char *key;
+	const char *name;
+	long bkoff;
+	unsigned sboff;
+	unsigned sig_len;
+	const char *magic;
+	int uuid_offset;
+	int last_mount_offset;
+	int last_mount_size;
+};
+
+/*
+ * Based on libuuid's blkid_magic array. Note that I don't
+ * have uuid offsets for all of these yet - mssing ones are 0x0.
+ * Further information welcome.
+ *
+ * Rearranged by page of fs signature for optimisation.
+ */
+static struct uuid_info uuid_list[] = {
+ { NULL, "oracleasm", 0, 32, 8, "ORCLDISK", 0x0, 0, 0 },
+ { "ntfs", "ntfs", 0, 3, 8, "NTFS    ", 0x0, 0, 0 },
+ { "vfat", "vfat", 0, 0x52, 5, "MSWIN", 0x0, 0, 0 },
+ { "vfat", "vfat", 0, 0x52, 8, "FAT32   ", 0x0, 0, 0 },
+ { "vfat", "vfat", 0, 0x36, 5, "MSDOS", 0x0, 0, 0 },
+ { "vfat", "vfat", 0, 0x36, 8, "FAT16   ", 0x0, 0, 0 },
+ { "vfat", "vfat", 0, 0x36, 8, "FAT12   ", 0x0, 0, 0 },
+ { "vfat", "vfat", 0, 0, 1, "\353", 0x0, 0, 0 },
+ { "vfat", "vfat", 0, 0, 1, "\351", 0x0, 0, 0 },
+ { "vfat", "vfat", 0, 0x1fe, 2, "\125\252", 0x0, 0, 0 },
+ { "xfs", "xfs", 0, 0, 4, "XFSB", 0x20, 0, 0 },
+ { "romfs", "romfs", 0, 0, 8, "-rom1fs-", 0x0, 0, 0 },
+ { "bfs", "bfs", 0, 0, 4, "\316\372\173\033", 0, 0, 0 },
+ { "cramfs", "cramfs", 0, 0, 4, "E=\315\050", 0x0, 0, 0 },
+ { "qnx4", "qnx4", 0, 4, 6, "QNX4FS", 0, 0, 0 },
+ { NULL, "crypt_LUKS", 0, 0, 6, "LUKS\xba\xbe", 0x0, 0, 0 },
+ { "squashfs", "squashfs", 0, 0, 4, "sqsh", 0, 0, 0 },
+ { "squashfs", "squashfs", 0, 0, 4, "hsqs", 0, 0, 0 },
+ { "ocfs", "ocfs", 0, 8, 9, "OracleCFS", 0x0, 0, 0 },
+ { "lvm2pv", "lvm2pv", 0, 0x018, 8, "LVM2 001", 0x0, 0, 0 },
+ { "sysv", "sysv", 0, 0x3f8, 4, "\020~\030\375", 0, 0, 0 },
+ { "ext", "ext", 1, 0x38, 2, "\123\357", 0x468, 0x42c, 4 },
+ { "minix", "minix", 1, 0x10, 2, "\177\023", 0, 0, 0 },
+ { "minix", "minix", 1, 0x10, 2, "\217\023", 0, 0, 0 },
+ { "minix", "minix", 1, 0x10, 2, "\150\044", 0, 0, 0 },
+ { "minix", "minix", 1, 0x10, 2, "\170\044", 0, 0, 0 },
+ { "lvm2pv", "lvm2pv", 1, 0x018, 8, "LVM2 001", 0x0, 0, 0 },
+ { "vxfs", "vxfs", 1, 0, 4, "\365\374\001\245", 0, 0, 0 },
+ { "hfsplus", "hfsplus", 1, 0, 2, "BD", 0x0, 0, 0 },
+ { "hfsplus", "hfsplus", 1, 0, 2, "H+", 0x0, 0, 0 },
+ { "hfsplus", "hfsplus", 1, 0, 2, "HX", 0x0, 0, 0 },
+ { "hfs", "hfs", 1, 0, 2, "BD", 0x0, 0, 0 },
+ { "ocfs2", "ocfs2", 1, 0, 6, "OCFSV2", 0x0, 0, 0 },
+ { "lvm2pv", "lvm2pv", 0, 0x218, 8, "LVM2 001", 0x0, 0, 0 },
+ { "lvm2pv", "lvm2pv", 1, 0x218, 8, "LVM2 001", 0x0, 0, 0 },
+ { "ocfs2", "ocfs2", 2, 0, 6, "OCFSV2", 0x0, 0, 0 },
+ { "swap", "swap", 0, 0xff6, 10, "SWAP-SPACE", 0x40c, 0, 0 },
+ { "swap", "swap", 0, 0xff6, 10, "SWAPSPACE2", 0x40c, 0, 0 },
+ { "swap", "swsuspend", 0, 0xff6, 9, "S1SUSPEND", 0x40c, 0, 0 },
+ { "swap", "swsuspend", 0, 0xff6, 9, "S2SUSPEND", 0x40c, 0, 0 },
+ { "swap", "swsuspend", 0, 0xff6, 9, "ULSUSPEND", 0x40c, 0, 0 },
+ { "ocfs2", "ocfs2", 4, 0, 6, "OCFSV2", 0x0, 0, 0 },
+ { "ocfs2", "ocfs2", 8, 0, 6, "OCFSV2", 0x0, 0, 0 },
+ { "hpfs", "hpfs", 8, 0, 4, "I\350\225\371", 0, 0, 0 },
+ { "reiserfs", "reiserfs", 8, 0x34, 8, "ReIsErFs", 0x10054, 0, 0 },
+ { "reiserfs", "reiserfs", 8, 20, 8, "ReIsErFs", 0x10054, 0, 0 },
+ { "zfs", "zfs", 8, 0, 8, "\0\0\x02\xf5\xb0\x07\xb1\x0c", 0x0, 0, 0 },
+ { "zfs", "zfs", 8, 0, 8, "\x0c\xb1\x07\xb0\xf5\x02\0\0", 0x0, 0, 0 },
+ { "ufs", "ufs", 8, 0x55c, 4, "T\031\001\000", 0, 0, 0 },
+ { "swap", "swap", 0, 0x1ff6, 10, "SWAP-SPACE", 0x40c, 0, 0 },
+ { "swap", "swap", 0, 0x1ff6, 10, "SWAPSPACE2", 0x40c, 0, 0 },
+ { "swap", "swsuspend", 0, 0x1ff6, 9, "S1SUSPEND", 0x40c, 0, 0 },
+ { "swap", "swsuspend", 0, 0x1ff6, 9, "S2SUSPEND", 0x40c, 0, 0 },
+ { "swap", "swsuspend", 0, 0x1ff6, 9, "ULSUSPEND", 0x40c, 0, 0 },
+ { "reiserfs", "reiserfs", 64, 0x34, 9, "ReIsEr2Fs", 0x10054, 0, 0 },
+ { "reiserfs", "reiserfs", 64, 0x34, 9, "ReIsEr3Fs", 0x10054, 0, 0 },
+ { "reiserfs", "reiserfs", 64, 0x34, 8, "ReIsErFs", 0x10054, 0, 0 },
+ { "reiser4", "reiser4", 64, 0, 7, "ReIsEr4", 0x100544, 0, 0 },
+ { "gfs2", "gfs2", 64, 0, 4, "\x01\x16\x19\x70", 0x0, 0, 0 },
+ { "gfs", "gfs", 64, 0, 4, "\x01\x16\x19\x70", 0x0, 0, 0 },
+ { "btrfs", "btrfs", 64, 0x40, 8, "_BHRfS_M", 0x0, 0, 0 },
+ { "swap", "swap", 0, 0x3ff6, 10, "SWAP-SPACE", 0x40c, 0, 0 },
+ { "swap", "swap", 0, 0x3ff6, 10, "SWAPSPACE2", 0x40c, 0, 0 },
+ { "swap", "swsuspend", 0, 0x3ff6, 9, "S1SUSPEND", 0x40c, 0, 0 },
+ { "swap", "swsuspend", 0, 0x3ff6, 9, "S2SUSPEND", 0x40c, 0, 0 },
+ { "swap", "swsuspend", 0, 0x3ff6, 9, "ULSUSPEND", 0x40c, 0, 0 },
+ { "udf", "udf", 32, 1, 5, "BEA01", 0x0, 0, 0 },
+ { "udf", "udf", 32, 1, 5, "BOOT2", 0x0, 0, 0 },
+ { "udf", "udf", 32, 1, 5, "CD001", 0x0, 0, 0 },
+ { "udf", "udf", 32, 1, 5, "CDW02", 0x0, 0, 0 },
+ { "udf", "udf", 32, 1, 5, "NSR02", 0x0, 0, 0 },
+ { "udf", "udf", 32, 1, 5, "NSR03", 0x0, 0, 0 },
+ { "udf", "udf", 32, 1, 5, "TEA01", 0x0, 0, 0 },
+ { "iso9660", "iso9660", 32, 1, 5, "CD001", 0x0, 0, 0 },
+ { "iso9660", "iso9660", 32, 9, 5, "CDROM", 0x0, 0, 0 },
+ { "jfs", "jfs", 32, 0, 4, "JFS1", 0x88, 0, 0 },
+ { "swap", "swap", 0, 0x7ff6, 10, "SWAP-SPACE", 0x40c, 0, 0 },
+ { "swap", "swap", 0, 0x7ff6, 10, "SWAPSPACE2", 0x40c, 0, 0 },
+ { "swap", "swsuspend", 0, 0x7ff6, 9, "S1SUSPEND", 0x40c, 0, 0 },
+ { "swap", "swsuspend", 0, 0x7ff6, 9, "S2SUSPEND", 0x40c, 0, 0 },
+ { "swap", "swsuspend", 0, 0x7ff6, 9, "ULSUSPEND", 0x40c, 0, 0 },
+ { "swap", "swap", 0, 0xfff6, 10, "SWAP-SPACE", 0x40c, 0, 0 },
+ { "swap", "swap", 0, 0xfff6, 10, "SWAPSPACE2", 0x40c, 0, 0 },
+ { "swap", "swsuspend", 0, 0xfff6, 9, "S1SUSPEND", 0x40c, 0, 0 },
+ { "swap", "swsuspend", 0, 0xfff6, 9, "S2SUSPEND", 0x40c, 0, 0 },
+ { "swap", "swsuspend", 0, 0xfff6, 9, "ULSUSPEND", 0x40c, 0, 0 },
+ { "zfs", "zfs", 264, 0, 8, "\0\0\x02\xf5\xb0\x07\xb1\x0c", 0x0, 0, 0 },
+ { "zfs", "zfs", 264, 0, 8, "\x0c\xb1\x07\xb0\xf5\x02\0\0", 0x0, 0, 0 },
+ { NULL, NULL, 0, 0, 0, NULL, 0x0, 0, 0 }
+};
+
+static int null_uuid(const char *uuid)
+{
+	int i;
+
+	for (i = 0; i < 16 && !uuid[i]; i++);
+
+	return (i == 16);
+}
+
+
+static void uuid_end_bio(struct bio *bio, int err)
+{
+	struct page *page = bio->bi_io_vec[0].bv_page;
+
+	if(!test_bit(BIO_UPTODATE, &bio->bi_flags))
+		SetPageError(page);
+
+	unlock_page(page);
+	bio_put(bio);
+}
+
+
+/**
+ * submit - submit BIO request
+ * @dev: The block device we're using.
+ * @page_num: The page we're reading.
+ *
+ * Based on Patrick Mochell's pmdisk code from long ago: "Straight from the
+ * textbook - allocate and initialize the bio. If we're writing, make sure
+ * the page is marked as dirty. Then submit it and carry on."
+ **/
+static struct page *read_bdev_page(struct block_device *dev, int page_num)
+{
+	struct bio *bio = NULL;
+	struct page *page = alloc_page(GFP_NOFS | __GFP_HIGHMEM);
+
+	if (!page) {
+		printk(KERN_ERR "Failed to allocate a page for reading data "
+				"in UUID checks.");
+		return NULL;
+	}
+
+	bio = bio_alloc(GFP_NOFS, 1);
+	bio->bi_bdev = dev;
+	bio->bi_sector = page_num << 3;
+	bio->bi_end_io = uuid_end_bio;
+	bio->bi_flags |= (1 << BIO_TOI);
+
+	PRINTK("Submitting bio on device %lx, page %d using bio %p and page %p.\n",
+			(unsigned long) dev->bd_dev, page_num, bio, page);
+
+	if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) {
+		printk(KERN_DEBUG "ERROR: adding page to bio at %d\n",
+				page_num);
+		bio_put(bio);
+		__free_page(page);
+		printk(KERN_DEBUG "read_bdev_page freed page %p (in error "
+				"path).\n", page);
+		return NULL;
+	}
+
+	lock_page(page);
+	submit_bio(READ | REQ_SYNC, bio);
+
+	wait_on_page_locked(page);
+	if (PageError(page)) {
+		__free_page(page);
+		page = NULL;
+	}
+	return page;
+}
+
+int bdev_matches_key(struct block_device *bdev, const char *key)
+{
+	unsigned char *data = NULL;
+	struct page *data_page = NULL;
+
+	int dev_offset, pg_num, pg_off, i;
+	int last_pg_num = -1;
+	int result = 0;
+	char buf[50];
+
+	if (null_uuid(key)) {
+		PRINTK("Refusing to find a NULL key.\n");
+		return 0;
+	}
+
+	if (!bdev->bd_disk) {
+		bdevname(bdev, buf);
+		PRINTK("bdev %s has no bd_disk.\n", buf);
+		return 0;
+	}
+
+	if (!bdev->bd_disk->queue) {
+		bdevname(bdev, buf);
+		PRINTK("bdev %s has no queue.\n", buf);
+		return 0;
+	}
+
+	for (i = 0; uuid_list[i].name; i++) {
+		struct uuid_info *dat = &uuid_list[i];
+
+		if (!dat->key || strcmp(dat->key, key))
+			continue;
+
+		dev_offset = (dat->bkoff << 10) + dat->sboff;
+		pg_num = dev_offset >> 12;
+		pg_off = dev_offset & 0xfff;
+
+		if ((((pg_num + 1) << 3) - 1) > bdev->bd_part->nr_sects >> 1)
+			continue;
+
+		if (pg_num != last_pg_num) {
+			if (data_page) {
+				kunmap(data_page);
+				__free_page(data_page);
+			}
+			data_page = read_bdev_page(bdev, pg_num);
+			if (!data_page)
+				continue;
+			data = kmap(data_page);
+		}
+
+		last_pg_num = pg_num;
+
+		if (strncmp(&data[pg_off], dat->magic, dat->sig_len))
+			continue;
+
+		result = 1;
+		break;
+	}
+
+	if (data_page) {
+		kunmap(data_page);
+		__free_page(data_page);
+	}
+
+	return result;
+}
+
+/* 
+ * part_matches_fs_info - Does the given partition match the details given?
+ *
+ * Returns a score saying how good the match is.
+ * 0 = no UUID match.
+ * 1 = UUID but last mount time differs.
+ * 2 = UUID, last mount time but not dev_t
+ * 3 = perfect match
+ *
+ * This lets us cope elegantly with probing resulting in dev_ts changing
+ * from boot to boot, and with the case where a user copies a partition
+ * (UUID is non unique), and we need to check the last mount time of the
+ * correct partition.
+ */
+int part_matches_fs_info(struct hd_struct *part, struct fs_info *seek)
+{
+	struct block_device *bdev;
+	struct fs_info *got;
+	int result = 0;
+	char buf[50];
+
+	if (null_uuid((char *) &seek->uuid)) {
+		PRINTK("Refusing to find a NULL uuid.\n");
+		return 0;
+	}
+
+	bdev = bdget(part_devt(part));
+
+	PRINTK("part_matches fs info considering %x.\n", part_devt(part));
+
+	if (blkdev_get(bdev, FMODE_READ, 0)) {
+		PRINTK("blkdev_get failed.\n");
+		return 0;
+	}
+
+	if (!bdev->bd_disk) {
+		bdevname(bdev, buf);
+		PRINTK("bdev %s has no bd_disk.\n", buf);
+		goto out;
+	}
+
+	if (!bdev->bd_disk->queue) {
+		bdevname(bdev, buf);
+		PRINTK("bdev %s has no queue.\n", buf);
+		goto out;
+	}
+
+	got = fs_info_from_block_dev(bdev);
+
+	if (got && !memcmp(got->uuid, seek->uuid, 16)) {
+		PRINTK(" Have matching UUID.\n");
+		PRINTK(" Got: LMS %d, LM %p.\n", got->last_mount_size, got->last_mount);
+		PRINTK(" Seek: LMS %d, LM %p.\n", seek->last_mount_size, seek->last_mount);
+		result = 1;
+
+		if (got->last_mount_size == seek->last_mount_size &&
+		    got->last_mount && seek->last_mount &&
+		    !memcmp(got->last_mount, seek->last_mount,
+			    got->last_mount_size)) {
+			result = 2;
+
+			PRINTK(" Matching last mount time.\n");
+
+			if (part_devt(part) == seek->dev_t) {
+				result = 3;
+				PRINTK(" Matching dev_t.\n");
+			} else
+				PRINTK("Dev_ts differ (%x vs %x).\n", part_devt(part), seek->dev_t);
+		}
+	}
+
+	PRINTK(" Score for %x is %d.\n", part_devt(part), result);
+	free_fs_info(got);
+out:
+	blkdev_put(bdev, FMODE_READ);
+	return result;
+}
+
+void free_fs_info(struct fs_info *fs_info)
+{
+	if (!fs_info || IS_ERR(fs_info))
+		return;
+
+	if (fs_info->last_mount)
+		kfree(fs_info->last_mount);
+
+	kfree(fs_info);
+}
+EXPORT_SYMBOL_GPL(free_fs_info);
+
+struct fs_info *fs_info_from_block_dev(struct block_device *bdev)
+{
+	unsigned char *data = NULL;
+	struct page *data_page = NULL;
+
+	int dev_offset, pg_num, pg_off;
+	int uuid_pg_num, uuid_pg_off, i;
+	unsigned char *uuid_data = NULL;
+	struct page *uuid_data_page = NULL;
+
+	int last_pg_num = -1, last_uuid_pg_num = 0;
+	char buf[50];
+	struct fs_info *fs_info = NULL;
+
+	bdevname(bdev, buf);
+
+	PRINTK("uuid_from_block_dev looking for partition type of %s.\n", buf);
+
+	for (i = 0; uuid_list[i].name; i++) {
+		struct uuid_info *dat = &uuid_list[i];
+		dev_offset = (dat->bkoff << 10) + dat->sboff;
+		pg_num = dev_offset >> 12;
+		pg_off = dev_offset & 0xfff;
+		uuid_pg_num = dat->uuid_offset >> 12;
+		uuid_pg_off = dat->uuid_offset & 0xfff;
+
+		if ((((pg_num + 1) << 3) - 1) > bdev->bd_part->nr_sects >> 1)
+			continue;
+
+		/* Ignore partition types with no UUID offset */
+		if (!dat->uuid_offset)
+			continue;
+
+		if (pg_num != last_pg_num) {
+			if (data_page) {
+				kunmap(data_page);
+				__free_page(data_page);
+			}
+			data_page = read_bdev_page(bdev, pg_num);
+			if (!data_page)
+				continue;
+			data = kmap(data_page);
+		}
+
+		last_pg_num = pg_num;
+
+		if (strncmp(&data[pg_off], dat->magic, dat->sig_len))
+			continue;
+
+		PRINTK("This partition looks like %s.\n", dat->name);
+
+		fs_info = kzalloc(sizeof(struct fs_info), GFP_KERNEL);
+
+		if (!fs_info) {
+			PRINTK("Failed to allocate fs_info struct.");
+			fs_info = ERR_PTR(-ENOMEM);
+			break;
+		}
+
+		/* UUID can't be off the end of the disk */
+		if ((uuid_pg_num > bdev->bd_part->nr_sects >> 3) ||
+				!dat->uuid_offset)
+			goto no_uuid;
+
+		if (!uuid_data || uuid_pg_num != last_uuid_pg_num) {
+			/* No need to reread the page from above */
+			if (uuid_pg_num == pg_num && uuid_data)
+				memcpy(uuid_data, data, PAGE_SIZE);
+			else {
+				if (uuid_data_page) {
+					kunmap(uuid_data_page);
+					__free_page(uuid_data_page);
+				}
+				uuid_data_page = read_bdev_page(bdev, uuid_pg_num);
+				if (!uuid_data_page)
+					continue;
+				uuid_data = kmap(uuid_data_page);
+			}
+		}
+
+		last_uuid_pg_num = uuid_pg_num;
+		memcpy(&fs_info->uuid, &uuid_data[uuid_pg_off], 16);
+		fs_info->dev_t = bdev->bd_dev;
+
+no_uuid:
+		PRINT_HEX_DUMP(KERN_EMERG, "fs_info_from_block_dev "
+				"returning uuid ", DUMP_PREFIX_NONE, 16, 1,
+				fs_info->uuid, 16, 0);
+
+		if (dat->last_mount_size) {
+			int pg = dat->last_mount_offset >> 12, sz;
+			int off = dat->last_mount_offset & 0xfff;
+			struct page *last_mount = read_bdev_page(bdev, pg);
+			unsigned char *last_mount_data;
+			char *ptr;
+
+			if (!last_mount) {
+				fs_info = ERR_PTR(-ENOMEM);
+				break;
+			}
+			last_mount_data = kmap(last_mount);
+			sz = dat->last_mount_size;
+			ptr = kmalloc(sz, GFP_KERNEL);
+
+			if (!ptr) {
+				printk(KERN_EMERG "fs_info_from_block_dev "
+					"failed to get memory for last mount "
+					"timestamp.");
+				free_fs_info(fs_info);
+				fs_info = ERR_PTR(-ENOMEM);
+			} else {
+				fs_info->last_mount = ptr;
+				fs_info->last_mount_size = sz;
+				memcpy(ptr, &last_mount_data[off], sz);
+			}
+
+			kunmap(last_mount);
+			__free_page(last_mount);
+		}
+		break;
+	}
+
+	if (data_page) {
+		kunmap(data_page);
+		__free_page(data_page);
+	}
+
+	if (uuid_data_page) {
+		kunmap(uuid_data_page);
+		__free_page(uuid_data_page);
+	}
+
+	return fs_info;
+}
+EXPORT_SYMBOL_GPL(fs_info_from_block_dev);
+
+static int __init uuid_debug_setup(char *str)
+{
+	int value;
+
+	if (sscanf(str, "=%d", &value))
+		debug_enabled = value;
+
+	return 1;
+}
+
+__setup("uuid_debug", uuid_debug_setup);
diff --git a/drivers/acpi/acpi_pad.c b/drivers/acpi/acpi_pad.c
index fc6008f..89fdc92 100644
--- a/drivers/acpi/acpi_pad.c
+++ b/drivers/acpi/acpi_pad.c
@@ -154,6 +154,7 @@ static int power_saving_thread(void *data)
 	u64 last_jiffies = 0;
 
 	sched_setscheduler(current, SCHED_RR, &param);
+	set_freezable();
 
 	while (!kthread_should_stop()) {
 		int cpu;
diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index 1b41fca..591a970 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -790,6 +790,7 @@ void dpm_resume(pm_message_t state)
 	async_synchronize_full();
 	dpm_show_time(starttime, state, NULL);
 }
+EXPORT_SYMBOL_GPL(dpm_resume);
 
 /**
  * device_complete - Complete a PM transition for given device.
@@ -866,6 +867,7 @@ void dpm_complete(pm_message_t state)
 	list_splice(&list, &dpm_list);
 	mutex_unlock(&dpm_list_mtx);
 }
+EXPORT_SYMBOL_GPL(dpm_complete);
 
 /**
  * dpm_resume_end - Execute "resume" callbacks and complete system transition.
@@ -1294,6 +1296,7 @@ int dpm_suspend(pm_message_t state)
 		dpm_show_time(starttime, state, NULL);
 	return error;
 }
+EXPORT_SYMBOL_GPL(dpm_suspend);
 
 /**
  * device_prepare - Prepare a device for system power transition.
@@ -1398,6 +1401,7 @@ int dpm_prepare(pm_message_t state)
 	mutex_unlock(&dpm_list_mtx);
 	return error;
 }
+EXPORT_SYMBOL_GPL(dpm_prepare);
 
 /**
  * dpm_suspend_start - Prepare devices for PM transition and suspend them.
diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
index 2d56f41..2f530c8 100644
--- a/drivers/base/power/wakeup.c
+++ b/drivers/base/power/wakeup.c
@@ -23,6 +23,7 @@
  * if wakeup events are registered during or immediately before the transition.
  */
 bool events_check_enabled __read_mostly;
+EXPORT_SYMBOL_GPL(events_check_enabled);
 
 /*
  * Combined counters of registered wakeup events and wakeup events in progress.
@@ -715,6 +716,7 @@ bool pm_wakeup_pending(void)
 
 	return ret;
 }
+EXPORT_SYMBOL_GPL(pm_wakeup_pending);
 
 /**
  * pm_get_wakeup_count - Read the number of registered wakeup events.
diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
index 6620b73..7f803c9 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -578,6 +578,7 @@ int xen_blkif_schedule(void *arg)
 	int ret;
 
 	xen_blkif_get(blkif);
+	set_freezable();
 
 	while (!kthread_should_stop()) {
 		if (try_to_freeze())
diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 629a673..f9c43f9 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -1904,16 +1904,13 @@ static void blkback_changed(struct xenbus_device *dev,
 	case XenbusStateReconfiguring:
 	case XenbusStateReconfigured:
 	case XenbusStateUnknown:
+	case XenbusStateClosed:
 		break;
 
 	case XenbusStateConnected:
 		blkfront_connect(info);
 		break;
 
-	case XenbusStateClosed:
-		if (dev->state == XenbusStateClosed)
-			break;
-		/* Missed the backend's Closing state -- fallthrough */
 	case XenbusStateClosing:
 		blkfront_closing(info);
 		break;
diff --git a/drivers/char/raw.c b/drivers/char/raw.c
index 6e8d65e..f3223aa 100644
--- a/drivers/char/raw.c
+++ b/drivers/char/raw.c
@@ -190,7 +190,7 @@ static int bind_get(int number, dev_t *dev)
 	struct raw_device_data *rawdev;
 	struct block_device *bdev;
 
-	if (number <= 0 || number >= max_raw_minors)
+	if (number <= 0 || number >= MAX_RAW_MINORS)
 		return -EINVAL;
 
 	rawdev = &raw_devices[number];
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 33edd67..e8c9ef0 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -559,8 +559,7 @@ static void edac_mc_workq_function(struct work_struct *work_req)
  *
  *		called with the mem_ctls_mutex held
  */
-static void edac_mc_workq_setup(struct mem_ctl_info *mci, unsigned msec,
-				bool init)
+static void edac_mc_workq_setup(struct mem_ctl_info *mci, unsigned msec)
 {
 	edac_dbg(0, "\n");
 
@@ -568,9 +567,7 @@ static void edac_mc_workq_setup(struct mem_ctl_info *mci, unsigned msec,
 	if (mci->op_state != OP_RUNNING_POLL)
 		return;
 
-	if (init)
-		INIT_DELAYED_WORK(&mci->work, edac_mc_workq_function);
-
+	INIT_DELAYED_WORK(&mci->work, edac_mc_workq_function);
 	mod_delayed_work(edac_workqueue, &mci->work, msecs_to_jiffies(msec));
 }
 
@@ -604,7 +601,7 @@ static void edac_mc_workq_teardown(struct mem_ctl_info *mci)
  *	user space has updated our poll period value, need to
  *	reset our workq delays
  */
-void edac_mc_reset_delay_period(unsigned long value)
+void edac_mc_reset_delay_period(int value)
 {
 	struct mem_ctl_info *mci;
 	struct list_head *item;
@@ -614,7 +611,7 @@ void edac_mc_reset_delay_period(unsigned long value)
 	list_for_each(item, &mc_devices) {
 		mci = list_entry(item, struct mem_ctl_info, link);
 
-		edac_mc_workq_setup(mci, value, false);
+		edac_mc_workq_setup(mci, (unsigned long) value);
 	}
 
 	mutex_unlock(&mem_ctls_mutex);
@@ -785,7 +782,7 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
 		/* This instance is NOW RUNNING */
 		mci->op_state = OP_RUNNING_POLL;
 
-		edac_mc_workq_setup(mci, edac_mc_get_poll_msec(), true);
+		edac_mc_workq_setup(mci, edac_mc_get_poll_msec());
 	} else {
 		mci->op_state = OP_RUNNING_INTERRUPT;
 	}
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index e5bdf21..9f7e0e60 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -52,20 +52,18 @@ int edac_mc_get_poll_msec(void)
 
 static int edac_set_poll_msec(const char *val, struct kernel_param *kp)
 {
-	unsigned long l;
+	long l;
 	int ret;
 
 	if (!val)
 		return -EINVAL;
 
-	ret = kstrtoul(val, 0, &l);
+	ret = kstrtol(val, 0, &l);
 	if (ret)
 		return ret;
-
-	if (l < 1000)
+	if ((int)l != l)
 		return -EINVAL;
-
-	*((unsigned long *)kp->arg) = l;
+	*((int *)kp->arg) = l;
 
 	/* notify edac_mc engine to reset the poll period */
 	edac_mc_reset_delay_period(l);
diff --git a/drivers/edac/edac_module.h b/drivers/edac/edac_module.h
index f2118bf..3d139c6 100644
--- a/drivers/edac/edac_module.h
+++ b/drivers/edac/edac_module.h
@@ -52,7 +52,7 @@ extern void edac_device_workq_setup(struct edac_device_ctl_info *edac_dev,
 extern void edac_device_workq_teardown(struct edac_device_ctl_info *edac_dev);
 extern void edac_device_reset_delay_period(struct edac_device_ctl_info
 					   *edac_dev, unsigned long value);
-extern void edac_mc_reset_delay_period(unsigned long value);
+extern void edac_mc_reset_delay_period(int value);
 
 extern void *edac_align_ptr(void **p, unsigned size, int n_elems);
 
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index cfd7708..7fb46a9 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -131,7 +131,7 @@ int drm_gem_object_init(struct drm_device *dev,
 
 	drm_gem_private_object_init(dev, obj, size);
 
-	filp = shmem_file_setup("drm mm object", size, VM_NORESERVE);
+	filp = shmem_file_setup("drm mm object", size, VM_NORESERVE, 1);
 	if (IS_ERR(filp))
 		return PTR_ERR(filp);
 
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index a052ef9..6705f3b 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -146,10 +146,7 @@ static void i915_error_vprintf(struct drm_i915_error_state_buf *e,
 		va_list tmp;
 
 		va_copy(tmp, args);
-		len = vsnprintf(NULL, 0, f, tmp);
-		va_end(tmp);
-
-		if (!__i915_error_seek(e, len))
+		if (!__i915_error_seek(e, vsnprintf(NULL, 0, f, tmp)))
 			return;
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index a209177..f13d5ed 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -567,7 +567,8 @@ static u32 i915_get_vblank_counter(struct drm_device *dev, int pipe)
 
 		vbl_start = mode->crtc_vblank_start * mode->crtc_htotal;
 	} else {
-		enum transcoder cpu_transcoder = (enum transcoder) pipe;
+		enum transcoder cpu_transcoder =
+			intel_pipe_to_cpu_transcoder(dev_priv, pipe);
 		u32 htotal;
 
 		htotal = ((I915_READ(HTOTAL(cpu_transcoder)) >> 16) & 0x1fff) + 1;
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 2e38546..45e82db 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -1865,12 +1865,10 @@ static void vlv_pre_enable_dp(struct intel_encoder *encoder)
 
 	mutex_unlock(&dev_priv->dpio_lock);
 
-	if (is_edp(intel_dp)) {
-		/* init power sequencer on this pipe and port */
-		intel_dp_init_panel_power_sequencer(dev, intel_dp, &power_seq);
-		intel_dp_init_panel_power_sequencer_registers(dev, intel_dp,
-							      &power_seq);
-	}
+	/* init power sequencer on this pipe and port */
+	intel_dp_init_panel_power_sequencer(dev, intel_dp, &power_seq);
+	intel_dp_init_panel_power_sequencer_registers(dev, intel_dp,
+						      &power_seq);
 
 	intel_enable_dp(encoder);
 
diff --git a/drivers/gpu/drm/radeon/cik_sdma.c b/drivers/gpu/drm/radeon/cik_sdma.c
index e8832b7..d08b83c 100644
--- a/drivers/gpu/drm/radeon/cik_sdma.c
+++ b/drivers/gpu/drm/radeon/cik_sdma.c
@@ -88,35 +88,6 @@ void cik_sdma_ring_ib_execute(struct radeon_device *rdev,
 }
 
 /**
- * cik_sdma_hdp_flush_ring_emit - emit an hdp flush on the DMA ring
- *
- * @rdev: radeon_device pointer
- * @ridx: radeon ring index
- *
- * Emit an hdp flush packet on the requested DMA ring.
- */
-static void cik_sdma_hdp_flush_ring_emit(struct radeon_device *rdev,
-					 int ridx)
-{
-	struct radeon_ring *ring = &rdev->ring[ridx];
-	u32 extra_bits = (SDMA_POLL_REG_MEM_EXTRA_OP(1) |
-			  SDMA_POLL_REG_MEM_EXTRA_FUNC(3)); /* == */
-	u32 ref_and_mask;
-
-	if (ridx == R600_RING_TYPE_DMA_INDEX)
-		ref_and_mask = SDMA0;
-	else
-		ref_and_mask = SDMA1;
-
-	radeon_ring_write(ring, SDMA_PACKET(SDMA_OPCODE_POLL_REG_MEM, 0, extra_bits));
-	radeon_ring_write(ring, GPU_HDP_FLUSH_DONE);
-	radeon_ring_write(ring, GPU_HDP_FLUSH_REQ);
-	radeon_ring_write(ring, ref_and_mask); /* reference */
-	radeon_ring_write(ring, ref_and_mask); /* mask */
-	radeon_ring_write(ring, (0xfff << 16) | 10); /* retry count, poll interval */
-}
-
-/**
  * cik_sdma_fence_ring_emit - emit a fence on the DMA ring
  *
  * @rdev: radeon_device pointer
@@ -140,7 +111,12 @@ void cik_sdma_fence_ring_emit(struct radeon_device *rdev,
 	/* generate an interrupt */
 	radeon_ring_write(ring, SDMA_PACKET(SDMA_OPCODE_TRAP, 0, 0));
 	/* flush HDP */
-	cik_sdma_hdp_flush_ring_emit(rdev, fence->ring);
+	/* We should be using the new POLL_REG_MEM special op packet here
+	 * but it causes sDMA to hang sometimes
+	 */
+	radeon_ring_write(ring, SDMA_PACKET(SDMA_OPCODE_SRBM_WRITE, 0, 0xf000));
+	radeon_ring_write(ring, HDP_MEM_COHERENCY_FLUSH_CNTL >> 2);
+	radeon_ring_write(ring, 0);
 }
 
 /**
@@ -771,7 +747,12 @@ void cik_dma_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm
 	radeon_ring_write(ring, VMID(0));
 
 	/* flush HDP */
-	cik_sdma_hdp_flush_ring_emit(rdev, ridx);
+	/* We should be using the new POLL_REG_MEM special op packet here
+	 * but it causes sDMA to hang sometimes
+	 */
+	radeon_ring_write(ring, SDMA_PACKET(SDMA_OPCODE_SRBM_WRITE, 0, 0xf000));
+	radeon_ring_write(ring, HDP_MEM_COHERENCY_FLUSH_CNTL >> 2);
+	radeon_ring_write(ring, 0);
 
 	/* flush TLB */
 	radeon_ring_write(ring, SDMA_PACKET(SDMA_OPCODE_SRBM_WRITE, 0, 0xf000));
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index 51cbb60..260b419 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -3904,10 +3904,6 @@ restart_ih:
 				break;
 			}
 			break;
-		case 124: /* UVD */
-			DRM_DEBUG("IH: UVD int: 0x%08x\n", src_data);
-			radeon_fence_process(rdev, R600_RING_TYPE_UVD_INDEX);
-			break;
 		case 176: /* CP_INT in ring buffer */
 		case 177: /* CP_INT in IB1 */
 		case 178: /* CP_INT in IB2 */
diff --git a/drivers/gpu/drm/radeon/si.c b/drivers/gpu/drm/radeon/si.c
index 6ff3f1a..398d016 100644
--- a/drivers/gpu/drm/radeon/si.c
+++ b/drivers/gpu/drm/radeon/si.c
@@ -6222,10 +6222,6 @@ restart_ih:
 				break;
 			}
 			break;
-		case 124: /* UVD */
-			DRM_DEBUG("IH: UVD int: 0x%08x\n", src_data);
-			radeon_fence_process(rdev, R600_RING_TYPE_UVD_INDEX);
-			break;
 		case 146:
 		case 147:
 			addr = RREG32(VM_CONTEXT1_PROTECTION_FAULT_ADDR);
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 210d503..50a5422 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -337,7 +337,7 @@ int ttm_tt_swapout(struct ttm_tt *ttm, struct file *persistent_swap_storage)
 	if (!persistent_swap_storage) {
 		swap_storage = shmem_file_setup("ttm swap",
 						ttm->num_pages << PAGE_SHIFT,
-						0);
+						0, 0);
 		if (unlikely(IS_ERR(swap_storage))) {
 			pr_err("Failed allocating swap storage\n");
 			return PTR_ERR(swap_storage);
diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index f2d7bf9..af6edf9 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -67,6 +67,7 @@ static int vmbus_negotiate_version(struct vmbus_channel_msginfo *msginfo,
 	int ret = 0;
 	struct vmbus_channel_initiate_contact *msg;
 	unsigned long flags;
+	int t;
 
 	init_completion(&msginfo->waitevent);
 
@@ -77,8 +78,6 @@ static int vmbus_negotiate_version(struct vmbus_channel_msginfo *msginfo,
 	msg->interrupt_page = virt_to_phys(vmbus_connection.int_page);
 	msg->monitor_page1 = virt_to_phys(vmbus_connection.monitor_pages[0]);
 	msg->monitor_page2 = virt_to_phys(vmbus_connection.monitor_pages[1]);
-	if (version == VERSION_WIN8)
-		msg->target_vcpu = hv_context.vp_index[smp_processor_id()];
 
 	/*
 	 * Add to list before we send the request since we may
@@ -101,7 +100,15 @@ static int vmbus_negotiate_version(struct vmbus_channel_msginfo *msginfo,
 	}
 
 	/* Wait for the connection response */
-	wait_for_completion(&msginfo->waitevent);
+	t =  wait_for_completion_timeout(&msginfo->waitevent, 5*HZ);
+	if (t == 0) {
+		spin_lock_irqsave(&vmbus_connection.channelmsg_lock,
+				flags);
+		list_del(&msginfo->msglistentry);
+		spin_unlock_irqrestore(&vmbus_connection.channelmsg_lock,
+					flags);
+		return -ETIMEDOUT;
+	}
 
 	spin_lock_irqsave(&vmbus_connection.channelmsg_lock, flags);
 	list_del(&msginfo->msglistentry);
diff --git a/drivers/hwmon/ntc_thermistor.c b/drivers/hwmon/ntc_thermistor.c
index 8a17f01..8c23203 100644
--- a/drivers/hwmon/ntc_thermistor.c
+++ b/drivers/hwmon/ntc_thermistor.c
@@ -145,7 +145,7 @@ struct ntc_data {
 static int ntc_adc_iio_read(struct ntc_thermistor_platform_data *pdata)
 {
 	struct iio_channel *channel = pdata->chan;
-	s64 result;
+	unsigned int result;
 	int val, ret;
 
 	ret = iio_read_channel_raw(channel, &val);
@@ -155,10 +155,10 @@ static int ntc_adc_iio_read(struct ntc_thermistor_platform_data *pdata)
 	}
 
 	/* unit: mV */
-	result = pdata->pullup_uv * (s64) val;
+	result = pdata->pullup_uv * val;
 	result >>= 12;
 
-	return (int)result;
+	return result;
 }
 
 static const struct of_device_id ntc_match[] = {
diff --git a/drivers/i2c/busses/i2c-mv64xxx.c b/drivers/i2c/busses/i2c-mv64xxx.c
index d52d849..b8c5187 100644
--- a/drivers/i2c/busses/i2c-mv64xxx.c
+++ b/drivers/i2c/busses/i2c-mv64xxx.c
@@ -97,6 +97,7 @@ enum {
 enum {
 	MV64XXX_I2C_ACTION_INVALID,
 	MV64XXX_I2C_ACTION_CONTINUE,
+	MV64XXX_I2C_ACTION_OFFLOAD_SEND_START,
 	MV64XXX_I2C_ACTION_SEND_START,
 	MV64XXX_I2C_ACTION_SEND_RESTART,
 	MV64XXX_I2C_ACTION_OFFLOAD_RESTART,
@@ -203,9 +204,6 @@ static int mv64xxx_i2c_offload_msg(struct mv64xxx_i2c_data *drv_data)
 	unsigned long ctrl_reg;
 	struct i2c_msg *msg = drv_data->msgs;
 
-	if (!drv_data->offload_enabled)
-		return -EOPNOTSUPP;
-
 	drv_data->msg = msg;
 	drv_data->byte_posn = 0;
 	drv_data->bytes_left = msg->len;
@@ -435,7 +433,8 @@ mv64xxx_i2c_do_action(struct mv64xxx_i2c_data *drv_data)
 
 		drv_data->msgs++;
 		drv_data->num_msgs--;
-		if (mv64xxx_i2c_offload_msg(drv_data) < 0) {
+		if (!(drv_data->offload_enabled &&
+				mv64xxx_i2c_offload_msg(drv_data))) {
 			drv_data->cntl_bits |= MV64XXX_I2C_REG_CONTROL_START;
 			writel(drv_data->cntl_bits,
 			drv_data->reg_base + drv_data->reg_offsets.control);
@@ -459,14 +458,15 @@ mv64xxx_i2c_do_action(struct mv64xxx_i2c_data *drv_data)
 			drv_data->reg_base + drv_data->reg_offsets.control);
 		break;
 
+	case MV64XXX_I2C_ACTION_OFFLOAD_SEND_START:
+		if (!mv64xxx_i2c_offload_msg(drv_data))
+			break;
+		else
+			drv_data->action = MV64XXX_I2C_ACTION_SEND_START;
+		/* FALLTHRU */
 	case MV64XXX_I2C_ACTION_SEND_START:
-		/* Can we offload this msg ? */
-		if (mv64xxx_i2c_offload_msg(drv_data) < 0) {
-			/* No, switch to standard path */
-			mv64xxx_i2c_prepare_for_io(drv_data, drv_data->msgs);
-			writel(drv_data->cntl_bits | MV64XXX_I2C_REG_CONTROL_START,
-				drv_data->reg_base + drv_data->reg_offsets.control);
-		}
+		writel(drv_data->cntl_bits | MV64XXX_I2C_REG_CONTROL_START,
+			drv_data->reg_base + drv_data->reg_offsets.control);
 		break;
 
 	case MV64XXX_I2C_ACTION_SEND_ADDR_1:
@@ -625,10 +625,15 @@ mv64xxx_i2c_execute_msg(struct mv64xxx_i2c_data *drv_data, struct i2c_msg *msg,
 	unsigned long	flags;
 
 	spin_lock_irqsave(&drv_data->lock, flags);
+	if (drv_data->offload_enabled) {
+		drv_data->action = MV64XXX_I2C_ACTION_OFFLOAD_SEND_START;
+		drv_data->state = MV64XXX_I2C_STATE_WAITING_FOR_START_COND;
+	} else {
+		mv64xxx_i2c_prepare_for_io(drv_data, msg);
 
-	drv_data->action = MV64XXX_I2C_ACTION_SEND_START;
-	drv_data->state = MV64XXX_I2C_STATE_WAITING_FOR_START_COND;
-
+		drv_data->action = MV64XXX_I2C_ACTION_SEND_START;
+		drv_data->state = MV64XXX_I2C_STATE_WAITING_FOR_START_COND;
+	}
 	drv_data->send_stop = is_last;
 	drv_data->block = 1;
 	mv64xxx_i2c_do_action(drv_data);
diff --git a/drivers/iio/adc/max1363.c b/drivers/iio/adc/max1363.c
index 5edb991..6118dce 100644
--- a/drivers/iio/adc/max1363.c
+++ b/drivers/iio/adc/max1363.c
@@ -1560,7 +1560,7 @@ static int max1363_probe(struct i2c_client *client,
 	st->client = client;
 
 	st->vref_uv = st->chip_info->int_vref_mv * 1000;
-	vref = devm_regulator_get_optional(&client->dev, "vref");
+	vref = devm_regulator_get(&client->dev, "vref");
 	if (!IS_ERR(vref)) {
 		int vref_uv;
 
diff --git a/drivers/iio/imu/adis16400.h b/drivers/iio/imu/adis16400.h
index 0916bf6..2f8f9d6 100644
--- a/drivers/iio/imu/adis16400.h
+++ b/drivers/iio/imu/adis16400.h
@@ -189,7 +189,6 @@ enum {
 	ADIS16300_SCAN_INCLI_X,
 	ADIS16300_SCAN_INCLI_Y,
 	ADIS16400_SCAN_ADC,
-	ADIS16400_SCAN_TIMESTAMP,
 };
 
 #ifdef CONFIG_IIO_BUFFER
diff --git a/drivers/iio/imu/adis16400_core.c b/drivers/iio/imu/adis16400_core.c
index 7c582f7..368660d 100644
--- a/drivers/iio/imu/adis16400_core.c
+++ b/drivers/iio/imu/adis16400_core.c
@@ -632,7 +632,7 @@ static const struct iio_chan_spec adis16400_channels[] = {
 	ADIS16400_MAGN_CHAN(Z, ADIS16400_ZMAGN_OUT, 14),
 	ADIS16400_TEMP_CHAN(ADIS16400_TEMP_OUT, 12),
 	ADIS16400_AUX_ADC_CHAN(ADIS16400_AUX_ADC, 12),
-	IIO_CHAN_SOFT_TIMESTAMP(ADIS16400_SCAN_TIMESTAMP),
+	IIO_CHAN_SOFT_TIMESTAMP(12)
 };
 
 static const struct iio_chan_spec adis16448_channels[] = {
@@ -659,7 +659,7 @@ static const struct iio_chan_spec adis16448_channels[] = {
 		},
 	},
 	ADIS16400_TEMP_CHAN(ADIS16448_TEMP_OUT, 12),
-	IIO_CHAN_SOFT_TIMESTAMP(ADIS16400_SCAN_TIMESTAMP),
+	IIO_CHAN_SOFT_TIMESTAMP(11)
 };
 
 static const struct iio_chan_spec adis16350_channels[] = {
@@ -677,7 +677,7 @@ static const struct iio_chan_spec adis16350_channels[] = {
 	ADIS16400_MOD_TEMP_CHAN(X, ADIS16350_XTEMP_OUT, 12),
 	ADIS16400_MOD_TEMP_CHAN(Y, ADIS16350_YTEMP_OUT, 12),
 	ADIS16400_MOD_TEMP_CHAN(Z, ADIS16350_ZTEMP_OUT, 12),
-	IIO_CHAN_SOFT_TIMESTAMP(ADIS16400_SCAN_TIMESTAMP),
+	IIO_CHAN_SOFT_TIMESTAMP(11)
 };
 
 static const struct iio_chan_spec adis16300_channels[] = {
@@ -690,7 +690,7 @@ static const struct iio_chan_spec adis16300_channels[] = {
 	ADIS16400_AUX_ADC_CHAN(ADIS16300_AUX_ADC, 12),
 	ADIS16400_INCLI_CHAN(X, ADIS16300_PITCH_OUT, 13),
 	ADIS16400_INCLI_CHAN(Y, ADIS16300_ROLL_OUT, 13),
-	IIO_CHAN_SOFT_TIMESTAMP(ADIS16400_SCAN_TIMESTAMP),
+	IIO_CHAN_SOFT_TIMESTAMP(14)
 };
 
 static const struct iio_chan_spec adis16334_channels[] = {
@@ -701,7 +701,7 @@ static const struct iio_chan_spec adis16334_channels[] = {
 	ADIS16400_ACCEL_CHAN(Y, ADIS16400_YACCL_OUT, 14),
 	ADIS16400_ACCEL_CHAN(Z, ADIS16400_ZACCL_OUT, 14),
 	ADIS16400_TEMP_CHAN(ADIS16350_XTEMP_OUT, 12),
-	IIO_CHAN_SOFT_TIMESTAMP(ADIS16400_SCAN_TIMESTAMP),
+	IIO_CHAN_SOFT_TIMESTAMP(8)
 };
 
 static struct attribute *adis16400_attributes[] = {
diff --git a/drivers/iio/magnetometer/ak8975.c b/drivers/iio/magnetometer/ak8975.c
index 0542354..ff284e5 100644
--- a/drivers/iio/magnetometer/ak8975.c
+++ b/drivers/iio/magnetometer/ak8975.c
@@ -85,7 +85,6 @@
 #define AK8975_MAX_CONVERSION_TIMEOUT	500
 #define AK8975_CONVERSION_DONE_POLL_TIME 10
 #define AK8975_DATA_READY_TIMEOUT	((100*HZ)/1000)
-#define RAW_TO_GAUSS(asa) ((((asa) + 128) * 3000) / 256)
 
 /*
  * Per-instance context data for the device.
@@ -266,15 +265,15 @@ static int ak8975_setup(struct i2c_client *client)
  *
  * Since 1uT = 0.01 gauss, our final scale factor becomes:
  *
- * Hadj = H * ((ASA + 128) / 256) * 3/10 * 1/100
- * Hadj = H * ((ASA + 128) * 0.003) / 256
+ * Hadj = H * ((ASA + 128) / 256) * 3/10 * 100
+ * Hadj = H * ((ASA + 128) * 30 / 256
  *
  * Since ASA doesn't change, we cache the resultant scale factor into the
  * device context in ak8975_setup().
  */
-	data->raw_to_gauss[0] = RAW_TO_GAUSS(data->asa[0]);
-	data->raw_to_gauss[1] = RAW_TO_GAUSS(data->asa[1]);
-	data->raw_to_gauss[2] = RAW_TO_GAUSS(data->asa[2]);
+	data->raw_to_gauss[0] = ((data->asa[0] + 128) * 30) >> 8;
+	data->raw_to_gauss[1] = ((data->asa[1] + 128) * 30) >> 8;
+	data->raw_to_gauss[2] = ((data->asa[2] + 128) * 30) >> 8;
 
 	return 0;
 }
@@ -429,9 +428,8 @@ static int ak8975_read_raw(struct iio_dev *indio_dev,
 	case IIO_CHAN_INFO_RAW:
 		return ak8975_read_axis(indio_dev, chan->address, val);
 	case IIO_CHAN_INFO_SCALE:
-		*val = 0;
-		*val2 = data->raw_to_gauss[chan->address];
-		return IIO_VAL_INT_PLUS_MICRO;
+		*val = data->raw_to_gauss[chan->address];
+		return IIO_VAL_INT;
 	}
 	return -EINVAL;
 }
diff --git a/drivers/infiniband/hw/qib/qib_iba7322.c b/drivers/infiniband/hw/qib/qib_iba7322.c
index d1bd213..5bfc02f 100644
--- a/drivers/infiniband/hw/qib/qib_iba7322.c
+++ b/drivers/infiniband/hw/qib/qib_iba7322.c
@@ -2395,11 +2395,6 @@ static int qib_7322_bringup_serdes(struct qib_pportdata *ppd)
 	qib_write_kreg_port(ppd, krp_ibcctrl_a, ppd->cpspec->ibcctrl_a);
 	qib_write_kreg(dd, kr_scratch, 0ULL);
 
-	/* ensure previous Tx parameters are not still forced */
-	qib_write_kreg_port(ppd, krp_tx_deemph_override,
-		SYM_MASK(IBSD_TX_DEEMPHASIS_OVERRIDE_0,
-		reset_tx_deemphasis_override));
-
 	if (qib_compat_ddr_negotiate) {
 		ppd->cpspec->ibdeltainprog = 1;
 		ppd->cpspec->ibsymsnap = read_7322_creg32_port(ppd,
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 369d919..7328793 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -33,6 +33,7 @@
 */
 
 #include <linux/kthread.h>
+#include <linux/freezer.h>
 #include <linux/blkdev.h>
 #include <linux/sysctl.h>
 #include <linux/seq_file.h>
@@ -7393,6 +7394,8 @@ void md_do_sync(struct md_thread *thread)
 	 *
 	 */
 
+	set_freezable();
+
 	do {
 		mddev->curr_resync = 2;
 
@@ -7416,6 +7419,9 @@ void md_do_sync(struct md_thread *thread)
 					 * time 'round when curr_resync == 2
 					 */
 					continue;
+
+				try_to_freeze();
+
 				/* We need to wait 'interruptible' so as not to
 				 * contribute to the load average, and not to
 				 * be caught by 'softlockup'
@@ -7428,6 +7434,7 @@ void md_do_sync(struct md_thread *thread)
 					       " share one or more physical units)\n",
 					       desc, mdname(mddev), mdname(mddev2));
 					mddev_put(mddev2);
+					try_to_freeze();
 					if (signal_pending(current))
 						flush_signals(current);
 					schedule();
@@ -7758,8 +7765,10 @@ no_add:
  */
 void md_check_recovery(struct mddev *mddev)
 {
-	if (mddev->suspended)
+#ifdef CONFIG_FREEZER
+	if (mddev->suspended || unlikely(atomic_read(&system_freezing_cnt)))
 		return;
+#endif
 
 	if (mddev->bitmap)
 		bitmap_daemon_work(mddev);
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 63b2e8d..a49cfcc 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1952,15 +1952,11 @@ static int process_checks(struct r1bio *r1_bio)
 	for (i = 0; i < conf->raid_disks * 2; i++) {
 		int j;
 		int size;
-		int uptodate;
 		struct bio *b = r1_bio->bios[i];
 		if (b->bi_end_io != end_sync_read)
 			continue;
-		/* fixup the bio for reuse, but preserve BIO_UPTODATE */
-		uptodate = test_bit(BIO_UPTODATE, &b->bi_flags);
+		/* fixup the bio for reuse */
 		bio_reset(b);
-		if (!uptodate)
-			clear_bit(BIO_UPTODATE, &b->bi_flags);
 		b->bi_vcnt = vcnt;
 		b->bi_size = r1_bio->sectors << 9;
 		b->bi_sector = r1_bio->sector +
@@ -1993,14 +1989,11 @@ static int process_checks(struct r1bio *r1_bio)
 		int j;
 		struct bio *pbio = r1_bio->bios[primary];
 		struct bio *sbio = r1_bio->bios[i];
-		int uptodate = test_bit(BIO_UPTODATE, &sbio->bi_flags);
 
 		if (sbio->bi_end_io != end_sync_read)
 			continue;
-		/* Now we can 'fixup' the BIO_UPTODATE flag */
-		set_bit(BIO_UPTODATE, &sbio->bi_flags);
 
-		if (uptodate) {
+		if (test_bit(BIO_UPTODATE, &sbio->bi_flags)) {
 			for (j = vcnt; j-- ; ) {
 				struct page *p, *s;
 				p = pbio->bi_io_vec[j].bv_page;
@@ -2015,7 +2008,7 @@ static int process_checks(struct r1bio *r1_bio)
 		if (j >= 0)
 			atomic64_add(r1_bio->sectors, &mddev->resync_mismatches);
 		if (j < 0 || (test_bit(MD_RECOVERY_CHECK, &mddev->recovery)
-			      && uptodate)) {
+			      && test_bit(BIO_UPTODATE, &sbio->bi_flags))) {
 			/* No need to write to this device. */
 			sbio->bi_end_io = NULL;
 			rdev_dec_pending(conf->mirrors[i].rdev, mddev);
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 48cdec8..03f82ab 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -5512,43 +5512,23 @@ raid5_size(struct mddev *mddev, sector_t sectors, int raid_disks)
 	return sectors * (raid_disks - conf->max_degraded);
 }
 
-static void free_scratch_buffer(struct r5conf *conf, struct raid5_percpu *percpu)
-{
-	safe_put_page(percpu->spare_page);
-	kfree(percpu->scribble);
-	percpu->spare_page = NULL;
-	percpu->scribble = NULL;
-}
-
-static int alloc_scratch_buffer(struct r5conf *conf, struct raid5_percpu *percpu)
-{
-	if (conf->level == 6 && !percpu->spare_page)
-		percpu->spare_page = alloc_page(GFP_KERNEL);
-	if (!percpu->scribble)
-		percpu->scribble = kmalloc(conf->scribble_len, GFP_KERNEL);
-
-	if (!percpu->scribble || (conf->level == 6 && !percpu->spare_page)) {
-		free_scratch_buffer(conf, percpu);
-		return -ENOMEM;
-	}
-
-	return 0;
-}
-
 static void raid5_free_percpu(struct r5conf *conf)
 {
+	struct raid5_percpu *percpu;
 	unsigned long cpu;
 
 	if (!conf->percpu)
 		return;
 
+	get_online_cpus();
+	for_each_possible_cpu(cpu) {
+		percpu = per_cpu_ptr(conf->percpu, cpu);
+		safe_put_page(percpu->spare_page);
+		kfree(percpu->scribble);
+	}
 #ifdef CONFIG_HOTPLUG_CPU
 	unregister_cpu_notifier(&conf->cpu_notify);
 #endif
-
-	get_online_cpus();
-	for_each_possible_cpu(cpu)
-		free_scratch_buffer(conf, per_cpu_ptr(conf->percpu, cpu));
 	put_online_cpus();
 
 	free_percpu(conf->percpu);
@@ -5575,7 +5555,15 @@ static int raid456_cpu_notify(struct notifier_block *nfb, unsigned long action,
 	switch (action) {
 	case CPU_UP_PREPARE:
 	case CPU_UP_PREPARE_FROZEN:
-		if (alloc_scratch_buffer(conf, percpu)) {
+		if (conf->level == 6 && !percpu->spare_page)
+			percpu->spare_page = alloc_page(GFP_KERNEL);
+		if (!percpu->scribble)
+			percpu->scribble = kmalloc(conf->scribble_len, GFP_KERNEL);
+
+		if (!percpu->scribble ||
+		    (conf->level == 6 && !percpu->spare_page)) {
+			safe_put_page(percpu->spare_page);
+			kfree(percpu->scribble);
 			pr_err("%s: failed memory allocation for cpu%ld\n",
 			       __func__, cpu);
 			return notifier_from_errno(-ENOMEM);
@@ -5583,7 +5571,10 @@ static int raid456_cpu_notify(struct notifier_block *nfb, unsigned long action,
 		break;
 	case CPU_DEAD:
 	case CPU_DEAD_FROZEN:
-		free_scratch_buffer(conf, per_cpu_ptr(conf->percpu, cpu));
+		safe_put_page(percpu->spare_page);
+		kfree(percpu->scribble);
+		percpu->spare_page = NULL;
+		percpu->scribble = NULL;
 		break;
 	default:
 		break;
@@ -5595,29 +5586,40 @@ static int raid456_cpu_notify(struct notifier_block *nfb, unsigned long action,
 static int raid5_alloc_percpu(struct r5conf *conf)
 {
 	unsigned long cpu;
-	int err = 0;
+	struct page *spare_page;
+	struct raid5_percpu __percpu *allcpus;
+	void *scribble;
+	int err;
 
-	conf->percpu = alloc_percpu(struct raid5_percpu);
-	if (!conf->percpu)
+	allcpus = alloc_percpu(struct raid5_percpu);
+	if (!allcpus)
 		return -ENOMEM;
-
-#ifdef CONFIG_HOTPLUG_CPU
-	conf->cpu_notify.notifier_call = raid456_cpu_notify;
-	conf->cpu_notify.priority = 0;
-	err = register_cpu_notifier(&conf->cpu_notify);
-	if (err)
-		return err;
-#endif
+	conf->percpu = allcpus;
 
 	get_online_cpus();
+	err = 0;
 	for_each_present_cpu(cpu) {
-		err = alloc_scratch_buffer(conf, per_cpu_ptr(conf->percpu, cpu));
-		if (err) {
-			pr_err("%s: failed memory allocation for cpu%ld\n",
-			       __func__, cpu);
+		if (conf->level == 6) {
+			spare_page = alloc_page(GFP_KERNEL);
+			if (!spare_page) {
+				err = -ENOMEM;
+				break;
+			}
+			per_cpu_ptr(conf->percpu, cpu)->spare_page = spare_page;
+		}
+		scribble = kmalloc(conf->scribble_len, GFP_KERNEL);
+		if (!scribble) {
+			err = -ENOMEM;
 			break;
 		}
+		per_cpu_ptr(conf->percpu, cpu)->scribble = scribble;
 	}
+#ifdef CONFIG_HOTPLUG_CPU
+	conf->cpu_notify.notifier_call = raid456_cpu_notify;
+	conf->cpu_notify.priority = 0;
+	if (err == 0)
+		err = register_cpu_notifier(&conf->cpu_notify);
+#endif
 	put_online_cpus();
 
 	return err;
diff --git a/drivers/misc/mei/client.c b/drivers/misc/mei/client.c
index 28f53fe..87c96e4 100644
--- a/drivers/misc/mei/client.c
+++ b/drivers/misc/mei/client.c
@@ -907,6 +907,7 @@ void mei_cl_all_disconnect(struct mei_device *dev)
 	list_for_each_entry_safe(cl, next, &dev->file_list, link) {
 		cl->state = MEI_FILE_DISCONNECTED;
 		cl->mei_flow_ctrl_creds = 0;
+		cl->read_cb = NULL;
 		cl->timer_count = 0;
 	}
 }
@@ -940,16 +941,8 @@ void mei_cl_all_wakeup(struct mei_device *dev)
 void mei_cl_all_write_clear(struct mei_device *dev)
 {
 	struct mei_cl_cb *cb, *next;
-	struct list_head *list;
 
-	list = &dev->write_list.list;
-	list_for_each_entry_safe(cb, next, list, list) {
-		list_del(&cb->list);
-		mei_io_cb_free(cb);
-	}
-
-	list = &dev->write_waiting_list.list;
-	list_for_each_entry_safe(cb, next, list, list) {
+	list_for_each_entry_safe(cb, next, &dev->write_list.list, list) {
 		list_del(&cb->list);
 		mei_io_cb_free(cb);
 	}
diff --git a/drivers/misc/mic/host/mic_virtio.c b/drivers/misc/mic/host/mic_virtio.c
index 7e1ef0e..752ff87 100644
--- a/drivers/misc/mic/host/mic_virtio.c
+++ b/drivers/misc/mic/host/mic_virtio.c
@@ -156,8 +156,7 @@ static int mic_vringh_copy(struct mic_vdev *mvdev, struct vringh_kiov *iov,
 static int _mic_virtio_copy(struct mic_vdev *mvdev,
 	struct mic_copy_desc *copy)
 {
-	int ret = 0;
-	u32 iovcnt = copy->iovcnt;
+	int ret = 0, iovcnt = copy->iovcnt;
 	struct iovec iov;
 	struct iovec __user *u_iov = copy->iov;
 	void __user *ubuf = NULL;
diff --git a/drivers/net/irda/stir4200.c b/drivers/net/irda/stir4200.c
index 876e709..2b6bc28 100644
--- a/drivers/net/irda/stir4200.c
+++ b/drivers/net/irda/stir4200.c
@@ -739,7 +739,9 @@ static int stir_transmit_thread(void *arg)
 	struct net_device *dev = stir->netdev;
 	struct sk_buff *skb;
 
-        while (!kthread_should_stop()) {
+	set_freezable();
+
+        while (!kthread_freezable_should_stop(NULL)) {
 #ifdef CONFIG_PM
 		/* if suspending, then power off and wait */
 		if (unlikely(freezing(current))) {
diff --git a/drivers/net/wireless/ath/ar5523/ar5523.c b/drivers/net/wireless/ath/ar5523/ar5523.c
index 9cc2b91..280fc3d 100644
--- a/drivers/net/wireless/ath/ar5523/ar5523.c
+++ b/drivers/net/wireless/ath/ar5523/ar5523.c
@@ -1765,7 +1765,7 @@ static struct usb_device_id ar5523_id_table[] = {
 	AR5523_DEVICE_UG(0x07d1, 0x3a07),	/* D-Link / WUA-2340 rev A1 */
 	AR5523_DEVICE_UG(0x1690, 0x0712),	/* Gigaset / AR5523 */
 	AR5523_DEVICE_UG(0x1690, 0x0710),	/* Gigaset / SMCWUSBTG */
-	AR5523_DEVICE_UG(0x129b, 0x160b),	/* Gigaset / USB stick 108
+	AR5523_DEVICE_UG(0x129b, 0x160c),	/* Gigaset / USB stick 108
 						   (CyberTAN Technology) */
 	AR5523_DEVICE_UG(0x16ab, 0x7801),	/* Globalsun / AR5523_1 */
 	AR5523_DEVICE_UX(0x16ab, 0x7811),	/* Globalsun / AR5523_2 */
diff --git a/drivers/net/wireless/ath/ath9k/htc_drv_init.c b/drivers/net/wireless/ath/ath9k/htc_drv_init.c
index 50f991c..c3676bf 100644
--- a/drivers/net/wireless/ath/ath9k/htc_drv_init.c
+++ b/drivers/net/wireless/ath/ath9k/htc_drv_init.c
@@ -34,10 +34,6 @@ static int ath9k_htc_btcoex_enable;
 module_param_named(btcoex_enable, ath9k_htc_btcoex_enable, int, 0444);
 MODULE_PARM_DESC(btcoex_enable, "Enable wifi-BT coexistence");
 
-static int ath9k_ps_enable;
-module_param_named(ps_enable, ath9k_ps_enable, int, 0444);
-MODULE_PARM_DESC(ps_enable, "Enable WLAN PowerSave");
-
 #define CHAN2G(_freq, _idx)  { \
 	.center_freq = (_freq), \
 	.hw_value = (_idx), \
@@ -729,14 +725,12 @@ static void ath9k_set_hw_capab(struct ath9k_htc_priv *priv,
 		IEEE80211_HW_SPECTRUM_MGMT |
 		IEEE80211_HW_HAS_RATE_CONTROL |
 		IEEE80211_HW_RX_INCLUDES_FCS |
+		IEEE80211_HW_SUPPORTS_PS |
 		IEEE80211_HW_PS_NULLFUNC_STACK |
 		IEEE80211_HW_REPORTS_TX_ACK_STATUS |
 		IEEE80211_HW_MFP_CAPABLE |
 		IEEE80211_HW_HOST_BROADCAST_PS_BUFFERING;
 
-	if (ath9k_ps_enable)
-		hw->flags |= IEEE80211_HW_SUPPORTS_PS;
-
 	hw->wiphy->interface_modes =
 		BIT(NL80211_IFTYPE_STATION) |
 		BIT(NL80211_IFTYPE_ADHOC) |
diff --git a/drivers/net/wireless/ath/ath9k/htc_drv_main.c b/drivers/net/wireless/ath/ath9k/htc_drv_main.c
index a57af9b..608d739 100644
--- a/drivers/net/wireless/ath/ath9k/htc_drv_main.c
+++ b/drivers/net/wireless/ath/ath9k/htc_drv_main.c
@@ -1315,22 +1315,21 @@ static void ath9k_htc_sta_rc_update(struct ieee80211_hw *hw,
 	struct ath_common *common = ath9k_hw_common(priv->ah);
 	struct ath9k_htc_target_rate trate;
 
-	if (!(changed & IEEE80211_RC_SUPP_RATES_CHANGED))
-		return;
-
 	mutex_lock(&priv->mutex);
 	ath9k_htc_ps_wakeup(priv);
 
-	memset(&trate, 0, sizeof(struct ath9k_htc_target_rate));
-	ath9k_htc_setup_rate(priv, sta, &trate);
-	if (!ath9k_htc_send_rate_cmd(priv, &trate))
-		ath_dbg(common, CONFIG,
-			"Supported rates for sta: %pM updated, rate caps: 0x%X\n",
-			sta->addr, be32_to_cpu(trate.capflags));
-	else
-		ath_dbg(common, CONFIG,
-			"Unable to update supported rates for sta: %pM\n",
-			sta->addr);
+	if (changed & IEEE80211_RC_SUPP_RATES_CHANGED) {
+		memset(&trate, 0, sizeof(struct ath9k_htc_target_rate));
+		ath9k_htc_setup_rate(priv, sta, &trate);
+		if (!ath9k_htc_send_rate_cmd(priv, &trate))
+			ath_dbg(common, CONFIG,
+				"Supported rates for sta: %pM updated, rate caps: 0x%X\n",
+				sta->addr, be32_to_cpu(trate.capflags));
+		else
+			ath_dbg(common, CONFIG,
+				"Unable to update supported rates for sta: %pM\n",
+				sta->addr);
+	}
 
 	ath9k_htc_ps_restore(priv);
 	mutex_unlock(&priv->mutex);
diff --git a/drivers/net/wireless/ath/ath9k/init.c b/drivers/net/wireless/ath/ath9k/init.c
index 9eea982..710192e 100644
--- a/drivers/net/wireless/ath/ath9k/init.c
+++ b/drivers/net/wireless/ath/ath9k/init.c
@@ -57,10 +57,6 @@ static int ath9k_bt_ant_diversity;
 module_param_named(bt_ant_diversity, ath9k_bt_ant_diversity, int, 0444);
 MODULE_PARM_DESC(bt_ant_diversity, "Enable WLAN/BT RX antenna diversity");
 
-static int ath9k_ps_enable;
-module_param_named(ps_enable, ath9k_ps_enable, int, 0444);
-MODULE_PARM_DESC(ps_enable, "Enable WLAN PowerSave");
-
 bool is_ath9k_unloaded;
 /* We use the hw_value as an index into our private channel structure */
 
@@ -894,15 +890,13 @@ void ath9k_set_hw_capab(struct ath_softc *sc, struct ieee80211_hw *hw)
 	hw->flags = IEEE80211_HW_RX_INCLUDES_FCS |
 		IEEE80211_HW_HOST_BROADCAST_PS_BUFFERING |
 		IEEE80211_HW_SIGNAL_DBM |
+		IEEE80211_HW_SUPPORTS_PS |
 		IEEE80211_HW_PS_NULLFUNC_STACK |
 		IEEE80211_HW_SPECTRUM_MGMT |
 		IEEE80211_HW_REPORTS_TX_ACK_STATUS |
 		IEEE80211_HW_SUPPORTS_RC_TABLE |
 		IEEE80211_HW_SUPPORTS_HT_CCK_RATES;
 
-	if (ath9k_ps_enable)
-		hw->flags |= IEEE80211_HW_SUPPORTS_PS;
-
 	if (sc->sc_ah->caps.hw_caps & ATH9K_HW_CAP_HT) {
 		hw->flags |= IEEE80211_HW_AMPDU_AGGREGATION;
 
diff --git a/drivers/net/wireless/iwlwifi/iwl-nvm-parse.c b/drivers/net/wireless/iwlwifi/iwl-nvm-parse.c
index 4a1cf13..b76a9a8 100644
--- a/drivers/net/wireless/iwlwifi/iwl-nvm-parse.c
+++ b/drivers/net/wireless/iwlwifi/iwl-nvm-parse.c
@@ -182,11 +182,6 @@ static int iwl_init_channel_map(struct device *dev, const struct iwl_cfg *cfg,
 
 	for (ch_idx = 0; ch_idx < IWL_NUM_CHANNELS; ch_idx++) {
 		ch_flags = __le16_to_cpup(nvm_ch_flags + ch_idx);
-
-		if (ch_idx >= NUM_2GHZ_CHANNELS &&
-		    !data->sku_cap_band_52GHz_enable)
-			ch_flags &= ~NVM_CHANNEL_VALID;
-
 		if (!(ch_flags & NVM_CHANNEL_VALID)) {
 			IWL_DEBUG_EEPROM(dev,
 					 "Ch. %d Flags %x [%sGHz] - No traffic\n",
diff --git a/drivers/net/wireless/iwlwifi/mvm/mac80211.c b/drivers/net/wireless/iwlwifi/mvm/mac80211.c
index 0bd6fdc..74bc2c8 100644
--- a/drivers/net/wireless/iwlwifi/mvm/mac80211.c
+++ b/drivers/net/wireless/iwlwifi/mvm/mac80211.c
@@ -246,7 +246,7 @@ int iwl_mvm_mac_setup_register(struct iwl_mvm *mvm)
 	else
 		hw->wiphy->flags &= ~WIPHY_FLAG_PS_ON_BY_DEFAULT;
 
-	if (0 && mvm->fw->ucode_capa.flags & IWL_UCODE_TLV_FLAGS_SCHED_SCAN) {
+	if (mvm->fw->ucode_capa.flags & IWL_UCODE_TLV_FLAGS_SCHED_SCAN) {
 		hw->wiphy->flags |= WIPHY_FLAG_SUPPORTS_SCHED_SCAN;
 		hw->wiphy->max_sched_scan_ssids = PROBE_OPTION_MAX;
 		hw->wiphy->max_match_sets = IWL_SCAN_MAX_PROFILES;
diff --git a/drivers/net/wireless/iwlwifi/mvm/scan.c b/drivers/net/wireless/iwlwifi/mvm/scan.c
index 83a5ee1..dff7592 100644
--- a/drivers/net/wireless/iwlwifi/mvm/scan.c
+++ b/drivers/net/wireless/iwlwifi/mvm/scan.c
@@ -325,8 +325,7 @@ int iwl_mvm_scan_request(struct iwl_mvm *mvm,
 
 	iwl_mvm_scan_fill_ssids(cmd, req, basic_ssid ? 1 : 0);
 
-	cmd->tx_cmd.tx_flags = cpu_to_le32(TX_CMD_FLG_SEQ_CTL |
-					   TX_CMD_FLG_BT_DIS);
+	cmd->tx_cmd.tx_flags = cpu_to_le32(TX_CMD_FLG_SEQ_CTL);
 	cmd->tx_cmd.sta_id = mvm->aux_sta.sta_id;
 	cmd->tx_cmd.life_time = cpu_to_le32(TX_CMD_LIFE_TIME_INFINITE);
 	cmd->tx_cmd.rate_n_flags =
diff --git a/drivers/net/wireless/iwlwifi/mvm/utils.c b/drivers/net/wireless/iwlwifi/mvm/utils.c
index e2eda9f..ed69e9b 100644
--- a/drivers/net/wireless/iwlwifi/mvm/utils.c
+++ b/drivers/net/wireless/iwlwifi/mvm/utils.c
@@ -411,8 +411,6 @@ void iwl_mvm_dump_nic_error_log(struct iwl_mvm *mvm)
 			mvm->status, table.valid);
 	}
 
-	IWL_ERR(mvm, "Loaded firmware version: %s\n", mvm->fw->fw_version);
-
 	trace_iwlwifi_dev_ucode_error(trans->dev, table.error_id, table.tsf_low,
 				      table.data1, table.data2, table.data3,
 				      table.blink1, table.blink2, table.ilink1,
diff --git a/drivers/net/wireless/iwlwifi/pcie/drv.c b/drivers/net/wireless/iwlwifi/pcie/drv.c
index db7c0c7..e627254 100644
--- a/drivers/net/wireless/iwlwifi/pcie/drv.c
+++ b/drivers/net/wireless/iwlwifi/pcie/drv.c
@@ -354,25 +354,20 @@ static DEFINE_PCI_DEVICE_TABLE(iwl_hw_card_ids) = {
 /* 7265 Series */
 	{IWL_PCI_DEVICE(0x095A, 0x5010, iwl7265_2ac_cfg)},
 	{IWL_PCI_DEVICE(0x095A, 0x5110, iwl7265_2ac_cfg)},
-	{IWL_PCI_DEVICE(0x095A, 0x5112, iwl7265_2ac_cfg)},
-	{IWL_PCI_DEVICE(0x095A, 0x5100, iwl7265_2ac_cfg)},
-	{IWL_PCI_DEVICE(0x095A, 0x510A, iwl7265_2ac_cfg)},
 	{IWL_PCI_DEVICE(0x095B, 0x5310, iwl7265_2ac_cfg)},
 	{IWL_PCI_DEVICE(0x095B, 0x5302, iwl7265_2ac_cfg)},
 	{IWL_PCI_DEVICE(0x095B, 0x5210, iwl7265_2ac_cfg)},
 	{IWL_PCI_DEVICE(0x095A, 0x5012, iwl7265_2ac_cfg)},
+	{IWL_PCI_DEVICE(0x095A, 0x500A, iwl7265_2ac_cfg)},
 	{IWL_PCI_DEVICE(0x095A, 0x5410, iwl7265_2ac_cfg)},
 	{IWL_PCI_DEVICE(0x095A, 0x5400, iwl7265_2ac_cfg)},
 	{IWL_PCI_DEVICE(0x095A, 0x1010, iwl7265_2ac_cfg)},
 	{IWL_PCI_DEVICE(0x095A, 0x5000, iwl7265_2n_cfg)},
-	{IWL_PCI_DEVICE(0x095A, 0x500A, iwl7265_2n_cfg)},
 	{IWL_PCI_DEVICE(0x095B, 0x5200, iwl7265_2n_cfg)},
 	{IWL_PCI_DEVICE(0x095A, 0x5002, iwl7265_n_cfg)},
 	{IWL_PCI_DEVICE(0x095B, 0x5202, iwl7265_n_cfg)},
 	{IWL_PCI_DEVICE(0x095A, 0x9010, iwl7265_2ac_cfg)},
-	{IWL_PCI_DEVICE(0x095A, 0x9012, iwl7265_2ac_cfg)},
 	{IWL_PCI_DEVICE(0x095A, 0x9110, iwl7265_2ac_cfg)},
-	{IWL_PCI_DEVICE(0x095A, 0x9112, iwl7265_2ac_cfg)},
 	{IWL_PCI_DEVICE(0x095A, 0x9210, iwl7265_2ac_cfg)},
 	{IWL_PCI_DEVICE(0x095A, 0x9510, iwl7265_2ac_cfg)},
 	{IWL_PCI_DEVICE(0x095A, 0x9310, iwl7265_2ac_cfg)},
diff --git a/drivers/of/address.c b/drivers/of/address.c
index 1a54f1f..d3dd41c 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -99,12 +99,11 @@ static unsigned int of_bus_default_get_flags(const __be32 *addr)
 static int of_bus_pci_match(struct device_node *np)
 {
 	/*
- 	 * "pciex" is PCI Express
 	 * "vci" is for the /chaos bridge on 1st-gen PCI powermacs
 	 * "ht" is hypertransport
 	 */
-	return !strcmp(np->type, "pci") || !strcmp(np->type, "pciex") ||
-		!strcmp(np->type, "vci") || !strcmp(np->type, "ht");
+	return !strcmp(np->type, "pci") || !strcmp(np->type, "vci") ||
+		!strcmp(np->type, "ht");
 }
 
 static void of_bus_pci_count_cells(struct device_node *np,
diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
index 113a0f5..e864392 100644
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -706,17 +706,6 @@ static unsigned int get_slot_status(struct acpiphp_slot *slot)
 	return (unsigned int)sta;
 }
 
-static inline bool device_status_valid(unsigned int sta)
-{
-	/*
-	 * ACPI spec says that _STA may return bit 0 clear with bit 3 set
-	 * if the device is valid but does not require a device driver to be
-	 * loaded (Section 6.3.7 of ACPI 5.0A).
-	 */
-	unsigned int mask = ACPI_STA_DEVICE_ENABLED | ACPI_STA_DEVICE_FUNCTIONING;
-	return (sta & mask) == mask;
-}
-
 /**
  * trim_stale_devices - remove PCI devices that are not responding.
  * @dev: PCI device to start walking the hierarchy from.
@@ -732,7 +721,7 @@ static void trim_stale_devices(struct pci_dev *dev)
 		unsigned long long sta;
 
 		status = acpi_evaluate_integer(handle, "_STA", NULL, &sta);
-		alive = (ACPI_SUCCESS(status) && device_status_valid(sta))
+		alive = (ACPI_SUCCESS(status) && sta == ACPI_STA_ALL)
 			|| acpiphp_no_hotplug(handle);
 	}
 	if (!alive) {
@@ -775,7 +764,7 @@ static void acpiphp_check_bridge(struct acpiphp_bridge *bridge)
 		mutex_lock(&slot->crit_sect);
 		if (slot_no_hotplug(slot)) {
 			; /* do nothing */
-		} else if (device_status_valid(get_slot_status(slot))) {
+		} else if (get_slot_status(slot) == ACPI_STA_ALL) {
 			/* remove stale devices if any */
 			list_for_each_entry_safe(dev, tmp, &bus->devices,
 						 bus_list)
diff --git a/drivers/power/max17040_battery.c b/drivers/power/max17040_battery.c
index 0fbac86..c7ff6d6 100644
--- a/drivers/power/max17040_battery.c
+++ b/drivers/power/max17040_battery.c
@@ -148,7 +148,7 @@ static void max17040_get_online(struct i2c_client *client)
 {
 	struct max17040_chip *chip = i2c_get_clientdata(client);
 
-	if (chip->pdata && chip->pdata->battery_online)
+	if (chip->pdata->battery_online)
 		chip->online = chip->pdata->battery_online();
 	else
 		chip->online = 1;
@@ -158,8 +158,7 @@ static void max17040_get_status(struct i2c_client *client)
 {
 	struct max17040_chip *chip = i2c_get_clientdata(client);
 
-	if (!chip->pdata || !chip->pdata->charger_online
-			|| !chip->pdata->charger_enable) {
+	if (!chip->pdata->charger_online || !chip->pdata->charger_enable) {
 		chip->status = POWER_SUPPLY_STATUS_UNKNOWN;
 		return;
 	}
diff --git a/drivers/spi/spi-nuc900.c b/drivers/spi/spi-nuc900.c
index 8497df2..e0c32bc 100644
--- a/drivers/spi/spi-nuc900.c
+++ b/drivers/spi/spi-nuc900.c
@@ -363,8 +363,6 @@ static int nuc900_spi_probe(struct platform_device *pdev)
 	init_completion(&hw->done);
 
 	master->mode_bits          = SPI_CPOL | SPI_CPHA | SPI_CS_HIGH;
-	if (hw->pdata->lsb)
-		master->mode_bits |= SPI_LSB_FIRST;
 	master->num_chipselect     = hw->pdata->num_cs;
 	master->bus_num            = hw->pdata->bus_num;
 	hw->bitbang.master         = hw->master;
diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 349ebba..d745f95 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -735,7 +735,9 @@ static void spi_pump_messages(struct kthread_work *work)
 	ret = master->transfer_one_message(master, master->cur_msg);
 	if (ret) {
 		dev_err(&master->dev,
-			"failed to transfer one message from queue\n");
+			"failed to transfer one message from queue: %d\n", ret);
+		master->cur_msg->status = ret;
+		spi_finalize_current_message(master);
 		return;
 	}
 }
diff --git a/drivers/staging/comedi/drivers/adv_pci1710.c b/drivers/staging/comedi/drivers/adv_pci1710.c
index 50d2895..c3fdcab 100644
--- a/drivers/staging/comedi/drivers/adv_pci1710.c
+++ b/drivers/staging/comedi/drivers/adv_pci1710.c
@@ -489,7 +489,6 @@ static int pci171x_insn_write_ao(struct comedi_device *dev,
 				 struct comedi_insn *insn, unsigned int *data)
 {
 	struct pci1710_private *devpriv = dev->private;
-	unsigned int val;
 	int n, chan, range, ofs;
 
 	chan = CR_CHAN(insn->chanspec);
@@ -505,14 +504,11 @@ static int pci171x_insn_write_ao(struct comedi_device *dev,
 		outw(devpriv->da_ranges, dev->iobase + PCI171x_DAREF);
 		ofs = PCI171x_DA1;
 	}
-	val = devpriv->ao_data[chan];
 
-	for (n = 0; n < insn->n; n++) {
-		val = data[n];
-		outw(val, dev->iobase + ofs);
-	}
+	for (n = 0; n < insn->n; n++)
+		outw(data[n], dev->iobase + ofs);
 
-	devpriv->ao_data[chan] = val;
+	devpriv->ao_data[chan] = data[n];
 
 	return n;
 
@@ -678,7 +674,6 @@ static int pci1720_insn_write_ao(struct comedi_device *dev,
 				 struct comedi_insn *insn, unsigned int *data)
 {
 	struct pci1710_private *devpriv = dev->private;
-	unsigned int val;
 	int n, rangereg, chan;
 
 	chan = CR_CHAN(insn->chanspec);
@@ -688,15 +683,13 @@ static int pci1720_insn_write_ao(struct comedi_device *dev,
 		outb(rangereg, dev->iobase + PCI1720_RANGE);
 		devpriv->da_ranges = rangereg;
 	}
-	val = devpriv->ao_data[chan];
 
 	for (n = 0; n < insn->n; n++) {
-		val = data[n];
-		outw(val, dev->iobase + PCI1720_DA0 + (chan << 1));
+		outw(data[n], dev->iobase + PCI1720_DA0 + (chan << 1));
 		outb(0, dev->iobase + PCI1720_SYNCOUT);	/*  update outputs */
 	}
 
-	devpriv->ao_data[chan] = val;
+	devpriv->ao_data[chan] = data[n];
 
 	return n;
 }
diff --git a/drivers/staging/iio/adc/ad799x_core.c b/drivers/staging/iio/adc/ad799x_core.c
index 19a671b..9428be8 100644
--- a/drivers/staging/iio/adc/ad799x_core.c
+++ b/drivers/staging/iio/adc/ad799x_core.c
@@ -393,7 +393,7 @@ static const struct iio_event_spec ad799x_events[] = {
 	}, {
 		.type = IIO_EV_TYPE_THRESH,
 		.dir = IIO_EV_DIR_FALLING,
-		.mask_separate = BIT(IIO_EV_INFO_VALUE) |
+		.mask_separate = BIT(IIO_EV_INFO_VALUE),
 			BIT(IIO_EV_INFO_ENABLE),
 	}, {
 		.type = IIO_EV_TYPE_THRESH,
@@ -588,8 +588,7 @@ static int ad799x_probe(struct i2c_client *client,
 	return 0;
 
 error_free_irq:
-	if (client->irq > 0)
-		free_irq(client->irq, indio_dev);
+	free_irq(client->irq, indio_dev);
 error_cleanup_ring:
 	ad799x_ring_cleanup(indio_dev);
 error_disable_reg:
diff --git a/drivers/staging/iio/impedance-analyzer/ad5933.c b/drivers/staging/iio/impedance-analyzer/ad5933.c
index 2b96665..0a4298b 100644
--- a/drivers/staging/iio/impedance-analyzer/ad5933.c
+++ b/drivers/staging/iio/impedance-analyzer/ad5933.c
@@ -629,7 +629,7 @@ static int ad5933_register_ring_funcs_and_init(struct iio_dev *indio_dev)
 	struct iio_buffer *buffer;
 
 	buffer = iio_kfifo_allocate(indio_dev);
-	if (!buffer)
+	if (buffer)
 		return -ENOMEM;
 
 	iio_device_attach_buffer(indio_dev, buffer);
diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index a4e0472..1f07903 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -1086,7 +1086,7 @@ static int quotactl_ioctl(struct ll_sb_info *sbi, struct if_quotactl *qctl)
 		break;
 	case Q_GETQUOTA:
 		if (((type == USRQUOTA &&
-		      !uid_eq(current_euid(), make_kuid(&init_user_ns, id))) ||
+		      uid_eq(current_euid(), make_kuid(&init_user_ns, id))) ||
 		     (type == GRPQUOTA &&
 		      !in_egroup_p(make_kgid(&init_user_ns, id)))) &&
 		    (!cfs_capable(CFS_CAP_SYS_ADMIN) ||
diff --git a/drivers/staging/rtl8188eu/os_dep/usb_intf.c b/drivers/staging/rtl8188eu/os_dep/usb_intf.c
index 09eb278..7d14779 100644
--- a/drivers/staging/rtl8188eu/os_dep/usb_intf.c
+++ b/drivers/staging/rtl8188eu/os_dep/usb_intf.c
@@ -53,7 +53,7 @@ static struct usb_device_id rtw_usb_id_tbl[] = {
 	{USB_DEVICE(USB_VENDER_ID_REALTEK, 0x0179)}, /* 8188ETV */
 	/*=== Customer ID ===*/
 	/****** 8188EUS ********/
-	{USB_DEVICE(0x07b8, 0x8179)}, /* Abocom - Abocom */
+	{USB_DEVICE(0x8179, 0x07B8)}, /* Abocom - Abocom */
 	{USB_DEVICE(0x2001, 0x330F)}, /* DLink DWA-125 REV D1 */
 	{}	/* Terminating entry */
 };
diff --git a/drivers/target/target_core_pr.c b/drivers/target/target_core_pr.c
index 3013287..2f5d779 100644
--- a/drivers/target/target_core_pr.c
+++ b/drivers/target/target_core_pr.c
@@ -2009,7 +2009,7 @@ core_scsi3_emulate_pro_register(struct se_cmd *cmd, u64 res_key, u64 sa_res_key,
 	struct t10_reservation *pr_tmpl = &dev->t10_pr;
 	unsigned char isid_buf[PR_REG_ISID_LEN], *isid_ptr = NULL;
 	sense_reason_t ret = TCM_NO_SENSE;
-	int pr_holder = 0, type;
+	int pr_holder = 0;
 
 	if (!se_sess || !se_lun) {
 		pr_err("SPC-3 PR: se_sess || struct se_lun is NULL!\n");
@@ -2131,7 +2131,6 @@ core_scsi3_emulate_pro_register(struct se_cmd *cmd, u64 res_key, u64 sa_res_key,
 			ret = TCM_RESERVATION_CONFLICT;
 			goto out;
 		}
-		type = pr_reg->pr_res_type;
 
 		spin_lock(&pr_tmpl->registration_lock);
 		/*
@@ -2162,7 +2161,6 @@ core_scsi3_emulate_pro_register(struct se_cmd *cmd, u64 res_key, u64 sa_res_key,
 		 * Release the calling I_T Nexus registration now..
 		 */
 		__core_scsi3_free_registration(cmd->se_dev, pr_reg, NULL, 1);
-		pr_reg = NULL;
 
 		/*
 		 * From spc4r17, section 5.7.11.3 Unregistering
@@ -2176,8 +2174,8 @@ core_scsi3_emulate_pro_register(struct se_cmd *cmd, u64 res_key, u64 sa_res_key,
 		 * RESERVATIONS RELEASED.
 		 */
 		if (pr_holder &&
-		    (type == PR_TYPE_WRITE_EXCLUSIVE_REGONLY ||
-		     type == PR_TYPE_EXCLUSIVE_ACCESS_REGONLY)) {
+		    (pr_reg->pr_res_type == PR_TYPE_WRITE_EXCLUSIVE_REGONLY ||
+		     pr_reg->pr_res_type == PR_TYPE_EXCLUSIVE_ACCESS_REGONLY)) {
 			list_for_each_entry(pr_reg_p,
 					&pr_tmpl->registration_list,
 					pr_reg_list) {
@@ -2196,8 +2194,7 @@ core_scsi3_emulate_pro_register(struct se_cmd *cmd, u64 res_key, u64 sa_res_key,
 	ret = core_scsi3_update_and_write_aptpl(dev, aptpl);
 
 out:
-	if (pr_reg)
-		core_scsi3_put_pr_reg(pr_reg);
+	core_scsi3_put_pr_reg(pr_reg);
 	return ret;
 }
 
diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c
index 5056090..c0f76da 100644
--- a/drivers/tty/n_gsm.c
+++ b/drivers/tty/n_gsm.c
@@ -1089,7 +1089,6 @@ static void gsm_control_modem(struct gsm_mux *gsm, u8 *data, int clen)
 {
 	unsigned int addr = 0;
 	unsigned int modem = 0;
-	unsigned int brk = 0;
 	struct gsm_dlci *dlci;
 	int len = clen;
 	u8 *dp = data;
@@ -1116,16 +1115,6 @@ static void gsm_control_modem(struct gsm_mux *gsm, u8 *data, int clen)
 		if (len == 0)
 			return;
 	}
-	len--;
-	if (len > 0) {
-		while (gsm_read_ea(&brk, *dp++) == 0) {
-			len--;
-			if (len == 0)
-				return;
-		}
-		modem <<= 7;
-		modem |= (brk & 0x7f);
-	}
 	tty = tty_port_tty_get(&dlci->port);
 	gsm_process_modem(tty, dlci, modem, clen);
 	if (tty) {
diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index 4c10837..34aacaa 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -813,7 +813,8 @@ static void process_echoes(struct tty_struct *tty)
 	struct n_tty_data *ldata = tty->disc_data;
 	size_t echoed;
 
-	if (ldata->echo_mark == ldata->echo_tail)
+	if ((!L_ECHO(tty) && !L_ECHONL(tty)) ||
+	    ldata->echo_mark == ldata->echo_tail)
 		return;
 
 	mutex_lock(&ldata->output_lock);
@@ -1237,8 +1238,7 @@ n_tty_receive_signal_char(struct tty_struct *tty, int signal, unsigned char c)
 	if (L_ECHO(tty)) {
 		echo_char(c, tty);
 		commit_echoes(tty);
-	} else
-		process_echoes(tty);
+	}
 	isig(signal, tty);
 	return;
 }
@@ -1269,7 +1269,7 @@ n_tty_receive_char_special(struct tty_struct *tty, unsigned char c)
 	if (I_IXON(tty)) {
 		if (c == START_CHAR(tty)) {
 			start_tty(tty);
-			process_echoes(tty);
+			commit_echoes(tty);
 			return 0;
 		}
 		if (c == STOP_CHAR(tty)) {
@@ -1821,10 +1821,8 @@ static void n_tty_set_termios(struct tty_struct *tty, struct ktermios *old)
 	 * Fix tty hang when I_IXON(tty) is cleared, but the tty
 	 * been stopped by STOP_CHAR(tty) before it.
 	 */
-	if (!I_IXON(tty) && old && (old->c_iflag & IXON) && !tty->flow_stopped) {
+	if (!I_IXON(tty) && old && (old->c_iflag & IXON) && !tty->flow_stopped)
 		start_tty(tty);
-		process_echoes(tty);
-	}
 
 	/* The termios change make the tty ready for I/O */
 	wake_up_interruptible(&tty->write_wait);
diff --git a/drivers/tty/serial/omap-serial.c b/drivers/tty/serial/omap-serial.c
index 2051581..fa511eb 100644
--- a/drivers/tty/serial/omap-serial.c
+++ b/drivers/tty/serial/omap-serial.c
@@ -738,6 +738,9 @@ static int serial_omap_startup(struct uart_port *port)
 			return retval;
 		}
 		disable_irq(up->wakeirq);
+	} else {
+		dev_info(up->port.dev, "no wakeirq for uart%d\n",
+			 up->port.line);
 	}
 
 	dev_dbg(up->port.dev, "serial_omap_startup+%d\n", up->port.line);
@@ -1684,9 +1687,6 @@ static int serial_omap_probe(struct platform_device *pdev)
 	up->port.iotype = UPIO_MEM;
 	up->port.irq = uartirq;
 	up->wakeirq = wakeirq;
-	if (!up->wakeirq)
-		dev_info(up->port.dev, "no wakeirq for uart%d\n",
-			 up->port.line);
 
 	up->port.regshift = 2;
 	up->port.fifosize = 64;
diff --git a/drivers/tty/serial/sirfsoc_uart.c b/drivers/tty/serial/sirfsoc_uart.c
index 3fd7435..f186a8f 100644
--- a/drivers/tty/serial/sirfsoc_uart.c
+++ b/drivers/tty/serial/sirfsoc_uart.c
@@ -540,10 +540,8 @@ static void sirfsoc_rx_tmo_process_tl(unsigned long param)
 	wr_regl(port, ureg->sirfsoc_rx_dma_io_ctrl,
 			rd_regl(port, ureg->sirfsoc_rx_dma_io_ctrl) |
 			SIRFUART_IO_MODE);
-	spin_unlock_irqrestore(&sirfport->rx_lock, flags);
-	spin_lock(&port->lock);
 	sirfsoc_uart_pio_rx_chars(port, 4 - sirfport->rx_io_count);
-	spin_unlock(&port->lock);
+	spin_unlock_irqrestore(&sirfport->rx_lock, flags);
 	if (sirfport->rx_io_count == 4) {
 		spin_lock_irqsave(&sirfport->rx_lock, flags);
 		sirfport->rx_io_count = 0;
diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 23b5d32..e91c395 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -1164,8 +1164,6 @@ static void csi_J(struct vc_data *vc, int vpar)
 			scr_memsetw(vc->vc_screenbuf, vc->vc_video_erase_char,
 				    vc->vc_screenbuf_size >> 1);
 			set_origin(vc);
-			if (CON_IS_VISIBLE(vc))
-				update_screen(vc);
 			/* fall through */
 		case 2: /* erase whole display */
 			count = vc->vc_cols * vc->vc_rows;
@@ -2431,6 +2429,7 @@ int vt_kmsg_redirect(int new)
 	else
 		return kmsg_con;
 }
+EXPORT_SYMBOL_GPL(vt_kmsg_redirect);
 
 /*
  *	Console on virtual terminal
diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
index d39106c..6bffb8c 100644
--- a/drivers/usb/core/hcd.c
+++ b/drivers/usb/core/hcd.c
@@ -1031,6 +1031,7 @@ static int register_root_hub(struct usb_hcd *hcd)
 					dev_name(&usb_dev->dev), retval);
 			return retval;
 		}
+		usb_dev->lpm_capable = usb_device_supports_lpm(usb_dev);
 	}
 
 	retval = usb_new_device (usb_dev);
diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index ebcd3bf..07e6654 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -135,7 +135,7 @@ struct usb_hub *usb_hub_to_struct_hub(struct usb_device *hdev)
 	return usb_get_intfdata(hdev->actconfig->interface[0]);
 }
 
-static int usb_device_supports_lpm(struct usb_device *udev)
+int usb_device_supports_lpm(struct usb_device *udev)
 {
 	/* USB 2.1 (and greater) devices indicate LPM support through
 	 * their USB 2.0 Extended Capabilities BOS descriptor.
@@ -156,6 +156,11 @@ static int usb_device_supports_lpm(struct usb_device *udev)
 				"Power management will be impacted.\n");
 		return 0;
 	}
+
+	/* udev is root hub */
+	if (!udev->parent)
+		return 1;
+
 	if (udev->parent->lpm_capable)
 		return 1;
 
diff --git a/drivers/usb/core/usb.h b/drivers/usb/core/usb.h
index 8238577..c493836 100644
--- a/drivers/usb/core/usb.h
+++ b/drivers/usb/core/usb.h
@@ -35,6 +35,7 @@ extern int usb_get_device_descriptor(struct usb_device *dev,
 		unsigned int size);
 extern int usb_get_bos_descriptor(struct usb_device *dev);
 extern void usb_release_bos_descriptor(struct usb_device *dev);
+extern int usb_device_supports_lpm(struct usb_device *udev);
 extern char *usb_cache_string(struct usb_device *udev, int index);
 extern int usb_set_configuration(struct usb_device *dev, int configuration);
 extern int usb_choose_configuration(struct usb_device *udev);
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 1e2f3f4..64c36fe 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2973,8 +2973,58 @@ static int prepare_ring(struct xhci_hcd *xhci, struct xhci_ring *ep_ring,
 	}
 
 	while (1) {
-		if (room_on_ring(xhci, ep_ring, num_trbs))
-			break;
+		if (room_on_ring(xhci, ep_ring, num_trbs)) {
+			union xhci_trb *trb = ep_ring->enqueue;
+			unsigned int usable = ep_ring->enq_seg->trbs +
+					TRBS_PER_SEGMENT - 1 - trb;
+			u32 nop_cmd;
+
+			/*
+			 * Section 4.11.7.1 TD Fragments states that a link
+			 * TRB must only occur at the boundary between
+			 * data bursts (eg 512 bytes for 480M).
+			 * While it is possible to split a large fragment
+			 * we don't know the size yet.
+			 * Simplest solution is to fill the trb before the
+			 * LINK with nop commands.
+			 */
+			if (num_trbs == 1 || num_trbs <= usable || usable == 0)
+				break;
+
+			if (ep_ring->type != TYPE_BULK)
+				/*
+				 * While isoc transfers might have a buffer that
+				 * crosses a 64k boundary it is unlikely.
+				 * Since we can't add NOPs without generating
+				 * gaps in the traffic just hope it never
+				 * happens at the end of the ring.
+				 * This could be fixed by writing a LINK TRB
+				 * instead of the first NOP - however the
+				 * TRB_TYPE_LINK_LE32() calls would all need
+				 * changing to check the ring length.
+				 */
+				break;
+
+			if (num_trbs >= TRBS_PER_SEGMENT) {
+				xhci_err(xhci, "Too many fragments %d, max %d\n",
+						num_trbs, TRBS_PER_SEGMENT - 1);
+				return -EINVAL;
+			}
+
+			nop_cmd = cpu_to_le32(TRB_TYPE(TRB_TR_NOOP) |
+					ep_ring->cycle_state);
+			ep_ring->num_trbs_free -= usable;
+			do {
+				trb->generic.field[0] = 0;
+				trb->generic.field[1] = 0;
+				trb->generic.field[2] = 0;
+				trb->generic.field[3] = nop_cmd;
+				trb++;
+			} while (--usable);
+			ep_ring->enqueue = trb;
+			if (room_on_ring(xhci, ep_ring, num_trbs))
+				break;
+		}
 
 		if (ep_ring == xhci->cmd_ring) {
 			xhci_err(xhci, "Do not support expand command ring\n");
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 56d488d..cfa5995 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -4716,8 +4716,11 @@ int xhci_gen_setup(struct usb_hcd *hcd, xhci_get_quirks_t get_quirks)
 	struct device		*dev = hcd->self.controller;
 	int			retval;
 
-	/* Accept arbitrarily long scatter-gather lists */
-	hcd->self.sg_tablesize = ~0;
+	/* Limit the block layer scatter-gather lists to half a segment. */
+	hcd->self.sg_tablesize = TRBS_PER_SEGMENT / 2;
+
+	/* support to build packet from discontinuous buffers */
+	hcd->self.no_sg_constraint = 1;
 
 	/* XHCI controllers don't stop the ep queue on short packets :| */
 	hcd->self.no_stop_on_short = 1;
@@ -4743,14 +4746,6 @@ int xhci_gen_setup(struct usb_hcd *hcd, xhci_get_quirks_t get_quirks)
 		/* xHCI private pointer was set in xhci_pci_probe for the second
 		 * registered roothub.
 		 */
-		xhci = hcd_to_xhci(hcd);
-		/*
-		 * Support arbitrarily aligned sg-list entries on hosts without
-		 * TD fragment rules (which are currently unsupported).
-		 */
-		if (xhci->hci_version < 0x100)
-			hcd->self.no_sg_constraint = 1;
-
 		return 0;
 	}
 
@@ -4777,9 +4772,6 @@ int xhci_gen_setup(struct usb_hcd *hcd, xhci_get_quirks_t get_quirks)
 	if (xhci->hci_version > 0x96)
 		xhci->quirks |= XHCI_SPURIOUS_SUCCESS;
 
-	if (xhci->hci_version < 0x100)
-		hcd->self.no_sg_constraint = 1;
-
 	/* Make sure the HC is halted. */
 	retval = xhci_halt(xhci);
 	if (retval)
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 03c74b7..c283cf1 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1260,7 +1260,7 @@ union xhci_trb {
  * since the command ring is 64-byte aligned.
  * It must also be greater than 16.
  */
-#define TRBS_PER_SEGMENT	64
+#define TRBS_PER_SEGMENT	256
 /* Allow two commands + a link TRB, along with any reserved command TRBs */
 #define MAX_RSVD_CMD_TRBS	(TRBS_PER_SEGMENT - 3)
 #define TRB_SEGMENT_SIZE	(TRBS_PER_SEGMENT*16)
diff --git a/drivers/usb/serial/ftdi_sio.c b/drivers/usb/serial/ftdi_sio.c
index 9f4d432..8c204a5 100644
--- a/drivers/usb/serial/ftdi_sio.c
+++ b/drivers/usb/serial/ftdi_sio.c
@@ -153,7 +153,6 @@ static struct usb_device_id id_table_combined [] = {
 	{ USB_DEVICE(FTDI_VID, FTDI_CANUSB_PID) },
 	{ USB_DEVICE(FTDI_VID, FTDI_CANDAPTER_PID) },
 	{ USB_DEVICE(FTDI_VID, FTDI_NXTCAM_PID) },
-	{ USB_DEVICE(FTDI_VID, FTDI_EV3CON_PID) },
 	{ USB_DEVICE(FTDI_VID, FTDI_SCS_DEVICE_0_PID) },
 	{ USB_DEVICE(FTDI_VID, FTDI_SCS_DEVICE_1_PID) },
 	{ USB_DEVICE(FTDI_VID, FTDI_SCS_DEVICE_2_PID) },
@@ -193,8 +192,6 @@ static struct usb_device_id id_table_combined [] = {
 	{ USB_DEVICE(INTERBIOMETRICS_VID, INTERBIOMETRICS_IOBOARD_PID) },
 	{ USB_DEVICE(INTERBIOMETRICS_VID, INTERBIOMETRICS_MINI_IOBOARD_PID) },
 	{ USB_DEVICE(FTDI_VID, FTDI_SPROG_II) },
-	{ USB_DEVICE(FTDI_VID, FTDI_TAGSYS_LP101_PID) },
-	{ USB_DEVICE(FTDI_VID, FTDI_TAGSYS_P200X_PID) },
 	{ USB_DEVICE(FTDI_VID, FTDI_LENZ_LIUSB_PID) },
 	{ USB_DEVICE(FTDI_VID, FTDI_XF_632_PID) },
 	{ USB_DEVICE(FTDI_VID, FTDI_XF_634_PID) },
diff --git a/drivers/usb/serial/ftdi_sio_ids.h b/drivers/usb/serial/ftdi_sio_ids.h
index 1e2d369..a7019d1 100644
--- a/drivers/usb/serial/ftdi_sio_ids.h
+++ b/drivers/usb/serial/ftdi_sio_ids.h
@@ -50,7 +50,6 @@
 #define TI_XDS100V2_PID		0xa6d0
 
 #define FTDI_NXTCAM_PID		0xABB8 /* NXTCam for Mindstorms NXT */
-#define FTDI_EV3CON_PID		0xABB9 /* Mindstorms EV3 Console Adapter */
 
 /* US Interface Navigator (http://www.usinterface.com/) */
 #define FTDI_USINT_CAT_PID	0xb810	/* Navigator CAT and 2nd PTT lines */
@@ -364,12 +363,6 @@
 /* Sprog II (Andrew Crosland's SprogII DCC interface) */
 #define FTDI_SPROG_II		0xF0C8
 
-/*
- * Two of the Tagsys RFID Readers
- */
-#define FTDI_TAGSYS_LP101_PID	0xF0E9	/* Tagsys L-P101 RFID*/
-#define FTDI_TAGSYS_P200X_PID	0xF0EE	/* Tagsys Medio P200x RFID*/
-
 /* an infrared receiver for user access control with IR tags */
 #define FTDI_PIEGROUP_PID	0xF208	/* Product Id */
 
diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c
index 216d20a..5c86f57 100644
--- a/drivers/usb/serial/option.c
+++ b/drivers/usb/serial/option.c
@@ -1362,8 +1362,7 @@ static const struct usb_device_id option_ids[] = {
 	{ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1267, 0xff, 0xff, 0xff) },
 	{ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1268, 0xff, 0xff, 0xff) },
 	{ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1269, 0xff, 0xff, 0xff) },
-	{ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1270, 0xff, 0xff, 0xff),
-	  .driver_info = (kernel_ulong_t)&net_intf5_blacklist },
+	{ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1270, 0xff, 0xff, 0xff) },
 	{ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1271, 0xff, 0xff, 0xff) },
 	{ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1272, 0xff, 0xff, 0xff) },
 	{ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1273, 0xff, 0xff, 0xff) },
diff --git a/drivers/usb/serial/qcserial.c b/drivers/usb/serial/qcserial.c
index 968a402..c65437c 100644
--- a/drivers/usb/serial/qcserial.c
+++ b/drivers/usb/serial/qcserial.c
@@ -139,9 +139,6 @@ static const struct usb_device_id id_table[] = {
 	{USB_DEVICE_INTERFACE_NUMBER(0x1199, 0x901c, 0)},	/* Sierra Wireless EM7700 Device Management */
 	{USB_DEVICE_INTERFACE_NUMBER(0x1199, 0x901c, 2)},	/* Sierra Wireless EM7700 NMEA */
 	{USB_DEVICE_INTERFACE_NUMBER(0x1199, 0x901c, 3)},	/* Sierra Wireless EM7700 Modem */
-	{USB_DEVICE_INTERFACE_NUMBER(0x1199, 0x9051, 0)},	/* Netgear AirCard 340U Device Management */
-	{USB_DEVICE_INTERFACE_NUMBER(0x1199, 0x9051, 2)},	/* Netgear AirCard 340U NMEA */
-	{USB_DEVICE_INTERFACE_NUMBER(0x1199, 0x9051, 3)},	/* Netgear AirCard 340U Modem */
 
 	{ }				/* Terminating entry */
 };
diff --git a/drivers/usb/serial/usb-serial-simple.c b/drivers/usb/serial/usb-serial-simple.c
index 147f019..52eb91f 100644
--- a/drivers/usb/serial/usb-serial-simple.c
+++ b/drivers/usb/serial/usb-serial-simple.c
@@ -72,8 +72,7 @@ DEVICE(hp4x, HP4X_IDS);
 
 /* Suunto ANT+ USB Driver */
 #define SUUNTO_IDS()			\
-	{ USB_DEVICE(0x0fcf, 0x1008) },	\
-	{ USB_DEVICE(0x0fcf, 0x1009) } /* Dynastream ANT USB-m Stick */
+	{ USB_DEVICE(0x0fcf, 0x1008) }
 DEVICE(suunto, SUUNTO_IDS);
 
 /* Siemens USB/MPI adapter */
diff --git a/drivers/usb/storage/Kconfig b/drivers/usb/storage/Kconfig
index 1dd0604..8470e1b 100644
--- a/drivers/usb/storage/Kconfig
+++ b/drivers/usb/storage/Kconfig
@@ -18,9 +18,7 @@ config USB_STORAGE
 
 	  This option depends on 'SCSI' support being enabled, but you
 	  probably also need 'SCSI device support: SCSI disk support'
-	  (BLK_DEV_SD) for most USB storage devices.  Some devices also
-	  will require 'Probe all LUNs on each SCSI device'
-	  (SCSI_MULTI_LUN).
+	  (BLK_DEV_SD) for most USB storage devices.
 
 	  To compile this driver as a module, choose M here: the
 	  module will be called usb-storage.
diff --git a/drivers/usb/storage/scsiglue.c b/drivers/usb/storage/scsiglue.c
index 9d38ddc..18509e6 100644
--- a/drivers/usb/storage/scsiglue.c
+++ b/drivers/usb/storage/scsiglue.c
@@ -78,8 +78,6 @@ static const char* host_info(struct Scsi_Host *host)
 
 static int slave_alloc (struct scsi_device *sdev)
 {
-	struct us_data *us = host_to_us(sdev->host);
-
 	/*
 	 * Set the INQUIRY transfer length to 36.  We don't use any of
 	 * the extra data and many devices choke if asked for more or
@@ -104,10 +102,6 @@ static int slave_alloc (struct scsi_device *sdev)
 	 */
 	blk_queue_update_dma_alignment(sdev->request_queue, (512 - 1));
 
-	/* Tell the SCSI layer if we know there is more than one LUN */
-	if (us->protocol == USB_PR_BULK && us->max_lun > 0)
-		sdev->sdev_bflags |= BLIST_FORCELUN;
-
 	return 0;
 }
 
diff --git a/drivers/usb/storage/unusual_cypress.h b/drivers/usb/storage/unusual_cypress.h
index 82e8ed0..65a6a75 100644
--- a/drivers/usb/storage/unusual_cypress.h
+++ b/drivers/usb/storage/unusual_cypress.h
@@ -31,7 +31,7 @@ UNUSUAL_DEV(  0x04b4, 0x6831, 0x0000, 0x9999,
 		"Cypress ISD-300LP",
 		USB_SC_CYP_ATACB, USB_PR_DEVICE, NULL, 0),
 
-UNUSUAL_DEV( 0x14cd, 0x6116, 0x0160, 0x0160,
+UNUSUAL_DEV( 0x14cd, 0x6116, 0x0000, 0x0219,
 		"Super Top",
 		"USB 2.0  SATA BRIDGE",
 		USB_SC_CYP_ATACB, USB_PR_DEVICE, NULL, 0),
diff --git a/drivers/usb/storage/unusual_devs.h b/drivers/usb/storage/unusual_devs.h
index adbeb25..ad06255 100644
--- a/drivers/usb/storage/unusual_devs.h
+++ b/drivers/usb/storage/unusual_devs.h
@@ -1455,13 +1455,6 @@ UNUSUAL_DEV( 0x0f88, 0x042e, 0x0100, 0x0100,
 		USB_SC_DEVICE, USB_PR_DEVICE, NULL,
 		US_FL_FIX_CAPACITY ),
 
-/* Reported by Moritz Moeller-Herrmann <moritz-kernel@moeller-herrmann.de> */
-UNUSUAL_DEV(  0x0fca, 0x8004, 0x0201, 0x0201,
-		"Research In Motion",
-		"BlackBerry Bold 9000",
-		USB_SC_DEVICE, USB_PR_DEVICE, NULL,
-		US_FL_MAX_SECTORS_64 ),
-
 /* Reported by Michael Stattmann <michael@stattmann.com> */
 UNUSUAL_DEV(  0x0fce, 0xd008, 0x0000, 0x0000,
 		"Sony Ericsson",
diff --git a/drivers/uwb/uwbd.c b/drivers/uwb/uwbd.c
index bdcb13c..ce8fc9c 100644
--- a/drivers/uwb/uwbd.c
+++ b/drivers/uwb/uwbd.c
@@ -271,6 +271,7 @@ static int uwbd(void *param)
 	struct uwb_event *evt;
 	int should_stop = 0;
 
+	set_freezable();
 	while (1) {
 		wait_event_interruptible_timeout(
 			rc->uwbd.wq,
diff --git a/drivers/vme/bridges/vme_ca91cx42.c b/drivers/vme/bridges/vme_ca91cx42.c
index 0b2fefb..f844857 100644
--- a/drivers/vme/bridges/vme_ca91cx42.c
+++ b/drivers/vme/bridges/vme_ca91cx42.c
@@ -884,7 +884,7 @@ static ssize_t ca91cx42_master_read(struct vme_master_resource *image,
 		if (done == count)
 			goto out;
 	}
-	if ((uintptr_t)(addr + done) & 0x2) {
+	if ((uintptr_t)addr & 0x2) {
 		if ((count - done) < 2) {
 			*(u8 *)(buf + done) = ioread8(addr + done);
 			done += 1;
@@ -938,7 +938,7 @@ static ssize_t ca91cx42_master_write(struct vme_master_resource *image,
 		if (done == count)
 			goto out;
 	}
-	if ((uintptr_t)(addr + done) & 0x2) {
+	if ((uintptr_t)addr & 0x2) {
 		if ((count - done) < 2) {
 			iowrite8(*(u8 *)(buf + done), addr + done);
 			done += 1;
diff --git a/drivers/vme/bridges/vme_tsi148.c b/drivers/vme/bridges/vme_tsi148.c
index 7db4e63..9cf8833 100644
--- a/drivers/vme/bridges/vme_tsi148.c
+++ b/drivers/vme/bridges/vme_tsi148.c
@@ -1289,7 +1289,7 @@ static ssize_t tsi148_master_read(struct vme_master_resource *image, void *buf,
 		if (done == count)
 			goto out;
 	}
-	if ((uintptr_t)(addr + done) & 0x2) {
+	if ((uintptr_t)addr & 0x2) {
 		if ((count - done) < 2) {
 			*(u8 *)(buf + done) = ioread8(addr + done);
 			done += 1;
@@ -1371,7 +1371,7 @@ static ssize_t tsi148_master_write(struct vme_master_resource *image, void *buf,
 		if (done == count)
 			goto out;
 	}
-	if ((uintptr_t)(addr + done) & 0x2) {
+	if ((uintptr_t)addr & 0x2) {
 		if ((count - done) < 2) {
 			iowrite8(*(u8 *)(buf + done), addr + done);
 			done += 1;
diff --git a/drivers/w1/w1.c b/drivers/w1/w1.c
index 66efa96..128ad2c 100644
--- a/drivers/w1/w1.c
+++ b/drivers/w1/w1.c
@@ -1011,8 +1011,9 @@ int w1_process(void *data)
 	 * time can be calculated in jiffies once.
 	 */
 	const unsigned long jtime = msecs_to_jiffies(w1_timeout * 1000);
+	set_freezable();
 
-	while (!kthread_should_stop()) {
+	while (!kthread_freezable_should_stop(NULL)) {
 		if (dev->search_count) {
 			mutex_lock(&dev->mutex);
 			w1_search_process(dev, W1_SEARCH);
diff --git a/fs/bio-integrity.c b/fs/bio-integrity.c
index 6dea2b9..fc60b31 100644
--- a/fs/bio-integrity.c
+++ b/fs/bio-integrity.c
@@ -114,14 +114,6 @@ void bio_integrity_free(struct bio *bio)
 }
 EXPORT_SYMBOL(bio_integrity_free);
 
-static inline unsigned int bip_integrity_vecs(struct bio_integrity_payload *bip)
-{
-	if (bip->bip_slab == BIO_POOL_NONE)
-		return BIP_INLINE_VECS;
-
-	return bvec_nr_vecs(bip->bip_slab);
-}
-
 /**
  * bio_integrity_add_page - Attach integrity metadata
  * @bio:	bio to update
@@ -137,7 +129,7 @@ int bio_integrity_add_page(struct bio *bio, struct page *page,
 	struct bio_integrity_payload *bip = bio->bi_integrity;
 	struct bio_vec *iv;
 
-	if (bip->bip_vcnt >= bip_integrity_vecs(bip)) {
+	if (bip->bip_vcnt >= bvec_nr_vecs(bip->bip_slab)) {
 		printk(KERN_ERR "%s: bip_vec full\n", __func__);
 		return 0;
 	}
diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
index c1e0b0c..2654408 100644
--- a/fs/btrfs/async-thread.c
+++ b/fs/btrfs/async-thread.c
@@ -309,6 +309,8 @@ static int worker_loop(void *arg)
 	INIT_LIST_HEAD(&head);
 	INIT_LIST_HEAD(&prio_head);
 
+	set_freezable();
+
 	do {
 again:
 		while (1) {
@@ -345,7 +347,7 @@ again:
 			try_to_freeze();
 		} else {
 			spin_unlock_irq(&worker->lock);
-			if (!kthread_should_stop()) {
+			if (!kthread_freezable_should_stop(NULL)) {
 				cpu_relax();
 				/*
 				 * we've dropped the lock, did someone else
@@ -370,7 +372,7 @@ again:
 				    !list_empty(&worker->prio_pending))
 					continue;
 
-				if (kthread_should_stop())
+				if (kthread_freezable_should_stop(NULL))
 					break;
 
 				/* still no more work?, sleep for real */
@@ -390,7 +392,7 @@ again:
 				worker->working = 0;
 				spin_unlock_irq(&worker->lock);
 
-				if (!kthread_should_stop()) {
+				if (!kthread_freezable_should_stop(NULL)) {
 					schedule_timeout(HZ * 120);
 					if (!worker->working &&
 					    try_worker_shutdown(worker)) {
@@ -400,7 +402,7 @@ again:
 			}
 			__set_current_state(TASK_RUNNING);
 		}
-	} while (!kthread_should_stop());
+	} while (!kthread_freezable_should_stop(NULL));
 	return 0;
 }
 
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 8072cfa..082032d 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1703,6 +1703,8 @@ static int cleaner_kthread(void *arg)
 	struct btrfs_root *root = arg;
 	int again;
 
+	set_freezable();
+
 	do {
 		again = 0;
 
@@ -1734,11 +1736,11 @@ static int cleaner_kthread(void *arg)
 sleep:
 		if (!try_to_freeze() && !again) {
 			set_current_state(TASK_INTERRUPTIBLE);
-			if (!kthread_should_stop())
+			if (!kthread_freezable_should_stop(NULL))
 				schedule();
 			__set_current_state(TASK_RUNNING);
 		}
-	} while (!kthread_should_stop());
+	} while (!kthread_freezable_should_stop(NULL));
 	return 0;
 }
 
@@ -1752,6 +1754,8 @@ static int transaction_kthread(void *arg)
 	unsigned long delay;
 	bool cannot_commit;
 
+	set_freezable();
+
 	do {
 		cannot_commit = false;
 		delay = HZ * root->fs_info->commit_interval;
@@ -1796,13 +1800,13 @@ sleep:
 			btrfs_cleanup_transaction(root);
 		if (!try_to_freeze()) {
 			set_current_state(TASK_INTERRUPTIBLE);
-			if (!kthread_should_stop() &&
+			if (!kthread_freezable_should_stop(NULL) &&
 			    (!btrfs_transaction_blocked(root->fs_info) ||
 			     cannot_commit))
 				schedule_timeout(delay);
 			__set_current_state(TASK_RUNNING);
 		}
-	} while (!kthread_should_stop());
+	} while (!kthread_freezable_should_stop(NULL));
 	return 0;
 }
 
diff --git a/fs/cifs/cifsacl.c b/fs/cifs/cifsacl.c
index 494b683..51f5e0e 100644
--- a/fs/cifs/cifsacl.c
+++ b/fs/cifs/cifsacl.c
@@ -1027,30 +1027,15 @@ id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 nmode,
 	__u32 secdesclen = 0;
 	struct cifs_ntsd *pntsd = NULL; /* acl obtained from server */
 	struct cifs_ntsd *pnntsd = NULL; /* modified acl to be sent to server */
-	struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
-	struct tcon_link *tlink = cifs_sb_tlink(cifs_sb);
-	struct cifs_tcon *tcon;
-
-	if (IS_ERR(tlink))
-		return PTR_ERR(tlink);
-	tcon = tlink_tcon(tlink);
 
 	cifs_dbg(NOISY, "set ACL from mode for %s\n", path);
 
 	/* Get the security descriptor */
-
-	if (tcon->ses->server->ops->get_acl == NULL) {
-		cifs_put_tlink(tlink);
-		return -EOPNOTSUPP;
-	}
-
-	pntsd = tcon->ses->server->ops->get_acl(cifs_sb, inode, path,
-						&secdesclen);
+	pntsd = get_cifs_acl(CIFS_SB(inode->i_sb), inode, path, &secdesclen);
 	if (IS_ERR(pntsd)) {
 		rc = PTR_ERR(pntsd);
 		cifs_dbg(VFS, "%s: error %d getting sec desc\n", __func__, rc);
-		cifs_put_tlink(tlink);
-		return rc;
+		goto out;
 	}
 
 	/*
@@ -1063,7 +1048,6 @@ id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 nmode,
 	pnntsd = kmalloc(secdesclen, GFP_KERNEL);
 	if (!pnntsd) {
 		kfree(pntsd);
-		cifs_put_tlink(tlink);
 		return -ENOMEM;
 	}
 
@@ -1072,18 +1056,14 @@ id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 nmode,
 
 	cifs_dbg(NOISY, "build_sec_desc rc: %d\n", rc);
 
-	if (tcon->ses->server->ops->set_acl == NULL)
-		rc = -EOPNOTSUPP;
-
 	if (!rc) {
 		/* Set the security descriptor */
-		rc = tcon->ses->server->ops->set_acl(pnntsd, secdesclen, inode,
-						     path, aclflag);
+		rc = set_cifs_acl(pnntsd, secdesclen, inode, path, aclflag);
 		cifs_dbg(NOISY, "set_cifs_acl rc: %d\n", rc);
 	}
-	cifs_put_tlink(tlink);
 
 	kfree(pnntsd);
 	kfree(pntsd);
+out:
 	return rc;
 }
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index 579c6d5..f918a99 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -385,16 +385,6 @@ struct smb_version_operations {
 			struct cifsFileInfo *target_file, u64 src_off, u64 len,
 			u64 dest_off);
 	int (*validate_negotiate)(const unsigned int, struct cifs_tcon *);
-	ssize_t (*query_all_EAs)(const unsigned int, struct cifs_tcon *,
-			const unsigned char *, const unsigned char *, char *,
-			size_t, const struct nls_table *, int);
-	int (*set_EA)(const unsigned int, struct cifs_tcon *, const char *,
-			const char *, const void *, const __u16,
-			const struct nls_table *, int);
-	struct cifs_ntsd * (*get_acl)(struct cifs_sb_info *, struct inode *,
-			const char *, u32 *);
-	int (*set_acl)(struct cifs_ntsd *, __u32, struct inode *, const char *,
-			int);
 };
 
 struct smb_version_values {
diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
index 5f8bdff..49719b8 100644
--- a/fs/cifs/inode.c
+++ b/fs/cifs/inode.c
@@ -518,15 +518,10 @@ static int cifs_sfu_mode(struct cifs_fattr *fattr, const unsigned char *path,
 		return PTR_ERR(tlink);
 	tcon = tlink_tcon(tlink);
 
-	if (tcon->ses->server->ops->query_all_EAs == NULL) {
-		cifs_put_tlink(tlink);
-		return -EOPNOTSUPP;
-	}
-
-	rc = tcon->ses->server->ops->query_all_EAs(xid, tcon, path,
-			"SETFILEBITS", ea_value, 4 /* size of buf */,
-			cifs_sb->local_nls,
-			cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);
+	rc = CIFSSMBQAllEAs(xid, tcon, path, "SETFILEBITS",
+			    ea_value, 4 /* size of buf */, cifs_sb->local_nls,
+			    cifs_sb->mnt_cifs_flags &
+				CIFS_MOUNT_MAP_SPECIAL_CHR);
 	cifs_put_tlink(tlink);
 	if (rc < 0)
 		return (int)rc;
diff --git a/fs/cifs/smb1ops.c b/fs/cifs/smb1ops.c
index ffc9ef9..5f5ba0d 100644
--- a/fs/cifs/smb1ops.c
+++ b/fs/cifs/smb1ops.c
@@ -1011,14 +1011,6 @@ struct smb_version_operations smb1_operations = {
 	.push_mand_locks = cifs_push_mandatory_locks,
 	.query_mf_symlink = open_query_close_cifs_symlink,
 	.is_read_op = cifs_is_read_op,
-#ifdef CONFIG_CIFS_XATTR
-	.query_all_EAs = CIFSSMBQAllEAs,
-	.set_EA = CIFSSMBSetEA,
-#endif /* CIFS_XATTR */
-#ifdef CONFIG_CIFS_ACL
-	.get_acl = get_cifs_acl,
-	.set_acl = set_cifs_acl,
-#endif /* CIFS_ACL */
 };
 
 struct smb_version_values smb1_values = {
diff --git a/fs/cifs/xattr.c b/fs/cifs/xattr.c
index 5ac836a..09afda4 100644
--- a/fs/cifs/xattr.c
+++ b/fs/cifs/xattr.c
@@ -82,11 +82,9 @@ int cifs_removexattr(struct dentry *direntry, const char *ea_name)
 			goto remove_ea_exit;
 
 		ea_name += XATTR_USER_PREFIX_LEN; /* skip past user. prefix */
-		if (pTcon->ses->server->ops->set_EA)
-			rc = pTcon->ses->server->ops->set_EA(xid, pTcon,
-				full_path, ea_name, NULL, (__u16)0,
-				cifs_sb->local_nls, cifs_sb->mnt_cifs_flags &
-					CIFS_MOUNT_MAP_SPECIAL_CHR);
+		rc = CIFSSMBSetEA(xid, pTcon, full_path, ea_name, NULL,
+			(__u16)0, cifs_sb->local_nls,
+			cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);
 	}
 remove_ea_exit:
 	kfree(full_path);
@@ -151,22 +149,18 @@ int cifs_setxattr(struct dentry *direntry, const char *ea_name,
 			cifs_dbg(FYI, "attempt to set cifs inode metadata\n");
 
 		ea_name += XATTR_USER_PREFIX_LEN; /* skip past user. prefix */
-		if (pTcon->ses->server->ops->set_EA)
-			rc = pTcon->ses->server->ops->set_EA(xid, pTcon,
-				full_path, ea_name, ea_value, (__u16)value_size,
-				cifs_sb->local_nls, cifs_sb->mnt_cifs_flags &
-					CIFS_MOUNT_MAP_SPECIAL_CHR);
+		rc = CIFSSMBSetEA(xid, pTcon, full_path, ea_name, ea_value,
+			(__u16)value_size, cifs_sb->local_nls,
+			cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);
 	} else if (strncmp(ea_name, XATTR_OS2_PREFIX, XATTR_OS2_PREFIX_LEN)
 		   == 0) {
 		if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NO_XATTR)
 			goto set_ea_exit;
 
 		ea_name += XATTR_OS2_PREFIX_LEN; /* skip past os2. prefix */
-		if (pTcon->ses->server->ops->set_EA)
-			rc = pTcon->ses->server->ops->set_EA(xid, pTcon,
-				full_path, ea_name, ea_value, (__u16)value_size,
-				cifs_sb->local_nls, cifs_sb->mnt_cifs_flags &
-					CIFS_MOUNT_MAP_SPECIAL_CHR);
+		rc = CIFSSMBSetEA(xid, pTcon, full_path, ea_name, ea_value,
+			(__u16)value_size, cifs_sb->local_nls,
+			cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);
 	} else if (strncmp(ea_name, CIFS_XATTR_CIFS_ACL,
 			strlen(CIFS_XATTR_CIFS_ACL)) == 0) {
 #ifdef CONFIG_CIFS_ACL
@@ -176,12 +170,8 @@ int cifs_setxattr(struct dentry *direntry, const char *ea_name,
 			rc = -ENOMEM;
 		} else {
 			memcpy(pacl, ea_value, value_size);
-			if (pTcon->ses->server->ops->set_acl)
-				rc = pTcon->ses->server->ops->set_acl(pacl,
-						value_size, direntry->d_inode,
-						full_path, CIFS_ACL_DACL);
-			else
-				rc = -EOPNOTSUPP;
+			rc = set_cifs_acl(pacl, value_size,
+				direntry->d_inode, full_path, CIFS_ACL_DACL);
 			if (rc == 0) /* force revalidate of the inode */
 				CIFS_I(direntry->d_inode)->time = 0;
 			kfree(pacl);
@@ -282,21 +272,17 @@ ssize_t cifs_getxattr(struct dentry *direntry, const char *ea_name,
 			/* revalidate/getattr then populate from inode */
 		} /* BB add else when above is implemented */
 		ea_name += XATTR_USER_PREFIX_LEN; /* skip past user. prefix */
-		if (pTcon->ses->server->ops->query_all_EAs)
-			rc = pTcon->ses->server->ops->query_all_EAs(xid, pTcon,
-				full_path, ea_name, ea_value, buf_size,
-				cifs_sb->local_nls, cifs_sb->mnt_cifs_flags &
-					CIFS_MOUNT_MAP_SPECIAL_CHR);
+		rc = CIFSSMBQAllEAs(xid, pTcon, full_path, ea_name, ea_value,
+			buf_size, cifs_sb->local_nls,
+			cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);
 	} else if (strncmp(ea_name, XATTR_OS2_PREFIX, XATTR_OS2_PREFIX_LEN) == 0) {
 		if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NO_XATTR)
 			goto get_ea_exit;
 
 		ea_name += XATTR_OS2_PREFIX_LEN; /* skip past os2. prefix */
-		if (pTcon->ses->server->ops->query_all_EAs)
-			rc = pTcon->ses->server->ops->query_all_EAs(xid, pTcon,
-				full_path, ea_name, ea_value, buf_size,
-				cifs_sb->local_nls, cifs_sb->mnt_cifs_flags &
-					CIFS_MOUNT_MAP_SPECIAL_CHR);
+		rc = CIFSSMBQAllEAs(xid, pTcon, full_path, ea_name, ea_value,
+			buf_size, cifs_sb->local_nls,
+			cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);
 	} else if (strncmp(ea_name, POSIX_ACL_XATTR_ACCESS,
 			  strlen(POSIX_ACL_XATTR_ACCESS)) == 0) {
 #ifdef CONFIG_CIFS_POSIX
@@ -327,11 +313,8 @@ ssize_t cifs_getxattr(struct dentry *direntry, const char *ea_name,
 			u32 acllen;
 			struct cifs_ntsd *pacl;
 
-			if (pTcon->ses->server->ops->get_acl == NULL)
-				goto get_ea_exit; /* rc already EOPNOTSUPP */
-
-			pacl = pTcon->ses->server->ops->get_acl(cifs_sb,
-					direntry->d_inode, full_path, &acllen);
+			pacl = get_cifs_acl(cifs_sb, direntry->d_inode,
+						full_path, &acllen);
 			if (IS_ERR(pacl)) {
 				rc = PTR_ERR(pacl);
 				cifs_dbg(VFS, "%s: error %zd getting sec desc\n",
@@ -417,12 +400,11 @@ ssize_t cifs_listxattr(struct dentry *direntry, char *data, size_t buf_size)
 	/* if proc/fs/cifs/streamstoxattr is set then
 		search server for EAs or streams to
 		returns as xattrs */
-
-	if (pTcon->ses->server->ops->query_all_EAs)
-		rc = pTcon->ses->server->ops->query_all_EAs(xid, pTcon,
-				full_path, NULL, data, buf_size,
-				cifs_sb->local_nls, cifs_sb->mnt_cifs_flags &
+	rc = CIFSSMBQAllEAs(xid, pTcon, full_path, NULL, data,
+				buf_size, cifs_sb->local_nls,
+				cifs_sb->mnt_cifs_flags &
 					CIFS_MOUNT_MAP_SPECIAL_CHR);
+
 list_ea_exit:
 	kfree(full_path);
 	free_xid(xid);
diff --git a/fs/drop_caches.c b/fs/drop_caches.c
index 9fd702f..089b971 100644
--- a/fs/drop_caches.c
+++ b/fs/drop_caches.c
@@ -8,6 +8,7 @@
 #include <linux/writeback.h>
 #include <linux/sysctl.h>
 #include <linux/gfp.h>
+#include <linux/export.h>
 #include "internal.h"
 
 /* A global variable is a bit ugly, but it keeps the code simple */
@@ -50,6 +51,13 @@ static void drop_slab(void)
 	} while (nr_objects > 10);
 }
 
+/* For TuxOnIce */
+void drop_pagecache(void)
+{
+	iterate_supers(drop_pagecache_sb, NULL);
+}
+EXPORT_SYMBOL_GPL(drop_pagecache);
+
 int drop_caches_sysctl_handler(ctl_table *table, int write,
 	void __user *buffer, size_t *length, loff_t *ppos)
 {
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 1f7784d..7ce861c 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -2921,6 +2921,7 @@ static int ext4_lazyinit_thread(void *arg)
 	unsigned long next_wakeup, cur;
 
 	BUG_ON(NULL == eli);
+	set_freezable();
 
 cont_thread:
 	while (true) {
@@ -2960,7 +2961,7 @@ cont_thread:
 
 		schedule_timeout_interruptible(next_wakeup - cur);
 
-		if (kthread_should_stop()) {
+		if (kthread_freezable_should_stop(NULL)) {
 			ext4_clear_request_list();
 			goto exit_thread;
 		}
diff --git a/fs/file.c b/fs/file.c
index 9de2026..4a78f98 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -34,7 +34,7 @@ static void *alloc_fdmem(size_t size)
 	 * vmalloc() if the allocation size will be considered "large" by the VM.
 	 */
 	if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
-		void *data = kmalloc(size, GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY);
+		void *data = kmalloc(size, GFP_KERNEL|__GFP_NOWARN);
 		if (data != NULL)
 			return data;
 	}
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 9dcb977..a05b789 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -872,7 +872,9 @@ int gfs2_logd(void *data)
 	unsigned long t = 1;
 	DEFINE_WAIT(wait);
 
-	while (!kthread_should_stop()) {
+	set_freezable();
+
+	while (!kthread_freezable_should_stop(NULL)) {
 
 		if (gfs2_jrnl_flush_reqd(sdp) || t == 0) {
 			gfs2_ail1_empty(sdp);
@@ -898,11 +900,11 @@ int gfs2_logd(void *data)
 					TASK_INTERRUPTIBLE);
 			if (!gfs2_ail_flush_reqd(sdp) &&
 			    !gfs2_jrnl_flush_reqd(sdp) &&
-			    !kthread_should_stop())
+			    !kthread_freezable_should_stop(NULL))
 				t = schedule_timeout(t);
 		} while(t && !gfs2_ail_flush_reqd(sdp) &&
 			!gfs2_jrnl_flush_reqd(sdp) &&
-			!kthread_should_stop());
+			!kthread_freezable_should_stop(NULL));
 		finish_wait(&sdp->sd_logd_waitq, &wait);
 	}
 
diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
index 98236d0..579453a 100644
--- a/fs/gfs2/quota.c
+++ b/fs/gfs2/quota.c
@@ -1441,7 +1441,9 @@ int gfs2_quotad(void *data)
 	DEFINE_WAIT(wait);
 	int empty;
 
-	while (!kthread_should_stop()) {
+	set_freezable();
+
+	while (!kthread_freezable_should_stop(NULL)) {
 
 		/* Update the master statfs file */
 		if (sdp->sd_statfs_force_sync) {
diff --git a/fs/jfs/jfs_logmgr.c b/fs/jfs/jfs_logmgr.c
index 360d27c..eb29b79 100644
--- a/fs/jfs/jfs_logmgr.c
+++ b/fs/jfs/jfs_logmgr.c
@@ -2342,6 +2342,8 @@ int jfsIOWait(void *arg)
 {
 	struct lbuf *bp;
 
+	set_freezable();
+
 	do {
 		spin_lock_irq(&log_redrive_lock);
 		while ((bp = log_redrive_list)) {
@@ -2361,7 +2363,7 @@ int jfsIOWait(void *arg)
 			schedule();
 			__set_current_state(TASK_RUNNING);
 		}
-	} while (!kthread_should_stop());
+	} while (!kthread_freezable_should_stop(NULL));
 
 	jfs_info("jfsIOWait being killed!");
 	return 0;
diff --git a/fs/jfs/jfs_txnmgr.c b/fs/jfs/jfs_txnmgr.c
index 564c4f2..0a622bc 100644
--- a/fs/jfs/jfs_txnmgr.c
+++ b/fs/jfs/jfs_txnmgr.c
@@ -2752,6 +2752,8 @@ int jfs_lazycommit(void *arg)
 	unsigned long flags;
 	struct jfs_sb_info *sbi;
 
+	set_freezable();
+
 	do {
 		LAZY_LOCK(flags);
 		jfs_commit_thread_waking = 0;	/* OK to wake another thread */
@@ -2811,7 +2813,7 @@ int jfs_lazycommit(void *arg)
 			__set_current_state(TASK_RUNNING);
 			remove_wait_queue(&jfs_commit_thread_wait, &wq);
 		}
-	} while (!kthread_should_stop());
+	} while (!kthread_freezable_should_stop(NULL));
 
 	if (!list_empty(&TxAnchor.unlock_queue))
 		jfs_err("jfs_lazycommit being killed w/pending transactions!");
@@ -2936,6 +2938,8 @@ int jfs_sync(void *arg)
 	struct jfs_inode_info *jfs_ip;
 	tid_t tid;
 
+	set_freezable();
+
 	do {
 		/*
 		 * write each inode on the anonymous inode list
@@ -2998,7 +3002,7 @@ int jfs_sync(void *arg)
 			schedule();
 			__set_current_state(TASK_RUNNING);
 		}
-	} while (!kthread_should_stop());
+	} while (!kthread_freezable_should_stop(NULL));
 
 	jfs_info("jfs_sync being killed");
 	return 0;
diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index ab798a8..e066a39 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -779,7 +779,6 @@ nlmsvc_grant_blocked(struct nlm_block *block)
 	struct nlm_file		*file = block->b_file;
 	struct nlm_lock		*lock = &block->b_call->a_args.lock;
 	int			error;
-	loff_t			fl_start, fl_end;
 
 	dprintk("lockd: grant blocked lock %p\n", block);
 
@@ -797,16 +796,9 @@ nlmsvc_grant_blocked(struct nlm_block *block)
 	}
 
 	/* Try the lock operation again */
-	/* vfs_lock_file() can mangle fl_start and fl_end, but we need
-	 * them unchanged for the GRANT_MSG
-	 */
 	lock->fl.fl_flags |= FL_SLEEP;
-	fl_start = lock->fl.fl_start;
-	fl_end = lock->fl.fl_end;
 	error = vfs_lock_file(file->f_file, F_SETLK, &lock->fl, NULL);
 	lock->fl.fl_flags &= ~FL_SLEEP;
-	lock->fl.fl_start = fl_start;
-	lock->fl.fl_end = fl_end;
 
 	switch (error) {
 	case 0:
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index c442a74..812154a 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1837,11 +1837,6 @@ int nfs_symlink(struct inode *dir, struct dentry *dentry, const char *symname)
 							GFP_KERNEL)) {
 		SetPageUptodate(page);
 		unlock_page(page);
-		/*
-		 * add_to_page_cache_lru() grabs an extra page refcount.
-		 * Drop it here to avoid leaking this page later.
-		 */
-		page_cache_release(page);
 	} else
 		__free_page(page);
 
diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index a1a1916..7bb1322 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -2449,6 +2449,8 @@ static int nilfs_segctor_thread(void *arg)
 	struct the_nilfs *nilfs = sci->sc_super->s_fs_info;
 	int timeout = 0;
 
+	set_freezable();
+
 	sci->sc_timer.data = (unsigned long)current;
 	sci->sc_timer.function = nilfs_construction_timeout;
 
diff --git a/fs/super.c b/fs/super.c
index e5f6c2c..23aaaf5 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -38,6 +38,8 @@
 
 
 LIST_HEAD(super_blocks);
+EXPORT_SYMBOL_GPL(super_blocks);
+
 DEFINE_SPINLOCK(sb_lock);
 
 static char *sb_writers_name[SB_FREEZE_LEVELS] = {
diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
index a728735..a5d9536 100644
--- a/fs/xfs/xfs_trans_ail.c
+++ b/fs/xfs/xfs_trans_ail.c
@@ -498,9 +498,10 @@ xfsaild(
 	struct xfs_ail	*ailp = data;
 	long		tout = 0;	/* milliseconds */
 
+	set_freezable();
 	current->flags |= PF_MEMALLOC;
 
-	while (!kthread_should_stop()) {
+	while (!kthread_freezable_should_stop(NULL)) {
 		if (tout && tout <= 20)
 			__set_current_state(TASK_KILLABLE);
 		else
@@ -522,6 +523,7 @@ xfsaild(
 		    ailp->xa_target == ailp->xa_target_prev) {
 			spin_unlock(&ailp->xa_lock);
 			schedule();
+			try_to_freeze();
 			tout = 0;
 			continue;
 		}
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 060ff69..61d4167 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -32,6 +32,8 @@
 /* struct bio, bio_vec and BIO_* flags are defined in blk_types.h */
 #include <linux/blk_types.h>
 
+extern int trap_non_toi_io;
+
 #define BIO_DEBUG
 
 #ifdef BIO_DEBUG
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 238ef0e..af32c77 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -112,13 +112,14 @@ struct bio {
 #define BIO_QUIET	10	/* Make BIO Quiet */
 #define BIO_MAPPED_INTEGRITY 11/* integrity metadata has been remapped */
 #define BIO_SNAP_STABLE	12	/* bio data must be snapshotted during write */
+#define BIO_TOI		13	/* bio is TuxOnIce submitted */	
 
 /*
  * Flags starting here get preserved by bio_reset() - this includes
  * BIO_POOL_IDX()
  */
-#define BIO_RESET_BITS	13
-#define BIO_OWNS_VEC	13	/* bio_free() should free bvec */
+#define BIO_RESET_BITS	14
+#define BIO_OWNS_VEC	14	/* bio_free() should free bvec */
 
 #define bio_flagged(bio, flag)	((bio)->bi_flags & (1 << (flag)))
 
diff --git a/include/linux/compiler-gcc4.h b/include/linux/compiler-gcc4.h
index 2507fd2..ded4299 100644
--- a/include/linux/compiler-gcc4.h
+++ b/include/linux/compiler-gcc4.h
@@ -75,7 +75,11 @@
  *
  * (asm goto is automatically volatile - the naming reflects this.)
  */
-#define asm_volatile_goto(x...)	do { asm goto(x); asm (""); } while (0)
+#if GCC_VERSION <= 40801
+# define asm_volatile_goto(x...)	do { asm goto(x); asm (""); } while (0)
+#else
+# define asm_volatile_goto(x...)	do { asm goto(x); } while (0)
+#endif
 
 #ifdef CONFIG_ARCH_USE_BUILTIN_BSWAP
 #if GCC_VERSION >= 40400
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 121f11f..daa2752 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1639,6 +1639,8 @@ struct super_operations {
 #define S_IMA		1024	/* Inode has an associated IMA struct */
 #define S_AUTOMOUNT	2048	/* Automount/referral quasi-directory */
 #define S_NOSEC		4096	/* no suid or xattr security attributes */
+#define S_ATOMIC_COPY	8192	/* Pages mapped with this inode need to be
+				   atomically copied (gem) */
 
 /*
  * Note that nosuid etc flags are inode-specific: setting some file-system
@@ -2124,6 +2126,13 @@ extern struct super_block *freeze_bdev(struct block_device *);
 extern void emergency_thaw_all(void);
 extern int thaw_bdev(struct block_device *bdev, struct super_block *sb);
 extern int fsync_bdev(struct block_device *);
+extern int fsync_super(struct super_block *);
+extern int fsync_no_super(struct block_device *);
+#define FS_FREEZER_FUSE 1
+#define FS_FREEZER_NORMAL 2
+#define FS_FREEZER_ALL (FS_FREEZER_FUSE | FS_FREEZER_NORMAL)
+void freeze_filesystems(int which);
+void thaw_filesystems(int which);
 extern int sb_is_blkdev_sb(struct super_block *sb);
 #else
 static inline void bd_forget(struct inode *inode) {}
diff --git a/include/linux/fs_uuid.h b/include/linux/fs_uuid.h
new file mode 100644
index 0000000..3234135
--- /dev/null
+++ b/include/linux/fs_uuid.h
@@ -0,0 +1,19 @@
+#include <linux/device.h>
+
+struct hd_struct;
+struct block_device;
+
+struct fs_info {
+	char uuid[16];
+	dev_t dev_t;
+	char *last_mount;
+	int last_mount_size;
+};
+
+int part_matches_fs_info(struct hd_struct *part, struct fs_info *seek);
+dev_t blk_lookup_fs_info(struct fs_info *seek);
+struct fs_info *fs_info_from_block_dev(struct block_device *bdev);
+void free_fs_info(struct fs_info *fs_info);
+int bdev_matches_key(struct block_device *bdev, const char *key);
+struct block_device *next_bdev_of_type(struct block_device *last,
+	const char *key);
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 344883d..15da677 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -875,7 +875,7 @@ struct vmbus_channel_relid_released {
 struct vmbus_channel_initiate_contact {
 	struct vmbus_channel_message_header header;
 	u32 vmbus_version_requested;
-	u32 target_vcpu; /* The VCPU the host should respond to */
+	u32 padding2;
 	u64 interrupt_page;
 	u64 monitor_page1;
 	u64 monitor_page2;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 9fac6dd..b327e91 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1931,6 +1931,7 @@ int drop_caches_sysctl_handler(struct ctl_table *, int,
 unsigned long shrink_slab(struct shrink_control *shrink,
 			  unsigned long nr_pages_scanned,
 			  unsigned long lru_pages);
+void drop_pagecache(void);
 
 #ifndef CONFIG_MMU
 #define randomize_va_space 0
diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index 9d55438..5dd737b 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -46,9 +46,10 @@ static inline struct shmem_inode_info *SHMEM_I(struct inode *inode)
 extern int shmem_init(void);
 extern int shmem_fill_super(struct super_block *sb, void *data, int silent);
 extern struct file *shmem_file_setup(const char *name,
-					loff_t size, unsigned long flags);
+					loff_t size, unsigned long flags,
+					int atomic_copy);
 extern struct file *shmem_kernel_file_setup(const char *name, loff_t size,
-					    unsigned long flags);
+					    unsigned long flags, int atomic_copy);
 extern int shmem_zero_setup(struct vm_area_struct *);
 extern int shmem_lock(struct file *file, int lock, struct user_struct *user);
 extern void shmem_unlock_mapping(struct address_space *mapping);
diff --git a/include/linux/suspend.h b/include/linux/suspend.h
index f73cabf..d25dc56 100644
--- a/include/linux/suspend.h
+++ b/include/linux/suspend.h
@@ -419,6 +419,73 @@ extern bool pm_print_times_enabled;
 #define pm_print_times_enabled	(false)
 #endif
 
+enum {
+	TOI_CAN_HIBERNATE,
+	TOI_CAN_RESUME,
+	TOI_RESUME_DEVICE_OK,
+	TOI_NORESUME_SPECIFIED,
+	TOI_SANITY_CHECK_PROMPT,
+	TOI_CONTINUE_REQ,
+	TOI_RESUMED_BEFORE,
+	TOI_BOOT_TIME,
+	TOI_NOW_RESUMING,
+	TOI_IGNORE_LOGLEVEL,
+	TOI_TRYING_TO_RESUME,
+	TOI_LOADING_ALT_IMAGE,
+	TOI_STOP_RESUME,
+	TOI_IO_STOPPED,
+	TOI_NOTIFIERS_PREPARE,
+	TOI_CLUSTER_MODE,
+	TOI_BOOT_KERNEL,
+};
+
+#ifdef CONFIG_TOI
+
+/* Used in init dir files */
+extern unsigned long toi_state;
+#define set_toi_state(bit) (set_bit(bit, &toi_state))
+#define clear_toi_state(bit) (clear_bit(bit, &toi_state))
+#define test_toi_state(bit) (test_bit(bit, &toi_state))
+extern int toi_running;
+
+#define test_action_state(bit) (test_bit(bit, &toi_bkd.toi_action))
+extern int try_tuxonice_hibernate(void);
+
+#else /* !CONFIG_TOI */
+
+#define toi_state		(0)
+#define set_toi_state(bit) do { } while (0)
+#define clear_toi_state(bit) do { } while (0)
+#define test_toi_state(bit) (0)
+#define toi_running (0)
+
+static inline int try_tuxonice_hibernate(void) { return 0; }
+#define test_action_state(bit) (0)
+
+#endif /* CONFIG_TOI */
+
+#ifdef CONFIG_HIBERNATION
+#ifdef CONFIG_TOI
+extern void try_tuxonice_resume(void);
+#else
+#define try_tuxonice_resume() do { } while (0)
+#endif
+
+extern int resume_attempted;
+extern int software_resume(void);
+
+static inline void check_resume_attempted(void)
+{
+	if (resume_attempted)
+		return;
+
+	software_resume();
+}
+#else
+#define check_resume_attempted() do { } while (0)
+#define resume_attempted (0)
+#endif
+
 #ifdef CONFIG_PM_AUTOSLEEP
 
 /* kernel/power/autosleep.c */
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 46ba0c6..0aa55ad 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -265,6 +265,7 @@ extern unsigned long totalram_pages;
 extern unsigned long totalreserve_pages;
 extern unsigned long dirty_balance_reserve;
 extern unsigned long nr_free_buffer_pages(void);
+extern unsigned long nr_unallocated_buffer_pages(void);
 extern unsigned long nr_free_pagecache_pages(void);
 
 /* Definition of global_page_state not available yet */
@@ -314,6 +315,8 @@ extern unsigned long mem_cgroup_shrink_node_zone(struct mem_cgroup *mem,
 						struct zone *zone,
 						unsigned long *nr_scanned);
 extern unsigned long shrink_all_memory(unsigned long nr_pages);
+extern unsigned long shrink_memory_mask(unsigned long nr_to_reclaim,
+		gfp_t mask);
 extern int vm_swappiness;
 extern int remove_mapping(struct address_space *mapping, struct page *page);
 extern unsigned long vm_total_pages;
@@ -427,13 +430,17 @@ extern void swapcache_free(swp_entry_t, struct page *page);
 extern int free_swap_and_cache(swp_entry_t);
 extern int swap_type_of(dev_t, sector_t, struct block_device **);
 extern unsigned int count_swap_pages(int, int);
+extern sector_t map_swap_entry(swp_entry_t entry, struct block_device **);
 extern sector_t map_swap_page(struct page *, struct block_device **);
 extern sector_t swapdev_block(int, pgoff_t);
+extern struct swap_info_struct *get_swap_info_struct(unsigned);
 extern int page_swapcount(struct page *);
 extern struct swap_info_struct *page_swap_info(struct page *);
 extern int reuse_swap_page(struct page *);
 extern int try_to_free_swap(struct page *);
 struct backing_dev_info;
+extern void get_swap_range_of_type(int type, swp_entry_t *start,
+		swp_entry_t *end, unsigned int limit);
 
 #ifdef CONFIG_MEMCG
 extern void
diff --git a/include/linux/usb.h b/include/linux/usb.h
index 7454865..512ab16 100644
--- a/include/linux/usb.h
+++ b/include/linux/usb.h
@@ -1264,6 +1264,8 @@ typedef void (*usb_complete_t)(struct urb *);
  * @sg: scatter gather buffer list, the buffer size of each element in
  * 	the list (except the last) must be divisible by the endpoint's
  * 	max packet size if no_sg_constraint isn't set in 'struct usb_bus'
+ * 	(FIXME: scatter-gather under xHCI is broken for periodic transfers.
+ * 	Do not use urb->sg for interrupt endpoints for now, only bulk.)
  * @num_mapped_sgs: (internal) number of mapped sg entries
  * @num_sgs: number of entries in the sg list
  * @transfer_buffer_length: How big is transfer_buffer.  The transfer may
diff --git a/include/uapi/linux/mic_ioctl.h b/include/uapi/linux/mic_ioctl.h
index feb0b4c..7fabba5 100644
--- a/include/uapi/linux/mic_ioctl.h
+++ b/include/uapi/linux/mic_ioctl.h
@@ -39,7 +39,7 @@ struct mic_copy_desc {
 #else
 	struct iovec *iov;
 #endif
-	__u32 iovcnt;
+	int iovcnt;
 	__u8 vr_idx;
 	__u8 update_used;
 	__u32 out_len;
diff --git a/include/uapi/linux/netlink.h b/include/uapi/linux/netlink.h
index 1a85940..3025606 100644
--- a/include/uapi/linux/netlink.h
+++ b/include/uapi/linux/netlink.h
@@ -27,6 +27,8 @@
 #define NETLINK_ECRYPTFS	19
 #define NETLINK_RDMA		20
 #define NETLINK_CRYPTO		21	/* Crypto layer */
+#define NETLINK_TOI_USERUI	22	/* TuxOnIce's userui */
+#define NETLINK_TOI_USM		23	/* Userspace storage manager */
 
 #define NETLINK_INET_DIAG	NETLINK_SOCK_DIAG
 
diff --git a/init/do_mounts.c b/init/do_mounts.c
index 8e5addc..05200e3 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -285,6 +285,7 @@ fail:
 done:
 	return res;
 }
+EXPORT_SYMBOL_GPL(name_to_dev_t);
 
 static int __init root_dev_setup(char *line)
 {
@@ -586,6 +587,8 @@ void __init prepare_namespace(void)
 	if (is_floppy && rd_doload && rd_load_disk(0))
 		ROOT_DEV = Root_RAM0;
 
+	check_resume_attempted();
+
 	mount_root();
 out:
 	devtmpfs_mount("dev");
diff --git a/init/do_mounts_initrd.c b/init/do_mounts_initrd.c
index 3e0878e..a49c596 100644
--- a/init/do_mounts_initrd.c
+++ b/init/do_mounts_initrd.c
@@ -15,6 +15,7 @@
 #include <linux/romfs_fs.h>
 #include <linux/initrd.h>
 #include <linux/sched.h>
+#include <linux/suspend.h>
 #include <linux/freezer.h>
 #include <linux/kmod.h>
 
@@ -79,6 +80,11 @@ static void __init handle_initrd(void)
 
 	current->flags &= ~PF_FREEZER_SKIP;
 
+	if (!resume_attempted)
+		printk(KERN_ERR "TuxOnIce: No attempt was made to resume from "
+				"any image that might exist.\n");
+	clear_toi_state(TOI_BOOT_TIME);
+
 	/* move initrd to rootfs' /old */
 	sys_mount("..", ".", NULL, MS_MOVE, NULL);
 	/* switch root and cwd back to / of rootfs */
diff --git a/init/main.c b/init/main.c
index febc511..303b454 100644
--- a/init/main.c
+++ b/init/main.c
@@ -129,6 +129,7 @@ void (*__initdata late_time_init)(void);
 char __initdata boot_command_line[COMMAND_LINE_SIZE];
 /* Untouched saved command line (eg. for /proc) */
 char *saved_command_line;
+EXPORT_SYMBOL_GPL(saved_command_line);
 /* Command line for parameter parsing */
 static char *static_command_line;
 /* Command line for per-initcall parameter parsing */
diff --git a/ipc/shm.c b/ipc/shm.c
index 7a51443..712fa09 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -538,7 +538,7 @@ static int newseg(struct ipc_namespace *ns, struct ipc_params *params)
 		if  ((shmflg & SHM_NORESERVE) &&
 				sysctl_overcommit_memory != OVERCOMMIT_NEVER)
 			acctflag = VM_NORESERVE;
-		file = shmem_file_setup(name, size, acctflag);
+		file = shmem_file_setup(name, size, acctflag, 0);
 	}
 	error = PTR_ERR(file);
 	if (IS_ERR(file))
diff --git a/kernel/cpu.c b/kernel/cpu.c
index deff2e6..dab697e 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -508,6 +508,7 @@ int disable_nonboot_cpus(void)
 	cpu_maps_update_done();
 	return error;
 }
+EXPORT_SYMBOL_GPL(disable_nonboot_cpus);
 
 void __weak arch_enable_nonboot_cpus_begin(void)
 {
@@ -546,6 +547,7 @@ void __ref enable_nonboot_cpus(void)
 out:
 	cpu_maps_update_done();
 }
+EXPORT_SYMBOL_GPL(enable_nonboot_cpus);
 
 static int __init alloc_frozen_cpus(void)
 {
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 8ab8e93..192a302 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -274,7 +274,6 @@ struct irq_desc *irq_to_desc(unsigned int irq)
 {
 	return (irq < NR_IRQS) ? irq_desc + irq : NULL;
 }
-EXPORT_SYMBOL(irq_to_desc);
 
 static void free_desc(unsigned int irq)
 {
diff --git a/kernel/kmod.c b/kernel/kmod.c
index b086006..ef2b0cd 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -461,6 +461,7 @@ void __usermodehelper_set_disable_depth(enum umh_disable_depth depth)
 	wake_up(&usermodehelper_disabled_waitq);
 	up_write(&umhelper_sem);
 }
+EXPORT_SYMBOL_GPL(__usermodehelper_set_disable_depth);
 
 /**
  * __usermodehelper_disable - Prevent new helpers from being started.
@@ -494,6 +495,7 @@ int __usermodehelper_disable(enum umh_disable_depth depth)
 	__usermodehelper_set_disable_depth(UMH_ENABLED);
 	return -EAGAIN;
 }
+EXPORT_SYMBOL_GPL(__usermodehelper_disable);
 
 static void helper_lock(void)
 {
diff --git a/kernel/kthread.c b/kernel/kthread.c
index b5ae3ee..75c0da8 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -550,6 +550,8 @@ int kthread_worker_fn(void *worker_ptr)
 
 	WARN_ON(worker->task);
 	worker->task = current;
+	set_freezable();
+
 repeat:
 	set_current_state(TASK_INTERRUPTIBLE);	/* mb paired w/ kthread_stop */
 
diff --git a/kernel/pid.c b/kernel/pid.c
index 9b9a266..ad91ea4 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -450,6 +450,7 @@ struct task_struct *find_task_by_pid_ns(pid_t nr, struct pid_namespace *ns)
 			   " protection");
 	return pid_task(find_pid_ns(nr, ns), PIDTYPE_PID);
 }
+EXPORT_SYMBOL_GPL(find_task_by_pid_ns);
 
 struct task_struct *find_task_by_vpid(pid_t vnr)
 {
diff --git a/kernel/power/Kconfig b/kernel/power/Kconfig
index 2fac9cc..b5a23c4 100644
--- a/kernel/power/Kconfig
+++ b/kernel/power/Kconfig
@@ -91,6 +91,286 @@ config PM_STD_PARTITION
 	  suspended image to. It will simply pick the first available swap 
 	  device.
 
+menuconfig TOI_CORE
+	tristate "Enhanced Hibernation (TuxOnIce)"
+	depends on HIBERNATION
+	default y
+	---help---
+	  TuxOnIce is the 'new and improved' suspend support.
+
+	  See the TuxOnIce home page (tuxonice.net)
+	  for FAQs, HOWTOs and other documentation.
+
+	comment "Image Storage (you need at least one allocator)"
+		depends on TOI_CORE
+
+	config TOI_FILE
+		tristate "File Allocator"
+		depends on TOI_CORE
+		default y
+		---help---
+		  This option enables support for storing an image in a
+		  simple file. You might want this if your swap is
+		  sometimes full enough that you don't have enough spare
+		  space to store an image.
+
+	config TOI_SWAP
+		tristate "Swap Allocator"
+		depends on TOI_CORE && SWAP
+		default y
+		---help---
+		  This option enables support for storing an image in your
+		  swap space.
+
+	comment "General Options"
+		depends on TOI_CORE
+
+	config TOI_INCREMENTAL
+		tristate "Incremental Image Support"
+		depends on TOI_CORE && CRYPTO && BROKEN
+		select CRYPTO_SHA1
+		default y
+		---help---
+		  This option adds initial support for using hashing algorithms
+		  (a quick, internal implementation of Fletcher16 and SHA1 via
+		  cryptoapi) to discover the number of pages which are
+		  unchanged since the image was last written. It is hoped that
+		  this will be an initial step toward implementing storing just
+		  the differences between consecutive images, which will
+		  increase the amount of storage needed for the image, but also
+		  increase the speed at which writing an image occurs and
+		  reduce the wear and tear on drives.
+
+	comment "No increemntal image support available without Cryptoapi support."
+		depends on TOI_CORE && !CRYPTO
+
+	config TOI_PRUNE
+		tristate "Image pruning support"
+		depends on TOI_CORE && CRYPTO && BROKEN
+		default y
+		---help---
+		  This option adds support for using cryptoapi hashing
+		  algorithms to identify pages with the same content. We
+		  then write a much smaller pointer to the first copy of
+		  the data instead of a complete (perhaps compressed)
+                  additional copy.
+
+		  You probably want this, so say Y here.
+
+	comment "No image pruning support available without Cryptoapi support."
+		depends on TOI_CORE && !CRYPTO
+
+	config TOI_CRYPTO
+		tristate "Compression support"
+		depends on TOI_CORE && CRYPTO
+		default y
+		---help---
+		  This option adds support for using cryptoapi compression
+		  algorithms. Compression is particularly useful as it can
+		  more than double your suspend and resume speed (depending
+		  upon how well your image compresses).
+
+		  You probably want this, so say Y here.
+
+	comment "No compression support available without Cryptoapi support."
+		depends on TOI_CORE && !CRYPTO
+
+	config TOI_USERUI
+		tristate "Userspace User Interface support"
+		depends on TOI_CORE && NET && (VT || SERIAL_CONSOLE)
+		default y
+		---help---
+		  This option enabled support for a userspace based user interface
+		  to TuxOnIce, which allows you to have a nice display while suspending
+		  and resuming, and also enables features such as pressing escape to
+		  cancel a cycle or interactive debugging.
+
+	config TOI_USERUI_DEFAULT_PATH
+		string "Default userui program location"
+		default "/usr/local/sbin/tuxoniceui_text"
+		depends on TOI_USERUI
+		---help---
+		  This entry allows you to specify a default path to the userui binary.
+
+	config TOI_DEFAULT_IMAGE_SIZE_LIMIT
+		int "Default image size limit"
+		range -2 65536 
+		default "-2"
+		depends on TOI_CORE
+		---help---
+		  This entry allows you to specify a default image size limit. It can
+		  be overridden at run-time using /sys/power/tuxonice/image_size_limit.
+
+	config TOI_KEEP_IMAGE
+		bool "Allow Keep Image Mode"
+		depends on TOI_CORE
+		---help---
+		  This option allows you to keep and image and reuse it. It is intended
+		  __ONLY__ for use with systems where all filesystems are mounted read-
+		  only (kiosks, for example). To use it, compile this option in and boot
+		  normally. Set the KEEP_IMAGE flag in /sys/power/tuxonice and suspend.
+		  When you resume, the image will not be removed. You will be unable to turn
+		  off swap partitions (assuming you are using the swap allocator), but future
+		  suspends simply do a power-down. The image can be updated using the
+		  kernel command line parameter suspend_act= to turn off the keep image
+		  bit. Keep image mode is a little less user friendly on purpose - it
+		  should not be used without thought!
+
+	config TOI_REPLACE_SWSUSP
+		bool "Replace swsusp by default"
+		default y
+		depends on TOI_CORE
+		---help---
+		  TuxOnIce can replace swsusp. This option makes that the default state,
+		  requiring you to echo 0 > /sys/power/tuxonice/replace_swsusp if you want
+		  to use the vanilla kernel functionality. Note that your initrd/ramfs will
+		  need to do this before trying to resume, too.
+		  With overriding swsusp enabled, echoing disk  to /sys/power/state will
+		  start a TuxOnIce cycle. If resume= doesn't specify an allocator and both
+		  the swap and file allocators are compiled in, the swap allocator will be
+		  used by default.
+
+	config TOI_IGNORE_LATE_INITCALL
+		bool "Wait for initrd/ramfs to run, by default"
+		default n
+		depends on TOI_CORE
+		---help---
+		  When booting, TuxOnIce can check for an image and start to resume prior
+		  to any initrd/ramfs running (via a late initcall).
+
+		  If you don't have an initrd/ramfs, this is what you want to happen -
+		  otherwise you won't be able to safely resume. You should set this option
+		  to 'No'.
+
+		  If, however, you want your initrd/ramfs to run anyway before resuming,
+		  you need to tell TuxOnIce to ignore that earlier opportunity to resume.
+		  This can be done either by using this compile time option, or by
+		  overriding this option with the boot-time parameter toi_initramfs_resume_only=1.
+
+		  Note that if TuxOnIce can't resume at the earlier opportunity, the
+		  value of this option won't matter - the initramfs/initrd (if any) will
+		  run anyway.
+
+	menuconfig TOI_CLUSTER
+		tristate "Cluster support"
+		default n
+		depends on TOI_CORE && NET && BROKEN
+		---help---
+		  Support for linking multiple machines in a cluster so that they suspend
+		  and resume together.
+
+	config TOI_DEFAULT_CLUSTER_INTERFACE
+		string "Default cluster interface"
+		depends on TOI_CLUSTER
+		---help---
+		  The default interface on which to communicate with other nodes in
+		  the cluster.
+
+		  If no value is set here, cluster support will be disabled by default.
+
+	config TOI_DEFAULT_CLUSTER_KEY
+		string "Default cluster key"
+		default "Default"
+		depends on TOI_CLUSTER
+		---help---
+		  The default key used by this node. All nodes in the same cluster
+		  have the same key. Multiple clusters may coexist on the same lan
+		  by using different values for this key.
+
+	config TOI_CLUSTER_IMAGE_TIMEOUT
+		int "Timeout when checking for image"
+		default 15
+		depends on TOI_CLUSTER
+		---help---
+		  Timeout (seconds) before continuing to boot when waiting to see
+		  whether other nodes might have an image. Set to -1 to wait
+		  indefinitely. In WAIT_UNTIL_NODES is non zero, we might continue
+		  booting sooner than this timeout.
+
+	config TOI_CLUSTER_WAIT_UNTIL_NODES
+		int "Nodes without image before continuing"
+		default 0
+		depends on TOI_CLUSTER
+		---help---
+		  When booting and no image is found, we wait to see if other nodes
+		  have an image before continuing to boot. This value lets us
+		  continue after seeing a certain number of nodes without an image,
+		  instead of continuing to wait for the timeout. Set to 0 to only
+		  use the timeout.
+
+	config TOI_DEFAULT_CLUSTER_PRE_HIBERNATE
+		string "Default pre-hibernate script"
+		depends on TOI_CLUSTER
+		---help---
+		  The default script to be called when starting to hibernate.
+
+	config TOI_DEFAULT_CLUSTER_POST_HIBERNATE
+		string "Default post-hibernate script"
+		depends on TOI_CLUSTER
+		---help---
+		  The default script to be called after resuming from hibernation.
+
+	config TOI_DEFAULT_WAIT
+		int "Default waiting time for emergency boot messages"
+		default "25"
+		range -1 32768
+		depends on TOI_CORE
+		help
+		  TuxOnIce can display warnings very early in the process of resuming,
+		  if (for example) it appears that you have booted a kernel that doesn't
+		  match an image on disk. It can then give you the opportunity to either
+		  continue booting that kernel, or reboot the machine. This option can be
+		  used to control how long to wait in such circumstances. -1 means wait
+		  forever. 0 means don't wait at all (do the default action, which will
+		  generally be to continue booting and remove the image). Values of 1 or
+		  more indicate a number of seconds (up to 255) to wait before doing the
+		  default.
+
+	config  TOI_DEFAULT_EXTRA_PAGES_ALLOWANCE
+		int "Default extra pages allowance"
+		default "2000"
+		range 500 32768
+		depends on TOI_CORE
+		help
+		  This value controls the default for the allowance TuxOnIce makes for
+		  drivers to allocate extra memory during the atomic copy. The default
+		  value of 2000 will be okay in most cases. If you are using
+		  DRI, the easiest way to find what value to use is to try to hibernate
+		  and look at how many pages were actually needed in the sysfs entry
+		  /sys/power/tuxonice/debug_info (first number on the last line), adding
+		  a little extra because the value is not always the same.
+
+	config TOI_CHECKSUM
+		bool "Checksum pageset2"
+		default n
+		depends on TOI_CORE
+		select CRYPTO
+		select CRYPTO_ALGAPI
+		select CRYPTO_MD4
+		---help---
+		  Adds support for checksumming pageset2 pages, to ensure you really get an
+		  atomic copy. Since some filesystems (XFS especially) change metadata even
+		  when there's no other activity, we need this to check for pages that have
+		  been changed while we were saving the page cache. If your debugging output
+		  always says no pages were resaved, you may be able to safely disable this
+		  option.
+
+config TOI
+	bool
+	depends on TOI_CORE!=n
+	default y
+
+config TOI_EXPORTS
+	bool
+	depends on TOI_SWAP=m || TOI_FILE=m || \
+		TOI_CRYPTO=m || TOI_CLUSTER=m || \
+		TOI_USERUI=m || TOI_CORE=m
+	default y
+
+config TOI_ZRAM_SUPPORT
+	def_bool y
+	depends on TOI && ZRAM!=n
+
 config PM_SLEEP
 	def_bool y
 	depends on SUSPEND || HIBERNATE_CALLBACKS
diff --git a/kernel/power/Makefile b/kernel/power/Makefile
index 29472bf..dd5d4f2 100644
--- a/kernel/power/Makefile
+++ b/kernel/power/Makefile
@@ -1,6 +1,37 @@
 
 ccflags-$(CONFIG_PM_DEBUG)	:= -DDEBUG
 
+tuxonice_core-y := tuxonice_modules.o
+
+obj-$(CONFIG_TOI)		+= tuxonice_builtin.o
+
+tuxonice_core-$(CONFIG_PM_DEBUG)	+= tuxonice_alloc.o
+
+# Compile these in after allocation debugging, if used.
+
+tuxonice_core-y += tuxonice_sysfs.o tuxonice_highlevel.o \
+		tuxonice_io.o tuxonice_pagedir.o tuxonice_prepare_image.o \
+		tuxonice_extent.o tuxonice_pageflags.o tuxonice_ui.o \
+		tuxonice_power_off.o tuxonice_atomic_copy.o
+
+tuxonice_core-$(CONFIG_TOI_CHECKSUM)	+= tuxonice_checksum.o
+
+tuxonice_core-$(CONFIG_NET)	+= tuxonice_storage.o tuxonice_netlink.o
+
+obj-$(CONFIG_TOI_CORE)		+= tuxonice_core.o
+obj-$(CONFIG_TOI_PRUNE)		+= tuxonice_prune.o
+obj-$(CONFIG_TOI_INCREMENTAL)	+= tuxonice_incremental.o
+obj-$(CONFIG_TOI_CRYPTO)	+= tuxonice_compress.o
+
+tuxonice_bio-y := tuxonice_bio_core.o tuxonice_bio_chains.o \
+		tuxonice_bio_signature.o
+
+obj-$(CONFIG_TOI_SWAP)		+= tuxonice_bio.o tuxonice_swap.o
+obj-$(CONFIG_TOI_FILE)		+= tuxonice_bio.o tuxonice_file.o
+obj-$(CONFIG_TOI_CLUSTER)	+= tuxonice_cluster.o
+
+obj-$(CONFIG_TOI_USERUI)	+= tuxonice_userui.o
+
 obj-y				+= qos.o
 obj-$(CONFIG_PM)		+= main.o
 obj-$(CONFIG_VT_CONSOLE_SLEEP)	+= console.o
diff --git a/kernel/power/console.c b/kernel/power/console.c
index eacb8bd..867823a 100644
--- a/kernel/power/console.c
+++ b/kernel/power/console.c
@@ -137,6 +137,7 @@ int pm_prepare_console(void)
 	orig_kmsg = vt_kmsg_redirect(SUSPEND_CONSOLE);
 	return 0;
 }
+EXPORT_SYMBOL_GPL(pm_prepare_console);
 
 void pm_restore_console(void)
 {
@@ -148,3 +149,4 @@ void pm_restore_console(void)
 		vt_kmsg_redirect(orig_kmsg);
 	}
 }
+EXPORT_SYMBOL_GPL(pm_restore_console);
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index 0121dab..38d33de 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -29,14 +29,15 @@
 #include <linux/ctype.h>
 #include <linux/genhd.h>
 
-#include "power.h"
+#include "tuxonice.h"
 
 
 static int nocompress;
 static int noresume;
 static int resume_wait;
 static int resume_delay;
-static char resume_file[256] = CONFIG_PM_STD_PARTITION;
+char resume_file[256] = CONFIG_PM_STD_PARTITION;
+EXPORT_SYMBOL_GPL(resume_file);
 dev_t swsusp_resume_device;
 sector_t swsusp_resume_block;
 __visible int in_suspend __nosavedata;
@@ -114,21 +115,23 @@ static int hibernation_test(int level) { return 0; }
  * platform_begin - Call platform to start hibernation.
  * @platform_mode: Whether or not to use the platform driver.
  */
-static int platform_begin(int platform_mode)
+int platform_begin(int platform_mode)
 {
 	return (platform_mode && hibernation_ops) ?
 		hibernation_ops->begin() : 0;
 }
+EXPORT_SYMBOL_GPL(platform_begin);
 
 /**
  * platform_end - Call platform to finish transition to the working state.
  * @platform_mode: Whether or not to use the platform driver.
  */
-static void platform_end(int platform_mode)
+void platform_end(int platform_mode)
 {
 	if (platform_mode && hibernation_ops)
 		hibernation_ops->end();
 }
+EXPORT_SYMBOL_GPL(platform_end);
 
 /**
  * platform_pre_snapshot - Call platform to prepare the machine for hibernation.
@@ -138,11 +141,12 @@ static void platform_end(int platform_mode)
  * if so configured, and return an error code if that fails.
  */
 
-static int platform_pre_snapshot(int platform_mode)
+int platform_pre_snapshot(int platform_mode)
 {
 	return (platform_mode && hibernation_ops) ?
 		hibernation_ops->pre_snapshot() : 0;
 }
+EXPORT_SYMBOL_GPL(platform_pre_snapshot);
 
 /**
  * platform_leave - Call platform to prepare a transition to the working state.
@@ -153,11 +157,12 @@ static int platform_pre_snapshot(int platform_mode)
  *
  * This routine is called on one CPU with interrupts disabled.
  */
-static void platform_leave(int platform_mode)
+void platform_leave(int platform_mode)
 {
 	if (platform_mode && hibernation_ops)
 		hibernation_ops->leave();
 }
+EXPORT_SYMBOL_GPL(platform_leave);
 
 /**
  * platform_finish - Call platform to switch the system to the working state.
@@ -168,11 +173,12 @@ static void platform_leave(int platform_mode)
  *
  * This routine must be called after platform_prepare().
  */
-static void platform_finish(int platform_mode)
+void platform_finish(int platform_mode)
 {
 	if (platform_mode && hibernation_ops)
 		hibernation_ops->finish();
 }
+EXPORT_SYMBOL_GPL(platform_finish);
 
 /**
  * platform_pre_restore - Prepare for hibernate image restoration.
@@ -184,11 +190,12 @@ static void platform_finish(int platform_mode)
  * If the restore fails after this function has been called,
  * platform_restore_cleanup() must be called.
  */
-static int platform_pre_restore(int platform_mode)
+int platform_pre_restore(int platform_mode)
 {
 	return (platform_mode && hibernation_ops) ?
 		hibernation_ops->pre_restore() : 0;
 }
+EXPORT_SYMBOL_GPL(platform_pre_restore);
 
 /**
  * platform_restore_cleanup - Switch to the working state after failing restore.
@@ -201,21 +208,23 @@ static int platform_pre_restore(int platform_mode)
  * function must be called too, regardless of the result of
  * platform_pre_restore().
  */
-static void platform_restore_cleanup(int platform_mode)
+void platform_restore_cleanup(int platform_mode)
 {
 	if (platform_mode && hibernation_ops)
 		hibernation_ops->restore_cleanup();
 }
+EXPORT_SYMBOL_GPL(platform_restore_cleanup);
 
 /**
  * platform_recover - Recover from a failure to suspend devices.
  * @platform_mode: Whether or not to use the platform driver.
  */
-static void platform_recover(int platform_mode)
+void platform_recover(int platform_mode)
 {
 	if (platform_mode && hibernation_ops && hibernation_ops->recover)
 		hibernation_ops->recover();
 }
+EXPORT_SYMBOL_GPL(platform_recover);
 
 /**
  * swsusp_show_speed - Print time elapsed between two events during hibernation.
@@ -573,6 +582,7 @@ int hibernation_platform_enter(void)
 
 	return error;
 }
+EXPORT_SYMBOL_GPL(hibernation_platform_enter);
 
 /**
  * power_down - Shut the machine down for hibernation.
@@ -632,6 +642,9 @@ int hibernate(void)
 {
 	int error;
 
+	if (test_action_state(TOI_REPLACE_SWSUSP))
+		return try_tuxonice_hibernate();
+
 	lock_system_sleep();
 	/* The snapshot device should not be opened while we're running */
 	if (!atomic_add_unless(&snapshot_device_available, -1, 0)) {
@@ -716,11 +729,19 @@ int hibernate(void)
  * attempts to recover gracefully and make the kernel return to the normal mode
  * of operation.
  */
-static int software_resume(void)
+int software_resume(void)
 {
 	int error;
 	unsigned int flags;
 
+	resume_attempted = 1;
+
+	/*
+	 * We can't know (until an image header - if any - is loaded), whether
+	 * we did override swsusp. We therefore ensure that both are tried.
+	 */
+	try_tuxonice_resume();
+
 	/*
 	 * If the user said "noresume".. bail out early.
 	 */
@@ -1095,6 +1116,7 @@ static int __init hibernate_setup(char *str)
 static int __init noresume_setup(char *str)
 {
 	noresume = 1;
+	set_toi_state(TOI_NORESUME_SPECIFIED);
 	return 1;
 }
 
diff --git a/kernel/power/main.c b/kernel/power/main.c
index 1d1bf63..d16f971 100644
--- a/kernel/power/main.c
+++ b/kernel/power/main.c
@@ -19,12 +19,14 @@
 #include "power.h"
 
 DEFINE_MUTEX(pm_mutex);
+EXPORT_SYMBOL_GPL(pm_mutex);
 
 #ifdef CONFIG_PM_SLEEP
 
 /* Routines for PM-transition notifications */
 
-static BLOCKING_NOTIFIER_HEAD(pm_chain_head);
+BLOCKING_NOTIFIER_HEAD(pm_chain_head);
+EXPORT_SYMBOL_GPL(pm_chain_head);
 
 int register_pm_notifier(struct notifier_block *nb)
 {
@@ -44,6 +46,7 @@ int pm_notifier_call_chain(unsigned long val)
 
 	return notifier_to_errno(ret);
 }
+EXPORT_SYMBOL_GPL(pm_notifier_call_chain);
 
 /* If set, devices may be suspended and resumed asynchronously. */
 int pm_async_enabled = 1;
@@ -277,6 +280,7 @@ static inline void pm_print_times_init(void) {}
 #endif /* CONFIG_PM_SLEEP_DEBUG */
 
 struct kobject *power_kobj;
+EXPORT_SYMBOL_GPL(power_kobj);
 
 /**
  *	state - control system power state.
diff --git a/kernel/power/power.h b/kernel/power/power.h
index 7d4b7ff..98b9660 100644
--- a/kernel/power/power.h
+++ b/kernel/power/power.h
@@ -35,8 +35,12 @@ static inline char *check_image_kernel(struct swsusp_info *info)
 	return arch_hibernation_header_restore(info) ?
 			"architecture specific data" : NULL;
 }
+#else
+extern char *check_image_kernel(struct swsusp_info *info);
 #endif /* CONFIG_ARCH_HIBERNATION_HEADER */
+extern int init_header(struct swsusp_info *info);
 
+extern char resume_file[256];
 /*
  * Keep some memory free so that I/O operations can succeed without paging
  * [Might this be more than 4 MB?]
@@ -55,6 +59,7 @@ extern bool freezer_test_done;
 extern int hibernation_snapshot(int platform_mode);
 extern int hibernation_restore(int platform_mode);
 extern int hibernation_platform_enter(void);
+extern void platform_recover(int platform_mode);
 
 #else /* !CONFIG_HIBERNATION */
 
@@ -74,6 +79,8 @@ static struct kobj_attribute _name##_attr = {	\
 	.store	= _name##_store,		\
 }
 
+extern struct pbe *restore_pblist;
+
 /* Preferred image size in bytes (default 500 MB) */
 extern unsigned long image_size;
 /* Size of memory reserved for drivers (default SPARE_PAGES x PAGE_SIZE) */
@@ -268,6 +275,90 @@ static inline void suspend_thaw_processes(void)
 }
 #endif
 
+extern struct page *saveable_page(struct zone *z, unsigned long p);
+#ifdef CONFIG_HIGHMEM
+extern struct page *saveable_highmem_page(struct zone *z, unsigned long p);
+#else
+static
+inline struct page *saveable_highmem_page(struct zone *z, unsigned long p)
+{
+	return NULL;
+}
+#endif
+
+#define PBES_PER_PAGE (PAGE_SIZE / sizeof(struct pbe))
+extern struct list_head nosave_regions;
+
+/**
+ *	This structure represents a range of page frames the contents of which
+ *	should not be saved during the suspend.
+ */
+
+struct nosave_region {
+	struct list_head list;
+	unsigned long start_pfn;
+	unsigned long end_pfn;
+};
+
+#define BM_END_OF_MAP	(~0UL)
+
+#define BM_BITS_PER_BLOCK	(PAGE_SIZE * BITS_PER_BYTE)
+
+struct bm_block {
+	struct list_head hook;		/* hook into a list of bitmap blocks */
+	unsigned long start_pfn;	/* pfn represented by the first bit */
+	unsigned long end_pfn;	/* pfn represented by the last bit plus 1 */
+	unsigned long *data;	/* bitmap representing pages */
+};
+
+/* struct bm_position is used for browsing memory bitmaps */
+
+struct bm_position {
+	struct bm_block *block;
+	int bit;
+};
+
+struct memory_bitmap {
+	struct list_head blocks;	/* list of bitmap blocks */
+	struct linked_page *p_list;	/* list of pages used to store zone
+					 * bitmap objects and bitmap block
+					 * objects
+					 */
+	struct bm_position *states;	/* most recently used bit position */
+	int num_states;			/* when iterating over a bitmap and
+					 * number of states we support.
+					 */
+};
+
+extern int memory_bm_create(struct memory_bitmap *bm, gfp_t gfp_mask,
+		int safe_needed);
+extern int memory_bm_create_index(struct memory_bitmap *bm, gfp_t gfp_mask,
+		int safe_needed, int index);
+extern void memory_bm_free(struct memory_bitmap *bm, int clear_nosave_free);
+extern void memory_bm_set_bit(struct memory_bitmap *bm, unsigned long pfn);
+extern void memory_bm_clear_bit(struct memory_bitmap *bm, unsigned long pfn);
+extern void memory_bm_clear_bit_index(struct memory_bitmap *bm, unsigned long pfn, int index);
+extern int memory_bm_test_bit(struct memory_bitmap *bm, unsigned long pfn);
+extern int memory_bm_test_bit_index(struct memory_bitmap *bm, unsigned long pfn, int index);
+extern unsigned long memory_bm_next_pfn(struct memory_bitmap *bm);
+extern unsigned long memory_bm_next_pfn_index(struct memory_bitmap *bm,
+		int index);
+extern void memory_bm_position_reset(struct memory_bitmap *bm);
+extern void memory_bm_clear(struct memory_bitmap *bm);
+extern void memory_bm_copy(struct memory_bitmap *source,
+		struct memory_bitmap *dest);
+extern void memory_bm_dup(struct memory_bitmap *source,
+		struct memory_bitmap *dest);
+extern int memory_bm_set_iterators(struct memory_bitmap *bm, int number);
+
+#ifdef CONFIG_TOI
+struct toi_module_ops;
+extern int memory_bm_read(struct memory_bitmap *bm, int (*rw_chunk)
+	(int rw, struct toi_module_ops *owner, char *buffer, int buffer_size));
+extern int memory_bm_write(struct memory_bitmap *bm, int (*rw_chunk)
+	(int rw, struct toi_module_ops *owner, char *buffer, int buffer_size));
+#endif
+
 #ifdef CONFIG_PM_AUTOSLEEP
 
 /* kernel/power/autosleep.c */
diff --git a/kernel/power/process.c b/kernel/power/process.c
index 06ec886..4004a83 100644
--- a/kernel/power/process.c
+++ b/kernel/power/process.c
@@ -143,6 +143,7 @@ int freeze_processes(void)
 		thaw_processes();
 	return error;
 }
+EXPORT_SYMBOL_GPL(freeze_processes);
 
 /**
  * freeze_kernel_threads - Make freezable kernel threads go to the refrigerator.
@@ -169,6 +170,7 @@ int freeze_kernel_threads(void)
 		thaw_kernel_threads();
 	return error;
 }
+EXPORT_SYMBOL_GPL(freeze_kernel_threads);
 
 void thaw_processes(void)
 {
@@ -202,6 +204,7 @@ void thaw_processes(void)
 	schedule();
 	printk("done.\n");
 }
+EXPORT_SYMBOL_GPL(thaw_processes);
 
 void thaw_kernel_threads(void)
 {
@@ -222,3 +225,4 @@ void thaw_kernel_threads(void)
 	schedule();
 	printk("done.\n");
 }
+EXPORT_SYMBOL_GPL(thaw_kernel_threads);
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index b38109e..0bda9d0 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -35,6 +35,8 @@
 #include <asm/io.h>
 
 #include "power.h"
+#include "tuxonice_builtin.h"
+#include "tuxonice_pagedir.h"
 
 static int swsusp_page_is_free(struct page *);
 static void swsusp_set_page_forbidden(struct page *);
@@ -71,6 +73,10 @@ void __init hibernate_image_size_init(void)
  * directly to their "original" page frames.
  */
 struct pbe *restore_pblist;
+EXPORT_SYMBOL_GPL(restore_pblist);
+
+int resume_attempted;
+EXPORT_SYMBOL_GPL(resume_attempted);
 
 /* Pointer to an auxiliary buffer (1 page) */
 static void *buffer;
@@ -113,6 +119,9 @@ static void *get_image_page(gfp_t gfp_mask, int safe_needed)
 
 unsigned long get_safe_page(gfp_t gfp_mask)
 {
+	if (toi_running)
+		return toi_get_nonconflicting_page();
+
 	return (unsigned long)get_image_page(gfp_mask, PG_SAFE);
 }
 
@@ -249,47 +258,53 @@ static void *chain_alloc(struct chain_allocator *ca, unsigned int size)
  *	the represented memory area.
  */
 
-#define BM_END_OF_MAP	(~0UL)
-
-#define BM_BITS_PER_BLOCK	(PAGE_SIZE * BITS_PER_BYTE)
-
-struct bm_block {
-	struct list_head hook;	/* hook into a list of bitmap blocks */
-	unsigned long start_pfn;	/* pfn represented by the first bit */
-	unsigned long end_pfn;	/* pfn represented by the last bit plus 1 */
-	unsigned long *data;	/* bitmap representing pages */
-};
-
 static inline unsigned long bm_block_bits(struct bm_block *bb)
 {
 	return bb->end_pfn - bb->start_pfn;
 }
 
-/* strcut bm_position is used for browsing memory bitmaps */
+/* Functions that operate on memory bitmaps */
 
-struct bm_position {
-	struct bm_block *block;
-	int bit;
-};
+void memory_bm_position_reset_index(struct memory_bitmap *bm, int index)
+{
+	bm->states[index].block = list_entry(bm->blocks.next,
+				struct bm_block, hook);
+	bm->states[index].bit = 0;
+}
+EXPORT_SYMBOL_GPL(memory_bm_position_reset_index);
 
-struct memory_bitmap {
-	struct list_head blocks;	/* list of bitmap blocks */
-	struct linked_page *p_list;	/* list of pages used to store zone
-					 * bitmap objects and bitmap block
-					 * objects
-					 */
-	struct bm_position cur;	/* most recently used bit position */
-};
+void memory_bm_position_reset(struct memory_bitmap *bm)
+{
+	int i;
 
-/* Functions that operate on memory bitmaps */
+	for (i = 0; i < bm->num_states; i++) {
+		bm->states[i].block = list_entry(bm->blocks.next,
+				struct bm_block, hook);
+		bm->states[i].bit = 0;
+	}
+}
+EXPORT_SYMBOL_GPL(memory_bm_position_reset);
 
-static void memory_bm_position_reset(struct memory_bitmap *bm)
+int memory_bm_set_iterators(struct memory_bitmap *bm, int number)
 {
-	bm->cur.block = list_entry(bm->blocks.next, struct bm_block, hook);
-	bm->cur.bit = 0;
-}
+	int bytes = number * sizeof(struct bm_position);
+	struct bm_position *new_states;
+
+	if (number < bm->num_states)
+		return 0;
 
-static void memory_bm_free(struct memory_bitmap *bm, int clear_nosave_free);
+	new_states = kmalloc(bytes, GFP_KERNEL);
+	if (!new_states)
+		return -ENOMEM;
+
+	if (bm->states)
+		kfree(bm->states);
+
+	bm->states = new_states;
+	bm->num_states = number;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(memory_bm_set_iterators);
 
 /**
  *	create_bm_block_list - create a list of block bitmap objects
@@ -397,8 +412,8 @@ static int create_mem_extents(struct list_head *list, gfp_t gfp_mask)
 /**
   *	memory_bm_create - allocate memory for a memory bitmap
   */
-static int
-memory_bm_create(struct memory_bitmap *bm, gfp_t gfp_mask, int safe_needed)
+int memory_bm_create_index(struct memory_bitmap *bm, gfp_t gfp_mask,
+		int safe_needed, int states)
 {
 	struct chain_allocator ca;
 	struct list_head mem_extents;
@@ -442,6 +457,9 @@ memory_bm_create(struct memory_bitmap *bm, gfp_t gfp_mask, int safe_needed)
 		}
 	}
 
+	if (!error)
+		error = memory_bm_set_iterators(bm, states);
+
 	bm->p_list = ca.chain;
 	memory_bm_position_reset(bm);
  Exit:
@@ -453,11 +471,18 @@ memory_bm_create(struct memory_bitmap *bm, gfp_t gfp_mask, int safe_needed)
 	memory_bm_free(bm, PG_UNSAFE_CLEAR);
 	goto Exit;
 }
+EXPORT_SYMBOL_GPL(memory_bm_create_index);
+
+int memory_bm_create(struct memory_bitmap *bm, gfp_t gfp_mask, int safe_needed)
+{
+	return memory_bm_create_index(bm, gfp_mask, safe_needed, 1);
+}
+EXPORT_SYMBOL_GPL(memory_bm_create);
 
 /**
   *	memory_bm_free - free memory occupied by the memory bitmap @bm
   */
-static void memory_bm_free(struct memory_bitmap *bm, int clear_nosave_free)
+void memory_bm_free(struct memory_bitmap *bm, int clear_nosave_free)
 {
 	struct bm_block *bb;
 
@@ -468,15 +493,22 @@ static void memory_bm_free(struct memory_bitmap *bm, int clear_nosave_free)
 	free_list_of_pages(bm->p_list, clear_nosave_free);
 
 	INIT_LIST_HEAD(&bm->blocks);
+
+	if (bm->states) {
+		kfree(bm->states);
+		bm->states = NULL;
+		bm->num_states = 0;
+	}
 }
+EXPORT_SYMBOL_GPL(memory_bm_free);
 
 /**
  *	memory_bm_find_bit - find the bit in the bitmap @bm that corresponds
  *	to given pfn.  The cur_zone_bm member of @bm and the cur_block member
- *	of @bm->cur_zone_bm are updated.
+ *	of @bm->states[i]_zone_bm are updated.
  */
-static int memory_bm_find_bit(struct memory_bitmap *bm, unsigned long pfn,
-				void **addr, unsigned int *bit_nr)
+static int memory_bm_find_bit_index(struct memory_bitmap *bm, unsigned long pfn,
+				void **addr, unsigned int *bit_nr, int state)
 {
 	struct bm_block *bb;
 
@@ -484,7 +516,7 @@ static int memory_bm_find_bit(struct memory_bitmap *bm, unsigned long pfn,
 	 * Check if the pfn corresponds to the current bitmap block and find
 	 * the block where it fits if this is not the case.
 	 */
-	bb = bm->cur.block;
+	bb = bm->states[state].block;
 	if (pfn < bb->start_pfn)
 		list_for_each_entry_continue_reverse(bb, &bm->blocks, hook)
 			if (pfn >= bb->start_pfn)
@@ -499,15 +531,21 @@ static int memory_bm_find_bit(struct memory_bitmap *bm, unsigned long pfn,
 		return -EFAULT;
 
 	/* The block has been found */
-	bm->cur.block = bb;
+	bm->states[state].block = bb;
 	pfn -= bb->start_pfn;
-	bm->cur.bit = pfn + 1;
+	bm->states[state].bit = pfn + 1;
 	*bit_nr = pfn;
 	*addr = bb->data;
 	return 0;
 }
 
-static void memory_bm_set_bit(struct memory_bitmap *bm, unsigned long pfn)
+static int memory_bm_find_bit(struct memory_bitmap *bm, unsigned long pfn,
+				void **addr, unsigned int *bit_nr)
+{
+	return memory_bm_find_bit_index(bm, pfn, addr, bit_nr, 0);
+}
+
+void memory_bm_set_bit(struct memory_bitmap *bm, unsigned long pfn)
 {
 	void *addr;
 	unsigned int bit;
@@ -517,6 +555,7 @@ static void memory_bm_set_bit(struct memory_bitmap *bm, unsigned long pfn)
 	BUG_ON(error);
 	set_bit(bit, addr);
 }
+EXPORT_SYMBOL_GPL(memory_bm_set_bit);
 
 static int mem_bm_set_bit_check(struct memory_bitmap *bm, unsigned long pfn)
 {
@@ -530,27 +569,43 @@ static int mem_bm_set_bit_check(struct memory_bitmap *bm, unsigned long pfn)
 	return error;
 }
 
-static void memory_bm_clear_bit(struct memory_bitmap *bm, unsigned long pfn)
+void memory_bm_clear_bit_index(struct memory_bitmap *bm, unsigned long pfn,
+		int index)
 {
 	void *addr;
 	unsigned int bit;
 	int error;
 
-	error = memory_bm_find_bit(bm, pfn, &addr, &bit);
+	error = memory_bm_find_bit_index(bm, pfn, &addr, &bit, index);
 	BUG_ON(error);
 	clear_bit(bit, addr);
 }
+EXPORT_SYMBOL_GPL(memory_bm_clear_bit_index);
+
+void memory_bm_clear_bit(struct memory_bitmap *bm, unsigned long pfn)
+{
+	memory_bm_clear_bit_index(bm, pfn, 0);
+}
+EXPORT_SYMBOL_GPL(memory_bm_clear_bit);
 
-static int memory_bm_test_bit(struct memory_bitmap *bm, unsigned long pfn)
+int memory_bm_test_bit_index(struct memory_bitmap *bm, unsigned long pfn,
+		int index)
 {
 	void *addr;
 	unsigned int bit;
 	int error;
 
-	error = memory_bm_find_bit(bm, pfn, &addr, &bit);
+	error = memory_bm_find_bit_index(bm, pfn, &addr, &bit, index);
 	BUG_ON(error);
 	return test_bit(bit, addr);
 }
+EXPORT_SYMBOL_GPL(memory_bm_test_bit_index);
+
+int memory_bm_test_bit(struct memory_bitmap *bm, unsigned long pfn)
+{
+	return memory_bm_test_bit_index(bm, pfn, 0);
+}
+EXPORT_SYMBOL_GPL(memory_bm_test_bit);
 
 static bool memory_bm_pfn_present(struct memory_bitmap *bm, unsigned long pfn)
 {
@@ -569,43 +624,184 @@ static bool memory_bm_pfn_present(struct memory_bitmap *bm, unsigned long pfn)
  *	this function.
  */
 
-static unsigned long memory_bm_next_pfn(struct memory_bitmap *bm)
+unsigned long memory_bm_next_pfn_index(struct memory_bitmap *bm, int index)
 {
 	struct bm_block *bb;
 	int bit;
 
-	bb = bm->cur.block;
+	bb = bm->states[index].block;
 	do {
-		bit = bm->cur.bit;
+		bit = bm->states[index].bit;
 		bit = find_next_bit(bb->data, bm_block_bits(bb), bit);
 		if (bit < bm_block_bits(bb))
 			goto Return_pfn;
 
 		bb = list_entry(bb->hook.next, struct bm_block, hook);
-		bm->cur.block = bb;
-		bm->cur.bit = 0;
+		bm->states[index].block = bb;
+		bm->states[index].bit = 0;
 	} while (&bb->hook != &bm->blocks);
 
-	memory_bm_position_reset(bm);
+	memory_bm_position_reset_index(bm, index);
 	return BM_END_OF_MAP;
 
  Return_pfn:
-	bm->cur.bit = bit + 1;
+	bm->states[index].bit = bit + 1;
 	return bb->start_pfn + bit;
 }
+EXPORT_SYMBOL_GPL(memory_bm_next_pfn_index);
 
-/**
- *	This structure represents a range of page frames the contents of which
- *	should not be saved during the suspend.
- */
+unsigned long memory_bm_next_pfn(struct memory_bitmap *bm)
+{
+	return memory_bm_next_pfn_index(bm, 0);
+}
+EXPORT_SYMBOL_GPL(memory_bm_next_pfn);
 
-struct nosave_region {
-	struct list_head list;
-	unsigned long start_pfn;
-	unsigned long end_pfn;
-};
+void memory_bm_clear(struct memory_bitmap *bm)
+{
+	unsigned long pfn;
 
-static LIST_HEAD(nosave_regions);
+	memory_bm_position_reset(bm);
+	pfn = memory_bm_next_pfn(bm);
+	while (pfn != BM_END_OF_MAP) {
+		memory_bm_clear_bit(bm, pfn);
+		pfn = memory_bm_next_pfn(bm);
+	}
+}
+EXPORT_SYMBOL_GPL(memory_bm_clear);
+
+void memory_bm_copy(struct memory_bitmap *source, struct memory_bitmap *dest)
+{
+	unsigned long pfn;
+
+	memory_bm_position_reset(source);
+	pfn = memory_bm_next_pfn(source);
+	while (pfn != BM_END_OF_MAP) {
+		memory_bm_set_bit(dest, pfn);
+		pfn = memory_bm_next_pfn(source);
+	}
+}
+EXPORT_SYMBOL_GPL(memory_bm_copy);
+
+void memory_bm_dup(struct memory_bitmap *source, struct memory_bitmap *dest)
+{
+	memory_bm_clear(dest);
+	memory_bm_copy(source, dest);
+}
+EXPORT_SYMBOL_GPL(memory_bm_dup);
+
+#ifdef CONFIG_TOI
+#define DEFINE_MEMORY_BITMAP(name) \
+struct memory_bitmap *name; \
+EXPORT_SYMBOL_GPL(name)
+
+DEFINE_MEMORY_BITMAP(pageset1_map);
+DEFINE_MEMORY_BITMAP(pageset1_copy_map);
+DEFINE_MEMORY_BITMAP(pageset2_map);
+DEFINE_MEMORY_BITMAP(page_resave_map);
+DEFINE_MEMORY_BITMAP(io_map);
+DEFINE_MEMORY_BITMAP(nosave_map);
+DEFINE_MEMORY_BITMAP(free_map);
+
+int memory_bm_write(struct memory_bitmap *bm, int (*rw_chunk)
+	(int rw, struct toi_module_ops *owner, char *buffer, int buffer_size))
+{
+	int result = 0;
+	unsigned int nr = 0;
+	struct bm_block *bb;
+
+	if (!bm)
+		return result;
+
+	list_for_each_entry(bb, &bm->blocks, hook)
+		nr++;
+
+	result = (*rw_chunk)(WRITE, NULL, (char *) &nr, sizeof(unsigned int));
+	if (result)
+		return result;
+
+	list_for_each_entry(bb, &bm->blocks, hook) {
+		result = (*rw_chunk)(WRITE, NULL, (char *) &bb->start_pfn,
+				2 * sizeof(unsigned long));
+		if (result)
+			return result;
+
+		result = (*rw_chunk)(WRITE, NULL, (char *) bb->data, PAGE_SIZE);
+		if (result)
+			return result;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(memory_bm_write);
+
+int memory_bm_read(struct memory_bitmap *bm, int (*rw_chunk)
+	(int rw, struct toi_module_ops *owner, char *buffer, int buffer_size))
+{
+	int result = 0;
+	unsigned int nr, i;
+	struct bm_block *bb;
+
+	if (!bm)
+		return result;
+
+	result = memory_bm_create(bm, GFP_KERNEL, 0);
+
+	if (result)
+		return result;
+
+	result = (*rw_chunk)(READ, NULL, (char *) &nr, sizeof(unsigned int));
+	if (result)
+		goto Free;
+
+	for (i = 0; i < nr; i++) {
+		unsigned long pfn;
+
+		result = (*rw_chunk)(READ, NULL, (char *) &pfn,
+				sizeof(unsigned long));
+		if (result)
+			goto Free;
+
+		list_for_each_entry(bb, &bm->blocks, hook)
+			if (bb->start_pfn == pfn)
+				break;
+
+		if (&bb->hook == &bm->blocks) {
+			printk(KERN_ERR
+				"TuxOnIce: Failed to load memory bitmap.\n");
+			result = -EINVAL;
+			goto Free;
+		}
+
+		result = (*rw_chunk)(READ, NULL, (char *) &pfn,
+				sizeof(unsigned long));
+		if (result)
+			goto Free;
+
+		if (pfn != bb->end_pfn) {
+			printk(KERN_ERR
+				"TuxOnIce: Failed to load memory bitmap. "
+				"End PFN doesn't match what was saved.\n");
+			result = -EINVAL;
+			goto Free;
+		}
+
+		result = (*rw_chunk)(READ, NULL, (char *) bb->data, PAGE_SIZE);
+
+		if (result)
+			goto Free;
+	}
+
+	return 0;
+
+Free:
+	memory_bm_free(bm, PG_ANY);
+	return result;
+}
+EXPORT_SYMBOL_GPL(memory_bm_read);
+#endif
+
+LIST_HEAD(nosave_regions);
+EXPORT_SYMBOL_GPL(nosave_regions);
 
 /**
  *	register_nosave_region - register a range of page frames the contents
@@ -848,7 +1044,7 @@ static unsigned int count_free_highmem_pages(void)
  *	We should save the page if it isn't Nosave or NosaveFree, or Reserved,
  *	and it isn't a part of a free chunk of pages.
  */
-static struct page *saveable_highmem_page(struct zone *zone, unsigned long pfn)
+struct page *saveable_highmem_page(struct zone *zone, unsigned long pfn)
 {
 	struct page *page;
 
@@ -870,6 +1066,7 @@ static struct page *saveable_highmem_page(struct zone *zone, unsigned long pfn)
 
 	return page;
 }
+EXPORT_SYMBOL_GPL(saveable_highmem_page);
 
 /**
  *	count_highmem_pages - compute the total number of saveable highmem
@@ -895,11 +1092,6 @@ static unsigned int count_highmem_pages(void)
 	}
 	return n;
 }
-#else
-static inline void *saveable_highmem_page(struct zone *z, unsigned long p)
-{
-	return NULL;
-}
 #endif /* CONFIG_HIGHMEM */
 
 /**
@@ -910,7 +1102,7 @@ static inline void *saveable_highmem_page(struct zone *z, unsigned long p)
  *	of pages statically defined as 'unsaveable', and it isn't a part of
  *	a free chunk of pages.
  */
-static struct page *saveable_page(struct zone *zone, unsigned long pfn)
+struct page *saveable_page(struct zone *zone, unsigned long pfn)
 {
 	struct page *page;
 
@@ -935,6 +1127,7 @@ static struct page *saveable_page(struct zone *zone, unsigned long pfn)
 
 	return page;
 }
+EXPORT_SYMBOL_GPL(saveable_page);
 
 /**
  *	count_data_pages - compute the total number of saveable non-highmem
@@ -1589,6 +1782,9 @@ asmlinkage int swsusp_save(void)
 {
 	unsigned int nr_pages, nr_highmem;
 
+	if (toi_running)
+		return toi_post_context_save();
+
 	printk(KERN_INFO "PM: Creating hibernation image:\n");
 
 	drain_local_pages(NULL);
@@ -1629,14 +1825,14 @@ asmlinkage int swsusp_save(void)
 }
 
 #ifndef CONFIG_ARCH_HIBERNATION_HEADER
-static int init_header_complete(struct swsusp_info *info)
+int init_header_complete(struct swsusp_info *info)
 {
 	memcpy(&info->uts, init_utsname(), sizeof(struct new_utsname));
 	info->version_code = LINUX_VERSION_CODE;
 	return 0;
 }
 
-static char *check_image_kernel(struct swsusp_info *info)
+char *check_image_kernel(struct swsusp_info *info)
 {
 	if (info->version_code != LINUX_VERSION_CODE)
 		return "kernel version";
@@ -1650,6 +1846,7 @@ static char *check_image_kernel(struct swsusp_info *info)
 		return "machine";
 	return NULL;
 }
+EXPORT_SYMBOL_GPL(check_image_kernel);
 #endif /* CONFIG_ARCH_HIBERNATION_HEADER */
 
 unsigned long snapshot_get_image_size(void)
@@ -1657,7 +1854,7 @@ unsigned long snapshot_get_image_size(void)
 	return nr_copy_pages + nr_meta_pages + 1;
 }
 
-static int init_header(struct swsusp_info *info)
+int init_header(struct swsusp_info *info)
 {
 	memset(info, 0, sizeof(struct swsusp_info));
 	info->num_physpages = get_num_physpages();
@@ -1667,6 +1864,7 @@ static int init_header(struct swsusp_info *info)
 	info->size <<= PAGE_SHIFT;
 	return init_header_complete(info);
 }
+EXPORT_SYMBOL_GPL(init_header);
 
 /**
  *	pack_pfns - pfns corresponding to the set bits found in the bitmap @bm
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 62ee437..95795ee 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -298,6 +298,7 @@ int suspend_devices_and_enter(suspend_state_t state)
 		suspend_ops->recover();
 	goto Resume_devices;
 }
+EXPORT_SYMBOL_GPL(suspend_devices_and_enter);
 
 /**
  * suspend_finish - Clean up before finishing the suspend sequence.
diff --git a/kernel/power/tuxonice.h b/kernel/power/tuxonice.h
new file mode 100644
index 0000000..6f8d127
--- /dev/null
+++ b/kernel/power/tuxonice.h
@@ -0,0 +1,227 @@
+/*
+ * kernel/power/tuxonice.h
+ *
+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * It contains declarations used throughout swsusp.
+ *
+ */
+
+#ifndef KERNEL_POWER_TOI_H
+#define KERNEL_POWER_TOI_H
+
+#include <linux/delay.h>
+#include <linux/bootmem.h>
+#include <linux/suspend.h>
+#include <linux/fs.h>
+#include <linux/module.h>
+#include <asm/setup.h>
+#include "tuxonice_pageflags.h"
+#include "power.h"
+
+#define TOI_CORE_VERSION "3.3"
+#define	TOI_HEADER_VERSION 3
+#define MY_BOOT_KERNEL_DATA_VERSION 4
+
+struct toi_boot_kernel_data {
+	int version;
+	int size;
+	unsigned long toi_action;
+	unsigned long toi_debug_state;
+	u32 toi_default_console_level;
+	int toi_io_time[2][2];
+	char toi_nosave_commandline[COMMAND_LINE_SIZE];
+	unsigned long pages_used[33];
+	unsigned long incremental_bytes_in;
+	unsigned long incremental_bytes_out;
+	unsigned long compress_bytes_in;
+	unsigned long compress_bytes_out;
+	unsigned long pruned_pages;
+};
+
+extern struct toi_boot_kernel_data toi_bkd;
+
+/* Location of book kernel data struct in kernel being resumed */
+extern unsigned long boot_kernel_data_buffer;
+
+/*		 == Action states == 		*/
+
+enum {
+	TOI_REBOOT,
+	TOI_PAUSE,
+	TOI_LOGALL,
+	TOI_CAN_CANCEL,
+	TOI_KEEP_IMAGE,
+	TOI_FREEZER_TEST,
+	TOI_SINGLESTEP,
+	TOI_PAUSE_NEAR_PAGESET_END,
+	TOI_TEST_FILTER_SPEED,
+	TOI_TEST_BIO,
+	TOI_NO_PAGESET2,
+	TOI_IGNORE_ROOTFS,
+	TOI_REPLACE_SWSUSP,
+	TOI_PAGESET2_FULL,
+	TOI_ABORT_ON_RESAVE_NEEDED,
+	TOI_NO_MULTITHREADED_IO,
+	TOI_NO_DIRECT_LOAD, /* Obsolete */
+	TOI_LATE_CPU_HOTPLUG,
+	TOI_GET_MAX_MEM_ALLOCD,
+	TOI_NO_FLUSHER_THREAD,
+	TOI_NO_PS2_IF_UNNEEDED,
+	TOI_POST_RESUME_BREAKPOINT,
+	TOI_NO_READAHEAD,
+};
+
+extern unsigned long toi_bootflags_mask;
+
+#define clear_action_state(bit) (test_and_clear_bit(bit, &toi_bkd.toi_action))
+
+/*		 == Result states == 		*/
+
+enum {
+	TOI_ABORTED,
+	TOI_ABORT_REQUESTED,
+	TOI_NOSTORAGE_AVAILABLE,
+	TOI_INSUFFICIENT_STORAGE,
+	TOI_FREEZING_FAILED,
+	TOI_KEPT_IMAGE,
+	TOI_WOULD_EAT_MEMORY,
+	TOI_UNABLE_TO_FREE_ENOUGH_MEMORY,
+	TOI_PM_SEM,
+	TOI_DEVICE_REFUSED,
+	TOI_SYSDEV_REFUSED,
+	TOI_EXTRA_PAGES_ALLOW_TOO_SMALL,
+	TOI_UNABLE_TO_PREPARE_IMAGE,
+	TOI_FAILED_MODULE_INIT,
+	TOI_FAILED_MODULE_CLEANUP,
+	TOI_FAILED_IO,
+	TOI_OUT_OF_MEMORY,
+	TOI_IMAGE_ERROR,
+	TOI_PLATFORM_PREP_FAILED,
+	TOI_CPU_HOTPLUG_FAILED,
+	TOI_ARCH_PREPARE_FAILED, /* Removed Linux-3.0 */
+	TOI_RESAVE_NEEDED,
+	TOI_CANT_SUSPEND,
+	TOI_NOTIFIERS_PREPARE_FAILED,
+	TOI_PRE_SNAPSHOT_FAILED,
+	TOI_PRE_RESTORE_FAILED,
+	TOI_USERMODE_HELPERS_ERR,
+	TOI_CANT_USE_ALT_RESUME,
+	TOI_HEADER_TOO_BIG,
+	TOI_WAKEUP_EVENT,
+	TOI_SYSCORE_REFUSED,
+	TOI_DPM_PREPARE_FAILED,
+	TOI_DPM_SUSPEND_FAILED,
+	TOI_NUM_RESULT_STATES	/* Used in printing debug info only */
+};
+
+extern unsigned long toi_result;
+
+#define set_result_state(bit) (test_and_set_bit(bit, &toi_result))
+#define set_abort_result(bit) (test_and_set_bit(TOI_ABORTED, &toi_result), \
+				test_and_set_bit(bit, &toi_result))
+#define clear_result_state(bit) (test_and_clear_bit(bit, &toi_result))
+#define test_result_state(bit) (test_bit(bit, &toi_result))
+
+/*	 == Debug sections and levels == 	*/
+
+/* debugging levels. */
+enum {
+	TOI_STATUS = 0,
+	TOI_ERROR = 2,
+	TOI_LOW,
+	TOI_MEDIUM,
+	TOI_HIGH,
+	TOI_VERBOSE,
+};
+
+enum {
+	TOI_ANY_SECTION,
+	TOI_EAT_MEMORY,
+	TOI_IO,
+	TOI_HEADER,
+	TOI_WRITER,
+	TOI_MEMORY,
+	TOI_PAGEDIR,
+	TOI_COMPRESS,
+	TOI_BIO,
+};
+
+#define set_debug_state(bit) (test_and_set_bit(bit, &toi_bkd.toi_debug_state))
+#define clear_debug_state(bit) \
+	(test_and_clear_bit(bit, &toi_bkd.toi_debug_state))
+#define test_debug_state(bit) (test_bit(bit, &toi_bkd.toi_debug_state))
+
+/*		== Steps in hibernating ==	*/
+
+enum {
+	STEP_HIBERNATE_PREPARE_IMAGE,
+	STEP_HIBERNATE_SAVE_IMAGE,
+	STEP_HIBERNATE_POWERDOWN,
+	STEP_RESUME_CAN_RESUME,
+	STEP_RESUME_LOAD_PS1,
+	STEP_RESUME_DO_RESTORE,
+	STEP_RESUME_READ_PS2,
+	STEP_RESUME_GO,
+	STEP_RESUME_ALT_IMAGE,
+	STEP_CLEANUP,
+	STEP_QUIET_CLEANUP
+};
+
+/*		== TuxOnIce states ==
+	(see also include/linux/suspend.h)	*/
+
+#define get_toi_state()  (toi_state)
+#define restore_toi_state(saved_state) \
+	do { toi_state = saved_state; } while (0)
+
+/*		== Module support ==		*/
+
+struct toi_core_fns {
+	int (*post_context_save)(void);
+	unsigned long (*get_nonconflicting_page)(void);
+	int (*try_hibernate)(void);
+	void (*try_resume)(void);
+};
+
+extern struct toi_core_fns *toi_core_fns;
+
+/*		== All else ==			*/
+#define KB(x) ((x) << (PAGE_SHIFT - 10))
+#define MB(x) ((x) >> (20 - PAGE_SHIFT))
+
+extern int toi_start_anything(int toi_or_resume);
+extern void toi_finish_anything(int toi_or_resume);
+
+extern int save_image_part1(void);
+extern int toi_atomic_restore(void);
+
+extern int toi_try_hibernate(void);
+extern void toi_try_resume(void);
+
+extern int __toi_post_context_save(void);
+
+extern unsigned int nr_hibernates;
+extern char alt_resume_param[256];
+
+extern void copyback_post(void);
+extern int toi_hibernate(void);
+extern unsigned long extra_pd1_pages_used;
+
+#define SECTOR_SIZE 512
+
+extern void toi_early_boot_message(int can_erase_image, int default_answer,
+	char *warning_reason, ...);
+
+extern int do_check_can_resume(void);
+extern int do_toi_step(int step);
+extern int toi_launch_userspace_program(char *command, int channel_no,
+		int wait, int debug);
+
+extern char tuxonice_signature[9];
+
+extern int toi_start_other_threads(void);
+extern void toi_stop_other_threads(void);
+#endif
diff --git a/kernel/power/tuxonice_alloc.c b/kernel/power/tuxonice_alloc.c
new file mode 100644
index 0000000..675f2b5
--- /dev/null
+++ b/kernel/power/tuxonice_alloc.c
@@ -0,0 +1,314 @@
+/*
+ * kernel/power/tuxonice_alloc.c
+ *
+ * Copyright (C) 2008-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ */
+
+#ifdef CONFIG_PM_DEBUG
+#include <linux/export.h>
+#include <linux/slab.h>
+#include "tuxonice_modules.h"
+#include "tuxonice_alloc.h"
+#include "tuxonice_sysfs.h"
+#include "tuxonice.h"
+
+#define TOI_ALLOC_PATHS 40
+
+static DEFINE_MUTEX(toi_alloc_mutex);
+
+static struct toi_module_ops toi_alloc_ops;
+
+static int toi_fail_num;
+
+static atomic_t toi_alloc_count[TOI_ALLOC_PATHS],
+		toi_free_count[TOI_ALLOC_PATHS],
+		toi_test_count[TOI_ALLOC_PATHS],
+		toi_fail_count[TOI_ALLOC_PATHS];
+static int toi_cur_allocd[TOI_ALLOC_PATHS], toi_max_allocd[TOI_ALLOC_PATHS];
+static int cur_allocd, max_allocd;
+
+static char *toi_alloc_desc[TOI_ALLOC_PATHS] = {
+	"", /* 0 */
+	"get_io_info_struct",
+	"extent",
+	"extent (loading chain)",
+	"userui channel",
+	"userui arg", /* 5 */
+	"attention list metadata",
+	"extra pagedir memory metadata",
+	"bdev metadata",
+	"extra pagedir memory",
+	"header_locations_read", /* 10 */
+	"bio queue",
+	"prepare_readahead",
+	"i/o buffer",
+	"writer buffer in bio_init",
+	"checksum buffer", /* 15 */
+	"compression buffer",
+	"filewriter signature op",
+	"set resume param alloc1",
+	"set resume param alloc2",
+	"debugging info buffer", /* 20 */
+	"check can resume buffer",
+	"write module config buffer",
+	"read module config buffer",
+	"write image header buffer",
+	"read pageset1 buffer", /* 25 */
+	"get_have_image_data buffer",
+	"checksum page",
+	"worker rw loop",
+	"get nonconflicting page",
+	"ps1 load addresses", /* 30 */
+	"remove swap image",
+	"swap image exists",
+	"swap parse sig location",
+	"sysfs kobj",
+	"swap mark resume attempted buffer", /* 35 */
+	"cluster member",
+	"boot kernel data buffer",
+	"setting swap signature",
+	"block i/o bdev struct"
+};
+
+#define MIGHT_FAIL(FAIL_NUM, FAIL_VAL) \
+	do { \
+		BUG_ON(FAIL_NUM >= TOI_ALLOC_PATHS); \
+		\
+		if (FAIL_NUM == toi_fail_num) { \
+			atomic_inc(&toi_test_count[FAIL_NUM]); \
+			toi_fail_num = 0; \
+			return FAIL_VAL; \
+		} \
+	} while (0)
+
+static void alloc_update_stats(int fail_num, void *result, int size)
+{
+	if (!result) {
+		atomic_inc(&toi_fail_count[fail_num]);
+		return;
+	}
+
+	atomic_inc(&toi_alloc_count[fail_num]);
+	if (unlikely(test_action_state(TOI_GET_MAX_MEM_ALLOCD))) {
+		mutex_lock(&toi_alloc_mutex);
+		toi_cur_allocd[fail_num]++;
+		cur_allocd += size;
+		if (unlikely(cur_allocd > max_allocd)) {
+			int i;
+
+			for (i = 0; i < TOI_ALLOC_PATHS; i++)
+				toi_max_allocd[i] = toi_cur_allocd[i];
+			max_allocd = cur_allocd;
+		}
+		mutex_unlock(&toi_alloc_mutex);
+	}
+}
+
+static void free_update_stats(int fail_num, int size)
+{
+	BUG_ON(fail_num >= TOI_ALLOC_PATHS);
+	atomic_inc(&toi_free_count[fail_num]);
+	if (unlikely(atomic_read(&toi_free_count[fail_num]) >
+				atomic_read(&toi_alloc_count[fail_num])))
+		dump_stack();
+	if (unlikely(test_action_state(TOI_GET_MAX_MEM_ALLOCD))) {
+		mutex_lock(&toi_alloc_mutex);
+		cur_allocd -= size;
+		toi_cur_allocd[fail_num]--;
+		mutex_unlock(&toi_alloc_mutex);
+	}
+}
+
+void *toi_kzalloc(int fail_num, size_t size, gfp_t flags)
+{
+	void *result;
+
+	if (toi_alloc_ops.enabled)
+		MIGHT_FAIL(fail_num, NULL);
+	result = kzalloc(size, flags);
+	if (toi_alloc_ops.enabled)
+		alloc_update_stats(fail_num, result, size);
+	if (fail_num == toi_trace_allocs)
+		dump_stack();
+	return result;
+}
+EXPORT_SYMBOL_GPL(toi_kzalloc);
+
+unsigned long toi_get_free_pages(int fail_num, gfp_t mask,
+		unsigned int order)
+{
+	unsigned long result;
+
+	if (toi_alloc_ops.enabled)
+		MIGHT_FAIL(fail_num, 0);
+	result = __get_free_pages(mask, order);
+	if (toi_alloc_ops.enabled)
+		alloc_update_stats(fail_num, (void *) result,
+				PAGE_SIZE << order);
+	if (fail_num == toi_trace_allocs)
+		dump_stack();
+	return result;
+}
+EXPORT_SYMBOL_GPL(toi_get_free_pages);
+
+struct page *toi_alloc_page(int fail_num, gfp_t mask)
+{
+	struct page *result;
+
+	if (toi_alloc_ops.enabled)
+		MIGHT_FAIL(fail_num, NULL);
+	result = alloc_page(mask);
+	if (toi_alloc_ops.enabled)
+		alloc_update_stats(fail_num, (void *) result, PAGE_SIZE);
+	if (fail_num == toi_trace_allocs)
+		dump_stack();
+	return result;
+}
+EXPORT_SYMBOL_GPL(toi_alloc_page);
+
+unsigned long toi_get_zeroed_page(int fail_num, gfp_t mask)
+{
+	unsigned long result;
+
+	if (toi_alloc_ops.enabled)
+		MIGHT_FAIL(fail_num, 0);
+	result = get_zeroed_page(mask);
+	if (toi_alloc_ops.enabled)
+		alloc_update_stats(fail_num, (void *) result, PAGE_SIZE);
+	if (fail_num == toi_trace_allocs)
+		dump_stack();
+	return result;
+}
+EXPORT_SYMBOL_GPL(toi_get_zeroed_page);
+
+void toi_kfree(int fail_num, const void *arg, int size)
+{
+	if (arg && toi_alloc_ops.enabled)
+		free_update_stats(fail_num, size);
+
+	if (fail_num == toi_trace_allocs)
+		dump_stack();
+	kfree(arg);
+}
+EXPORT_SYMBOL_GPL(toi_kfree);
+
+void toi_free_page(int fail_num, unsigned long virt)
+{
+	if (virt && toi_alloc_ops.enabled)
+		free_update_stats(fail_num, PAGE_SIZE);
+
+	if (fail_num == toi_trace_allocs)
+		dump_stack();
+	free_page(virt);
+}
+EXPORT_SYMBOL_GPL(toi_free_page);
+
+void toi__free_page(int fail_num, struct page *page)
+{
+	if (page && toi_alloc_ops.enabled)
+		free_update_stats(fail_num, PAGE_SIZE);
+
+	if (fail_num == toi_trace_allocs)
+		dump_stack();
+	__free_page(page);
+}
+EXPORT_SYMBOL_GPL(toi__free_page);
+
+void toi_free_pages(int fail_num, struct page *page, int order)
+{
+	if (page && toi_alloc_ops.enabled)
+		free_update_stats(fail_num, PAGE_SIZE << order);
+
+	if (fail_num == toi_trace_allocs)
+		dump_stack();
+	__free_pages(page, order);
+}
+
+void toi_alloc_print_debug_stats(void)
+{
+	int i, header_done = 0;
+
+	if (!toi_alloc_ops.enabled)
+		return;
+
+	for (i = 0; i < TOI_ALLOC_PATHS; i++)
+		if (atomic_read(&toi_alloc_count[i]) !=
+		    atomic_read(&toi_free_count[i])) {
+			if (!header_done) {
+				printk(KERN_INFO "Idx  Allocs   Frees   Tests "
+					"  Fails     Max Description\n");
+				header_done = 1;
+			}
+
+			printk(KERN_INFO "%3d %7d %7d %7d %7d %7d %s\n", i,
+				atomic_read(&toi_alloc_count[i]),
+				atomic_read(&toi_free_count[i]),
+				atomic_read(&toi_test_count[i]),
+				atomic_read(&toi_fail_count[i]),
+				toi_max_allocd[i],
+				toi_alloc_desc[i]);
+		}
+}
+EXPORT_SYMBOL_GPL(toi_alloc_print_debug_stats);
+
+static int toi_alloc_initialise(int starting_cycle)
+{
+	int i;
+
+	if (!starting_cycle)
+		return 0;
+
+	if (toi_trace_allocs)
+		dump_stack();
+
+	for (i = 0; i < TOI_ALLOC_PATHS; i++) {
+		atomic_set(&toi_alloc_count[i], 0);
+		atomic_set(&toi_free_count[i], 0);
+		atomic_set(&toi_test_count[i], 0);
+		atomic_set(&toi_fail_count[i], 0);
+		toi_cur_allocd[i] = 0;
+		toi_max_allocd[i] = 0;
+	};
+
+	max_allocd = 0;
+	cur_allocd = 0;
+	return 0;
+}
+
+static struct toi_sysfs_data sysfs_params[] = {
+	SYSFS_INT("failure_test", SYSFS_RW, &toi_fail_num, 0, 99, 0, NULL),
+	SYSFS_INT("trace", SYSFS_RW, &toi_trace_allocs, 0, TOI_ALLOC_PATHS, 0,
+			NULL),
+	SYSFS_BIT("find_max_mem_allocated", SYSFS_RW, &toi_bkd.toi_action,
+			TOI_GET_MAX_MEM_ALLOCD, 0),
+	SYSFS_INT("enabled", SYSFS_RW, &toi_alloc_ops.enabled, 0, 1, 0,
+			NULL)
+};
+
+static struct toi_module_ops toi_alloc_ops = {
+	.type					= MISC_HIDDEN_MODULE,
+	.name					= "allocation debugging",
+	.directory				= "alloc",
+	.module					= THIS_MODULE,
+	.early					= 1,
+	.initialise				= toi_alloc_initialise,
+
+	.sysfs_data		= sysfs_params,
+	.num_sysfs_entries	= sizeof(sysfs_params) /
+		sizeof(struct toi_sysfs_data),
+};
+
+int toi_alloc_init(void)
+{
+	int result = toi_register_module(&toi_alloc_ops);
+	return result;
+}
+
+void toi_alloc_exit(void)
+{
+	toi_unregister_module(&toi_alloc_ops);
+}
+#endif
diff --git a/kernel/power/tuxonice_alloc.h b/kernel/power/tuxonice_alloc.h
new file mode 100644
index 0000000..099ee51
--- /dev/null
+++ b/kernel/power/tuxonice_alloc.h
@@ -0,0 +1,54 @@
+/*
+ * kernel/power/tuxonice_alloc.h
+ *
+ * Copyright (C) 2008-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ */
+
+#include <linux/slab.h>
+#define TOI_WAIT_GFP (GFP_NOFS | __GFP_NOWARN)
+#define TOI_ATOMIC_GFP (GFP_ATOMIC | __GFP_NOWARN)
+
+#ifdef CONFIG_PM_DEBUG
+extern void *toi_kzalloc(int fail_num, size_t size, gfp_t flags);
+extern void toi_kfree(int fail_num, const void *arg, int size);
+
+extern unsigned long toi_get_free_pages(int fail_num, gfp_t mask,
+		unsigned int order);
+#define toi_get_free_page(FAIL_NUM, MASK) toi_get_free_pages(FAIL_NUM, MASK, 0)
+extern unsigned long toi_get_zeroed_page(int fail_num, gfp_t mask);
+extern void toi_free_page(int fail_num, unsigned long buf);
+extern void toi__free_page(int fail_num, struct page *page);
+extern void toi_free_pages(int fail_num, struct page *page, int order);
+extern struct page *toi_alloc_page(int fail_num, gfp_t mask);
+extern int toi_alloc_init(void);
+extern void toi_alloc_exit(void);
+
+extern void toi_alloc_print_debug_stats(void);
+
+#else /* CONFIG_PM_DEBUG */
+
+#define toi_kzalloc(FAIL, SIZE, FLAGS) (kzalloc(SIZE, FLAGS))
+#define toi_kfree(FAIL, ALLOCN, SIZE) (kfree(ALLOCN))
+
+#define toi_get_free_pages(FAIL, FLAGS, ORDER) __get_free_pages(FLAGS, ORDER)
+#define toi_get_free_page(FAIL, FLAGS) __get_free_page(FLAGS)
+#define toi_get_zeroed_page(FAIL, FLAGS) get_zeroed_page(FLAGS)
+#define toi_free_page(FAIL, ALLOCN) do { free_page(ALLOCN); } while (0)
+#define toi__free_page(FAIL, PAGE) __free_page(PAGE)
+#define toi_free_pages(FAIL, PAGE, ORDER) __free_pages(PAGE, ORDER)
+#define toi_alloc_page(FAIL, MASK) alloc_page(MASK)
+static inline int toi_alloc_init(void)
+{
+	return 0;
+}
+
+static inline void toi_alloc_exit(void) { }
+
+static inline void toi_alloc_print_debug_stats(void) { }
+
+#endif
+
+extern int toi_trace_allocs;
diff --git a/kernel/power/tuxonice_atomic_copy.c b/kernel/power/tuxonice_atomic_copy.c
new file mode 100644
index 0000000..c524acb
--- /dev/null
+++ b/kernel/power/tuxonice_atomic_copy.c
@@ -0,0 +1,473 @@
+/*
+ * kernel/power/tuxonice_atomic_copy.c
+ *
+ * Copyright 2004-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * Distributed under GPLv2.
+ *
+ * Routines for doing the atomic save/restore.
+ */
+
+#include <linux/suspend.h>
+#include <linux/highmem.h>
+#include <linux/cpu.h>
+#include <linux/freezer.h>
+#include <linux/console.h>
+#include <linux/syscore_ops.h>
+#include <linux/ftrace.h>
+#include <asm/suspend.h>
+#include "tuxonice.h"
+#include "tuxonice_storage.h"
+#include "tuxonice_power_off.h"
+#include "tuxonice_ui.h"
+#include "tuxonice_io.h"
+#include "tuxonice_prepare_image.h"
+#include "tuxonice_pageflags.h"
+#include "tuxonice_checksum.h"
+#include "tuxonice_builtin.h"
+#include "tuxonice_atomic_copy.h"
+#include "tuxonice_alloc.h"
+#include "tuxonice_modules.h"
+
+unsigned long extra_pd1_pages_used;
+
+/**
+ * free_pbe_list - free page backup entries used by the atomic copy code.
+ * @list:	List to free.
+ * @highmem:	Whether the list is in highmem.
+ *
+ * Normally, this function isn't used. If, however, we need to abort before
+ * doing the atomic copy, we use this to free the pbes previously allocated.
+ **/
+static void free_pbe_list(struct pbe **list, int highmem)
+{
+	while (*list) {
+		int i;
+		struct pbe *free_pbe, *next_page = NULL;
+		struct page *page;
+
+		if (highmem) {
+			page = (struct page *) *list;
+			free_pbe = (struct pbe *) kmap(page);
+		} else {
+			page = virt_to_page(*list);
+			free_pbe = *list;
+		}
+
+		for (i = 0; i < PBES_PER_PAGE; i++) {
+			if (!free_pbe)
+				break;
+			if (highmem)
+				toi__free_page(29, free_pbe->address);
+			else
+				toi_free_page(29,
+					(unsigned long) free_pbe->address);
+			free_pbe = free_pbe->next;
+		}
+
+		if (highmem) {
+			if (free_pbe)
+				next_page = free_pbe;
+			kunmap(page);
+		} else {
+			if (free_pbe)
+				next_page = free_pbe;
+		}
+
+		toi__free_page(29, page);
+		*list = (struct pbe *) next_page;
+	};
+}
+
+/**
+ * copyback_post - post atomic-restore actions
+ *
+ * After doing the atomic restore, we have a few more things to do:
+ *	1) We want to retain some values across the restore, so we now copy
+ *	these from the nosave variables to the normal ones.
+ *	2) Set the status flags.
+ *	3) Resume devices.
+ *	4) Tell userui so it can redraw & restore settings.
+ *	5) Reread the page cache.
+ **/
+void copyback_post(void)
+{
+	struct toi_boot_kernel_data *bkd =
+		(struct toi_boot_kernel_data *) boot_kernel_data_buffer;
+
+	if (toi_activate_storage(1))
+		panic("Failed to reactivate our storage.");
+
+	toi_post_atomic_restore_modules(bkd);
+
+	toi_cond_pause(1, "About to reload secondary pagedir.");
+
+	if (read_pageset2(0))
+		panic("Unable to successfully reread the page cache.");
+
+	/*
+	 * If the user wants to sleep again after resuming from full-off,
+	 * it's most likely to be in order to suspend to ram, so we'll
+	 * do this check after loading pageset2, to give them the fastest
+	 * wakeup when they are ready to use the computer again.
+	 */
+	toi_check_resleep();
+}
+
+/**
+ * toi_copy_pageset1 - do the atomic copy of pageset1
+ *
+ * Make the atomic copy of pageset1. We can't use copy_page (as we once did)
+ * because we can't be sure what side effects it has. On my old Duron, with
+ * 3DNOW, kernel_fpu_begin increments preempt count, making our preempt
+ * count at resume time 4 instead of 3.
+ *
+ * We don't want to call kmap_atomic unconditionally because it has the side
+ * effect of incrementing the preempt count, which will leave it one too high
+ * post resume (the page containing the preempt count will be copied after
+ * its incremented. This is essentially the same problem.
+ **/
+void toi_copy_pageset1(void)
+{
+	int i;
+	unsigned long source_index, dest_index;
+
+	memory_bm_position_reset(pageset1_map);
+	memory_bm_position_reset(pageset1_copy_map);
+
+	source_index = memory_bm_next_pfn(pageset1_map);
+	dest_index = memory_bm_next_pfn(pageset1_copy_map);
+
+	for (i = 0; i < pagedir1.size; i++) {
+		unsigned long *origvirt, *copyvirt;
+		struct page *origpage, *copypage;
+		int loop = (PAGE_SIZE / sizeof(unsigned long)) - 1,
+		    was_present1, was_present2;
+
+		origpage = pfn_to_page(source_index);
+		copypage = pfn_to_page(dest_index);
+
+		origvirt = PageHighMem(origpage) ?
+			kmap_atomic(origpage) :
+			page_address(origpage);
+
+		copyvirt = PageHighMem(copypage) ?
+			kmap_atomic(copypage) :
+			page_address(copypage);
+
+		was_present1 = kernel_page_present(origpage);
+		if (!was_present1)
+			kernel_map_pages(origpage, 1, 1);
+
+		was_present2 = kernel_page_present(copypage);
+		if (!was_present2)
+			kernel_map_pages(copypage, 1, 1);
+
+		while (loop >= 0) {
+			*(copyvirt + loop) = *(origvirt + loop);
+			loop--;
+		}
+
+		if (!was_present1)
+			kernel_map_pages(origpage, 1, 0);
+
+		if (!was_present2)
+			kernel_map_pages(copypage, 1, 0);
+
+		if (PageHighMem(origpage))
+			kunmap_atomic(origvirt);
+
+		if (PageHighMem(copypage))
+			kunmap_atomic(copyvirt);
+
+		source_index = memory_bm_next_pfn(pageset1_map);
+		dest_index = memory_bm_next_pfn(pageset1_copy_map);
+	}
+}
+
+/**
+ * __toi_post_context_save - steps after saving the cpu context
+ *
+ * Steps taken after saving the CPU state to make the actual
+ * atomic copy.
+ *
+ * Called from swsusp_save in snapshot.c via toi_post_context_save.
+ **/
+int __toi_post_context_save(void)
+{
+	unsigned long old_ps1_size = pagedir1.size;
+
+	check_checksums();
+
+	free_checksum_pages();
+
+	toi_recalculate_image_contents(1);
+
+	extra_pd1_pages_used = pagedir1.size > old_ps1_size ?
+		pagedir1.size - old_ps1_size : 0;
+
+	if (extra_pd1_pages_used > extra_pd1_pages_allowance) {
+		printk(KERN_INFO "Pageset1 has grown by %lu pages. "
+			"extra_pages_allowance is currently only %lu.\n",
+			pagedir1.size - old_ps1_size,
+			extra_pd1_pages_allowance);
+
+		/*
+		 * Highlevel code will see this, clear the state and
+		 * retry if we haven't already done so twice.
+		 */
+		if (any_to_free(1)) {
+			set_abort_result(TOI_EXTRA_PAGES_ALLOW_TOO_SMALL);
+			return 1;
+		}
+		if (try_allocate_extra_memory()) {
+			printk(KERN_INFO "Failed to allocate the extra memory"
+					" needed. Restarting the process.");
+			set_abort_result(TOI_EXTRA_PAGES_ALLOW_TOO_SMALL);
+			return 1;
+		}
+		printk(KERN_INFO "However it looks like there's enough"
+			" free ram and storage to handle this, so "
+			" continuing anyway.");
+		/* 
+		 * What if try_allocate_extra_memory above calls
+		 * toi_allocate_extra_pagedir_memory and it allocs a new
+		 * slab page via toi_kzalloc which should be in ps1? So...
+		 */
+		toi_recalculate_image_contents(1);
+	}
+
+	if (!test_action_state(TOI_TEST_FILTER_SPEED) &&
+	    !test_action_state(TOI_TEST_BIO))
+		toi_copy_pageset1();
+
+	return 0;
+}
+
+/**
+ * toi_hibernate - high level code for doing the atomic copy
+ *
+ * High-level code which prepares to do the atomic copy. Loosely based
+ * on the swsusp version, but with the following twists:
+ *	- We set toi_running so the swsusp code uses our code paths.
+ *	- We give better feedback regarding what goes wrong if there is a
+ *	  problem.
+ *	- We use an extra function to call the assembly, just in case this code
+ *	  is in a module (return address).
+ **/
+int toi_hibernate(void)
+{
+	int error;
+
+	toi_running = 1; /* For the swsusp code we use :< */
+
+	error = toi_lowlevel_builtin();
+
+	if (!error) {
+		struct toi_boot_kernel_data *bkd =
+			(struct toi_boot_kernel_data *) boot_kernel_data_buffer;
+
+		/*
+		 * The boot kernel's data may be larger (newer version) or
+		 * smaller (older version) than ours. Copy the minimum
+		 * of the two sizes, so that we don't overwrite valid values
+		 * from pre-atomic copy.
+		 */
+
+		memcpy(&toi_bkd, (char *) boot_kernel_data_buffer,
+			min_t(int, sizeof(struct toi_boot_kernel_data),
+				bkd->size));
+	}
+
+	toi_running = 0;
+	return error;
+}
+
+/**
+ * toi_atomic_restore - prepare to do the atomic restore
+ *
+ * Get ready to do the atomic restore. This part gets us into the same
+ * state we are in prior to do calling do_toi_lowlevel while
+ * hibernating: hot-unplugging secondary cpus and freeze processes,
+ * before starting the thread that will do the restore.
+ **/
+int toi_atomic_restore(void)
+{
+	int error;
+
+	toi_running = 1;
+
+	toi_prepare_status(DONT_CLEAR_BAR,	"Atomic restore.");
+
+	memcpy(&toi_bkd.toi_nosave_commandline, saved_command_line,
+		strlen(saved_command_line));
+
+	toi_pre_atomic_restore_modules(&toi_bkd);
+
+	if (add_boot_kernel_data_pbe())
+		goto Failed;
+
+	toi_prepare_status(DONT_CLEAR_BAR, "Doing atomic copy/restore.");
+
+	if (toi_go_atomic(PMSG_QUIESCE, 0))
+		goto Failed;
+
+	/* We'll ignore saved state, but this gets preempt count (etc) right */
+	save_processor_state();
+
+	error = swsusp_arch_resume();
+	/*
+	 * Code below is only ever reached in case of failure. Otherwise
+	 * execution continues at place where swsusp_arch_suspend was called.
+	 *
+	 * We don't know whether it's safe to continue (this shouldn't happen),
+	 * so lets err on the side of caution.
+	 */
+	BUG();
+
+Failed:
+	free_pbe_list(&restore_pblist, 0);
+#ifdef CONFIG_HIGHMEM
+	free_pbe_list(&restore_highmem_pblist, 1);
+#endif
+	toi_running = 0;
+	return 1;
+}
+
+/**
+ * toi_go_atomic - do the actual atomic copy/restore
+ * @state:	   The state to use for dpm_suspend_start & power_down calls.
+ * @suspend_time:  Whether we're suspending or resuming.
+ **/
+int toi_go_atomic(pm_message_t state, int suspend_time)
+{
+  if (suspend_time) {
+    if (platform_begin(1)) {
+      set_abort_result(TOI_PLATFORM_PREP_FAILED);
+      toi_end_atomic(ATOMIC_STEP_PLATFORM_END, suspend_time, 3);
+      return 1;
+    }
+
+    if (dpm_prepare(PMSG_FREEZE)) {
+      set_abort_result(TOI_DPM_PREPARE_FAILED);
+      dpm_complete(PMSG_RECOVER);
+      toi_end_atomic(ATOMIC_STEP_PLATFORM_END, suspend_time, 3);
+      return 1;
+    }
+  }
+
+	suspend_console();
+	ftrace_stop();
+	pm_restrict_gfp_mask();
+
+  if (suspend_time) {
+    if (dpm_suspend(state)) {
+      set_abort_result(TOI_DPM_SUSPEND_FAILED);
+      toi_end_atomic(ATOMIC_STEP_DEVICE_RESUME, suspend_time, 3);
+      return 1;
+    }
+  } else {
+    if (dpm_suspend_start(state)) {
+      set_abort_result(TOI_DPM_SUSPEND_FAILED);
+      toi_end_atomic(ATOMIC_STEP_DEVICE_RESUME, suspend_time, 3);
+      return 1;
+    }
+  }
+
+	/* At this point, dpm_suspend_start() has been called, but *not*
+	 * dpm_suspend_noirq(). We *must* dpm_suspend_noirq() now.
+	 * Otherwise, drivers for some devices (e.g. interrupt controllers)
+	 * become desynchronized with the actual state of the hardware
+	 * at resume time, and evil weirdness ensues.
+	 */
+
+	if (dpm_suspend_end(state)) {
+		set_abort_result(TOI_DEVICE_REFUSED);
+		toi_end_atomic(ATOMIC_STEP_DEVICE_RESUME, suspend_time, 1);
+		return 1;
+	}
+
+	if (suspend_time) {
+		if (platform_pre_snapshot(1))
+			set_abort_result(TOI_PRE_SNAPSHOT_FAILED);
+	} else {
+		if (platform_pre_restore(1))
+			set_abort_result(TOI_PRE_RESTORE_FAILED);
+	}
+
+	if (test_result_state(TOI_ABORTED)) {
+		toi_end_atomic(ATOMIC_STEP_PLATFORM_FINISH, suspend_time, 1);
+		return 1;
+	}
+
+	if (test_action_state(TOI_LATE_CPU_HOTPLUG)) {
+		if (disable_nonboot_cpus()) {
+			set_abort_result(TOI_CPU_HOTPLUG_FAILED);
+			toi_end_atomic(ATOMIC_STEP_CPU_HOTPLUG,
+					suspend_time, 1);
+			return 1;
+		}
+	}
+
+	local_irq_disable();
+
+	if (syscore_suspend()) {
+		set_abort_result(TOI_SYSCORE_REFUSED);
+		toi_end_atomic(ATOMIC_STEP_IRQS, suspend_time, 1);
+		return 1;
+	}
+
+	if (suspend_time && pm_wakeup_pending()) {
+		set_abort_result(TOI_WAKEUP_EVENT);
+		toi_end_atomic(ATOMIC_STEP_SYSCORE_RESUME, suspend_time, 1);
+		return 1;
+	}
+	return 0;
+}
+
+/**
+ * toi_end_atomic - post atomic copy/restore routines
+ * @stage:		What step to start at.
+ * @suspend_time:	Whether we're suspending or resuming.
+ * @error:		Whether we're recovering from an error.
+ **/
+void toi_end_atomic(int stage, int suspend_time, int error)
+{
+	pm_message_t msg = suspend_time ? (error ? PMSG_RECOVER : PMSG_THAW) :
+		PMSG_RESTORE;
+
+	switch (stage) {
+	case ATOMIC_ALL_STEPS:
+		if (!suspend_time) {
+			events_check_enabled = false;
+			platform_leave(1);
+		}
+	case ATOMIC_STEP_SYSCORE_RESUME:
+		syscore_resume();
+	case ATOMIC_STEP_IRQS:
+		local_irq_enable();
+	case ATOMIC_STEP_CPU_HOTPLUG:
+		if (test_action_state(TOI_LATE_CPU_HOTPLUG))
+			enable_nonboot_cpus();
+	case ATOMIC_STEP_PLATFORM_FINISH:
+		if (!suspend_time && error & 2)
+			platform_restore_cleanup(1);
+		else 
+			platform_finish(1);
+		dpm_resume_start(msg);
+	case ATOMIC_STEP_DEVICE_RESUME:
+		if (suspend_time && (error & 2))
+			platform_recover(1);
+		dpm_resume(msg);
+		if (error || !toi_in_suspend())
+			pm_restore_gfp_mask();
+		ftrace_start();
+		resume_console();
+	case ATOMIC_STEP_DPM_COMPLETE:
+		dpm_complete(msg);
+	case ATOMIC_STEP_PLATFORM_END:
+		platform_end(1);
+
+		toi_prepare_status(DONT_CLEAR_BAR, "Post atomic.");
+	}
+}
diff --git a/kernel/power/tuxonice_atomic_copy.h b/kernel/power/tuxonice_atomic_copy.h
new file mode 100644
index 0000000..6a989c1
--- /dev/null
+++ b/kernel/power/tuxonice_atomic_copy.h
@@ -0,0 +1,23 @@
+/*
+ * kernel/power/tuxonice_atomic_copy.h
+ *
+ * Copyright 2008-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * Distributed under GPLv2.
+ *
+ * Routines for doing the atomic save/restore.
+ */
+
+enum {
+	ATOMIC_ALL_STEPS,
+	ATOMIC_STEP_SYSCORE_RESUME,
+	ATOMIC_STEP_IRQS,
+	ATOMIC_STEP_CPU_HOTPLUG,
+	ATOMIC_STEP_PLATFORM_FINISH,
+	ATOMIC_STEP_DEVICE_RESUME,
+	ATOMIC_STEP_DPM_COMPLETE,
+	ATOMIC_STEP_PLATFORM_END,
+};
+
+int toi_go_atomic(pm_message_t state, int toi_time);
+void toi_end_atomic(int stage, int toi_time, int error);
diff --git a/kernel/power/tuxonice_bio.h b/kernel/power/tuxonice_bio.h
new file mode 100644
index 0000000..9627ccc
--- /dev/null
+++ b/kernel/power/tuxonice_bio.h
@@ -0,0 +1,77 @@
+/*
+ * kernel/power/tuxonice_bio.h
+ *
+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * Distributed under GPLv2.
+ *
+ * This file contains declarations for functions exported from
+ * tuxonice_bio.c, which contains low level io functions.
+ */
+
+#include <linux/buffer_head.h>
+#include "tuxonice_extent.h"
+
+void toi_put_extent_chain(struct hibernate_extent_chain *chain);
+int toi_add_to_extent_chain(struct hibernate_extent_chain *chain,
+		unsigned long start, unsigned long end);
+
+struct hibernate_extent_saved_state {
+	int extent_num;
+	struct hibernate_extent *extent_ptr;
+	unsigned long offset;
+};
+
+struct toi_bdev_info {
+	struct toi_bdev_info *next;
+	struct hibernate_extent_chain blocks;
+	struct block_device *bdev;
+	struct toi_module_ops *allocator;
+	int allocator_index;
+	struct hibernate_extent_chain allocations;
+	char name[266]; /* "swap on " or "file " + up to 256 chars */
+
+	/* Saved in header */
+	char uuid[17];
+	dev_t dev_t;
+	int prio;
+	int bmap_shift;
+	int blocks_per_page;
+	unsigned long pages_used;
+	struct hibernate_extent_saved_state saved_state[4];
+};
+
+struct toi_extent_iterate_state {
+	struct toi_bdev_info *current_chain;
+	int num_chains;
+	int saved_chain_number[4];
+	struct toi_bdev_info *saved_chain_ptr[4];
+};
+
+/*
+ * Our exported interface so the swapwriter and filewriter don't
+ * need these functions duplicated.
+ */
+struct toi_bio_ops {
+	int (*bdev_page_io) (int rw, struct block_device *bdev, long pos,
+			struct page *page);
+	int (*register_storage)(struct toi_bdev_info *new);
+	void (*free_storage)(void);
+};
+
+struct toi_allocator_ops {
+	unsigned long (*toi_swap_storage_available) (void);
+};
+
+extern struct toi_bio_ops toi_bio_ops;
+
+extern char *toi_writer_buffer;
+extern int toi_writer_buffer_posn;
+
+struct toi_bio_allocator_ops {
+	int (*register_storage) (void);
+	unsigned long (*storage_available)(void);
+	int (*allocate_storage) (struct toi_bdev_info *, unsigned long);
+	int (*bmap) (struct toi_bdev_info *);
+	void (*free_storage) (struct toi_bdev_info *);
+};
diff --git a/kernel/power/tuxonice_bio_chains.c b/kernel/power/tuxonice_bio_chains.c
new file mode 100644
index 0000000..c214d18
--- /dev/null
+++ b/kernel/power/tuxonice_bio_chains.c
@@ -0,0 +1,1048 @@
+/*
+ * kernel/power/tuxonice_bio_devinfo.c
+ *
+ * Copyright (C) 2009-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * Distributed under GPLv2.
+ *
+ */
+
+#include <linux/mm_types.h>
+#include "tuxonice_bio.h"
+#include "tuxonice_bio_internal.h"
+#include "tuxonice_alloc.h"
+#include "tuxonice_ui.h"
+#include "tuxonice.h"
+#include "tuxonice_io.h"
+
+static struct toi_bdev_info *prio_chain_head;
+static int num_chains;
+
+/* Pointer to current entry being loaded/saved. */
+struct toi_extent_iterate_state toi_writer_posn;
+
+#define metadata_size (sizeof(struct toi_bdev_info) - \
+		offsetof(struct toi_bdev_info, uuid))
+
+/*
+ * After section 0 (header) comes 2 => next_section[0] = 2
+ */
+static int next_section[3] = { 2, 3, 1 };
+
+/**
+ * dump_block_chains - print the contents of the bdev info array.
+ **/
+void dump_block_chains(void)
+{
+	int i = 0;
+	int j;
+	struct toi_bdev_info *cur_chain = prio_chain_head;
+
+	while (cur_chain) {
+		struct hibernate_extent *this = cur_chain->blocks.first;
+
+		printk(KERN_DEBUG "Chain %d (prio %d):", i, cur_chain->prio);
+
+		while (this) {
+			printk(KERN_CONT " [%lu-%lu]%s", this->start,
+					this->end, this->next ? "," : "");
+			this = this->next;
+		}
+
+		printk("\n");
+		cur_chain = cur_chain->next;
+		i++;
+	}
+
+	printk(KERN_DEBUG "Saved states:\n");
+	for (i = 0; i < 4; i++) {
+		printk(KERN_DEBUG "Slot %d: Chain %d.\n",
+			i, toi_writer_posn.saved_chain_number[i]);
+
+		cur_chain = prio_chain_head;
+		j = 0;
+		while (cur_chain) {
+			printk(KERN_DEBUG " Chain %d: Extent %d. Offset %lu.\n",
+					j, cur_chain->saved_state[i].extent_num,
+					cur_chain->saved_state[i].offset);
+			cur_chain = cur_chain->next;
+			j++;
+		}
+		printk(KERN_CONT "\n");
+	}
+}
+
+/**
+ *
+ **/
+static void toi_extent_chain_next(void)
+{
+	struct toi_bdev_info *this = toi_writer_posn.current_chain;
+
+	if (!this->blocks.current_extent)
+		return;
+
+	if (this->blocks.current_offset == this->blocks.current_extent->end) {
+		if (this->blocks.current_extent->next) {
+			this->blocks.current_extent =
+				this->blocks.current_extent->next;
+			this->blocks.current_offset =
+				this->blocks.current_extent->start;
+		} else {
+			this->blocks.current_extent = NULL;
+			this->blocks.current_offset = 0;
+		}
+	} else
+		this->blocks.current_offset++;
+}
+
+/**
+ *
+ */
+
+static struct toi_bdev_info *__find_next_chain_same_prio(void)
+{
+	struct toi_bdev_info *start_chain = toi_writer_posn.current_chain;
+	struct toi_bdev_info *this = start_chain;
+	int orig_prio = this->prio;
+
+	do {
+		this = this->next;
+
+		if (!this)
+			this = prio_chain_head;
+
+		/* Back on original chain? Use it again. */
+		if (this == start_chain)
+			return start_chain;
+
+	} while (!this->blocks.current_extent || this->prio != orig_prio);
+
+	return this;
+}
+
+static void find_next_chain(void)
+{
+	struct toi_bdev_info *this;
+
+	this = __find_next_chain_same_prio();
+
+	/*
+	 * If we didn't get another chain of the same priority that we
+	 * can use, look for the next priority.
+	 */
+	while (this && !this->blocks.current_extent)
+		this = this->next;
+
+	toi_writer_posn.current_chain = this;
+}
+
+/**
+ * toi_extent_state_next - go to the next extent
+ * @blocks: The number of values to progress.
+ * @stripe_mode: Whether to spread usage across all chains.
+ *
+ * Given a state, progress to the next valid entry. We may begin in an
+ * invalid state, as we do when invoked after extent_state_goto_start below.
+ *
+ * When using compression and expected_compression > 0, we let the image size
+ * be larger than storage, so we can validly run out of data to return.
+ **/
+static unsigned long toi_extent_state_next(int blocks, int current_stream)
+{
+	int i;
+
+	if (!toi_writer_posn.current_chain)
+		return -ENOSPC;
+
+	/* Assume chains always have lengths that are multiples of @blocks */
+	for (i = 0; i < blocks; i++)
+		toi_extent_chain_next();
+
+	/* The header stream is not striped */
+	if (current_stream ||
+	    !toi_writer_posn.current_chain->blocks.current_extent)
+		find_next_chain();
+
+	return  toi_writer_posn.current_chain ? 0 : -ENOSPC;
+}
+
+static void toi_insert_chain_in_prio_list(struct toi_bdev_info *this)
+{
+	struct toi_bdev_info **prev_ptr;
+	struct toi_bdev_info *cur;
+
+	/* Loop through the existing chain, finding where to insert it */
+	prev_ptr = &prio_chain_head;
+	cur = prio_chain_head;
+
+	while (cur && cur->prio >= this->prio) {
+		prev_ptr = &cur->next;
+		cur = cur->next;
+	}
+
+	this->next = *prev_ptr;
+	*prev_ptr = this;
+
+	this = prio_chain_head;
+	while (this)
+		this = this->next;
+	num_chains++;
+}
+
+/**
+ * toi_extent_state_goto_start - reinitialize an extent chain iterator
+ * @state:	Iterator to reinitialize
+ **/
+void toi_extent_state_goto_start(void)
+{
+	struct toi_bdev_info *this = prio_chain_head;
+
+	while (this) {
+		toi_message(TOI_BIO, TOI_VERBOSE, 0,
+			"Setting current extent to %p.", this->blocks.first);
+		this->blocks.current_extent = this->blocks.first;
+		if (this->blocks.current_extent) {
+			toi_message(TOI_BIO, TOI_VERBOSE, 0,
+					"Setting current offset to %lu.",
+					this->blocks.current_extent->start);
+			this->blocks.current_offset =
+				this->blocks.current_extent->start;
+		}
+
+		this = this->next;
+	}
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "Setting current chain to %p.",
+			prio_chain_head);
+	toi_writer_posn.current_chain = prio_chain_head;
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "Leaving extent state goto start.");
+}
+
+/**
+ * toi_extent_state_save - save state of the iterator
+ * @state:		Current state of the chain
+ * @saved_state:	Iterator to populate
+ *
+ * Given a state and a struct hibernate_extent_state_store, save the current
+ * position in a format that can be used with relocated chains (at
+ * resume time).
+ **/
+void toi_extent_state_save(int slot)
+{
+	struct toi_bdev_info *cur_chain = prio_chain_head;
+	struct hibernate_extent *extent;
+	struct hibernate_extent_saved_state *chain_state;
+	int i = 0;
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "toi_extent_state_save, slot %d.",
+			slot);
+
+	if (!toi_writer_posn.current_chain) {
+		toi_message(TOI_BIO, TOI_VERBOSE, 0, "No current chain => "
+				"chain_num = -1.");
+		toi_writer_posn.saved_chain_number[slot] = -1;
+		return;
+	}
+
+	while (cur_chain) {
+		i++;
+		toi_message(TOI_BIO, TOI_VERBOSE, 0, "Saving chain %d (%p) "
+				"state, slot %d.", i, cur_chain, slot);
+
+		chain_state = &cur_chain->saved_state[slot];
+
+		chain_state->offset = cur_chain->blocks.current_offset;
+
+		if (toi_writer_posn.current_chain == cur_chain) {
+			toi_writer_posn.saved_chain_number[slot] = i;
+			toi_message(TOI_BIO, TOI_VERBOSE, 0, "This is the chain "
+					"we were on => chain_num is %d.", i);
+		}
+
+		if (!cur_chain->blocks.current_extent) {
+			chain_state->extent_num = 0;
+			toi_message(TOI_BIO, TOI_VERBOSE, 0, "No current extent "
+					"for this chain => extent_num %d is 0.",
+					i);
+			cur_chain = cur_chain->next;
+			continue;
+		}
+
+		extent = cur_chain->blocks.first;
+		chain_state->extent_num = 1;
+
+		while (extent != cur_chain->blocks.current_extent) {
+			chain_state->extent_num++;
+			extent = extent->next;
+		}
+
+		toi_message(TOI_BIO, TOI_VERBOSE, 0, "extent num %d is %d.", i,
+				chain_state->extent_num);
+
+		cur_chain = cur_chain->next;
+	}
+	toi_message(TOI_BIO, TOI_VERBOSE, 0,
+			"Completed saving extent state slot %d.", slot);
+}
+
+/**
+ * toi_extent_state_restore - restore the position saved by extent_state_save
+ * @state:		State to populate
+ * @saved_state:	Iterator saved to restore
+ **/
+void toi_extent_state_restore(int slot)
+{
+	int i = 0;
+	struct toi_bdev_info *cur_chain = prio_chain_head;
+	struct hibernate_extent_saved_state *chain_state;
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0,
+			"toi_extent_state_restore - slot %d.", slot);
+
+	if (toi_writer_posn.saved_chain_number[slot] == -1) {
+		toi_writer_posn.current_chain = NULL;
+		return;
+	}
+
+	while (cur_chain) {
+		int posn;
+		int j;
+		i++;
+		toi_message(TOI_BIO, TOI_VERBOSE, 0, "Restoring chain %d (%p) "
+				"state, slot %d.", i, cur_chain, slot);
+
+		chain_state = &cur_chain->saved_state[slot];
+
+		posn = chain_state->extent_num;
+
+		cur_chain->blocks.current_extent = cur_chain->blocks.first;
+		cur_chain->blocks.current_offset = chain_state->offset;
+
+		if (i == toi_writer_posn.saved_chain_number[slot]) {
+			toi_writer_posn.current_chain = cur_chain;
+			toi_message(TOI_BIO, TOI_VERBOSE, 0,
+					"Found current chain.");
+		}
+
+		for (j = 0; j < 4; j++)
+			if (i == toi_writer_posn.saved_chain_number[j]) {
+				toi_writer_posn.saved_chain_ptr[j] = cur_chain;
+				toi_message(TOI_BIO, TOI_VERBOSE, 0,
+					"Found saved chain ptr %d (%p) (offset"
+					" %d).", j, cur_chain,
+					cur_chain->saved_state[j].offset);
+			}
+
+		if (posn) {
+			while (--posn)
+				cur_chain->blocks.current_extent =
+					cur_chain->blocks.current_extent->next;
+		} else
+			cur_chain->blocks.current_extent = NULL;
+
+		cur_chain = cur_chain->next;
+	}
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "Done.");
+	if (test_action_state(TOI_LOGALL))
+		dump_block_chains();
+}
+
+/*
+ * Storage needed
+ *
+ * Returns amount of space in the image header required
+ * for the chain data. This ignores the links between
+ * pages, which we factor in when allocating the space.
+ */
+int toi_bio_devinfo_storage_needed(void)
+{
+	int result = sizeof(num_chains);
+	struct toi_bdev_info *chain = prio_chain_head;
+
+	while (chain) {
+		result += metadata_size;
+
+		/* Chain size */
+		result += sizeof(int);
+
+		/* Extents */
+		result += (2 * sizeof(unsigned long) *
+			chain->blocks.num_extents);
+
+		chain = chain->next;
+	}
+
+	result += 4 * sizeof(int);
+	return result;
+}
+
+static unsigned long chain_pages_used(struct toi_bdev_info *chain)
+{
+	struct hibernate_extent *this = chain->blocks.first;
+	struct hibernate_extent_saved_state *state = &chain->saved_state[3];
+	unsigned long size = 0;
+	int extent_idx = 1;
+
+	if (!state->extent_num) {
+		if (!this)
+			return 0;
+		else
+			return chain->blocks.size;
+	}
+
+	while (extent_idx < state->extent_num) {
+		size += (this->end - this->start + 1);
+		this = this->next;
+		extent_idx++;
+	}
+
+	/* We didn't use the one we're sitting on, so don't count it */
+	return size + state->offset - this->start;
+}
+
+/**
+ * toi_serialise_extent_chain - write a chain in the image
+ * @chain:	Chain to write.
+ **/
+static int toi_serialise_extent_chain(struct toi_bdev_info *chain)
+{
+	struct hibernate_extent *this;
+	int ret;
+	int i = 1;
+
+	chain->pages_used = chain_pages_used(chain);
+
+	if (test_action_state(TOI_LOGALL))
+		dump_block_chains();
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "Serialising chain (dev_t %lx).",
+			chain->dev_t);
+	/* Device info -  dev_t, prio, bmap_shift, blocks per page, positions */
+	ret = toiActiveAllocator->rw_header_chunk(WRITE, &toi_blockwriter_ops,
+			(char *) &chain->uuid, metadata_size);
+	if (ret)
+		return ret;
+
+	/* Num extents */
+	ret = toiActiveAllocator->rw_header_chunk(WRITE, &toi_blockwriter_ops,
+			(char *) &chain->blocks.num_extents, sizeof(int));
+	if (ret)
+		return ret;
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "%d extents.",
+			chain->blocks.num_extents);
+
+	this = chain->blocks.first;
+	while (this) {
+		toi_message(TOI_BIO, TOI_VERBOSE, 0, "Extent %d.", i);
+		ret = toiActiveAllocator->rw_header_chunk(WRITE,
+				&toi_blockwriter_ops,
+				(char *) this, 2 * sizeof(this->start));
+		if (ret)
+			return ret;
+		this = this->next;
+		i++;
+	}
+
+	return ret;
+}
+
+int toi_serialise_extent_chains(void)
+{
+	struct toi_bdev_info *this = prio_chain_head;
+	int result;
+
+	/* Write the number of chains */
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "Write number of chains (%d)",
+			num_chains);
+	result = toiActiveAllocator->rw_header_chunk(WRITE,
+			&toi_blockwriter_ops, (char *) &num_chains,
+			sizeof(int));
+	if (result)
+		return result;
+
+	/* Then the chains themselves */
+	while (this) {
+		result = toi_serialise_extent_chain(this);
+		if (result)
+			return result;
+		this = this->next;
+	}
+
+	/*
+	 * Finally, the chain we should be on at the start of each
+	 * section.
+	 */
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "Saved chain numbers.");
+	result = toiActiveAllocator->rw_header_chunk(WRITE,
+			&toi_blockwriter_ops,
+			(char *) &toi_writer_posn.saved_chain_number[0],
+			4 * sizeof(int));
+
+	return result;
+}
+
+int toi_register_storage_chain(struct toi_bdev_info *new)
+{
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "Inserting chain %p into list.",
+			new);
+	toi_insert_chain_in_prio_list(new);
+	return 0;
+}
+
+static void free_bdev_info(struct toi_bdev_info *chain)
+{
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "Free chain %p.", chain);
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, " - Block extents.");
+	toi_put_extent_chain(&chain->blocks);
+
+	/*
+	 * The allocator may need to do more than just free the chains
+	 * (swap_free, for example). Don't call from boot kernel.
+	 */
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, " - Allocator extents.");
+	if (chain->allocator)
+		chain->allocator->bio_allocator_ops->free_storage(chain);
+
+	/*
+	 * Dropping out of reading atomic copy? Need to undo
+	 * toi_open_by_devnum.
+	 */
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, " - Bdev.");
+	if (chain->bdev && !IS_ERR(chain->bdev) &&
+			chain->bdev != resume_block_device &&
+			chain->bdev != header_block_device &&
+			test_toi_state(TOI_TRYING_TO_RESUME))
+		toi_close_bdev(chain->bdev);
+
+	/* Poison */
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, " - Struct.");
+	toi_kfree(39, chain, sizeof(*chain));
+
+	if (prio_chain_head == chain)
+		prio_chain_head = NULL;
+
+	num_chains--;
+}
+
+void free_all_bdev_info(void)
+{
+	struct toi_bdev_info *this = prio_chain_head;
+
+	while (this) {
+		struct toi_bdev_info *next = this->next;
+		free_bdev_info(this);
+		this = next;
+	}
+
+	memset((char *) &toi_writer_posn, 0, sizeof(toi_writer_posn));
+	prio_chain_head = NULL;
+}
+
+static void set_up_start_position(void)
+{
+	toi_writer_posn.current_chain = prio_chain_head;
+	go_next_page(0, 0);
+}
+
+/**
+ * toi_load_extent_chain - read back a chain saved in the image
+ * @chain:	Chain to load
+ *
+ * The linked list of extents is reconstructed from the disk. chain will point
+ * to the first entry.
+ **/
+int toi_load_extent_chain(int index, int *num_loaded)
+{
+	struct toi_bdev_info *chain = toi_kzalloc(39,
+			sizeof(struct toi_bdev_info), GFP_ATOMIC);
+	struct hibernate_extent *this, *last = NULL;
+	int i, ret;
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "Loading extent chain %d.", index);
+	/* Get dev_t, prio, bmap_shift, blocks per page, positions */
+	ret = toiActiveAllocator->rw_header_chunk_noreadahead(READ, NULL,
+			(char *) &chain->uuid, metadata_size);
+
+	if (ret) {
+		printk(KERN_ERR "Failed to read the size of extent chain.\n");
+		toi_kfree(39, chain, sizeof(*chain));
+		return 1;
+	}
+
+	toi_bkd.pages_used[index] = chain->pages_used;
+
+	ret = toiActiveAllocator->rw_header_chunk_noreadahead(READ, NULL,
+			(char *) &chain->blocks.num_extents, sizeof(int));
+	if (ret) {
+		printk(KERN_ERR "Failed to read the size of extent chain.\n");
+		toi_kfree(39, chain, sizeof(*chain));
+		return 1;
+	}
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "%d extents.",
+			chain->blocks.num_extents);
+
+	for (i = 0; i < chain->blocks.num_extents; i++) {
+		toi_message(TOI_BIO, TOI_VERBOSE, 0, "Extent %d.", i + 1);
+
+		this = toi_kzalloc(2, sizeof(struct hibernate_extent),
+				TOI_ATOMIC_GFP);
+		if (!this) {
+			printk(KERN_INFO "Failed to allocate a new extent.\n");
+			free_bdev_info(chain);
+			return -ENOMEM;
+		}
+		this->next = NULL;
+		/* Get the next page */
+		ret = toiActiveAllocator->rw_header_chunk_noreadahead(READ,
+				NULL, (char *) this, 2 * sizeof(this->start));
+		if (ret) {
+			printk(KERN_INFO "Failed to read an extent.\n");
+			toi_kfree(2, this, sizeof(struct hibernate_extent));
+			free_bdev_info(chain);
+			return 1;
+		}
+
+		if (last)
+			last->next = this;
+		else {
+			char b1[32], b2[32], b3[32];
+			/*
+			 * Open the bdev
+			 */
+			toi_message(TOI_BIO, TOI_VERBOSE, 0,
+				"Chain dev_t is %s. Resume dev t is %s. Header"
+				" bdev_t is %s.\n",
+				format_dev_t(b1, chain->dev_t),
+				format_dev_t(b2, resume_dev_t),
+				format_dev_t(b3, toi_sig_data->header_dev_t));
+
+			if (chain->dev_t == resume_dev_t)
+				chain->bdev = resume_block_device;
+			else if (chain->dev_t == toi_sig_data->header_dev_t)
+				chain->bdev = header_block_device;
+			else {
+				chain->bdev = toi_open_bdev(chain->uuid,
+						chain->dev_t, 1);
+				if (IS_ERR(chain->bdev)) {
+					free_bdev_info(chain);
+					return -ENODEV;
+				}
+			}
+
+			toi_message(TOI_BIO, TOI_VERBOSE, 0, "Chain bmap shift "
+					"is %d and blocks per page is %d.",
+					chain->bmap_shift,
+					chain->blocks_per_page);
+
+			chain->blocks.first = this;
+
+			/*
+			 * Couldn't do this earlier, but can't do
+			 * goto_start now - we may have already used blocks
+			 * in the first chain.
+			 */
+			chain->blocks.current_extent = this;
+			chain->blocks.current_offset = this->start;
+
+			/*
+			 * Can't wait until we've read the whole chain
+			 * before we insert it in the list. We might need
+			 * this chain to read the next page in the header
+			 */
+			toi_insert_chain_in_prio_list(chain);
+		}
+
+		/*
+		 * We have to wait until 2 extents are loaded before setting up
+		 * properly because if the first extent has only one page, we
+		 * will need to put the position on the second extent. Sounds
+		 * obvious, but it wasn't!
+		 */
+		(*num_loaded)++;
+		if ((*num_loaded) == 2)
+			set_up_start_position();
+		last = this;
+	}
+
+	/*
+	 * Shouldn't get empty chains, but it's not impossible. Link them in so
+	 * they get freed properly later.
+	 */
+	if (!chain->blocks.num_extents)
+		toi_insert_chain_in_prio_list(chain);
+
+	if (!chain->blocks.current_extent) {
+		chain->blocks.current_extent = chain->blocks.first;
+		if (chain->blocks.current_extent)
+			chain->blocks.current_offset =
+				chain->blocks.current_extent->start;
+	}
+	return 0;
+}
+
+int toi_load_extent_chains(void)
+{
+	int result;
+	int to_load;
+	int i;
+	int extents_loaded = 0;
+
+	result = toiActiveAllocator->rw_header_chunk_noreadahead(READ, NULL,
+			(char *) &to_load,
+			sizeof(int));
+	if (result)
+		return result;
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "%d chains to read.", to_load);
+
+	for (i = 0; i < to_load; i++) {
+		toi_message(TOI_BIO, TOI_VERBOSE, 0, " >> Loading chain %d/%d.",
+				i, to_load);
+		result = toi_load_extent_chain(i, &extents_loaded);
+		if (result)
+			return result;
+	}
+
+	/* If we never got to a second extent, we still need to do this. */
+	if (extents_loaded == 1)
+		set_up_start_position();
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "Save chain numbers.");
+	result = toiActiveAllocator->rw_header_chunk_noreadahead(READ,
+			&toi_blockwriter_ops,
+			(char *) &toi_writer_posn.saved_chain_number[0],
+			4 * sizeof(int));
+
+	return result;
+}
+
+static int toi_end_of_stream(int writing, int section_barrier)
+{
+	struct toi_bdev_info *cur_chain = toi_writer_posn.current_chain;
+	int compare_to = next_section[current_stream];
+	struct toi_bdev_info *compare_chain =
+		toi_writer_posn.saved_chain_ptr[compare_to];
+	int compare_offset = compare_chain ?
+		compare_chain->saved_state[compare_to].offset : 0;
+
+	if (!section_barrier)
+		return 0;
+
+	if (!cur_chain)
+		return 1;
+
+	if (cur_chain == compare_chain &&
+	    cur_chain->blocks.current_offset == compare_offset) {
+		if (writing) {
+			if (!current_stream) {
+				debug_broken_header();
+				return 1;
+			}
+		} else {
+			more_readahead = 0;
+			toi_message(TOI_BIO, TOI_VERBOSE, 0,
+					"Reached the end of stream %d "
+					"(not an error).", current_stream);
+			return 1;
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * go_next_page - skip blocks to the start of the next page
+ * @writing: Whether we're reading or writing the image.
+ *
+ * Go forward one page.
+ **/
+int go_next_page(int writing, int section_barrier)
+{
+	struct toi_bdev_info *cur_chain = toi_writer_posn.current_chain;
+	int max = cur_chain ? cur_chain->blocks_per_page : 1;
+
+	/* Nope. Go foward a page - or maybe two. Don't stripe the header,
+	 * so that bad fragmentation doesn't put the extent data containing
+	 * the location of the second page out of the first header page.
+	 */
+	if (toi_extent_state_next(max, current_stream)) {
+		/* Don't complain if readahead falls off the end */
+		if (writing && section_barrier) {
+			toi_message(TOI_BIO, TOI_VERBOSE, 0, "Extent state eof. "
+				"Expected compression ratio too optimistic?");
+			if (test_action_state(TOI_LOGALL))
+				dump_block_chains();
+		}
+		toi_message(TOI_BIO, TOI_VERBOSE, 0, "Ran out of extents to "
+				"read/write. (Not necessarily a fatal error.");
+		return -ENOSPC;
+	}
+
+	return 0;
+}
+
+int devices_of_same_priority(struct toi_bdev_info *this)
+{
+	struct toi_bdev_info *check = prio_chain_head;
+	int i = 0;
+
+	while (check) {
+		if (check->prio == this->prio)
+			i++;
+		check = check->next;
+	}
+
+	return i;
+}
+
+/**
+ * toi_bio_rw_page - do i/o on the next disk page in the image
+ * @writing: Whether reading or writing.
+ * @page: Page to do i/o on.
+ * @is_readahead: Whether we're doing readahead
+ * @free_group: The group used in allocating the page
+ *
+ * Submit a page for reading or writing, possibly readahead.
+ * Pass the group used in allocating the page as well, as it should
+ * be freed on completion of the bio if we're writing the page.
+ **/
+int toi_bio_rw_page(int writing, struct page *page,
+		int is_readahead, int free_group)
+{
+	int result = toi_end_of_stream(writing, 1);
+	struct toi_bdev_info *dev_info = toi_writer_posn.current_chain;
+
+	if (result) {
+		if (writing)
+			abort_hibernate(TOI_INSUFFICIENT_STORAGE,
+				"Insufficient storage for your image.");
+		else
+			toi_message(TOI_BIO, TOI_VERBOSE, 0, "Seeking to "
+				"read/write another page when stream has "
+				"ended.");
+		return -ENOSPC;
+	}
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0,
+			"%s %lx:%ld",
+			writing ? "Write" : "Read",
+			dev_info->dev_t, dev_info->blocks.current_offset);
+
+	result = toi_do_io(writing, dev_info->bdev,
+		dev_info->blocks.current_offset << dev_info->bmap_shift,
+		page, is_readahead, 0, free_group);
+
+	/* Ignore the result here - will check end of stream if come in again */
+	go_next_page(writing, 1);
+
+	if (result)
+		printk(KERN_ERR "toi_do_io returned %d.\n", result);
+	return result;
+}
+
+dev_t get_header_dev_t(void)
+{
+	return prio_chain_head->dev_t;
+}
+
+struct block_device *get_header_bdev(void)
+{
+	return prio_chain_head->bdev;
+}
+
+unsigned long get_headerblock(void)
+{
+	return prio_chain_head->blocks.first->start <<
+		prio_chain_head->bmap_shift;
+}
+
+int get_main_pool_phys_params(void)
+{
+	struct toi_bdev_info *this = prio_chain_head;
+	int result;
+
+	while (this) {
+		result = this->allocator->bio_allocator_ops->bmap(this);
+		if (result)
+			return result;
+		this = this->next;
+	}
+
+	return 0;
+}
+
+static int apply_header_reservation(void)
+{
+	int i;
+
+	if (!header_pages_reserved) {
+		toi_message(TOI_BIO, TOI_VERBOSE, 0,
+				"No header pages reserved at the moment.");
+		return 0;
+	}
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "Applying header reservation.");
+
+	/* Apply header space reservation */
+	toi_extent_state_goto_start();
+
+	for (i = 0; i < header_pages_reserved; i++)
+		if (go_next_page(1, 0))
+			return -ENOSPC;
+
+	/* The end of header pages will be the start of pageset 2 */
+	toi_extent_state_save(2);
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0,
+			"Finished applying header reservation.");
+	return 0;
+}
+
+static int toi_bio_register_storage(void)
+{
+	int result = 0;
+	struct toi_module_ops *this_module;
+
+	list_for_each_entry(this_module, &toi_modules, module_list) {
+		if (!this_module->enabled ||
+		    this_module->type != BIO_ALLOCATOR_MODULE)
+			continue;
+		toi_message(TOI_BIO, TOI_VERBOSE, 0,
+				"Registering storage from %s.",
+				this_module->name);
+		result = this_module->bio_allocator_ops->register_storage();
+		if (result)
+			break;
+	}
+
+	return result;
+}
+
+int toi_bio_allocate_storage(unsigned long request)
+{
+	struct toi_bdev_info *chain = prio_chain_head;
+	unsigned long to_get = request;
+	unsigned long extra_pages, needed;
+	int no_free = 0;
+
+	if (!chain) {
+		int result = toi_bio_register_storage();
+		toi_message(TOI_BIO, TOI_VERBOSE, 0, "toi_bio_allocate_storage: "
+			"Registering storage.");
+		if (result)
+			return 0;
+		chain = prio_chain_head;
+		if (!chain) {
+			printk("TuxOnIce: No storage was registered.\n");
+			return 0;
+		}
+	}
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "toi_bio_allocate_storage: "
+			"Request is %lu pages.", request);
+	extra_pages = DIV_ROUND_UP(request * (sizeof(unsigned long)
+			       + sizeof(int)), PAGE_SIZE);
+	needed = request + extra_pages + header_pages_reserved;
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "Adding %lu extra pages and %lu "
+			"for header => %lu.",
+			extra_pages, header_pages_reserved, needed);
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "Already allocated %lu pages.",
+			raw_pages_allocd);
+
+	to_get = needed > raw_pages_allocd ? needed - raw_pages_allocd : 0;
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "Need to get %lu pages.", to_get);
+
+	if (!to_get)
+		return apply_header_reservation();
+
+	while (to_get && chain) {
+		int num_group = devices_of_same_priority(chain);
+		int divisor = num_group - no_free;
+		int i;
+		unsigned long portion = DIV_ROUND_UP(to_get, divisor);
+		unsigned long got = 0;
+		unsigned long got_this_round = 0;
+		struct toi_bdev_info *top = chain;
+
+		toi_message(TOI_BIO, TOI_VERBOSE, 0,
+				" Start of loop. To get is %lu. Divisor is %d.",
+				to_get, divisor);
+		no_free = 0;
+
+		/*
+		 * We're aiming to spread the allocated storage as evenly
+		 * as possible, but we also want to get all the storage we
+		 * can off this priority.
+		 */
+		for (i = 0; i < num_group; i++) {
+			struct toi_bio_allocator_ops *ops =
+				chain->allocator->bio_allocator_ops;
+			toi_message(TOI_BIO, TOI_VERBOSE, 0,
+					" Asking for %lu pages from chain %p.",
+					portion, chain);
+			got = ops->allocate_storage(chain, portion);
+			toi_message(TOI_BIO, TOI_VERBOSE, 0,
+					" Got %lu pages from allocator %p.",
+					got, chain);
+			if (!got)
+				no_free++;
+			got_this_round += got;
+			chain = chain->next;
+		}
+		toi_message(TOI_BIO, TOI_VERBOSE, 0, " Loop finished. Got a "
+				"total of %lu pages from %d allocators.",
+				got_this_round, divisor - no_free);
+
+		raw_pages_allocd += got_this_round;
+		to_get = needed > raw_pages_allocd ? needed - raw_pages_allocd :
+			0;
+
+		/*
+		 * If we got anything from chains of this priority and we
+		 * still have storage to allocate, go over this priority
+		 * again.
+		 */
+		if (got_this_round && to_get)
+			chain = top;
+		else
+			no_free = 0;
+	}
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "Finished allocating. Calling "
+			"get_main_pool_phys_params");
+	/* Now let swap allocator bmap the pages */
+	get_main_pool_phys_params();
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "Done. Reserving header.");
+	return apply_header_reservation();
+}
+
+void toi_bio_chains_post_atomic(struct toi_boot_kernel_data *bkd)
+{
+	int i = 0;
+	struct toi_bdev_info *cur_chain = prio_chain_head;
+
+	while (cur_chain) {
+		cur_chain->pages_used = bkd->pages_used[i];
+		cur_chain = cur_chain->next;
+		i++;
+	}
+}
+
+int toi_bio_chains_debug_info(char *buffer, int size)
+{
+	/* Show what we actually used */
+	struct toi_bdev_info *cur_chain = prio_chain_head;
+	int len = 0;
+
+	while (cur_chain) {
+		len += scnprintf(buffer + len, size - len, "  Used %lu pages "
+				"from %s.\n", cur_chain->pages_used,
+				cur_chain->name);
+		cur_chain = cur_chain->next;
+	}
+
+	return len;
+}
diff --git a/kernel/power/tuxonice_bio_core.c b/kernel/power/tuxonice_bio_core.c
new file mode 100644
index 0000000..4edea73
--- /dev/null
+++ b/kernel/power/tuxonice_bio_core.c
@@ -0,0 +1,1839 @@
+/*
+ * kernel/power/tuxonice_bio.c
+ *
+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * Distributed under GPLv2.
+ *
+ * This file contains block io functions for TuxOnIce. These are
+ * used by the swapwriter and it is planned that they will also
+ * be used by the NFSwriter.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/syscalls.h>
+#include <linux/suspend.h>
+#include <linux/ctype.h>
+#include <linux/fs_uuid.h>
+
+#include "tuxonice.h"
+#include "tuxonice_sysfs.h"
+#include "tuxonice_modules.h"
+#include "tuxonice_prepare_image.h"
+#include "tuxonice_bio.h"
+#include "tuxonice_ui.h"
+#include "tuxonice_alloc.h"
+#include "tuxonice_io.h"
+#include "tuxonice_builtin.h"
+#include "tuxonice_bio_internal.h"
+
+#define MEMORY_ONLY 1
+#define THROTTLE_WAIT 2
+
+/* #define MEASURE_MUTEX_CONTENTION */
+#ifndef MEASURE_MUTEX_CONTENTION
+#define my_mutex_lock(index, the_lock) mutex_lock(the_lock)
+#define my_mutex_unlock(index, the_lock) mutex_unlock(the_lock)
+#else
+unsigned long mutex_times[2][2][NR_CPUS];
+#define my_mutex_lock(index, the_lock) do { \
+	int have_mutex; \
+	have_mutex = mutex_trylock(the_lock); \
+	if (!have_mutex) { \
+		mutex_lock(the_lock); \
+		mutex_times[index][0][smp_processor_id()]++; \
+	} else { \
+		mutex_times[index][1][smp_processor_id()]++; \
+	}
+
+#define my_mutex_unlock(index, the_lock) \
+	mutex_unlock(the_lock); \
+} while (0)
+#endif
+
+static int page_idx, reset_idx;
+
+static int target_outstanding_io = 1024;
+static int max_outstanding_writes, max_outstanding_reads;
+
+static struct page *bio_queue_head, *bio_queue_tail;
+static atomic_t toi_bio_queue_size;
+static DEFINE_SPINLOCK(bio_queue_lock);
+
+static int free_mem_throttle, throughput_throttle;
+int more_readahead = 1;
+static struct page *readahead_list_head, *readahead_list_tail;
+
+static struct page *waiting_on;
+
+static atomic_t toi_io_in_progress, toi_io_done;
+static DECLARE_WAIT_QUEUE_HEAD(num_in_progress_wait);
+
+int current_stream;
+/* Not static, so that the allocators can setup and complete
+ * writing the header */
+char *toi_writer_buffer;
+int toi_writer_buffer_posn;
+
+static DEFINE_MUTEX(toi_bio_mutex);
+static DEFINE_MUTEX(toi_bio_readahead_mutex);
+
+static struct task_struct *toi_queue_flusher;
+static int toi_bio_queue_flush_pages(int dedicated_thread);
+
+struct toi_module_ops toi_blockwriter_ops;
+
+#define TOTAL_OUTSTANDING_IO (atomic_read(&toi_io_in_progress) + \
+	       atomic_read(&toi_bio_queue_size))
+
+unsigned long raw_pages_allocd, header_pages_reserved;
+
+/**
+ * set_free_mem_throttle - set the point where we pause to avoid oom.
+ *
+ * Initially, this value is zero, but when we first fail to allocate memory,
+ * we set it (plus a buffer) and thereafter throttle i/o once that limit is
+ * reached.
+ **/
+static void set_free_mem_throttle(void)
+{
+	int new_throttle = nr_free_buffer_pages() + 256;
+
+	if (new_throttle > free_mem_throttle)
+		free_mem_throttle = new_throttle;
+}
+
+#define NUM_REASONS 7
+static atomic_t reasons[NUM_REASONS];
+static char *reason_name[NUM_REASONS] = {
+	"readahead not ready",
+	"bio allocation",
+	"synchronous I/O",
+	"toi_bio_get_new_page",
+	"memory low",
+	"readahead buffer allocation",
+	"throughput_throttle",
+};
+
+/* User Specified Parameters. */
+unsigned long resume_firstblock;
+dev_t resume_dev_t;
+struct block_device *resume_block_device;
+static atomic_t resume_bdev_open_count;
+
+struct block_device *header_block_device;
+
+/**
+ * toi_open_bdev: Open a bdev at resume time.
+ *
+ * index: The swap index. May be MAX_SWAPFILES for the resume_dev_t
+ * (the user can have resume= pointing at a swap partition/file that isn't
+ * swapon'd when they hibernate. MAX_SWAPFILES+1 for the first page of the
+ * header. It will be from a swap partition that was enabled when we hibernated,
+ * but we don't know it's real index until we read that first page.
+ * dev_t: The device major/minor.
+ * display_errs: Whether to try to do this quietly.
+ *
+ * We stored a dev_t in the image header. Open the matching device without
+ * requiring /dev/<whatever> in most cases and record the details needed
+ * to close it later and avoid duplicating work.
+ */
+struct block_device *toi_open_bdev(char *uuid, dev_t default_device,
+		int display_errs)
+{
+	struct block_device *bdev;
+	dev_t device = default_device;
+	char buf[32];
+	int retried = 0;
+
+retry:
+	if (uuid) {
+		struct fs_info seek;
+		strncpy((char *) &seek.uuid, uuid, 16);
+		seek.dev_t = 0;
+		seek.last_mount_size = 0;
+		device = blk_lookup_fs_info(&seek);
+		if (!device) {
+			device = default_device;
+			printk(KERN_DEBUG "Unable to resolve uuid. Falling back"
+					" to dev_t.\n");
+		} else
+			printk(KERN_DEBUG "Resolved uuid to device %s.\n",
+					format_dev_t(buf, device));
+	}
+
+	if (!device) {
+		printk(KERN_ERR "TuxOnIce attempting to open a "
+				"blank dev_t!\n");
+		dump_stack();
+		return NULL;
+	}
+	bdev = toi_open_by_devnum(device);
+
+	if (IS_ERR(bdev) || !bdev) {
+		if (!retried) {
+			retried = 1;
+			wait_for_device_probe();
+			goto retry;
+		}
+		if (display_errs)
+			toi_early_boot_message(1, TOI_CONTINUE_REQ,
+				"Failed to get access to block device "
+				"\"%x\" (error %d).\n Maybe you need "
+				"to run mknod and/or lvmsetup in an "
+				"initrd/ramfs?", device, bdev);
+		return ERR_PTR(-EINVAL);
+	}
+	toi_message(TOI_BIO, TOI_VERBOSE, 0,
+			"TuxOnIce got bdev %p for dev_t %x.",
+			bdev, device);
+
+	return bdev;
+}
+
+static void toi_bio_reserve_header_space(unsigned long request)
+{
+	header_pages_reserved = request;
+}
+
+/**
+ * do_bio_wait - wait for some TuxOnIce I/O to complete
+ * @reason: The array index of the reason we're waiting.
+ *
+ * Wait for a particular page of I/O if we're after a particular page.
+ * If we're not after a particular page, wait instead for all in flight
+ * I/O to be completed or for us to have enough free memory to be able
+ * to submit more I/O.
+ *
+ * If we wait, we also update our statistics regarding why we waited.
+ **/
+static void do_bio_wait(int reason)
+{
+	struct page *was_waiting_on = waiting_on;
+
+	/* On SMP, waiting_on can be reset, so we make a copy */
+	if (was_waiting_on) {
+		wait_on_page_locked(was_waiting_on);
+		atomic_inc(&reasons[reason]);
+	} else {
+		atomic_inc(&reasons[reason]);
+
+		wait_event(num_in_progress_wait,
+			!atomic_read(&toi_io_in_progress) ||
+			nr_free_buffer_pages() > free_mem_throttle);
+	}
+}
+
+/**
+ * throttle_if_needed - wait for I/O completion if throttle points are reached
+ * @flags: What to check and how to act.
+ *
+ * Check whether we need to wait for some I/O to complete. We always check
+ * whether we have enough memory available, but may also (depending upon
+ * @reason) check if the throughput throttle limit has been reached.
+ **/
+static int throttle_if_needed(int flags)
+{
+	int free_pages = nr_free_buffer_pages();
+
+	/* Getting low on memory and I/O is in progress? */
+	while (unlikely(free_pages < free_mem_throttle) &&
+			atomic_read(&toi_io_in_progress) &&
+			!test_result_state(TOI_ABORTED)) {
+		if (!(flags & THROTTLE_WAIT))
+			return -ENOMEM;
+		do_bio_wait(4);
+		free_pages = nr_free_buffer_pages();
+	}
+
+	while (!(flags & MEMORY_ONLY) && throughput_throttle &&
+		TOTAL_OUTSTANDING_IO >= throughput_throttle &&
+		!test_result_state(TOI_ABORTED)) {
+		int result = toi_bio_queue_flush_pages(0);
+		if (result)
+			return result;
+		atomic_inc(&reasons[6]);
+		wait_event(num_in_progress_wait,
+			!atomic_read(&toi_io_in_progress) ||
+			TOTAL_OUTSTANDING_IO < throughput_throttle);
+	}
+
+	return 0;
+}
+
+/**
+ * update_throughput_throttle - update the raw throughput throttle
+ * @jif_index: The number of times this function has been called.
+ *
+ * This function is called four times per second by the core, and used to limit
+ * the amount of I/O we submit at once, spreading out our waiting through the
+ * whole job and letting userui get an opportunity to do its work.
+ *
+ * We don't start limiting I/O until 1/4s has gone so that we get a
+ * decent sample for our initial limit, and keep updating it because
+ * throughput may vary (on rotating media, eg) with our block number.
+ *
+ * We throttle to 1/10s worth of I/O.
+ **/
+static void update_throughput_throttle(int jif_index)
+{
+	int done = atomic_read(&toi_io_done);
+	throughput_throttle = done * 2 / 5 / jif_index;
+}
+
+/**
+ * toi_finish_all_io - wait for all outstanding i/o to complete
+ *
+ * Flush any queued but unsubmitted I/O and wait for it all to complete.
+ **/
+static int toi_finish_all_io(void)
+{
+	int result = toi_bio_queue_flush_pages(0);
+	toi_bio_queue_flusher_should_finish = 1;
+	wake_up(&toi_io_queue_flusher);
+	wait_event(num_in_progress_wait, !TOTAL_OUTSTANDING_IO);
+	return result;
+}
+
+/**
+ * toi_end_bio - bio completion function.
+ * @bio: bio that has completed.
+ * @err: Error value. Yes, like end_swap_bio_read, we ignore it.
+ *
+ * Function called by the block driver from interrupt context when I/O is
+ * completed. If we were writing the page, we want to free it and will have
+ * set bio->bi_private to the parameter we should use in telling the page
+ * allocation accounting code what the page was allocated for. If we're
+ * reading the page, it will be in the singly linked list made from
+ * page->private pointers.
+ **/
+static void toi_end_bio(struct bio *bio, int err)
+{
+	struct page *page = bio->bi_io_vec[0].bv_page;
+
+	BUG_ON(!test_bit(BIO_UPTODATE, &bio->bi_flags));
+
+	unlock_page(page);
+	bio_put(bio);
+
+	if (waiting_on == page)
+		waiting_on = NULL;
+
+	put_page(page);
+
+	if (bio->bi_private)
+		toi__free_page((int) ((unsigned long) bio->bi_private) , page);
+
+	bio_put(bio);
+
+	atomic_dec(&toi_io_in_progress);
+	atomic_inc(&toi_io_done);
+
+	wake_up(&num_in_progress_wait);
+}
+
+/**
+ * submit - submit BIO request
+ * @writing: READ or WRITE.
+ * @dev: The block device we're using.
+ * @first_block: The first sector we're using.
+ * @page: The page being used for I/O.
+ * @free_group: If writing, the group that was used in allocating the page
+ * 	and which will be used in freeing the page from the completion
+ * 	routine.
+ *
+ * Based on Patrick Mochell's pmdisk code from long ago: "Straight from the
+ * textbook - allocate and initialize the bio. If we're writing, make sure
+ * the page is marked as dirty. Then submit it and carry on."
+ *
+ * If we're just testing the speed of our own code, we fake having done all
+ * the hard work and all toi_end_bio immediately.
+ **/
+static int submit(int writing, struct block_device *dev, sector_t first_block,
+		struct page *page, int free_group)
+{
+	struct bio *bio = NULL;
+	int cur_outstanding_io, result;
+
+	/*
+	 * Shouldn't throttle if reading - can deadlock in the single
+	 * threaded case as pages are only freed when we use the
+	 * readahead.
+	 */
+	if (writing) {
+		result = throttle_if_needed(MEMORY_ONLY | THROTTLE_WAIT);
+		if (result)
+			return result;
+	}
+
+	while (!bio) {
+		bio = bio_alloc(TOI_ATOMIC_GFP, 1);
+		if (!bio) {
+			set_free_mem_throttle();
+			do_bio_wait(1);
+		}
+	}
+
+	bio->bi_bdev = dev;
+	bio->bi_sector = first_block;
+	bio->bi_private = (void *) ((unsigned long) free_group);
+	bio->bi_end_io = toi_end_bio;
+	bio->bi_flags |= (1 << BIO_TOI);
+
+	if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) {
+		printk(KERN_DEBUG "ERROR: adding page to bio at %lld\n",
+				(unsigned long long) first_block);
+		bio_put(bio);
+		return -EFAULT;
+	}
+
+	bio_get(bio);
+
+	cur_outstanding_io = atomic_add_return(1, &toi_io_in_progress);
+	if (writing) {
+		if (cur_outstanding_io > max_outstanding_writes)
+			max_outstanding_writes = cur_outstanding_io;
+	} else {
+		if (cur_outstanding_io > max_outstanding_reads)
+			max_outstanding_reads = cur_outstanding_io;
+	}
+
+
+	/* Still read the header! */
+	if (unlikely(test_action_state(TOI_TEST_BIO) && writing)) {
+		/* Fake having done the hard work */
+		set_bit(BIO_UPTODATE, &bio->bi_flags);
+		toi_end_bio(bio, 0);
+	} else
+		submit_bio(writing | REQ_SYNC, bio);
+
+	return 0;
+}
+
+/**
+ * toi_do_io: Prepare to do some i/o on a page and submit or batch it.
+ *
+ * @writing: Whether reading or writing.
+ * @bdev: The block device which we're using.
+ * @block0: The first sector we're reading or writing.
+ * @page: The page on which I/O is being done.
+ * @readahead_index: If doing readahead, the index (reset this flag when done).
+ * @syncio: Whether the i/o is being done synchronously.
+ *
+ * Prepare and start a read or write operation.
+ *
+ * Note that we always work with our own page. If writing, we might be given a
+ * compression buffer that will immediately be used to start compressing the
+ * next page. For reading, we do readahead and therefore don't know the final
+ * address where the data needs to go.
+ **/
+int toi_do_io(int writing, struct block_device *bdev, long block0,
+	struct page *page, int is_readahead, int syncio, int free_group)
+{
+	page->private = 0;
+
+	/* Do here so we don't race against toi_bio_get_next_page_read */
+	lock_page(page);
+
+	if (is_readahead) {
+		if (readahead_list_head)
+			readahead_list_tail->private = (unsigned long) page;
+		else
+			readahead_list_head = page;
+
+		readahead_list_tail = page;
+	}
+
+	/* Done before submitting to avoid races. */
+	if (syncio)
+		waiting_on = page;
+
+	/* Submit the page */
+	get_page(page);
+
+	if (submit(writing, bdev, block0, page, free_group))
+		return -EFAULT;
+
+	if (syncio)
+		do_bio_wait(2);
+
+	return 0;
+}
+
+/**
+ * toi_bdev_page_io - simpler interface to do directly i/o on a single page
+ * @writing: Whether reading or writing.
+ * @bdev: Block device on which we're operating.
+ * @pos: Sector at which page to read or write starts.
+ * @page: Page to be read/written.
+ *
+ * A simple interface to submit a page of I/O and wait for its completion.
+ * The caller must free the page used.
+ **/
+static int toi_bdev_page_io(int writing, struct block_device *bdev,
+		long pos, struct page *page)
+{
+	return toi_do_io(writing, bdev, pos, page, 0, 1, 0);
+}
+
+/**
+ * toi_bio_memory_needed - report the amount of memory needed for block i/o
+ *
+ * We want to have at least enough memory so as to have target_outstanding_io
+ * or more transactions on the fly at once. If we can do more, fine.
+ **/
+static int toi_bio_memory_needed(void)
+{
+	return target_outstanding_io * (PAGE_SIZE + sizeof(struct request) +
+				sizeof(struct bio));
+}
+
+/**
+ * toi_bio_print_debug_stats - put out debugging info in the buffer provided
+ * @buffer: A buffer of size @size into which text should be placed.
+ * @size: The size of @buffer.
+ *
+ * Fill a buffer with debugging info. This is used for both our debug_info sysfs
+ * entry and for recording the same info in dmesg.
+ **/
+static int toi_bio_print_debug_stats(char *buffer, int size)
+{
+	int len = 0;
+
+	if (toiActiveAllocator != &toi_blockwriter_ops) {
+		len = scnprintf(buffer, size,
+				"- Block I/O inactive.\n");
+		return len;
+	}
+
+	len = scnprintf(buffer, size, "- Block I/O active.\n");
+
+	len += toi_bio_chains_debug_info(buffer + len, size - len);
+
+	len += scnprintf(buffer + len, size - len,
+			"- Max outstanding reads %d. Max writes %d.\n",
+			max_outstanding_reads, max_outstanding_writes);
+
+	len += scnprintf(buffer + len, size - len,
+		"  Memory_needed: %d x (%lu + %u + %u) = %d bytes.\n",
+		target_outstanding_io,
+		PAGE_SIZE, (unsigned int) sizeof(struct request),
+		(unsigned int) sizeof(struct bio), toi_bio_memory_needed());
+
+#ifdef MEASURE_MUTEX_CONTENTION
+	{
+	int i;
+
+	len += scnprintf(buffer + len, size - len,
+		"  Mutex contention while reading:\n  Contended      Free\n");
+
+	for_each_online_cpu(i)
+		len += scnprintf(buffer + len, size - len,
+		"  %9lu %9lu\n",
+		mutex_times[0][0][i], mutex_times[0][1][i]);
+
+	len += scnprintf(buffer + len, size - len,
+		"  Mutex contention while writing:\n  Contended      Free\n");
+
+	for_each_online_cpu(i)
+		len += scnprintf(buffer + len, size - len,
+		"  %9lu %9lu\n",
+		mutex_times[1][0][i], mutex_times[1][1][i]);
+
+	}
+#endif
+
+	return len + scnprintf(buffer + len, size - len,
+		"  Free mem throttle point reached %d.\n", free_mem_throttle);
+}
+
+static int total_header_bytes;
+static int unowned;
+
+void debug_broken_header(void)
+{
+	printk(KERN_DEBUG "Image header too big for size allocated!\n");
+	print_toi_header_storage_for_modules();
+	printk(KERN_DEBUG "Page flags : %d.\n", toi_pageflags_space_needed());
+	printk(KERN_DEBUG "toi_header : %zu.\n", sizeof(struct toi_header));
+	printk(KERN_DEBUG "Total unowned : %d.\n", unowned);
+	printk(KERN_DEBUG "Total used : %d (%ld pages).\n", total_header_bytes,
+			DIV_ROUND_UP(total_header_bytes, PAGE_SIZE));
+	printk(KERN_DEBUG "Space needed now : %ld.\n",
+			get_header_storage_needed());
+	dump_block_chains();
+	abort_hibernate(TOI_HEADER_TOO_BIG, "Header reservation too small.");
+}
+
+/**
+ * toi_rw_init - prepare to read or write a stream in the image
+ * @writing: Whether reading or writing.
+ * @stream number: Section of the image being processed.
+ *
+ * Prepare to read or write a section ('stream') in the image.
+ **/
+static int toi_rw_init(int writing, int stream_number)
+{
+	if (stream_number)
+		toi_extent_state_restore(stream_number);
+	else
+		toi_extent_state_goto_start();
+
+	if (writing) {
+		reset_idx = 0;
+		if (!current_stream)
+			page_idx = 0;
+	} else {
+		reset_idx = 1;
+	}
+
+	atomic_set(&toi_io_done, 0);
+	if (!toi_writer_buffer)
+		toi_writer_buffer = (char *) toi_get_zeroed_page(11,
+				TOI_ATOMIC_GFP);
+	toi_writer_buffer_posn = writing ? 0 : PAGE_SIZE;
+
+	current_stream = stream_number;
+
+	more_readahead = 1;
+
+	return toi_writer_buffer ? 0 : -ENOMEM;
+}
+
+/**
+ * toi_bio_queue_write - queue a page for writing
+ * @full_buffer: Pointer to a page to be queued
+ *
+ * Add a page to the queue to be submitted. If we're the queue flusher,
+ * we'll do this once we've dropped toi_bio_mutex, so other threads can
+ * continue to submit I/O while we're on the slow path doing the actual
+ * submission.
+ **/
+static void toi_bio_queue_write(char **full_buffer)
+{
+	struct page *page = virt_to_page(*full_buffer);
+	unsigned long flags;
+
+	*full_buffer = NULL;
+	page->private = 0;
+
+	spin_lock_irqsave(&bio_queue_lock, flags);
+	if (!bio_queue_head)
+		bio_queue_head = page;
+	else
+		bio_queue_tail->private = (unsigned long) page;
+
+	bio_queue_tail = page;
+	atomic_inc(&toi_bio_queue_size);
+
+	spin_unlock_irqrestore(&bio_queue_lock, flags);
+	wake_up(&toi_io_queue_flusher);
+}
+
+/**
+ * toi_rw_cleanup - Cleanup after i/o.
+ * @writing: Whether we were reading or writing.
+ *
+ * Flush all I/O and clean everything up after reading or writing a
+ * section of the image.
+ **/
+static int toi_rw_cleanup(int writing)
+{
+	int i, result = 0;
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "toi_rw_cleanup.");
+	if (writing) {
+		if (toi_writer_buffer_posn && !test_result_state(TOI_ABORTED))
+			toi_bio_queue_write(&toi_writer_buffer);
+
+		while (bio_queue_head && !result)
+			result = toi_bio_queue_flush_pages(0);
+
+		if (result)
+			return result;
+
+		if (current_stream == 2)
+			toi_extent_state_save(1);
+		else if (current_stream == 1)
+			toi_extent_state_save(3);
+	}
+
+	result = toi_finish_all_io();
+
+	while (readahead_list_head) {
+		void *next = (void *) readahead_list_head->private;
+		toi__free_page(12, readahead_list_head);
+		readahead_list_head = next;
+	}
+
+	readahead_list_tail = NULL;
+
+	if (!current_stream)
+		return result;
+
+	for (i = 0; i < NUM_REASONS; i++) {
+		if (!atomic_read(&reasons[i]))
+			continue;
+		printk(KERN_DEBUG "Waited for i/o due to %s %d times.\n",
+				reason_name[i], atomic_read(&reasons[i]));
+		atomic_set(&reasons[i], 0);
+	}
+
+	current_stream = 0;
+	return result;
+}
+
+/**
+ * toi_start_one_readahead - start one page of readahead
+ * @dedicated_thread: Is this a thread dedicated to doing readahead?
+ *
+ * Start one new page of readahead. If this is being called by a thread
+ * whose only just is to submit readahead, don't quit because we failed
+ * to allocate a page.
+ **/
+static int toi_start_one_readahead(int dedicated_thread)
+{
+	char *buffer = NULL;
+	int oom = 0, result;
+
+	result = throttle_if_needed(dedicated_thread ? THROTTLE_WAIT : 0);
+	if (result)
+		return result;
+
+	mutex_lock(&toi_bio_readahead_mutex);
+
+	while (!buffer) {
+		buffer = (char *) toi_get_zeroed_page(12,
+				TOI_ATOMIC_GFP);
+		if (!buffer) {
+			if (oom && !dedicated_thread) {
+				mutex_unlock(&toi_bio_readahead_mutex);
+				return -ENOMEM;
+			}
+
+			oom = 1;
+			set_free_mem_throttle();
+			do_bio_wait(5);
+		}
+	}
+
+	result = toi_bio_rw_page(READ, virt_to_page(buffer), 1, 0);
+	if (result == -ENOSPC)
+		toi__free_page(12, virt_to_page(buffer));
+	mutex_unlock(&toi_bio_readahead_mutex);
+	if (result) {
+		if (result == -ENOSPC)
+			toi_message(TOI_BIO, TOI_VERBOSE, 0,
+					"Last readahead page submitted.");
+		else
+			printk(KERN_DEBUG "toi_bio_rw_page returned %d.\n",
+					result);
+	}
+	return result;
+}
+
+/**
+ * toi_start_new_readahead - start new readahead
+ * @dedicated_thread: Are we dedicated to this task?
+ *
+ * Start readahead of image pages.
+ *
+ * We can be called as a thread dedicated to this task (may be helpful on
+ * systems with lots of CPUs), in which case we don't exit until there's no
+ * more readahead.
+ *
+ * If this is not called by a dedicated thread, we top up our queue until
+ * there's no more readahead to submit, we've submitted the number given
+ * in target_outstanding_io or the number in progress exceeds the target
+ * outstanding I/O value.
+ *
+ * No mutex needed because this is only ever called by the first cpu.
+ **/
+static int toi_start_new_readahead(int dedicated_thread)
+{
+	int last_result, num_submitted = 0;
+
+	/* Start a new readahead? */
+	if (!more_readahead)
+		return 0;
+
+	do {
+		last_result = toi_start_one_readahead(dedicated_thread);
+
+		if (last_result) {
+			if (last_result == -ENOMEM || last_result == -ENOSPC)
+				return 0;
+
+			printk(KERN_DEBUG
+				"Begin read chunk returned %d.\n",
+				last_result);
+		} else
+			num_submitted++;
+
+	} while (more_readahead && !last_result &&
+		 (dedicated_thread ||
+		  (num_submitted < target_outstanding_io &&
+		   atomic_read(&toi_io_in_progress) < target_outstanding_io)));
+
+	return last_result;
+}
+
+/**
+ * bio_io_flusher - start the dedicated I/O flushing routine
+ * @writing: Whether we're writing the image.
+ **/
+static int bio_io_flusher(int writing)
+{
+
+	if (writing)
+		return toi_bio_queue_flush_pages(1);
+	else
+		return toi_start_new_readahead(1);
+}
+
+/**
+ * toi_bio_get_next_page_read - read a disk page, perhaps with readahead
+ * @no_readahead: Whether we can use readahead
+ *
+ * Read a page from disk, submitting readahead and cleaning up finished i/o
+ * while we wait for the page we're after.
+ **/
+static int toi_bio_get_next_page_read(int no_readahead)
+{
+	char *virt;
+	struct page *old_readahead_list_head;
+
+	/*
+	 * When reading the second page of the header, we have to
+	 * delay submitting the read until after we've gotten the
+	 * extents out of the first page.
+	 */
+	if (unlikely(no_readahead && toi_start_one_readahead(0))) {
+		printk(KERN_EMERG "No readahead and toi_start_one_readahead "
+				"returned non-zero.\n");
+		return -EIO;
+	}
+
+	if (unlikely(!readahead_list_head)) {
+		/*
+		 * If the last page finishes exactly on the page
+		 * boundary, we will be called one extra time and
+		 * have no data to return. In this case, we should
+		 * not BUG(), like we used to!
+		 */
+		if (!more_readahead) {
+			printk(KERN_EMERG "No more readahead.\n");
+			return -ENOSPC;
+		}
+		if (unlikely(toi_start_one_readahead(0))) {
+			printk(KERN_EMERG "No readahead and "
+			 "toi_start_one_readahead returned non-zero.\n");
+			return -EIO;
+		}
+	}
+
+	if (PageLocked(readahead_list_head)) {
+		waiting_on = readahead_list_head;
+		do_bio_wait(0);
+	}
+
+	virt = page_address(readahead_list_head);
+	memcpy(toi_writer_buffer, virt, PAGE_SIZE);
+	
+	mutex_lock(&toi_bio_readahead_mutex);
+	old_readahead_list_head = readahead_list_head;
+	readahead_list_head = (struct page *) readahead_list_head->private;
+	mutex_unlock(&toi_bio_readahead_mutex);
+	toi__free_page(12, old_readahead_list_head);
+	return 0;
+}
+
+/**
+ * toi_bio_queue_flush_pages - flush the queue of pages queued for writing
+ * @dedicated_thread: Whether we're a dedicated thread
+ *
+ * Flush the queue of pages ready to be written to disk.
+ *
+ * If we're a dedicated thread, stay in here until told to leave,
+ * sleeping in wait_event.
+ *
+ * The first thread is normally the only one to come in here. Another
+ * thread can enter this routine too, though, via throttle_if_needed.
+ * Since that's the case, we must be careful to only have one thread
+ * doing this work at a time. Otherwise we have a race and could save
+ * pages out of order.
+ *
+ * If an error occurs, free all remaining pages without submitting them
+ * for I/O.
+ **/
+
+int toi_bio_queue_flush_pages(int dedicated_thread)
+{
+	unsigned long flags;
+	int result = 0;
+	static DEFINE_MUTEX(busy);
+
+	if (!mutex_trylock(&busy))
+		return 0;
+
+top:
+	spin_lock_irqsave(&bio_queue_lock, flags);
+	while (bio_queue_head) {
+		struct page *page = bio_queue_head;
+		bio_queue_head = (struct page *) page->private;
+		if (bio_queue_tail == page)
+			bio_queue_tail = NULL;
+		atomic_dec(&toi_bio_queue_size);
+		spin_unlock_irqrestore(&bio_queue_lock, flags);
+
+		/* Don't generate more error messages if already had one */
+		if (!result)
+			result = toi_bio_rw_page(WRITE, page, 0, 11);
+		/*
+		 * If writing the page failed, don't drop out.
+		 * Flush the rest of the queue too.
+		 */
+		if (result)
+			toi__free_page(11 , page);
+		spin_lock_irqsave(&bio_queue_lock, flags);
+	}
+	spin_unlock_irqrestore(&bio_queue_lock, flags);
+
+	if (dedicated_thread) {
+		wait_event(toi_io_queue_flusher, bio_queue_head ||
+				toi_bio_queue_flusher_should_finish);
+		if (likely(!toi_bio_queue_flusher_should_finish))
+			goto top;
+		toi_bio_queue_flusher_should_finish = 0;
+	}
+
+	mutex_unlock(&busy);
+	return result;
+}
+
+/**
+ * toi_bio_get_new_page - get a new page for I/O
+ * @full_buffer: Pointer to a page to allocate.
+ **/
+static int toi_bio_get_new_page(char **full_buffer)
+{
+	int result = throttle_if_needed(THROTTLE_WAIT);
+	if (result)
+		return result;
+
+	while (!*full_buffer) {
+		*full_buffer = (char *) toi_get_zeroed_page(11, TOI_ATOMIC_GFP);
+		if (!*full_buffer) {
+			set_free_mem_throttle();
+			do_bio_wait(3);
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * toi_rw_buffer - combine smaller buffers into PAGE_SIZE I/O
+ * @writing:		Bool - whether writing (or reading).
+ * @buffer:		The start of the buffer to write or fill.
+ * @buffer_size:	The size of the buffer to write or fill.
+ * @no_readahead:	Don't try to start readhead (when getting extents).
+ **/
+static int toi_rw_buffer(int writing, char *buffer, int buffer_size,
+		int no_readahead)
+{
+	int bytes_left = buffer_size, result = 0;
+
+	while (bytes_left) {
+		char *source_start = buffer + buffer_size - bytes_left;
+		char *dest_start = toi_writer_buffer + toi_writer_buffer_posn;
+		int capacity = PAGE_SIZE - toi_writer_buffer_posn;
+		char *to = writing ? dest_start : source_start;
+		char *from = writing ? source_start : dest_start;
+
+		if (bytes_left <= capacity) {
+			memcpy(to, from, bytes_left);
+			toi_writer_buffer_posn += bytes_left;
+			return 0;
+		}
+
+		/* Complete this page and start a new one */
+		memcpy(to, from, capacity);
+		bytes_left -= capacity;
+
+		if (!writing) {
+			/*
+			 * Perform actual I/O:
+			 * read readahead_list_head into toi_writer_buffer
+			 */
+			int result = toi_bio_get_next_page_read(no_readahead);
+			if (result) {
+				printk("toi_bio_get_next_page_read "
+						"returned %d.\n", result);
+				return result;
+			}
+		} else {
+			toi_bio_queue_write(&toi_writer_buffer);
+			result = toi_bio_get_new_page(&toi_writer_buffer);
+			if (result) {
+				printk(KERN_ERR "toi_bio_get_new_page returned "
+						"%d.\n", result);
+				return result;
+			}
+		}
+
+		toi_writer_buffer_posn = 0;
+		toi_cond_pause(0, NULL);
+	}
+
+	return 0;
+}
+
+/**
+ * toi_bio_read_page - read a page of the image
+ * @pfn:		The pfn where the data belongs.
+ * @buffer_page:	The page containing the (possibly compressed) data.
+ * @buf_size:		The number of bytes on @buffer_page used (PAGE_SIZE).
+ *
+ * Read a (possibly compressed) page from the image, into buffer_page,
+ * returning its pfn and the buffer size.
+ **/
+static int toi_bio_read_page(unsigned long *pfn, int buf_type,
+		void *buffer_page, unsigned int *buf_size)
+{
+	int result = 0;
+	int this_idx;
+	char *buffer_virt = TOI_MAP(buf_type, buffer_page);
+
+	/*
+	 * Only call start_new_readahead if we don't have a dedicated thread
+	 * and we're the queue flusher.
+	 */
+	if (current == toi_queue_flusher && more_readahead &&
+			!test_action_state(TOI_NO_READAHEAD)) {
+		int result2 = toi_start_new_readahead(0);
+		if (result2) {
+			printk(KERN_DEBUG "Queue flusher and "
+			 "toi_start_one_readahead returned non-zero.\n");
+			result = -EIO;
+			goto out;
+		}
+	}
+
+	my_mutex_lock(0, &toi_bio_mutex);
+
+	/*
+	 * Structure in the image:
+	 *	[destination pfn|page size|page data]
+	 * buf_size is PAGE_SIZE
+	 * We can validly find there's nothing to read in a multithreaded
+	 * situation.
+	 */
+	if (toi_rw_buffer(READ, (char *) &this_idx, sizeof(int), 0) ||
+	    toi_rw_buffer(READ, (char *) pfn, sizeof(unsigned long), 0) ||
+	    toi_rw_buffer(READ, (char *) buf_size, sizeof(int), 0) ||
+	    toi_rw_buffer(READ, buffer_virt, *buf_size, 0)) {
+		result = -ENODATA;
+		goto out_unlock;
+	}
+
+	if (reset_idx) {
+		page_idx = this_idx;
+		reset_idx = 0;
+	} else {
+		page_idx++;
+		if (!this_idx)
+			result = -ENODATA;
+		else if (page_idx != this_idx)
+			printk(KERN_ERR "Got page index %d, expected %d.\n",
+					this_idx, page_idx);
+	}
+
+out_unlock:
+	my_mutex_unlock(0, &toi_bio_mutex);
+out:
+	TOI_UNMAP(buf_type, buffer_page);
+	return result;
+}
+
+/**
+ * toi_bio_write_page - write a page of the image
+ * @pfn:		The pfn where the data belongs.
+ * @buffer_page:	The page containing the (possibly compressed) data.
+ * @buf_size:	The number of bytes on @buffer_page used.
+ *
+ * Write a (possibly compressed) page to the image from the buffer, together
+ * with it's index and buffer size.
+ **/
+static int toi_bio_write_page(unsigned long pfn, int buf_type,
+		void *buffer_page, unsigned int buf_size)
+{
+	char *buffer_virt;
+	int result = 0, result2 = 0;
+
+	if (unlikely(test_action_state(TOI_TEST_FILTER_SPEED)))
+		return 0;
+
+	my_mutex_lock(1, &toi_bio_mutex);
+
+	if (test_result_state(TOI_ABORTED)) {
+		my_mutex_unlock(1, &toi_bio_mutex);
+		return 0;
+	}
+
+	buffer_virt = TOI_MAP(buf_type, buffer_page);
+	page_idx++;
+
+	/*
+	 * Structure in the image:
+	 *	[destination pfn|page size|page data]
+	 * buf_size is PAGE_SIZE
+	 */
+	if (toi_rw_buffer(WRITE, (char *) &page_idx, sizeof(int), 0) ||
+	    toi_rw_buffer(WRITE, (char *) &pfn, sizeof(unsigned long), 0) ||
+	    toi_rw_buffer(WRITE, (char *) &buf_size, sizeof(int), 0) ||
+	    toi_rw_buffer(WRITE, buffer_virt, buf_size, 0)) {
+		printk(KERN_DEBUG "toi_rw_buffer returned non-zero to "
+				"toi_bio_write_page.\n");
+		result = -EIO;
+	}
+
+	TOI_UNMAP(buf_type, buffer_page);
+	my_mutex_unlock(1, &toi_bio_mutex);
+
+	if (current == toi_queue_flusher)
+		result2 = toi_bio_queue_flush_pages(0);
+
+	return result ? result : result2;
+}
+
+/**
+ * _toi_rw_header_chunk - read or write a portion of the image header
+ * @writing:		Whether reading or writing.
+ * @owner:		The module for which we're writing.
+ *			Used for confirming that modules
+ *			don't use more header space than they asked for.
+ * @buffer:		Address of the data to write.
+ * @buffer_size:	Size of the data buffer.
+ * @no_readahead:	Don't try to start readhead (when getting extents).
+ *
+ * Perform PAGE_SIZE I/O. Start readahead if needed.
+ **/
+static int _toi_rw_header_chunk(int writing, struct toi_module_ops *owner,
+		char *buffer, int buffer_size, int no_readahead)
+{
+	int result = 0;
+
+	if (owner) {
+		owner->header_used += buffer_size;
+		toi_message(TOI_HEADER, TOI_LOW, 1,
+			"Header: %s : %d bytes (%d/%d) from offset %d.",
+			owner->name,
+			buffer_size, owner->header_used,
+			owner->header_requested,
+			toi_writer_buffer_posn);
+		if (owner->header_used > owner->header_requested && writing) {
+			printk(KERN_EMERG "TuxOnIce module %s is using more "
+				"header space (%u) than it requested (%u).\n",
+				owner->name,
+				owner->header_used,
+				owner->header_requested);
+			return buffer_size;
+		}
+	} else {
+		unowned += buffer_size;
+		toi_message(TOI_HEADER, TOI_LOW, 1,
+			"Header: (No owner): %d bytes (%d total so far) from "
+			"offset %d.", buffer_size, unowned,
+			toi_writer_buffer_posn);
+	}
+
+	if (!writing && !no_readahead && more_readahead) {
+		result = toi_start_new_readahead(0);
+		toi_message(TOI_BIO, TOI_VERBOSE, 0, "Start new readahead "
+				"returned %d.", result);
+	}
+
+	if (!result) {
+		result = toi_rw_buffer(writing, buffer, buffer_size,
+				no_readahead);
+		toi_message(TOI_BIO, TOI_VERBOSE, 0, "rw_buffer returned "
+				"%d.", result);
+	}
+
+	total_header_bytes += buffer_size;
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "_toi_rw_header_chunk returning "
+			"%d.", result);
+	return result;
+}
+
+static int toi_rw_header_chunk(int writing, struct toi_module_ops *owner,
+		char *buffer, int size)
+{
+	return _toi_rw_header_chunk(writing, owner, buffer, size, 1);
+}
+
+static int toi_rw_header_chunk_noreadahead(int writing,
+		struct toi_module_ops *owner, char *buffer, int size)
+{
+	return _toi_rw_header_chunk(writing, owner, buffer, size, 1);
+}
+
+/**
+ * toi_bio_storage_needed - get the amount of storage needed for my fns
+ **/
+static int toi_bio_storage_needed(void)
+{
+	return sizeof(int) + PAGE_SIZE + toi_bio_devinfo_storage_needed();
+}
+
+/**
+ * toi_bio_save_config_info - save block I/O config to image header
+ * @buf:	PAGE_SIZE'd buffer into which data should be saved.
+ **/
+static int toi_bio_save_config_info(char *buf)
+{
+	int *ints = (int *) buf;
+	ints[0] = target_outstanding_io;
+	return sizeof(int);
+}
+
+/**
+ * toi_bio_load_config_info - restore block I/O config
+ * @buf:	Data to be reloaded.
+ * @size:	Size of the buffer saved.
+ **/
+static void toi_bio_load_config_info(char *buf, int size)
+{
+	int *ints = (int *) buf;
+	target_outstanding_io  = ints[0];
+}
+
+void close_resume_dev_t(int force)
+{
+	if (!resume_block_device)
+		return;
+
+	if (force)
+		atomic_set(&resume_bdev_open_count, 0);
+	else
+		atomic_dec(&resume_bdev_open_count);
+
+	if (!atomic_read(&resume_bdev_open_count)) {
+		toi_close_bdev(resume_block_device);
+		resume_block_device = NULL;
+	}
+}
+
+int open_resume_dev_t(int force, int quiet)
+{
+	if (force) {
+		close_resume_dev_t(1);
+		atomic_set(&resume_bdev_open_count, 1);
+	} else
+		atomic_inc(&resume_bdev_open_count);
+
+	if (resume_block_device)
+		return 0;
+
+	resume_block_device = toi_open_bdev(NULL, resume_dev_t, 0);
+	if (IS_ERR(resume_block_device)) {
+		if (!quiet)
+			toi_early_boot_message(1, TOI_CONTINUE_REQ,
+				"Failed to open device %x, where"
+				" the header should be found.",
+				resume_dev_t);
+		resume_block_device = NULL;
+		atomic_set(&resume_bdev_open_count, 0);
+		return 1;
+	}
+
+	return 0;
+}
+
+/**
+ * toi_bio_initialise - initialise bio code at start of some action
+ * @starting_cycle:	Whether starting a hibernation cycle, or just reading or
+ *			writing a sysfs value.
+ **/
+static int toi_bio_initialise(int starting_cycle)
+{
+	int result;
+
+	if (!starting_cycle || !resume_dev_t)
+		return 0;
+
+	max_outstanding_writes = 0;
+	max_outstanding_reads = 0;
+	current_stream = 0;
+	toi_queue_flusher = current;
+#ifdef MEASURE_MUTEX_CONTENTION
+	{
+		int i, j, k;
+
+		for (i = 0; i < 2; i++)
+			for (j = 0; j < 2; j++)
+				for_each_online_cpu(k)
+					mutex_times[i][j][k] = 0;
+	}
+#endif
+	result = open_resume_dev_t(0, 1);
+
+	if (result)
+		return result;
+
+	return get_signature_page();
+}
+
+static unsigned long raw_to_real(unsigned long raw)
+{
+	unsigned long extra;
+
+	extra = (raw * (sizeof(unsigned long) + sizeof(int)) +
+		(PAGE_SIZE + sizeof(unsigned long) + sizeof(int) + 1)) /
+		(PAGE_SIZE + sizeof(unsigned long) + sizeof(int));
+
+	return raw > extra ? raw - extra : 0;
+}
+
+static unsigned long toi_bio_storage_available(void)
+{
+	unsigned long sum = 0;
+	struct toi_module_ops *this_module;
+
+	list_for_each_entry(this_module, &toi_modules, module_list) {
+		if (!this_module->enabled ||
+		    this_module->type != BIO_ALLOCATOR_MODULE)
+			continue;
+		toi_message(TOI_BIO, TOI_VERBOSE, 0, "Seeking storage "
+				"available from %s.", this_module->name);
+		sum += this_module->bio_allocator_ops->storage_available();
+	}
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "Total storage available is %lu "
+			"pages (%d header pages).", sum, header_pages_reserved);
+
+	return sum > header_pages_reserved ?
+		raw_to_real(sum - header_pages_reserved) : 0;
+
+}
+
+static unsigned long toi_bio_storage_allocated(void)
+{
+	return raw_pages_allocd > header_pages_reserved ?
+		raw_to_real(raw_pages_allocd - header_pages_reserved) : 0;
+}
+
+/*
+ * If we have read part of the image, we might have filled  memory with
+ * data that should be zeroed out.
+ */
+static void toi_bio_noresume_reset(void)
+{
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "toi_bio_noresume_reset.");
+	toi_rw_cleanup(READ);
+	free_all_bdev_info();
+}
+
+/**
+ * toi_bio_cleanup - cleanup after some action
+ * @finishing_cycle:	Whether completing a cycle.
+ **/
+static void toi_bio_cleanup(int finishing_cycle)
+{
+	if (!finishing_cycle)
+		return;
+
+	if (toi_writer_buffer) {
+		toi_free_page(11, (unsigned long) toi_writer_buffer);
+		toi_writer_buffer = NULL;
+	}
+
+	forget_signature_page();
+
+	if (header_block_device && toi_sig_data &&
+			toi_sig_data->header_dev_t != resume_dev_t)
+		toi_close_bdev(header_block_device);
+
+	header_block_device = NULL;
+
+	close_resume_dev_t(0);
+}
+
+static int toi_bio_write_header_init(void)
+{
+	int result;
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "toi_bio_write_header_init");
+	toi_rw_init(WRITE, 0);
+	toi_writer_buffer_posn = 0;
+
+	/* Info needed to bootstrap goes at the start of the header.
+	 * First we save the positions and devinfo, including the number
+	 * of header pages. Then we save the structs containing data needed
+	 * for reading the header pages back.
+	 * Note that even if header pages take more than one page, when we
+	 * read back the info, we will have restored the location of the
+	 * next header page by the time we go to use it.
+	 */
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "serialise extent chains.");
+	result = toi_serialise_extent_chains();
+
+	if (result)
+		return result;
+
+	/*
+	 * Signature page hasn't been modified at this point. Write it in
+	 * the header so we can restore it later.
+	 */
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "serialise signature page.");
+	return toi_rw_header_chunk_noreadahead(WRITE, &toi_blockwriter_ops,
+			(char *) toi_cur_sig_page,
+			PAGE_SIZE);
+}
+
+static int toi_bio_write_header_cleanup(void)
+{
+	int result = 0;
+
+	if (toi_writer_buffer_posn)
+		toi_bio_queue_write(&toi_writer_buffer);
+
+	result = toi_finish_all_io();
+
+	unowned = 0;
+	total_header_bytes = 0;
+
+	/* Set signature to save we have an image */
+	if (!result)
+		result = toi_bio_mark_have_image();
+
+	return result;
+}
+
+/*
+ * toi_bio_read_header_init()
+ *
+ * Description:
+ * 1. Attempt to read the device specified with resume=.
+ * 2. Check the contents of the swap header for our signature.
+ * 3. Warn, ignore, reset and/or continue as appropriate.
+ * 4. If continuing, read the toi_swap configuration section
+ *    of the header and set up block device info so we can read
+ *    the rest of the header & image.
+ *
+ * Returns:
+ * May not return if user choose to reboot at a warning.
+ * -EINVAL if cannot resume at this time. Booting should continue
+ * normally.
+ */
+
+static int toi_bio_read_header_init(void)
+{
+	int result = 0;
+	char buf[32];
+
+	toi_writer_buffer_posn = 0;
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "toi_bio_read_header_init");
+
+	if (!toi_sig_data) {
+		printk(KERN_INFO "toi_bio_read_header_init called when we "
+				"haven't verified there is an image!\n");
+		return -EINVAL;
+	}
+
+	/*
+	 * If the header is not on the resume_swap_dev_t, get the resume device
+	 * first.
+	 */
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "Header dev_t is %lx.",
+			toi_sig_data->header_dev_t);
+	if (toi_sig_data->have_uuid) {
+		struct fs_info seek;
+		dev_t device;
+
+		strncpy((char *) seek.uuid, toi_sig_data->header_uuid, 16);
+		seek.dev_t = toi_sig_data->header_dev_t;
+		seek.last_mount_size = 0;
+		device = blk_lookup_fs_info(&seek);
+		if (device) {
+			printk("Using dev_t %s, returned by blk_lookup_fs_info.\n",
+					format_dev_t(buf, device));
+			toi_sig_data->header_dev_t = device;
+		}
+	}
+	if (toi_sig_data->header_dev_t != resume_dev_t) {
+		header_block_device = toi_open_bdev(NULL,
+				toi_sig_data->header_dev_t, 1);
+
+		if (IS_ERR(header_block_device))
+			return PTR_ERR(header_block_device);
+	} else
+		header_block_device = resume_block_device;
+
+	if (!toi_writer_buffer)
+		toi_writer_buffer = (char *) toi_get_zeroed_page(11,
+				TOI_ATOMIC_GFP);
+	more_readahead = 1;
+
+	/*
+	 * Read toi_swap configuration.
+	 * Headerblock size taken into account already.
+	 */
+	result = toi_bio_ops.bdev_page_io(READ, header_block_device,
+			toi_sig_data->first_header_block,
+			virt_to_page((unsigned long) toi_writer_buffer));
+	if (result)
+		return result;
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "load extent chains.");
+	result = toi_load_extent_chains();
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "load original signature page.");
+	toi_orig_sig_page = (char *) toi_get_zeroed_page(38, TOI_ATOMIC_GFP);
+	if (!toi_orig_sig_page) {
+		printk(KERN_ERR "Failed to allocate memory for the current"
+			" image signature.\n");
+		return -ENOMEM;
+	}
+
+	return toi_rw_header_chunk_noreadahead(READ, &toi_blockwriter_ops,
+			(char *) toi_orig_sig_page,
+			PAGE_SIZE);
+}
+
+static int toi_bio_read_header_cleanup(void)
+{
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "toi_bio_read_header_cleanup.");
+	return toi_rw_cleanup(READ);
+}
+
+/* Works only for digits and letters, but small and fast */
+#define TOLOWER(x) ((x) | 0x20)
+
+/*
+ * UUID must be 32 chars long. It may have dashes, but nothing
+ * else.
+ */
+char *uuid_from_commandline(char *commandline)
+{
+	int low = 0;
+	char *result = NULL, *output, *ptr;
+
+	if (strncmp(commandline, "UUID=", 5))
+		return NULL;
+
+	result = kzalloc(17, GFP_KERNEL);
+	if (!result) {
+		printk("Failed to kzalloc UUID text memory.\n");
+		return NULL;
+	}
+
+	ptr = commandline + 5;
+	output = result;
+
+	while (*ptr && (output - result) < 16) {
+		if (isxdigit(*ptr)) {
+			int value = isdigit(*ptr) ? *ptr - '0' :
+				TOLOWER(*ptr) - 'a' + 10;
+			if (low) {
+				*output += value;
+				output++;
+			} else {
+				*output = value << 4;
+			}
+			low = !low;
+		} else if (*ptr != '-')
+			break;
+		ptr++;
+	}
+
+	if ((output - result) < 16 || *ptr) {
+		printk(KERN_DEBUG "Found resume=UUID=, but the value looks "
+				"invalid.\n");
+		kfree(result);
+		result = NULL;
+	}
+
+	return result;
+}
+
+#define retry_if_fails(command) \
+do { \
+	command; \
+	if (!resume_dev_t && !waited_for_device_probe) { \
+		wait_for_device_probe(); \
+		command; \
+		waited_for_device_probe = 1; \
+	} \
+} while(0)
+
+/**
+ * try_to_open_resume_device: Try to parse and open resume=
+ *
+ * Any "swap:" has been stripped away and we just have the path to deal with.
+ * We attempt to do name_to_dev_t, open and stat the file. Having opened the
+ * file, get the struct block_device * to match.
+ */
+static int try_to_open_resume_device(char *commandline, int quiet)
+{
+	struct kstat stat;
+	int error = 0;
+	char *uuid = uuid_from_commandline(commandline);
+	int waited_for_device_probe = 0;
+
+	resume_dev_t = MKDEV(0, 0);
+
+	if (!strlen(commandline))
+		retry_if_fails(toi_bio_scan_for_image(quiet));
+
+	if (uuid) {
+		struct fs_info seek;
+		strncpy((char *) &seek.uuid, uuid, 16);
+		seek.dev_t = resume_dev_t;
+		seek.last_mount_size = 0;
+		retry_if_fails(resume_dev_t = blk_lookup_fs_info(&seek));
+		kfree(uuid);
+	}
+
+	if (!resume_dev_t)
+		retry_if_fails(resume_dev_t = name_to_dev_t(commandline));
+
+	if (!resume_dev_t) {
+		struct file *file = filp_open(commandline,
+				O_RDONLY|O_LARGEFILE, 0);
+
+		if (!IS_ERR(file) && file) {
+			vfs_getattr(&file->f_path, &stat);
+			filp_close(file, NULL);
+		} else
+			error = vfs_stat(commandline, &stat);
+		if (!error)
+			resume_dev_t = stat.rdev;
+	}
+
+	if (!resume_dev_t) {
+		if (quiet)
+			return 1;
+
+		if (test_toi_state(TOI_TRYING_TO_RESUME))
+			toi_early_boot_message(1, toi_translate_err_default,
+			  "Failed to translate \"%s\" into a device id.\n",
+			  commandline);
+		else
+			printk("TuxOnIce: Can't translate \"%s\" into a device "
+					"id yet.\n", commandline);
+		return 1;
+	}
+
+	return open_resume_dev_t(1, quiet);
+}
+
+/*
+ * Parse Image Location
+ *
+ * Attempt to parse a resume= parameter.
+ * Swap Writer accepts:
+ * resume=[swap:|file:]DEVNAME[:FIRSTBLOCK][@BLOCKSIZE]
+ *
+ * Where:
+ * DEVNAME is convertable to a dev_t by name_to_dev_t
+ * FIRSTBLOCK is the location of the first block in the swap file
+ * (specifying for a swap partition is nonsensical but not prohibited).
+ * Data is validated by attempting to read a swap header from the
+ * location given. Failure will result in toi_swap refusing to
+ * save an image, and a reboot with correct parameters will be
+ * necessary.
+ */
+static int toi_bio_parse_sig_location(char *commandline,
+		int only_allocator, int quiet)
+{
+	char *thischar, *devstart, *colon = NULL;
+	int signature_found, result = -EINVAL, temp_result = 0;
+
+	if (strncmp(commandline, "swap:", 5) &&
+	    strncmp(commandline, "file:", 5)) {
+		/*
+		 * Failing swap:, we'll take a simple resume=/dev/hda2, or a
+		 * blank value (scan) but fall through to other allocators
+		 * if /dev/ or UUID= isn't matched.
+		 */
+		if (strncmp(commandline, "/dev/", 5) &&
+		    strncmp(commandline, "UUID=", 5) &&
+		    strlen(commandline))
+			return 1;
+	} else
+		commandline += 5;
+
+	devstart = commandline;
+	thischar = commandline;
+	while ((*thischar != ':') && (*thischar != '@') &&
+		((thischar - commandline) < 250) && (*thischar))
+		thischar++;
+
+	if (*thischar == ':') {
+		colon = thischar;
+		*colon = 0;
+		thischar++;
+	}
+
+	while ((thischar - commandline) < 250 && *thischar)
+		thischar++;
+
+	if (colon) {
+		unsigned long block;
+		temp_result = strict_strtoul(colon + 1, 0, &block);
+		if (!temp_result)
+			resume_firstblock = (int) block;
+	} else
+		resume_firstblock = 0;
+
+	clear_toi_state(TOI_CAN_HIBERNATE);
+	clear_toi_state(TOI_CAN_RESUME);
+
+	if (!temp_result)
+		temp_result = try_to_open_resume_device(devstart, quiet);
+
+	if (colon)
+		*colon = ':';
+
+	/* No error if we only scanned */
+	if (temp_result)
+		return strlen(commandline) ? -EINVAL : 1;
+
+	signature_found = toi_bio_image_exists(quiet);
+
+	if (signature_found != -1) {
+		result = 0;
+		/*
+		 * TODO: If only file storage, CAN_HIBERNATE should only be
+		 * set if file allocator's target is valid.
+		 */
+		set_toi_state(TOI_CAN_HIBERNATE);
+		set_toi_state(TOI_CAN_RESUME);
+	} else
+		if (!quiet)
+			printk(KERN_ERR "TuxOnIce: Block I/O: No "
+				"signature found at %s.\n", devstart);
+
+	return result;
+}
+
+static void toi_bio_release_storage(void)
+{
+	header_pages_reserved = 0;
+	raw_pages_allocd = 0;
+
+	free_all_bdev_info();
+}
+
+/* toi_swap_remove_image
+ *
+ */
+static int toi_bio_remove_image(void)
+{
+	int result;
+
+	toi_message(TOI_BIO, TOI_VERBOSE, 0, "toi_bio_remove_image.");
+
+	result = toi_bio_restore_original_signature();
+
+	/*
+	 * We don't do a sanity check here: we want to restore the swap
+	 * whatever version of kernel made the hibernate image.
+	 *
+	 * We need to write swap, but swap may not be enabled so
+	 * we write the device directly
+	 *
+	 * If we don't have an current_signature_page, we didn't
+	 * read an image header, so don't change anything.
+	 */
+
+	toi_bio_release_storage();
+
+	return result;
+}
+
+struct toi_bio_ops toi_bio_ops = {
+	.bdev_page_io = toi_bdev_page_io,
+	.register_storage = toi_register_storage_chain,
+	.free_storage = toi_bio_release_storage,
+};
+EXPORT_SYMBOL_GPL(toi_bio_ops);
+
+static struct toi_sysfs_data sysfs_params[] = {
+	SYSFS_INT("target_outstanding_io", SYSFS_RW, &target_outstanding_io,
+			0, 16384, 0, NULL),
+};
+
+struct toi_module_ops toi_blockwriter_ops = {
+	.type				= WRITER_MODULE,
+	.name				= "block i/o",
+	.directory			= "block_io",
+	.module				= THIS_MODULE,
+	.memory_needed			= toi_bio_memory_needed,
+	.print_debug_info		= toi_bio_print_debug_stats,
+	.storage_needed			= toi_bio_storage_needed,
+	.save_config_info		= toi_bio_save_config_info,
+	.load_config_info		= toi_bio_load_config_info,
+	.initialise			= toi_bio_initialise,
+	.cleanup			= toi_bio_cleanup,
+	.post_atomic_restore		= toi_bio_chains_post_atomic,
+
+	.rw_init			= toi_rw_init,
+	.rw_cleanup			= toi_rw_cleanup,
+	.read_page			= toi_bio_read_page,
+	.write_page			= toi_bio_write_page,
+	.rw_header_chunk		= toi_rw_header_chunk,
+	.rw_header_chunk_noreadahead	= toi_rw_header_chunk_noreadahead,
+	.io_flusher			= bio_io_flusher,
+	.update_throughput_throttle	= update_throughput_throttle,
+	.finish_all_io			= toi_finish_all_io,
+
+	.noresume_reset			= toi_bio_noresume_reset,
+	.storage_available 		= toi_bio_storage_available,
+	.storage_allocated		= toi_bio_storage_allocated,
+	.reserve_header_space		= toi_bio_reserve_header_space,
+	.allocate_storage		= toi_bio_allocate_storage,
+	.image_exists			= toi_bio_image_exists,
+	.mark_resume_attempted		= toi_bio_mark_resume_attempted,
+	.write_header_init		= toi_bio_write_header_init,
+	.write_header_cleanup		= toi_bio_write_header_cleanup,
+	.read_header_init		= toi_bio_read_header_init,
+	.read_header_cleanup		= toi_bio_read_header_cleanup,
+	.get_header_version		= toi_bio_get_header_version,
+	.remove_image			= toi_bio_remove_image,
+	.parse_sig_location		= toi_bio_parse_sig_location,
+
+	.sysfs_data			= sysfs_params,
+	.num_sysfs_entries		= sizeof(sysfs_params) /
+		sizeof(struct toi_sysfs_data),
+};
+
+/**
+ * toi_block_io_load - load time routine for block I/O module
+ *
+ * Register block i/o ops and sysfs entries.
+ **/
+static __init int toi_block_io_load(void)
+{
+	return toi_register_module(&toi_blockwriter_ops);
+}
+
+#ifdef MODULE
+static __exit void toi_block_io_unload(void)
+{
+	toi_unregister_module(&toi_blockwriter_ops);
+}
+
+module_init(toi_block_io_load);
+module_exit(toi_block_io_unload);
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Nigel Cunningham");
+MODULE_DESCRIPTION("TuxOnIce block io functions");
+#else
+late_initcall(toi_block_io_load);
+#endif
diff --git a/kernel/power/tuxonice_bio_internal.h b/kernel/power/tuxonice_bio_internal.h
new file mode 100644
index 0000000..58c2481
--- /dev/null
+++ b/kernel/power/tuxonice_bio_internal.h
@@ -0,0 +1,86 @@
+/*
+ * kernel/power/tuxonice_bio_internal.h
+ *
+ * Copyright (C) 2009-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * Distributed under GPLv2.
+ *
+ * This file contains declarations for functions exported from
+ * tuxonice_bio.c, which contains low level io functions.
+ */
+
+/* Extent chains */
+void toi_extent_state_goto_start(void);
+void toi_extent_state_save(int slot);
+int go_next_page(int writing, int section_barrier);
+void toi_extent_state_restore(int slot);
+void free_all_bdev_info(void);
+int devices_of_same_priority(struct toi_bdev_info *this);
+int toi_register_storage_chain(struct toi_bdev_info *new);
+int toi_serialise_extent_chains(void);
+int toi_load_extent_chains(void);
+int toi_bio_rw_page(int writing, struct page *page, int is_readahead,
+		int free_group);
+int toi_bio_restore_original_signature(void);
+int toi_bio_devinfo_storage_needed(void);
+unsigned long get_headerblock(void);
+dev_t get_header_dev_t(void);
+struct block_device *get_header_bdev(void);
+int toi_bio_allocate_storage(unsigned long request);
+
+/* Signature functions */
+#define HaveImage "HaveImage"
+#define NoImage "TuxOnIce"
+#define sig_size (sizeof(HaveImage))
+
+struct sig_data {
+	char sig[sig_size];
+	int have_image;
+	int resumed_before;
+
+	char have_uuid;
+	char header_uuid[17];
+	dev_t header_dev_t;
+	unsigned long first_header_block;
+
+	/* Repeat the signature to be sure we have a header version */
+	char sig2[sig_size];
+	int header_version;
+};
+
+void forget_signature_page(void);
+int toi_check_for_signature(void);
+int toi_bio_image_exists(int quiet);
+int get_signature_page(void);
+int toi_bio_mark_resume_attempted(int);
+extern char *toi_cur_sig_page;
+extern char *toi_orig_sig_page;
+int toi_bio_mark_have_image(void);
+extern struct sig_data *toi_sig_data;
+extern dev_t resume_dev_t;
+extern struct block_device *resume_block_device;
+extern struct block_device *header_block_device;
+extern unsigned long resume_firstblock;
+
+struct block_device *open_bdev(dev_t device, int display_errs);
+extern int current_stream;
+extern int more_readahead;
+int toi_do_io(int writing, struct block_device *bdev, long block0,
+	struct page *page, int is_readahead, int syncio, int free_group);
+int get_main_pool_phys_params(void);
+
+void toi_close_bdev(struct block_device *bdev);
+struct block_device *toi_open_bdev(char *uuid, dev_t default_device,
+		int display_errs);
+
+extern struct toi_module_ops toi_blockwriter_ops;
+void dump_block_chains(void);
+void debug_broken_header(void);
+extern unsigned long raw_pages_allocd, header_pages_reserved;
+int toi_bio_chains_debug_info(char *buffer, int size);
+void toi_bio_chains_post_atomic(struct toi_boot_kernel_data *bkd);
+int toi_bio_scan_for_image(int quiet);
+int toi_bio_get_header_version(void);
+
+void close_resume_dev_t(int force);
+int open_resume_dev_t(int force, int quiet);
diff --git a/kernel/power/tuxonice_bio_signature.c b/kernel/power/tuxonice_bio_signature.c
new file mode 100644
index 0000000..244f333
--- /dev/null
+++ b/kernel/power/tuxonice_bio_signature.c
@@ -0,0 +1,403 @@
+/*
+ * kernel/power/tuxonice_bio_signature.c
+ *
+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * Distributed under GPLv2.
+ *
+ */
+
+#include <linux/fs_uuid.h>
+
+#include "tuxonice.h"
+#include "tuxonice_sysfs.h"
+#include "tuxonice_modules.h"
+#include "tuxonice_prepare_image.h"
+#include "tuxonice_bio.h"
+#include "tuxonice_ui.h"
+#include "tuxonice_alloc.h"
+#include "tuxonice_io.h"
+#include "tuxonice_builtin.h"
+#include "tuxonice_bio_internal.h"
+
+struct sig_data *toi_sig_data;
+
+/* Struct of swap header pages */
+
+struct old_sig_data {
+	dev_t device;
+	unsigned long sector;
+	int resume_attempted;
+	int orig_sig_type;
+};
+
+union diskpage {
+	union swap_header swh;	/* swh.magic is the only member used */
+	struct sig_data sig_data;
+	struct old_sig_data old_sig_data;
+};
+
+union p_diskpage {
+	union diskpage *pointer;
+	char *ptr;
+	unsigned long address;
+};
+
+char *toi_cur_sig_page;
+char *toi_orig_sig_page;
+int have_image;
+int have_old_image;
+
+int get_signature_page(void)
+{
+	if (!toi_cur_sig_page) {
+		toi_message(TOI_IO, TOI_VERBOSE, 0,
+				"Allocating current signature page.");
+		toi_cur_sig_page = (char *) toi_get_zeroed_page(38,
+			TOI_ATOMIC_GFP);
+		if (!toi_cur_sig_page) {
+			printk(KERN_ERR "Failed to allocate memory for the "
+				"current image signature.\n");
+			return -ENOMEM;
+		}
+
+		toi_sig_data = (struct sig_data *) toi_cur_sig_page;
+	}
+
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "Reading signature from dev %lx,"
+			" sector %d.",
+			resume_block_device->bd_dev, resume_firstblock);
+
+	return toi_bio_ops.bdev_page_io(READ, resume_block_device,
+		resume_firstblock, virt_to_page(toi_cur_sig_page));
+}
+
+void forget_signature_page(void)
+{
+	if (toi_cur_sig_page) {
+		toi_sig_data = NULL;
+		toi_message(TOI_IO, TOI_VERBOSE, 0, "Freeing toi_cur_sig_page"
+				" (%p).", toi_cur_sig_page);
+		toi_free_page(38, (unsigned long) toi_cur_sig_page);
+		toi_cur_sig_page = NULL;
+	}
+
+	if (toi_orig_sig_page) {
+		toi_message(TOI_IO, TOI_VERBOSE, 0, "Freeing toi_orig_sig_page"
+				" (%p).", toi_orig_sig_page);
+		toi_free_page(38, (unsigned long) toi_orig_sig_page);
+		toi_orig_sig_page = NULL;
+	}
+}
+
+/*
+ * We need to ensure we use the signature page that's currently on disk,
+ * so as to not remove the image header. Post-atomic-restore, the orig sig
+ * page will be empty, so we can use that as our method of knowing that we
+ * need to load the on-disk signature and not use the non-image sig in
+ * memory. (We're going to powerdown after writing the change, so it's safe.
+ */
+int toi_bio_mark_resume_attempted(int flag)
+{
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "Make resume attempted = %d.",
+			flag);
+	if (!toi_orig_sig_page) {
+		forget_signature_page();
+		get_signature_page();
+	}
+	toi_sig_data->resumed_before = flag;
+	return toi_bio_ops.bdev_page_io(WRITE, resume_block_device,
+		resume_firstblock, virt_to_page(toi_cur_sig_page));
+}
+
+int toi_bio_mark_have_image(void)
+{
+	int result = 0;
+	char buf[32];
+	struct fs_info *fs_info;
+
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "Recording that an image exists.");
+	memcpy(toi_sig_data->sig, tuxonice_signature,
+			sizeof(tuxonice_signature));
+	toi_sig_data->have_image = 1;
+	toi_sig_data->resumed_before = 0;
+	toi_sig_data->header_dev_t = get_header_dev_t();
+	toi_sig_data->have_uuid = 0;
+
+	fs_info = fs_info_from_block_dev(get_header_bdev());
+	if (fs_info && !IS_ERR(fs_info)) {
+		memcpy(toi_sig_data->header_uuid, &fs_info->uuid, 16);
+		free_fs_info(fs_info);
+	} else
+		result = (int) PTR_ERR(fs_info);
+
+	if (!result) {
+		toi_message(TOI_IO, TOI_VERBOSE, 0, "Got uuid for dev_t %s.",
+				format_dev_t(buf, get_header_dev_t()));
+		toi_sig_data->have_uuid = 1;
+	} else
+		toi_message(TOI_IO, TOI_VERBOSE, 0, "Could not get uuid for "
+				"dev_t %s.",
+				format_dev_t(buf, get_header_dev_t()));
+
+	toi_sig_data->first_header_block = get_headerblock();
+	have_image = 1;
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "header dev_t is %x. First block "
+			"is %d.", toi_sig_data->header_dev_t,
+			toi_sig_data->first_header_block);
+
+	memcpy(toi_sig_data->sig2, tuxonice_signature,
+			sizeof(tuxonice_signature));
+	toi_sig_data->header_version = TOI_HEADER_VERSION;
+
+	return toi_bio_ops.bdev_page_io(WRITE, resume_block_device,
+		resume_firstblock, virt_to_page(toi_cur_sig_page));
+}
+
+int remove_old_signature(void)
+{
+	union p_diskpage swap_header_page = (union p_diskpage) toi_cur_sig_page;
+	char *orig_sig;
+	char *header_start = (char *) toi_get_zeroed_page(38, TOI_ATOMIC_GFP);
+	int result;
+	struct block_device *header_bdev;
+	struct old_sig_data *old_sig_data =
+		&swap_header_page.pointer->old_sig_data;
+
+	header_bdev = toi_open_bdev(NULL, old_sig_data->device, 1);
+	result = toi_bio_ops.bdev_page_io(READ, header_bdev,
+			old_sig_data->sector, virt_to_page(header_start));
+
+	if (result)
+		goto out;
+
+	/*
+	 * TODO: Get the original contents of the first bytes of the swap
+	 * header page.
+	 */
+	if (!old_sig_data->orig_sig_type)
+		orig_sig = "SWAP-SPACE";
+	else
+		orig_sig = "SWAPSPACE2";
+
+	memcpy(swap_header_page.pointer->swh.magic.magic, orig_sig, 10);
+	memcpy(swap_header_page.ptr, header_start, 10);
+
+	result = toi_bio_ops.bdev_page_io(WRITE, resume_block_device,
+		resume_firstblock, virt_to_page(swap_header_page.ptr));
+
+out:
+	toi_close_bdev(header_bdev);
+	have_old_image = 0;
+	toi_free_page(38, (unsigned long) header_start);
+	return result;
+}
+
+/*
+ * toi_bio_restore_original_signature - restore the original signature
+ *
+ * At boot time (aborting pre atomic-restore), toi_orig_sig_page gets used.
+ * It will have the original signature page contents, stored in the image
+ * header. Post atomic-restore, we use :toi_cur_sig_page, which will contain
+ * the contents that were loaded when we started the cycle.
+ */
+int toi_bio_restore_original_signature(void)
+{
+	char *use = toi_orig_sig_page ? toi_orig_sig_page : toi_cur_sig_page;
+
+	if (have_old_image)
+		return remove_old_signature();
+
+	if (!use) {
+		printk("toi_bio_restore_original_signature: No signature "
+				"page loaded.\n");
+		return 0;
+	}
+
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "Recording that no image exists.");
+	have_image = 0;
+	toi_sig_data->have_image = 0;
+	return toi_bio_ops.bdev_page_io(WRITE, resume_block_device,
+		resume_firstblock, virt_to_page(use));
+}
+
+/*
+ * check_for_signature - See whether we have an image.
+ *
+ * Returns 0 if no image, 1 if there is one, -1 if indeterminate.
+ */
+int toi_check_for_signature(void)
+{
+	union p_diskpage swap_header_page;
+	int type;
+	const char *normal_sigs[] = {"SWAP-SPACE", "SWAPSPACE2" };
+	const char *swsusp_sigs[] = {"S1SUSP", "S2SUSP", "S1SUSPEND" };
+	char *swap_header;
+
+	if (!toi_cur_sig_page) {
+		int result = get_signature_page();
+
+		if (result)
+			return result;
+	}
+
+	/*
+	 * Start by looking for the binary header.
+	 */
+	if (!memcmp(tuxonice_signature, toi_cur_sig_page,
+				sizeof(tuxonice_signature))) {
+		have_image = toi_sig_data->have_image;
+		toi_message(TOI_IO, TOI_VERBOSE, 0, "Have binary signature. "
+				"Have image is %d.", have_image);
+		if (have_image)
+			toi_message(TOI_IO, TOI_VERBOSE, 0, "header dev_t is "
+					"%x. First block is %d.",
+					toi_sig_data->header_dev_t,
+					toi_sig_data->first_header_block);
+		return toi_sig_data->have_image;
+	}
+
+	/*
+	 * Failing that, try old file allocator headers.
+	 */
+
+	if (!memcmp(HaveImage, toi_cur_sig_page, strlen(HaveImage))) {
+		have_image = 1;
+		return 1;
+	}
+
+	have_image = 0;
+
+	if (!memcmp(NoImage, toi_cur_sig_page, strlen(NoImage)))
+		return 0;
+
+	/*
+	 * Nope? How about swap?
+	 */
+	swap_header_page = (union p_diskpage) toi_cur_sig_page;
+	swap_header = swap_header_page.pointer->swh.magic.magic;
+
+	/* Normal swapspace? */
+	for (type = 0; type < 2; type++)
+		if (!memcmp(normal_sigs[type], swap_header,
+					strlen(normal_sigs[type])))
+			return 0;
+
+	/* Swsusp or uswsusp? */
+	for (type = 0; type < 3; type++)
+		if (!memcmp(swsusp_sigs[type], swap_header,
+					strlen(swsusp_sigs[type])))
+			return 2;
+
+	/* Old TuxOnIce version? */
+	if (!memcmp(tuxonice_signature, swap_header,
+				sizeof(tuxonice_signature) - 1)) {
+		toi_message(TOI_IO, TOI_VERBOSE, 0, "Found old TuxOnIce "
+				"signature.");
+		have_old_image = 1;
+		return 3;
+	}
+
+	return -1;
+}
+
+/*
+ * Image_exists
+ *
+ * Returns -1 if don't know, otherwise 0 (no) or 1 (yes).
+ */
+int toi_bio_image_exists(int quiet)
+{
+	int result;
+	char *msg = NULL;
+
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "toi_bio_image_exists.");
+
+	if (!resume_dev_t) {
+		if (!quiet)
+			printk(KERN_INFO "Not even trying to read header "
+				"because resume_dev_t is not set.\n");
+		return -1;
+	}
+
+	if (open_resume_dev_t(0, quiet))
+		return -1;
+
+	result = toi_check_for_signature();
+
+	clear_toi_state(TOI_RESUMED_BEFORE);
+	if (toi_sig_data->resumed_before)
+		set_toi_state(TOI_RESUMED_BEFORE);
+
+	if (quiet || result == -ENOMEM)
+		return result;
+
+	if (result == -1)
+		msg = "TuxOnIce: Unable to find a signature."
+				" Could you have moved a swap file?\n";
+	else if (!result)
+		msg = "TuxOnIce: No image found.\n";
+	else if (result == 1)
+		msg = "TuxOnIce: Image found.\n";
+	else if (result == 2)
+		msg = "TuxOnIce: uswsusp or swsusp image found.\n";
+	else if (result == 3)
+		msg = "TuxOnIce: Old implementation's signature found.\n";
+
+	printk(KERN_INFO "%s", msg);
+
+	return result;
+}
+
+int toi_bio_scan_for_image(int quiet)
+{
+	struct block_device *bdev;
+	char default_name[255] = "";
+
+	if (!quiet)
+		printk(KERN_DEBUG "Scanning swap devices for TuxOnIce "
+				"signature...\n");
+	for (bdev = next_bdev_of_type(NULL, "swap"); bdev;
+				bdev = next_bdev_of_type(bdev, "swap")) {
+		int result;
+		char name[255] = "";
+		sprintf(name, "%u:%u", MAJOR(bdev->bd_dev),
+				MINOR(bdev->bd_dev));
+		if (!quiet)
+			printk(KERN_DEBUG "- Trying %s.\n", name);
+		resume_block_device = bdev;
+		resume_dev_t = bdev->bd_dev;
+
+		result = toi_check_for_signature();
+
+		resume_block_device = NULL;
+		resume_dev_t = MKDEV(0, 0);
+
+		if (!default_name[0])
+			strcpy(default_name, name);
+
+		if (result == 1) {
+			/* Got one! */
+			strcpy(resume_file, name);
+			next_bdev_of_type(bdev, NULL);
+			if (!quiet)
+				printk(KERN_DEBUG " ==> Image found on %s.\n",
+						resume_file);
+			return 1;
+		}
+		forget_signature_page();
+	}
+
+	if (!quiet)
+		printk(KERN_DEBUG "TuxOnIce scan: No image found.\n");
+	strcpy(resume_file, default_name);
+	return 0;
+}
+
+int toi_bio_get_header_version(void)
+{
+	return (memcmp(toi_sig_data->sig2, tuxonice_signature,
+				sizeof(tuxonice_signature))) ?
+		0 : toi_sig_data->header_version;
+
+}
diff --git a/kernel/power/tuxonice_builtin.c b/kernel/power/tuxonice_builtin.c
new file mode 100644
index 0000000..62b5d14
--- /dev/null
+++ b/kernel/power/tuxonice_builtin.c
@@ -0,0 +1,445 @@
+/*
+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ */
+#include <linux/resume-trace.h>
+#include <linux/kernel.h>
+#include <linux/swap.h>
+#include <linux/syscalls.h>
+#include <linux/bio.h>
+#include <linux/root_dev.h>
+#include <linux/freezer.h>
+#include <linux/reboot.h>
+#include <linux/writeback.h>
+#include <linux/tty.h>
+#include <linux/crypto.h>
+#include <linux/cpu.h>
+#include <linux/ctype.h>
+#include "tuxonice_io.h"
+#include "tuxonice.h"
+#include "tuxonice_extent.h"
+#include "tuxonice_netlink.h"
+#include "tuxonice_prepare_image.h"
+#include "tuxonice_ui.h"
+#include "tuxonice_sysfs.h"
+#include "tuxonice_pagedir.h"
+#include "tuxonice_modules.h"
+#include "tuxonice_builtin.h"
+#include "tuxonice_power_off.h"
+#include "tuxonice_alloc.h"
+
+unsigned long toi_bootflags_mask;
+EXPORT_SYMBOL_GPL(toi_bootflags_mask);
+
+/*
+ * Highmem related functions (x86 only).
+ */
+
+#ifdef CONFIG_HIGHMEM
+
+/**
+ * copyback_high: Restore highmem pages.
+ *
+ * Highmem data and pbe lists are/can be stored in highmem.
+ * The format is slightly different to the lowmem pbe lists
+ * used for the assembly code: the last pbe in each page is
+ * a struct page * instead of struct pbe *, pointing to the
+ * next page where pbes are stored (or NULL if happens to be
+ * the end of the list). Since we don't want to generate
+ * unnecessary deltas against swsusp code, we use a cast
+ * instead of a union.
+ **/
+
+static void copyback_high(void)
+{
+	struct page *pbe_page = (struct page *) restore_highmem_pblist;
+	struct pbe *this_pbe, *first_pbe;
+	unsigned long *origpage, *copypage;
+	int pbe_index = 1;
+
+	if (!pbe_page)
+		return;
+
+	this_pbe = (struct pbe *) kmap_atomic(pbe_page);
+	first_pbe = this_pbe;
+
+	while (this_pbe) {
+		int loop = (PAGE_SIZE / sizeof(unsigned long)) - 1;
+
+		origpage = kmap_atomic(pfn_to_page((unsigned long) this_pbe->orig_address));
+		copypage = kmap_atomic((struct page *) this_pbe->address);
+
+		while (loop >= 0) {
+			*(origpage + loop) = *(copypage + loop);
+			loop--;
+		}
+
+		kunmap_atomic(origpage);
+		kunmap_atomic(copypage);
+
+		if (!this_pbe->next)
+			break;
+
+		if (pbe_index < PBES_PER_PAGE) {
+			this_pbe++;
+			pbe_index++;
+		} else {
+			pbe_page = (struct page *) this_pbe->next;
+			kunmap_atomic(first_pbe);
+			if (!pbe_page)
+				return;
+			this_pbe = (struct pbe *) kmap_atomic(pbe_page);
+			first_pbe = this_pbe;
+			pbe_index = 1;
+		}
+	}
+	kunmap_atomic(first_pbe);
+}
+
+#else /* CONFIG_HIGHMEM */
+static void copyback_high(void) { }
+#endif
+
+char toi_wait_for_keypress_dev_console(int timeout)
+{
+	int fd, this_timeout = 255;
+	char key = '\0';
+	struct termios t, t_backup;
+
+	/* We should be guaranteed /dev/console exists after populate_rootfs()
+	 * in init/main.c.
+	 */
+	fd = sys_open("/dev/console", O_RDONLY, 0);
+	if (fd < 0) {
+		printk(KERN_INFO "Couldn't open /dev/console.\n");
+		return key;
+	}
+
+	if (sys_ioctl(fd, TCGETS, (long)&t) < 0)
+		goto out_close;
+
+	memcpy(&t_backup, &t, sizeof(t));
+
+	t.c_lflag &= ~(ISIG|ICANON|ECHO);
+	t.c_cc[VMIN] = 0;
+
+new_timeout:
+	if (timeout > 0) {
+		this_timeout = timeout < 26 ? timeout : 25;
+		timeout -= this_timeout;
+		this_timeout *= 10;
+	}
+
+	t.c_cc[VTIME] = this_timeout;
+
+	if (sys_ioctl(fd, TCSETS, (long)&t) < 0)
+		goto out_restore;
+
+	while (1) {
+		if (sys_read(fd, &key, 1) <= 0) {
+			if (timeout)
+				goto new_timeout;
+			key = '\0';
+			break;
+		}
+		key = tolower(key);
+		if (test_toi_state(TOI_SANITY_CHECK_PROMPT)) {
+			if (key == 'c') {
+				set_toi_state(TOI_CONTINUE_REQ);
+				break;
+			} else if (key == ' ')
+				break;
+		} else
+			break;
+	}
+
+out_restore:
+	sys_ioctl(fd, TCSETS, (long)&t_backup);
+out_close:
+	sys_close(fd);
+
+	return key;
+}
+EXPORT_SYMBOL_GPL(toi_wait_for_keypress_dev_console);
+
+struct toi_boot_kernel_data toi_bkd __nosavedata
+		__attribute__((aligned(PAGE_SIZE))) = {
+	MY_BOOT_KERNEL_DATA_VERSION,
+	0,
+#ifdef CONFIG_TOI_REPLACE_SWSUSP
+	(1 << TOI_REPLACE_SWSUSP) |
+#endif
+	(1 << TOI_NO_FLUSHER_THREAD) |
+	(1 << TOI_PAGESET2_FULL) | (1 << TOI_LATE_CPU_HOTPLUG),
+};
+EXPORT_SYMBOL_GPL(toi_bkd);
+
+struct block_device *toi_open_by_devnum(dev_t dev)
+{
+	struct block_device *bdev = bdget(dev);
+	int err = -ENOMEM;
+	if (bdev)
+		err = blkdev_get(bdev, FMODE_READ | FMODE_NDELAY, NULL);
+	return err ? ERR_PTR(err) : bdev;
+}
+EXPORT_SYMBOL_GPL(toi_open_by_devnum);
+
+/**
+ * toi_close_bdev: Close a swap bdev.
+ *
+ * int: The swap entry number to close.
+ */
+void toi_close_bdev(struct block_device *bdev)
+{
+	blkdev_put(bdev, FMODE_READ | FMODE_NDELAY);
+}
+EXPORT_SYMBOL_GPL(toi_close_bdev);
+
+int toi_wait = CONFIG_TOI_DEFAULT_WAIT;
+EXPORT_SYMBOL_GPL(toi_wait);
+
+struct toi_core_fns *toi_core_fns;
+EXPORT_SYMBOL_GPL(toi_core_fns);
+
+unsigned long toi_result;
+EXPORT_SYMBOL_GPL(toi_result);
+
+struct pagedir pagedir1 = {1};
+EXPORT_SYMBOL_GPL(pagedir1);
+
+unsigned long toi_get_nonconflicting_page(void)
+{
+	return toi_core_fns->get_nonconflicting_page();
+}
+
+int toi_post_context_save(void)
+{
+	return toi_core_fns->post_context_save();
+}
+
+int try_tuxonice_hibernate(void)
+{
+	if (!toi_core_fns)
+		return -ENODEV;
+
+	return toi_core_fns->try_hibernate();
+}
+
+static int num_resume_calls;
+#ifdef CONFIG_TOI_IGNORE_LATE_INITCALL
+static int ignore_late_initcall = 1;
+#else
+static int ignore_late_initcall;
+#endif
+
+int toi_translate_err_default = TOI_CONTINUE_REQ;
+EXPORT_SYMBOL_GPL(toi_translate_err_default);
+
+void try_tuxonice_resume(void)
+{
+	/* Don't let it wrap around eventually */
+	if (num_resume_calls < 2)
+		num_resume_calls++;
+
+	if (num_resume_calls == 1 && ignore_late_initcall) {
+		printk(KERN_INFO "TuxOnIce: Ignoring late initcall, as requested.\n");
+		return;
+	}
+
+	if (toi_core_fns)
+		toi_core_fns->try_resume();
+	else
+		printk(KERN_INFO "TuxOnIce core not loaded yet.\n");
+}
+
+int toi_lowlevel_builtin(void)
+{
+	int error = 0;
+
+	save_processor_state();
+	error = swsusp_arch_suspend();
+	if (error)
+		printk(KERN_ERR "Error %d hibernating\n", error);
+
+	/* Restore control flow appears here */
+	if (!toi_in_hibernate) {
+		copyback_high();
+		set_toi_state(TOI_NOW_RESUMING);
+	}
+
+	restore_processor_state();
+	return error;
+}
+EXPORT_SYMBOL_GPL(toi_lowlevel_builtin);
+
+unsigned long toi_compress_bytes_in;
+EXPORT_SYMBOL_GPL(toi_compress_bytes_in);
+
+unsigned long toi_compress_bytes_out;
+EXPORT_SYMBOL_GPL(toi_compress_bytes_out);
+
+int toi_in_suspend(void)
+{
+  return in_suspend;
+}
+EXPORT_SYMBOL_GPL(toi_in_suspend);
+
+unsigned long toi_state = ((1 << TOI_BOOT_TIME) |
+		(1 << TOI_IGNORE_LOGLEVEL) |
+		(1 << TOI_IO_STOPPED));
+EXPORT_SYMBOL_GPL(toi_state);
+
+/* The number of hibernates we have started (some may have been cancelled) */
+unsigned int nr_hibernates;
+EXPORT_SYMBOL_GPL(nr_hibernates);
+
+int toi_running;
+EXPORT_SYMBOL_GPL(toi_running);
+
+__nosavedata int toi_in_hibernate;
+EXPORT_SYMBOL_GPL(toi_in_hibernate);
+
+__nosavedata struct pbe *restore_highmem_pblist;
+EXPORT_SYMBOL_GPL(restore_highmem_pblist);
+
+int toi_trace_allocs;
+EXPORT_SYMBOL_GPL(toi_trace_allocs);
+
+void toi_read_lock_tasklist(void)
+{
+	read_lock(&tasklist_lock);
+}
+EXPORT_SYMBOL_GPL(toi_read_lock_tasklist);
+
+void toi_read_unlock_tasklist(void)
+{
+	read_unlock(&tasklist_lock);
+}
+EXPORT_SYMBOL_GPL(toi_read_unlock_tasklist);
+
+#ifdef CONFIG_TOI_ZRAM_SUPPORT
+int (*toi_flag_zram_disks) (void);
+EXPORT_SYMBOL_GPL(toi_flag_zram_disks);
+
+int toi_do_flag_zram_disks(void)
+{
+	return toi_flag_zram_disks ? (*toi_flag_zram_disks)() : 0;
+}
+EXPORT_SYMBOL_GPL(toi_do_flag_zram_disks);
+#endif
+
+static int __init toi_wait_setup(char *str)
+{
+	int value;
+
+	if (sscanf(str, "=%d", &value)) {
+		if (value < -1 || value > 255)
+			printk(KERN_INFO "TuxOnIce_wait outside range -1 to "
+					"255.\n");
+		else
+			toi_wait = value;
+	}
+
+	return 1;
+}
+
+__setup("toi_wait", toi_wait_setup);
+
+static int __init toi_translate_retry_setup(char *str)
+{
+	toi_translate_err_default = 0;
+	return 1;
+}
+
+__setup("toi_translate_retry", toi_translate_retry_setup);
+
+static int __init toi_debug_setup(char *str)
+{
+	toi_bkd.toi_action |= (1 << TOI_LOGALL);
+	toi_bootflags_mask |= (1 << TOI_LOGALL);
+	toi_bkd.toi_debug_state = 255;
+	toi_bkd.toi_default_console_level = 7;
+	return 1;
+}
+
+__setup("toi_debug_setup", toi_debug_setup);
+
+static int __init toi_pause_setup(char *str)
+{
+	toi_bkd.toi_action |= (1 << TOI_PAUSE);
+	toi_bootflags_mask |= (1 << TOI_PAUSE);
+	return 1;
+}
+
+__setup("toi_pause", toi_pause_setup);
+
+#ifdef CONFIG_PM_DEBUG
+static int __init toi_trace_allocs_setup(char *str)
+{
+	int value;
+
+	if (sscanf(str, "=%d", &value))
+		toi_trace_allocs = value;
+
+	return 1;
+}
+__setup("toi_trace_allocs", toi_trace_allocs_setup);
+#endif
+
+static int __init toi_ignore_late_initcall_setup(char *str)
+{
+	int value;
+
+	if (sscanf(str, "=%d", &value))
+		ignore_late_initcall = value;
+
+	return 1;
+}
+
+__setup("toi_initramfs_resume_only", toi_ignore_late_initcall_setup);
+
+static int __init toi_force_no_multithreaded_setup(char *str)
+{
+	int value;
+
+	toi_bkd.toi_action &= ~(1 << TOI_NO_MULTITHREADED_IO);
+	toi_bootflags_mask |= (1 << TOI_NO_MULTITHREADED_IO);
+
+	if (sscanf(str, "=%d", &value) && value)
+		toi_bkd.toi_action |= (1 << TOI_NO_MULTITHREADED_IO);
+
+	return 1;
+}
+
+__setup("toi_no_multithreaded", toi_force_no_multithreaded_setup);
+
+#ifdef CONFIG_KGDB
+static int __init toi_post_resume_breakpoint_setup(char *str)
+{
+	int value;
+
+	toi_bkd.toi_action &= ~(1 << TOI_POST_RESUME_BREAKPOINT);
+	toi_bootflags_mask |= (1 << TOI_POST_RESUME_BREAKPOINT);
+	if (sscanf(str, "=%d", &value) && value)
+		toi_bkd.toi_action |= (1 << TOI_POST_RESUME_BREAKPOINT);
+
+	return 1;
+}
+
+__setup("toi_post_resume_break", toi_post_resume_breakpoint_setup);
+#endif
+
+static int __init toi_disable_readahead_setup(char *str)
+{
+	int value;
+
+	toi_bkd.toi_action &= ~(1 << TOI_NO_READAHEAD);
+	toi_bootflags_mask |= (1 << TOI_NO_READAHEAD);
+	if (sscanf(str, "=%d", &value) && value)
+		toi_bkd.toi_action |= (1 << TOI_NO_READAHEAD);
+
+	return 1;
+}
+
+__setup("toi_no_readahead", toi_disable_readahead_setup);
diff --git a/kernel/power/tuxonice_builtin.h b/kernel/power/tuxonice_builtin.h
new file mode 100644
index 0000000..eea0155
--- /dev/null
+++ b/kernel/power/tuxonice_builtin.h
@@ -0,0 +1,39 @@
+/*
+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ */
+#include <asm/setup.h>
+
+extern struct toi_core_fns *toi_core_fns;
+extern unsigned long toi_compress_bytes_in, toi_compress_bytes_out;
+extern unsigned int nr_hibernates;
+extern int toi_in_hibernate;
+
+extern __nosavedata struct pbe *restore_highmem_pblist;
+
+int toi_lowlevel_builtin(void);
+
+#ifdef CONFIG_HIGHMEM
+extern __nosavedata struct zone_data *toi_nosave_zone_list;
+extern __nosavedata unsigned long toi_nosave_max_pfn;
+#endif
+
+extern unsigned long toi_get_nonconflicting_page(void);
+extern int toi_post_context_save(void);
+
+extern char toi_wait_for_keypress_dev_console(int timeout);
+extern struct block_device *toi_open_by_devnum(dev_t dev);
+extern void toi_close_bdev(struct block_device *bdev);
+extern int toi_wait;
+extern int toi_translate_err_default;
+extern int toi_force_no_multithreaded;
+extern void toi_read_lock_tasklist(void);
+extern void toi_read_unlock_tasklist(void);
+extern int toi_in_suspend(void);
+
+#ifdef CONFIG_TOI_ZRAM_SUPPORT
+extern int toi_do_flag_zram_disks(void);
+#else
+#define toi_do_flag_zram_disks() (0)
+#endif
diff --git a/kernel/power/tuxonice_checksum.c b/kernel/power/tuxonice_checksum.c
new file mode 100644
index 0000000..006e68b
--- /dev/null
+++ b/kernel/power/tuxonice_checksum.c
@@ -0,0 +1,384 @@
+/*
+ * kernel/power/tuxonice_checksum.c
+ *
+ * Copyright (C) 2006-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * This file contains data checksum routines for TuxOnIce,
+ * using cryptoapi. They are used to locate any modifications
+ * made to pageset 2 while we're saving it.
+ */
+
+#include <linux/suspend.h>
+#include <linux/highmem.h>
+#include <linux/vmalloc.h>
+#include <linux/crypto.h>
+#include <linux/scatterlist.h>
+
+#include "tuxonice.h"
+#include "tuxonice_modules.h"
+#include "tuxonice_sysfs.h"
+#include "tuxonice_io.h"
+#include "tuxonice_pageflags.h"
+#include "tuxonice_checksum.h"
+#include "tuxonice_pagedir.h"
+#include "tuxonice_alloc.h"
+#include "tuxonice_ui.h"
+
+static struct toi_module_ops toi_checksum_ops;
+
+/* Constant at the mo, but I might allow tuning later */
+static char toi_checksum_name[32] = "md4";
+/* Bytes per checksum */
+#define CHECKSUM_SIZE (16)
+
+#define CHECKSUMS_PER_PAGE ((PAGE_SIZE - sizeof(void *)) / CHECKSUM_SIZE)
+
+struct cpu_context {
+	struct crypto_hash *transform;
+	struct hash_desc desc;
+	struct scatterlist sg[2];
+	char *buf;
+};
+
+static DEFINE_PER_CPU(struct cpu_context, contexts);
+static int pages_allocated;
+static unsigned long page_list;
+
+static int toi_num_resaved;
+
+static unsigned long this_checksum, next_page;
+static int checksum_index;
+
+static inline int checksum_pages_needed(void)
+{
+	return DIV_ROUND_UP(pagedir2.size, CHECKSUMS_PER_PAGE);
+}
+
+/* ---- Local buffer management ---- */
+
+/*
+ * toi_checksum_cleanup
+ *
+ * Frees memory allocated for our labours.
+ */
+static void toi_checksum_cleanup(int ending_cycle)
+{
+	int cpu;
+
+	if (ending_cycle) {
+		for_each_online_cpu(cpu) {
+			struct cpu_context *this = &per_cpu(contexts, cpu);
+			if (this->transform) {
+				crypto_free_hash(this->transform);
+				this->transform = NULL;
+				this->desc.tfm = NULL;
+			}
+
+			if (this->buf) {
+				toi_free_page(27, (unsigned long) this->buf);
+				this->buf = NULL;
+			}
+		}
+	}
+}
+
+/*
+ * toi_crypto_initialise
+ *
+ * Prepare to do some work by allocating buffers and transforms.
+ * Returns: Int: Zero. Even if we can't set up checksum, we still
+ * seek to hibernate.
+ */
+static int toi_checksum_initialise(int starting_cycle)
+{
+	int cpu;
+
+	if (!(starting_cycle & SYSFS_HIBERNATE) || !toi_checksum_ops.enabled)
+		return 0;
+
+	if (!*toi_checksum_name) {
+		printk(KERN_INFO "TuxOnIce: No checksum algorithm name set.\n");
+		return 1;
+	}
+
+	for_each_online_cpu(cpu) {
+		struct cpu_context *this = &per_cpu(contexts, cpu);
+		struct page *page;
+
+		this->transform = crypto_alloc_hash(toi_checksum_name, 0, 0);
+		if (IS_ERR(this->transform)) {
+			printk(KERN_INFO "TuxOnIce: Failed to initialise the "
+				"%s checksum algorithm: %ld.\n",
+				toi_checksum_name, (long) this->transform);
+			this->transform = NULL;
+			return 1;
+		}
+
+		this->desc.tfm = this->transform;
+		this->desc.flags = 0;
+
+		page = toi_alloc_page(27, GFP_KERNEL);
+		if (!page)
+			return 1;
+		this->buf = page_address(page);
+		sg_init_one(&this->sg[0], this->buf, PAGE_SIZE);
+	}
+	return 0;
+}
+
+/*
+ * toi_checksum_print_debug_stats
+ * @buffer: Pointer to a buffer into which the debug info will be printed.
+ * @size: Size of the buffer.
+ *
+ * Print information to be recorded for debugging purposes into a buffer.
+ * Returns: Number of characters written to the buffer.
+ */
+
+static int toi_checksum_print_debug_stats(char *buffer, int size)
+{
+	int len;
+
+	if (!toi_checksum_ops.enabled)
+		return scnprintf(buffer, size,
+			"- Checksumming disabled.\n");
+
+	len = scnprintf(buffer, size, "- Checksum method is '%s'.\n",
+			toi_checksum_name);
+	len += scnprintf(buffer + len, size - len,
+		"  %d pages resaved in atomic copy.\n", toi_num_resaved);
+	return len;
+}
+
+static int toi_checksum_memory_needed(void)
+{
+	return toi_checksum_ops.enabled ?
+		checksum_pages_needed() << PAGE_SHIFT : 0;
+}
+
+static int toi_checksum_storage_needed(void)
+{
+	if (toi_checksum_ops.enabled)
+		return strlen(toi_checksum_name) + sizeof(int) + 1;
+	else
+		return 0;
+}
+
+/*
+ * toi_checksum_save_config_info
+ * @buffer: Pointer to a buffer of size PAGE_SIZE.
+ *
+ * Save informaton needed when reloading the image at resume time.
+ * Returns: Number of bytes used for saving our data.
+ */
+static int toi_checksum_save_config_info(char *buffer)
+{
+	int namelen = strlen(toi_checksum_name) + 1;
+	int total_len;
+
+	*((unsigned int *) buffer) = namelen;
+	strncpy(buffer + sizeof(unsigned int), toi_checksum_name, namelen);
+	total_len = sizeof(unsigned int) + namelen;
+	return total_len;
+}
+
+/* toi_checksum_load_config_info
+ * @buffer: Pointer to the start of the data.
+ * @size: Number of bytes that were saved.
+ *
+ * Description:	Reload information needed for dechecksuming the image at
+ * resume time.
+ */
+static void toi_checksum_load_config_info(char *buffer, int size)
+{
+	int namelen;
+
+	namelen = *((unsigned int *) (buffer));
+	strncpy(toi_checksum_name, buffer + sizeof(unsigned int),
+			namelen);
+	return;
+}
+
+/*
+ * Free Checksum Memory
+ */
+
+void free_checksum_pages(void)
+{
+	while (pages_allocated) {
+		unsigned long next = *((unsigned long *) page_list);
+		ClearPageNosave(virt_to_page(page_list));
+		toi_free_page(15, (unsigned long) page_list);
+		page_list = next;
+		pages_allocated--;
+	}
+}
+
+/*
+ * Allocate Checksum Memory
+ */
+
+int allocate_checksum_pages(void)
+{
+	int pages_needed = checksum_pages_needed();
+
+	if (!toi_checksum_ops.enabled)
+		return 0;
+
+	while (pages_allocated < pages_needed) {
+		unsigned long *new_page =
+		  (unsigned long *) toi_get_zeroed_page(15, TOI_ATOMIC_GFP);
+		if (!new_page) {
+			printk(KERN_ERR "Unable to allocate checksum pages.\n");
+			return -ENOMEM;
+		}
+		SetPageNosave(virt_to_page(new_page));
+		(*new_page) = page_list;
+		page_list = (unsigned long) new_page;
+		pages_allocated++;
+	}
+
+	next_page = (unsigned long) page_list;
+	checksum_index = 0;
+
+	return 0;
+}
+
+char *tuxonice_get_next_checksum(void)
+{
+	if (!toi_checksum_ops.enabled)
+		return NULL;
+
+	if (checksum_index % CHECKSUMS_PER_PAGE)
+		this_checksum += CHECKSUM_SIZE;
+	else {
+		this_checksum = next_page + sizeof(void *);
+		next_page = *((unsigned long *) next_page);
+	}
+
+	checksum_index++;
+	return (char *) this_checksum;
+}
+
+int tuxonice_calc_checksum(struct page *page, char *checksum_locn)
+{
+	char *pa;
+	int result, cpu = smp_processor_id();
+	struct cpu_context *ctx = &per_cpu(contexts, cpu);
+
+	if (!toi_checksum_ops.enabled)
+		return 0;
+
+	pa = kmap(page);
+	memcpy(ctx->buf, pa, PAGE_SIZE);
+	kunmap(page);
+	result = crypto_hash_digest(&ctx->desc, ctx->sg, PAGE_SIZE,
+						checksum_locn);
+	if (result)
+		printk(KERN_ERR "TuxOnIce checksumming: crypto_hash_digest "
+				"returned %d.\n", result);
+	return result;
+}
+/*
+ * Calculate checksums
+ */
+
+void check_checksums(void)
+{
+	int pfn, index = 0, cpu = smp_processor_id();
+	char current_checksum[CHECKSUM_SIZE];
+	struct cpu_context *ctx = &per_cpu(contexts, cpu);
+
+	if (!toi_checksum_ops.enabled) {
+		toi_message(TOI_IO, TOI_VERBOSE, 0, "Checksumming disabled.");
+		return;
+	}
+
+	next_page = (unsigned long) page_list;
+
+	toi_num_resaved = 0;
+	this_checksum = 0;
+
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "Verifying checksums.");
+	memory_bm_position_reset(pageset2_map);
+	for (pfn = memory_bm_next_pfn(pageset2_map); pfn != BM_END_OF_MAP;
+			pfn = memory_bm_next_pfn(pageset2_map)) {
+		int ret;
+		char *pa;
+		struct page *page = pfn_to_page(pfn);
+
+		if (index % CHECKSUMS_PER_PAGE) {
+			this_checksum += CHECKSUM_SIZE;
+		} else {
+			this_checksum = next_page + sizeof(void *);
+			next_page = *((unsigned long *) next_page);
+		}
+
+		/* Done when IRQs disabled so must be atomic */
+		pa = kmap_atomic(page);
+		memcpy(ctx->buf, pa, PAGE_SIZE);
+		kunmap_atomic(pa);
+		ret = crypto_hash_digest(&ctx->desc, ctx->sg, PAGE_SIZE,
+							current_checksum);
+
+		if (ret) {
+			printk(KERN_INFO "Digest failed. Returned %d.\n", ret);
+			return;
+		}
+
+		if (memcmp(current_checksum, (char *) this_checksum,
+							CHECKSUM_SIZE)) {
+			toi_message(TOI_IO, TOI_VERBOSE, 0, "Resaving %ld.",
+					pfn);
+			SetPageResave(pfn_to_page(pfn));
+			toi_num_resaved++;
+			if (test_action_state(TOI_ABORT_ON_RESAVE_NEEDED))
+				set_abort_result(TOI_RESAVE_NEEDED);
+		}
+
+		index++;
+	}
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "Checksum verification complete.");
+}
+
+static struct toi_sysfs_data sysfs_params[] = {
+	SYSFS_INT("enabled", SYSFS_RW, &toi_checksum_ops.enabled, 0, 1, 0,
+			NULL),
+	SYSFS_BIT("abort_if_resave_needed", SYSFS_RW, &toi_bkd.toi_action,
+			TOI_ABORT_ON_RESAVE_NEEDED, 0)
+};
+
+/*
+ * Ops structure.
+ */
+static struct toi_module_ops toi_checksum_ops = {
+	.type			= MISC_MODULE,
+	.name			= "checksumming",
+	.directory		= "checksum",
+	.module			= THIS_MODULE,
+	.initialise		= toi_checksum_initialise,
+	.cleanup		= toi_checksum_cleanup,
+	.print_debug_info	= toi_checksum_print_debug_stats,
+	.save_config_info	= toi_checksum_save_config_info,
+	.load_config_info	= toi_checksum_load_config_info,
+	.memory_needed		= toi_checksum_memory_needed,
+	.storage_needed		= toi_checksum_storage_needed,
+
+	.sysfs_data		= sysfs_params,
+	.num_sysfs_entries	= sizeof(sysfs_params) /
+		sizeof(struct toi_sysfs_data),
+};
+
+/* ---- Registration ---- */
+int toi_checksum_init(void)
+{
+	int result = toi_register_module(&toi_checksum_ops);
+	return result;
+}
+
+void toi_checksum_exit(void)
+{
+	toi_unregister_module(&toi_checksum_ops);
+}
diff --git a/kernel/power/tuxonice_checksum.h b/kernel/power/tuxonice_checksum.h
new file mode 100644
index 0000000..0f2812e
--- /dev/null
+++ b/kernel/power/tuxonice_checksum.h
@@ -0,0 +1,31 @@
+/*
+ * kernel/power/tuxonice_checksum.h
+ *
+ * Copyright (C) 2006-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * This file contains data checksum routines for TuxOnIce,
+ * using cryptoapi. They are used to locate any modifications
+ * made to pageset 2 while we're saving it.
+ */
+
+#if defined(CONFIG_TOI_CHECKSUM)
+extern int toi_checksum_init(void);
+extern void toi_checksum_exit(void);
+void check_checksums(void);
+int allocate_checksum_pages(void);
+void free_checksum_pages(void);
+char *tuxonice_get_next_checksum(void);
+int tuxonice_calc_checksum(struct page *page, char *checksum_locn);
+#else
+static inline int toi_checksum_init(void) { return 0; }
+static inline void toi_checksum_exit(void) { }
+static inline void check_checksums(void) { };
+static inline int allocate_checksum_pages(void) { return 0; };
+static inline void free_checksum_pages(void) { };
+static inline char *tuxonice_get_next_checksum(void) { return NULL; };
+static inline int tuxonice_calc_checksum(struct page *page, char *checksum_locn)
+	{ return 0; }
+#endif
+
diff --git a/kernel/power/tuxonice_cluster.c b/kernel/power/tuxonice_cluster.c
new file mode 100644
index 0000000..0e5a262
--- /dev/null
+++ b/kernel/power/tuxonice_cluster.c
@@ -0,0 +1,1069 @@
+/*
+ * kernel/power/tuxonice_cluster.c
+ *
+ * Copyright (C) 2006-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * This file contains routines for cluster hibernation support.
+ *
+ * Based on ip autoconfiguration code in net/ipv4/ipconfig.c.
+ *
+ * How does it work?
+ *
+ * There is no 'master' node that tells everyone else what to do. All nodes
+ * send messages to the broadcast address/port, maintain a list of peers
+ * and figure out when to progress to the next step in hibernating or resuming.
+ * This makes us more fault tolerant when it comes to nodes coming and going
+ * (which may be more of an issue if we're hibernating when power supplies
+ * are being unreliable).
+ *
+ * At boot time, we start a ktuxonice thread that handles communication with
+ * other nodes. This node maintains a state machine that controls our progress
+ * through hibernating and resuming, keeping us in step with other nodes. Nodes
+ * are identified by their hw address.
+ *
+ * On startup, the node sends CLUSTER_PING on the configured interface's
+ * broadcast address, port $toi_cluster_port (see below) and begins to listen
+ * for other broadcast messages. CLUSTER_PING messages are repeated at
+ * intervals of 5 minutes, with a random offset to spread traffic out.
+ *
+ * A hibernation cycle is initiated from any node via
+ *
+ * echo > /sys/power/tuxonice/do_hibernate
+ *
+ * and (possibily) the hibernate script. At each step of the process, the node
+ * completes its work, and waits for all other nodes to signal completion of
+ * their work (or timeout) before progressing to the next step.
+ *
+ * Request/state  Action before reply	Possible reply	Next state
+ * HIBERNATE	  capable, pre-script	HIBERNATE|ACK	NODE_PREP
+ * 					HIBERNATE|NACK	INIT_0
+ *
+ * PREP		  prepare_image		PREP|ACK	IMAGE_WRITE
+ *		 			PREP|NACK	INIT_0
+ * 					ABORT		RUNNING
+ *
+ * IO		  write image		IO|ACK		power off
+ * 					ABORT		POST_RESUME
+ *
+ * (Boot time)	  check for image	IMAGE|ACK	RESUME_PREP
+ * 					(Note 1)
+ * 					IMAGE|NACK	(Note 2)
+ *
+ * PREP		  prepare read image	PREP|ACK	IMAGE_READ
+ * 					PREP|NACK	(As NACK_IMAGE)
+ *
+ * IO		  read image		IO|ACK		POST_RESUME
+ *
+ * POST_RESUME	  thaw, post-script			RUNNING
+ *
+ * INIT_0	  init 0
+ *
+ * Other messages:
+ *
+ * - PING: Request for all other live nodes to send a PONG. Used at startup to
+ *   announce presence, when a node is suspected dead and periodically, in case
+ *   segments of the network are [un]plugged.
+ *
+ * - PONG: Response to a PING.
+ *
+ * - ABORT: Request to cancel writing an image.
+ *
+ * - BYE: Notification that this node is shutting down.
+ *
+ * Note 1: Repeated at 3s intervals until we continue to boot/resume, so that
+ * nodes which are slower to start up can get state synchronised. If a node
+ * starting up sees other nodes sending RESUME_PREP or IMAGE_READ, it may send
+ * ACK_IMAGE and they will wait for it to catch up. If it sees ACK_READ, it
+ * must invalidate its image (if any) and boot normally.
+ *
+ * Note 2: May occur when one node lost power or powered off while others
+ * hibernated. This node waits for others to complete resuming (ACK_READ)
+ * before completing its boot, so that it appears as a fail node restarting.
+ *
+ * If any node has an image, then it also has a list of nodes that hibernated
+ * in synchronisation with it. The node will wait for other nodes to appear
+ * or timeout before beginning its restoration.
+ *
+ * If a node has no image, it needs to wait, in case other nodes which do have
+ * an image are going to resume, but are taking longer to announce their
+ * presence. For this reason, the user can specify a timeout value and a number
+ * of nodes detected before we just continue. (We might want to assume in a
+ * cluster of, say, 15 nodes, if 8 others have booted without finding an image,
+ * the remaining nodes will too. This might help in situations where some nodes
+ * are much slower to boot, or more subject to hardware failures or such like).
+ */
+
+#include <linux/suspend.h>
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/if.h>
+#include <linux/rtnetlink.h>
+#include <linux/ip.h>
+#include <linux/udp.h>
+#include <linux/in.h>
+#include <linux/if_arp.h>
+#include <linux/kthread.h>
+#include <linux/wait.h>
+#include <linux/netdevice.h>
+#include <net/ip.h>
+
+#include "tuxonice.h"
+#include "tuxonice_modules.h"
+#include "tuxonice_sysfs.h"
+#include "tuxonice_alloc.h"
+#include "tuxonice_io.h"
+
+#if 1
+#define PRINTK(a, b...) do { printk(a, ##b); } while (0)
+#else
+#define PRINTK(a, b...) do { } while (0)
+#endif
+
+static int loopback_mode;
+static int num_local_nodes = 1;
+#define MAX_LOCAL_NODES 8
+#define SADDR (loopback_mode ? b->sid : h->saddr)
+
+#define MYNAME "TuxOnIce Clustering"
+
+enum cluster_message {
+	MSG_ACK = 1,
+	MSG_NACK = 2,
+	MSG_PING = 4,
+	MSG_ABORT = 8,
+	MSG_BYE = 16,
+	MSG_HIBERNATE = 32,
+	MSG_IMAGE = 64,
+	MSG_IO = 128,
+	MSG_RUNNING = 256
+};
+
+static char *str_message(int message)
+{
+	switch (message) {
+	case 4:
+		return "Ping";
+	case 8:
+		return "Abort";
+	case 9:
+		return "Abort acked";
+	case 10:
+		return "Abort nacked";
+	case 16:
+		return "Bye";
+	case 17:
+		return "Bye acked";
+	case 18:
+		return "Bye nacked";
+	case 32:
+		return "Hibernate request";
+	case 33:
+		return "Hibernate ack";
+	case 34:
+		return "Hibernate nack";
+	case 64:
+		return "Image exists?";
+	case 65:
+		return "Image does exist";
+	case 66:
+		return "No image here";
+	case 128:
+		return "I/O";
+	case 129:
+		return "I/O okay";
+	case 130:
+		return "I/O failed";
+	case 256:
+		return "Running";
+	default:
+		printk(KERN_ERR "Unrecognised message %d.\n", message);
+		return "Unrecognised message (see dmesg)";
+	}
+}
+
+#define MSG_ACK_MASK (MSG_ACK | MSG_NACK)
+#define MSG_STATE_MASK (~MSG_ACK_MASK)
+
+struct node_info {
+	struct list_head member_list;
+	wait_queue_head_t member_events;
+	spinlock_t member_list_lock;
+	spinlock_t receive_lock;
+	int peer_count, ignored_peer_count;
+	struct toi_sysfs_data sysfs_data;
+	enum cluster_message current_message;
+};
+
+struct node_info node_array[MAX_LOCAL_NODES];
+
+struct cluster_member {
+	__be32 addr;
+	enum cluster_message message;
+	struct list_head list;
+	int ignore;
+};
+
+#define toi_cluster_port_send 3501
+#define toi_cluster_port_recv 3502
+
+static struct net_device *net_dev;
+static struct toi_module_ops toi_cluster_ops;
+
+static int toi_recv(struct sk_buff *skb, struct net_device *dev,
+		struct packet_type *pt, struct net_device *orig_dev);
+
+static struct packet_type toi_cluster_packet_type = {
+	.type =	__constant_htons(ETH_P_IP),
+	.func =	toi_recv,
+};
+
+struct toi_pkt {		/* BOOTP packet format */
+	struct iphdr iph;	/* IP header */
+	struct udphdr udph;	/* UDP header */
+	u8 htype;		/* HW address type */
+	u8 hlen;		/* HW address length */
+	__be32 xid;		/* Transaction ID */
+	__be16 secs;		/* Seconds since we started */
+	__be16 flags;		/* Just what it says */
+	u8 hw_addr[16];		/* Sender's HW address */
+	u16 message;		/* Message */
+	unsigned long sid;	/* Source ID for loopback testing */
+};
+
+static char toi_cluster_iface[IFNAMSIZ] = CONFIG_TOI_DEFAULT_CLUSTER_INTERFACE;
+
+static int added_pack;
+
+static int others_have_image;
+
+/* Key used to allow multiple clusters on the same lan */
+static char toi_cluster_key[32] = CONFIG_TOI_DEFAULT_CLUSTER_KEY;
+static char pre_hibernate_script[255] =
+	CONFIG_TOI_DEFAULT_CLUSTER_PRE_HIBERNATE;
+static char post_hibernate_script[255] =
+	CONFIG_TOI_DEFAULT_CLUSTER_POST_HIBERNATE;
+
+/*			List of cluster members			*/
+static unsigned long continue_delay = 5 * HZ;
+static unsigned long cluster_message_timeout = 3 * HZ;
+
+/* 		=== Membership list === 	*/
+
+static void print_member_info(int index)
+{
+	struct cluster_member *this;
+
+	printk(KERN_INFO "==> Dumping node %d.\n", index);
+
+	list_for_each_entry(this, &node_array[index].member_list, list)
+		printk(KERN_INFO "%d.%d.%d.%d last message %s. %s\n",
+				NIPQUAD(this->addr),
+				str_message(this->message),
+				this->ignore ? "(Ignored)" : "");
+	printk(KERN_INFO "== Done ==\n");
+}
+
+static struct cluster_member *__find_member(int index, __be32 addr)
+{
+	struct cluster_member *this;
+
+	list_for_each_entry(this, &node_array[index].member_list, list) {
+		if (this->addr != addr)
+			continue;
+
+		return this;
+	}
+
+	return NULL;
+}
+
+static void set_ignore(int index, __be32 addr, struct cluster_member *this)
+{
+	if (this->ignore) {
+		PRINTK("Node %d already ignoring %d.%d.%d.%d.\n",
+				index, NIPQUAD(addr));
+		return;
+	}
+
+	PRINTK("Node %d sees node %d.%d.%d.%d now being ignored.\n",
+				index, NIPQUAD(addr));
+	this->ignore = 1;
+	node_array[index].ignored_peer_count++;
+}
+
+static int __add_update_member(int index, __be32 addr, int message)
+{
+	struct cluster_member *this;
+
+	this = __find_member(index, addr);
+	if (this) {
+		if (this->message != message) {
+			this->message = message;
+			if ((message & MSG_NACK) &&
+			    (message & (MSG_HIBERNATE | MSG_IMAGE | MSG_IO)))
+				set_ignore(index, addr, this);
+			PRINTK("Node %d sees node %d.%d.%d.%d now sending "
+					"%s.\n", index, NIPQUAD(addr),
+					str_message(message));
+			wake_up(&node_array[index].member_events);
+		}
+		return 0;
+	}
+
+	this = (struct cluster_member *) toi_kzalloc(36,
+			sizeof(struct cluster_member), GFP_KERNEL);
+
+	if (!this)
+		return -1;
+
+	this->addr = addr;
+	this->message = message;
+	this->ignore = 0;
+	INIT_LIST_HEAD(&this->list);
+
+	node_array[index].peer_count++;
+
+	PRINTK("Node %d sees node %d.%d.%d.%d sending %s.\n", index,
+			NIPQUAD(addr), str_message(message));
+
+	if ((message & MSG_NACK) &&
+	    (message & (MSG_HIBERNATE | MSG_IMAGE | MSG_IO)))
+		set_ignore(index, addr, this);
+	list_add_tail(&this->list, &node_array[index].member_list);
+	return 1;
+}
+
+static int add_update_member(int index, __be32 addr, int message)
+{
+	int result;
+	unsigned long flags;
+	spin_lock_irqsave(&node_array[index].member_list_lock, flags);
+	result = __add_update_member(index, addr, message);
+	spin_unlock_irqrestore(&node_array[index].member_list_lock, flags);
+
+	print_member_info(index);
+
+	wake_up(&node_array[index].member_events);
+
+	return result;
+}
+
+static void del_member(int index, __be32 addr)
+{
+	struct cluster_member *this;
+	unsigned long flags;
+
+	spin_lock_irqsave(&node_array[index].member_list_lock, flags);
+	this = __find_member(index, addr);
+
+	if (this) {
+		list_del_init(&this->list);
+		toi_kfree(36, this, sizeof(*this));
+		node_array[index].peer_count--;
+	}
+
+	spin_unlock_irqrestore(&node_array[index].member_list_lock, flags);
+}
+
+/* 		=== Message transmission ===	*/
+
+static void toi_send_if(int message, unsigned long my_id);
+
+/*
+ *  Process received TOI packet.
+ */
+static int toi_recv(struct sk_buff *skb, struct net_device *dev,
+		struct packet_type *pt, struct net_device *orig_dev)
+{
+	struct toi_pkt *b;
+	struct iphdr *h;
+	int len, result, index;
+	unsigned long addr, message, ack;
+
+	/* Perform verifications before taking the lock.  */
+	if (skb->pkt_type == PACKET_OTHERHOST)
+		goto drop;
+
+	if (dev != net_dev)
+		goto drop;
+
+	skb = skb_share_check(skb, GFP_ATOMIC);
+	if (!skb)
+		return NET_RX_DROP;
+
+	if (!pskb_may_pull(skb,
+			   sizeof(struct iphdr) +
+			   sizeof(struct udphdr)))
+		goto drop;
+
+	b = (struct toi_pkt *)skb_network_header(skb);
+	h = &b->iph;
+
+	if (h->ihl != 5 || h->version != 4 || h->protocol != IPPROTO_UDP)
+		goto drop;
+
+	/* Fragments are not supported */
+	if (h->frag_off & htons(IP_OFFSET | IP_MF)) {
+		if (net_ratelimit())
+			printk(KERN_ERR "TuxOnIce: Ignoring fragmented "
+			       "cluster message.\n");
+		goto drop;
+	}
+
+	if (skb->len < ntohs(h->tot_len))
+		goto drop;
+
+	if (ip_fast_csum((char *) h, h->ihl))
+		goto drop;
+
+	if (b->udph.source != htons(toi_cluster_port_send) ||
+	    b->udph.dest != htons(toi_cluster_port_recv))
+		goto drop;
+
+	if (ntohs(h->tot_len) < ntohs(b->udph.len) + sizeof(struct iphdr))
+		goto drop;
+
+	len = ntohs(b->udph.len) - sizeof(struct udphdr);
+
+	/* Ok the front looks good, make sure we can get at the rest.  */
+	if (!pskb_may_pull(skb, skb->len))
+		goto drop;
+
+	b = (struct toi_pkt *)skb_network_header(skb);
+	h = &b->iph;
+
+	addr = SADDR;
+	PRINTK(">>> Message %s received from " NIPQUAD_FMT ".\n",
+			str_message(b->message), NIPQUAD(addr));
+
+	message = b->message & MSG_STATE_MASK;
+	ack = b->message & MSG_ACK_MASK;
+
+	for (index = 0; index < num_local_nodes; index++) {
+		int new_message = node_array[index].current_message,
+		    old_message = new_message;
+
+		if (index == SADDR || !old_message) {
+			PRINTK("Ignoring node %d (offline or self).\n", index);
+			continue;
+		}
+
+		/* One message at a time, please. */
+		spin_lock(&node_array[index].receive_lock);
+
+		result = add_update_member(index, SADDR, b->message);
+		if (result == -1) {
+			printk(KERN_INFO "Failed to add new cluster member "
+					NIPQUAD_FMT ".\n",
+					NIPQUAD(addr));
+			goto drop_unlock;
+		}
+
+		switch (b->message & MSG_STATE_MASK) {
+		case MSG_PING:
+			break;
+		case MSG_ABORT:
+			break;
+		case MSG_BYE:
+			break;
+		case MSG_HIBERNATE:
+			/* Can I hibernate? */
+			new_message = MSG_HIBERNATE |
+				((index & 1) ? MSG_NACK : MSG_ACK);
+			break;
+		case MSG_IMAGE:
+			/* Can I resume? */
+			new_message = MSG_IMAGE |
+				((index & 1) ? MSG_NACK : MSG_ACK);
+			if (new_message != old_message)
+				printk(KERN_ERR "Setting whether I can resume "
+						"to %d.\n", new_message);
+			break;
+		case MSG_IO:
+			new_message = MSG_IO | MSG_ACK;
+			break;
+		case MSG_RUNNING:
+			break;
+		default:
+			if (net_ratelimit())
+				printk(KERN_ERR "Unrecognised TuxOnIce cluster"
+					" message %d from " NIPQUAD_FMT ".\n",
+					b->message, NIPQUAD(addr));
+		};
+
+		if (old_message != new_message) {
+			node_array[index].current_message = new_message;
+			printk(KERN_INFO ">>> Sending new message for node "
+					"%d.\n", index);
+			toi_send_if(new_message, index);
+		} else if (!ack) {
+			printk(KERN_INFO ">>> Resending message for node %d.\n",
+					index);
+			toi_send_if(new_message, index);
+		}
+drop_unlock:
+		spin_unlock(&node_array[index].receive_lock);
+	};
+
+drop:
+	/* Throw the packet out. */
+	kfree_skb(skb);
+
+	return 0;
+}
+
+/*
+ *  Send cluster message to single interface.
+ */
+static void toi_send_if(int message, unsigned long my_id)
+{
+	struct sk_buff *skb;
+	struct toi_pkt *b;
+	int hh_len = LL_RESERVED_SPACE(net_dev);
+	struct iphdr *h;
+
+	/* Allocate packet */
+	skb = alloc_skb(sizeof(struct toi_pkt) + hh_len + 15, GFP_KERNEL);
+	if (!skb)
+		return;
+	skb_reserve(skb, hh_len);
+	b = (struct toi_pkt *) skb_put(skb, sizeof(struct toi_pkt));
+	memset(b, 0, sizeof(struct toi_pkt));
+
+	/* Construct IP header */
+	skb_reset_network_header(skb);
+	h = ip_hdr(skb);
+	h->version = 4;
+	h->ihl = 5;
+	h->tot_len = htons(sizeof(struct toi_pkt));
+	h->frag_off = htons(IP_DF);
+	h->ttl = 64;
+	h->protocol = IPPROTO_UDP;
+	h->daddr = htonl(INADDR_BROADCAST);
+	h->check = ip_fast_csum((unsigned char *) h, h->ihl);
+
+	/* Construct UDP header */
+	b->udph.source = htons(toi_cluster_port_send);
+	b->udph.dest = htons(toi_cluster_port_recv);
+	b->udph.len = htons(sizeof(struct toi_pkt) - sizeof(struct iphdr));
+	/* UDP checksum not calculated -- explicitly allowed in BOOTP RFC */
+
+	/* Construct message */
+	b->message = message;
+	b->sid = my_id;
+	b->htype = net_dev->type; /* can cause undefined behavior */
+	b->hlen = net_dev->addr_len;
+	memcpy(b->hw_addr, net_dev->dev_addr, net_dev->addr_len);
+	b->secs = htons(3); /* 3 seconds */
+
+	/* Chain packet down the line... */
+	skb->dev = net_dev;
+	skb->protocol = htons(ETH_P_IP);
+	if ((dev_hard_header(skb, net_dev, ntohs(skb->protocol),
+		     net_dev->broadcast, net_dev->dev_addr, skb->len) < 0) ||
+			dev_queue_xmit(skb) < 0)
+		printk(KERN_INFO "E");
+}
+
+/*	=========================================		*/
+
+/*			kTOICluster			*/
+
+static atomic_t num_cluster_threads;
+static DECLARE_WAIT_QUEUE_HEAD(clusterd_events);
+
+static int kTOICluster(void *data)
+{
+	unsigned long my_id;
+
+	my_id = atomic_add_return(1, &num_cluster_threads) - 1;
+	node_array[my_id].current_message = (unsigned long) data;
+
+	PRINTK("kTOICluster daemon %lu starting.\n", my_id);
+
+	current->flags |= PF_NOFREEZE;
+
+	while (node_array[my_id].current_message) {
+		toi_send_if(node_array[my_id].current_message, my_id);
+		sleep_on_timeout(&clusterd_events,
+				cluster_message_timeout);
+		PRINTK("Link state %lu is %d.\n", my_id,
+				node_array[my_id].current_message);
+	}
+
+	toi_send_if(MSG_BYE, my_id);
+	atomic_dec(&num_cluster_threads);
+	wake_up(&clusterd_events);
+
+	PRINTK("kTOICluster daemon %lu exiting.\n", my_id);
+	__set_current_state(TASK_RUNNING);
+	return 0;
+}
+
+static void kill_clusterd(void)
+{
+	int i;
+
+	for (i = 0; i < num_local_nodes; i++) {
+		if (node_array[i].current_message) {
+			PRINTK("Seeking to kill clusterd %d.\n", i);
+			node_array[i].current_message = 0;
+		}
+	}
+	wait_event(clusterd_events,
+			!atomic_read(&num_cluster_threads));
+	PRINTK("All cluster daemons have exited.\n");
+}
+
+static int peers_not_in_message(int index, int message, int precise)
+{
+	struct cluster_member *this;
+	unsigned long flags;
+	int result = 0;
+
+	spin_lock_irqsave(&node_array[index].member_list_lock, flags);
+	list_for_each_entry(this, &node_array[index].member_list, list) {
+		if (this->ignore)
+			continue;
+
+		PRINTK("Peer %d.%d.%d.%d sending %s. "
+			"Seeking %s.\n",
+			NIPQUAD(this->addr),
+			str_message(this->message), str_message(message));
+		if ((precise ? this->message :
+					this->message & MSG_STATE_MASK) !=
+					message)
+			result++;
+	}
+	spin_unlock_irqrestore(&node_array[index].member_list_lock, flags);
+	PRINTK("%d peers in sought message.\n", result);
+	return result;
+}
+
+static void reset_ignored(int index)
+{
+	struct cluster_member *this;
+	unsigned long flags;
+
+	spin_lock_irqsave(&node_array[index].member_list_lock, flags);
+	list_for_each_entry(this, &node_array[index].member_list, list)
+		this->ignore = 0;
+	node_array[index].ignored_peer_count = 0;
+	spin_unlock_irqrestore(&node_array[index].member_list_lock, flags);
+}
+
+static int peers_in_message(int index, int message, int precise)
+{
+	return node_array[index].peer_count -
+		node_array[index].ignored_peer_count -
+		peers_not_in_message(index, message, precise);
+}
+
+static int time_to_continue(int index, unsigned long start, int message)
+{
+	int first = peers_not_in_message(index, message, 0);
+	int second = peers_in_message(index, message, 1);
+
+	PRINTK("First part returns %d, second returns %d.\n", first, second);
+
+	if (!first && !second) {
+		PRINTK("All peers answered message %d.\n",
+			message);
+		return 1;
+	}
+
+	if (time_after(jiffies, start + continue_delay)) {
+		PRINTK("Timeout reached.\n");
+		return 1;
+	}
+
+	PRINTK("Not time to continue yet (%lu < %lu).\n", jiffies,
+			start + continue_delay);
+	return 0;
+}
+
+void toi_initiate_cluster_hibernate(void)
+{
+	int result;
+	unsigned long start;
+
+	result = do_toi_step(STEP_HIBERNATE_PREPARE_IMAGE);
+	if (result)
+		return;
+
+	toi_send_if(MSG_HIBERNATE, 0);
+
+	start = jiffies;
+	wait_event(node_array[0].member_events,
+			time_to_continue(0, start, MSG_HIBERNATE));
+
+	if (test_action_state(TOI_FREEZER_TEST)) {
+		toi_send_if(MSG_ABORT, 0);
+
+		start = jiffies;
+		wait_event(node_array[0].member_events,
+			time_to_continue(0, start, MSG_RUNNING));
+
+		do_toi_step(STEP_QUIET_CLEANUP);
+		return;
+	}
+
+	toi_send_if(MSG_IO, 0);
+
+	result = do_toi_step(STEP_HIBERNATE_SAVE_IMAGE);
+	if (result)
+		return;
+
+	/* This code runs at resume time too! */
+	if (toi_in_hibernate)
+		result = do_toi_step(STEP_HIBERNATE_POWERDOWN);
+}
+EXPORT_SYMBOL_GPL(toi_initiate_cluster_hibernate);
+
+/* toi_cluster_print_debug_stats
+ *
+ * Description:	Print information to be recorded for debugging purposes into a
+ * 		buffer.
+ * Arguments:	buffer: Pointer to a buffer into which the debug info will be
+ * 			printed.
+ * 		size:	Size of the buffer.
+ * Returns:	Number of characters written to the buffer.
+ */
+static int toi_cluster_print_debug_stats(char *buffer, int size)
+{
+	int len;
+
+	if (strlen(toi_cluster_iface))
+		len = scnprintf(buffer, size,
+				"- Cluster interface is '%s'.\n",
+				toi_cluster_iface);
+	else
+		len = scnprintf(buffer, size,
+				"- Cluster support is disabled.\n");
+	return len;
+}
+
+/* cluster_memory_needed
+ *
+ * Description:	Tell the caller how much memory we need to operate during
+ * 		hibernate/resume.
+ * Returns:	Unsigned long. Maximum number of bytes of memory required for
+ * 		operation.
+ */
+static int toi_cluster_memory_needed(void)
+{
+	return 0;
+}
+
+static int toi_cluster_storage_needed(void)
+{
+	return 1 + strlen(toi_cluster_iface);
+}
+
+/* toi_cluster_save_config_info
+ *
+ * Description:	Save informaton needed when reloading the image at resume time.
+ * Arguments:	Buffer:		Pointer to a buffer of size PAGE_SIZE.
+ * Returns:	Number of bytes used for saving our data.
+ */
+static int toi_cluster_save_config_info(char *buffer)
+{
+	strcpy(buffer, toi_cluster_iface);
+	return strlen(toi_cluster_iface + 1);
+}
+
+/* toi_cluster_load_config_info
+ *
+ * Description:	Reload information needed for declustering the image at
+ * 		resume time.
+ * Arguments:	Buffer:		Pointer to the start of the data.
+ *		Size:		Number of bytes that were saved.
+ */
+static void toi_cluster_load_config_info(char *buffer, int size)
+{
+	strncpy(toi_cluster_iface, buffer, size);
+	return;
+}
+
+static void cluster_startup(void)
+{
+	int have_image = do_check_can_resume(), i;
+	unsigned long start = jiffies, initial_message;
+	struct task_struct *p;
+
+	initial_message = MSG_IMAGE;
+
+	have_image = 1;
+
+	for (i = 0; i < num_local_nodes; i++) {
+		PRINTK("Starting ktoiclusterd %d.\n", i);
+		p = kthread_create(kTOICluster, (void *) initial_message,
+				"ktoiclusterd/%d", i);
+		if (IS_ERR(p)) {
+			printk(KERN_ERR "Failed to start ktoiclusterd.\n");
+			return;
+		}
+
+		wake_up_process(p);
+	}
+
+	/* Wait for delay or someone else sending first message */
+	wait_event(node_array[0].member_events, time_to_continue(0, start,
+				MSG_IMAGE));
+
+	others_have_image = peers_in_message(0, MSG_IMAGE | MSG_ACK, 1);
+
+	printk(KERN_INFO "Continuing. I %shave an image. Peers with image:"
+		" %d.\n", have_image ? "" : "don't ", others_have_image);
+
+	if (have_image) {
+		int result;
+
+		/* Start to resume */
+		printk(KERN_INFO "  === Starting to resume ===  \n");
+		node_array[0].current_message = MSG_IO;
+		toi_send_if(MSG_IO, 0);
+
+		/* result = do_toi_step(STEP_RESUME_LOAD_PS1); */
+		result = 0;
+
+		if (!result) {
+			/*
+			 * Atomic restore - we'll come back in the hibernation
+			 * path.
+			 */
+
+			/* result = do_toi_step(STEP_RESUME_DO_RESTORE); */
+			result = 0;
+
+			/* do_toi_step(STEP_QUIET_CLEANUP); */
+		}
+
+		node_array[0].current_message |= MSG_NACK;
+
+		/* For debugging - disable for real life? */
+		wait_event(node_array[0].member_events,
+				time_to_continue(0, start, MSG_IO));
+	}
+
+	if (others_have_image) {
+		/* Wait for them to resume */
+		printk(KERN_INFO "Waiting for other nodes to resume.\n");
+		start = jiffies;
+		wait_event(node_array[0].member_events,
+				time_to_continue(0, start, MSG_RUNNING));
+		if (peers_not_in_message(0, MSG_RUNNING, 0))
+			printk(KERN_INFO "Timed out while waiting for other "
+					"nodes to resume.\n");
+	}
+
+	/* Find out whether an image exists here. Send ACK_IMAGE or NACK_IMAGE
+	 * as appropriate.
+	 *
+	 * If we don't have an image:
+	 * - Wait until someone else says they have one, or conditions are met
+	 *   for continuing to boot (n machines or t seconds).
+	 * - If anyone has an image, wait for them to resume before continuing
+	 *   to boot.
+	 *
+	 * If we have an image:
+	 * - Wait until conditions are met before continuing to resume (n
+	 *   machines or t seconds). Send RESUME_PREP and freeze processes.
+	 *   NACK_PREP if freezing fails (shouldn't) and follow logic for
+	 *   us having no image above. On success, wait for [N]ACK_PREP from
+	 *   other machines. Read image (including atomic restore) until done.
+	 *   Wait for ACK_READ from others (should never fail). Thaw processes
+	 *   and do post-resume. (The section after the atomic restore is done
+	 *   via the code for hibernating).
+	 */
+
+	node_array[0].current_message = MSG_RUNNING;
+}
+
+/* toi_cluster_open_iface
+ *
+ * Description:	Prepare to use an interface.
+ */
+
+static int toi_cluster_open_iface(void)
+{
+	struct net_device *dev;
+
+	rtnl_lock();
+
+	for_each_netdev(&init_net, dev) {
+		if (/* dev == &init_net.loopback_dev || */
+		    strcmp(dev->name, toi_cluster_iface))
+			continue;
+
+		net_dev = dev;
+		break;
+	}
+
+	rtnl_unlock();
+
+	if (!net_dev) {
+		printk(KERN_ERR MYNAME ": Device %s not found.\n",
+				toi_cluster_iface);
+		return -ENODEV;
+	}
+
+	dev_add_pack(&toi_cluster_packet_type);
+	added_pack = 1;
+
+	loopback_mode = (net_dev == init_net.loopback_dev);
+	num_local_nodes = loopback_mode ? 8 : 1;
+
+	PRINTK("Loopback mode is %s. Number of local nodes is %d.\n",
+			loopback_mode ? "on" : "off", num_local_nodes);
+
+	cluster_startup();
+	return 0;
+}
+
+/* toi_cluster_close_iface
+ *
+ * Description: Stop using an interface.
+ */
+
+static int toi_cluster_close_iface(void)
+{
+	kill_clusterd();
+	if (added_pack) {
+		dev_remove_pack(&toi_cluster_packet_type);
+		added_pack = 0;
+	}
+	return 0;
+}
+
+static void write_side_effect(void)
+{
+	if (toi_cluster_ops.enabled) {
+		toi_cluster_open_iface();
+		set_toi_state(TOI_CLUSTER_MODE);
+	} else {
+		toi_cluster_close_iface();
+		clear_toi_state(TOI_CLUSTER_MODE);
+	}
+}
+
+static void node_write_side_effect(void)
+{
+}
+
+/*
+ * data for our sysfs entries.
+ */
+static struct toi_sysfs_data sysfs_params[] = {
+	SYSFS_STRING("interface", SYSFS_RW, toi_cluster_iface, IFNAMSIZ, 0,
+			NULL),
+	SYSFS_INT("enabled", SYSFS_RW, &toi_cluster_ops.enabled, 0, 1, 0,
+			write_side_effect),
+	SYSFS_STRING("cluster_name", SYSFS_RW, toi_cluster_key, 32, 0, NULL),
+	SYSFS_STRING("pre-hibernate-script", SYSFS_RW, pre_hibernate_script,
+			256, 0, NULL),
+	SYSFS_STRING("post-hibernate-script", SYSFS_RW, post_hibernate_script,
+			256, 0, STRING),
+	SYSFS_UL("continue_delay", SYSFS_RW, &continue_delay, HZ / 2, 60 * HZ,
+			0)
+};
+
+/*
+ * Ops structure.
+ */
+
+static struct toi_module_ops toi_cluster_ops = {
+	.type			= FILTER_MODULE,
+	.name			= "Cluster",
+	.directory		= "cluster",
+	.module			= THIS_MODULE,
+	.memory_needed 		= toi_cluster_memory_needed,
+	.print_debug_info	= toi_cluster_print_debug_stats,
+	.save_config_info	= toi_cluster_save_config_info,
+	.load_config_info	= toi_cluster_load_config_info,
+	.storage_needed		= toi_cluster_storage_needed,
+
+	.sysfs_data		= sysfs_params,
+	.num_sysfs_entries	= sizeof(sysfs_params) /
+		sizeof(struct toi_sysfs_data),
+};
+
+/* ---- Registration ---- */
+
+#ifdef MODULE
+#define INIT static __init
+#define EXIT static __exit
+#else
+#define INIT
+#define EXIT
+#endif
+
+INIT int toi_cluster_init(void)
+{
+	int temp = toi_register_module(&toi_cluster_ops), i;
+	struct kobject *kobj = toi_cluster_ops.dir_kobj;
+
+	for (i = 0; i < MAX_LOCAL_NODES; i++) {
+		node_array[i].current_message = 0;
+		INIT_LIST_HEAD(&node_array[i].member_list);
+		init_waitqueue_head(&node_array[i].member_events);
+		spin_lock_init(&node_array[i].member_list_lock);
+		spin_lock_init(&node_array[i].receive_lock);
+
+		/* Set up sysfs entry */
+		node_array[i].sysfs_data.attr.name = toi_kzalloc(8,
+				sizeof(node_array[i].sysfs_data.attr.name),
+				GFP_KERNEL);
+		sprintf((char *) node_array[i].sysfs_data.attr.name, "node_%d",
+				i);
+		node_array[i].sysfs_data.attr.mode = SYSFS_RW;
+		node_array[i].sysfs_data.type = TOI_SYSFS_DATA_INTEGER;
+		node_array[i].sysfs_data.flags = 0;
+		node_array[i].sysfs_data.data.integer.variable =
+			(int *) &node_array[i].current_message;
+		node_array[i].sysfs_data.data.integer.minimum = 0;
+		node_array[i].sysfs_data.data.integer.maximum = INT_MAX;
+		node_array[i].sysfs_data.write_side_effect =
+			node_write_side_effect;
+		toi_register_sysfs_file(kobj, &node_array[i].sysfs_data);
+	}
+
+	toi_cluster_ops.enabled = (strlen(toi_cluster_iface) > 0);
+
+	if (toi_cluster_ops.enabled)
+		toi_cluster_open_iface();
+
+	return temp;
+}
+
+EXIT void toi_cluster_exit(void)
+{
+	int i;
+	toi_cluster_close_iface();
+
+	for (i = 0; i < MAX_LOCAL_NODES; i++)
+		toi_unregister_sysfs_file(toi_cluster_ops.dir_kobj,
+				&node_array[i].sysfs_data);
+	toi_unregister_module(&toi_cluster_ops);
+}
+
+static int __init toi_cluster_iface_setup(char *iface)
+{
+	toi_cluster_ops.enabled = (*iface &&
+			strcmp(iface, "off"));
+
+	if (toi_cluster_ops.enabled)
+		strncpy(toi_cluster_iface, iface, strlen(iface));
+}
+
+__setup("toi_cluster=", toi_cluster_iface_setup);
+
+#ifdef MODULE
+MODULE_LICENSE("GPL");
+module_init(toi_cluster_init);
+module_exit(toi_cluster_exit);
+MODULE_AUTHOR("Nigel Cunningham");
+MODULE_DESCRIPTION("Cluster Support for TuxOnIce");
+#endif
diff --git a/kernel/power/tuxonice_cluster.h b/kernel/power/tuxonice_cluster.h
new file mode 100644
index 0000000..051feb3
--- /dev/null
+++ b/kernel/power/tuxonice_cluster.h
@@ -0,0 +1,18 @@
+/*
+ * kernel/power/tuxonice_cluster.h
+ *
+ * Copyright (C) 2006-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ */
+
+#ifdef CONFIG_TOI_CLUSTER
+extern int toi_cluster_init(void);
+extern void toi_cluster_exit(void);
+extern void toi_initiate_cluster_hibernate(void);
+#else
+static inline int toi_cluster_init(void) { return 0; }
+static inline void toi_cluster_exit(void) { }
+static inline void toi_initiate_cluster_hibernate(void) { }
+#endif
+
diff --git a/kernel/power/tuxonice_compress.c b/kernel/power/tuxonice_compress.c
new file mode 100644
index 0000000..2d89c4c
--- /dev/null
+++ b/kernel/power/tuxonice_compress.c
@@ -0,0 +1,465 @@
+/*
+ * kernel/power/compression.c
+ *
+ * Copyright (C) 2003-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * This file contains data compression routines for TuxOnIce,
+ * using cryptoapi.
+ */
+
+#include <linux/suspend.h>
+#include <linux/highmem.h>
+#include <linux/vmalloc.h>
+#include <linux/crypto.h>
+
+#include "tuxonice_builtin.h"
+#include "tuxonice.h"
+#include "tuxonice_modules.h"
+#include "tuxonice_sysfs.h"
+#include "tuxonice_io.h"
+#include "tuxonice_ui.h"
+#include "tuxonice_alloc.h"
+
+static int toi_expected_compression;
+
+static struct toi_module_ops toi_compression_ops;
+static struct toi_module_ops *next_driver;
+
+static char toi_compressor_name[32] = "lzo";
+
+static DEFINE_MUTEX(stats_lock);
+
+struct cpu_context {
+	u8 *page_buffer;
+	struct crypto_comp *transform;
+	unsigned int len;
+	u8 *buffer_start;
+	u8 *output_buffer;
+};
+
+#define OUT_BUF_SIZE (2 * PAGE_SIZE)
+
+static DEFINE_PER_CPU(struct cpu_context, contexts);
+
+/*
+ * toi_crypto_prepare
+ *
+ * Prepare to do some work by allocating buffers and transforms.
+ */
+static int toi_compress_crypto_prepare(void)
+{
+	int cpu;
+
+	if (!*toi_compressor_name) {
+		printk(KERN_INFO "TuxOnIce: Compression enabled but no "
+				"compressor name set.\n");
+		return 1;
+	}
+
+	for_each_online_cpu(cpu) {
+		struct cpu_context *this = &per_cpu(contexts, cpu);
+		this->transform = crypto_alloc_comp(toi_compressor_name, 0, 0);
+		if (IS_ERR(this->transform)) {
+			printk(KERN_INFO "TuxOnIce: Failed to initialise the "
+					"%s compression transform.\n",
+					toi_compressor_name);
+			this->transform = NULL;
+			return 1;
+		}
+
+		this->page_buffer =
+			(char *) toi_get_zeroed_page(16, TOI_ATOMIC_GFP);
+
+		if (!this->page_buffer) {
+			printk(KERN_ERR
+			  "Failed to allocate a page buffer for TuxOnIce "
+			  "compression driver.\n");
+			return -ENOMEM;
+		}
+
+		this->output_buffer =
+			(char *) vmalloc_32(OUT_BUF_SIZE);
+
+		if (!this->output_buffer) {
+			printk(KERN_ERR
+			  "Failed to allocate a output buffer for TuxOnIce "
+			  "compression driver.\n");
+			return -ENOMEM;
+		}
+	}
+
+	return 0;
+}
+
+static int toi_compress_rw_cleanup(int writing)
+{
+	int cpu;
+
+	for_each_online_cpu(cpu) {
+		struct cpu_context *this = &per_cpu(contexts, cpu);
+		if (this->transform) {
+			crypto_free_comp(this->transform);
+			this->transform = NULL;
+		}
+
+		if (this->page_buffer)
+			toi_free_page(16, (unsigned long) this->page_buffer);
+
+		this->page_buffer = NULL;
+
+		if (this->output_buffer)
+			vfree(this->output_buffer);
+
+		this->output_buffer = NULL;
+	}
+
+	return 0;
+}
+
+/*
+ * toi_compress_init
+ */
+
+static int toi_compress_init(int toi_or_resume)
+{
+	if (!toi_or_resume)
+		return 0;
+
+	toi_compress_bytes_in = 0;
+	toi_compress_bytes_out = 0;
+
+	next_driver = toi_get_next_filter(&toi_compression_ops);
+
+	return next_driver ? 0 : -ECHILD;
+}
+
+/*
+ * toi_compress_rw_init()
+ */
+
+static int toi_compress_rw_init(int rw, int stream_number)
+{
+	if (toi_compress_crypto_prepare()) {
+		printk(KERN_ERR "Failed to initialise compression "
+				"algorithm.\n");
+		if (rw == READ) {
+			printk(KERN_INFO "Unable to read the image.\n");
+			return -ENODEV;
+		} else {
+			printk(KERN_INFO "Continuing without "
+				"compressing the image.\n");
+			toi_compression_ops.enabled = 0;
+		}
+	}
+
+	return 0;
+}
+
+/*
+ * toi_compress_write_page()
+ *
+ * Compress a page of data, buffering output and passing on filled
+ * pages to the next module in the pipeline.
+ *
+ * Buffer_page:	Pointer to a buffer of size PAGE_SIZE, containing
+ * data to be compressed.
+ *
+ * Returns:	0 on success. Otherwise the error is that returned by later
+ * 		modules, -ECHILD if we have a broken pipeline or -EIO if
+ * 		zlib errs.
+ */
+static int toi_compress_write_page(unsigned long index, int buf_type,
+		void *buffer_page, unsigned int buf_size)
+{
+	int ret = 0, cpu = smp_processor_id();
+	struct cpu_context *ctx = &per_cpu(contexts, cpu);
+	u8* output_buffer = buffer_page;
+	int output_len = buf_size;
+	int out_buf_type = buf_type;
+
+	if (ctx->transform) {
+
+		ctx->buffer_start = TOI_MAP(buf_type, buffer_page);
+		ctx->len = OUT_BUF_SIZE;
+
+		ret = crypto_comp_compress(ctx->transform,
+			ctx->buffer_start, buf_size,
+			ctx->output_buffer, &ctx->len);
+
+		TOI_UNMAP(buf_type, buffer_page);
+
+		toi_message(TOI_COMPRESS, TOI_VERBOSE, 0,
+				"CPU %d, index %lu: %d bytes",
+				cpu, index, ctx->len);
+
+		if (!ret && ctx->len < buf_size) { /* some compression */
+			output_buffer = ctx->output_buffer;
+			output_len = ctx->len;
+			out_buf_type = TOI_VIRT;
+		}
+
+	}
+
+	mutex_lock(&stats_lock);
+
+	toi_compress_bytes_in += buf_size;
+	toi_compress_bytes_out += output_len;
+
+	mutex_unlock(&stats_lock);
+
+	if (!ret)
+		ret = next_driver->write_page(index, out_buf_type,
+				output_buffer, output_len);
+
+	return ret;
+}
+
+/*
+ * toi_compress_read_page()
+ * @buffer_page: struct page *. Pointer to a buffer of size PAGE_SIZE.
+ *
+ * Retrieve data from later modules and decompress it until the input buffer
+ * is filled.
+ * Zero if successful. Error condition from me or from downstream on failure.
+ */
+static int toi_compress_read_page(unsigned long *index, int buf_type,
+		void *buffer_page, unsigned int *buf_size)
+{
+	int ret, cpu = smp_processor_id();
+	unsigned int len;
+	unsigned int outlen = PAGE_SIZE;
+	char *buffer_start;
+	struct cpu_context *ctx = &per_cpu(contexts, cpu);
+
+	if (!ctx->transform)
+		return next_driver->read_page(index, TOI_PAGE, buffer_page,
+				buf_size);
+
+	/*
+	 * All our reads must be synchronous - we can't decompress
+	 * data that hasn't been read yet.
+	 */
+
+	ret = next_driver->read_page(index, TOI_VIRT, ctx->page_buffer, &len);
+
+	buffer_start = kmap(buffer_page);
+
+	/* Error or uncompressed data */
+	if (ret || len == PAGE_SIZE) {
+		memcpy(buffer_start, ctx->page_buffer, len);
+		goto out;
+	}
+
+	ret = crypto_comp_decompress(
+			ctx->transform,
+			ctx->page_buffer,
+			len, buffer_start, &outlen);
+
+	toi_message(TOI_COMPRESS, TOI_VERBOSE, 0,
+			"CPU %d, index %lu: %d=>%d (%d).",
+			cpu, *index, len, outlen, ret);
+
+	if (ret)
+		abort_hibernate(TOI_FAILED_IO,
+			"Compress_read returned %d.\n", ret);
+	else if (outlen != PAGE_SIZE) {
+		abort_hibernate(TOI_FAILED_IO,
+			"Decompression yielded %d bytes instead of %ld.\n",
+			outlen, PAGE_SIZE);
+		printk(KERN_ERR "Decompression yielded %d bytes instead of "
+				"%ld.\n", outlen, PAGE_SIZE);
+		ret = -EIO;
+		*buf_size = outlen;
+	}
+out:
+	TOI_UNMAP(buf_type, buffer_page);
+	return ret;
+}
+
+/*
+ * toi_compress_print_debug_stats
+ * @buffer: Pointer to a buffer into which the debug info will be printed.
+ * @size: Size of the buffer.
+ *
+ * Print information to be recorded for debugging purposes into a buffer.
+ * Returns: Number of characters written to the buffer.
+ */
+
+static int toi_compress_print_debug_stats(char *buffer, int size)
+{
+	unsigned long pages_in = toi_compress_bytes_in >> PAGE_SHIFT,
+		      pages_out = toi_compress_bytes_out >> PAGE_SHIFT;
+	int len;
+
+	/* Output the compression ratio achieved. */
+	if (*toi_compressor_name)
+		len = scnprintf(buffer, size, "- Compressor is '%s'.\n",
+				toi_compressor_name);
+	else
+		len = scnprintf(buffer, size, "- Compressor is not set.\n");
+
+	if (pages_in)
+		len += scnprintf(buffer+len, size - len, "  Compressed "
+			"%lu bytes into %lu (%ld percent compression).\n",
+		  toi_compress_bytes_in,
+		  toi_compress_bytes_out,
+		  (pages_in - pages_out) * 100 / pages_in);
+	return len;
+}
+
+/*
+ * toi_compress_compression_memory_needed
+ *
+ * Tell the caller how much memory we need to operate during hibernate/resume.
+ * Returns: Unsigned long. Maximum number of bytes of memory required for
+ * operation.
+ */
+static int toi_compress_memory_needed(void)
+{
+	return 2 * PAGE_SIZE;
+}
+
+static int toi_compress_storage_needed(void)
+{
+	return 2 * sizeof(unsigned long) + 2 * sizeof(int) +
+		strlen(toi_compressor_name) + 1;
+}
+
+/*
+ * toi_compress_save_config_info
+ * @buffer: Pointer to a buffer of size PAGE_SIZE.
+ *
+ * Save informaton needed when reloading the image at resume time.
+ * Returns: Number of bytes used for saving our data.
+ */
+static int toi_compress_save_config_info(char *buffer)
+{
+	int len = strlen(toi_compressor_name) + 1, offset = 0;
+
+	*((unsigned long *) buffer) = toi_compress_bytes_in;
+	offset += sizeof(unsigned long);
+	*((unsigned long *) (buffer + offset)) = toi_compress_bytes_out;
+	offset += sizeof(unsigned long);
+	*((int *) (buffer + offset)) = toi_expected_compression;
+	offset += sizeof(int);
+	*((int *) (buffer + offset)) = len;
+	offset += sizeof(int);
+	strncpy(buffer + offset, toi_compressor_name, len);
+	return offset + len;
+}
+
+/* toi_compress_load_config_info
+ * @buffer: Pointer to the start of the data.
+ * @size: Number of bytes that were saved.
+ *
+ * Description:	Reload information needed for decompressing the image at
+ * resume time.
+ */
+static void toi_compress_load_config_info(char *buffer, int size)
+{
+	int len, offset = 0;
+
+	toi_compress_bytes_in = *((unsigned long *) buffer);
+	offset += sizeof(unsigned long);
+	toi_compress_bytes_out = *((unsigned long *) (buffer + offset));
+	offset += sizeof(unsigned long);
+	toi_expected_compression = *((int *) (buffer + offset));
+	offset += sizeof(int);
+	len = *((int *) (buffer + offset));
+	offset += sizeof(int);
+	strncpy(toi_compressor_name, buffer + offset, len);
+}
+
+static void toi_compress_pre_atomic_restore(struct toi_boot_kernel_data *bkd)
+{
+	bkd->compress_bytes_in = toi_compress_bytes_in;
+	bkd->compress_bytes_out = toi_compress_bytes_out;
+}
+
+static void toi_compress_post_atomic_restore(struct toi_boot_kernel_data *bkd)
+{
+	toi_compress_bytes_in = bkd->compress_bytes_in;
+	toi_compress_bytes_out = bkd->compress_bytes_out;
+}
+
+/*
+ * toi_expected_compression_ratio
+ *
+ * Description:	Returns the expected ratio between data passed into this module
+ * 		and the amount of data output when writing.
+ * Returns:	100 if the module is disabled. Otherwise the value set by the
+ * 		user via our sysfs entry.
+ */
+
+static int toi_compress_expected_ratio(void)
+{
+	if (!toi_compression_ops.enabled)
+		return 100;
+	else
+		return 100 - toi_expected_compression;
+}
+
+/*
+ * data for our sysfs entries.
+ */
+static struct toi_sysfs_data sysfs_params[] = {
+	SYSFS_INT("expected_compression", SYSFS_RW, &toi_expected_compression,
+			0, 99, 0, NULL),
+	SYSFS_INT("enabled", SYSFS_RW, &toi_compression_ops.enabled, 0, 1, 0,
+			NULL),
+	SYSFS_STRING("algorithm", SYSFS_RW, toi_compressor_name, 31, 0, NULL),
+};
+
+/*
+ * Ops structure.
+ */
+static struct toi_module_ops toi_compression_ops = {
+	.type			= FILTER_MODULE,
+	.name			= "compression",
+	.directory		= "compression",
+	.module			= THIS_MODULE,
+	.initialise		= toi_compress_init,
+	.memory_needed 		= toi_compress_memory_needed,
+	.print_debug_info	= toi_compress_print_debug_stats,
+	.save_config_info	= toi_compress_save_config_info,
+	.load_config_info	= toi_compress_load_config_info,
+	.storage_needed		= toi_compress_storage_needed,
+	.expected_compression	= toi_compress_expected_ratio,
+
+	.pre_atomic_restore	= toi_compress_pre_atomic_restore,
+	.post_atomic_restore	= toi_compress_post_atomic_restore,
+
+	.rw_init		= toi_compress_rw_init,
+	.rw_cleanup		= toi_compress_rw_cleanup,
+
+	.write_page		= toi_compress_write_page,
+	.read_page		= toi_compress_read_page,
+
+	.sysfs_data		= sysfs_params,
+	.num_sysfs_entries	= sizeof(sysfs_params) /
+		sizeof(struct toi_sysfs_data),
+};
+
+/* ---- Registration ---- */
+
+static __init int toi_compress_load(void)
+{
+	return toi_register_module(&toi_compression_ops);
+}
+
+#ifdef MODULE
+static __exit void toi_compress_unload(void)
+{
+	toi_unregister_module(&toi_compression_ops);
+}
+
+module_init(toi_compress_load);
+module_exit(toi_compress_unload);
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Nigel Cunningham");
+MODULE_DESCRIPTION("Compression Support for TuxOnIce");
+#else
+late_initcall(toi_compress_load);
+#endif
diff --git a/kernel/power/tuxonice_extent.c b/kernel/power/tuxonice_extent.c
new file mode 100644
index 0000000..e84572c
--- /dev/null
+++ b/kernel/power/tuxonice_extent.c
@@ -0,0 +1,123 @@
+/*
+ * kernel/power/tuxonice_extent.c
+ *
+ * Copyright (C) 2003-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * Distributed under GPLv2.
+ *
+ * These functions encapsulate the manipulation of storage metadata.
+ */
+
+#include <linux/suspend.h>
+#include "tuxonice_modules.h"
+#include "tuxonice_extent.h"
+#include "tuxonice_alloc.h"
+#include "tuxonice_ui.h"
+#include "tuxonice.h"
+
+/**
+ * toi_get_extent - return a free extent
+ *
+ * May fail, returning NULL instead.
+ **/
+static struct hibernate_extent *toi_get_extent(void)
+{
+	return (struct hibernate_extent *) toi_kzalloc(2,
+			sizeof(struct hibernate_extent), TOI_ATOMIC_GFP);
+}
+
+/**
+ * toi_put_extent_chain - free a whole chain of extents
+ * @chain:	Chain to free.
+ **/
+void toi_put_extent_chain(struct hibernate_extent_chain *chain)
+{
+	struct hibernate_extent *this;
+
+	this = chain->first;
+
+	while (this) {
+		struct hibernate_extent *next = this->next;
+		toi_kfree(2, this, sizeof(*this));
+		chain->num_extents--;
+		this = next;
+	}
+
+	chain->first = NULL;
+	chain->last_touched = NULL;
+	chain->current_extent = NULL;
+	chain->size = 0;
+}
+EXPORT_SYMBOL_GPL(toi_put_extent_chain);
+
+/**
+ * toi_add_to_extent_chain - add an extent to an existing chain
+ * @chain:	Chain to which the extend should be added
+ * @start:	Start of the extent (first physical block)
+ * @end:	End of the extent (last physical block)
+ *
+ * The chain information is updated if the insertion is successful.
+ **/
+int toi_add_to_extent_chain(struct hibernate_extent_chain *chain,
+		unsigned long start, unsigned long end)
+{
+	struct hibernate_extent *new_ext = NULL, *cur_ext = NULL;
+
+	toi_message(TOI_IO, TOI_VERBOSE, 0,
+		"Adding extent %lu-%lu to chain %p.\n", start, end, chain);
+
+	/* Find the right place in the chain */
+	if (chain->last_touched && chain->last_touched->start < start)
+		cur_ext = chain->last_touched;
+	else if (chain->first && chain->first->start < start)
+		cur_ext = chain->first;
+
+	if (cur_ext) {
+		while (cur_ext->next && cur_ext->next->start < start)
+			cur_ext = cur_ext->next;
+
+		if (cur_ext->end == (start - 1)) {
+			struct hibernate_extent *next_ext = cur_ext->next;
+			cur_ext->end = end;
+
+			/* Merge with the following one? */
+			if (next_ext && cur_ext->end + 1 == next_ext->start) {
+				cur_ext->end = next_ext->end;
+				cur_ext->next = next_ext->next;
+				toi_kfree(2, next_ext, sizeof(*next_ext));
+				chain->num_extents--;
+			}
+
+			chain->last_touched = cur_ext;
+			chain->size += (end - start + 1);
+
+			return 0;
+		}
+	}
+
+	new_ext = toi_get_extent();
+	if (!new_ext) {
+		printk(KERN_INFO "Error unable to append a new extent to the "
+				"chain.\n");
+		return -ENOMEM;
+	}
+
+	chain->num_extents++;
+	chain->size += (end - start + 1);
+	new_ext->start = start;
+	new_ext->end = end;
+
+	chain->last_touched = new_ext;
+
+	if (cur_ext) {
+		new_ext->next = cur_ext->next;
+		cur_ext->next = new_ext;
+	} else {
+		if (chain->first)
+			new_ext->next = chain->first;
+		chain->first = new_ext;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(toi_add_to_extent_chain);
diff --git a/kernel/power/tuxonice_extent.h b/kernel/power/tuxonice_extent.h
new file mode 100644
index 0000000..157446cf
--- /dev/null
+++ b/kernel/power/tuxonice_extent.h
@@ -0,0 +1,44 @@
+/*
+ * kernel/power/tuxonice_extent.h
+ *
+ * Copyright (C) 2003-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * It contains declarations related to extents. Extents are
+ * TuxOnIce's method of storing some of the metadata for the image.
+ * See tuxonice_extent.c for more info.
+ *
+ */
+
+#include "tuxonice_modules.h"
+
+#ifndef EXTENT_H
+#define EXTENT_H
+
+struct hibernate_extent {
+	unsigned long start, end;
+	struct hibernate_extent *next;
+};
+
+struct hibernate_extent_chain {
+	unsigned long size; /* size of the chain ie sum (max-min+1) */
+	int num_extents;
+	struct hibernate_extent *first, *last_touched;
+	struct hibernate_extent *current_extent;
+	unsigned long current_offset;
+};
+
+/* Simplify iterating through all the values in an extent chain */
+#define toi_extent_for_each(extent_chain, extentpointer, value) \
+if ((extent_chain)->first) \
+	for ((extentpointer) = (extent_chain)->first, (value) = \
+			(extentpointer)->start; \
+	     ((extentpointer) && ((extentpointer)->next || (value) <= \
+				 (extentpointer)->end)); \
+	     (((value) == (extentpointer)->end) ? \
+		((extentpointer) = (extentpointer)->next, (value) = \
+		 ((extentpointer) ? (extentpointer)->start : 0)) : \
+			(value)++))
+
+#endif
diff --git a/kernel/power/tuxonice_file.c b/kernel/power/tuxonice_file.c
new file mode 100644
index 0000000..4b817c4
--- /dev/null
+++ b/kernel/power/tuxonice_file.c
@@ -0,0 +1,497 @@
+/*
+ * kernel/power/tuxonice_file.c
+ *
+ * Copyright (C) 2005-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * Distributed under GPLv2.
+ *
+ * This file encapsulates functions for usage of a simple file as a
+ * backing store. It is based upon the swapallocator, and shares the
+ * same basic working. Here, though, we have nothing to do with
+ * swapspace, and only one device to worry about.
+ *
+ * The user can just
+ *
+ * echo TuxOnIce > /path/to/my_file
+ *
+ * dd if=/dev/zero bs=1M count=<file_size_desired> >> /path/to/my_file
+ *
+ * and
+ *
+ * echo /path/to/my_file > /sys/power/tuxonice/file/target
+ *
+ * then put what they find in /sys/power/tuxonice/resume
+ * as their resume= parameter in lilo.conf (and rerun lilo if using it).
+ *
+ * Having done this, they're ready to hibernate and resume.
+ *
+ * TODO:
+ * - File resizing.
+ */
+
+#include <linux/blkdev.h>
+#include <linux/mount.h>
+#include <linux/fs.h>
+#include <linux/fs_uuid.h>
+
+#include "tuxonice.h"
+#include "tuxonice_modules.h"
+#include "tuxonice_bio.h"
+#include "tuxonice_alloc.h"
+#include "tuxonice_builtin.h"
+#include "tuxonice_sysfs.h"
+#include "tuxonice_ui.h"
+#include "tuxonice_io.h"
+
+#define target_is_normal_file() (S_ISREG(target_inode->i_mode))
+
+static struct toi_module_ops toi_fileops;
+
+static struct file *target_file;
+static struct block_device *toi_file_target_bdev;
+static unsigned long pages_available, pages_allocated;
+static char toi_file_target[256];
+static struct inode *target_inode;
+static int file_target_priority;
+static int used_devt;
+static int target_claim;
+static dev_t toi_file_dev_t;
+static int sig_page_index;
+
+/* For test_toi_file_target */
+static struct toi_bdev_info *file_chain;
+
+static int has_contiguous_blocks(struct toi_bdev_info *dev_info, int page_num)
+{
+	int j;
+	sector_t last = 0;
+
+	for (j = 0; j < dev_info->blocks_per_page; j++) {
+		sector_t this = bmap(target_inode,
+				page_num * dev_info->blocks_per_page + j);
+
+		if (!this || (last && (last + 1) != this))
+			break;
+
+		last = this;
+	}
+
+	return j == dev_info->blocks_per_page;
+}
+
+static unsigned long get_usable_pages(struct toi_bdev_info *dev_info)
+{
+	unsigned long result = 0;
+	struct block_device *bdev = dev_info->bdev;
+	int i;
+
+	switch (target_inode->i_mode & S_IFMT) {
+	case S_IFSOCK:
+	case S_IFCHR:
+	case S_IFIFO: /* Socket, Char, Fifo */
+		return -1;
+	case S_IFREG: /* Regular file: current size - holes + free
+			 space on part */
+		for (i = 0; i < (target_inode->i_size >> PAGE_SHIFT) ; i++) {
+			if (has_contiguous_blocks(dev_info, i))
+				result++;
+		}
+		break;
+	case S_IFBLK: /* Block device */
+		if (!bdev->bd_disk) {
+			toi_message(TOI_IO, TOI_VERBOSE, 0,
+					"bdev->bd_disk null.");
+			return 0;
+		}
+
+		result = (bdev->bd_part ?
+			bdev->bd_part->nr_sects :
+			get_capacity(bdev->bd_disk)) >> (PAGE_SHIFT - 9);
+	}
+
+
+	return result;
+}
+
+static int toi_file_register_storage(void)
+{
+	struct toi_bdev_info *devinfo;
+	int result = 0;
+	struct fs_info *fs_info;
+
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "toi_file_register_storage.");
+	if (!strlen(toi_file_target)) {
+		toi_message(TOI_IO, TOI_VERBOSE, 0, "Register file storage: "
+				"No target filename set.");
+		return 0;
+	}
+
+	target_file = filp_open(toi_file_target, O_RDONLY|O_LARGEFILE, 0);
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "filp_open %s returned %p.",
+			toi_file_target, target_file);
+
+	if (IS_ERR(target_file) || !target_file) {
+		target_file = NULL;
+		toi_file_dev_t = name_to_dev_t(toi_file_target);
+		if (!toi_file_dev_t) {
+			struct kstat stat;
+			int error = vfs_stat(toi_file_target, &stat);
+			printk(KERN_INFO "Open file %s returned %p and "
+					"name_to_devt failed.\n",
+					toi_file_target, target_file);
+			if (error) {
+				printk(KERN_INFO "Stating the file also failed."
+					" Nothing more we can do.\n");
+				return 0;
+			} else
+				toi_file_dev_t = stat.rdev;
+		}
+
+		toi_file_target_bdev = toi_open_by_devnum(toi_file_dev_t);
+		if (IS_ERR(toi_file_target_bdev)) {
+			printk(KERN_INFO "Got a dev_num (%lx) but failed to "
+					"open it.\n",
+					(unsigned long) toi_file_dev_t);
+			toi_file_target_bdev = NULL;
+			return 0;
+		}
+		used_devt = 1;
+		target_inode = toi_file_target_bdev->bd_inode;
+	} else
+		target_inode = target_file->f_mapping->host;
+
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "Succeeded in opening the target.");
+	if (S_ISLNK(target_inode->i_mode) || S_ISDIR(target_inode->i_mode) ||
+	    S_ISSOCK(target_inode->i_mode) || S_ISFIFO(target_inode->i_mode)) {
+		printk(KERN_INFO "File support works with regular files,"
+				" character files and block devices.\n");
+		/* Cleanup routine will undo the above */
+		return 0;
+	}
+
+	if (!used_devt) {
+		if (S_ISBLK(target_inode->i_mode)) {
+			toi_file_target_bdev = I_BDEV(target_inode);
+			if (!blkdev_get(toi_file_target_bdev, FMODE_WRITE |
+						FMODE_READ, NULL))
+				target_claim = 1;
+		} else
+			toi_file_target_bdev = target_inode->i_sb->s_bdev;
+		if (!toi_file_target_bdev) {
+			printk(KERN_INFO "%s is not a valid file allocator "
+					"target.\n", toi_file_target);
+			return 0;
+		}
+		toi_file_dev_t = toi_file_target_bdev->bd_dev;
+	}
+
+	devinfo = toi_kzalloc(39, sizeof(struct toi_bdev_info), GFP_ATOMIC);
+	if (!devinfo) {
+		printk("Failed to allocate a toi_bdev_info struct for the file allocator.\n");
+		return -ENOMEM;
+	}
+
+	devinfo->bdev = toi_file_target_bdev;
+	devinfo->allocator = &toi_fileops;
+	devinfo->allocator_index = 0;
+
+	fs_info = fs_info_from_block_dev(toi_file_target_bdev);
+	if (fs_info && !IS_ERR(fs_info)) {
+		memcpy(devinfo->uuid, &fs_info->uuid, 16);
+		free_fs_info(fs_info);
+	} else
+		result = (int) PTR_ERR(fs_info);
+
+	/* Unlike swap code, only complain if fs_info_from_block_dev returned
+	 * -ENOMEM. The 'file' might be a full partition, so might validly not
+	 * have an identifiable type, UUID etc.
+	 */
+	if (result)
+		printk(KERN_DEBUG "Failed to get fs_info for file device (%d).\n",
+				result);
+	devinfo->dev_t = toi_file_dev_t;
+	devinfo->prio = file_target_priority;
+	devinfo->bmap_shift = target_inode->i_blkbits - 9;
+	devinfo->blocks_per_page =
+		(1 << (PAGE_SHIFT - target_inode->i_blkbits));
+	sprintf(devinfo->name, "file %s", toi_file_target);
+	file_chain = devinfo;
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "Dev_t is %lx. Prio is %d. Bmap "
+			"shift is %d. Blocks per page %d.",
+			devinfo->dev_t, devinfo->prio, devinfo->bmap_shift,
+			devinfo->blocks_per_page);
+
+	/* Keep one aside for the signature */
+	pages_available = get_usable_pages(devinfo) - 1;
+
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "Registering file storage, %lu "
+			"pages.", pages_available);
+
+	toi_bio_ops.register_storage(devinfo);
+	return 0;
+}
+
+static unsigned long toi_file_storage_available(void)
+{
+	return pages_available;
+}
+
+static int toi_file_allocate_storage(struct toi_bdev_info *chain,
+		unsigned long request)
+{
+	unsigned long available = pages_available - pages_allocated;
+	unsigned long to_add = min(available, request);
+
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "Pages available is %lu. Allocated "
+		"is %lu. Allocating %lu pages from file.",
+		pages_available, pages_allocated, to_add);
+	pages_allocated += to_add;
+
+	return to_add;
+}
+
+/**
+ * __populate_block_list - add an extent to the chain
+ * @min:	Start of the extent (first physical block = sector)
+ * @max:	End of the extent (last physical block = sector)
+ *
+ * If TOI_TEST_BIO is set, print a debug message, outputting the min and max
+ * fs block numbers.
+ **/
+static int __populate_block_list(struct toi_bdev_info *chain, int min, int max)
+{
+	if (test_action_state(TOI_TEST_BIO))
+		toi_message(TOI_IO, TOI_VERBOSE, 0, "Adding extent %d-%d.",
+			min << chain->bmap_shift,
+			((max + 1) << chain->bmap_shift) - 1);
+
+	return toi_add_to_extent_chain(&chain->blocks, min, max);
+}
+
+static int get_main_pool_phys_params(struct toi_bdev_info *chain)
+{
+	int i, extent_min = -1, extent_max = -1, result = 0, have_sig_page = 0;
+	unsigned long pages_mapped = 0;
+
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "Getting file allocator blocks.");
+
+	if (chain->blocks.first)
+		toi_put_extent_chain(&chain->blocks);
+
+	if (!target_is_normal_file()) {
+		result = (pages_available > 0) ?
+			__populate_block_list(chain, chain->blocks_per_page,
+				(pages_allocated + 1) *
+				chain->blocks_per_page - 1) : 0;
+		return result;
+	}
+
+	/*
+	 * FIXME: We are assuming the first page is contiguous. Is that
+	 * assumption always right?
+	 */
+
+	for (i = 0; i < (target_inode->i_size >> PAGE_SHIFT); i++) {
+		sector_t new_sector;
+
+		if (!has_contiguous_blocks(chain, i))
+			continue;
+
+		if (!have_sig_page) {
+			have_sig_page = 1;
+			sig_page_index = i;
+			continue;
+		}
+
+		pages_mapped++;
+
+		/* Ignore first page - it has the header */
+		if (pages_mapped == 1)
+			continue;
+
+		new_sector = bmap(target_inode, (i * chain->blocks_per_page));
+
+		/*
+		 * I'd love to be able to fill in holes and resize
+		 * files, but not yet...
+		 */
+
+		if (new_sector == extent_max + 1)
+			extent_max += chain->blocks_per_page;
+		else {
+			if (extent_min > -1) {
+				result = __populate_block_list(chain,
+						extent_min, extent_max);
+				if (result)
+					return result;
+			}
+
+			extent_min = new_sector;
+			extent_max = extent_min +
+				chain->blocks_per_page - 1;
+		}
+
+		if (pages_mapped == pages_allocated)
+			break;
+	}
+
+	if (extent_min > -1) {
+		result = __populate_block_list(chain, extent_min, extent_max);
+		if (result)
+			return result;
+	}
+
+	return 0;
+}
+
+static void toi_file_free_storage(struct toi_bdev_info *chain)
+{
+	pages_allocated = 0;
+	file_chain = NULL;
+}
+
+/**
+ * toi_file_print_debug_stats - print debug info
+ * @buffer:	Buffer to data to populate
+ * @size:	Size of the buffer
+ **/
+static int toi_file_print_debug_stats(char *buffer, int size)
+{
+	int len = scnprintf(buffer, size, "- File Allocator active.\n");
+
+	len += scnprintf(buffer+len, size-len, "  Storage available for "
+			"image: %lu pages.\n", pages_available);
+
+	return len;
+}
+
+static void toi_file_cleanup(int finishing_cycle)
+{
+	if (toi_file_target_bdev) {
+		if (target_claim) {
+			blkdev_put(toi_file_target_bdev, FMODE_WRITE | FMODE_READ);
+			target_claim = 0;
+		}
+
+		if (used_devt) {
+			blkdev_put(toi_file_target_bdev,
+					FMODE_READ | FMODE_NDELAY);
+			used_devt = 0;
+		}
+		toi_file_target_bdev = NULL;
+		target_inode = NULL;
+	}
+
+	if (target_file) {
+		filp_close(target_file, NULL);
+		target_file = NULL;
+	}
+
+	pages_available = 0;
+}
+
+/**
+ * test_toi_file_target - sysfs callback for /sys/power/tuxonince/file/target
+ *
+ * Test wheter the target file is valid for hibernating.
+ **/
+static void test_toi_file_target(void)
+{
+	int result = toi_file_register_storage();
+	sector_t sector;
+	char buf[50];
+	struct fs_info *fs_info;
+
+	if (result || !file_chain)
+		return;
+
+	/* This doesn't mean we're in business. Is any storage available? */
+	if (!pages_available)
+		goto out;
+
+	toi_file_allocate_storage(file_chain, 1);
+	result = get_main_pool_phys_params(file_chain);
+	if (result)
+		goto out;
+
+
+	sector = bmap(target_inode, sig_page_index *
+			file_chain->blocks_per_page) << file_chain->bmap_shift;
+
+	/* Use the uuid, or the dev_t if that fails */
+	fs_info = fs_info_from_block_dev(toi_file_target_bdev);
+	if (!fs_info || IS_ERR(fs_info)) {
+		bdevname(toi_file_target_bdev, buf);
+		sprintf(resume_file, "/dev/%s:%llu", buf,
+				(unsigned long long) sector);
+	} else {
+		int i;
+		hex_dump_to_buffer(fs_info->uuid, 16, 32, 1, buf, 50, 0);
+
+		/* Remove the spaces */
+		for (i = 1; i < 16; i++) {
+			buf[2 * i] = buf[3 * i];
+			buf[2 * i + 1] = buf[3 * i + 1];
+		}
+		buf[32] = 0;
+		sprintf(resume_file, "UUID=%s:0x%llx", buf,
+				(unsigned long long) sector);
+		free_fs_info(fs_info);
+	}
+
+	toi_attempt_to_parse_resume_device(0);
+out:
+	toi_file_free_storage(file_chain);
+	toi_bio_ops.free_storage();
+}
+
+static struct toi_sysfs_data sysfs_params[] = {
+	SYSFS_STRING("target", SYSFS_RW, toi_file_target, 256,
+		SYSFS_NEEDS_SM_FOR_WRITE, test_toi_file_target),
+	SYSFS_INT("enabled", SYSFS_RW, &toi_fileops.enabled, 0, 1, 0, NULL),
+	SYSFS_INT("priority", SYSFS_RW, &file_target_priority, -4095,
+			4096, 0, NULL),
+};
+
+static struct toi_bio_allocator_ops toi_bio_fileops = {
+	.register_storage			= toi_file_register_storage,
+	.storage_available			= toi_file_storage_available,
+	.allocate_storage			= toi_file_allocate_storage,
+	.bmap					= get_main_pool_phys_params,
+	.free_storage				= toi_file_free_storage,
+};
+
+static struct toi_module_ops toi_fileops = {
+	.type					= BIO_ALLOCATOR_MODULE,
+	.name					= "file storage",
+	.directory				= "file",
+	.module					= THIS_MODULE,
+	.print_debug_info			= toi_file_print_debug_stats,
+	.cleanup				= toi_file_cleanup,
+	.bio_allocator_ops			= &toi_bio_fileops,
+
+	.sysfs_data		= sysfs_params,
+	.num_sysfs_entries	= sizeof(sysfs_params) /
+		sizeof(struct toi_sysfs_data),
+};
+
+/* ---- Registration ---- */
+static __init int toi_file_load(void)
+{
+	return toi_register_module(&toi_fileops);
+}
+
+#ifdef MODULE
+static __exit void toi_file_unload(void)
+{
+	toi_unregister_module(&toi_fileops);
+}
+
+module_init(toi_file_load);
+module_exit(toi_file_unload);
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Nigel Cunningham");
+MODULE_DESCRIPTION("TuxOnIce FileAllocator");
+#else
+late_initcall(toi_file_load);
+#endif
diff --git a/kernel/power/tuxonice_highlevel.c b/kernel/power/tuxonice_highlevel.c
new file mode 100644
index 0000000..33dabda
--- /dev/null
+++ b/kernel/power/tuxonice_highlevel.c
@@ -0,0 +1,1344 @@
+/*
+ * kernel/power/tuxonice_highlevel.c
+ */
+/** \mainpage TuxOnIce.
+ *
+ * TuxOnIce provides support for saving and restoring an image of
+ * system memory to an arbitrary storage device, either on the local computer,
+ * or across some network. The support is entirely OS based, so TuxOnIce
+ * works without requiring BIOS, APM or ACPI support. The vast majority of the
+ * code is also architecture independant, so it should be very easy to port
+ * the code to new architectures. TuxOnIce includes support for SMP, 4G HighMem
+ * and preemption. Initramfses and initrds are also supported.
+ *
+ * TuxOnIce uses a modular design, in which the method of storing the image is
+ * completely abstracted from the core code, as are transformations on the data
+ * such as compression and/or encryption (multiple 'modules' can be used to
+ * provide arbitrary combinations of functionality). The user interface is also
+ * modular, so that arbitrarily simple or complex interfaces can be used to
+ * provide anything from debugging information through to eye candy.
+ *
+ * \section Copyright
+ *
+ * TuxOnIce is released under the GPLv2.
+ *
+ * Copyright (C) 1998-2001 Gabor Kuti <seasons@fornax.hu><BR>
+ * Copyright (C) 1998,2001,2002 Pavel Machek <pavel@suse.cz><BR>
+ * Copyright (C) 2002-2003 Florent Chabaud <fchabaud@free.fr><BR>
+ * Copyright (C) 2002-2010 Nigel Cunningham (nigel at tuxonice net)<BR>
+ *
+ * \section Credits
+ *
+ * Nigel would like to thank the following people for their work:
+ *
+ * Bernard Blackham <bernard@blackham.com.au><BR>
+ * Web page & Wiki administration, some coding. A person without whom
+ * TuxOnIce would not be where it is.
+ *
+ * Michael Frank <mhf@linuxmail.org><BR>
+ * Extensive testing and help with improving stability. I was constantly
+ * amazed by the quality and quantity of Michael's help.
+ *
+ * Pavel Machek <pavel@ucw.cz><BR>
+ * Modifications, defectiveness pointing, being with Gabor at the very
+ * beginning, suspend to swap space, stop all tasks. Port to 2.4.18-ac and
+ * 2.5.17. Even though Pavel and I disagree on the direction suspend to
+ * disk should take, I appreciate the valuable work he did in helping Gabor
+ * get the concept working.
+ *
+ * ..and of course the myriads of TuxOnIce users who have helped diagnose
+ * and fix bugs, made suggestions on how to improve the code, proofread
+ * documentation, and donated time and money.
+ *
+ * Thanks also to corporate sponsors:
+ *
+ * <B>Redhat.</B>Sometime employer from May 2006 (my fault, not Redhat's!).
+ *
+ * <B>Cyclades.com.</B> Nigel's employers from Dec 2004 until May 2006, who
+ * allowed him to work on TuxOnIce and PM related issues on company time.
+ *
+ * <B>LinuxFund.org.</B> Sponsored Nigel's work on TuxOnIce for four months Oct
+ * 2003 to Jan 2004.
+ *
+ * <B>LAC Linux.</B> Donated P4 hardware that enabled development and ongoing
+ * maintenance of SMP and Highmem support.
+ *
+ * <B>OSDL.</B> Provided access to various hardware configurations, make
+ * occasional small donations to the project.
+ */
+
+#include <linux/suspend.h>
+#include <linux/freezer.h>
+#include <generated/utsrelease.h>
+#include <linux/cpu.h>
+#include <linux/console.h>
+#include <linux/writeback.h>
+#include <linux/uaccess.h> /* for get/set_fs & KERNEL_DS on i386 */
+#include <linux/bio.h>
+#include <linux/kgdb.h>
+
+#include "tuxonice.h"
+#include "tuxonice_modules.h"
+#include "tuxonice_sysfs.h"
+#include "tuxonice_prepare_image.h"
+#include "tuxonice_io.h"
+#include "tuxonice_ui.h"
+#include "tuxonice_power_off.h"
+#include "tuxonice_storage.h"
+#include "tuxonice_checksum.h"
+#include "tuxonice_builtin.h"
+#include "tuxonice_atomic_copy.h"
+#include "tuxonice_alloc.h"
+#include "tuxonice_cluster.h"
+
+/*! Pageset metadata. */
+struct pagedir pagedir2 = {2};
+EXPORT_SYMBOL_GPL(pagedir2);
+
+static mm_segment_t oldfs;
+static DEFINE_MUTEX(tuxonice_in_use);
+static int block_dump_save;
+
+/* Binary signature if an image is present */
+char tuxonice_signature[9] = "\xed\xc3\x02\xe9\x98\x56\xe5\x0c";
+EXPORT_SYMBOL_GPL(tuxonice_signature);
+
+unsigned long boot_kernel_data_buffer;
+
+static char *result_strings[] = {
+	"Hibernation was aborted",
+	"The user requested that we cancel the hibernation",
+	"No storage was available",
+	"Insufficient storage was available",
+	"Freezing filesystems and/or tasks failed",
+	"A pre-existing image was used",
+	"We would free memory, but image size limit doesn't allow this",
+	"Unable to free enough memory to hibernate",
+	"Unable to obtain the Power Management Semaphore",
+	"A device suspend/resume returned an error",
+	"A system device suspend/resume returned an error",
+	"The extra pages allowance is too small",
+	"We were unable to successfully prepare an image",
+	"TuxOnIce module initialisation failed",
+	"TuxOnIce module cleanup failed",
+	"I/O errors were encountered",
+	"Ran out of memory",
+	"An error was encountered while reading the image",
+	"Platform preparation failed",
+	"CPU Hotplugging failed",
+	"Architecture specific preparation failed",
+	"Pages needed resaving, but we were told to abort if this happens",
+	"We can't hibernate at the moment (invalid resume= or filewriter "
+		"target?)",
+	"A hibernation preparation notifier chain member cancelled the "
+		"hibernation",
+	"Pre-snapshot preparation failed",
+	"Pre-restore preparation failed",
+	"Failed to disable usermode helpers",
+	"Can't resume from alternate image",
+	"Header reservation too small",
+	"Device Power Management Preparation failed",
+};
+
+/**
+ * toi_finish_anything - cleanup after doing anything
+ * @hibernate_or_resume:	Whether finishing a cycle or attempt at
+ *				resuming.
+ *
+ * This is our basic clean-up routine, matching start_anything below. We
+ * call cleanup routines, drop module references and restore process fs and
+ * cpus allowed masks, together with the global block_dump variable's value.
+ **/
+void toi_finish_anything(int hibernate_or_resume)
+{
+	toi_cleanup_modules(hibernate_or_resume);
+	toi_put_modules();
+	if (hibernate_or_resume) {
+		block_dump = block_dump_save;
+		set_cpus_allowed_ptr(current, cpu_all_mask);
+		toi_alloc_print_debug_stats();
+		atomic_inc(&snapshot_device_available);
+    unlock_system_sleep();
+	}
+
+	set_fs(oldfs);
+	mutex_unlock(&tuxonice_in_use);
+}
+
+/**
+ * toi_start_anything - basic initialisation for TuxOnIce
+ * @toi_or_resume:	Whether starting a cycle or attempt at resuming.
+ *
+ * Our basic initialisation routine. Take references on modules, use the
+ * kernel segment, recheck resume= if no active allocator is set, initialise
+ * modules, save and reset block_dump and ensure we're running on CPU0.
+ **/
+int toi_start_anything(int hibernate_or_resume)
+{
+	mutex_lock(&tuxonice_in_use);
+
+	oldfs = get_fs();
+	set_fs(KERNEL_DS);
+
+	if (hibernate_or_resume) {
+    lock_system_sleep();
+
+		if (!atomic_add_unless(&snapshot_device_available, -1, 0))
+			goto snapshotdevice_unavailable;
+	}
+
+	if (hibernate_or_resume == SYSFS_HIBERNATE)
+		toi_print_modules();
+
+	if (toi_get_modules()) {
+		printk(KERN_INFO "TuxOnIce: Get modules failed!\n");
+		goto prehibernate_err;
+	}
+
+	if (hibernate_or_resume) {
+		block_dump_save = block_dump;
+		block_dump = 0;
+		set_cpus_allowed_ptr(current,
+				cpumask_of(cpumask_first(cpu_online_mask)));
+	}
+
+	if (toi_initialise_modules_early(hibernate_or_resume))
+		goto early_init_err;
+
+	if (!toiActiveAllocator)
+		toi_attempt_to_parse_resume_device(!hibernate_or_resume);
+
+	if (!toi_initialise_modules_late(hibernate_or_resume))
+		return 0;
+
+	toi_cleanup_modules(hibernate_or_resume);
+early_init_err:
+	if (hibernate_or_resume) {
+		block_dump_save = block_dump;
+		set_cpus_allowed_ptr(current, cpu_all_mask);
+	}
+	toi_put_modules();
+prehibernate_err:
+	if (hibernate_or_resume)
+		atomic_inc(&snapshot_device_available);
+snapshotdevice_unavailable:
+	if (hibernate_or_resume)
+		mutex_unlock(&pm_mutex);
+	set_fs(oldfs);
+	mutex_unlock(&tuxonice_in_use);
+	return -EBUSY;
+}
+
+/*
+ * Nosave page tracking.
+ *
+ * Here rather than in prepare_image because we want to do it once only at the
+ * start of a cycle.
+ */
+
+/**
+ * mark_nosave_pages - set up our Nosave bitmap
+ *
+ * Build a bitmap of Nosave pages from the list. The bitmap allows faster
+ * use when preparing the image.
+ **/
+static void mark_nosave_pages(void)
+{
+	struct nosave_region *region;
+
+	list_for_each_entry(region, &nosave_regions, list) {
+		unsigned long pfn;
+
+		for (pfn = region->start_pfn; pfn < region->end_pfn; pfn++)
+			if (pfn_valid(pfn))
+				SetPageNosave(pfn_to_page(pfn));
+	}
+}
+
+static int toi_alloc_bitmap(struct memory_bitmap **bm)
+{
+	int result = 0;
+
+	*bm = kzalloc(sizeof(struct memory_bitmap), GFP_KERNEL);
+	if (!*bm) {
+		printk(KERN_ERR "Failed to kzalloc memory for a bitmap.\n");
+		return -ENOMEM;
+	}
+
+	result = memory_bm_create(*bm, GFP_KERNEL, 0);
+
+	if (result) {
+		printk(KERN_ERR "Failed to create a bitmap.\n");
+		kfree(*bm);
+		*bm = NULL;
+	}
+
+	return result;
+}
+
+/**
+ * allocate_bitmaps - allocate bitmaps used to record page states
+ *
+ * Allocate the bitmaps we use to record the various TuxOnIce related
+ * page states.
+ **/
+static int allocate_bitmaps(void)
+{
+	if (toi_alloc_bitmap(&pageset1_map) ||
+	    toi_alloc_bitmap(&pageset1_copy_map) ||
+	    toi_alloc_bitmap(&pageset2_map) ||
+	    toi_alloc_bitmap(&io_map) ||
+	    toi_alloc_bitmap(&nosave_map) ||
+	    toi_alloc_bitmap(&free_map) ||
+	    toi_alloc_bitmap(&page_resave_map))
+		return 1;
+
+	return 0;
+}
+
+static void toi_free_bitmap(struct memory_bitmap **bm)
+{
+	if (!*bm)
+		return;
+
+	memory_bm_free(*bm, 0);
+	kfree(*bm);
+	*bm = NULL;
+}
+
+/**
+ * free_bitmaps - free the bitmaps used to record page states
+ *
+ * Free the bitmaps allocated above. It is not an error to call
+ * memory_bm_free on a bitmap that isn't currently allocated.
+ **/
+static void free_bitmaps(void)
+{
+	toi_free_bitmap(&pageset1_map);
+	toi_free_bitmap(&pageset1_copy_map);
+	toi_free_bitmap(&pageset2_map);
+	toi_free_bitmap(&io_map);
+	toi_free_bitmap(&nosave_map);
+	toi_free_bitmap(&free_map);
+	toi_free_bitmap(&page_resave_map);
+}
+
+/**
+ * io_MB_per_second - return the number of MB/s read or written
+ * @write:	Whether to return the speed at which we wrote.
+ *
+ * Calculate the number of megabytes per second that were read or written.
+ **/
+static int io_MB_per_second(int write)
+{
+	return (toi_bkd.toi_io_time[write][1]) ?
+		MB((unsigned long) toi_bkd.toi_io_time[write][0]) * HZ /
+		toi_bkd.toi_io_time[write][1] : 0;
+}
+
+#define SNPRINTF(a...) 	do { len += scnprintf(((char *) buffer) + len, \
+		count - len - 1, ## a); } while (0)
+
+/**
+ * get_debug_info - fill a buffer with debugging information
+ * @buffer:	The buffer to be filled.
+ * @count:	The size of the buffer, in bytes.
+ *
+ * Fill a (usually PAGE_SIZEd) buffer with the debugging info that we will
+ * either printk or return via sysfs.
+ **/
+static int get_toi_debug_info(const char *buffer, int count)
+{
+	int len = 0, i, first_result = 1;
+
+	SNPRINTF("TuxOnIce debugging info:\n");
+	SNPRINTF("- TuxOnIce core  : " TOI_CORE_VERSION "\n");
+	SNPRINTF("- Kernel Version : " UTS_RELEASE "\n");
+	SNPRINTF("- Compiler vers. : %d.%d\n", __GNUC__, __GNUC_MINOR__);
+	SNPRINTF("- Attempt number : %d\n", nr_hibernates);
+	SNPRINTF("- Parameters     : %ld %ld %ld %d %ld %ld\n",
+			toi_result,
+			toi_bkd.toi_action,
+			toi_bkd.toi_debug_state,
+			toi_bkd.toi_default_console_level,
+			image_size_limit,
+			toi_poweroff_method);
+	SNPRINTF("- Overall expected compression percentage: %d.\n",
+			100 - toi_expected_compression_ratio());
+	len += toi_print_module_debug_info(((char *) buffer) + len,
+			count - len - 1);
+	if (toi_bkd.toi_io_time[0][1]) {
+		if ((io_MB_per_second(0) < 5) || (io_MB_per_second(1) < 5)) {
+			SNPRINTF("- I/O speed: Write %ld KB/s",
+			  (KB((unsigned long) toi_bkd.toi_io_time[0][0]) * HZ /
+			  toi_bkd.toi_io_time[0][1]));
+			if (toi_bkd.toi_io_time[1][1])
+				SNPRINTF(", Read %ld KB/s",
+				  (KB((unsigned long)
+				      toi_bkd.toi_io_time[1][0]) * HZ /
+				  toi_bkd.toi_io_time[1][1]));
+		} else {
+			SNPRINTF("- I/O speed: Write %ld MB/s",
+			 (MB((unsigned long) toi_bkd.toi_io_time[0][0]) * HZ /
+			  toi_bkd.toi_io_time[0][1]));
+			if (toi_bkd.toi_io_time[1][1])
+				SNPRINTF(", Read %ld MB/s",
+				 (MB((unsigned long)
+				     toi_bkd.toi_io_time[1][0]) * HZ /
+				  toi_bkd.toi_io_time[1][1]));
+		}
+		SNPRINTF(".\n");
+	} else
+		SNPRINTF("- No I/O speed stats available.\n");
+	SNPRINTF("- Extra pages    : %lu used/%lu.\n",
+			extra_pd1_pages_used, extra_pd1_pages_allowance);
+
+	for (i = 0; i < TOI_NUM_RESULT_STATES; i++)
+		if (test_result_state(i)) {
+			SNPRINTF("%s: %s.\n", first_result ?
+					"- Result         " :
+					"                 ",
+					result_strings[i]);
+			first_result = 0;
+		}
+	if (first_result)
+		SNPRINTF("- Result         : %s.\n", nr_hibernates ?
+			"Succeeded" :
+			"No hibernation attempts so far");
+	return len;
+}
+
+/**
+ * do_cleanup - cleanup after attempting to hibernate or resume
+ * @get_debug_info:	Whether to allocate and return debugging info.
+ *
+ * Cleanup after attempting to hibernate or resume, possibly getting
+ * debugging info as we do so.
+ **/
+static void do_cleanup(int get_debug_info, int restarting)
+{
+	int i = 0;
+	char *buffer = NULL;
+
+	trap_non_toi_io = 0;
+
+	if (get_debug_info)
+		toi_prepare_status(DONT_CLEAR_BAR, "Cleaning up...");
+
+	free_checksum_pages();
+
+	if (get_debug_info)
+		buffer = (char *) toi_get_zeroed_page(20, TOI_ATOMIC_GFP);
+
+	if (buffer)
+		i = get_toi_debug_info(buffer, PAGE_SIZE);
+
+	toi_free_extra_pagedir_memory();
+
+	pagedir1.size = 0;
+	pagedir2.size = 0;
+	set_highmem_size(pagedir1, 0);
+	set_highmem_size(pagedir2, 0);
+
+	if (boot_kernel_data_buffer) {
+		if (!test_toi_state(TOI_BOOT_KERNEL))
+			toi_free_page(37, boot_kernel_data_buffer);
+		boot_kernel_data_buffer = 0;
+	}
+
+	clear_toi_state(TOI_BOOT_KERNEL);
+	if (current->flags & PF_SUSPEND_TASK)
+		thaw_processes();
+
+	if (!restarting)
+		toi_stop_other_threads();
+
+	if (test_action_state(TOI_KEEP_IMAGE) &&
+	    !test_result_state(TOI_ABORTED)) {
+		toi_message(TOI_ANY_SECTION, TOI_LOW, 1,
+			"TuxOnIce: Not invalidating the image due "
+			"to Keep Image being enabled.");
+		set_result_state(TOI_KEPT_IMAGE);
+	} else
+		if (toiActiveAllocator)
+			toiActiveAllocator->remove_image();
+
+	free_bitmaps();
+	usermodehelper_enable();
+
+	if (test_toi_state(TOI_NOTIFIERS_PREPARE)) {
+		pm_notifier_call_chain(PM_POST_HIBERNATION);
+		clear_toi_state(TOI_NOTIFIERS_PREPARE);
+	}
+
+	if (buffer && i) {
+		/* Printk can only handle 1023 bytes, including
+		 * its level mangling. */
+		for (i = 0; i < 3; i++)
+			printk(KERN_ERR "%s", buffer + (1023 * i));
+		toi_free_page(20, (unsigned long) buffer);
+	}
+
+	if (!test_action_state(TOI_LATE_CPU_HOTPLUG))
+		enable_nonboot_cpus();
+
+	if (!restarting)
+		toi_cleanup_console();
+
+	free_attention_list();
+
+	if (!restarting)
+		toi_deactivate_storage(0);
+
+	clear_toi_state(TOI_IGNORE_LOGLEVEL);
+	clear_toi_state(TOI_TRYING_TO_RESUME);
+	clear_toi_state(TOI_NOW_RESUMING);
+}
+
+/**
+ * check_still_keeping_image - we kept an image; check whether to reuse it.
+ *
+ * We enter this routine when we have kept an image. If the user has said they
+ * want to still keep it, all we need to do is powerdown. If powering down
+ * means hibernating to ram and the power doesn't run out, we'll return 1.
+ * If we do power off properly or the battery runs out, we'll resume via the
+ * normal paths.
+ *
+ * If the user has said they want to remove the previously kept image, we
+ * remove it, and return 0. We'll then store a new image.
+ **/
+static int check_still_keeping_image(void)
+{
+	if (test_action_state(TOI_KEEP_IMAGE)) {
+		printk(KERN_INFO "Image already stored: powering down "
+				"immediately.");
+		do_toi_step(STEP_HIBERNATE_POWERDOWN);
+		return 1;	/* Just in case we're using S3 */
+	}
+
+	printk(KERN_INFO "Invalidating previous image.\n");
+	toiActiveAllocator->remove_image();
+
+	return 0;
+}
+
+/**
+ * toi_init - prepare to hibernate to disk
+ *
+ * Initialise variables & data structures, in preparation for
+ * hibernating to disk.
+ **/
+static int toi_init(int restarting)
+{
+	int result, i, j;
+
+	toi_result = 0;
+
+	printk(KERN_INFO "Initiating a hibernation cycle.\n");
+
+	nr_hibernates++;
+
+	for (i = 0; i < 2; i++)
+		for (j = 0; j < 2; j++)
+			toi_bkd.toi_io_time[i][j] = 0;
+
+	if (!test_toi_state(TOI_CAN_HIBERNATE) ||
+	    allocate_bitmaps())
+		return 1;
+
+	mark_nosave_pages();
+
+	if (!restarting)
+		toi_prepare_console();
+
+	result = pm_notifier_call_chain(PM_HIBERNATION_PREPARE);
+	if (result) {
+		set_result_state(TOI_NOTIFIERS_PREPARE_FAILED);
+		return 1;
+	}
+	set_toi_state(TOI_NOTIFIERS_PREPARE);
+
+	if (!restarting) {
+		printk(KERN_ERR "Starting other threads.");
+		toi_start_other_threads();
+	}
+
+	result = usermodehelper_disable();
+	if (result) {
+		printk(KERN_ERR "TuxOnIce: Failed to disable usermode "
+				"helpers\n");
+		set_result_state(TOI_USERMODE_HELPERS_ERR);
+		return 1;
+	}
+
+	boot_kernel_data_buffer = toi_get_zeroed_page(37, TOI_ATOMIC_GFP);
+	if (!boot_kernel_data_buffer) {
+		printk(KERN_ERR "TuxOnIce: Failed to allocate "
+				"boot_kernel_data_buffer.\n");
+		set_result_state(TOI_OUT_OF_MEMORY);
+		return 1;
+	}
+
+	if (!test_action_state(TOI_LATE_CPU_HOTPLUG) &&
+			disable_nonboot_cpus()) {
+		set_abort_result(TOI_CPU_HOTPLUG_FAILED);
+		return 1;
+	}
+
+	return 0;
+}
+
+/**
+ * can_hibernate - perform basic 'Can we hibernate?' tests
+ *
+ * Perform basic tests that must pass if we're going to be able to hibernate:
+ * Can we get the pm_mutex? Is resume= valid (we need to know where to write
+ * the image header).
+ **/
+static int can_hibernate(void)
+{
+	if (!test_toi_state(TOI_CAN_HIBERNATE))
+		toi_attempt_to_parse_resume_device(0);
+
+	if (!test_toi_state(TOI_CAN_HIBERNATE)) {
+		printk(KERN_INFO "TuxOnIce: Hibernation is disabled.\n"
+			"This may be because you haven't put something along "
+			"the lines of\n\nresume=swap:/dev/hda1\n\n"
+			"in lilo.conf or equivalent. (Where /dev/hda1 is your "
+			"swap partition).\n");
+		set_abort_result(TOI_CANT_SUSPEND);
+		return 0;
+	}
+
+	if (strlen(alt_resume_param)) {
+		attempt_to_parse_alt_resume_param();
+
+		if (!strlen(alt_resume_param)) {
+			printk(KERN_INFO "Alternate resume parameter now "
+					"invalid. Aborting.\n");
+			set_abort_result(TOI_CANT_USE_ALT_RESUME);
+			return 0;
+		}
+	}
+
+	return 1;
+}
+
+/**
+ * do_post_image_write - having written an image, figure out what to do next
+ *
+ * After writing an image, we might load an alternate image or power down.
+ * Powering down might involve hibernating to ram, in which case we also
+ * need to handle reloading pageset2.
+ **/
+static int do_post_image_write(void)
+{
+	/* If switching images fails, do normal powerdown */
+	if (alt_resume_param[0])
+		do_toi_step(STEP_RESUME_ALT_IMAGE);
+
+	toi_power_down();
+
+	barrier();
+	mb();
+	return 0;
+}
+
+/**
+ * __save_image - do the hard work of saving the image
+ *
+ * High level routine for getting the image saved. The key assumptions made
+ * are that processes have been frozen and sufficient memory is available.
+ *
+ * We also exit through here at resume time, coming back from toi_hibernate
+ * after the atomic restore. This is the reason for the toi_in_hibernate
+ * test.
+ **/
+static int __save_image(void)
+{
+	int temp_result, did_copy = 0;
+
+	toi_prepare_status(DONT_CLEAR_BAR, "Starting to save the image..");
+
+	toi_message(TOI_ANY_SECTION, TOI_LOW, 1,
+		" - Final values: %d and %d.",
+		pagedir1.size, pagedir2.size);
+
+	toi_cond_pause(1, "About to write pagedir2.");
+
+	temp_result = write_pageset(&pagedir2);
+
+	if (temp_result == -1 || test_result_state(TOI_ABORTED))
+		return 1;
+
+	toi_cond_pause(1, "About to copy pageset 1.");
+
+	if (test_result_state(TOI_ABORTED))
+		return 1;
+
+	toi_deactivate_storage(1);
+
+	toi_prepare_status(DONT_CLEAR_BAR, "Doing atomic copy/restore.");
+
+	toi_in_hibernate = 1;
+
+	if (toi_go_atomic(PMSG_FREEZE, 1))
+		goto Failed;
+
+	temp_result = toi_hibernate();
+
+#ifdef CONFIG_KGDB
+	if (test_action_state(TOI_POST_RESUME_BREAKPOINT))
+		kgdb_breakpoint();
+#endif
+
+	if (!temp_result)
+		did_copy = 1;
+
+	/* We return here at resume time too! */
+	toi_end_atomic(ATOMIC_ALL_STEPS, toi_in_hibernate, temp_result);
+
+Failed:
+	if (toi_activate_storage(1))
+		panic("Failed to reactivate our storage.");
+
+	/* Resume time? */
+	if (!toi_in_hibernate) {
+		copyback_post();
+		return 0;
+	}
+
+	/* Nope. Hibernating. So, see if we can save the image... */
+
+	if (temp_result || test_result_state(TOI_ABORTED)) {
+		if (did_copy)
+			goto abort_reloading_pagedir_two;
+		else
+			return 1;
+	}
+
+	toi_update_status(pagedir2.size, pagedir1.size + pagedir2.size,
+			NULL);
+
+	if (test_result_state(TOI_ABORTED))
+		goto abort_reloading_pagedir_two;
+
+	toi_cond_pause(1, "About to write pageset1.");
+
+	toi_message(TOI_ANY_SECTION, TOI_LOW, 1, "-- Writing pageset1");
+
+	temp_result = write_pageset(&pagedir1);
+
+	/* We didn't overwrite any memory, so no reread needs to be done. */
+	if (test_action_state(TOI_TEST_FILTER_SPEED) ||
+	    test_action_state(TOI_TEST_BIO))
+		return 1;
+
+	if (temp_result == 1 || test_result_state(TOI_ABORTED))
+		goto abort_reloading_pagedir_two;
+
+	toi_cond_pause(1, "About to write header.");
+
+	if (test_result_state(TOI_ABORTED))
+		goto abort_reloading_pagedir_two;
+
+	temp_result = write_image_header();
+
+	if (!temp_result && !test_result_state(TOI_ABORTED))
+		return 0;
+
+abort_reloading_pagedir_two:
+	temp_result = read_pageset2(1);
+
+	/* If that failed, we're sunk. Panic! */
+	if (temp_result)
+		panic("Attempt to reload pagedir 2 while aborting "
+				"a hibernate failed.");
+
+	return 1;
+}
+
+static void map_ps2_pages(int enable)
+{
+	unsigned long pfn = 0;
+
+	pfn = memory_bm_next_pfn(pageset2_map);
+
+	while (pfn != BM_END_OF_MAP) {
+		struct page *page = pfn_to_page(pfn);
+		kernel_map_pages(page, 1, enable);
+		pfn = memory_bm_next_pfn(pageset2_map);
+	}
+}
+
+/**
+ * do_save_image - save the image and handle the result
+ *
+ * Save the prepared image. If we fail or we're in the path returning
+ * from the atomic restore, cleanup.
+ **/
+static int do_save_image(void)
+{
+	int result;
+	map_ps2_pages(0);
+	result = __save_image();
+	map_ps2_pages(1);
+	return result;
+}
+
+/**
+ * do_prepare_image - try to prepare an image
+ *
+ * Seek to initialise and prepare an image to be saved. On failure,
+ * cleanup.
+ **/
+static int do_prepare_image(void)
+{
+	int restarting = test_result_state(TOI_EXTRA_PAGES_ALLOW_TOO_SMALL);
+
+	if (!restarting && toi_activate_storage(0))
+		return 1;
+
+	/*
+	 * If kept image and still keeping image and hibernating to RAM, we will
+	 * return 1 after hibernating and resuming (provided the power doesn't
+	 * run out. In that case, we skip directly to cleaning up and exiting.
+	 */
+
+	if (!can_hibernate() ||
+	    (test_result_state(TOI_KEPT_IMAGE) &&
+	     check_still_keeping_image()))
+		return 1;
+
+	if (toi_init(restarting) || toi_prepare_image() ||
+			test_result_state(TOI_ABORTED))
+		return 1;
+
+	trap_non_toi_io = 1;
+
+	return 0;
+}
+
+/**
+ * do_check_can_resume - find out whether an image has been stored
+ *
+ * Read whether an image exists. We use the same routine as the
+ * image_exists sysfs entry, and just look to see whether the
+ * first character in the resulting buffer is a '1'.
+ **/
+int do_check_can_resume(void)
+{
+	int result = -1;
+
+	if (toi_activate_storage(0))
+		return -1;
+
+	if (!test_toi_state(TOI_RESUME_DEVICE_OK))
+		toi_attempt_to_parse_resume_device(1);
+
+	if (toiActiveAllocator)
+		result = toiActiveAllocator->image_exists(1);
+
+	toi_deactivate_storage(0);
+	return result;
+}
+EXPORT_SYMBOL_GPL(do_check_can_resume);
+
+/**
+ * do_load_atomic_copy - load the first part of an image, if it exists
+ *
+ * Check whether we have an image. If one exists, do sanity checking
+ * (possibly invalidating the image or even rebooting if the user
+ * requests that) before loading it into memory in preparation for the
+ * atomic restore.
+ *
+ * If and only if we have an image loaded and ready to restore, we return 1.
+ **/
+static int do_load_atomic_copy(void)
+{
+	int read_image_result = 0;
+
+	if (sizeof(swp_entry_t) != sizeof(long)) {
+		printk(KERN_WARNING "TuxOnIce: The size of swp_entry_t != size"
+			" of long. Please report this!\n");
+		return 1;
+	}
+
+	if (!resume_file[0])
+		printk(KERN_WARNING "TuxOnIce: "
+			"You need to use a resume= command line parameter to "
+			"tell TuxOnIce where to look for an image.\n");
+
+	toi_activate_storage(0);
+
+	if (!(test_toi_state(TOI_RESUME_DEVICE_OK)) &&
+		!toi_attempt_to_parse_resume_device(0)) {
+		/*
+		 * Without a usable storage device we can do nothing -
+		 * even if noresume is given
+		 */
+
+		if (!toiNumAllocators)
+			printk(KERN_ALERT "TuxOnIce: "
+			  "No storage allocators have been registered.\n");
+		else
+			printk(KERN_ALERT "TuxOnIce: "
+				"Missing or invalid storage location "
+				"(resume= parameter). Please correct and "
+				"rerun lilo (or equivalent) before "
+				"hibernating.\n");
+		toi_deactivate_storage(0);
+		return 1;
+	}
+
+	if (allocate_bitmaps())
+		return 1;
+
+	read_image_result = read_pageset1(); /* non fatal error ignored */
+
+	if (test_toi_state(TOI_NORESUME_SPECIFIED))
+		clear_toi_state(TOI_NORESUME_SPECIFIED);
+
+	toi_deactivate_storage(0);
+
+	if (read_image_result)
+		return 1;
+
+	return 0;
+}
+
+/**
+ * prepare_restore_load_alt_image - save & restore alt image variables
+ *
+ * Save and restore the pageset1 maps, when loading an alternate image.
+ **/
+static void prepare_restore_load_alt_image(int prepare)
+{
+	static struct memory_bitmap *pageset1_map_save, *pageset1_copy_map_save;
+
+	if (prepare) {
+		pageset1_map_save = pageset1_map;
+		pageset1_map = NULL;
+		pageset1_copy_map_save = pageset1_copy_map;
+		pageset1_copy_map = NULL;
+		set_toi_state(TOI_LOADING_ALT_IMAGE);
+		toi_reset_alt_image_pageset2_pfn();
+	} else {
+		memory_bm_free(pageset1_map, 0);
+		pageset1_map = pageset1_map_save;
+		memory_bm_free(pageset1_copy_map, 0);
+		pageset1_copy_map = pageset1_copy_map_save;
+		clear_toi_state(TOI_NOW_RESUMING);
+		clear_toi_state(TOI_LOADING_ALT_IMAGE);
+	}
+}
+
+/**
+ * do_toi_step - perform a step in hibernating or resuming
+ *
+ * Perform a step in hibernating or resuming an image. This abstraction
+ * is in preparation for implementing cluster support, and perhaps replacing
+ * uswsusp too (haven't looked whether that's possible yet).
+ **/
+int do_toi_step(int step)
+{
+	switch (step) {
+	case STEP_HIBERNATE_PREPARE_IMAGE:
+		return do_prepare_image();
+	case STEP_HIBERNATE_SAVE_IMAGE:
+		return do_save_image();
+	case STEP_HIBERNATE_POWERDOWN:
+		return do_post_image_write();
+	case STEP_RESUME_CAN_RESUME:
+		return do_check_can_resume();
+	case STEP_RESUME_LOAD_PS1:
+		return do_load_atomic_copy();
+	case STEP_RESUME_DO_RESTORE:
+		/*
+		 * If we succeed, this doesn't return.
+		 * Instead, we return from do_save_image() in the
+		 * hibernated kernel.
+		 */
+		return toi_atomic_restore();
+	case STEP_RESUME_ALT_IMAGE:
+		printk(KERN_INFO "Trying to resume alternate image.\n");
+		toi_in_hibernate = 0;
+		save_restore_alt_param(SAVE, NOQUIET);
+		prepare_restore_load_alt_image(1);
+		if (!do_check_can_resume()) {
+			printk(KERN_INFO "Nothing to resume from.\n");
+			goto out;
+		}
+		if (!do_load_atomic_copy())
+			toi_atomic_restore();
+
+		printk(KERN_INFO "Failed to load image.\n");
+out:
+		prepare_restore_load_alt_image(0);
+		save_restore_alt_param(RESTORE, NOQUIET);
+		break;
+	case STEP_CLEANUP:
+		do_cleanup(1, 0);
+		break;
+	case STEP_QUIET_CLEANUP:
+		do_cleanup(0, 0);
+		break;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(do_toi_step);
+
+/* -- Functions for kickstarting a hibernate or resume --- */
+
+/**
+ * toi_try_resume - try to do the steps in resuming
+ *
+ * Check if we have an image and if so try to resume. Clear the status
+ * flags too.
+ **/
+void toi_try_resume(void)
+{
+	set_toi_state(TOI_TRYING_TO_RESUME);
+	resume_attempted = 1;
+
+	current->flags |= PF_MEMALLOC;
+	toi_start_other_threads();
+
+	if (do_toi_step(STEP_RESUME_CAN_RESUME) &&
+			!do_toi_step(STEP_RESUME_LOAD_PS1))
+		do_toi_step(STEP_RESUME_DO_RESTORE);
+
+	toi_stop_other_threads();
+	do_cleanup(0, 0);
+
+	current->flags &= ~PF_MEMALLOC;
+
+	clear_toi_state(TOI_IGNORE_LOGLEVEL);
+	clear_toi_state(TOI_TRYING_TO_RESUME);
+	clear_toi_state(TOI_NOW_RESUMING);
+}
+
+/**
+ * toi_sys_power_disk_try_resume - wrapper calling toi_try_resume
+ *
+ * Wrapper for when __toi_try_resume is called from swsusp resume path,
+ * rather than from echo > /sys/power/tuxonice/do_resume.
+ **/
+static void toi_sys_power_disk_try_resume(void)
+{
+	resume_attempted = 1;
+
+	/*
+	 * There's a comment in kernel/power/disk.c that indicates
+	 * we should be able to use mutex_lock_nested below. That
+	 * doesn't seem to cut it, though, so let's just turn lockdep
+	 * off for now.
+	 */
+	lockdep_off();
+
+	if (toi_start_anything(SYSFS_RESUMING))
+		goto out;
+
+	toi_try_resume();
+
+	/*
+	 * For initramfs, we have to clear the boot time
+	 * flag after trying to resume
+	 */
+	clear_toi_state(TOI_BOOT_TIME);
+
+	toi_finish_anything(SYSFS_RESUMING);
+out:
+	lockdep_on();
+}
+
+/**
+ * toi_try_hibernate - try to start a hibernation cycle
+ *
+ * Start a hibernation cycle, coming in from either
+ * echo > /sys/power/tuxonice/do_suspend
+ *
+ * or
+ *
+ * echo disk > /sys/power/state
+ *
+ * In the later case, we come in without pm_sem taken; in the
+ * former, it has been taken.
+ **/
+int toi_try_hibernate(void)
+{
+	int result = 0, sys_power_disk = 0, retries = 0;
+
+	if (!mutex_is_locked(&tuxonice_in_use)) {
+		/* Came in via /sys/power/disk */
+		if (toi_start_anything(SYSFS_HIBERNATING))
+			return -EBUSY;
+		sys_power_disk = 1;
+	}
+
+	current->flags |= PF_MEMALLOC;
+
+	if (test_toi_state(TOI_CLUSTER_MODE)) {
+		toi_initiate_cluster_hibernate();
+		goto out;
+	}
+
+prepare:
+	result = do_toi_step(STEP_HIBERNATE_PREPARE_IMAGE);
+
+	if (result)
+		goto out;
+
+	if (test_action_state(TOI_FREEZER_TEST))
+		goto out_restore_gfp_mask;
+
+	result = do_toi_step(STEP_HIBERNATE_SAVE_IMAGE);
+
+	if (test_result_state(TOI_EXTRA_PAGES_ALLOW_TOO_SMALL)) {
+		if (retries < 2) {
+			do_cleanup(0, 1);
+			retries++;
+			clear_result_state(TOI_ABORTED);
+			extra_pd1_pages_allowance = extra_pd1_pages_used + 500;
+			printk(KERN_INFO "Automatically adjusting the extra"
+				" pages allowance to %ld and restarting.\n",
+				extra_pd1_pages_allowance);
+			pm_restore_gfp_mask();
+			goto prepare;
+		}
+
+		printk(KERN_INFO "Adjusted extra pages allowance twice and "
+			"still couldn't hibernate successfully. Giving up.");
+	}
+
+	/* This code runs at resume time too! */
+	if (!result && toi_in_hibernate)
+		result = do_toi_step(STEP_HIBERNATE_POWERDOWN);
+
+out_restore_gfp_mask:
+	pm_restore_gfp_mask();
+out:
+	do_cleanup(1, 0);
+	current->flags &= ~PF_MEMALLOC;
+
+	if (sys_power_disk)
+		toi_finish_anything(SYSFS_HIBERNATING);
+
+	return result;
+}
+
+/*
+ * channel_no: If !0, -c <channel_no> is added to args (userui).
+ */
+int toi_launch_userspace_program(char *command, int channel_no,
+		int wait, int debug)
+{
+	int retval;
+	static char *envp[] = {
+			"HOME=/",
+			"TERM=linux",
+			"PATH=/sbin:/usr/sbin:/bin:/usr/bin",
+			NULL };
+	static char *argv[] = { NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL
+		};
+	char *channel = NULL;
+	int arg = 0, size;
+	char test_read[255];
+	char *orig_posn = command;
+
+	if (!strlen(orig_posn))
+		return 1;
+
+	if (channel_no) {
+		channel = toi_kzalloc(4, 6, GFP_KERNEL);
+		if (!channel) {
+			printk(KERN_INFO "Failed to allocate memory in "
+				"preparing to launch userspace program.\n");
+			return 1;
+		}
+	}
+
+	/* Up to 6 args supported */
+	while (arg < 6) {
+		sscanf(orig_posn, "%s", test_read);
+		size = strlen(test_read);
+		if (!(size))
+			break;
+		argv[arg] = toi_kzalloc(5, size + 1, TOI_ATOMIC_GFP);
+		strcpy(argv[arg], test_read);
+		orig_posn += size + 1;
+		*test_read = 0;
+		arg++;
+	}
+
+	if (channel_no) {
+		sprintf(channel, "-c%d", channel_no);
+		argv[arg] = channel;
+	} else
+		arg--;
+
+	if (debug) {
+		argv[++arg] = toi_kzalloc(5, 8, TOI_ATOMIC_GFP);
+		strcpy(argv[arg], "--debug");
+	}
+
+	retval = call_usermodehelper(argv[0], argv, envp, wait);
+
+	/*
+	 * If the program reports an error, retval = 256. Don't complain
+	 * about that here.
+	 */
+	if (retval && retval != 256)
+		printk(KERN_ERR "Failed to launch userspace program '%s': "
+				"Error %d\n", command, retval);
+
+	{
+		int i;
+		for (i = 0; i < arg; i++)
+			if (argv[i] && argv[i] != channel)
+				toi_kfree(5, argv[i], sizeof(*argv[i]));
+	}
+
+	toi_kfree(4, channel, sizeof(*channel));
+
+	return retval;
+}
+
+/*
+ * This array contains entries that are automatically registered at
+ * boot. Modules and the console code register their own entries separately.
+ */
+static struct toi_sysfs_data sysfs_params[] = {
+	SYSFS_LONG("extra_pages_allowance", SYSFS_RW,
+			&extra_pd1_pages_allowance, 0, LONG_MAX, 0),
+	SYSFS_CUSTOM("image_exists", SYSFS_RW, image_exists_read,
+			image_exists_write, SYSFS_NEEDS_SM_FOR_BOTH, NULL),
+	SYSFS_STRING("resume", SYSFS_RW, resume_file, 255,
+			SYSFS_NEEDS_SM_FOR_WRITE,
+			attempt_to_parse_resume_device2),
+	SYSFS_STRING("alt_resume_param", SYSFS_RW, alt_resume_param, 255,
+			SYSFS_NEEDS_SM_FOR_WRITE,
+			attempt_to_parse_alt_resume_param),
+	SYSFS_CUSTOM("debug_info", SYSFS_READONLY, get_toi_debug_info, NULL, 0,
+			NULL),
+	SYSFS_BIT("ignore_rootfs", SYSFS_RW, &toi_bkd.toi_action,
+			TOI_IGNORE_ROOTFS, 0),
+	SYSFS_LONG("image_size_limit", SYSFS_RW, &image_size_limit, -2,
+			INT_MAX, 0),
+	SYSFS_UL("last_result", SYSFS_RW, &toi_result, 0, 0, 0),
+	SYSFS_BIT("no_multithreaded_io", SYSFS_RW, &toi_bkd.toi_action,
+			TOI_NO_MULTITHREADED_IO, 0),
+	SYSFS_BIT("no_flusher_thread", SYSFS_RW, &toi_bkd.toi_action,
+			TOI_NO_FLUSHER_THREAD, 0),
+	SYSFS_BIT("full_pageset2", SYSFS_RW, &toi_bkd.toi_action,
+			TOI_PAGESET2_FULL, 0),
+	SYSFS_BIT("reboot", SYSFS_RW, &toi_bkd.toi_action, TOI_REBOOT, 0),
+	SYSFS_BIT("replace_swsusp", SYSFS_RW, &toi_bkd.toi_action,
+			TOI_REPLACE_SWSUSP, 0),
+	SYSFS_STRING("resume_commandline", SYSFS_RW,
+			toi_bkd.toi_nosave_commandline, COMMAND_LINE_SIZE, 0,
+			NULL),
+	SYSFS_STRING("version", SYSFS_READONLY, TOI_CORE_VERSION, 0, 0, NULL),
+	SYSFS_BIT("freezer_test", SYSFS_RW, &toi_bkd.toi_action,
+			TOI_FREEZER_TEST, 0),
+	SYSFS_BIT("test_bio", SYSFS_RW, &toi_bkd.toi_action, TOI_TEST_BIO, 0),
+	SYSFS_BIT("test_filter_speed", SYSFS_RW, &toi_bkd.toi_action,
+			TOI_TEST_FILTER_SPEED, 0),
+	SYSFS_BIT("no_pageset2", SYSFS_RW, &toi_bkd.toi_action,
+			TOI_NO_PAGESET2, 0),
+	SYSFS_BIT("no_pageset2_if_unneeded", SYSFS_RW, &toi_bkd.toi_action,
+			TOI_NO_PS2_IF_UNNEEDED, 0),
+	SYSFS_BIT("late_cpu_hotplug", SYSFS_RW, &toi_bkd.toi_action,
+			TOI_LATE_CPU_HOTPLUG, 0),
+	SYSFS_STRING("binary_signature", SYSFS_READONLY,
+			tuxonice_signature, 9, 0, NULL),
+	SYSFS_INT("max_workers", SYSFS_RW, &toi_max_workers, 0, NR_CPUS, 0,
+			NULL),
+#ifdef CONFIG_KGDB
+	SYSFS_BIT("post_resume_breakpoint", SYSFS_RW, &toi_bkd.toi_action,
+			TOI_POST_RESUME_BREAKPOINT, 0),
+#endif
+	SYSFS_BIT("no_readahead", SYSFS_RW, &toi_bkd.toi_action,
+			TOI_NO_READAHEAD, 0),
+#ifdef CONFIG_TOI_KEEP_IMAGE
+	SYSFS_BIT("keep_image", SYSFS_RW , &toi_bkd.toi_action, TOI_KEEP_IMAGE,
+			0),
+#endif
+};
+
+static struct toi_core_fns my_fns = {
+	.get_nonconflicting_page = __toi_get_nonconflicting_page,
+	.post_context_save = __toi_post_context_save,
+	.try_hibernate = toi_try_hibernate,
+	.try_resume = toi_sys_power_disk_try_resume,
+};
+
+/**
+ * core_load - initialisation of TuxOnIce core
+ *
+ * Initialise the core, beginning with sysfs. Checksum and so on are part of
+ * the core, but have their own initialisation routines because they either
+ * aren't compiled in all the time or have their own subdirectories.
+ **/
+static __init int core_load(void)
+{
+	int i,
+	    numfiles = sizeof(sysfs_params) / sizeof(struct toi_sysfs_data);
+
+	printk(KERN_INFO "TuxOnIce " TOI_CORE_VERSION
+			" (http://tuxonice.net)\n");
+
+	if (toi_sysfs_init())
+		return 1;
+
+	for (i = 0; i < numfiles; i++)
+		toi_register_sysfs_file(tuxonice_kobj, &sysfs_params[i]);
+
+	toi_core_fns = &my_fns;
+
+	if (toi_alloc_init())
+		return 1;
+	if (toi_checksum_init())
+		return 1;
+	if (toi_usm_init())
+		return 1;
+	if (toi_ui_init())
+		return 1;
+	if (toi_poweroff_init())
+		return 1;
+	if (toi_cluster_init())
+		return 1;
+
+	return 0;
+}
+
+#ifdef MODULE
+/**
+ * core_unload: Prepare to unload the core code.
+ **/
+static __exit void core_unload(void)
+{
+	int i,
+	    numfiles = sizeof(sysfs_params) / sizeof(struct toi_sysfs_data);
+
+	toi_alloc_exit();
+	toi_checksum_exit();
+	toi_poweroff_exit();
+	toi_ui_exit();
+	toi_usm_exit();
+	toi_cluster_exit();
+
+	for (i = 0; i < numfiles; i++)
+		toi_unregister_sysfs_file(tuxonice_kobj, &sysfs_params[i]);
+
+	toi_core_fns = NULL;
+
+	toi_sysfs_exit();
+}
+MODULE_LICENSE("GPL");
+module_init(core_load);
+module_exit(core_unload);
+#else
+late_initcall(core_load);
+#endif
diff --git a/kernel/power/tuxonice_incremental.c b/kernel/power/tuxonice_incremental.c
new file mode 100644
index 0000000..16d58fb
--- /dev/null
+++ b/kernel/power/tuxonice_incremental.c
@@ -0,0 +1,383 @@
+/*
+ * kernel/power/incremental.c
+ *
+ * Copyright (C) 2012 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * This file contains routines related to storing incremental images - that
+ * is, retaining an image after an initial cycle and then storing incremental
+ * changes on subsequent hibernations.
+ */
+
+#include <linux/suspend.h>
+#include <linux/highmem.h>
+#include <linux/vmalloc.h>
+#include <linux/crypto.h>
+#include <linux/scatterlist.h>
+
+#include "tuxonice_builtin.h"
+#include "tuxonice.h"
+#include "tuxonice_modules.h"
+#include "tuxonice_sysfs.h"
+#include "tuxonice_io.h"
+#include "tuxonice_ui.h"
+#include "tuxonice_alloc.h"
+
+static struct toi_module_ops toi_incremental_ops;
+static struct toi_module_ops *next_driver;
+static unsigned long toi_incremental_bytes_in, toi_incremental_bytes_out;
+
+static char toi_incremental_slow_cmp_name[32] = "sha1";
+static int toi_incremental_digestsize;
+
+static DEFINE_MUTEX(stats_lock);
+
+struct cpu_context {
+	u8 *buffer_start;
+	struct hash_desc desc;
+	struct scatterlist sg[1];
+	unsigned char *digest;
+};
+
+#define OUT_BUF_SIZE (2 * PAGE_SIZE)
+
+static DEFINE_PER_CPU(struct cpu_context, contexts);
+
+/*
+ * toi_crypto_prepare
+ *
+ * Prepare to do some work by allocating buffers and transforms.
+ */
+static int toi_incremental_crypto_prepare(void)
+{
+	int cpu, digestsize = toi_incremental_digestsize;
+
+	if (!*toi_incremental_slow_cmp_name) {
+		printk(KERN_INFO "TuxOnIce: Incremental image support enabled but no "
+				"hash algorithm set.\n");
+		return 1;
+	}
+
+	for_each_online_cpu(cpu) {
+		struct cpu_context *this = &per_cpu(contexts, cpu);
+		this->desc.tfm = crypto_alloc_hash(toi_incremental_slow_cmp_name, 0, 0);
+		if (IS_ERR(this->desc.tfm)) {
+			printk(KERN_INFO "TuxOnIce: Failed to initialise the "
+					"%s hashing transform.\n",
+					toi_incremental_slow_cmp_name);
+			this->desc.tfm = NULL;
+			return 1;
+		}
+
+		if (!digestsize) {
+			digestsize = crypto_hash_digestsize(this->desc.tfm);
+			toi_incremental_digestsize = digestsize;
+		}
+
+		this->digest = toi_kzalloc(16, digestsize, GFP_KERNEL);
+		if (!this->digest)
+			return -ENOMEM;
+
+		this->desc.flags = CRYPTO_TFM_REQ_MAY_SLEEP;
+	}
+
+	return 0;
+}
+
+static int toi_incremental_rw_cleanup(int writing)
+{
+	int cpu;
+
+	for_each_online_cpu(cpu) {
+		struct cpu_context *this = &per_cpu(contexts, cpu);
+		if (this->desc.tfm) {
+			crypto_free_hash(this->desc.tfm);
+			this->desc.tfm = NULL;
+		}
+
+		if (this->digest) {
+			toi_kfree(16, this->digest, toi_incremental_digestsize);
+			this->digest = NULL;
+		}
+	}
+
+	return 0;
+}
+
+/*
+ * toi_incremental_init
+ */
+
+static int toi_incremental_init(int hibernate_or_resume)
+{
+	if (!hibernate_or_resume)
+		return 0;
+
+	next_driver = toi_get_next_filter(&toi_incremental_ops);
+
+	return next_driver ? 0 : -ECHILD;
+}
+
+/*
+ * toi_incremental_rw_init()
+ */
+
+static int toi_incremental_rw_init(int rw, int stream_number)
+{
+	if (rw == WRITE && toi_incremental_crypto_prepare()) {
+		printk(KERN_ERR "Failed to initialise hashing "
+				"algorithm.\n");
+		if (rw == READ) {
+			printk(KERN_INFO "Unable to read the image.\n");
+			return -ENODEV;
+		} else {
+			printk(KERN_INFO "Continuing without "
+				" calculating an incremental image.\n");
+			toi_incremental_ops.enabled = 0;
+		}
+	}
+
+	return 0;
+}
+
+/*
+ * toi_incremental_write_page()
+ *
+ * Decide whether to write a page to the image. Calculate the SHA1 (or something
+ * else if the user changes the hashing algo) of the page and compare it to the
+ * previous value (if any). If there was no previous value or the values are
+ * different, write the page. Otherwise, skip the write.
+ *
+ * @TODO: Clear hashes for pages that are no longer in the image!
+ *
+ * Buffer_page:	Pointer to a buffer of size PAGE_SIZE, containing
+ * data to be written.
+ *
+ * Returns:	0 on success. Otherwise the error is that returned by later
+ * 		modules, -ECHILD if we have a broken pipeline or -EIO if
+ * 		zlib errs.
+ */
+static int toi_incremental_write_page(unsigned long index, int buf_type,
+		void *buffer_page, unsigned int buf_size)
+{
+	int ret = 0, cpu = smp_processor_id();
+	struct cpu_context *ctx = &per_cpu(contexts, cpu);
+	int to_write = true;
+
+	if (ctx->desc.tfm) {
+		// char *old_hash;
+
+		ctx->buffer_start = TOI_MAP(buf_type, buffer_page);
+
+		sg_init_one(&ctx->sg[0], ctx->buffer_start, buf_size);
+
+		ret = crypto_hash_digest(&ctx->desc, &ctx->sg[0], ctx->sg[0].length, ctx->digest);
+		// old_hash = get_old_hash(index);
+
+		TOI_UNMAP(buf_type, buffer_page);
+
+#if 0
+		if (!ret && new_hash == old_hash) {
+			to_write = false;	
+		} else
+			store_hash(ctx, index, new_hash);
+#endif
+	}
+
+	mutex_lock(&stats_lock);
+
+	toi_incremental_bytes_in += buf_size;
+	if (ret || to_write)
+		toi_incremental_bytes_out += buf_size;
+
+	mutex_unlock(&stats_lock);
+
+	if (ret || to_write) {
+		int ret2 = next_driver->write_page(index, buf_type,
+				buffer_page, buf_size);
+		if (!ret)
+			ret = ret2;
+	}
+
+	return ret;
+}
+
+/*
+ * toi_incremental_read_page()
+ * @buffer_page: struct page *. Pointer to a buffer of size PAGE_SIZE.
+ *
+ * Nothing extra to do here.
+ */
+static int toi_incremental_read_page(unsigned long *index, int buf_type,
+		void *buffer_page, unsigned int *buf_size)
+{
+	return next_driver->read_page(index, TOI_PAGE, buffer_page, buf_size);
+}
+
+/*
+ * toi_incremental_print_debug_stats
+ * @buffer: Pointer to a buffer into which the debug info will be printed.
+ * @size: Size of the buffer.
+ *
+ * Print information to be recorded for debugging purposes into a buffer.
+ * Returns: Number of characters written to the buffer.
+ */
+
+static int toi_incremental_print_debug_stats(char *buffer, int size)
+{
+	unsigned long pages_in = toi_incremental_bytes_in >> PAGE_SHIFT,
+		      pages_out = toi_incremental_bytes_out >> PAGE_SHIFT;
+	int len;
+
+	/* Output the size of the incremental image. */
+	if (*toi_incremental_slow_cmp_name)
+		len = scnprintf(buffer, size, "- Hash algorithm is '%s'.\n",
+				toi_incremental_slow_cmp_name);
+	else
+		len = scnprintf(buffer, size, "- Hash algorithm is not set.\n");
+
+	if (pages_in)
+		len += scnprintf(buffer+len, size - len, "  Incremental image "
+			"%lu of %lu bytes (%ld percent).\n",
+		  toi_incremental_bytes_out,
+		  toi_incremental_bytes_in,
+		  pages_out * 100 / pages_in);
+	return len;
+}
+
+/*
+ * toi_incremental_memory_needed
+ *
+ * Tell the caller how much memory we need to operate during hibernate/resume.
+ * Returns: Unsigned long. Maximum number of bytes of memory required for
+ * operation.
+ */
+static int toi_incremental_memory_needed(void)
+{
+	return 2 * PAGE_SIZE;
+}
+
+static int toi_incremental_storage_needed(void)
+{
+	return 2 * sizeof(unsigned long) + sizeof(int) +
+		strlen(toi_incremental_slow_cmp_name) + 1;
+}
+
+/*
+ * toi_incremental_save_config_info
+ * @buffer: Pointer to a buffer of size PAGE_SIZE.
+ *
+ * Save informaton needed when reloading the image at resume time.
+ * Returns: Number of bytes used for saving our data.
+ */
+static int toi_incremental_save_config_info(char *buffer)
+{
+	int len = strlen(toi_incremental_slow_cmp_name) + 1, offset = 0;
+
+	*((unsigned long *) buffer) = toi_incremental_bytes_in;
+	offset += sizeof(unsigned long);
+	*((unsigned long *) (buffer + offset)) = toi_incremental_bytes_out;
+	offset += sizeof(unsigned long);
+	*((int *) (buffer + offset)) = len;
+	offset += sizeof(int);
+	strncpy(buffer + offset, toi_incremental_slow_cmp_name, len);
+	return offset + len;
+}
+
+/* toi_incremental_load_config_info
+ * @buffer: Pointer to the start of the data.
+ * @size: Number of bytes that were saved.
+ *
+ * Description:	Reload information to be retained for debugging info.
+ */
+static void toi_incremental_load_config_info(char *buffer, int size)
+{
+	int len, offset = 0;
+
+	toi_incremental_bytes_in = *((unsigned long *) buffer);
+	offset += sizeof(unsigned long);
+	toi_incremental_bytes_out = *((unsigned long *) (buffer + offset));
+	offset += sizeof(unsigned long);
+	len = *((int *) (buffer + offset));
+	offset += sizeof(int);
+	strncpy(toi_incremental_slow_cmp_name, buffer + offset, len);
+}
+
+static void toi_incremental_pre_atomic_restore(struct toi_boot_kernel_data *bkd)
+{
+	bkd->incremental_bytes_in = toi_incremental_bytes_in;
+	bkd->incremental_bytes_out = toi_incremental_bytes_out;
+}
+
+static void toi_incremental_post_atomic_restore(struct toi_boot_kernel_data *bkd)
+{
+	toi_incremental_bytes_in = bkd->incremental_bytes_in;
+	toi_incremental_bytes_out = bkd->incremental_bytes_out;
+}
+
+static void toi_incremental_algo_change(void)
+{
+	/* Reset so it's gotten from crypto_hash_digestsize afresh */
+	toi_incremental_digestsize = 0;
+}
+
+/*
+ * data for our sysfs entries.
+ */
+static struct toi_sysfs_data sysfs_params[] = {
+	SYSFS_INT("enabled", SYSFS_RW, &toi_incremental_ops.enabled, 0, 1, 0,
+			NULL),
+	SYSFS_STRING("algorithm", SYSFS_RW, toi_incremental_slow_cmp_name, 31, 0, toi_incremental_algo_change),
+};
+
+/*
+ * Ops structure.
+ */
+static struct toi_module_ops toi_incremental_ops = {
+	.type			= FILTER_MODULE,
+	.name			= "incremental",
+	.directory		= "incremental",
+	.module			= THIS_MODULE,
+	.initialise		= toi_incremental_init,
+	.memory_needed 		= toi_incremental_memory_needed,
+	.print_debug_info	= toi_incremental_print_debug_stats,
+	.save_config_info	= toi_incremental_save_config_info,
+	.load_config_info	= toi_incremental_load_config_info,
+	.storage_needed		= toi_incremental_storage_needed,
+
+	.pre_atomic_restore	= toi_incremental_pre_atomic_restore,
+	.post_atomic_restore	= toi_incremental_post_atomic_restore,
+
+	.rw_init		= toi_incremental_rw_init,
+	.rw_cleanup		= toi_incremental_rw_cleanup,
+
+	.write_page		= toi_incremental_write_page,
+	.read_page		= toi_incremental_read_page,
+
+	.sysfs_data		= sysfs_params,
+	.num_sysfs_entries	= sizeof(sysfs_params) /
+		sizeof(struct toi_sysfs_data),
+};
+
+/* ---- Registration ---- */
+
+static __init int toi_incremental_load(void)
+{
+	return toi_register_module(&toi_incremental_ops);
+}
+
+#ifdef MODULE
+static __exit void toi_incremental_unload(void)
+{
+	toi_unregister_module(&toi_incremental_ops);
+}
+
+module_init(toi_incremental_load);
+module_exit(toi_incremental_unload);
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Nigel Cunningham");
+MODULE_DESCRIPTION("Incremental Image Support for TuxOnIce");
+#else
+late_initcall(toi_incremental_load);
+#endif
diff --git a/kernel/power/tuxonice_io.c b/kernel/power/tuxonice_io.c
new file mode 100644
index 0000000..901f1c9
--- /dev/null
+++ b/kernel/power/tuxonice_io.c
@@ -0,0 +1,1936 @@
+/*
+ * kernel/power/tuxonice_io.c
+ *
+ * Copyright (C) 1998-2001 Gabor Kuti <seasons@fornax.hu>
+ * Copyright (C) 1998,2001,2002 Pavel Machek <pavel@suse.cz>
+ * Copyright (C) 2002-2003 Florent Chabaud <fchabaud@free.fr>
+ * Copyright (C) 2002-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * It contains high level IO routines for hibernating.
+ *
+ */
+
+#include <linux/suspend.h>
+#include <linux/version.h>
+#include <linux/utsname.h>
+#include <linux/mount.h>
+#include <linux/highmem.h>
+#include <linux/kthread.h>
+#include <linux/cpu.h>
+#include <linux/fs_struct.h>
+#include <linux/bio.h>
+#include <linux/fs_uuid.h>
+#include <asm/tlbflush.h>
+
+#include "tuxonice.h"
+#include "tuxonice_modules.h"
+#include "tuxonice_pageflags.h"
+#include "tuxonice_io.h"
+#include "tuxonice_ui.h"
+#include "tuxonice_storage.h"
+#include "tuxonice_prepare_image.h"
+#include "tuxonice_extent.h"
+#include "tuxonice_sysfs.h"
+#include "tuxonice_builtin.h"
+#include "tuxonice_checksum.h"
+#include "tuxonice_alloc.h"
+char alt_resume_param[256];
+
+/* Version read from image header at resume */
+static int toi_image_header_version;
+
+#define read_if_version(VERS, VAR, DESC, ERR_ACT) do {					\
+	if (likely(toi_image_header_version >= VERS))				\
+		if (toiActiveAllocator->rw_header_chunk(READ, NULL,		\
+					(char *) &VAR, sizeof(VAR))) {		\
+			abort_hibernate(TOI_FAILED_IO, "Failed to read DESC.");	\
+			ERR_ACT;					\
+		}								\
+} while(0)									\
+
+/* Variables shared between threads and updated under the mutex */
+static int io_write, io_finish_at, io_base, io_barmax, io_pageset, io_result;
+static int io_index, io_nextupdate, io_pc, io_pc_step;
+static DEFINE_MUTEX(io_mutex);
+static DEFINE_PER_CPU(struct page *, last_sought);
+static DEFINE_PER_CPU(struct page *, last_high_page);
+static DEFINE_PER_CPU(char *, checksum_locn);
+static DEFINE_PER_CPU(struct pbe *, last_low_page);
+static atomic_t io_count;
+atomic_t toi_io_workers;
+EXPORT_SYMBOL_GPL(toi_io_workers);
+
+static int using_flusher;
+
+DECLARE_WAIT_QUEUE_HEAD(toi_io_queue_flusher);
+EXPORT_SYMBOL_GPL(toi_io_queue_flusher);
+
+int toi_bio_queue_flusher_should_finish;
+EXPORT_SYMBOL_GPL(toi_bio_queue_flusher_should_finish);
+
+int toi_max_workers;
+
+static char *image_version_error = "The image header version is newer than " \
+	"this kernel supports.";
+
+struct toi_module_ops *first_filter;
+
+static atomic_t toi_num_other_threads;
+static DECLARE_WAIT_QUEUE_HEAD(toi_worker_wait_queue);
+enum toi_worker_commands {
+	TOI_IO_WORKER_STOP,
+	TOI_IO_WORKER_RUN,
+	TOI_IO_WORKER_EXIT
+};
+static enum toi_worker_commands toi_worker_command;
+
+/**
+ * toi_attempt_to_parse_resume_device - determine if we can hibernate
+ *
+ * Can we hibernate, using the current resume= parameter?
+ **/
+int toi_attempt_to_parse_resume_device(int quiet)
+{
+	struct list_head *Allocator;
+	struct toi_module_ops *thisAllocator;
+	int result, returning = 0;
+
+	if (toi_activate_storage(0))
+		return 0;
+
+	toiActiveAllocator = NULL;
+	clear_toi_state(TOI_RESUME_DEVICE_OK);
+	clear_toi_state(TOI_CAN_RESUME);
+	clear_result_state(TOI_ABORTED);
+
+	if (!toiNumAllocators) {
+		if (!quiet)
+			printk(KERN_INFO "TuxOnIce: No storage allocators have "
+				"been registered. Hibernating will be "
+				"disabled.\n");
+		goto cleanup;
+	}
+
+	list_for_each(Allocator, &toiAllocators) {
+		thisAllocator = list_entry(Allocator, struct toi_module_ops,
+								type_list);
+
+		/*
+		 * Not sure why you'd want to disable an allocator, but
+		 * we should honour the flag if we're providing it
+		 */
+		if (!thisAllocator->enabled)
+			continue;
+
+		result = thisAllocator->parse_sig_location(
+				resume_file, (toiNumAllocators == 1),
+				quiet);
+
+		switch (result) {
+		case -EINVAL:
+			/* For this allocator, but not a valid
+			 * configuration. Error already printed. */
+			goto cleanup;
+
+		case 0:
+			/* For this allocator and valid. */
+			toiActiveAllocator = thisAllocator;
+
+			set_toi_state(TOI_RESUME_DEVICE_OK);
+			set_toi_state(TOI_CAN_RESUME);
+			returning = 1;
+			goto cleanup;
+		}
+	}
+	if (!quiet)
+		printk(KERN_INFO "TuxOnIce: No matching enabled allocator "
+				"found. Resuming disabled.\n");
+cleanup:
+	toi_deactivate_storage(0);
+	return returning;
+}
+EXPORT_SYMBOL_GPL(toi_attempt_to_parse_resume_device);
+
+void attempt_to_parse_resume_device2(void)
+{
+	toi_prepare_usm();
+	toi_attempt_to_parse_resume_device(0);
+	toi_cleanup_usm();
+}
+EXPORT_SYMBOL_GPL(attempt_to_parse_resume_device2);
+
+void save_restore_alt_param(int replace, int quiet)
+{
+	static char resume_param_save[255];
+	static unsigned long toi_state_save;
+
+	if (replace) {
+		toi_state_save = toi_state;
+		strcpy(resume_param_save, resume_file);
+		strcpy(resume_file, alt_resume_param);
+	} else {
+		strcpy(resume_file, resume_param_save);
+		toi_state = toi_state_save;
+	}
+	toi_attempt_to_parse_resume_device(quiet);
+}
+
+void attempt_to_parse_alt_resume_param(void)
+{
+	int ok = 0;
+
+	/* Temporarily set resume_param to the poweroff value */
+	if (!strlen(alt_resume_param))
+		return;
+
+	printk(KERN_INFO "=== Trying Poweroff Resume2 ===\n");
+	save_restore_alt_param(SAVE, NOQUIET);
+	if (test_toi_state(TOI_CAN_RESUME))
+		ok = 1;
+
+	printk(KERN_INFO "=== Done ===\n");
+	save_restore_alt_param(RESTORE, QUIET);
+
+	/* If not ok, clear the string */
+	if (ok)
+		return;
+
+	printk(KERN_INFO "Can't resume from that location; clearing "
+			"alt_resume_param.\n");
+	alt_resume_param[0] = '\0';
+}
+
+/**
+ * noresume_reset_modules - reset data structures in case of non resuming
+ *
+ * When we read the start of an image, modules (and especially the
+ * active allocator) might need to reset data structures if we
+ * decide to remove the image rather than resuming from it.
+ **/
+static void noresume_reset_modules(void)
+{
+	struct toi_module_ops *this_filter;
+
+	list_for_each_entry(this_filter, &toi_filters, type_list)
+		if (this_filter->noresume_reset)
+			this_filter->noresume_reset();
+
+	if (toiActiveAllocator && toiActiveAllocator->noresume_reset)
+		toiActiveAllocator->noresume_reset();
+}
+
+/**
+ * fill_toi_header - fill the hibernate header structure
+ * @struct toi_header: Header data structure to be filled.
+ **/
+static int fill_toi_header(struct toi_header *sh)
+{
+	int i, error;
+
+	error = init_header((struct swsusp_info *) sh);
+	if (error)
+		return error;
+
+	sh->pagedir = pagedir1;
+	sh->pageset_2_size = pagedir2.size;
+	sh->param0 = toi_result;
+	sh->param1 = toi_bkd.toi_action;
+	sh->param2 = toi_bkd.toi_debug_state;
+	sh->param3 = toi_bkd.toi_default_console_level;
+	sh->root_fs = current->fs->root.mnt->mnt_sb->s_dev;
+	for (i = 0; i < 4; i++)
+		sh->io_time[i/2][i%2] = toi_bkd.toi_io_time[i/2][i%2];
+	sh->bkd = boot_kernel_data_buffer;
+	return 0;
+}
+
+/**
+ * rw_init_modules - initialize modules
+ * @rw:		Whether we are reading of writing an image.
+ * @which:	Section of the image being processed.
+ *
+ * Iterate over modules, preparing the ones that will be used to read or write
+ * data.
+ **/
+static int rw_init_modules(int rw, int which)
+{
+	struct toi_module_ops *this_module;
+	/* Initialise page transformers */
+	list_for_each_entry(this_module, &toi_filters, type_list) {
+		if (!this_module->enabled)
+			continue;
+		if (this_module->rw_init && this_module->rw_init(rw, which)) {
+			abort_hibernate(TOI_FAILED_MODULE_INIT,
+				"Failed to initialize the %s filter.",
+				this_module->name);
+			return 1;
+		}
+	}
+
+	/* Initialise allocator */
+	if (toiActiveAllocator->rw_init(rw, which)) {
+		abort_hibernate(TOI_FAILED_MODULE_INIT,
+				"Failed to initialise the allocator.");
+		return 1;
+	}
+
+	/* Initialise other modules */
+	list_for_each_entry(this_module, &toi_modules, module_list) {
+		if (!this_module->enabled ||
+		    this_module->type == FILTER_MODULE ||
+		    this_module->type == WRITER_MODULE)
+			continue;
+		if (this_module->rw_init && this_module->rw_init(rw, which)) {
+			set_abort_result(TOI_FAILED_MODULE_INIT);
+			printk(KERN_INFO "Setting aborted flag due to module "
+					"init failure.\n");
+			return 1;
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * rw_cleanup_modules - cleanup modules
+ * @rw:	Whether we are reading of writing an image.
+ *
+ * Cleanup components after reading or writing a set of pages.
+ * Only the allocator may fail.
+ **/
+static int rw_cleanup_modules(int rw)
+{
+	struct toi_module_ops *this_module;
+	int result = 0;
+
+	/* Cleanup other modules */
+	list_for_each_entry(this_module, &toi_modules, module_list) {
+		if (!this_module->enabled ||
+		    this_module->type == FILTER_MODULE ||
+		    this_module->type == WRITER_MODULE)
+			continue;
+		if (this_module->rw_cleanup)
+			result |= this_module->rw_cleanup(rw);
+	}
+
+	/* Flush data and cleanup */
+	list_for_each_entry(this_module, &toi_filters, type_list) {
+		if (!this_module->enabled)
+			continue;
+		if (this_module->rw_cleanup)
+			result |= this_module->rw_cleanup(rw);
+	}
+
+	result |= toiActiveAllocator->rw_cleanup(rw);
+
+	return result;
+}
+
+static struct page *copy_page_from_orig_page(struct page *orig_page, int is_high)
+{
+	int index, min, max;
+	struct page *high_page = NULL,
+		    **my_last_high_page = &__get_cpu_var(last_high_page),
+		    **my_last_sought = &__get_cpu_var(last_sought);
+	struct pbe *this, **my_last_low_page = &__get_cpu_var(last_low_page);
+	void *compare;
+
+	if (is_high) {
+		if (*my_last_sought && *my_last_high_page &&
+				*my_last_sought < orig_page)
+			high_page = *my_last_high_page;
+		else
+			high_page = (struct page *) restore_highmem_pblist;
+		this = (struct pbe *) kmap(high_page);
+		compare = orig_page;
+	} else {
+		if (*my_last_sought && *my_last_low_page &&
+				*my_last_sought < orig_page)
+			this = *my_last_low_page;
+		else
+			this = restore_pblist;
+		compare = page_address(orig_page);
+	}
+
+	*my_last_sought = orig_page;
+
+	/* Locate page containing pbe */
+	while (this[PBES_PER_PAGE - 1].next &&
+			this[PBES_PER_PAGE - 1].orig_address < compare) {
+		if (is_high) {
+			struct page *next_high_page = (struct page *)
+				this[PBES_PER_PAGE - 1].next;
+			kunmap(high_page);
+			this = kmap(next_high_page);
+			high_page = next_high_page;
+		} else
+			this = this[PBES_PER_PAGE - 1].next;
+	}
+
+	/* Do a binary search within the page */
+	min = 0;
+	max = PBES_PER_PAGE;
+	index = PBES_PER_PAGE / 2;
+	while (max - min) {
+		if (!this[index].orig_address ||
+		    this[index].orig_address > compare)
+			max = index;
+		else if (this[index].orig_address == compare) {
+			if (is_high) {
+				struct page *page = this[index].address;
+				*my_last_high_page = high_page;
+				kunmap(high_page);
+				return page;
+			}
+			*my_last_low_page = this;
+			return virt_to_page(this[index].address);
+		} else
+			min = index;
+		index = ((max + min) / 2);
+	};
+
+	if (is_high)
+		kunmap(high_page);
+
+	abort_hibernate(TOI_FAILED_IO, "Failed to get destination page for"
+		" orig page %p. This[min].orig_address=%p.\n", orig_page,
+		this[index].orig_address);
+	return NULL;
+}
+
+/**
+ * write_next_page - write the next page in a pageset
+ * @data_pfn: The pfn where the next data to write is located.
+ * @my_io_index: The index of the page in the pageset.
+ * @write_pfn: The pfn number to write in the image (where the data belongs).
+ *
+ * Get the pfn of the next page to write, map the page if necessary and do the
+ * write.
+ **/
+static int write_next_page(unsigned long *data_pfn, int *my_io_index,
+		unsigned long *write_pfn)
+{
+	struct page *page;
+	char **my_checksum_locn = &__get_cpu_var(checksum_locn);
+	int result = 0, was_present;
+
+	*data_pfn = memory_bm_next_pfn(io_map);
+
+	/* Another thread could have beaten us to it. */
+	if (*data_pfn == BM_END_OF_MAP) {
+		if (atomic_read(&io_count)) {
+			printk(KERN_INFO "Ran out of pfns but io_count is "
+					"still %d.\n", atomic_read(&io_count));
+			BUG();
+		}
+		mutex_unlock(&io_mutex);
+		return -ENODATA;
+	}
+
+	*my_io_index = io_finish_at - atomic_sub_return(1, &io_count);
+
+	memory_bm_clear_bit(io_map, *data_pfn);
+	page = pfn_to_page(*data_pfn);
+
+	was_present = kernel_page_present(page);
+	if (!was_present)
+		kernel_map_pages(page, 1, 1);
+
+	if (io_pageset == 1)
+		*write_pfn = memory_bm_next_pfn(pageset1_map);
+	else {
+		*write_pfn = *data_pfn;
+		*my_checksum_locn = tuxonice_get_next_checksum();
+	}
+
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "Write %d:%ld.", *my_io_index, *write_pfn);
+
+	mutex_unlock(&io_mutex);
+
+	if (io_pageset == 2 && tuxonice_calc_checksum(page, *my_checksum_locn))
+		return 1;
+
+	result = first_filter->write_page(*write_pfn, TOI_PAGE, page,
+			PAGE_SIZE);
+
+	if (!was_present)
+		kernel_map_pages(page, 1, 0);
+
+	return result;
+}
+
+/**
+ * read_next_page - read the next page in a pageset
+ * @my_io_index: The index of the page in the pageset.
+ * @write_pfn: The pfn in which the data belongs.
+ *
+ * Read a page of the image into our buffer. It can happen (here and in the
+ * write routine) that threads don't get run until after other CPUs have done
+ * all the work. This was the cause of the long standing issue with
+ * occasionally getting -ENODATA errors at the end of reading the image. We
+ * therefore need to check there's actually a page to read before trying to
+ * retrieve one.
+ **/
+
+static int read_next_page(int *my_io_index, unsigned long *write_pfn,
+		struct page *buffer)
+{
+	unsigned int buf_size = PAGE_SIZE;
+	unsigned long left = atomic_read(&io_count);
+
+	if (!left)
+		return -ENODATA;
+
+	/* Start off assuming the page we read isn't resaved */
+	*my_io_index = io_finish_at - atomic_sub_return(1, &io_count);
+
+	mutex_unlock(&io_mutex);
+
+	/*
+	 * Are we aborting? If so, don't submit any more I/O as
+	 * resetting the resume_attempted flag (from ui.c) will
+	 * clear the bdev flags, making this thread oops.
+	 */
+	if (unlikely(test_toi_state(TOI_STOP_RESUME))) {
+		atomic_dec(&toi_io_workers);
+		if (!atomic_read(&toi_io_workers)) {
+			/*
+			 * So we can be sure we'll have memory for
+			 * marking that we haven't resumed.
+			 */
+			rw_cleanup_modules(READ);
+			set_toi_state(TOI_IO_STOPPED);
+		}
+		while (1)
+			schedule();
+	}
+
+	/*
+	 * See toi_bio_read_page in tuxonice_bio.c:
+	 * read the next page in the image.
+	 */
+	return first_filter->read_page(write_pfn, TOI_PAGE, buffer, &buf_size);
+}
+
+static void use_read_page(unsigned long write_pfn, struct page *buffer)
+{
+	struct page *final_page = pfn_to_page(write_pfn),
+		    *copy_page = final_page;
+	char *virt, *buffer_virt;
+	int was_present, cpu = smp_processor_id();
+	unsigned long idx = 0;
+
+	if (io_pageset == 1 && (!pageset1_copy_map ||
+			!memory_bm_test_bit_index(pageset1_copy_map, write_pfn, cpu))) {
+		int is_high = PageHighMem(final_page);
+		copy_page = copy_page_from_orig_page(is_high ? (void *) write_pfn : final_page, is_high);
+	}
+
+	if (!memory_bm_test_bit_index(io_map, write_pfn, cpu)) {
+		toi_message(TOI_IO, TOI_VERBOSE, 0, "Discard %ld.", write_pfn);
+		mutex_lock(&io_mutex);
+		idx = atomic_add_return(1, &io_count);
+		mutex_unlock(&io_mutex);
+		return;
+	}
+
+	virt = kmap(copy_page);
+	buffer_virt = kmap(buffer);
+	was_present = kernel_page_present(copy_page);
+	if (!was_present)
+		kernel_map_pages(copy_page, 1, 1);
+	memcpy(virt, buffer_virt, PAGE_SIZE);
+	if (!was_present)
+		kernel_map_pages(copy_page, 1, 0);
+	kunmap(copy_page);
+	kunmap(buffer);
+	memory_bm_clear_bit_index(io_map, write_pfn, cpu);
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "Read %d:%ld", idx, write_pfn);
+}
+
+static unsigned long status_update(int writing, unsigned long done,
+		unsigned long ticks)
+{
+	int cs_index = writing ? 0 : 1;
+	unsigned long ticks_so_far = toi_bkd.toi_io_time[cs_index][1] + ticks;
+	unsigned long msec = jiffies_to_msecs(abs(ticks_so_far));
+	unsigned long pgs_per_s, estimate = 0, pages_left;
+
+	if (msec) {
+		pages_left = io_barmax - done;
+		pgs_per_s = 1000 * done / msec;
+		if (pgs_per_s)
+			estimate = DIV_ROUND_UP(pages_left, pgs_per_s);
+	}
+
+	if (estimate && ticks > HZ / 2)
+		return toi_update_status(done, io_barmax,
+			" %d/%d MB (%lu sec left)",
+			MB(done+1), MB(io_barmax), estimate);
+
+	return toi_update_status(done, io_barmax, " %d/%d MB",
+		MB(done+1), MB(io_barmax));
+}
+
+/**
+ * worker_rw_loop - main loop to read/write pages
+ *
+ * The main I/O loop for reading or writing pages. The io_map bitmap is used to
+ * track the pages to read/write.
+ * If we are reading, the pages are loaded to their final (mapped) pfn.
+ * Data is non zero iff this is a thread started via start_other_threads.
+ * In that case, we stay in here until told to quit.
+ **/
+static int worker_rw_loop(void *data)
+{
+	unsigned long data_pfn, write_pfn, next_jiffies = jiffies + HZ / 4,
+		      jif_index = 1, start_time = jiffies, thread_num;
+	int result = 0, my_io_index = 0, last_worker;
+	struct page *buffer = toi_alloc_page(28, TOI_ATOMIC_GFP);
+	cpumask_var_t orig_mask;
+
+        if (!alloc_cpumask_var(&orig_mask, GFP_KERNEL)) {
+		printk(KERN_EMERG "Failed to allocate cpumask for TuxOnIce I/O thread %ld.\n", (unsigned long) data);
+                return -ENOMEM;
+        }
+
+	cpumask_copy(orig_mask, tsk_cpus_allowed(current));
+
+	current->flags |= PF_NOFREEZE;
+
+top:
+	mutex_lock(&io_mutex);
+	thread_num = atomic_read(&toi_io_workers);
+
+	cpumask_copy(tsk_cpus_allowed(current), orig_mask);
+	schedule();
+
+	atomic_inc(&toi_io_workers);
+
+	while (atomic_read(&io_count) >= atomic_read(&toi_io_workers) &&
+		!(io_write && test_result_state(TOI_ABORTED)) &&
+		toi_worker_command == TOI_IO_WORKER_RUN) {
+		if (!thread_num && jiffies > next_jiffies) {
+			next_jiffies += HZ / 4;
+			if (toiActiveAllocator->update_throughput_throttle)
+				toiActiveAllocator->update_throughput_throttle(
+						jif_index);
+			jif_index++;
+		}
+
+		/*
+		 * What page to use? If reading, don't know yet which page's
+		 * data will be read, so always use the buffer. If writing,
+		 * use the copy (Pageset1) or original page (Pageset2), but
+		 * always write the pfn of the original page.
+		 */
+		if (io_write)
+			result = write_next_page(&data_pfn, &my_io_index,
+					&write_pfn);
+		else /* Reading */
+			result = read_next_page(&my_io_index, &write_pfn,
+					buffer);
+
+		if (result) {
+			mutex_lock(&io_mutex);
+			/* Nothing to do? */
+			if (result == -ENODATA) {
+				toi_message(TOI_IO, TOI_VERBOSE, 0,
+					"Thread %d has no more work.",
+					smp_processor_id());
+				break;
+			}
+
+			io_result = result;
+
+			if (io_write) {
+				printk(KERN_INFO "Write chunk returned %d.\n",
+						result);
+				abort_hibernate(TOI_FAILED_IO,
+					"Failed to write a chunk of the "
+					"image.");
+				break;
+			}
+
+			if (io_pageset == 1) {
+				printk(KERN_ERR "\nBreaking out of I/O loop "
+					"because of result code %d.\n", result);
+				break;
+			}
+			panic("Read chunk returned (%d)", result);
+		}
+
+		/*
+		 * Discard reads of resaved pages while reading ps2
+		 * and unwanted pages while rereading ps2 when aborting.
+		 */
+		if (!io_write) {
+			if (!PageResave(pfn_to_page(write_pfn)))
+				use_read_page(write_pfn, buffer);
+			else {
+				mutex_lock(&io_mutex);
+				toi_message(TOI_IO, TOI_VERBOSE, 0,
+						"Resaved %ld.", write_pfn);
+				atomic_inc(&io_count);
+				mutex_unlock(&io_mutex);
+			}
+		}
+
+		if (!thread_num) {
+			if(my_io_index + io_base > io_nextupdate)
+				io_nextupdate = status_update(io_write,
+						my_io_index + io_base,
+						jiffies - start_time);
+
+			if (my_io_index > io_pc) {
+				printk(KERN_CONT "...%d%%", 20 * io_pc_step);
+				io_pc_step++;
+				io_pc = io_finish_at * io_pc_step / 5;
+			}
+		}
+
+		toi_cond_pause(0, NULL);
+
+		/*
+		 * Subtle: If there's less I/O still to be done than threads
+		 * running, quit. This stops us doing I/O beyond the end of
+		 * the image when reading.
+		 *
+		 * Possible race condition. Two threads could do the test at
+		 * the same time; one should exit and one should continue.
+		 * Therefore we take the mutex before comparing and exiting.
+		 */
+
+		mutex_lock(&io_mutex);
+	}
+
+	last_worker = atomic_dec_and_test(&toi_io_workers);
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "%d workers left.", atomic_read(&toi_io_workers));
+	mutex_unlock(&io_mutex);
+
+	if ((unsigned long) data && toi_worker_command != TOI_IO_WORKER_EXIT) {
+		/* Were we the last thread and we're using a flusher thread? */
+		if (last_worker && using_flusher) {
+			toiActiveAllocator->finish_all_io();
+		}
+		/* First, if we're doing I/O, wait for it to finish */
+		wait_event(toi_worker_wait_queue, toi_worker_command != TOI_IO_WORKER_RUN);
+		/* Then wait to be told what to do next */
+		wait_event(toi_worker_wait_queue, toi_worker_command != TOI_IO_WORKER_STOP);
+		if (toi_worker_command == TOI_IO_WORKER_RUN)
+			goto top;
+	}
+
+	if (thread_num)
+		atomic_dec(&toi_num_other_threads);
+
+	toi_message(TOI_IO, TOI_LOW, 0, "Thread %d exiting.", thread_num);
+	toi__free_page(28, buffer);
+	free_cpumask_var(orig_mask);
+
+	return result;
+}
+
+int toi_start_other_threads(void)
+{
+	int cpu;
+	struct task_struct *p;
+	int to_start = (toi_max_workers ? toi_max_workers : num_online_cpus()) - 1;
+  unsigned long num_started = 0;
+
+	if (test_action_state(TOI_NO_MULTITHREADED_IO))
+		return 0;
+
+	toi_worker_command = TOI_IO_WORKER_STOP;
+
+	for_each_online_cpu(cpu) {
+		if (num_started == to_start)
+			break;
+
+		if (cpu == smp_processor_id())
+			continue;
+
+		p = kthread_create_on_node(worker_rw_loop, (void *) num_started + 1,
+				cpu_to_node(cpu), "ktoi_io/%d", cpu);
+		if (IS_ERR(p)) {
+			printk(KERN_ERR "ktoi_io for %i failed\n", cpu);
+			continue;
+		}
+		kthread_bind(p, cpu);
+		p->flags |= PF_MEMALLOC;
+		wake_up_process(p);
+		num_started++;
+		atomic_inc(&toi_num_other_threads);
+	}
+
+	toi_message(TOI_IO, TOI_LOW, 0, "Started %d threads.", num_started);
+	return num_started;
+}
+
+void toi_stop_other_threads(void)
+{
+	toi_message(TOI_IO, TOI_LOW, 0, "Stopping other threads.");
+	toi_worker_command = TOI_IO_WORKER_EXIT;
+	wake_up(&toi_worker_wait_queue);
+}
+
+/**
+ * do_rw_loop - main highlevel function for reading or writing pages
+ *
+ * Create the io_map bitmap and call worker_rw_loop to perform I/O operations.
+ **/
+static int do_rw_loop(int write, int finish_at, struct memory_bitmap *pageflags,
+		int base, int barmax, int pageset)
+{
+	int index = 0, cpu, result = 0, workers_started;
+	unsigned long pfn;
+
+	first_filter = toi_get_next_filter(NULL);
+
+	if (!finish_at)
+		return 0;
+
+	io_write = write;
+	io_finish_at = finish_at;
+	io_base = base;
+	io_barmax = barmax;
+	io_pageset = pageset;
+	io_index = 0;
+	io_pc = io_finish_at / 5;
+	io_pc_step = 1;
+	io_result = 0;
+	io_nextupdate = base + 1;
+	toi_bio_queue_flusher_should_finish = 0;
+
+	for_each_online_cpu(cpu) {
+		per_cpu(last_sought, cpu) = NULL;
+		per_cpu(last_low_page, cpu) = NULL;
+		per_cpu(last_high_page, cpu) = NULL;
+	}
+
+	/* Ensure all bits clear */
+	memory_bm_clear(io_map);
+
+	/* Set the bits for the pages to write */
+	memory_bm_position_reset(pageflags);
+
+	pfn = memory_bm_next_pfn(pageflags);
+
+	while (pfn != BM_END_OF_MAP && index < finish_at) {
+		memory_bm_set_bit(io_map, pfn);
+		pfn = memory_bm_next_pfn(pageflags);
+		index++;
+	}
+
+	BUG_ON(index < finish_at);
+
+	atomic_set(&io_count, finish_at);
+
+	memory_bm_position_reset(pageset1_map);
+
+	mutex_lock(&io_mutex);
+
+	clear_toi_state(TOI_IO_STOPPED);
+
+	using_flusher = (atomic_read(&toi_num_other_threads) &&
+			 toiActiveAllocator->io_flusher &&
+			 !test_action_state(TOI_NO_FLUSHER_THREAD));
+
+	workers_started = atomic_read(&toi_num_other_threads);
+
+	memory_bm_set_iterators(io_map, atomic_read(&toi_num_other_threads) + 1);
+	memory_bm_position_reset(io_map);
+
+	memory_bm_set_iterators(pageset1_copy_map, atomic_read(&toi_num_other_threads) + 1);
+	memory_bm_position_reset(pageset1_copy_map);
+
+	toi_worker_command = TOI_IO_WORKER_RUN;
+	wake_up(&toi_worker_wait_queue);
+
+	mutex_unlock(&io_mutex);
+
+	if (using_flusher)
+		result = toiActiveAllocator->io_flusher(write);
+	else
+		worker_rw_loop(NULL);
+
+	while (atomic_read(&toi_io_workers))
+		schedule();
+
+	printk(KERN_CONT "\n");
+
+	toi_worker_command = TOI_IO_WORKER_STOP;
+	wake_up(&toi_worker_wait_queue);
+
+	if (unlikely(test_toi_state(TOI_STOP_RESUME))) {
+		if (!atomic_read(&toi_io_workers)) {
+			rw_cleanup_modules(READ);
+			set_toi_state(TOI_IO_STOPPED);
+		}
+		while (1)
+			schedule();
+	}
+	set_toi_state(TOI_IO_STOPPED);
+
+	if (!io_result && !result && !test_result_state(TOI_ABORTED)) {
+		unsigned long next;
+
+		toi_update_status(io_base + io_finish_at, io_barmax,
+				" %d/%d MB ",
+				MB(io_base + io_finish_at), MB(io_barmax));
+
+		memory_bm_position_reset(io_map);
+		next = memory_bm_next_pfn(io_map);
+		if  (next != BM_END_OF_MAP) {
+			printk(KERN_INFO "Finished I/O loop but still work to "
+					"do?\nFinish at = %d. io_count = %d.\n",
+					finish_at, atomic_read(&io_count));
+			printk(KERN_INFO "I/O bitmap still records work to do."
+					"%ld.\n", next);
+			BUG();
+			do {
+				cpu_relax();
+			} while (0);
+		}
+	}
+
+	return io_result ? io_result : result;
+}
+
+/**
+ * write_pageset - write a pageset to disk.
+ * @pagedir:	Which pagedir to write.
+ *
+ * Returns:
+ *	Zero on success or -1 on failure.
+ **/
+int write_pageset(struct pagedir *pagedir)
+{
+	int finish_at, base = 0;
+	int barmax = pagedir1.size + pagedir2.size;
+	long error = 0;
+	struct memory_bitmap *pageflags;
+	unsigned long start_time, end_time;
+
+	/*
+	 * Even if there is nothing to read or write, the allocator
+	 * may need the init/cleanup for it's housekeeping.  (eg:
+	 * Pageset1 may start where pageset2 ends when writing).
+	 */
+	finish_at = pagedir->size;
+
+	if (pagedir->id == 1) {
+		toi_prepare_status(DONT_CLEAR_BAR,
+				"Writing kernel & process data...");
+		base = pagedir2.size;
+		if (test_action_state(TOI_TEST_FILTER_SPEED) ||
+		    test_action_state(TOI_TEST_BIO))
+			pageflags = pageset1_map;
+		else
+			pageflags = pageset1_copy_map;
+	} else {
+		toi_prepare_status(DONT_CLEAR_BAR, "Writing caches...");
+		pageflags = pageset2_map;
+	}
+
+	start_time = jiffies;
+
+	if (rw_init_modules(1, pagedir->id)) {
+		abort_hibernate(TOI_FAILED_MODULE_INIT,
+				"Failed to initialise modules for writing.");
+		error = 1;
+	}
+
+	if (!error)
+		error = do_rw_loop(1, finish_at, pageflags, base, barmax,
+				pagedir->id);
+
+	if (rw_cleanup_modules(WRITE) && !error) {
+		abort_hibernate(TOI_FAILED_MODULE_CLEANUP,
+				"Failed to cleanup after writing.");
+		error = 1;
+	}
+
+	end_time = jiffies;
+
+	if ((end_time - start_time) && (!test_result_state(TOI_ABORTED))) {
+		toi_bkd.toi_io_time[0][0] += finish_at,
+		toi_bkd.toi_io_time[0][1] += (end_time - start_time);
+	}
+
+	return error;
+}
+
+/**
+ * read_pageset - highlevel function to read a pageset from disk
+ * @pagedir:			pageset to read
+ * @overwrittenpagesonly:	Whether to read the whole pageset or
+ *				only part of it.
+ *
+ * Returns:
+ *	Zero on success or -1 on failure.
+ **/
+static int read_pageset(struct pagedir *pagedir, int overwrittenpagesonly)
+{
+	int result = 0, base = 0;
+	int finish_at = pagedir->size;
+	int barmax = pagedir1.size + pagedir2.size;
+	struct memory_bitmap *pageflags;
+	unsigned long start_time, end_time;
+
+	if (pagedir->id == 1) {
+		toi_prepare_status(DONT_CLEAR_BAR,
+				"Reading kernel & process data...");
+		pageflags = pageset1_map;
+	} else {
+		toi_prepare_status(DONT_CLEAR_BAR, "Reading caches...");
+		if (overwrittenpagesonly) {
+			barmax = min(pagedir1.size, pagedir2.size);
+			finish_at = min(pagedir1.size, pagedir2.size);
+		} else
+			base = pagedir1.size;
+		pageflags = pageset2_map;
+	}
+
+	start_time = jiffies;
+
+	if (rw_init_modules(0, pagedir->id)) {
+		toiActiveAllocator->remove_image();
+		result = 1;
+	} else
+		result = do_rw_loop(0, finish_at, pageflags, base, barmax,
+				pagedir->id);
+
+	if (rw_cleanup_modules(READ) && !result) {
+		abort_hibernate(TOI_FAILED_MODULE_CLEANUP,
+				"Failed to cleanup after reading.");
+		result = 1;
+	}
+
+	/* Statistics */
+	end_time = jiffies;
+
+	if ((end_time - start_time) && (!test_result_state(TOI_ABORTED))) {
+		toi_bkd.toi_io_time[1][0] += finish_at,
+		toi_bkd.toi_io_time[1][1] += (end_time - start_time);
+	}
+
+	return result;
+}
+
+/**
+ * write_module_configs - store the modules configuration
+ *
+ * The configuration for each module is stored in the image header.
+ * Returns: Int
+ *	Zero on success, Error value otherwise.
+ **/
+static int write_module_configs(void)
+{
+	struct toi_module_ops *this_module;
+	char *buffer = (char *) toi_get_zeroed_page(22, TOI_ATOMIC_GFP);
+	int len, index = 1;
+	struct toi_module_header toi_module_header;
+
+	if (!buffer) {
+		printk(KERN_INFO "Failed to allocate a buffer for saving "
+				"module configuration info.\n");
+		return -ENOMEM;
+	}
+
+	/*
+	 * We have to know which data goes with which module, so we at
+	 * least write a length of zero for a module. Note that we are
+	 * also assuming every module's config data takes <= PAGE_SIZE.
+	 */
+
+	/* For each module (in registration order) */
+	list_for_each_entry(this_module, &toi_modules, module_list) {
+		if (!this_module->enabled || !this_module->storage_needed ||
+		    (this_module->type == WRITER_MODULE &&
+		     toiActiveAllocator != this_module))
+			continue;
+
+		/* Get the data from the module */
+		len = 0;
+		if (this_module->save_config_info)
+			len = this_module->save_config_info(buffer);
+
+		/* Save the details of the module */
+		toi_module_header.enabled = this_module->enabled;
+		toi_module_header.type = this_module->type;
+		toi_module_header.index = index++;
+		strncpy(toi_module_header.name, this_module->name,
+					sizeof(toi_module_header.name));
+		toiActiveAllocator->rw_header_chunk(WRITE,
+				this_module,
+				(char *) &toi_module_header,
+				sizeof(toi_module_header));
+
+		/* Save the size of the data and any data returned */
+		toiActiveAllocator->rw_header_chunk(WRITE,
+				this_module,
+				(char *) &len, sizeof(int));
+		if (len)
+			toiActiveAllocator->rw_header_chunk(
+				WRITE, this_module, buffer, len);
+	}
+
+	/* Write a blank header to terminate the list */
+	toi_module_header.name[0] = '\0';
+	toiActiveAllocator->rw_header_chunk(WRITE, NULL,
+			(char *) &toi_module_header, sizeof(toi_module_header));
+
+	toi_free_page(22, (unsigned long) buffer);
+	return 0;
+}
+
+/**
+ * read_one_module_config - read and configure one module
+ *
+ * Read the configuration for one module, and configure the module
+ * to match if it is loaded.
+ *
+ * Returns: Int
+ *	Zero on success, Error value otherwise.
+ **/
+static int read_one_module_config(struct toi_module_header *header)
+{
+	struct toi_module_ops *this_module;
+	int result, len;
+	char *buffer;
+
+	/* Find the module */
+	this_module = toi_find_module_given_name(header->name);
+
+	if (!this_module) {
+		if (header->enabled) {
+			toi_early_boot_message(1, TOI_CONTINUE_REQ,
+				"It looks like we need module %s for reading "
+				"the image but it hasn't been registered.\n",
+				header->name);
+			if (!(test_toi_state(TOI_CONTINUE_REQ)))
+				return -EINVAL;
+		} else
+			printk(KERN_INFO "Module %s configuration data found, "
+				"but the module hasn't registered. Looks like "
+				"it was disabled, so we're ignoring its data.",
+				header->name);
+	}
+
+	/* Get the length of the data (if any) */
+	result = toiActiveAllocator->rw_header_chunk(READ, NULL, (char *) &len,
+			sizeof(int));
+	if (result) {
+		printk(KERN_ERR "Failed to read the length of the module %s's"
+				" configuration data.\n",
+				header->name);
+		return -EINVAL;
+	}
+
+	/* Read any data and pass to the module (if we found one) */
+	if (!len)
+		return 0;
+
+	buffer = (char *) toi_get_zeroed_page(23, TOI_ATOMIC_GFP);
+
+	if (!buffer) {
+		printk(KERN_ERR "Failed to allocate a buffer for reloading "
+				"module configuration info.\n");
+		return -ENOMEM;
+	}
+
+	toiActiveAllocator->rw_header_chunk(READ, NULL, buffer, len);
+
+	if (!this_module)
+		goto out;
+
+	if (!this_module->save_config_info)
+		printk(KERN_ERR "Huh? Module %s appears to have a "
+				"save_config_info, but not a load_config_info "
+				"function!\n", this_module->name);
+	else
+		this_module->load_config_info(buffer, len);
+
+	/*
+	 * Now move this module to the tail of its lists. This will put it in
+	 * order. Any new modules will end up at the top of the lists. They
+	 * should have been set to disabled when loaded (people will
+	 * normally not edit an initrd to load a new module and then hibernate
+	 * without using it!).
+	 */
+
+	toi_move_module_tail(this_module);
+
+	this_module->enabled = header->enabled;
+
+out:
+	toi_free_page(23, (unsigned long) buffer);
+	return 0;
+}
+
+/**
+ * read_module_configs - reload module configurations from the image header.
+ *
+ * Returns: Int
+ *	Zero on success or an error code.
+ **/
+static int read_module_configs(void)
+{
+	int result = 0;
+	struct toi_module_header toi_module_header;
+	struct toi_module_ops *this_module;
+
+	/* All modules are initially disabled. That way, if we have a module
+	 * loaded now that wasn't loaded when we hibernated, it won't be used
+	 * in trying to read the data.
+	 */
+	list_for_each_entry(this_module, &toi_modules, module_list)
+		this_module->enabled = 0;
+
+	/* Get the first module header */
+	result = toiActiveAllocator->rw_header_chunk(READ, NULL,
+			(char *) &toi_module_header,
+			sizeof(toi_module_header));
+	if (result) {
+		printk(KERN_ERR "Failed to read the next module header.\n");
+		return -EINVAL;
+	}
+
+	/* For each module (in registration order) */
+	while (toi_module_header.name[0]) {
+		result = read_one_module_config(&toi_module_header);
+
+		if (result)
+			return -EINVAL;
+
+		/* Get the next module header */
+		result = toiActiveAllocator->rw_header_chunk(READ, NULL,
+				(char *) &toi_module_header,
+				sizeof(toi_module_header));
+
+		if (result) {
+			printk(KERN_ERR "Failed to read the next module "
+					"header.\n");
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
+
+static inline int save_fs_info(struct fs_info *fs, struct block_device *bdev)
+{
+	return (!fs || IS_ERR(fs) || !fs->last_mount_size) ? 0 : 1;
+}
+
+int fs_info_space_needed(void)
+{
+	const struct super_block *sb;
+	int result = sizeof(int);
+
+	list_for_each_entry(sb, &super_blocks, s_list) {
+		struct fs_info *fs;
+
+		if (!sb->s_bdev)
+			continue;
+
+		fs = fs_info_from_block_dev(sb->s_bdev);
+		if (save_fs_info(fs, sb->s_bdev))
+			result += 16 + sizeof(dev_t) + sizeof(int) +
+				fs->last_mount_size;
+		free_fs_info(fs);
+	}
+	return result;
+}
+
+static int fs_info_num_to_save(void)
+{
+	const struct super_block *sb;
+	int to_save = 0;
+
+	list_for_each_entry(sb, &super_blocks, s_list) {
+		struct fs_info *fs;
+
+		if (!sb->s_bdev)
+			continue;
+
+		fs = fs_info_from_block_dev(sb->s_bdev);
+		if (save_fs_info(fs, sb->s_bdev))
+			to_save++;
+		free_fs_info(fs);
+	}
+
+	return to_save;
+}
+
+static int fs_info_save(void)
+{
+	const struct super_block *sb;
+	int to_save = fs_info_num_to_save();
+
+	if (toiActiveAllocator->rw_header_chunk(WRITE, NULL, (char *) &to_save,
+				sizeof(int))) {
+		abort_hibernate(TOI_FAILED_IO, "Failed to write num fs_info"
+				" to save.");
+		return -EIO;
+	}
+
+	list_for_each_entry(sb, &super_blocks, s_list) {
+		struct fs_info *fs;
+
+		if (!sb->s_bdev)
+			continue;
+
+		fs = fs_info_from_block_dev(sb->s_bdev);
+		if (save_fs_info(fs, sb->s_bdev)) {
+			if (toiActiveAllocator->rw_header_chunk(WRITE, NULL,
+					&fs->uuid[0], 16)) {
+				abort_hibernate(TOI_FAILED_IO, "Failed to "
+						"write uuid.");
+				return -EIO;
+			}
+			if (toiActiveAllocator->rw_header_chunk(WRITE, NULL,
+					(char *) &fs->dev_t, sizeof(dev_t))) {
+				abort_hibernate(TOI_FAILED_IO, "Failed to "
+						"write dev_t.");
+				return -EIO;
+			}
+			if (toiActiveAllocator->rw_header_chunk(WRITE, NULL,
+					(char *) &fs->last_mount_size, sizeof(int))) {
+				abort_hibernate(TOI_FAILED_IO, "Failed to "
+						"write last mount length.");
+				return -EIO;
+			}
+			if (toiActiveAllocator->rw_header_chunk(WRITE, NULL,
+					fs->last_mount, fs->last_mount_size)) {
+				abort_hibernate(TOI_FAILED_IO, "Failed to "
+						"write uuid.");
+				return -EIO;
+			}
+		}
+		free_fs_info(fs);
+	}
+	return 0;
+}
+
+static int fs_info_load_and_check_one(void)
+{
+	char uuid[16], *last_mount;
+	int result = 0, ln;
+	dev_t dev_t;
+	struct block_device *dev;
+	struct fs_info *fs_info, seek;
+
+	if (toiActiveAllocator->rw_header_chunk(READ, NULL, uuid, 16)) {
+		abort_hibernate(TOI_FAILED_IO, "Failed to read uuid.");
+		return -EIO;
+	}
+
+	read_if_version(3, dev_t, "uuid dev_t field", return -EIO);
+
+	if (toiActiveAllocator->rw_header_chunk(READ, NULL, (char *) &ln,
+				sizeof(int))) {
+		abort_hibernate(TOI_FAILED_IO,
+				"Failed to read last mount size.");
+		return -EIO;
+	}
+
+	last_mount = kzalloc(ln, GFP_KERNEL);
+
+	if (!last_mount)
+		return -ENOMEM;
+
+	if (toiActiveAllocator->rw_header_chunk(READ, NULL, last_mount,	ln)) {
+		abort_hibernate(TOI_FAILED_IO,
+				"Failed to read last mount timestamp.");
+		result = -EIO;
+		goto out_lmt;
+	}
+
+	strncpy((char *) &seek.uuid, uuid, 16);
+	seek.dev_t = dev_t;
+	seek.last_mount_size = ln;
+	seek.last_mount = last_mount;
+	dev_t = blk_lookup_fs_info(&seek);
+	if (!dev_t)
+		goto out_lmt;
+
+	dev = toi_open_by_devnum(dev_t);
+
+	fs_info = fs_info_from_block_dev(dev);
+	if (fs_info && !IS_ERR(fs_info)) {
+		if (ln != fs_info->last_mount_size) {
+			printk(KERN_EMERG "Found matching uuid but last mount "
+					"time lengths differ?! "
+					"(%d vs %d).\n", ln,
+					fs_info->last_mount_size);
+			result = -EINVAL;
+		} else {
+			char buf[BDEVNAME_SIZE];
+			result = !!memcmp(fs_info->last_mount, last_mount, ln);
+			if (result)
+				printk(KERN_EMERG "Last mount time for %s has "
+					"changed!\n", bdevname(dev, buf));
+		}
+	}
+	toi_close_bdev(dev);
+	free_fs_info(fs_info);
+out_lmt:
+	kfree(last_mount);
+	return result;
+}
+
+static int fs_info_load_and_check(void)
+{
+	int to_do, result = 0;
+
+	if (toiActiveAllocator->rw_header_chunk(READ, NULL, (char *) &to_do,
+				sizeof(int))) {
+		abort_hibernate(TOI_FAILED_IO, "Failed to read num fs_info "
+				"to load.");
+		return -EIO;
+	}
+
+	while(to_do--)
+		result |= fs_info_load_and_check_one();
+
+	return result;
+}
+
+/**
+ * write_image_header - write the image header after write the image proper
+ *
+ * Returns: Int
+ *	Zero on success, error value otherwise.
+ **/
+int write_image_header(void)
+{
+	int ret;
+	int total = pagedir1.size + pagedir2.size+2;
+	char *header_buffer = NULL;
+
+	/* Now prepare to write the header */
+	ret = toiActiveAllocator->write_header_init();
+	if (ret) {
+		abort_hibernate(TOI_FAILED_MODULE_INIT,
+				"Active allocator's write_header_init"
+				" function failed.");
+		goto write_image_header_abort;
+	}
+
+	/* Get a buffer */
+	header_buffer = (char *) toi_get_zeroed_page(24, TOI_ATOMIC_GFP);
+	if (!header_buffer) {
+		abort_hibernate(TOI_OUT_OF_MEMORY,
+			"Out of memory when trying to get page for header!");
+		goto write_image_header_abort;
+	}
+
+	/* Write hibernate header */
+	if (fill_toi_header((struct toi_header *) header_buffer)) {
+		abort_hibernate(TOI_OUT_OF_MEMORY,
+			"Failure to fill header information!");
+		goto write_image_header_abort;
+	}
+
+	if (toiActiveAllocator->rw_header_chunk(WRITE, NULL,
+			header_buffer, sizeof(struct toi_header))) {
+		abort_hibernate(TOI_OUT_OF_MEMORY,
+			"Failure to write header info.");
+		goto write_image_header_abort;
+	}
+
+	if (toiActiveAllocator->rw_header_chunk(WRITE, NULL,
+			(char *) &toi_max_workers, sizeof(toi_max_workers))) {
+		abort_hibernate(TOI_OUT_OF_MEMORY,
+			"Failure to number of workers to use.");
+		goto write_image_header_abort;
+	}
+
+	/* Write filesystem info */
+	if (fs_info_save())
+		goto write_image_header_abort;
+
+	/* Write module configurations */
+	ret = write_module_configs();
+	if (ret) {
+		abort_hibernate(TOI_FAILED_IO,
+				"Failed to write module configs.");
+		goto write_image_header_abort;
+	}
+
+	if (memory_bm_write(pageset1_map,
+				toiActiveAllocator->rw_header_chunk)) {
+		abort_hibernate(TOI_FAILED_IO,
+				"Failed to write bitmaps.");
+		goto write_image_header_abort;
+	}
+
+	/* Flush data and let allocator cleanup */
+	if (toiActiveAllocator->write_header_cleanup()) {
+		abort_hibernate(TOI_FAILED_IO,
+				"Failed to cleanup writing header.");
+		goto write_image_header_abort_no_cleanup;
+	}
+
+	if (test_result_state(TOI_ABORTED))
+		goto write_image_header_abort_no_cleanup;
+
+	toi_update_status(total, total, NULL);
+
+out:
+	if (header_buffer)
+		toi_free_page(24, (unsigned long) header_buffer);
+	return ret;
+
+write_image_header_abort:
+	toiActiveAllocator->write_header_cleanup();
+write_image_header_abort_no_cleanup:
+	ret = -1;
+	goto out;
+}
+
+/**
+ * sanity_check - check the header
+ * @sh:	the header which was saved at hibernate time.
+ *
+ * Perform a few checks, seeking to ensure that the kernel being
+ * booted matches the one hibernated. They need to match so we can
+ * be _sure_ things will work. It is not absolutely impossible for
+ * resuming from a different kernel to work, just not assured.
+ **/
+static char *sanity_check(struct toi_header *sh)
+{
+	char *reason = check_image_kernel((struct swsusp_info *) sh);
+
+	if (reason)
+		return reason;
+
+	if (!test_action_state(TOI_IGNORE_ROOTFS)) {
+		const struct super_block *sb;
+		list_for_each_entry(sb, &super_blocks, s_list) {
+			if ((!(sb->s_flags & MS_RDONLY)) &&
+			    (sb->s_type->fs_flags & FS_REQUIRES_DEV))
+				return "Device backed fs has been mounted "
+					"rw prior to resume or initrd/ramfs "
+					"is mounted rw.";
+		}
+	}
+
+	return NULL;
+}
+
+static DECLARE_WAIT_QUEUE_HEAD(freeze_wait);
+
+#define FREEZE_IN_PROGRESS (~0)
+
+static int freeze_result;
+
+static void do_freeze(struct work_struct *dummy)
+{
+	freeze_result = freeze_processes();
+	wake_up(&freeze_wait);
+	trap_non_toi_io = 1;
+}
+
+static DECLARE_WORK(freeze_work, do_freeze);
+
+/**
+ * __read_pageset1 - test for the existence of an image and attempt to load it
+ *
+ * Returns:	Int
+ *	Zero if image found and pageset1 successfully loaded.
+ *	Error if no image found or loaded.
+ **/
+static int __read_pageset1(void)
+{
+	int i, result = 0;
+	char *header_buffer = (char *) toi_get_zeroed_page(25, TOI_ATOMIC_GFP),
+	     *sanity_error = NULL;
+	struct toi_header *toi_header;
+
+	if (!header_buffer) {
+		printk(KERN_INFO "Unable to allocate a page for reading the "
+				"signature.\n");
+		return -ENOMEM;
+	}
+
+	/* Check for an image */
+	result = toiActiveAllocator->image_exists(1);
+	if (result == 3) {
+		result = -ENODATA;
+		toi_early_boot_message(1, 0, "The signature from an older "
+				"version of TuxOnIce has been detected.");
+		goto out_remove_image;
+	}
+
+	if (result != 1) {
+		result = -ENODATA;
+		noresume_reset_modules();
+		printk(KERN_INFO "TuxOnIce: No image found.\n");
+		goto out;
+	}
+
+	/*
+	 * Prepare the active allocator for reading the image header. The
+	 * activate allocator might read its own configuration.
+	 *
+	 * NB: This call may never return because there might be a signature
+	 * for a different image such that we warn the user and they choose
+	 * to reboot. (If the device ids look erroneous (2.4 vs 2.6) or the
+	 * location of the image might be unavailable if it was stored on a
+	 * network connection).
+	 */
+
+	result = toiActiveAllocator->read_header_init();
+	if (result) {
+		printk(KERN_INFO "TuxOnIce: Failed to initialise, reading the "
+				"image header.\n");
+		goto out_remove_image;
+	}
+
+	/* Check for noresume command line option */
+	if (test_toi_state(TOI_NORESUME_SPECIFIED)) {
+		printk(KERN_INFO "TuxOnIce: Noresume on command line. Removed "
+				"image.\n");
+		goto out_remove_image;
+	}
+
+	/* Check whether we've resumed before */
+	if (test_toi_state(TOI_RESUMED_BEFORE)) {
+		toi_early_boot_message(1, 0, NULL);
+		if (!(test_toi_state(TOI_CONTINUE_REQ))) {
+			printk(KERN_INFO "TuxOnIce: Tried to resume before: "
+					"Invalidated image.\n");
+			goto out_remove_image;
+		}
+	}
+
+	clear_toi_state(TOI_CONTINUE_REQ);
+
+	toi_image_header_version = toiActiveAllocator->get_header_version();
+
+	if (unlikely(toi_image_header_version > TOI_HEADER_VERSION)) {
+		toi_early_boot_message(1, 0, image_version_error);
+		if (!(test_toi_state(TOI_CONTINUE_REQ))) {
+			printk(KERN_INFO "TuxOnIce: Header version too new: "
+					"Invalidated image.\n");
+			goto out_remove_image;
+		}
+	}
+
+	/* Read hibernate header */
+	result = toiActiveAllocator->rw_header_chunk(READ, NULL,
+			header_buffer, sizeof(struct toi_header));
+	if (result < 0) {
+		printk(KERN_ERR "TuxOnIce: Failed to read the image "
+				"signature.\n");
+		goto out_remove_image;
+	}
+
+	toi_header = (struct toi_header *) header_buffer;
+
+	/*
+	 * NB: This call may also result in a reboot rather than returning.
+	 */
+
+	sanity_error = sanity_check(toi_header);
+	if (sanity_error) {
+		toi_early_boot_message(1, TOI_CONTINUE_REQ,
+				sanity_error);
+		printk(KERN_INFO "TuxOnIce: Sanity check failed.\n");
+		goto out_remove_image;
+	}
+
+	/*
+	 * We have an image and it looks like it will load okay.
+	 *
+	 * Get metadata from header. Don't override commandline parameters.
+	 *
+	 * We don't need to save the image size limit because it's not used
+	 * during resume and will be restored with the image anyway.
+	 */
+
+	memcpy((char *) &pagedir1,
+		(char *) &toi_header->pagedir, sizeof(pagedir1));
+	toi_result = toi_header->param0;
+	if (!toi_bkd.toi_debug_state) {
+		toi_bkd.toi_action =
+			(toi_header->param1 & ~toi_bootflags_mask) |
+			(toi_bkd.toi_action & toi_bootflags_mask);
+		toi_bkd.toi_debug_state = toi_header->param2;
+		toi_bkd.toi_default_console_level = toi_header->param3;
+	}
+	clear_toi_state(TOI_IGNORE_LOGLEVEL);
+	pagedir2.size = toi_header->pageset_2_size;
+	for (i = 0; i < 4; i++)
+		toi_bkd.toi_io_time[i/2][i%2] =
+			toi_header->io_time[i/2][i%2];
+
+	set_toi_state(TOI_BOOT_KERNEL);
+	boot_kernel_data_buffer = toi_header->bkd;
+
+	read_if_version(1, toi_max_workers, "TuxOnIce max workers",
+			goto out_remove_image);
+
+	/* Read filesystem info */
+	if (fs_info_load_and_check()) {
+		printk(KERN_EMERG "TuxOnIce: File system mount time checks "
+			"failed. Refusing to corrupt your filesystems!\n");
+		goto out_remove_image;
+	}
+
+	/* Read module configurations */
+	result = read_module_configs();
+	if (result) {
+		pagedir1.size = 0;
+		pagedir2.size = 0;
+		printk(KERN_INFO "TuxOnIce: Failed to read TuxOnIce module "
+				"configurations.\n");
+		clear_action_state(TOI_KEEP_IMAGE);
+		goto out_remove_image;
+	}
+
+	toi_prepare_console();
+
+	set_toi_state(TOI_NOW_RESUMING);
+
+	if (!test_action_state(TOI_LATE_CPU_HOTPLUG)) {
+		toi_prepare_status(DONT_CLEAR_BAR, "Disable nonboot cpus.");
+		if (disable_nonboot_cpus()) {
+			set_abort_result(TOI_CPU_HOTPLUG_FAILED);
+			goto out_reset_console;
+		}
+	}
+
+	result = pm_notifier_call_chain(PM_RESTORE_PREPARE);
+	if (result)
+		goto out_notifier_call_chain;;
+
+	if (usermodehelper_disable())
+		goto out_enable_nonboot_cpus;
+
+	current->flags |= PF_NOFREEZE;
+	freeze_result = FREEZE_IN_PROGRESS;
+
+	schedule_work_on(cpumask_first(cpu_online_mask), &freeze_work);
+
+	toi_cond_pause(1, "About to read original pageset1 locations.");
+
+	/*
+	 * See _toi_rw_header_chunk in tuxonice_bio.c:
+	 * Initialize pageset1_map by reading the map from the image.
+	 */
+	if (memory_bm_read(pageset1_map, toiActiveAllocator->rw_header_chunk))
+		goto out_thaw;
+
+	/*
+	 * See toi_rw_cleanup in tuxonice_bio.c:
+	 * Clean up after reading the header.
+	 */
+	result = toiActiveAllocator->read_header_cleanup();
+	if (result) {
+		printk(KERN_ERR "TuxOnIce: Failed to cleanup after reading the "
+				"image header.\n");
+		goto out_thaw;
+	}
+
+	toi_cond_pause(1, "About to read pagedir.");
+
+	/*
+	 * Get the addresses of pages into which we will load the kernel to
+	 * be copied back and check if they conflict with the ones we are using.
+	 */
+	if (toi_get_pageset1_load_addresses()) {
+		printk(KERN_INFO "TuxOnIce: Failed to get load addresses for "
+				"pageset1.\n");
+		goto out_thaw;
+	}
+
+	/* Read the original kernel back */
+	toi_cond_pause(1, "About to read pageset 1.");
+
+	/* Given the pagemap, read back the data from disk */
+	if (read_pageset(&pagedir1, 0)) {
+		toi_prepare_status(DONT_CLEAR_BAR, "Failed to read pageset 1.");
+		result = -EIO;
+		goto out_thaw;
+	}
+
+	toi_cond_pause(1, "About to restore original kernel.");
+	result = 0;
+
+	if (!test_action_state(TOI_KEEP_IMAGE) &&
+	    toiActiveAllocator->mark_resume_attempted)
+		toiActiveAllocator->mark_resume_attempted(1);
+
+	wait_event(freeze_wait, freeze_result != FREEZE_IN_PROGRESS);
+out:
+	current->flags &= ~PF_NOFREEZE;
+	toi_free_page(25, (unsigned long) header_buffer);
+	return result;
+
+out_thaw:
+	wait_event(freeze_wait, freeze_result != FREEZE_IN_PROGRESS);
+	trap_non_toi_io = 0;
+	thaw_processes();
+	usermodehelper_enable();
+out_enable_nonboot_cpus:
+	enable_nonboot_cpus();
+out_notifier_call_chain:
+  pm_notifier_call_chain(PM_POST_RESTORE);
+out_reset_console:
+	toi_cleanup_console();
+out_remove_image:
+	result = -EINVAL;
+	if (!test_action_state(TOI_KEEP_IMAGE))
+		toiActiveAllocator->remove_image();
+	toiActiveAllocator->read_header_cleanup();
+	noresume_reset_modules();
+	goto out;
+}
+
+/**
+ * read_pageset1 - highlevel function to read the saved pages
+ *
+ * Attempt to read the header and pageset1 of a hibernate image.
+ * Handle the outcome, complaining where appropriate.
+ **/
+int read_pageset1(void)
+{
+	int error;
+
+	error = __read_pageset1();
+
+	if (error && error != -ENODATA && error != -EINVAL &&
+					!test_result_state(TOI_ABORTED))
+		abort_hibernate(TOI_IMAGE_ERROR,
+			"TuxOnIce: Error %d resuming\n", error);
+
+	return error;
+}
+
+/**
+ * get_have_image_data - check the image header
+ **/
+static char *get_have_image_data(void)
+{
+	char *output_buffer = (char *) toi_get_zeroed_page(26, TOI_ATOMIC_GFP);
+	struct toi_header *toi_header;
+
+	if (!output_buffer) {
+		printk(KERN_INFO "Output buffer null.\n");
+		return NULL;
+	}
+
+	/* Check for an image */
+	if (!toiActiveAllocator->image_exists(1) ||
+	    toiActiveAllocator->read_header_init() ||
+	    toiActiveAllocator->rw_header_chunk(READ, NULL,
+			output_buffer, sizeof(struct toi_header))) {
+		sprintf(output_buffer, "0\n");
+		/*
+		 * From an initrd/ramfs, catting have_image and
+		 * getting a result of 0 is sufficient.
+		 */
+		clear_toi_state(TOI_BOOT_TIME);
+		goto out;
+	}
+
+	toi_header = (struct toi_header *) output_buffer;
+
+	sprintf(output_buffer, "1\n%s\n%s\n",
+			toi_header->uts.machine,
+			toi_header->uts.version);
+
+	/* Check whether we've resumed before */
+	if (test_toi_state(TOI_RESUMED_BEFORE))
+		strcat(output_buffer, "Resumed before.\n");
+
+out:
+	noresume_reset_modules();
+	return output_buffer;
+}
+
+/**
+ * read_pageset2 - read second part of the image
+ * @overwrittenpagesonly:	Read only pages which would have been
+ *				verwritten by pageset1?
+ *
+ * Read in part or all of pageset2 of an image, depending upon
+ * whether we are hibernating and have only overwritten a portion
+ * with pageset1 pages, or are resuming and need to read them
+ * all.
+ *
+ * Returns: Int
+ *	Zero if no error, otherwise the error value.
+ **/
+int read_pageset2(int overwrittenpagesonly)
+{
+	int result = 0;
+
+	if (!pagedir2.size)
+		return 0;
+
+	result = read_pageset(&pagedir2, overwrittenpagesonly);
+
+	toi_cond_pause(1, "Pagedir 2 read.");
+
+	return result;
+}
+
+/**
+ * image_exists_read - has an image been found?
+ * @page:	Output buffer
+ *
+ * Store 0 or 1 in page, depending on whether an image is found.
+ * Incoming buffer is PAGE_SIZE and result is guaranteed
+ * to be far less than that, so we don't worry about
+ * overflow.
+ **/
+int image_exists_read(const char *page, int count)
+{
+	int len = 0;
+	char *result;
+
+	if (toi_activate_storage(0))
+		return count;
+
+	if (!test_toi_state(TOI_RESUME_DEVICE_OK))
+		toi_attempt_to_parse_resume_device(0);
+
+	if (!toiActiveAllocator) {
+		len = sprintf((char *) page, "-1\n");
+	} else {
+		result = get_have_image_data();
+		if (result) {
+			len = sprintf((char *) page, "%s",  result);
+			toi_free_page(26, (unsigned long) result);
+		}
+	}
+
+	toi_deactivate_storage(0);
+
+	return len;
+}
+
+/**
+ * image_exists_write - invalidate an image if one exists
+ **/
+int image_exists_write(const char *buffer, int count)
+{
+	if (toi_activate_storage(0))
+		return count;
+
+	if (toiActiveAllocator && toiActiveAllocator->image_exists(1))
+		toiActiveAllocator->remove_image();
+
+	toi_deactivate_storage(0);
+
+	clear_result_state(TOI_KEPT_IMAGE);
+
+	return count;
+}
diff --git a/kernel/power/tuxonice_io.h b/kernel/power/tuxonice_io.h
new file mode 100644
index 0000000..fe37713
--- /dev/null
+++ b/kernel/power/tuxonice_io.h
@@ -0,0 +1,74 @@
+/*
+ * kernel/power/tuxonice_io.h
+ *
+ * Copyright (C) 2005-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * It contains high level IO routines for hibernating.
+ *
+ */
+
+#include <linux/utsname.h>
+#include "tuxonice_pagedir.h"
+
+/* Non-module data saved in our image header */
+struct toi_header {
+	/*
+	 * Mirror struct swsusp_info, but without
+	 * the page aligned attribute
+	 */
+	struct new_utsname uts;
+	u32 version_code;
+	unsigned long num_physpages;
+	int cpus;
+	unsigned long image_pages;
+	unsigned long pages;
+	unsigned long size;
+
+	/* Our own data */
+	unsigned long orig_mem_free;
+	int page_size;
+	int pageset_2_size;
+	int param0;
+	int param1;
+	int param2;
+	int param3;
+	int progress0;
+	int progress1;
+	int progress2;
+	int progress3;
+	int io_time[2][2];
+	struct pagedir pagedir;
+	dev_t root_fs;
+	unsigned long bkd; /* Boot kernel data locn */
+};
+
+extern int write_pageset(struct pagedir *pagedir);
+extern int write_image_header(void);
+extern int read_pageset1(void);
+extern int read_pageset2(int overwrittenpagesonly);
+
+extern int toi_attempt_to_parse_resume_device(int quiet);
+extern void attempt_to_parse_resume_device2(void);
+extern void attempt_to_parse_alt_resume_param(void);
+int image_exists_read(const char *page, int count);
+int image_exists_write(const char *buffer, int count);
+extern void save_restore_alt_param(int replace, int quiet);
+extern atomic_t toi_io_workers;
+
+/* Args to save_restore_alt_param */
+#define RESTORE 0
+#define SAVE 1
+
+#define NOQUIET 0
+#define QUIET 1
+
+extern dev_t name_to_dev_t(char *line);
+
+extern wait_queue_head_t toi_io_queue_flusher;
+extern int toi_bio_queue_flusher_should_finish;
+
+int fs_info_space_needed(void);
+
+extern int toi_max_workers;
diff --git a/kernel/power/tuxonice_modules.c b/kernel/power/tuxonice_modules.c
new file mode 100644
index 0000000..4cc24a9
--- /dev/null
+++ b/kernel/power/tuxonice_modules.c
@@ -0,0 +1,522 @@
+/*
+ * kernel/power/tuxonice_modules.c
+ *
+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ */
+
+#include <linux/suspend.h>
+#include "tuxonice.h"
+#include "tuxonice_modules.h"
+#include "tuxonice_sysfs.h"
+#include "tuxonice_ui.h"
+
+LIST_HEAD(toi_filters);
+LIST_HEAD(toiAllocators);
+
+LIST_HEAD(toi_modules);
+EXPORT_SYMBOL_GPL(toi_modules);
+
+struct toi_module_ops *toiActiveAllocator;
+EXPORT_SYMBOL_GPL(toiActiveAllocator);
+
+static int toi_num_filters;
+int toiNumAllocators, toi_num_modules;
+
+/*
+ * toi_header_storage_for_modules
+ *
+ * Returns the amount of space needed to store configuration
+ * data needed by the modules prior to copying back the original
+ * kernel. We can exclude data for pageset2 because it will be
+ * available anyway once the kernel is copied back.
+ */
+long toi_header_storage_for_modules(void)
+{
+	struct toi_module_ops *this_module;
+	int bytes = 0;
+
+	list_for_each_entry(this_module, &toi_modules, module_list) {
+		if (!this_module->enabled ||
+		    (this_module->type == WRITER_MODULE &&
+		     toiActiveAllocator != this_module))
+			continue;
+		if (this_module->storage_needed) {
+			int this = this_module->storage_needed() +
+				sizeof(struct toi_module_header) +
+				sizeof(int);
+			this_module->header_requested = this;
+			bytes += this;
+		}
+	}
+
+	/* One more for the empty terminator */
+	return bytes + sizeof(struct toi_module_header);
+}
+
+void print_toi_header_storage_for_modules(void)
+{
+	struct toi_module_ops *this_module;
+	int bytes = 0;
+
+	printk(KERN_DEBUG "Header storage:\n");
+	list_for_each_entry(this_module, &toi_modules, module_list) {
+		if (!this_module->enabled ||
+		    (this_module->type == WRITER_MODULE &&
+		     toiActiveAllocator != this_module))
+			continue;
+		if (this_module->storage_needed) {
+			int this = this_module->storage_needed() +
+				sizeof(struct toi_module_header) +
+				sizeof(int);
+			this_module->header_requested = this;
+			bytes += this;
+			printk(KERN_DEBUG "+ %16s : %-4d/%d.\n",
+					this_module->name,
+					this_module->header_used, this);
+		}
+	}
+
+	printk(KERN_DEBUG "+ empty terminator : %zu.\n",
+			sizeof(struct toi_module_header));
+	printk(KERN_DEBUG "                     ====\n");
+	printk(KERN_DEBUG "                     %zu\n",
+			bytes + sizeof(struct toi_module_header));
+}
+EXPORT_SYMBOL_GPL(print_toi_header_storage_for_modules);
+
+/*
+ * toi_memory_for_modules
+ *
+ * Returns the amount of memory requested by modules for
+ * doing their work during the cycle.
+ */
+
+long toi_memory_for_modules(int print_parts)
+{
+	long bytes = 0, result;
+	struct toi_module_ops *this_module;
+
+	if (print_parts)
+		printk(KERN_INFO "Memory for modules:\n===================\n");
+	list_for_each_entry(this_module, &toi_modules, module_list) {
+		int this;
+		if (!this_module->enabled)
+			continue;
+		if (this_module->memory_needed) {
+			this = this_module->memory_needed();
+			if (print_parts)
+				printk(KERN_INFO "%10d bytes (%5ld pages) for "
+						"module '%s'.\n", this,
+						DIV_ROUND_UP(this, PAGE_SIZE),
+						this_module->name);
+			bytes += this;
+		}
+	}
+
+	result = DIV_ROUND_UP(bytes, PAGE_SIZE);
+	if (print_parts)
+		printk(KERN_INFO " => %ld bytes, %ld pages.\n", bytes, result);
+
+	return result;
+}
+
+/*
+ * toi_expected_compression_ratio
+ *
+ * Returns the compression ratio expected when saving the image.
+ */
+
+int toi_expected_compression_ratio(void)
+{
+	int ratio = 100;
+	struct toi_module_ops *this_module;
+
+	list_for_each_entry(this_module, &toi_modules, module_list) {
+		if (!this_module->enabled)
+			continue;
+		if (this_module->expected_compression)
+			ratio = ratio * this_module->expected_compression()
+				/ 100;
+	}
+
+	return ratio;
+}
+
+/* toi_find_module_given_dir
+ * Functionality :	Return a module (if found), given a pointer
+ * 			to its directory name
+ */
+
+static struct toi_module_ops *toi_find_module_given_dir(char *name)
+{
+	struct toi_module_ops *this_module, *found_module = NULL;
+
+	list_for_each_entry(this_module, &toi_modules, module_list) {
+		if (!strcmp(name, this_module->directory)) {
+			found_module = this_module;
+			break;
+		}
+	}
+
+	return found_module;
+}
+
+/* toi_find_module_given_name
+ * Functionality :	Return a module (if found), given a pointer
+ * 			to its name
+ */
+
+struct toi_module_ops *toi_find_module_given_name(char *name)
+{
+	struct toi_module_ops *this_module, *found_module = NULL;
+
+	list_for_each_entry(this_module, &toi_modules, module_list) {
+		if (!strcmp(name, this_module->name)) {
+			found_module = this_module;
+			break;
+		}
+	}
+
+	return found_module;
+}
+
+/*
+ * toi_print_module_debug_info
+ * Functionality   : Get debugging info from modules into a buffer.
+ */
+int toi_print_module_debug_info(char *buffer, int buffer_size)
+{
+	struct toi_module_ops *this_module;
+	int len = 0;
+
+	list_for_each_entry(this_module, &toi_modules, module_list) {
+		if (!this_module->enabled)
+			continue;
+		if (this_module->print_debug_info) {
+			int result;
+			result = this_module->print_debug_info(buffer + len,
+					buffer_size - len);
+			len += result;
+		}
+	}
+
+	/* Ensure null terminated */
+	buffer[buffer_size] = 0;
+
+	return len;
+}
+
+/*
+ * toi_register_module
+ *
+ * Register a module.
+ */
+int toi_register_module(struct toi_module_ops *module)
+{
+	int i;
+	struct kobject *kobj;
+
+	module->enabled = 1;
+
+	if (toi_find_module_given_name(module->name)) {
+		printk(KERN_INFO "TuxOnIce: Trying to load module %s,"
+				" which is already registered.\n",
+				module->name);
+		return -EBUSY;
+	}
+
+	switch (module->type) {
+	case FILTER_MODULE:
+		list_add_tail(&module->type_list, &toi_filters);
+		toi_num_filters++;
+		break;
+	case WRITER_MODULE:
+		list_add_tail(&module->type_list, &toiAllocators);
+		toiNumAllocators++;
+		break;
+	case MISC_MODULE:
+	case MISC_HIDDEN_MODULE:
+	case BIO_ALLOCATOR_MODULE:
+		break;
+	default:
+		printk(KERN_ERR "Hmmm. Module '%s' has an invalid type."
+			" It has been ignored.\n", module->name);
+		return -EINVAL;
+	}
+	list_add_tail(&module->module_list, &toi_modules);
+	toi_num_modules++;
+
+	if ((!module->directory && !module->shared_directory) ||
+			!module->sysfs_data || !module->num_sysfs_entries)
+		return 0;
+
+	/*
+	 * Modules may share a directory, but those with shared_dir
+	 * set must be loaded (via symbol dependencies) after parents
+	 * and unloaded beforehand.
+	 */
+	if (module->shared_directory) {
+		struct toi_module_ops *shared =
+			toi_find_module_given_dir(module->shared_directory);
+		if (!shared) {
+			printk(KERN_ERR "TuxOnIce: Module %s wants to share "
+					"%s's directory but %s isn't loaded.\n",
+					module->name, module->shared_directory,
+					module->shared_directory);
+			toi_unregister_module(module);
+			return -ENODEV;
+		}
+		kobj = shared->dir_kobj;
+	} else {
+		if (!strncmp(module->directory, "[ROOT]", 6))
+			kobj = tuxonice_kobj;
+		else
+			kobj = make_toi_sysdir(module->directory);
+	}
+	module->dir_kobj = kobj;
+	for (i = 0; i < module->num_sysfs_entries; i++) {
+		int result = toi_register_sysfs_file(kobj,
+				&module->sysfs_data[i]);
+		if (result)
+			return result;
+	}
+	return 0;
+}
+EXPORT_SYMBOL_GPL(toi_register_module);
+
+/*
+ * toi_unregister_module
+ *
+ * Remove a module.
+ */
+void toi_unregister_module(struct toi_module_ops *module)
+{
+	int i;
+
+	if (module->dir_kobj)
+		for (i = 0; i < module->num_sysfs_entries; i++)
+			toi_unregister_sysfs_file(module->dir_kobj,
+					&module->sysfs_data[i]);
+
+	if (!module->shared_directory && module->directory &&
+			strncmp(module->directory, "[ROOT]", 6))
+		remove_toi_sysdir(module->dir_kobj);
+
+	switch (module->type) {
+	case FILTER_MODULE:
+		list_del(&module->type_list);
+		toi_num_filters--;
+		break;
+	case WRITER_MODULE:
+		list_del(&module->type_list);
+		toiNumAllocators--;
+		if (toiActiveAllocator == module) {
+			toiActiveAllocator = NULL;
+			clear_toi_state(TOI_CAN_RESUME);
+			clear_toi_state(TOI_CAN_HIBERNATE);
+		}
+		break;
+	case MISC_MODULE:
+	case MISC_HIDDEN_MODULE:
+	case BIO_ALLOCATOR_MODULE:
+		break;
+	default:
+		printk(KERN_ERR "Module '%s' has an invalid type."
+			" It has been ignored.\n", module->name);
+		return;
+	}
+	list_del(&module->module_list);
+	toi_num_modules--;
+}
+EXPORT_SYMBOL_GPL(toi_unregister_module);
+
+/*
+ * toi_move_module_tail
+ *
+ * Rearrange modules when reloading the config.
+ */
+void toi_move_module_tail(struct toi_module_ops *module)
+{
+	switch (module->type) {
+	case FILTER_MODULE:
+		if (toi_num_filters > 1)
+			list_move_tail(&module->type_list, &toi_filters);
+		break;
+	case WRITER_MODULE:
+		if (toiNumAllocators > 1)
+			list_move_tail(&module->type_list, &toiAllocators);
+		break;
+	case MISC_MODULE:
+	case MISC_HIDDEN_MODULE:
+	case BIO_ALLOCATOR_MODULE:
+		break;
+	default:
+		printk(KERN_ERR "Module '%s' has an invalid type."
+			" It has been ignored.\n", module->name);
+		return;
+	}
+	if ((toi_num_filters + toiNumAllocators) > 1)
+		list_move_tail(&module->module_list, &toi_modules);
+}
+
+/*
+ * toi_initialise_modules
+ *
+ * Get ready to do some work!
+ */
+int toi_initialise_modules(int starting_cycle, int early)
+{
+	struct toi_module_ops *this_module;
+	int result;
+
+	list_for_each_entry(this_module, &toi_modules, module_list) {
+		this_module->header_requested = 0;
+		this_module->header_used = 0;
+		if (!this_module->enabled)
+			continue;
+		if (this_module->early != early)
+			continue;
+		if (this_module->initialise) {
+			result = this_module->initialise(starting_cycle);
+			if (result) {
+				toi_cleanup_modules(starting_cycle);
+				return result;
+			}
+			this_module->initialised = 1;
+		}
+	}
+
+	return 0;
+}
+
+/*
+ * toi_cleanup_modules
+ *
+ * Tell modules the work is done.
+ */
+void toi_cleanup_modules(int finishing_cycle)
+{
+	struct toi_module_ops *this_module;
+
+	list_for_each_entry(this_module, &toi_modules, module_list) {
+		if (!this_module->enabled || !this_module->initialised)
+			continue;
+		if (this_module->cleanup)
+			this_module->cleanup(finishing_cycle);
+		this_module->initialised = 0;
+	}
+}
+
+/*
+ * toi_pre_atomic_restore_modules
+ *
+ * Get ready to do some work!
+ */
+void toi_pre_atomic_restore_modules(struct toi_boot_kernel_data *bkd)
+{
+	struct toi_module_ops *this_module;
+
+	list_for_each_entry(this_module, &toi_modules, module_list) {
+		if (this_module->enabled && this_module->pre_atomic_restore)
+			this_module->pre_atomic_restore(bkd);
+	}
+}
+
+/*
+ * toi_post_atomic_restore_modules
+ *
+ * Get ready to do some work!
+ */
+void toi_post_atomic_restore_modules(struct toi_boot_kernel_data *bkd)
+{
+	struct toi_module_ops *this_module;
+
+	list_for_each_entry(this_module, &toi_modules, module_list) {
+		if (this_module->enabled && this_module->post_atomic_restore)
+			this_module->post_atomic_restore(bkd);
+	}
+}
+
+/*
+ * toi_get_next_filter
+ *
+ * Get the next filter in the pipeline.
+ */
+struct toi_module_ops *toi_get_next_filter(struct toi_module_ops *filter_sought)
+{
+	struct toi_module_ops *last_filter = NULL, *this_filter = NULL;
+
+	list_for_each_entry(this_filter, &toi_filters, type_list) {
+		if (!this_filter->enabled)
+			continue;
+		if ((last_filter == filter_sought) || (!filter_sought))
+			return this_filter;
+		last_filter = this_filter;
+	}
+
+	return toiActiveAllocator;
+}
+EXPORT_SYMBOL_GPL(toi_get_next_filter);
+
+/**
+ * toi_show_modules: Printk what support is loaded.
+ */
+void toi_print_modules(void)
+{
+	struct toi_module_ops *this_module;
+	int prev = 0;
+
+	printk(KERN_INFO "TuxOnIce " TOI_CORE_VERSION ", with support for");
+
+	list_for_each_entry(this_module, &toi_modules, module_list) {
+		if (this_module->type == MISC_HIDDEN_MODULE)
+			continue;
+		printk("%s %s%s%s", prev ? "," : "",
+				this_module->enabled ? "" : "[",
+				this_module->name,
+				this_module->enabled ? "" : "]");
+		prev = 1;
+	}
+
+	printk(".\n");
+}
+
+/* toi_get_modules
+ *
+ * Take a reference to modules so they can't go away under us.
+ */
+
+int toi_get_modules(void)
+{
+	struct toi_module_ops *this_module;
+
+	list_for_each_entry(this_module, &toi_modules, module_list) {
+		struct toi_module_ops *this_module2;
+
+		if (try_module_get(this_module->module))
+			continue;
+
+		/* Failed! Reverse gets and return error */
+		list_for_each_entry(this_module2, &toi_modules,
+				module_list) {
+			if (this_module == this_module2)
+				return -EINVAL;
+			module_put(this_module2->module);
+		}
+	}
+	return 0;
+}
+
+/* toi_put_modules
+ *
+ * Release our references to modules we used.
+ */
+
+void toi_put_modules(void)
+{
+	struct toi_module_ops *this_module;
+
+	list_for_each_entry(this_module, &toi_modules, module_list)
+		module_put(this_module->module);
+}
diff --git a/kernel/power/tuxonice_modules.h b/kernel/power/tuxonice_modules.h
new file mode 100644
index 0000000..bf5d749
--- /dev/null
+++ b/kernel/power/tuxonice_modules.h
@@ -0,0 +1,211 @@
+/*
+ * kernel/power/tuxonice_modules.h
+ *
+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * It contains declarations for modules. Modules are additions to
+ * TuxOnIce that provide facilities such as image compression or
+ * encryption, backends for storage of the image and user interfaces.
+ *
+ */
+
+#ifndef TOI_MODULES_H
+#define TOI_MODULES_H
+
+/* This is the maximum size we store in the image header for a module name */
+#define TOI_MAX_MODULE_NAME_LENGTH 30
+
+struct toi_boot_kernel_data;
+
+/* Per-module metadata */
+struct toi_module_header {
+	char name[TOI_MAX_MODULE_NAME_LENGTH];
+	int enabled;
+	int type;
+	int index;
+	int data_length;
+	unsigned long signature;
+};
+
+enum {
+	FILTER_MODULE,
+	WRITER_MODULE,
+	BIO_ALLOCATOR_MODULE,
+	MISC_MODULE,
+	MISC_HIDDEN_MODULE,
+};
+
+enum {
+	TOI_ASYNC,
+	TOI_SYNC
+};
+
+enum {
+	TOI_VIRT,
+	TOI_PAGE,
+};
+
+#define TOI_MAP(type, addr) \
+ (type == TOI_PAGE ? kmap(addr) : addr)
+
+#define TOI_UNMAP(type, addr) \
+ do { \
+   if (type == TOI_PAGE) \
+     kunmap(addr); \
+ } while(0)
+
+struct toi_module_ops {
+	/* Functions common to all modules */
+	int type;
+	char *name;
+	char *directory;
+	char *shared_directory;
+	struct kobject *dir_kobj;
+	struct module *module;
+	int enabled, early, initialised;
+	struct list_head module_list;
+
+	/* List of filters or allocators */
+	struct list_head list, type_list;
+
+	/*
+	 * Requirements for memory and storage in
+	 * the image header..
+	 */
+	int (*memory_needed) (void);
+	int (*storage_needed) (void);
+
+	int header_requested, header_used;
+
+	int (*expected_compression) (void);
+
+	/*
+	 * Debug info
+	 */
+	int (*print_debug_info) (char *buffer, int size);
+	int (*save_config_info) (char *buffer);
+	void (*load_config_info) (char *buffer, int len);
+
+	/*
+	 * Initialise & cleanup - general routines called
+	 * at the start and end of a cycle.
+	 */
+	int (*initialise) (int starting_cycle);
+	void (*cleanup) (int finishing_cycle);
+
+	void (*pre_atomic_restore) (struct toi_boot_kernel_data *bkd);
+	void (*post_atomic_restore) (struct toi_boot_kernel_data *bkd);
+
+	/*
+	 * Calls for allocating storage (allocators only).
+	 *
+	 * Header space is requested separately and cannot fail, but the
+	 * reservation is only applied when main storage is allocated.
+	 * The header space reservation is thus always set prior to
+	 * requesting the allocation of storage - and prior to querying
+	 * how much storage is available.
+	 */
+
+	unsigned long (*storage_available) (void);
+	void (*reserve_header_space) (unsigned long space_requested);
+	int (*register_storage) (void);
+	int (*allocate_storage) (unsigned long space_requested);
+	unsigned long (*storage_allocated) (void);
+
+	/*
+	 * Routines used in image I/O.
+	 */
+	int (*rw_init) (int rw, int stream_number);
+	int (*rw_cleanup) (int rw);
+	int (*write_page) (unsigned long index, int buf_type, void *buf,
+			unsigned int buf_size);
+	int (*read_page) (unsigned long *index, int buf_type, void *buf,
+			unsigned int *buf_size);
+	int (*io_flusher) (int rw);
+
+	/* Reset module if image exists but reading aborted */
+	void (*noresume_reset) (void);
+
+	/* Read and write the metadata */
+	int (*write_header_init) (void);
+	int (*write_header_cleanup) (void);
+
+	int (*read_header_init) (void);
+	int (*read_header_cleanup) (void);
+
+	/* To be called after read_header_init */
+	int (*get_header_version) (void);
+
+	int (*rw_header_chunk) (int rw, struct toi_module_ops *owner,
+			char *buffer_start, int buffer_size);
+
+	int (*rw_header_chunk_noreadahead) (int rw,
+			struct toi_module_ops *owner, char *buffer_start,
+			int buffer_size);
+
+	/* Attempt to parse an image location */
+	int (*parse_sig_location) (char *buffer, int only_writer, int quiet);
+
+	/* Throttle I/O according to throughput */
+	void (*update_throughput_throttle) (int jif_index);
+
+	/* Flush outstanding I/O */
+	int (*finish_all_io) (void);
+
+	/* Determine whether image exists that we can restore */
+	int (*image_exists) (int quiet);
+
+	/* Mark the image as having tried to resume */
+	int (*mark_resume_attempted) (int);
+
+	/* Destroy image if one exists */
+	int (*remove_image) (void);
+
+	/* Sysfs Data */
+	struct toi_sysfs_data *sysfs_data;
+	int num_sysfs_entries;
+
+	/* Block I/O allocator */
+	struct toi_bio_allocator_ops *bio_allocator_ops;
+};
+
+extern int toi_num_modules, toiNumAllocators;
+
+extern struct toi_module_ops *toiActiveAllocator;
+extern struct list_head toi_filters, toiAllocators, toi_modules;
+
+extern void toi_prepare_console_modules(void);
+extern void toi_cleanup_console_modules(void);
+
+extern struct toi_module_ops *toi_find_module_given_name(char *name);
+extern struct toi_module_ops *toi_get_next_filter(struct toi_module_ops *);
+
+extern int toi_register_module(struct toi_module_ops *module);
+extern void toi_move_module_tail(struct toi_module_ops *module);
+
+extern long toi_header_storage_for_modules(void);
+extern long toi_memory_for_modules(int print_parts);
+extern void print_toi_header_storage_for_modules(void);
+extern int toi_expected_compression_ratio(void);
+
+extern int toi_print_module_debug_info(char *buffer, int buffer_size);
+extern int toi_register_module(struct toi_module_ops *module);
+extern void toi_unregister_module(struct toi_module_ops *module);
+
+extern int toi_initialise_modules(int starting_cycle, int early);
+#define toi_initialise_modules_early(starting) \
+	toi_initialise_modules(starting, 1)
+#define toi_initialise_modules_late(starting) \
+	toi_initialise_modules(starting, 0)
+extern void toi_cleanup_modules(int finishing_cycle);
+
+extern void toi_post_atomic_restore_modules(struct toi_boot_kernel_data *bkd);
+extern void toi_pre_atomic_restore_modules(struct toi_boot_kernel_data *bkd);
+
+extern void toi_print_modules(void);
+
+int toi_get_modules(void);
+void toi_put_modules(void);
+#endif
diff --git a/kernel/power/tuxonice_netlink.c b/kernel/power/tuxonice_netlink.c
new file mode 100644
index 0000000..75b4aa9
--- /dev/null
+++ b/kernel/power/tuxonice_netlink.c
@@ -0,0 +1,329 @@
+/*
+ * kernel/power/tuxonice_netlink.c
+ *
+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * Functions for communicating with a userspace helper via netlink.
+ */
+
+
+#include <linux/suspend.h>
+#include <linux/sched.h>
+#include "tuxonice_netlink.h"
+#include "tuxonice.h"
+#include "tuxonice_modules.h"
+#include "tuxonice_alloc.h"
+#include "tuxonice_builtin.h"
+
+static struct user_helper_data *uhd_list;
+
+/*
+ * Refill our pool of SKBs for use in emergencies (eg, when eating memory and
+ * none can be allocated).
+ */
+static void toi_fill_skb_pool(struct user_helper_data *uhd)
+{
+	while (uhd->pool_level < uhd->pool_limit) {
+		struct sk_buff *new_skb =
+			alloc_skb(NLMSG_SPACE(uhd->skb_size), TOI_ATOMIC_GFP);
+
+		if (!new_skb)
+			break;
+
+		new_skb->next = uhd->emerg_skbs;
+		uhd->emerg_skbs = new_skb;
+		uhd->pool_level++;
+	}
+}
+
+/*
+ * Try to allocate a single skb. If we can't get one, try to use one from
+ * our pool.
+ */
+static struct sk_buff *toi_get_skb(struct user_helper_data *uhd)
+{
+	struct sk_buff *skb =
+		alloc_skb(NLMSG_SPACE(uhd->skb_size), TOI_ATOMIC_GFP);
+
+	if (skb)
+		return skb;
+
+	skb = uhd->emerg_skbs;
+	if (skb) {
+		uhd->pool_level--;
+		uhd->emerg_skbs = skb->next;
+		skb->next = NULL;
+	}
+
+	return skb;
+}
+
+void toi_send_netlink_message(struct user_helper_data *uhd,
+		int type, void *params, size_t len)
+{
+	struct sk_buff *skb;
+	struct nlmsghdr *nlh;
+	void *dest;
+	struct task_struct *t;
+
+	if (uhd->pid == -1)
+		return;
+
+	if (uhd->debug)
+		printk(KERN_ERR "toi_send_netlink_message: Send "
+				"message type %d.\n", type);
+
+	skb = toi_get_skb(uhd);
+	if (!skb) {
+		printk(KERN_INFO "toi_netlink: Can't allocate skb!\n");
+		return;
+	}
+
+	nlh = nlmsg_put(skb, 0, uhd->sock_seq, type, len, 0);
+	uhd->sock_seq++;
+
+	dest = NLMSG_DATA(nlh);
+	if (params && len > 0)
+		memcpy(dest, params, len);
+
+	netlink_unicast(uhd->nl, skb, uhd->pid, 0);
+
+	toi_read_lock_tasklist();
+	t = find_task_by_pid_ns(uhd->pid, &init_pid_ns);
+	if (!t) {
+		toi_read_unlock_tasklist();
+		if (uhd->pid > -1)
+			printk(KERN_INFO "Hmm. Can't find the userspace task"
+				" %d.\n", uhd->pid);
+		return;
+	}
+	wake_up_process(t);
+	toi_read_unlock_tasklist();
+
+	yield();
+}
+EXPORT_SYMBOL_GPL(toi_send_netlink_message);
+
+static void send_whether_debugging(struct user_helper_data *uhd)
+{
+	static u8 is_debugging = 1;
+
+	toi_send_netlink_message(uhd, NETLINK_MSG_IS_DEBUGGING,
+			&is_debugging, sizeof(u8));
+}
+
+/*
+ * Set the PF_NOFREEZE flag on the given process to ensure it can run whilst we
+ * are hibernating.
+ */
+static int nl_set_nofreeze(struct user_helper_data *uhd, __u32 pid)
+{
+	struct task_struct *t;
+
+	if (uhd->debug)
+		printk(KERN_ERR "nl_set_nofreeze for pid %d.\n", pid);
+
+	toi_read_lock_tasklist();
+	t = find_task_by_pid_ns(pid, &init_pid_ns);
+	if (!t) {
+		toi_read_unlock_tasklist();
+		printk(KERN_INFO "Strange. Can't find the userspace task %d.\n",
+				pid);
+		return -EINVAL;
+	}
+
+	t->flags |= PF_NOFREEZE;
+
+	toi_read_unlock_tasklist();
+	uhd->pid = pid;
+
+	toi_send_netlink_message(uhd, NETLINK_MSG_NOFREEZE_ACK, NULL, 0);
+
+	return 0;
+}
+
+/*
+ * Called when the userspace process has informed us that it's ready to roll.
+ */
+static int nl_ready(struct user_helper_data *uhd, u32 version)
+{
+	if (version != uhd->interface_version) {
+		printk(KERN_INFO "%s userspace process using invalid interface"
+				" version (%d - kernel wants %d). Trying to "
+				"continue without it.\n",
+				uhd->name, version, uhd->interface_version);
+		if (uhd->not_ready)
+			uhd->not_ready();
+		return -EINVAL;
+	}
+
+	complete(&uhd->wait_for_process);
+
+	return 0;
+}
+
+void toi_netlink_close_complete(struct user_helper_data *uhd)
+{
+	if (uhd->nl) {
+		netlink_kernel_release(uhd->nl);
+		uhd->nl = NULL;
+	}
+
+	while (uhd->emerg_skbs) {
+		struct sk_buff *next = uhd->emerg_skbs->next;
+		kfree_skb(uhd->emerg_skbs);
+		uhd->emerg_skbs = next;
+	}
+
+	uhd->pid = -1;
+}
+EXPORT_SYMBOL_GPL(toi_netlink_close_complete);
+
+static int toi_nl_gen_rcv_msg(struct user_helper_data *uhd,
+		struct sk_buff *skb, struct nlmsghdr *nlh)
+{
+	int type = nlh->nlmsg_type;
+	int *data;
+	int err;
+
+	if (uhd->debug)
+		printk(KERN_ERR "toi_user_rcv_skb: Received message %d.\n",
+				type);
+
+	/* Let the more specific handler go first. It returns
+	 * 1 for valid messages that it doesn't know. */
+	err = uhd->rcv_msg(skb, nlh);
+	if (err != 1)
+		return err;
+
+	/* Only allow one task to receive NOFREEZE privileges */
+	if (type == NETLINK_MSG_NOFREEZE_ME && uhd->pid != -1) {
+		printk(KERN_INFO "Received extra nofreeze me requests.\n");
+		return -EBUSY;
+	}
+
+	data = NLMSG_DATA(nlh);
+
+	switch (type) {
+	case NETLINK_MSG_NOFREEZE_ME:
+		return nl_set_nofreeze(uhd, nlh->nlmsg_pid);
+	case NETLINK_MSG_GET_DEBUGGING:
+		send_whether_debugging(uhd);
+		return 0;
+	case NETLINK_MSG_READY:
+		if (nlh->nlmsg_len != NLMSG_LENGTH(sizeof(u32))) {
+			printk(KERN_INFO "Invalid ready mesage.\n");
+			if (uhd->not_ready)
+				uhd->not_ready();
+			return -EINVAL;
+		}
+		return nl_ready(uhd, (u32) *data);
+	case NETLINK_MSG_CLEANUP:
+		toi_netlink_close_complete(uhd);
+		return 0;
+	}
+
+	return -EINVAL;
+}
+
+static void toi_user_rcv_skb(struct sk_buff *skb)
+{
+	int err;
+	struct nlmsghdr *nlh;
+	struct user_helper_data *uhd = uhd_list;
+
+	while (uhd && uhd->netlink_id != skb->sk->sk_protocol)
+		uhd = uhd->next;
+
+	if (!uhd)
+		return;
+
+	while (skb->len >= NLMSG_SPACE(0)) {
+		u32 rlen;
+
+		nlh = (struct nlmsghdr *) skb->data;
+		if (nlh->nlmsg_len < sizeof(*nlh) || skb->len < nlh->nlmsg_len)
+			return;
+
+		rlen = NLMSG_ALIGN(nlh->nlmsg_len);
+		if (rlen > skb->len)
+			rlen = skb->len;
+
+		err = toi_nl_gen_rcv_msg(uhd, skb, nlh);
+		if (err)
+			netlink_ack(skb, nlh, err);
+		else if (nlh->nlmsg_flags & NLM_F_ACK)
+			netlink_ack(skb, nlh, 0);
+		skb_pull(skb, rlen);
+	}
+}
+
+static int netlink_prepare(struct user_helper_data *uhd)
+{
+	struct netlink_kernel_cfg cfg = {
+		.groups = 0,
+		.input = toi_user_rcv_skb,
+	};
+
+	uhd->next = uhd_list;
+	uhd_list = uhd;
+
+	uhd->sock_seq = 0x42c0ffee;
+	uhd->nl = netlink_kernel_create(&init_net, uhd->netlink_id, &cfg);
+	if (!uhd->nl) {
+		printk(KERN_INFO "Failed to allocate netlink socket for %s.\n",
+				uhd->name);
+		return -ENOMEM;
+	}
+
+	toi_fill_skb_pool(uhd);
+
+	return 0;
+}
+
+void toi_netlink_close(struct user_helper_data *uhd)
+{
+	struct task_struct *t;
+
+	toi_read_lock_tasklist();
+	t = find_task_by_pid_ns(uhd->pid, &init_pid_ns);
+	if (t)
+		t->flags &= ~PF_NOFREEZE;
+	toi_read_unlock_tasklist();
+
+	toi_send_netlink_message(uhd, NETLINK_MSG_CLEANUP, NULL, 0);
+}
+EXPORT_SYMBOL_GPL(toi_netlink_close);
+
+int toi_netlink_setup(struct user_helper_data *uhd)
+{
+	/* In case userui didn't cleanup properly on us */
+	toi_netlink_close_complete(uhd);
+
+	if (netlink_prepare(uhd) < 0) {
+		printk(KERN_INFO "Netlink prepare failed.\n");
+		return 1;
+	}
+
+	if (toi_launch_userspace_program(uhd->program, uhd->netlink_id,
+				UMH_WAIT_EXEC, uhd->debug) < 0) {
+		printk(KERN_INFO "Launch userspace program failed.\n");
+		toi_netlink_close_complete(uhd);
+		return 1;
+	}
+
+	/* Wait 2 seconds for the userspace process to make contact */
+	wait_for_completion_timeout(&uhd->wait_for_process, 2*HZ);
+
+	if (uhd->pid == -1) {
+		printk(KERN_INFO "%s: Failed to contact userspace process.\n",
+				uhd->name);
+		toi_netlink_close_complete(uhd);
+		return 1;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(toi_netlink_setup);
diff --git a/kernel/power/tuxonice_netlink.h b/kernel/power/tuxonice_netlink.h
new file mode 100644
index 0000000..b8ef06e
--- /dev/null
+++ b/kernel/power/tuxonice_netlink.h
@@ -0,0 +1,62 @@
+/*
+ * kernel/power/tuxonice_netlink.h
+ *
+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * Declarations for functions for communicating with a userspace helper
+ * via netlink.
+ */
+
+#include <linux/netlink.h>
+#include <net/sock.h>
+
+#define NETLINK_MSG_BASE 0x10
+
+#define NETLINK_MSG_READY 0x10
+#define	NETLINK_MSG_NOFREEZE_ME 0x16
+#define NETLINK_MSG_GET_DEBUGGING 0x19
+#define NETLINK_MSG_CLEANUP 0x24
+#define NETLINK_MSG_NOFREEZE_ACK 0x27
+#define NETLINK_MSG_IS_DEBUGGING 0x28
+
+struct user_helper_data {
+	int (*rcv_msg) (struct sk_buff *skb, struct nlmsghdr *nlh);
+	void (*not_ready) (void);
+	struct sock *nl;
+	u32 sock_seq;
+	pid_t pid;
+	char *comm;
+	char program[256];
+	int pool_level;
+	int pool_limit;
+	struct sk_buff *emerg_skbs;
+	int skb_size;
+	int netlink_id;
+	char *name;
+	struct user_helper_data *next;
+	struct completion wait_for_process;
+	u32 interface_version;
+	int must_init;
+	int debug;
+};
+
+#ifdef CONFIG_NET
+int toi_netlink_setup(struct user_helper_data *uhd);
+void toi_netlink_close(struct user_helper_data *uhd);
+void toi_send_netlink_message(struct user_helper_data *uhd,
+		int type, void *params, size_t len);
+void toi_netlink_close_complete(struct user_helper_data *uhd);
+#else
+static inline int toi_netlink_setup(struct user_helper_data *uhd)
+{
+	return 0;
+}
+
+static inline void toi_netlink_close(struct user_helper_data *uhd) { };
+static inline void toi_send_netlink_message(struct user_helper_data *uhd,
+		int type, void *params, size_t len) { };
+static inline void toi_netlink_close_complete(struct user_helper_data *uhd)
+	{ };
+#endif
diff --git a/kernel/power/tuxonice_pagedir.c b/kernel/power/tuxonice_pagedir.c
new file mode 100644
index 0000000..ce0d38c
--- /dev/null
+++ b/kernel/power/tuxonice_pagedir.c
@@ -0,0 +1,346 @@
+/*
+ * kernel/power/tuxonice_pagedir.c
+ *
+ * Copyright (C) 1998-2001 Gabor Kuti <seasons@fornax.hu>
+ * Copyright (C) 1998,2001,2002 Pavel Machek <pavel@suse.cz>
+ * Copyright (C) 2002-2003 Florent Chabaud <fchabaud@free.fr>
+ * Copyright (C) 2006-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * Routines for handling pagesets.
+ * Note that pbes aren't actually stored as such. They're stored as
+ * bitmaps and extents.
+ */
+
+#include <linux/suspend.h>
+#include <linux/highmem.h>
+#include <linux/bootmem.h>
+#include <linux/hardirq.h>
+#include <linux/sched.h>
+#include <linux/cpu.h>
+#include <asm/tlbflush.h>
+
+#include "tuxonice_pageflags.h"
+#include "tuxonice_ui.h"
+#include "tuxonice_pagedir.h"
+#include "tuxonice_prepare_image.h"
+#include "tuxonice.h"
+#include "tuxonice_builtin.h"
+#include "tuxonice_alloc.h"
+
+static int ptoi_pfn;
+static struct pbe *this_low_pbe;
+static struct pbe **last_low_pbe_ptr;
+
+void toi_reset_alt_image_pageset2_pfn(void)
+{
+	memory_bm_position_reset(pageset2_map);
+}
+
+static struct page *first_conflicting_page;
+
+/*
+ * free_conflicting_pages
+ */
+
+static void free_conflicting_pages(void)
+{
+	while (first_conflicting_page) {
+		struct page *next =
+			*((struct page **) kmap(first_conflicting_page));
+		kunmap(first_conflicting_page);
+		toi__free_page(29, first_conflicting_page);
+		first_conflicting_page = next;
+	}
+}
+
+/* __toi_get_nonconflicting_page
+ *
+ * Description: Gets order zero pages that won't be overwritten
+ *		while copying the original pages.
+ */
+
+struct page *___toi_get_nonconflicting_page(int can_be_highmem)
+{
+	struct page *page;
+	gfp_t flags = TOI_ATOMIC_GFP;
+	if (can_be_highmem)
+		flags |= __GFP_HIGHMEM;
+
+
+	if (test_toi_state(TOI_LOADING_ALT_IMAGE) &&
+			pageset2_map &&
+			(ptoi_pfn != BM_END_OF_MAP)) {
+		do {
+			ptoi_pfn = memory_bm_next_pfn(pageset2_map);
+			if (ptoi_pfn != BM_END_OF_MAP) {
+				page = pfn_to_page(ptoi_pfn);
+				if (!PagePageset1(page) &&
+				    (can_be_highmem || !PageHighMem(page)))
+					return page;
+			}
+		} while (ptoi_pfn != BM_END_OF_MAP);
+	}
+
+	do {
+		page = toi_alloc_page(29, flags);
+		if (!page) {
+			printk(KERN_INFO "Failed to get nonconflicting "
+					"page.\n");
+			return NULL;
+		}
+		if (PagePageset1(page)) {
+			struct page **next = (struct page **) kmap(page);
+			*next = first_conflicting_page;
+			first_conflicting_page = page;
+			kunmap(page);
+		}
+	} while (PagePageset1(page));
+
+	return page;
+}
+
+unsigned long __toi_get_nonconflicting_page(void)
+{
+	struct page *page = ___toi_get_nonconflicting_page(0);
+	return page ? (unsigned long) page_address(page) : 0;
+}
+
+static struct pbe *get_next_pbe(struct page **page_ptr, struct pbe *this_pbe,
+		int highmem)
+{
+	if (((((unsigned long) this_pbe) & (PAGE_SIZE - 1))
+		     + 2 * sizeof(struct pbe)) > PAGE_SIZE) {
+		struct page *new_page =
+			___toi_get_nonconflicting_page(highmem);
+		if (!new_page)
+			return ERR_PTR(-ENOMEM);
+		this_pbe = (struct pbe *) kmap(new_page);
+		memset(this_pbe, 0, PAGE_SIZE);
+		*page_ptr = new_page;
+	} else
+		this_pbe++;
+
+	return this_pbe;
+}
+
+/**
+ * get_pageset1_load_addresses - generate pbes for conflicting pages
+ *
+ * We check here that pagedir & pages it points to won't collide
+ * with pages where we're going to restore from the loaded pages
+ * later.
+ *
+ * Returns:
+ *	Zero on success, one if couldn't find enough pages (shouldn't
+ *	happen).
+ **/
+int toi_get_pageset1_load_addresses(void)
+{
+	int pfn, highallocd = 0, lowallocd = 0;
+	int low_needed = pagedir1.size - get_highmem_size(pagedir1);
+	int high_needed = get_highmem_size(pagedir1);
+	int low_pages_for_highmem = 0;
+	gfp_t flags = GFP_ATOMIC | __GFP_NOWARN | __GFP_HIGHMEM;
+	struct page *page, *high_pbe_page = NULL, *last_high_pbe_page = NULL,
+		    *low_pbe_page, *last_low_pbe_page = NULL;
+	struct pbe **last_high_pbe_ptr = &restore_highmem_pblist,
+		   *this_high_pbe = NULL;
+	unsigned long orig_low_pfn, orig_high_pfn;
+	int high_pbes_done = 0, low_pbes_done = 0;
+	int low_direct = 0, high_direct = 0, result = 0, i;
+	int high_page = 1, high_offset = 0, low_page = 1, low_offset = 0;
+
+	memory_bm_set_iterators(pageset1_map, 3);
+	memory_bm_position_reset(pageset1_map);
+
+	memory_bm_set_iterators(pageset1_copy_map, 2);
+	memory_bm_position_reset(pageset1_copy_map);
+
+	last_low_pbe_ptr = &restore_pblist;
+
+	/* First, allocate pages for the start of our pbe lists. */
+	if (high_needed) {
+		high_pbe_page = ___toi_get_nonconflicting_page(1);
+		if (!high_pbe_page) {
+			result = -ENOMEM;
+			goto out;
+		}
+		this_high_pbe = (struct pbe *) kmap(high_pbe_page);
+		memset(this_high_pbe, 0, PAGE_SIZE);
+	}
+
+	low_pbe_page = ___toi_get_nonconflicting_page(0);
+	if (!low_pbe_page) {
+		result = -ENOMEM;
+		goto out;
+	}
+	this_low_pbe = (struct pbe *) page_address(low_pbe_page);
+
+	/*
+	 * Next, allocate the number of pages we need.
+	 */
+
+	i = low_needed + high_needed;
+
+	do {
+		int is_high;
+
+		if (i == low_needed)
+			flags &= ~__GFP_HIGHMEM;
+
+		page = toi_alloc_page(30, flags);
+		BUG_ON(!page);
+
+		SetPagePageset1Copy(page);
+		is_high = PageHighMem(page);
+
+		if (PagePageset1(page)) {
+			if (is_high)
+				high_direct++;
+			else
+				low_direct++;
+		} else {
+			if (is_high)
+				highallocd++;
+			else
+				lowallocd++;
+		}
+	} while (--i);
+
+	high_needed -= high_direct;
+	low_needed -= low_direct;
+
+	/*
+	 * Do we need to use some lowmem pages for the copies of highmem
+	 * pages?
+	 */
+	if (high_needed > highallocd) {
+		low_pages_for_highmem = high_needed - highallocd;
+		high_needed -= low_pages_for_highmem;
+		low_needed += low_pages_for_highmem;
+	}
+
+	/*
+	 * Now generate our pbes (which will be used for the atomic restore),
+	 * and free unneeded pages.
+	 */
+	memory_bm_position_reset(pageset1_copy_map);
+	for (pfn = memory_bm_next_pfn_index(pageset1_copy_map, 1); pfn != BM_END_OF_MAP;
+			pfn = memory_bm_next_pfn_index(pageset1_copy_map, 1)) {
+		int is_high;
+		page = pfn_to_page(pfn);
+		is_high = PageHighMem(page);
+
+		if (PagePageset1(page))
+			continue;
+
+		/* Nope. We're going to use this page. Add a pbe. */
+		if (is_high || low_pages_for_highmem) {
+			struct page *orig_page;
+			high_pbes_done++;
+			if (!is_high)
+				low_pages_for_highmem--;
+			do {
+				orig_high_pfn = memory_bm_next_pfn_index(pageset1_map, 1);
+				BUG_ON(orig_high_pfn == BM_END_OF_MAP);
+				orig_page = pfn_to_page(orig_high_pfn);
+			} while (!PageHighMem(orig_page) ||
+					PagePageset1Copy(orig_page));
+
+			this_high_pbe->orig_address = (void *) orig_high_pfn;
+			this_high_pbe->address = page;
+			this_high_pbe->next = NULL;
+			toi_message(TOI_PAGEDIR, TOI_VERBOSE, 0, "High pbe %d/%d: %p(%d)=>%p",
+					high_page, high_offset, page, orig_high_pfn, orig_page);
+			if (last_high_pbe_page != high_pbe_page) {
+				*last_high_pbe_ptr =
+					(struct pbe *) high_pbe_page;
+				if (last_high_pbe_page) {
+					kunmap(last_high_pbe_page);
+					high_page++;
+					high_offset = 0;
+				} else
+					high_offset++;
+				last_high_pbe_page = high_pbe_page;
+			} else {
+				*last_high_pbe_ptr = this_high_pbe;
+				high_offset++;
+			}
+			last_high_pbe_ptr = &this_high_pbe->next;
+			this_high_pbe = get_next_pbe(&high_pbe_page,
+					this_high_pbe, 1);
+			if (IS_ERR(this_high_pbe)) {
+				printk(KERN_INFO
+						"This high pbe is an error.\n");
+				return -ENOMEM;
+			}
+		} else {
+			struct page *orig_page;
+			low_pbes_done++;
+			do {
+				orig_low_pfn = memory_bm_next_pfn_index(pageset1_map, 2);
+				BUG_ON(orig_low_pfn == BM_END_OF_MAP);
+				orig_page = pfn_to_page(orig_low_pfn);
+			} while (PageHighMem(orig_page) ||
+					PagePageset1Copy(orig_page));
+
+			this_low_pbe->orig_address = page_address(orig_page);
+			this_low_pbe->address = page_address(page);
+			this_low_pbe->next = NULL;
+			toi_message(TOI_PAGEDIR, TOI_VERBOSE, 0, "Low pbe %d/%d: %p(%d)=>%p",
+					low_page, low_offset, this_low_pbe->orig_address,
+					orig_low_pfn, this_low_pbe->address);
+			*last_low_pbe_ptr = this_low_pbe;
+			last_low_pbe_ptr = &this_low_pbe->next;
+			this_low_pbe = get_next_pbe(&low_pbe_page,
+					this_low_pbe, 0);
+			if (low_pbe_page != last_low_pbe_page) {
+				if (last_low_pbe_page) {
+					low_page++;
+					low_offset = 0;
+				}
+				last_low_pbe_page = low_pbe_page;
+			} else
+				low_offset++;
+			if (IS_ERR(this_low_pbe)) {
+				printk(KERN_INFO "this_low_pbe is an error.\n");
+				return -ENOMEM;
+			}
+		}
+	}
+
+	if (high_pbe_page)
+		kunmap(high_pbe_page);
+
+	if (last_high_pbe_page != high_pbe_page) {
+		if (last_high_pbe_page)
+			kunmap(last_high_pbe_page);
+		toi__free_page(29, high_pbe_page);
+	}
+
+	free_conflicting_pages();
+
+out:
+	memory_bm_set_iterators(pageset1_map, 1);
+	memory_bm_set_iterators(pageset1_copy_map, 1);
+	return result;
+}
+
+int add_boot_kernel_data_pbe(void)
+{
+	this_low_pbe->address = (char *) __toi_get_nonconflicting_page();
+	if (!this_low_pbe->address) {
+		printk(KERN_INFO "Failed to get bkd atomic restore buffer.");
+		return -ENOMEM;
+	}
+
+	toi_bkd.size = sizeof(toi_bkd);
+	memcpy(this_low_pbe->address, &toi_bkd, sizeof(toi_bkd));
+
+	*last_low_pbe_ptr = this_low_pbe;
+	this_low_pbe->orig_address = (char *) boot_kernel_data_buffer;
+	this_low_pbe->next = NULL;
+	return 0;
+}
diff --git a/kernel/power/tuxonice_pagedir.h b/kernel/power/tuxonice_pagedir.h
new file mode 100644
index 0000000..d08e4b1
--- /dev/null
+++ b/kernel/power/tuxonice_pagedir.h
@@ -0,0 +1,50 @@
+/*
+ * kernel/power/tuxonice_pagedir.h
+ *
+ * Copyright (C) 2006-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * Declarations for routines for handling pagesets.
+ */
+
+#ifndef KERNEL_POWER_PAGEDIR_H
+#define KERNEL_POWER_PAGEDIR_H
+
+/* Pagedir
+ *
+ * Contains the metadata for a set of pages saved in the image.
+ */
+
+struct pagedir {
+	int id;
+	unsigned long size;
+#ifdef CONFIG_HIGHMEM
+	unsigned long size_high;
+#endif
+};
+
+#ifdef CONFIG_HIGHMEM
+#define get_highmem_size(pagedir) (pagedir.size_high)
+#define set_highmem_size(pagedir, sz) do { pagedir.size_high = sz; } while (0)
+#define inc_highmem_size(pagedir) do { pagedir.size_high++; } while (0)
+#define get_lowmem_size(pagedir) (pagedir.size - pagedir.size_high)
+#else
+#define get_highmem_size(pagedir) (0)
+#define set_highmem_size(pagedir, sz) do { } while (0)
+#define inc_highmem_size(pagedir) do { } while (0)
+#define get_lowmem_size(pagedir) (pagedir.size)
+#endif
+
+extern struct pagedir pagedir1, pagedir2;
+
+extern void toi_copy_pageset1(void);
+
+extern int toi_get_pageset1_load_addresses(void);
+
+extern unsigned long __toi_get_nonconflicting_page(void);
+struct page *___toi_get_nonconflicting_page(int can_be_highmem);
+
+extern void toi_reset_alt_image_pageset2_pfn(void);
+extern int add_boot_kernel_data_pbe(void);
+#endif
diff --git a/kernel/power/tuxonice_pageflags.c b/kernel/power/tuxonice_pageflags.c
new file mode 100644
index 0000000..77fab4f
--- /dev/null
+++ b/kernel/power/tuxonice_pageflags.c
@@ -0,0 +1,29 @@
+/*
+ * kernel/power/tuxonice_pageflags.c
+ *
+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * Routines for serialising and relocating pageflags in which we
+ * store our image metadata.
+ */
+
+#include <linux/list.h>
+#include <linux/module.h>
+#include "tuxonice_pageflags.h"
+#include "power.h"
+
+int toi_pageflags_space_needed(void)
+{
+	int total = 0;
+	struct bm_block *bb;
+
+	total = sizeof(unsigned int);
+
+	list_for_each_entry(bb, &pageset1_map->blocks, hook)
+		total += 2 * sizeof(unsigned long) + PAGE_SIZE;
+
+	return total;
+}
+EXPORT_SYMBOL_GPL(toi_pageflags_space_needed);
diff --git a/kernel/power/tuxonice_pageflags.h b/kernel/power/tuxonice_pageflags.h
new file mode 100644
index 0000000..d5aa7b1
--- /dev/null
+++ b/kernel/power/tuxonice_pageflags.h
@@ -0,0 +1,72 @@
+/*
+ * kernel/power/tuxonice_pageflags.h
+ *
+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ */
+
+#ifndef KERNEL_POWER_TUXONICE_PAGEFLAGS_H
+#define KERNEL_POWER_TUXONICE_PAGEFLAGS_H
+
+extern struct memory_bitmap *pageset1_map;
+extern struct memory_bitmap *pageset1_copy_map;
+extern struct memory_bitmap *pageset2_map;
+extern struct memory_bitmap *page_resave_map;
+extern struct memory_bitmap *io_map;
+extern struct memory_bitmap *nosave_map;
+extern struct memory_bitmap *free_map;
+
+#define PagePageset1(page) \
+	(memory_bm_test_bit(pageset1_map, page_to_pfn(page)))
+#define SetPagePageset1(page) \
+	(memory_bm_set_bit(pageset1_map, page_to_pfn(page)))
+#define ClearPagePageset1(page) \
+	(memory_bm_clear_bit(pageset1_map, page_to_pfn(page)))
+
+#define PagePageset1Copy(page) \
+	(memory_bm_test_bit(pageset1_copy_map, page_to_pfn(page)))
+#define SetPagePageset1Copy(page) \
+	(memory_bm_set_bit(pageset1_copy_map, page_to_pfn(page)))
+#define ClearPagePageset1Copy(page) \
+	(memory_bm_clear_bit(pageset1_copy_map, page_to_pfn(page)))
+
+#define PagePageset2(page) \
+	(memory_bm_test_bit(pageset2_map, page_to_pfn(page)))
+#define SetPagePageset2(page) \
+	(memory_bm_set_bit(pageset2_map, page_to_pfn(page)))
+#define ClearPagePageset2(page) \
+	(memory_bm_clear_bit(pageset2_map, page_to_pfn(page)))
+
+#define PageWasRW(page) \
+	(memory_bm_test_bit(pageset2_map, page_to_pfn(page)))
+#define SetPageWasRW(page) \
+	(memory_bm_set_bit(pageset2_map, page_to_pfn(page)))
+#define ClearPageWasRW(page) \
+	(memory_bm_clear_bit(pageset2_map, page_to_pfn(page)))
+
+#define PageResave(page) (page_resave_map ? \
+	memory_bm_test_bit(page_resave_map, page_to_pfn(page)) : 0)
+#define SetPageResave(page) \
+	(memory_bm_set_bit(page_resave_map, page_to_pfn(page)))
+#define ClearPageResave(page) \
+	(memory_bm_clear_bit(page_resave_map, page_to_pfn(page)))
+
+#define PageNosave(page) (nosave_map ? \
+		memory_bm_test_bit(nosave_map, page_to_pfn(page)) : 0)
+#define SetPageNosave(page) \
+	(memory_bm_set_bit(nosave_map, page_to_pfn(page)))
+#define ClearPageNosave(page) \
+	(memory_bm_clear_bit(nosave_map, page_to_pfn(page)))
+
+#define PageNosaveFree(page) (free_map ? \
+		memory_bm_test_bit(free_map, page_to_pfn(page)) : 0)
+#define SetPageNosaveFree(page) \
+	(memory_bm_set_bit(free_map, page_to_pfn(page)))
+#define ClearPageNosaveFree(page) \
+	(memory_bm_clear_bit(free_map, page_to_pfn(page)))
+
+extern void save_pageflags(struct memory_bitmap *pagemap);
+extern int load_pageflags(struct memory_bitmap *pagemap);
+extern int toi_pageflags_space_needed(void);
+#endif
diff --git a/kernel/power/tuxonice_power_off.c b/kernel/power/tuxonice_power_off.c
new file mode 100644
index 0000000..1604a95
--- /dev/null
+++ b/kernel/power/tuxonice_power_off.c
@@ -0,0 +1,287 @@
+/*
+ * kernel/power/tuxonice_power_off.c
+ *
+ * Copyright (C) 2006-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * Support for powering down.
+ */
+
+#include <linux/device.h>
+#include <linux/suspend.h>
+#include <linux/mm.h>
+#include <linux/pm.h>
+#include <linux/reboot.h>
+#include <linux/cpu.h>
+#include <linux/console.h>
+#include <linux/fs.h>
+#include "tuxonice.h"
+#include "tuxonice_ui.h"
+#include "tuxonice_power_off.h"
+#include "tuxonice_sysfs.h"
+#include "tuxonice_modules.h"
+#include "tuxonice_io.h"
+
+unsigned long toi_poweroff_method; /* 0 - Kernel power off */
+EXPORT_SYMBOL_GPL(toi_poweroff_method);
+
+static int wake_delay;
+static char lid_state_file[256], wake_alarm_dir[256];
+static struct file *lid_file, *alarm_file, *epoch_file;
+static int post_wake_state = -1;
+
+static int did_suspend_to_both;
+
+/*
+ * __toi_power_down
+ * Functionality   : Powers down or reboots the computer once the image
+ *                   has been written to disk.
+ * Key Assumptions : Able to reboot/power down via code called or that
+ *                   the warning emitted if the calls fail will be visible
+ *                   to the user (ie printk resumes devices).
+ */
+
+static void __toi_power_down(int method)
+{
+	int error;
+
+	toi_cond_pause(1, test_action_state(TOI_REBOOT) ? "Ready to reboot." :
+			"Powering down.");
+
+	if (test_result_state(TOI_ABORTED))
+		goto out;
+
+	if (test_action_state(TOI_REBOOT))
+		kernel_restart(NULL);
+
+	switch (method) {
+	case 0:
+		break;
+	case 3:
+		/*
+		 * Re-read the overwritten part of pageset2 to make post-resume
+		 * faster.
+		 */
+		if (read_pageset2(1))
+			panic("Attempt to reload pagedir 2 failed. "
+					"Try rebooting.");
+
+		pm_prepare_console();
+
+		error = pm_notifier_call_chain(PM_SUSPEND_PREPARE);
+		if (!error) {
+			pm_restore_gfp_mask();
+			error = suspend_devices_and_enter(PM_SUSPEND_MEM);
+			pm_restrict_gfp_mask();
+			if (!error)
+				did_suspend_to_both = 1;
+		}
+		pm_notifier_call_chain(PM_POST_SUSPEND);
+		pm_restore_console();
+
+		/* Success - we're now post-resume-from-ram */
+		if (did_suspend_to_both)
+			return;
+
+		/* Failed to suspend to ram - do normal power off */
+		break;
+	case 4:
+		/*
+		 * If succeeds, doesn't return. If fails, do a simple
+		 * powerdown.
+		 */
+		hibernation_platform_enter();
+		break;
+	case 5:
+		/* Historic entry only now */
+		break;
+	}
+
+	if (method && method != 5)
+		toi_cond_pause(1,
+			"Falling back to alternate power off method.");
+
+	if (test_result_state(TOI_ABORTED))
+		goto out;
+
+	kernel_power_off();
+	kernel_halt();
+	toi_cond_pause(1, "Powerdown failed.");
+	while (1)
+		cpu_relax();
+
+out:
+	if (read_pageset2(1))
+		panic("Attempt to reload pagedir 2 failed. Try rebooting.");
+	return;
+}
+
+#define CLOSE_FILE(file) \
+	if (file) { \
+		filp_close(file, NULL); file = NULL; \
+	}
+
+static void powerdown_cleanup(int toi_or_resume)
+{
+	if (!toi_or_resume)
+		return;
+
+	CLOSE_FILE(lid_file);
+	CLOSE_FILE(alarm_file);
+	CLOSE_FILE(epoch_file);
+}
+
+static void open_file(char *format, char *arg, struct file **var, int mode,
+		char *desc)
+{
+	char buf[256];
+
+	if (strlen(arg)) {
+		sprintf(buf, format, arg);
+		*var = filp_open(buf, mode, 0);
+		if (IS_ERR(*var) || !*var) {
+			printk(KERN_INFO "Failed to open %s file '%s' (%p).\n",
+				desc, buf, *var);
+			*var = NULL;
+		}
+	}
+}
+
+static int powerdown_init(int toi_or_resume)
+{
+	if (!toi_or_resume)
+		return 0;
+
+	did_suspend_to_both = 0;
+
+	open_file("/proc/acpi/button/%s/state", lid_state_file, &lid_file,
+			O_RDONLY, "lid");
+
+	if (strlen(wake_alarm_dir)) {
+		open_file("/sys/class/rtc/%s/wakealarm", wake_alarm_dir,
+				&alarm_file, O_WRONLY, "alarm");
+
+		open_file("/sys/class/rtc/%s/since_epoch", wake_alarm_dir,
+				&epoch_file, O_RDONLY, "epoch");
+	}
+
+	return 0;
+}
+
+static int lid_closed(void)
+{
+	char array[25];
+	ssize_t size;
+	loff_t pos = 0;
+
+	if (!lid_file)
+		return 0;
+
+	size = vfs_read(lid_file, (char __user *) array, 25, &pos);
+	if ((int) size < 1) {
+		printk(KERN_INFO "Failed to read lid state file (%d).\n",
+			(int) size);
+		return 0;
+	}
+
+	if (!strcmp(array, "state:      closed\n"))
+		return 1;
+
+	return 0;
+}
+
+static void write_alarm_file(int value)
+{
+	ssize_t size;
+	char buf[40];
+	loff_t pos = 0;
+
+	if (!alarm_file)
+		return;
+
+	sprintf(buf, "%d\n", value);
+
+	size = vfs_write(alarm_file, (char __user *)buf, strlen(buf), &pos);
+
+	if (size < 0)
+		printk(KERN_INFO "Error %d writing alarm value %s.\n",
+				(int) size, buf);
+}
+
+/**
+ * toi_check_resleep: See whether to powerdown again after waking.
+ *
+ * After waking, check whether we should powerdown again in a (usually
+ * different) way. We only do this if the lid switch is still closed.
+ */
+void toi_check_resleep(void)
+{
+	/* We only return if we suspended to ram and woke. */
+	if (lid_closed() && post_wake_state >= 0)
+		__toi_power_down(post_wake_state);
+}
+
+void toi_power_down(void)
+{
+	if (alarm_file && wake_delay) {
+		char array[25];
+		loff_t pos = 0;
+		size_t size = vfs_read(epoch_file, (char __user *) array, 25,
+				&pos);
+
+		if (((int) size) < 1)
+			printk(KERN_INFO "Failed to read epoch file (%d).\n",
+					(int) size);
+		else {
+			unsigned long since_epoch;
+			if (!strict_strtoul(array, 0, &since_epoch)) {
+				/* Clear any wakeup time. */
+				write_alarm_file(0);
+
+				/* Set new wakeup time. */
+				write_alarm_file(since_epoch + wake_delay);
+			}
+		}
+	}
+
+	__toi_power_down(toi_poweroff_method);
+
+	toi_check_resleep();
+}
+EXPORT_SYMBOL_GPL(toi_power_down);
+
+static struct toi_sysfs_data sysfs_params[] = {
+#if defined(CONFIG_ACPI)
+	SYSFS_STRING("lid_file", SYSFS_RW, lid_state_file, 256, 0, NULL),
+	SYSFS_INT("wake_delay", SYSFS_RW, &wake_delay, 0, INT_MAX, 0, NULL),
+	SYSFS_STRING("wake_alarm_dir", SYSFS_RW, wake_alarm_dir, 256, 0, NULL),
+	SYSFS_INT("post_wake_state", SYSFS_RW, &post_wake_state, -1, 5, 0,
+			NULL),
+	SYSFS_UL("powerdown_method", SYSFS_RW, &toi_poweroff_method, 0, 5, 0),
+	SYSFS_INT("did_suspend_to_both", SYSFS_READONLY, &did_suspend_to_both,
+		0, 0, 0, NULL)
+#endif
+};
+
+static struct toi_module_ops powerdown_ops = {
+	.type				= MISC_HIDDEN_MODULE,
+	.name				= "poweroff",
+	.initialise			= powerdown_init,
+	.cleanup			= powerdown_cleanup,
+	.directory			= "[ROOT]",
+	.module				= THIS_MODULE,
+	.sysfs_data			= sysfs_params,
+	.num_sysfs_entries		= sizeof(sysfs_params) /
+		sizeof(struct toi_sysfs_data),
+};
+
+int toi_poweroff_init(void)
+{
+	return toi_register_module(&powerdown_ops);
+}
+
+void toi_poweroff_exit(void)
+{
+	toi_unregister_module(&powerdown_ops);
+}
diff --git a/kernel/power/tuxonice_power_off.h b/kernel/power/tuxonice_power_off.h
new file mode 100644
index 0000000..9aa0ea8
--- /dev/null
+++ b/kernel/power/tuxonice_power_off.h
@@ -0,0 +1,24 @@
+/*
+ * kernel/power/tuxonice_power_off.h
+ *
+ * Copyright (C) 2006-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * Support for the powering down.
+ */
+
+int toi_pm_state_finish(void);
+void toi_power_down(void);
+extern unsigned long toi_poweroff_method;
+int toi_poweroff_init(void);
+void toi_poweroff_exit(void);
+void toi_check_resleep(void);
+
+extern int platform_begin(int platform_mode);
+extern int platform_pre_snapshot(int platform_mode);
+extern void platform_leave(int platform_mode);
+extern void platform_end(int platform_mode);
+extern void platform_finish(int platform_mode);
+extern int platform_pre_restore(int platform_mode);
+extern void platform_restore_cleanup(int platform_mode);
diff --git a/kernel/power/tuxonice_prepare_image.c b/kernel/power/tuxonice_prepare_image.c
new file mode 100644
index 0000000..a2d4259
--- /dev/null
+++ b/kernel/power/tuxonice_prepare_image.c
@@ -0,0 +1,1115 @@
+/*
+ * kernel/power/tuxonice_prepare_image.c
+ *
+ * Copyright (C) 2003-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * We need to eat memory until we can:
+ * 1. Perform the save without changing anything (RAM_NEEDED < #pages)
+ * 2. Fit it all in available space (toiActiveAllocator->available_space() >=
+ *    main_storage_needed())
+ * 3. Reload the pagedir and pageset1 to places that don't collide with their
+ *    final destinations, not knowing to what extent the resumed kernel will
+ *    overlap with the one loaded at boot time. I think the resumed kernel
+ *    should overlap completely, but I don't want to rely on this as it is
+ *    an unproven assumption. We therefore assume there will be no overlap at
+ *    all (worse case).
+ * 4. Meet the user's requested limit (if any) on the size of the image.
+ *    The limit is in MB, so pages/256 (assuming 4K pages).
+ *
+ */
+
+#include <linux/highmem.h>
+#include <linux/freezer.h>
+#include <linux/hardirq.h>
+#include <linux/mmzone.h>
+#include <linux/console.h>
+
+#include "tuxonice_pageflags.h"
+#include "tuxonice_modules.h"
+#include "tuxonice_io.h"
+#include "tuxonice_ui.h"
+#include "tuxonice_prepare_image.h"
+#include "tuxonice.h"
+#include "tuxonice_extent.h"
+#include "tuxonice_checksum.h"
+#include "tuxonice_sysfs.h"
+#include "tuxonice_alloc.h"
+#include "tuxonice_atomic_copy.h"
+#include "tuxonice_builtin.h"
+
+static unsigned long num_nosave, main_storage_allocated, storage_limit,
+	    header_storage_needed;
+unsigned long extra_pd1_pages_allowance =
+	CONFIG_TOI_DEFAULT_EXTRA_PAGES_ALLOWANCE;
+long image_size_limit = CONFIG_TOI_DEFAULT_IMAGE_SIZE_LIMIT;
+static int no_ps2_needed;
+
+struct attention_list {
+	struct task_struct *task;
+	struct attention_list *next;
+};
+
+static struct attention_list *attention_list;
+
+#define PAGESET1 0
+#define PAGESET2 1
+
+void free_attention_list(void)
+{
+	struct attention_list *last = NULL;
+
+	while (attention_list) {
+		last = attention_list;
+		attention_list = attention_list->next;
+		toi_kfree(6, last, sizeof(*last));
+	}
+}
+
+static int build_attention_list(void)
+{
+	int i, task_count = 0;
+	struct task_struct *p;
+	struct attention_list *next;
+
+	/*
+	 * Count all userspace process (with task->mm) marked PF_NOFREEZE.
+	 */
+	toi_read_lock_tasklist();
+	for_each_process(p)
+		if ((p->flags & PF_NOFREEZE) || p == current)
+			task_count++;
+	toi_read_unlock_tasklist();
+
+	/*
+	 * Allocate attention list structs.
+	 */
+	for (i = 0; i < task_count; i++) {
+		struct attention_list *this =
+			toi_kzalloc(6, sizeof(struct attention_list),
+					TOI_WAIT_GFP);
+		if (!this) {
+			printk(KERN_INFO "Failed to allocate slab for "
+					"attention list.\n");
+			free_attention_list();
+			return 1;
+		}
+		this->next = NULL;
+		if (attention_list)
+			this->next = attention_list;
+		attention_list = this;
+	}
+
+	next = attention_list;
+	toi_read_lock_tasklist();
+	for_each_process(p)
+		if ((p->flags & PF_NOFREEZE) || p == current) {
+			next->task = p;
+			next = next->next;
+		}
+	toi_read_unlock_tasklist();
+	return 0;
+}
+
+static void pageset2_full(void)
+{
+	struct zone *zone;
+	struct page *page;
+	unsigned long flags;
+	int i;
+
+	for_each_populated_zone(zone) {
+		spin_lock_irqsave(&zone->lru_lock, flags);
+		for_each_lru(i) {
+			if (!zone_page_state(zone, NR_LRU_BASE + i))
+				continue;
+
+			list_for_each_entry(page, &zone->lruvec.lists[i], lru) {
+				struct address_space *mapping;
+
+				mapping = page_mapping(page);
+				if (!mapping || !mapping->host ||
+				    !(mapping->host->i_flags & S_ATOMIC_COPY))
+					SetPagePageset2(page);
+			}
+		}
+		spin_unlock_irqrestore(&zone->lru_lock, flags);
+	}
+}
+
+/*
+ * toi_mark_task_as_pageset
+ * Functionality   : Marks all the saveable pages belonging to a given process
+ * 		     as belonging to a particular pageset.
+ */
+
+static void toi_mark_task_as_pageset(struct task_struct *t, int pageset2)
+{
+	struct vm_area_struct *vma;
+	struct mm_struct *mm;
+
+	mm = t->active_mm;
+
+	if (!mm || !mm->mmap)
+		return;
+
+	if (!irqs_disabled())
+		down_read(&mm->mmap_sem);
+
+	for (vma = mm->mmap; vma; vma = vma->vm_next) {
+		unsigned long posn;
+
+		if (!vma->vm_start ||
+		    vma->vm_flags & (VM_IO | VM_DONTDUMP | VM_PFNMAP))
+			continue;
+
+		for (posn = vma->vm_start; posn < vma->vm_end;
+				posn += PAGE_SIZE) {
+			struct page *page = follow_page(vma, posn, 0);
+			struct address_space *mapping;
+
+			if (!page || !pfn_valid(page_to_pfn(page)))
+				continue;
+
+			mapping = page_mapping(page);
+			if (mapping && mapping->host &&
+			    mapping->host->i_flags & S_ATOMIC_COPY)
+				continue;
+
+			if (pageset2)
+				SetPagePageset2(page);
+			else {
+				ClearPagePageset2(page);
+				SetPagePageset1(page);
+			}
+		}
+	}
+
+	if (!irqs_disabled())
+		up_read(&mm->mmap_sem);
+}
+
+static void mark_tasks(int pageset)
+{
+	struct task_struct *p;
+
+	toi_read_lock_tasklist();
+	for_each_process(p) {
+		if (!p->mm)
+			continue;
+
+		if (p->flags & PF_KTHREAD)
+			continue;
+
+		toi_mark_task_as_pageset(p, pageset);
+	}
+	toi_read_unlock_tasklist();
+
+}
+
+/* mark_pages_for_pageset2
+ *
+ * Description:	Mark unshared pages in processes not needed for hibernate as
+ * 		being able to be written out in a separate pagedir.
+ * 		HighMem pages are simply marked as pageset2. They won't be
+ * 		needed during hibernate.
+ */
+
+static void toi_mark_pages_for_pageset2(void)
+{
+	struct attention_list *this = attention_list;
+
+	memory_bm_clear(pageset2_map);
+
+	if (test_action_state(TOI_NO_PAGESET2) || no_ps2_needed)
+		return;
+
+	if (test_action_state(TOI_PAGESET2_FULL))
+		pageset2_full();
+	else
+		mark_tasks(PAGESET2);
+
+	/*
+	 * Because the tasks in attention_list are ones related to hibernating,
+	 * we know that they won't go away under us.
+	 */
+
+	while (this) {
+		if (!test_result_state(TOI_ABORTED))
+			toi_mark_task_as_pageset(this->task, PAGESET1);
+		this = this->next;
+	}
+}
+
+/*
+ * The atomic copy of pageset1 is stored in pageset2 pages.
+ * But if pageset1 is larger (normally only just after boot),
+ * we need to allocate extra pages to store the atomic copy.
+ * The following data struct and functions are used to handle
+ * the allocation and freeing of that memory.
+ */
+
+static unsigned long extra_pages_allocated;
+
+struct extras {
+	struct page *page;
+	int order;
+	struct extras *next;
+};
+
+static struct extras *extras_list;
+
+/* toi_free_extra_pagedir_memory
+ *
+ * Description:	Free previously allocated extra pagedir memory.
+ */
+void toi_free_extra_pagedir_memory(void)
+{
+	/* Free allocated pages */
+	while (extras_list) {
+		struct extras *this = extras_list;
+		int i;
+
+		extras_list = this->next;
+
+		for (i = 0; i < (1 << this->order); i++)
+			ClearPageNosave(this->page + i);
+
+		toi_free_pages(9, this->page, this->order);
+		toi_kfree(7, this, sizeof(*this));
+	}
+
+	extra_pages_allocated = 0;
+}
+
+/* toi_allocate_extra_pagedir_memory
+ *
+ * Description:	Allocate memory for making the atomic copy of pagedir1 in the
+ * 		case where it is bigger than pagedir2.
+ * Arguments:	int	num_to_alloc: Number of extra pages needed.
+ * Result:	int. 	Number of extra pages we now have allocated.
+ */
+static int toi_allocate_extra_pagedir_memory(int extra_pages_needed)
+{
+	int j, order, num_to_alloc = extra_pages_needed - extra_pages_allocated;
+	gfp_t flags = TOI_ATOMIC_GFP;
+
+	if (num_to_alloc < 1)
+		return 0;
+
+	order = fls(num_to_alloc);
+	if (order >= MAX_ORDER)
+		order = MAX_ORDER - 1;
+
+	while (num_to_alloc) {
+		struct page *newpage;
+		unsigned long virt;
+		struct extras *extras_entry;
+
+		while ((1 << order) > num_to_alloc)
+			order--;
+
+		extras_entry = (struct extras *) toi_kzalloc(7,
+			sizeof(struct extras), TOI_ATOMIC_GFP);
+
+		if (!extras_entry)
+			return extra_pages_allocated;
+
+		virt = toi_get_free_pages(9, flags, order);
+		while (!virt && order) {
+			order--;
+			virt = toi_get_free_pages(9, flags, order);
+		}
+
+		if (!virt) {
+			toi_kfree(7, extras_entry, sizeof(*extras_entry));
+			return extra_pages_allocated;
+		}
+
+		newpage = virt_to_page(virt);
+
+		extras_entry->page = newpage;
+		extras_entry->order = order;
+		extras_entry->next = extras_list;
+
+		extras_list = extras_entry;
+
+		for (j = 0; j < (1 << order); j++) {
+			SetPageNosave(newpage + j);
+			SetPagePageset1Copy(newpage + j);
+		}
+
+		extra_pages_allocated += (1 << order);
+		num_to_alloc -= (1 << order);
+	}
+
+	return extra_pages_allocated;
+}
+
+/*
+ * real_nr_free_pages: Count pcp pages for a zone type or all zones
+ * (-1 for all, otherwise zone_idx() result desired).
+ */
+unsigned long real_nr_free_pages(unsigned long zone_idx_mask)
+{
+	struct zone *zone;
+	int result = 0, cpu;
+
+	/* PCP lists */
+	for_each_populated_zone(zone) {
+		if (!(zone_idx_mask & (1 << zone_idx(zone))))
+			continue;
+
+		for_each_online_cpu(cpu) {
+			struct per_cpu_pageset *pset =
+				per_cpu_ptr(zone->pageset, cpu);
+			struct per_cpu_pages *pcp = &pset->pcp;
+			result += pcp->count;
+		}
+
+		result += zone_page_state(zone, NR_FREE_PAGES);
+	}
+	return result;
+}
+EXPORT_SYMBOL_GPL(real_nr_free_pages);
+
+/*
+ * Discover how much extra memory will be required by the drivers
+ * when they're asked to hibernate. We can then ensure that amount
+ * of memory is available when we really want it.
+ */
+static void get_extra_pd1_allowance(void)
+{
+	unsigned long orig_num_free = real_nr_free_pages(all_zones_mask), final;
+
+	toi_prepare_status(CLEAR_BAR, "Finding allowance for drivers.");
+
+	if (toi_go_atomic(PMSG_FREEZE, 1))
+		return;
+
+	final = real_nr_free_pages(all_zones_mask);
+	toi_end_atomic(ATOMIC_ALL_STEPS, 1, 0);
+
+	extra_pd1_pages_allowance = (orig_num_free > final) ?
+		orig_num_free - final + MIN_EXTRA_PAGES_ALLOWANCE :
+		MIN_EXTRA_PAGES_ALLOWANCE;
+}
+
+/*
+ * Amount of storage needed, possibly taking into account the
+ * expected compression ratio and possibly also ignoring our
+ * allowance for extra pages.
+ */
+static unsigned long main_storage_needed(int use_ecr,
+		int ignore_extra_pd1_allow)
+{
+	return (pagedir1.size + pagedir2.size +
+	  (ignore_extra_pd1_allow ? 0 : extra_pd1_pages_allowance)) *
+	 (use_ecr ? toi_expected_compression_ratio() : 100) / 100;
+}
+
+/*
+ * Storage needed for the image header, in bytes until the return.
+ */
+unsigned long get_header_storage_needed(void)
+{
+	unsigned long bytes = sizeof(struct toi_header) +
+			toi_header_storage_for_modules() +
+			toi_pageflags_space_needed() +
+			fs_info_space_needed();
+
+	return DIV_ROUND_UP(bytes, PAGE_SIZE);
+}
+EXPORT_SYMBOL_GPL(get_header_storage_needed);
+
+/*
+ * When freeing memory, pages from either pageset might be freed.
+ *
+ * When seeking to free memory to be able to hibernate, for every ps1 page
+ * freed, we need 2 less pages for the atomic copy because there is one less
+ * page to copy and one more page into which data can be copied.
+ *
+ * Freeing ps2 pages saves us nothing directly. No more memory is available
+ * for the atomic copy. Indirectly, a ps1 page might be freed (slab?), but
+ * that's too much work to figure out.
+ *
+ * => ps1_to_free functions
+ *
+ * Of course if we just want to reduce the image size, because of storage
+ * limitations or an image size limit either ps will do.
+ *
+ * => any_to_free function
+ */
+
+static unsigned long lowpages_usable_for_highmem_copy(void)
+{
+	unsigned long needed = get_lowmem_size(pagedir1) +
+			extra_pd1_pages_allowance + MIN_FREE_RAM +
+			toi_memory_for_modules(0),
+		available = get_lowmem_size(pagedir2) +
+			 real_nr_free_low_pages() + extra_pages_allocated;
+
+	return available > needed ? available - needed : 0;
+}
+
+static unsigned long highpages_ps1_to_free(void)
+{
+	unsigned long need = get_highmem_size(pagedir1),
+		      available = get_highmem_size(pagedir2) +
+			      real_nr_free_high_pages() +
+			      lowpages_usable_for_highmem_copy();
+
+	return need > available ? DIV_ROUND_UP(need - available, 2) : 0;
+}
+
+static unsigned long lowpages_ps1_to_free(void)
+{
+	unsigned long needed = get_lowmem_size(pagedir1) +
+			extra_pd1_pages_allowance + MIN_FREE_RAM +
+			toi_memory_for_modules(0),
+		available = get_lowmem_size(pagedir2) +
+			 real_nr_free_low_pages() + extra_pages_allocated;
+
+	return needed > available ? DIV_ROUND_UP(needed - available, 2) : 0;
+}
+
+static unsigned long current_image_size(void)
+{
+	return pagedir1.size + pagedir2.size + header_storage_needed;
+}
+
+static unsigned long storage_still_required(void)
+{
+	unsigned long needed = main_storage_needed(1, 1);
+	return needed > storage_limit ? needed - storage_limit : 0;
+}
+
+static unsigned long ram_still_required(void)
+{
+	unsigned long needed = MIN_FREE_RAM + toi_memory_for_modules(0) +
+		2 * extra_pd1_pages_allowance,
+		  available = real_nr_free_low_pages() + extra_pages_allocated;
+	return needed > available ? needed - available : 0;
+}
+
+unsigned long any_to_free(int use_image_size_limit)
+{
+	int use_soft_limit = use_image_size_limit && image_size_limit > 0;
+	unsigned long current_size = current_image_size(),
+		      soft_limit = use_soft_limit ? (image_size_limit << 8) : 0,
+		      to_free = use_soft_limit ? (current_size > soft_limit ?
+				      current_size - soft_limit : 0) : 0,
+		      storage_limit = storage_still_required(),
+		      ram_limit = ram_still_required(),
+		      first_max = max(to_free, storage_limit);
+
+	return max(first_max, ram_limit);
+}
+
+static int need_pageset2(void)
+{
+	return (real_nr_free_low_pages() + extra_pages_allocated -
+		2 * extra_pd1_pages_allowance - MIN_FREE_RAM -
+		 toi_memory_for_modules(0) - pagedir1.size) < pagedir2.size;
+}
+
+/* amount_needed
+ *
+ * Calculates the amount by which the image size needs to be reduced to meet
+ * our constraints.
+ */
+static unsigned long amount_needed(int use_image_size_limit)
+{
+	return max(highpages_ps1_to_free() + lowpages_ps1_to_free(),
+			any_to_free(use_image_size_limit));
+}
+
+static int image_not_ready(int use_image_size_limit)
+{
+	toi_message(TOI_EAT_MEMORY, TOI_LOW, 1,
+		"Amount still needed (%lu) > 0:%u,"
+		" Storage allocd: %lu < %lu: %u.\n",
+			amount_needed(use_image_size_limit),
+			(amount_needed(use_image_size_limit) > 0),
+			main_storage_allocated,
+			main_storage_needed(1, 1),
+			main_storage_allocated < main_storage_needed(1, 1));
+
+	toi_cond_pause(0, NULL);
+
+	return (amount_needed(use_image_size_limit) > 0) ||
+		 main_storage_allocated < main_storage_needed(1, 1);
+}
+
+static void display_failure_reason(int tries_exceeded)
+{
+	unsigned long storage_required = storage_still_required(),
+	    ram_required = ram_still_required(),
+	    high_ps1 = highpages_ps1_to_free(),
+	    low_ps1 = lowpages_ps1_to_free();
+
+	printk(KERN_INFO "Failed to prepare the image because...\n");
+
+	if (!storage_limit) {
+		printk(KERN_INFO "- You need some storage available to be "
+				"able to hibernate.\n");
+		return;
+	}
+
+	if (tries_exceeded)
+		printk(KERN_INFO "- The maximum number of iterations was "
+				"reached without successfully preparing the "
+				"image.\n");
+
+	if (storage_required) {
+		printk(KERN_INFO " - We need at least %lu pages of storage "
+				"(ignoring the header), but only have %lu.\n",
+				main_storage_needed(1, 1),
+				main_storage_allocated);
+		set_abort_result(TOI_INSUFFICIENT_STORAGE);
+	}
+
+	if (ram_required) {
+		printk(KERN_INFO " - We need %lu more free pages of low "
+				"memory.\n", ram_required);
+		printk(KERN_INFO "     Minimum free     : %8d\n", MIN_FREE_RAM);
+		printk(KERN_INFO "   + Reqd. by modules : %8lu\n",
+				toi_memory_for_modules(0));
+		printk(KERN_INFO "   + 2 * extra allow  : %8lu\n",
+				2 * extra_pd1_pages_allowance);
+		printk(KERN_INFO "   - Currently free   : %8lu\n",
+				real_nr_free_low_pages());
+		printk(KERN_INFO "   - Pages allocd     : %8lu\n",
+				extra_pages_allocated);
+		printk(KERN_INFO "                      : ========\n");
+		printk(KERN_INFO "     Still needed     : %8lu\n",
+				ram_required);
+
+		/* Print breakdown of memory needed for modules */
+		toi_memory_for_modules(1);
+		set_abort_result(TOI_UNABLE_TO_FREE_ENOUGH_MEMORY);
+	}
+
+	if (high_ps1) {
+		printk(KERN_INFO "- We need to free %lu highmem pageset 1 "
+				"pages.\n", high_ps1);
+		set_abort_result(TOI_UNABLE_TO_FREE_ENOUGH_MEMORY);
+	}
+
+	if (low_ps1) {
+		printk(KERN_INFO " - We need to free %ld lowmem pageset 1 "
+				"pages.\n", low_ps1);
+		set_abort_result(TOI_UNABLE_TO_FREE_ENOUGH_MEMORY);
+	}
+}
+
+static void display_stats(int always, int sub_extra_pd1_allow)
+{
+	char buffer[255];
+	snprintf(buffer, 254,
+		"Free:%lu(%lu). Sets:%lu(%lu),%lu(%lu). "
+		"Nosave:%lu-%lu=%lu. Storage:%lu/%lu(%lu=>%lu). "
+		"Needed:%lu,%lu,%lu(%u,%lu,%lu,%ld) (PS2:%s)\n",
+
+		/* Free */
+		real_nr_free_pages(all_zones_mask),
+		real_nr_free_low_pages(),
+
+		/* Sets */
+		pagedir1.size, pagedir1.size - get_highmem_size(pagedir1),
+		pagedir2.size, pagedir2.size - get_highmem_size(pagedir2),
+
+		/* Nosave */
+		num_nosave, extra_pages_allocated,
+		num_nosave - extra_pages_allocated,
+
+		/* Storage */
+		main_storage_allocated,
+		storage_limit,
+		main_storage_needed(1, sub_extra_pd1_allow),
+		main_storage_needed(1, 1),
+
+		/* Needed */
+		lowpages_ps1_to_free(), highpages_ps1_to_free(),
+		any_to_free(1),
+		MIN_FREE_RAM, toi_memory_for_modules(0),
+		extra_pd1_pages_allowance,
+		image_size_limit,
+
+		need_pageset2() ? "yes" : "no");
+
+	if (always)
+		printk("%s", buffer);
+	else
+		toi_message(TOI_EAT_MEMORY, TOI_MEDIUM, 1, buffer);
+}
+
+/* generate_free_page_map
+ *
+ * Description:	This routine generates a bitmap of free pages from the
+ * 		lists used by the memory manager. We then use the bitmap
+ * 		to quickly calculate which pages to save and in which
+ * 		pagesets.
+ */
+static void generate_free_page_map(void)
+{
+	int order, cpu, t;
+	unsigned long flags, i;
+	struct zone *zone;
+	struct list_head *curr;
+	unsigned long pfn;
+	struct page *page;
+
+	for_each_populated_zone(zone) {
+
+		if (!zone->spanned_pages)
+			continue;
+
+		spin_lock_irqsave(&zone->lock, flags);
+
+		for (i = 0; i < zone->spanned_pages; i++) {
+			pfn = zone->zone_start_pfn + i;
+
+			if (!pfn_valid(pfn))
+				continue;
+
+			page = pfn_to_page(pfn);
+
+			ClearPageNosaveFree(page);
+		}
+
+		for_each_migratetype_order(order, t) {
+			list_for_each(curr,
+					&zone->free_area[order].free_list[t]) {
+				unsigned long j;
+
+				pfn = page_to_pfn(list_entry(curr, struct page,
+							lru));
+				for (j = 0; j < (1UL << order); j++)
+					SetPageNosaveFree(pfn_to_page(pfn + j));
+			}
+		}
+
+		for_each_online_cpu(cpu) {
+			struct per_cpu_pageset *pset =
+				per_cpu_ptr(zone->pageset, cpu);
+			struct per_cpu_pages *pcp = &pset->pcp;
+			struct page *page;
+			int t;
+
+			for (t = 0; t < MIGRATE_PCPTYPES; t++)
+				list_for_each_entry(page, &pcp->lists[t], lru)
+					SetPageNosaveFree(page);
+		}
+
+		spin_unlock_irqrestore(&zone->lock, flags);
+	}
+}
+
+/* size_of_free_region
+ *
+ * Description:	Return the number of pages that are free, beginning with and
+ * 		including this one.
+ */
+static int size_of_free_region(struct zone *zone, unsigned long start_pfn)
+{
+	unsigned long this_pfn = start_pfn,
+		      end_pfn = zone->zone_start_pfn + zone->spanned_pages - 1;
+
+	while (pfn_valid(this_pfn) && this_pfn <= end_pfn && PageNosaveFree(pfn_to_page(this_pfn)))
+		this_pfn++;
+
+	return this_pfn - start_pfn;
+}
+
+/* flag_image_pages
+ *
+ * This routine generates our lists of pages to be stored in each
+ * pageset. Since we store the data using extents, and adding new
+ * extents might allocate a new extent page, this routine may well
+ * be called more than once.
+ */
+static void flag_image_pages(int atomic_copy)
+{
+	int num_free = 0;
+	unsigned long loop;
+	struct zone *zone;
+
+	pagedir1.size = 0;
+	pagedir2.size = 0;
+
+	set_highmem_size(pagedir1, 0);
+	set_highmem_size(pagedir2, 0);
+
+	num_nosave = 0;
+
+	memory_bm_clear(pageset1_map);
+
+	generate_free_page_map();
+
+	/*
+	 * Pages not to be saved are marked Nosave irrespective of being
+	 * reserved.
+	 */
+	for_each_populated_zone(zone) {
+		int highmem = is_highmem(zone);
+
+		for (loop = 0; loop < zone->spanned_pages; loop++) {
+			unsigned long pfn = zone->zone_start_pfn + loop;
+			struct page *page;
+			int chunk_size;
+
+			if (!pfn_valid(pfn))
+				continue;
+
+			chunk_size = size_of_free_region(zone, pfn);
+			if (chunk_size) {
+				num_free += chunk_size;
+				loop += chunk_size - 1;
+				continue;
+			}
+
+			page = pfn_to_page(pfn);
+
+			if (PageNosave(page)) {
+				num_nosave++;
+				continue;
+			}
+
+			page = highmem ? saveable_highmem_page(zone, pfn) :
+				saveable_page(zone, pfn);
+
+			if (!page) {
+				num_nosave++;
+				continue;
+			}
+
+			if (PagePageset2(page)) {
+				pagedir2.size++;
+				if (PageHighMem(page))
+					inc_highmem_size(pagedir2);
+				else
+					SetPagePageset1Copy(page);
+				if (PageResave(page)) {
+					SetPagePageset1(page);
+					ClearPagePageset1Copy(page);
+					pagedir1.size++;
+					if (PageHighMem(page))
+						inc_highmem_size(pagedir1);
+				}
+			} else {
+				pagedir1.size++;
+				SetPagePageset1(page);
+				if (PageHighMem(page))
+					inc_highmem_size(pagedir1);
+			}
+		}
+	}
+
+	if (!atomic_copy)
+		toi_message(TOI_EAT_MEMORY, TOI_MEDIUM, 0,
+			"Count data pages: Set1 (%d) + Set2 (%d) + Nosave (%ld)"
+						" + NumFree (%d) = %d.\n",
+			pagedir1.size, pagedir2.size, num_nosave, num_free,
+			pagedir1.size + pagedir2.size + num_nosave + num_free);
+}
+
+void toi_recalculate_image_contents(int atomic_copy)
+{
+	memory_bm_clear(pageset1_map);
+	if (!atomic_copy) {
+		unsigned long pfn;
+		memory_bm_position_reset(pageset2_map);
+		for (pfn = memory_bm_next_pfn(pageset2_map);
+				pfn != BM_END_OF_MAP;
+				pfn = memory_bm_next_pfn(pageset2_map))
+			ClearPagePageset1Copy(pfn_to_page(pfn));
+		/* Need to call this before getting pageset1_size! */
+		toi_mark_pages_for_pageset2();
+	}
+	flag_image_pages(atomic_copy);
+
+	if (!atomic_copy) {
+		storage_limit = toiActiveAllocator->storage_available();
+		display_stats(0, 0);
+	}
+}
+
+int try_allocate_extra_memory(void)
+{
+	unsigned long wanted = pagedir1.size +  extra_pd1_pages_allowance -
+		get_lowmem_size(pagedir2);
+	if (wanted > extra_pages_allocated) {
+		unsigned long got = toi_allocate_extra_pagedir_memory(wanted);
+		if (wanted < got) {
+			toi_message(TOI_EAT_MEMORY, TOI_LOW, 1,
+				"Want %d extra pages for pageset1, got %d.\n",
+				wanted, got);
+			return 1;
+		}
+	}
+	return 0;
+}
+
+
+/* update_image
+ *
+ * Allocate [more] memory and storage for the image.
+ */
+static void update_image(int ps2_recalc)
+{
+	int old_header_req;
+	unsigned long seek;
+
+	if (try_allocate_extra_memory())
+		return;
+
+	if (ps2_recalc)
+		goto recalc;
+
+	thaw_kernel_threads();
+
+	/*
+	 * Allocate remaining storage space, if possible, up to the
+	 * maximum we know we'll need. It's okay to allocate the
+	 * maximum if the writer is the swapwriter, but
+	 * we don't want to grab all available space on an NFS share.
+	 * We therefore ignore the expected compression ratio here,
+	 * thereby trying to allocate the maximum image size we could
+	 * need (assuming compression doesn't expand the image), but
+	 * don't complain if we can't get the full amount we're after.
+	 */
+
+	do {
+		int result;
+
+		old_header_req = header_storage_needed;
+		toiActiveAllocator->reserve_header_space(header_storage_needed);
+
+		/* How much storage is free with the reservation applied? */
+		storage_limit = toiActiveAllocator->storage_available();
+		seek = min(storage_limit, main_storage_needed(0, 0));
+
+		result = toiActiveAllocator->allocate_storage(seek);
+		if (result)
+			printk("Failed to allocate storage (%d).\n", result);
+
+		main_storage_allocated =
+			toiActiveAllocator->storage_allocated();
+
+		/* Need more header because more storage allocated? */
+		header_storage_needed = get_header_storage_needed();
+
+	} while (header_storage_needed > old_header_req);
+
+	if (freeze_kernel_threads())
+		set_abort_result(TOI_FREEZING_FAILED);
+
+recalc:
+	toi_recalculate_image_contents(0);
+}
+
+/* attempt_to_freeze
+ *
+ * Try to freeze processes.
+ */
+
+static int attempt_to_freeze(void)
+{
+	int result;
+
+	/* Stop processes before checking again */
+	toi_prepare_status(CLEAR_BAR, "Freezing processes & syncing "
+			"filesystems.");
+	result = freeze_processes();
+
+	if (result)
+		set_abort_result(TOI_FREEZING_FAILED);
+
+	result = freeze_kernel_threads();
+
+	if (result)
+		set_abort_result(TOI_FREEZING_FAILED);
+
+	return result;
+}
+
+/* eat_memory
+ *
+ * Try to free some memory, either to meet hard or soft constraints on the image
+ * characteristics.
+ *
+ * Hard constraints:
+ * - Pageset1 must be < half of memory;
+ * - We must have enough memory free at resume time to have pageset1
+ *   be able to be loaded in pages that don't conflict with where it has to
+ *   be restored.
+ * Soft constraints
+ * - User specificied image size limit.
+ */
+static void eat_memory(void)
+{
+	unsigned long amount_wanted = 0;
+	int did_eat_memory = 0;
+
+	/*
+	 * Note that if we have enough storage space and enough free memory, we
+	 * may exit without eating anything. We give up when the last 10
+	 * iterations ate no extra pages because we're not going to get much
+	 * more anyway, but the few pages we get will take a lot of time.
+	 *
+	 * We freeze processes before beginning, and then unfreeze them if we
+	 * need to eat memory until we think we have enough. If our attempts
+	 * to freeze fail, we give up and abort.
+	 */
+
+	amount_wanted = amount_needed(1);
+
+	switch (image_size_limit) {
+	case -1: /* Don't eat any memory */
+		if (amount_wanted > 0) {
+			set_abort_result(TOI_WOULD_EAT_MEMORY);
+			return;
+		}
+		break;
+	case -2:  /* Free caches only */
+		drop_pagecache();
+		toi_recalculate_image_contents(0);
+		amount_wanted = amount_needed(1);
+		break;
+	default:
+		break;
+	}
+
+	if (amount_wanted > 0 && !test_result_state(TOI_ABORTED) &&
+			image_size_limit != -1) {
+		unsigned long request = amount_wanted;
+		unsigned long high_req = max(highpages_ps1_to_free(),
+				any_to_free(1));
+		unsigned long low_req = lowpages_ps1_to_free();
+		unsigned long got = 0;
+
+		toi_prepare_status(CLEAR_BAR,
+				"Seeking to free %ldMB of memory.",
+				MB(amount_wanted));
+
+		thaw_kernel_threads();
+
+		/*
+		 * Ask for too many because shrink_memory_mask doesn't
+		 * currently return enough most of the time.
+		 */
+		
+		if (low_req)
+			got = shrink_memory_mask(low_req, GFP_KERNEL);
+		if (high_req)
+			shrink_memory_mask(high_req - got, GFP_HIGHUSER);
+
+		did_eat_memory = 1;
+
+		toi_recalculate_image_contents(0);
+
+		amount_wanted = amount_needed(1);
+
+		printk(KERN_DEBUG "Asked shrink_memory_mask for %ld low pages &"
+				" %ld pages from anywhere, got %ld.\n",
+				high_req, low_req,
+				request - amount_wanted);
+
+		toi_cond_pause(0, NULL);
+
+		if (freeze_kernel_threads())
+			set_abort_result(TOI_FREEZING_FAILED);
+	}
+
+	if (did_eat_memory)
+		toi_recalculate_image_contents(0);
+}
+
+/* toi_prepare_image
+ *
+ * Entry point to the whole image preparation section.
+ *
+ * We do four things:
+ * - Freeze processes;
+ * - Ensure image size constraints are met;
+ * - Complete all the preparation for saving the image,
+ *   including allocation of storage. The only memory
+ *   that should be needed when we're finished is that
+ *   for actually storing the image (and we know how
+ *   much is needed for that because the modules tell
+ *   us).
+ * - Make sure that all dirty buffers are written out.
+ */
+#define MAX_TRIES 2
+int toi_prepare_image(void)
+{
+	int result = 1, tries = 1;
+
+	main_storage_allocated = 0;
+	no_ps2_needed = 0;
+
+	if (attempt_to_freeze())
+		return 1;
+
+	if (!extra_pd1_pages_allowance)
+		get_extra_pd1_allowance();
+
+	storage_limit = toiActiveAllocator->storage_available();
+
+	if (!storage_limit) {
+		printk(KERN_INFO "No storage available. Didn't try to prepare "
+				"an image.\n");
+		display_failure_reason(0);
+		set_abort_result(TOI_NOSTORAGE_AVAILABLE);
+		return 1;
+	}
+
+	if (build_attention_list()) {
+		abort_hibernate(TOI_UNABLE_TO_PREPARE_IMAGE,
+				"Unable to successfully prepare the image.\n");
+		return 1;
+	}
+
+	toi_recalculate_image_contents(0);
+
+	do {
+		toi_prepare_status(CLEAR_BAR,
+				"Preparing Image. Try %d.", tries);
+
+		eat_memory();
+
+		if (test_result_state(TOI_ABORTED))
+			break;
+
+		update_image(0);
+
+		tries++;
+
+	} while (image_not_ready(1) && tries <= MAX_TRIES &&
+			!test_result_state(TOI_ABORTED));
+
+	result = image_not_ready(0);
+
+	if (!test_result_state(TOI_ABORTED)) {
+		if (result) {
+			display_stats(1, 0);
+			display_failure_reason(tries > MAX_TRIES);
+			abort_hibernate(TOI_UNABLE_TO_PREPARE_IMAGE,
+				"Unable to successfully prepare the image.\n");
+		} else {
+			/* Pageset 2 needed? */
+			if (!need_pageset2() &&
+				  test_action_state(TOI_NO_PS2_IF_UNNEEDED)) {
+				no_ps2_needed = 1;
+				toi_recalculate_image_contents(0);
+				update_image(1);
+			}
+
+			toi_cond_pause(1, "Image preparation complete.");
+		}
+	}
+
+	return result ? result : allocate_checksum_pages();
+}
diff --git a/kernel/power/tuxonice_prepare_image.h b/kernel/power/tuxonice_prepare_image.h
new file mode 100644
index 0000000..2a2ca0b
--- /dev/null
+++ b/kernel/power/tuxonice_prepare_image.h
@@ -0,0 +1,38 @@
+/*
+ * kernel/power/tuxonice_prepare_image.h
+ *
+ * Copyright (C) 2003-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ */
+
+#include <asm/sections.h>
+
+extern int toi_prepare_image(void);
+extern void toi_recalculate_image_contents(int storage_available);
+extern unsigned long real_nr_free_pages(unsigned long zone_idx_mask);
+extern long image_size_limit;
+extern void toi_free_extra_pagedir_memory(void);
+extern unsigned long extra_pd1_pages_allowance;
+extern void free_attention_list(void);
+
+#define MIN_FREE_RAM 100
+#define MIN_EXTRA_PAGES_ALLOWANCE 500
+
+#define all_zones_mask ((unsigned long) ((1 << MAX_NR_ZONES) - 1))
+#ifdef CONFIG_HIGHMEM
+#define real_nr_free_high_pages() (real_nr_free_pages(1 << ZONE_HIGHMEM))
+#define real_nr_free_low_pages() (real_nr_free_pages(all_zones_mask - \
+						(1 << ZONE_HIGHMEM)))
+#else
+#define real_nr_free_high_pages() (0)
+#define real_nr_free_low_pages() (real_nr_free_pages(all_zones_mask))
+
+/* For eat_memory function */
+#define ZONE_HIGHMEM (MAX_NR_ZONES + 1)
+#endif
+
+unsigned long get_header_storage_needed(void);
+unsigned long any_to_free(int use_image_size_limit);
+int try_allocate_extra_memory(void);
diff --git a/kernel/power/tuxonice_prune.c b/kernel/power/tuxonice_prune.c
new file mode 100644
index 0000000..9a9444d
--- /dev/null
+++ b/kernel/power/tuxonice_prune.c
@@ -0,0 +1,419 @@
+/*
+ * kernel/power/tuxonice_prune.c
+ *
+ * Copyright (C) 2012 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * This file implements a TuxOnIce module that seeks to prune the
+ * amount of data written to disk. It builds a table of hashes
+ * of the uncompressed data, and writes the pfn of the previous page
+ * with the same contents instead of repeating the data when a match
+ * is found.
+ */
+
+#include <linux/suspend.h>
+#include <linux/highmem.h>
+#include <linux/vmalloc.h>
+#include <linux/crypto.h>
+#include <linux/scatterlist.h>
+#include <crypto/hash.h>
+
+#include "tuxonice_builtin.h"
+#include "tuxonice.h"
+#include "tuxonice_modules.h"
+#include "tuxonice_sysfs.h"
+#include "tuxonice_io.h"
+#include "tuxonice_ui.h"
+#include "tuxonice_alloc.h"
+
+/*
+ * We never write a page bigger than PAGE_SIZE, so use a large number
+ * to indicate that data is a PFN.
+ */
+#define PRUNE_DATA_IS_PFN (PAGE_SIZE + 100)
+
+static unsigned long toi_pruned_pages;
+
+static struct toi_module_ops toi_prune_ops;
+static struct toi_module_ops *next_driver;
+
+static char toi_prune_hash_algo_name[32] = "sha1";
+
+static DEFINE_MUTEX(stats_lock);
+
+struct cpu_context {
+	struct shash_desc desc;
+	char *digest;
+};
+
+#define OUT_BUF_SIZE (2 * PAGE_SIZE)
+
+static DEFINE_PER_CPU(struct cpu_context, contexts);
+
+/*
+ * toi_crypto_prepare
+ *
+ * Prepare to do some work by allocating buffers and transforms.
+ */
+static int toi_prune_crypto_prepare(void)
+{
+	int cpu, ret, digestsize;
+
+	if (!*toi_prune_hash_algo_name) {
+		printk(KERN_INFO "TuxOnIce: Pruning enabled but no "
+				"hash algorithm set.\n");
+		return 1;
+	}
+
+	for_each_online_cpu(cpu) {
+		struct cpu_context *this = &per_cpu(contexts, cpu);
+		this->desc.tfm = crypto_alloc_shash(toi_prune_hash_algo_name, 0, 0);
+		if (IS_ERR(this->desc.tfm)) {
+			printk(KERN_INFO "TuxOnIce: Failed to allocate the "
+					"%s prune hash algorithm.\n",
+					toi_prune_hash_algo_name);
+			this->desc.tfm = NULL;
+			return 1;
+		}
+
+		if (!digestsize)
+			digestsize = crypto_shash_digestsize(this->desc.tfm);
+
+		this->digest = kmalloc(digestsize, GFP_KERNEL);
+		if (!this->digest) {
+			printk(KERN_INFO "TuxOnIce: Failed to allocate space "
+					"for digest output.\n");
+			crypto_free_shash(this->desc.tfm);
+			this->desc.tfm = NULL;
+		}
+
+		this->desc.flags = 0;
+
+		ret = crypto_shash_init(&this->desc);
+		if (ret < 0) {
+			printk(KERN_INFO "TuxOnIce: Failed to initialise the "
+					"%s prune hash algorithm.\n",
+					toi_prune_hash_algo_name);
+			kfree(this->digest);
+			this->digest = NULL;
+			crypto_free_shash(this->desc.tfm);
+			this->desc.tfm = NULL;
+			return 1;
+		}
+	}
+
+	return 0;
+}
+
+static int toi_prune_rw_cleanup(int writing)
+{
+	int cpu;
+
+	for_each_online_cpu(cpu) {
+		struct cpu_context *this = &per_cpu(contexts, cpu);
+		if (this->desc.tfm) {
+			crypto_free_shash(this->desc.tfm);
+			this->desc.tfm = NULL;
+		}
+
+		if (this->digest) {
+			kfree(this->digest);
+			this->digest = NULL;
+		}
+	}
+
+	return 0;
+}
+
+/*
+ * toi_prune_init
+ */
+
+static int toi_prune_init(int toi_or_resume)
+{
+	if (!toi_or_resume)
+		return 0;
+
+	toi_pruned_pages = 0;
+
+	next_driver = toi_get_next_filter(&toi_prune_ops);
+
+	return next_driver ? 0 : -ECHILD;
+}
+
+/*
+ * toi_prune_rw_init()
+ */
+
+static int toi_prune_rw_init(int rw, int stream_number)
+{
+	if (toi_prune_crypto_prepare()) {
+		printk(KERN_ERR "Failed to initialise prune "
+				"algorithm.\n");
+		if (rw == READ) {
+			printk(KERN_INFO "Unable to read the image.\n");
+			return -ENODEV;
+		} else {
+			printk(KERN_INFO "Continuing without "
+				"pruning the image.\n");
+			toi_prune_ops.enabled = 0;
+		}
+	}
+
+	return 0;
+}
+
+/*
+ * toi_prune_write_page()
+ *
+ * Compress a page of data, buffering output and passing on filled
+ * pages to the next module in the pipeline.
+ *
+ * Buffer_page:	Pointer to a buffer of size PAGE_SIZE, containing
+ * data to be checked.
+ *
+ * Returns:	0 on success. Otherwise the error is that returned by later
+ * 		modules, -ECHILD if we have a broken pipeline or -EIO if
+ * 		zlib errs.
+ */
+static int toi_prune_write_page(unsigned long index, int buf_type,
+		void *buffer_page, unsigned int buf_size)
+{
+	int ret = 0, cpu = smp_processor_id(), write_data = 1;
+	struct cpu_context *ctx = &per_cpu(contexts, cpu);
+	u8* output_buffer = buffer_page;
+	int output_len = buf_size;
+	int out_buf_type = buf_type;
+	void *buffer_start;
+	u32 buf[4];
+
+	if (ctx->desc.tfm) {
+
+		buffer_start = TOI_MAP(buf_type, buffer_page);
+		ctx->len = OUT_BUF_SIZE;
+
+		ret = crypto_shash_digest(&ctx->desc, buffer_start, buf_size, &ctx->digest);
+		if (ret) {
+			printk(KERN_INFO "TuxOnIce: Failed to calculate digest (%d).\n", ret);
+		} else {
+			mutex_lock(&stats_lock);
+
+			toi_pruned_pages++;
+
+			mutex_unlock(&stats_lock);
+
+		}
+
+		TOI_UNMAP(buf_type, buffer_page);
+	}
+
+	if (write_data)
+		ret = next_driver->write_page(index, out_buf_type,
+				output_buffer, output_len);
+	else
+		ret = next_driver->write_page(index, out_buf_type,
+				output_buffer, output_len);
+
+	return ret;
+}
+
+/*
+ * toi_prune_read_page()
+ * @buffer_page: struct page *. Pointer to a buffer of size PAGE_SIZE.
+ *
+ * Retrieve data from later modules or from a previously loaded page and
+ * fill the input buffer.
+ * Zero if successful. Error condition from me or from downstream on failure.
+ */
+static int toi_prune_read_page(unsigned long *index, int buf_type,
+		void *buffer_page, unsigned int *buf_size)
+{
+	int ret, cpu = smp_processor_id();
+	unsigned int len;
+	char *buffer_start;
+	struct cpu_context *ctx = &per_cpu(contexts, cpu);
+
+	if (!ctx->desc.tfm)
+		return next_driver->read_page(index, TOI_PAGE, buffer_page,
+				buf_size);
+
+	/*
+	 * All our reads must be synchronous - we can't handle
+	 * data that hasn't been read yet.
+	 */
+
+	ret = next_driver->read_page(index, buf_type, buffer_page, &len);
+
+	if (len == PRUNE_DATA_IS_PFN) {
+		buffer_start = kmap(buffer_page);
+	}
+
+	return ret;
+}
+
+/*
+ * toi_prune_print_debug_stats
+ * @buffer: Pointer to a buffer into which the debug info will be printed.
+ * @size: Size of the buffer.
+ *
+ * Print information to be recorded for debugging purposes into a buffer.
+ * Returns: Number of characters written to the buffer.
+ */
+
+static int toi_prune_print_debug_stats(char *buffer, int size)
+{
+	int len;
+
+	/* Output the number of pages pruned. */
+	if (*toi_prune_hash_algo_name)
+		len = scnprintf(buffer, size, "- Compressor is '%s'.\n",
+				toi_prune_hash_algo_name);
+	else
+		len = scnprintf(buffer, size, "- Compressor is not set.\n");
+
+	if (toi_pruned_pages)
+		len += scnprintf(buffer+len, size - len, "  Pruned "
+			"%lu pages).\n",
+		  toi_pruned_pages);
+	return len;
+}
+
+/*
+ * toi_prune_memory_needed
+ *
+ * Tell the caller how much memory we need to operate during hibernate/resume.
+ * Returns: Unsigned long. Maximum number of bytes of memory required for
+ * operation.
+ */
+static int toi_prune_memory_needed(void)
+{
+	return 2 * PAGE_SIZE;
+}
+
+static int toi_prune_storage_needed(void)
+{
+	return 2 * sizeof(unsigned long) + 2 * sizeof(int) +
+		strlen(toi_prune_hash_algo_name) + 1;
+}
+
+/*
+ * toi_prune_save_config_info
+ * @buffer: Pointer to a buffer of size PAGE_SIZE.
+ *
+ * Save informaton needed when reloading the image at resume time.
+ * Returns: Number of bytes used for saving our data.
+ */
+static int toi_prune_save_config_info(char *buffer)
+{
+	int len = strlen(toi_prune_hash_algo_name) + 1, offset = 0;
+
+	*((unsigned long *) buffer) = toi_pruned_pages;
+	offset += sizeof(unsigned long);
+	*((int *) (buffer + offset)) = len;
+	offset += sizeof(int);
+	strncpy(buffer + offset, toi_prune_hash_algo_name, len);
+	return offset + len;
+}
+
+/* toi_prune_load_config_info
+ * @buffer: Pointer to the start of the data.
+ * @size: Number of bytes that were saved.
+ *
+ * Description:	Reload information needed for passing back to the
+ * resumed kernel.
+ */
+static void toi_prune_load_config_info(char *buffer, int size)
+{
+	int len, offset = 0;
+
+	toi_pruned_pages = *((unsigned long *) buffer);
+	offset += sizeof(unsigned long);
+	len = *((int *) (buffer + offset));
+	offset += sizeof(int);
+	strncpy(toi_prune_hash_algo_name, buffer + offset, len);
+}
+
+static void toi_prune_pre_atomic_restore(struct toi_boot_kernel_data *bkd)
+{
+	bkd->pruned_pages = toi_pruned_pages;
+}
+
+static void toi_prune_post_atomic_restore(struct toi_boot_kernel_data *bkd)
+{
+	toi_pruned_pages = bkd->pruned_pages;
+}
+
+/*
+ * toi_expected_ratio
+ *
+ * Description:	Returns the expected ratio between data passed into this module
+ * 		and the amount of data output when writing.
+ * Returns:	100 - we have no idea how many pages will be pruned.
+ */
+
+static int toi_prune_expected_ratio(void)
+{
+	return 100;
+}
+
+/*
+ * data for our sysfs entries.
+ */
+static struct toi_sysfs_data sysfs_params[] = {
+	SYSFS_INT("enabled", SYSFS_RW, &toi_prune_ops.enabled, 0, 1, 0,
+			NULL),
+	SYSFS_STRING("algorithm", SYSFS_RW, toi_prune_hash_algo_name, 31, 0, NULL),
+};
+
+/*
+ * Ops structure.
+ */
+static struct toi_module_ops toi_prune_ops = {
+	.type			= FILTER_MODULE,
+	.name			= "prune",
+	.directory		= "prune",
+	.module			= THIS_MODULE,
+	.initialise		= toi_prune_init,
+	.memory_needed 		= toi_prune_memory_needed,
+	.print_debug_info	= toi_prune_print_debug_stats,
+	.save_config_info	= toi_prune_save_config_info,
+	.load_config_info	= toi_prune_load_config_info,
+	.storage_needed		= toi_prune_storage_needed,
+	.expected_compression	= toi_prune_expected_ratio,
+
+	.pre_atomic_restore	= toi_prune_pre_atomic_restore,
+	.post_atomic_restore	= toi_prune_post_atomic_restore,
+
+	.rw_init		= toi_prune_rw_init,
+	.rw_cleanup		= toi_prune_rw_cleanup,
+
+	.write_page		= toi_prune_write_page,
+	.read_page		= toi_prune_read_page,
+
+	.sysfs_data		= sysfs_params,
+	.num_sysfs_entries	= sizeof(sysfs_params) /
+		sizeof(struct toi_sysfs_data),
+};
+
+/* ---- Registration ---- */
+
+static __init int toi_prune_load(void)
+{
+	return toi_register_module(&toi_prune_ops);
+}
+
+#ifdef MODULE
+static __exit void toi_prune_unload(void)
+{
+	toi_unregister_module(&toi_prune_ops);
+}
+
+module_init(toi_prune_load);
+module_exit(toi_prune_unload);
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Nigel Cunningham");
+MODULE_DESCRIPTION("Image Pruning Support for TuxOnIce");
+#else
+late_initcall(toi_prune_load);
+#endif
diff --git a/kernel/power/tuxonice_storage.c b/kernel/power/tuxonice_storage.c
new file mode 100644
index 0000000..dcf83f4
--- /dev/null
+++ b/kernel/power/tuxonice_storage.c
@@ -0,0 +1,283 @@
+/*
+ * kernel/power/tuxonice_storage.c
+ *
+ * Copyright (C) 2005-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * Routines for talking to a userspace program that manages storage.
+ *
+ * The kernel side:
+ * - starts the userspace program;
+ * - sends messages telling it when to open and close the connection;
+ * - tells it when to quit;
+ *
+ * The user space side:
+ * - passes messages regarding status;
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/suspend.h>
+#include <linux/freezer.h>
+
+#include "tuxonice_sysfs.h"
+#include "tuxonice_modules.h"
+#include "tuxonice_netlink.h"
+#include "tuxonice_storage.h"
+#include "tuxonice_ui.h"
+
+static struct user_helper_data usm_helper_data;
+static struct toi_module_ops usm_ops;
+static int message_received, usm_prepare_count;
+static int storage_manager_last_action, storage_manager_action;
+
+static int usm_user_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
+{
+	int type;
+	int *data;
+
+	type = nlh->nlmsg_type;
+
+	/* A control message: ignore them */
+	if (type < NETLINK_MSG_BASE)
+		return 0;
+
+	/* Unknown message: reply with EINVAL */
+	if (type >= USM_MSG_MAX)
+		return -EINVAL;
+
+	/* All operations require privileges, even GET */
+	if (!capable(CAP_NET_ADMIN))
+		return -EPERM;
+
+	/* Only allow one task to receive NOFREEZE privileges */
+	if (type == NETLINK_MSG_NOFREEZE_ME && usm_helper_data.pid != -1)
+		return -EBUSY;
+
+	data = (int *) NLMSG_DATA(nlh);
+
+	switch (type) {
+	case USM_MSG_SUCCESS:
+	case USM_MSG_FAILED:
+		message_received = type;
+		complete(&usm_helper_data.wait_for_process);
+		break;
+	default:
+		printk(KERN_INFO "Storage manager doesn't recognise "
+				"message %d.\n", type);
+	}
+
+	return 1;
+}
+
+#ifdef CONFIG_NET
+static int activations;
+
+int toi_activate_storage(int force)
+{
+	int tries = 1;
+
+	if (usm_helper_data.pid == -1 || !usm_ops.enabled)
+		return 0;
+
+	message_received = 0;
+	activations++;
+
+	if (activations > 1 && !force)
+		return 0;
+
+	while ((!message_received || message_received == USM_MSG_FAILED) &&
+			tries < 2) {
+		toi_prepare_status(DONT_CLEAR_BAR, "Activate storage attempt "
+				"%d.\n", tries);
+
+		init_completion(&usm_helper_data.wait_for_process);
+
+		toi_send_netlink_message(&usm_helper_data,
+			USM_MSG_CONNECT,
+			NULL, 0);
+
+		/* Wait 2 seconds for the userspace process to make contact */
+		wait_for_completion_timeout(&usm_helper_data.wait_for_process,
+				2*HZ);
+
+		tries++;
+	}
+
+	return 0;
+}
+
+int toi_deactivate_storage(int force)
+{
+	if (usm_helper_data.pid == -1 || !usm_ops.enabled)
+		return 0;
+
+	message_received = 0;
+	activations--;
+
+	if (activations && !force)
+		return 0;
+
+	init_completion(&usm_helper_data.wait_for_process);
+
+	toi_send_netlink_message(&usm_helper_data,
+			USM_MSG_DISCONNECT,
+			NULL, 0);
+
+	wait_for_completion_timeout(&usm_helper_data.wait_for_process, 2*HZ);
+
+	if (!message_received || message_received == USM_MSG_FAILED) {
+		printk(KERN_INFO "Returning failure disconnecting storage.\n");
+		return 1;
+	}
+
+	return 0;
+}
+#endif
+
+static void storage_manager_simulate(void)
+{
+	printk(KERN_INFO "--- Storage manager simulate ---\n");
+	toi_prepare_usm();
+	schedule();
+	printk(KERN_INFO "--- Activate storage 1 ---\n");
+	toi_activate_storage(1);
+	schedule();
+	printk(KERN_INFO "--- Deactivate storage 1 ---\n");
+	toi_deactivate_storage(1);
+	schedule();
+	printk(KERN_INFO "--- Cleanup usm ---\n");
+	toi_cleanup_usm();
+	schedule();
+	printk(KERN_INFO "--- Storage manager simulate ends ---\n");
+}
+
+static int usm_storage_needed(void)
+{
+	return sizeof(int) + strlen(usm_helper_data.program) + 1;
+}
+
+static int usm_save_config_info(char *buf)
+{
+	int len = strlen(usm_helper_data.program);
+	memcpy(buf, usm_helper_data.program, len + 1);
+	return sizeof(int) + len + 1;
+}
+
+static void usm_load_config_info(char *buf, int size)
+{
+	/* Don't load the saved path if one has already been set */
+	if (usm_helper_data.program[0])
+		return;
+
+	memcpy(usm_helper_data.program, buf + sizeof(int), *((int *) buf));
+}
+
+static int usm_memory_needed(void)
+{
+	/* ball park figure of 32 pages */
+	return 32 * PAGE_SIZE;
+}
+
+/* toi_prepare_usm
+ */
+int toi_prepare_usm(void)
+{
+	usm_prepare_count++;
+
+	if (usm_prepare_count > 1 || !usm_ops.enabled)
+		return 0;
+
+	usm_helper_data.pid = -1;
+
+	if (!*usm_helper_data.program)
+		return 0;
+
+	toi_netlink_setup(&usm_helper_data);
+
+	if (usm_helper_data.pid == -1)
+		printk(KERN_INFO "TuxOnIce Storage Manager wanted, but couldn't"
+				" start it.\n");
+
+	toi_activate_storage(0);
+
+	return usm_helper_data.pid != -1;
+}
+
+void toi_cleanup_usm(void)
+{
+	usm_prepare_count--;
+
+	if (usm_helper_data.pid > -1 && !usm_prepare_count) {
+		toi_deactivate_storage(0);
+		toi_netlink_close(&usm_helper_data);
+	}
+}
+
+static void storage_manager_activate(void)
+{
+	if (storage_manager_action == storage_manager_last_action)
+		return;
+
+	if (storage_manager_action)
+		toi_prepare_usm();
+	else
+		toi_cleanup_usm();
+
+	storage_manager_last_action = storage_manager_action;
+}
+
+/*
+ * User interface specific /sys/power/tuxonice entries.
+ */
+
+static struct toi_sysfs_data sysfs_params[] = {
+	SYSFS_NONE("simulate_atomic_copy", storage_manager_simulate),
+	SYSFS_INT("enabled", SYSFS_RW, &usm_ops.enabled, 0, 1, 0, NULL),
+	SYSFS_STRING("program", SYSFS_RW, usm_helper_data.program, 254, 0,
+		NULL),
+	SYSFS_INT("activate_storage", SYSFS_RW , &storage_manager_action, 0, 1,
+			0, storage_manager_activate)
+};
+
+static struct toi_module_ops usm_ops = {
+	.type				= MISC_MODULE,
+	.name				= "usm",
+	.directory			= "storage_manager",
+	.module				= THIS_MODULE,
+	.storage_needed			= usm_storage_needed,
+	.save_config_info		= usm_save_config_info,
+	.load_config_info		= usm_load_config_info,
+	.memory_needed			= usm_memory_needed,
+
+	.sysfs_data			= sysfs_params,
+	.num_sysfs_entries		= sizeof(sysfs_params) /
+		sizeof(struct toi_sysfs_data),
+};
+
+/* toi_usm_sysfs_init
+ * Description: Boot time initialisation for user interface.
+ */
+int toi_usm_init(void)
+{
+	usm_helper_data.nl = NULL;
+	usm_helper_data.program[0] = '\0';
+	usm_helper_data.pid = -1;
+	usm_helper_data.skb_size = 0;
+	usm_helper_data.pool_limit = 6;
+	usm_helper_data.netlink_id = NETLINK_TOI_USM;
+	usm_helper_data.name = "userspace storage manager";
+	usm_helper_data.rcv_msg = usm_user_rcv_msg;
+	usm_helper_data.interface_version = 2;
+	usm_helper_data.must_init = 0;
+	init_completion(&usm_helper_data.wait_for_process);
+
+	return toi_register_module(&usm_ops);
+}
+
+void toi_usm_exit(void)
+{
+	toi_netlink_close_complete(&usm_helper_data);
+	toi_unregister_module(&usm_ops);
+}
diff --git a/kernel/power/tuxonice_storage.h b/kernel/power/tuxonice_storage.h
new file mode 100644
index 0000000..8c6b5a7
--- /dev/null
+++ b/kernel/power/tuxonice_storage.h
@@ -0,0 +1,45 @@
+/*
+ * kernel/power/tuxonice_storage.h
+ *
+ * Copyright (C) 2005-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ */
+
+#ifdef CONFIG_NET
+int toi_prepare_usm(void);
+void toi_cleanup_usm(void);
+
+int toi_activate_storage(int force);
+int toi_deactivate_storage(int force);
+extern int toi_usm_init(void);
+extern void toi_usm_exit(void);
+#else
+static inline int toi_usm_init(void) { return 0; }
+static inline void toi_usm_exit(void) { }
+
+static inline int toi_activate_storage(int force)
+{
+	return 0;
+}
+
+static inline int toi_deactivate_storage(int force)
+{
+	return 0;
+}
+
+static inline int toi_prepare_usm(void) { return 0; }
+static inline void toi_cleanup_usm(void) { }
+#endif
+
+enum {
+	USM_MSG_BASE = 0x10,
+
+	/* Kernel -> Userspace */
+	USM_MSG_CONNECT = 0x30,
+	USM_MSG_DISCONNECT = 0x31,
+	USM_MSG_SUCCESS = 0x40,
+	USM_MSG_FAILED = 0x41,
+
+	USM_MSG_MAX,
+};
diff --git a/kernel/power/tuxonice_swap.c b/kernel/power/tuxonice_swap.c
new file mode 100644
index 0000000..a6c0d76
--- /dev/null
+++ b/kernel/power/tuxonice_swap.c
@@ -0,0 +1,463 @@
+/*
+ * kernel/power/tuxonice_swap.c
+ *
+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * Distributed under GPLv2.
+ *
+ * This file encapsulates functions for usage of swap space as a
+ * backing store.
+ */
+
+#include <linux/suspend.h>
+#include <linux/blkdev.h>
+#include <linux/swapops.h>
+#include <linux/swap.h>
+#include <linux/syscalls.h>
+#include <linux/fs_uuid.h>
+
+#include "tuxonice.h"
+#include "tuxonice_sysfs.h"
+#include "tuxonice_modules.h"
+#include "tuxonice_io.h"
+#include "tuxonice_ui.h"
+#include "tuxonice_extent.h"
+#include "tuxonice_bio.h"
+#include "tuxonice_alloc.h"
+#include "tuxonice_builtin.h"
+
+static struct toi_module_ops toi_swapops;
+
+/* For swapfile automatically swapon/off'd. */
+static char swapfilename[255] = "";
+static int toi_swapon_status;
+
+/* Swap Pages */
+static unsigned long swap_allocated;
+
+static struct sysinfo swapinfo;
+
+static int is_ram_backed(struct swap_info_struct *si)
+{
+	if (!strncmp(si->bdev->bd_disk->disk_name, "ram", 3) ||
+	    !strncmp(si->bdev->bd_disk->disk_name, "zram", 4))
+		return 1;
+
+	return 0;
+}
+
+/**
+ * enable_swapfile: Swapon the user specified swapfile prior to hibernating.
+ *
+ * Activate the given swapfile if it wasn't already enabled. Remember whether
+ * we really did swapon it for swapoffing later.
+ */
+static void enable_swapfile(void)
+{
+	int activateswapresult = -EINVAL;
+
+	if (swapfilename[0]) {
+		/* Attempt to swap on with maximum priority */
+		activateswapresult = sys_swapon(swapfilename, 0xFFFF);
+		if (activateswapresult && activateswapresult != -EBUSY)
+			printk(KERN_ERR "TuxOnIce: The swapfile/partition "
+				"specified by /sys/power/tuxonice/swap/swapfile"
+				" (%s) could not be turned on (error %d). "
+				"Attempting to continue.\n",
+				swapfilename, activateswapresult);
+		if (!activateswapresult)
+			toi_swapon_status = 1;
+	}
+}
+
+/**
+ * disable_swapfile: Swapoff any file swaponed at the start of the cycle.
+ *
+ * If we did successfully swapon a file at the start of the cycle, swapoff
+ * it now (finishing up).
+ */
+static void disable_swapfile(void)
+{
+	if (!toi_swapon_status)
+		return;
+
+	sys_swapoff(swapfilename);
+	toi_swapon_status = 0;
+}
+
+static int add_blocks_to_extent_chain(struct toi_bdev_info *chain,
+		unsigned long start, unsigned long end)
+{
+	if (test_action_state(TOI_TEST_BIO))
+		toi_message(TOI_IO, TOI_VERBOSE, 0, "Adding extent %lu-%lu to "
+				"chain %p.", start << chain->bmap_shift,
+				end << chain->bmap_shift, chain);
+
+	return toi_add_to_extent_chain(&chain->blocks, start, end);
+}
+
+
+static int get_main_pool_phys_params(struct toi_bdev_info *chain)
+{
+	struct hibernate_extent *extentpointer = NULL;
+	unsigned long address, extent_min = 0, extent_max = 0;
+	int empty = 1;
+
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "get main pool phys params for "
+			"chain %d.", chain->allocator_index);
+
+	if (!chain->allocations.first)
+		return 0;
+
+	if (chain->blocks.first)
+		toi_put_extent_chain(&chain->blocks);
+
+	toi_extent_for_each(&chain->allocations, extentpointer, address) {
+		swp_entry_t swap_address = (swp_entry_t) { address };
+		struct block_device *bdev;
+		sector_t new_sector = map_swap_entry(swap_address, &bdev);
+
+		if (empty) {
+			empty = 0;
+			extent_min = extent_max = new_sector;
+			continue;
+		}
+
+		if (new_sector == extent_max + 1) {
+			extent_max++;
+			continue;
+		}
+
+		if (add_blocks_to_extent_chain(chain, extent_min, extent_max)) {
+			printk(KERN_ERR "Out of memory while making block "
+					"chains.\n");
+			return -ENOMEM;
+		}
+
+		extent_min = new_sector;
+		extent_max = new_sector;
+	}
+
+	if (!empty &&
+	    add_blocks_to_extent_chain(chain, extent_min, extent_max)) {
+		printk(KERN_ERR "Out of memory while making block chains.\n");
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+/*
+ * Like si_swapinfo, except that we don't include ram backed swap (compcache!)
+ * and don't need to use the spinlocks (userspace is stopped when this
+ * function is called).
+ */
+void si_swapinfo_no_compcache(void)
+{
+	unsigned int i;
+
+	si_swapinfo(&swapinfo);
+	swapinfo.freeswap = 0;
+	swapinfo.totalswap = 0;
+
+	for (i = 0; i < MAX_SWAPFILES; i++) {
+		struct swap_info_struct *si = get_swap_info_struct(i);
+		if (si && (si->flags & SWP_WRITEOK) && !is_ram_backed(si)) {
+			swapinfo.totalswap += si->inuse_pages;
+			swapinfo.freeswap += si->pages - si->inuse_pages;
+		}
+	}
+}
+/*
+ * We can't just remember the value from allocation time, because other
+ * processes might have allocated swap in the mean time.
+ */
+static unsigned long toi_swap_storage_available(void)
+{
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "In toi_swap_storage_available.");
+	si_swapinfo_no_compcache();
+	return swapinfo.freeswap + swap_allocated;
+}
+
+static int toi_swap_initialise(int starting_cycle)
+{
+	if (!starting_cycle)
+		return 0;
+
+	enable_swapfile();
+	return 0;
+}
+
+static void toi_swap_cleanup(int ending_cycle)
+{
+	if (!ending_cycle)
+		return;
+
+	disable_swapfile();
+}
+
+static void toi_swap_free_storage(struct toi_bdev_info *chain)
+{
+	/* Free swap entries */
+	struct hibernate_extent *extentpointer;
+	unsigned long extentvalue;
+
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "Freeing storage for chain %p.",
+			chain);
+
+	swap_allocated -= chain->allocations.size;
+	toi_extent_for_each(&chain->allocations, extentpointer, extentvalue)
+		swap_free((swp_entry_t) { extentvalue });
+
+	toi_put_extent_chain(&chain->allocations);
+}
+
+static void free_swap_range(unsigned long min, unsigned long max)
+{
+	int j;
+
+	for (j = min; j <= max; j++)
+		swap_free((swp_entry_t) { j });
+	swap_allocated -= (max - min + 1);
+}
+
+/*
+ * Allocation of a single swap type. Swap priorities are handled at the higher
+ * level.
+ */
+static int toi_swap_allocate_storage(struct toi_bdev_info *chain,
+		unsigned long request)
+{
+	unsigned long gotten = 0;
+
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "  Swap allocate storage: Asked to"
+			" allocate %lu pages from device %d.", request,
+			chain->allocator_index);
+
+	while (gotten < request) {
+		swp_entry_t start, end;
+		get_swap_range_of_type(chain->allocator_index, &start, &end,
+				request - gotten + 1);
+		if (start.val) {
+			int added = end.val - start.val + 1;
+			if (toi_add_to_extent_chain(&chain->allocations,
+						start.val, end.val)) {
+				printk(KERN_INFO "Failed to allocate extent for "
+					"%lu-%lu.\n", start.val, end.val);
+				free_swap_range(start.val, end.val);
+				break;
+			}
+			gotten += added;
+			swap_allocated += added;
+		} else
+			break;
+	}
+
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "  Allocated %lu pages.", gotten);
+	return gotten;
+}
+
+static int toi_swap_register_storage(void)
+{
+	int i, result = 0;
+
+	toi_message(TOI_IO, TOI_VERBOSE, 0, "toi_swap_register_storage.");
+	for (i = 0; i < MAX_SWAPFILES; i++) {
+		struct swap_info_struct *si = get_swap_info_struct(i);
+		struct toi_bdev_info *devinfo;
+		unsigned char *p;
+		unsigned char buf[256];
+		struct fs_info *fs_info;
+
+		if (!si || !(si->flags & SWP_WRITEOK) || is_ram_backed(si))
+			continue;
+
+		devinfo = toi_kzalloc(39, sizeof(struct toi_bdev_info),
+				GFP_ATOMIC);
+		if (!devinfo) {
+			printk("Failed to allocate devinfo struct for swap "
+					"device %d.\n", i);
+			return -ENOMEM;
+		}
+
+		devinfo->bdev = si->bdev;
+		devinfo->allocator = &toi_swapops;
+		devinfo->allocator_index = i;
+
+		fs_info = fs_info_from_block_dev(si->bdev);
+		if (fs_info && !IS_ERR(fs_info)) {
+			memcpy(devinfo->uuid, &fs_info->uuid, 16);
+			free_fs_info(fs_info);
+		} else
+			result = (int) PTR_ERR(fs_info);
+
+		if (!fs_info)
+			printk("fs_info from block dev returned %d.\n", result);
+		devinfo->dev_t = si->bdev->bd_dev;
+		devinfo->prio = si->prio;
+		devinfo->bmap_shift = 3;
+		devinfo->blocks_per_page = 1;
+
+		p = d_path(&si->swap_file->f_path, buf, sizeof(buf));
+		sprintf(devinfo->name, "swap on %s", p);
+
+		toi_message(TOI_IO, TOI_VERBOSE, 0, "Registering swap storage:"
+				" Device %d (%lx), prio %d.", i,
+				(unsigned long) devinfo->dev_t, devinfo->prio);
+		toi_bio_ops.register_storage(devinfo);
+	}
+
+	return 0;
+}
+
+/*
+ * workspace_size
+ *
+ * Description:
+ * Returns the number of bytes of RAM needed for this
+ * code to do its work. (Used when calculating whether
+ * we have enough memory to be able to hibernate & resume).
+ *
+ */
+static int toi_swap_memory_needed(void)
+{
+	return 1;
+}
+
+/*
+ * Print debug info
+ *
+ * Description:
+ */
+static int toi_swap_print_debug_stats(char *buffer, int size)
+{
+	int len = 0;
+
+	len = scnprintf(buffer, size, "- Swap Allocator enabled.\n");
+	if (swapfilename[0])
+		len += scnprintf(buffer+len, size-len,
+			"  Attempting to automatically swapon: %s.\n",
+			swapfilename);
+
+	si_swapinfo_no_compcache();
+
+	len += scnprintf(buffer+len, size-len,
+			"  Swap available for image: %lu pages.\n",
+			swapinfo.freeswap + swap_allocated);
+
+	return len;
+}
+
+static int header_locations_read_sysfs(const char *page, int count)
+{
+	int i, printedpartitionsmessage = 0, len = 0, haveswap = 0;
+	struct inode *swapf = NULL;
+	int zone;
+	char *path_page = (char *) toi_get_free_page(10, GFP_KERNEL);
+	char *path, *output = (char *) page;
+	int path_len;
+
+	if (!page)
+		return 0;
+
+	for (i = 0; i < MAX_SWAPFILES; i++) {
+		struct swap_info_struct *si =  get_swap_info_struct(i);
+
+		if (!si || !(si->flags & SWP_WRITEOK))
+			continue;
+
+		if (S_ISBLK(si->swap_file->f_mapping->host->i_mode)) {
+			haveswap = 1;
+			if (!printedpartitionsmessage) {
+				len += sprintf(output + len,
+					"For swap partitions, simply use the "
+					"format: resume=swap:/dev/hda1.\n");
+				printedpartitionsmessage = 1;
+			}
+		} else {
+			path_len = 0;
+
+			path = d_path(&si->swap_file->f_path, path_page,
+					PAGE_SIZE);
+			path_len = snprintf(path_page, PAGE_SIZE, "%s", path);
+
+			haveswap = 1;
+			swapf = si->swap_file->f_mapping->host;
+			zone = bmap(swapf, 0);
+			if (!zone) {
+				len += sprintf(output + len,
+					"Swapfile %s has been corrupted. Reuse"
+					" mkswap on it and try again.\n",
+					path_page);
+			} else {
+				char name_buffer[BDEVNAME_SIZE];
+				len += sprintf(output + len,
+					"For swapfile `%s`,"
+					" use resume=swap:/dev/%s:0x%x.\n",
+					path_page,
+					bdevname(si->bdev, name_buffer),
+					zone << (swapf->i_blkbits - 9));
+			}
+		}
+	}
+
+	if (!haveswap)
+		len = sprintf(output, "You need to turn on swap partitions "
+				"before examining this file.\n");
+
+	toi_free_page(10, (unsigned long) path_page);
+	return len;
+}
+
+static struct toi_sysfs_data sysfs_params[] = {
+	SYSFS_STRING("swapfilename", SYSFS_RW, swapfilename, 255, 0, NULL),
+	SYSFS_CUSTOM("headerlocations", SYSFS_READONLY,
+			header_locations_read_sysfs, NULL, 0, NULL),
+	SYSFS_INT("enabled", SYSFS_RW, &toi_swapops.enabled, 0, 1, 0,
+			attempt_to_parse_resume_device2),
+};
+
+static struct toi_bio_allocator_ops toi_bio_swapops = {
+	.register_storage			= toi_swap_register_storage,
+	.storage_available			= toi_swap_storage_available,
+	.allocate_storage			= toi_swap_allocate_storage,
+	.bmap					= get_main_pool_phys_params,
+	.free_storage				= toi_swap_free_storage,
+};
+
+static struct toi_module_ops toi_swapops = {
+	.type					= BIO_ALLOCATOR_MODULE,
+	.name					= "swap storage",
+	.directory				= "swap",
+	.module					= THIS_MODULE,
+	.memory_needed				= toi_swap_memory_needed,
+	.print_debug_info			= toi_swap_print_debug_stats,
+	.initialise				= toi_swap_initialise,
+	.cleanup				= toi_swap_cleanup,
+	.bio_allocator_ops			= &toi_bio_swapops,
+
+	.sysfs_data		= sysfs_params,
+	.num_sysfs_entries	= sizeof(sysfs_params) /
+		sizeof(struct toi_sysfs_data),
+};
+
+/* ---- Registration ---- */
+static __init int toi_swap_load(void)
+{
+	return toi_register_module(&toi_swapops);
+}
+
+#ifdef MODULE
+static __exit void toi_swap_unload(void)
+{
+	toi_unregister_module(&toi_swapops);
+}
+
+module_init(toi_swap_load);
+module_exit(toi_swap_unload);
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Nigel Cunningham");
+MODULE_DESCRIPTION("TuxOnIce SwapAllocator");
+#else
+late_initcall(toi_swap_load);
+#endif
diff --git a/kernel/power/tuxonice_sysfs.c b/kernel/power/tuxonice_sysfs.c
new file mode 100644
index 0000000..0088409
--- /dev/null
+++ b/kernel/power/tuxonice_sysfs.c
@@ -0,0 +1,335 @@
+/*
+ * kernel/power/tuxonice_sysfs.c
+ *
+ * Copyright (C) 2002-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * This file contains support for sysfs entries for tuning TuxOnIce.
+ *
+ * We have a generic handler that deals with the most common cases, and
+ * hooks for special handlers to use.
+ */
+
+#include <linux/suspend.h>
+
+#include "tuxonice_sysfs.h"
+#include "tuxonice.h"
+#include "tuxonice_storage.h"
+#include "tuxonice_alloc.h"
+
+static int toi_sysfs_initialised;
+
+static void toi_initialise_sysfs(void);
+
+static struct toi_sysfs_data sysfs_params[];
+
+#define to_sysfs_data(_attr) container_of(_attr, struct toi_sysfs_data, attr)
+
+static void toi_main_wrapper(void)
+{
+	toi_try_hibernate();
+}
+
+static ssize_t toi_attr_show(struct kobject *kobj, struct attribute *attr,
+			      char *page)
+{
+	struct toi_sysfs_data *sysfs_data = to_sysfs_data(attr);
+	int len = 0;
+	int full_prep = sysfs_data->flags & SYSFS_NEEDS_SM_FOR_READ;
+
+	if (full_prep && toi_start_anything(0))
+		return -EBUSY;
+
+	if (sysfs_data->flags & SYSFS_NEEDS_SM_FOR_READ)
+		toi_prepare_usm();
+
+	switch (sysfs_data->type) {
+	case TOI_SYSFS_DATA_CUSTOM:
+		len = (sysfs_data->data.special.read_sysfs) ?
+			(sysfs_data->data.special.read_sysfs)(page, PAGE_SIZE)
+			: 0;
+		break;
+	case TOI_SYSFS_DATA_BIT:
+		len = sprintf(page, "%d\n",
+			-test_bit(sysfs_data->data.bit.bit,
+				sysfs_data->data.bit.bit_vector));
+		break;
+	case TOI_SYSFS_DATA_INTEGER:
+		len = sprintf(page, "%d\n",
+			*(sysfs_data->data.integer.variable));
+		break;
+	case TOI_SYSFS_DATA_LONG:
+		len = sprintf(page, "%ld\n",
+			*(sysfs_data->data.a_long.variable));
+		break;
+	case TOI_SYSFS_DATA_UL:
+		len = sprintf(page, "%lu\n",
+			*(sysfs_data->data.ul.variable));
+		break;
+	case TOI_SYSFS_DATA_STRING:
+		len = sprintf(page, "%s\n",
+			sysfs_data->data.string.variable);
+		break;
+	}
+
+	if (sysfs_data->flags & SYSFS_NEEDS_SM_FOR_READ)
+		toi_cleanup_usm();
+
+	if (full_prep)
+		toi_finish_anything(0);
+
+	return len;
+}
+
+#define BOUND(_variable, _type) do { \
+	if (*_variable < sysfs_data->data._type.minimum) \
+		*_variable = sysfs_data->data._type.minimum; \
+	else if (*_variable > sysfs_data->data._type.maximum) \
+		*_variable = sysfs_data->data._type.maximum; \
+} while (0)
+
+static ssize_t toi_attr_store(struct kobject *kobj, struct attribute *attr,
+		const char *my_buf, size_t count)
+{
+	int assigned_temp_buffer = 0, result = count;
+	struct toi_sysfs_data *sysfs_data = to_sysfs_data(attr);
+
+	if (toi_start_anything((sysfs_data->flags & SYSFS_HIBERNATE_OR_RESUME)))
+		return -EBUSY;
+
+	((char *) my_buf)[count] = 0;
+
+	if (sysfs_data->flags & SYSFS_NEEDS_SM_FOR_WRITE)
+		toi_prepare_usm();
+
+	switch (sysfs_data->type) {
+	case TOI_SYSFS_DATA_CUSTOM:
+		if (sysfs_data->data.special.write_sysfs)
+			result = (sysfs_data->data.special.write_sysfs)(my_buf,
+					count);
+		break;
+	case TOI_SYSFS_DATA_BIT:
+		{
+		unsigned long value;
+		result = strict_strtoul(my_buf, 0, &value);
+		if (result)
+			break;
+		if (value)
+			set_bit(sysfs_data->data.bit.bit,
+				(sysfs_data->data.bit.bit_vector));
+		else
+			clear_bit(sysfs_data->data.bit.bit,
+				(sysfs_data->data.bit.bit_vector));
+		}
+		break;
+	case TOI_SYSFS_DATA_INTEGER:
+		{
+			long temp;
+			result = strict_strtol(my_buf, 0, &temp);
+			if (result)
+				break;
+			*(sysfs_data->data.integer.variable) = (int) temp;
+			BOUND(sysfs_data->data.integer.variable, integer);
+			break;
+		}
+	case TOI_SYSFS_DATA_LONG:
+		{
+			long *variable =
+				sysfs_data->data.a_long.variable;
+			result = strict_strtol(my_buf, 0, variable);
+			if (result)
+				break;
+			BOUND(variable, a_long);
+			break;
+		}
+	case TOI_SYSFS_DATA_UL:
+		{
+			unsigned long *variable =
+				sysfs_data->data.ul.variable;
+			result = strict_strtoul(my_buf, 0, variable);
+			if (result)
+				break;
+			BOUND(variable, ul);
+			break;
+		}
+		break;
+	case TOI_SYSFS_DATA_STRING:
+		{
+			int copy_len = count;
+			char *variable =
+				sysfs_data->data.string.variable;
+
+			if (sysfs_data->data.string.max_length &&
+			    (copy_len > sysfs_data->data.string.max_length))
+				copy_len = sysfs_data->data.string.max_length;
+
+			if (!variable) {
+				variable = (char *) toi_get_zeroed_page(31,
+						TOI_ATOMIC_GFP);
+				sysfs_data->data.string.variable = variable;
+				assigned_temp_buffer = 1;
+			}
+			strncpy(variable, my_buf, copy_len);
+			if (copy_len && my_buf[copy_len - 1] == '\n')
+				variable[count - 1] = 0;
+			variable[count] = 0;
+		}
+		break;
+	}
+
+	if (!result)
+		result = count;
+
+	/* Side effect routine? */
+	if (result == count && sysfs_data->write_side_effect)
+		sysfs_data->write_side_effect();
+
+	/* Free temporary buffers */
+	if (assigned_temp_buffer) {
+		toi_free_page(31,
+			(unsigned long) sysfs_data->data.string.variable);
+		sysfs_data->data.string.variable = NULL;
+	}
+
+	if (sysfs_data->flags & SYSFS_NEEDS_SM_FOR_WRITE)
+		toi_cleanup_usm();
+
+	toi_finish_anything(sysfs_data->flags & SYSFS_HIBERNATE_OR_RESUME);
+
+	return result;
+}
+
+static struct sysfs_ops toi_sysfs_ops = {
+	.show	= &toi_attr_show,
+	.store	= &toi_attr_store,
+};
+
+static struct kobj_type toi_ktype = {
+	.sysfs_ops	= &toi_sysfs_ops,
+};
+
+struct kobject *tuxonice_kobj;
+
+/* Non-module sysfs entries.
+ *
+ * This array contains entries that are automatically registered at
+ * boot. Modules and the console code register their own entries separately.
+ */
+
+static struct toi_sysfs_data sysfs_params[] = {
+	SYSFS_CUSTOM("do_hibernate", SYSFS_WRITEONLY, NULL, NULL,
+		SYSFS_HIBERNATING, toi_main_wrapper),
+	SYSFS_CUSTOM("do_resume", SYSFS_WRITEONLY, NULL, NULL,
+		SYSFS_RESUMING, toi_try_resume)
+};
+
+void remove_toi_sysdir(struct kobject *kobj)
+{
+	if (!kobj)
+		return;
+
+	kobject_put(kobj);
+}
+
+struct kobject *make_toi_sysdir(char *name)
+{
+	struct kobject *kobj = kobject_create_and_add(name, tuxonice_kobj);
+
+	if (!kobj) {
+		printk(KERN_INFO "TuxOnIce: Can't allocate kobject for sysfs "
+				"dir!\n");
+		return NULL;
+	}
+
+	kobj->ktype = &toi_ktype;
+
+	return kobj;
+}
+
+/* toi_register_sysfs_file
+ *
+ * Helper for registering a new /sysfs/tuxonice entry.
+ */
+
+int toi_register_sysfs_file(
+		struct kobject *kobj,
+		struct toi_sysfs_data *toi_sysfs_data)
+{
+	int result;
+
+	if (!toi_sysfs_initialised)
+		toi_initialise_sysfs();
+
+	result = sysfs_create_file(kobj, &toi_sysfs_data->attr);
+	if (result)
+		printk(KERN_INFO "TuxOnIce: sysfs_create_file for %s "
+			"returned %d.\n",
+			toi_sysfs_data->attr.name, result);
+	kobj->ktype = &toi_ktype;
+
+	return result;
+}
+EXPORT_SYMBOL_GPL(toi_register_sysfs_file);
+
+/* toi_unregister_sysfs_file
+ *
+ * Helper for removing unwanted /sys/power/tuxonice entries.
+ *
+ */
+void toi_unregister_sysfs_file(struct kobject *kobj,
+		struct toi_sysfs_data *toi_sysfs_data)
+{
+	sysfs_remove_file(kobj, &toi_sysfs_data->attr);
+}
+EXPORT_SYMBOL_GPL(toi_unregister_sysfs_file);
+
+void toi_cleanup_sysfs(void)
+{
+	int i,
+	    numfiles = sizeof(sysfs_params) / sizeof(struct toi_sysfs_data);
+
+	if (!toi_sysfs_initialised)
+		return;
+
+	for (i = 0; i < numfiles; i++)
+		toi_unregister_sysfs_file(tuxonice_kobj, &sysfs_params[i]);
+
+	kobject_put(tuxonice_kobj);
+	toi_sysfs_initialised = 0;
+}
+
+/* toi_initialise_sysfs
+ *
+ * Initialise the /sysfs/tuxonice directory.
+ */
+
+static void toi_initialise_sysfs(void)
+{
+	int i;
+	int numfiles = sizeof(sysfs_params) / sizeof(struct toi_sysfs_data);
+
+	if (toi_sysfs_initialised)
+		return;
+
+	/* Make our TuxOnIce directory a child of /sys/power */
+	tuxonice_kobj = kobject_create_and_add("tuxonice", power_kobj);
+	if (!tuxonice_kobj)
+		return;
+
+	toi_sysfs_initialised = 1;
+
+	for (i = 0; i < numfiles; i++)
+		toi_register_sysfs_file(tuxonice_kobj, &sysfs_params[i]);
+}
+
+int toi_sysfs_init(void)
+{
+	toi_initialise_sysfs();
+	return 0;
+}
+
+void toi_sysfs_exit(void)
+{
+	toi_cleanup_sysfs();
+}
diff --git a/kernel/power/tuxonice_sysfs.h b/kernel/power/tuxonice_sysfs.h
new file mode 100644
index 0000000..4185c6d
--- /dev/null
+++ b/kernel/power/tuxonice_sysfs.h
@@ -0,0 +1,137 @@
+/*
+ * kernel/power/tuxonice_sysfs.h
+ *
+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ */
+
+#include <linux/sysfs.h>
+
+struct toi_sysfs_data {
+	struct attribute attr;
+	int type;
+	int flags;
+	union {
+		struct {
+			unsigned long *bit_vector;
+			int bit;
+		} bit;
+		struct {
+			int *variable;
+			int minimum;
+			int maximum;
+		} integer;
+		struct {
+			long *variable;
+			long minimum;
+			long maximum;
+		} a_long;
+		struct {
+			unsigned long *variable;
+			unsigned long minimum;
+			unsigned long maximum;
+		} ul;
+		struct {
+			char *variable;
+			int max_length;
+		} string;
+		struct {
+			int (*read_sysfs) (const char *buffer, int count);
+			int (*write_sysfs) (const char *buffer, int count);
+			void *data;
+		} special;
+	} data;
+
+	/* Side effects routine. Used, eg, for reparsing the
+	 * resume= entry when it changes */
+	void (*write_side_effect) (void);
+	struct list_head sysfs_data_list;
+};
+
+enum {
+	TOI_SYSFS_DATA_NONE = 1,
+	TOI_SYSFS_DATA_CUSTOM,
+	TOI_SYSFS_DATA_BIT,
+	TOI_SYSFS_DATA_INTEGER,
+	TOI_SYSFS_DATA_UL,
+	TOI_SYSFS_DATA_LONG,
+	TOI_SYSFS_DATA_STRING
+};
+
+#define SYSFS_WRITEONLY 0200
+#define SYSFS_READONLY 0444
+#define SYSFS_RW 0644
+
+#define SYSFS_BIT(_name, _mode, _ul, _bit, _flags) { \
+	.attr = {.name  = _name , .mode   = _mode }, \
+	.type = TOI_SYSFS_DATA_BIT, \
+	.flags = _flags, \
+	.data = { .bit = { .bit_vector = _ul, .bit = _bit } } }
+
+#define SYSFS_INT(_name, _mode, _int, _min, _max, _flags, _wse) { \
+	.attr = {.name  = _name , .mode   = _mode }, \
+	.type = TOI_SYSFS_DATA_INTEGER, \
+	.flags = _flags, \
+	.data = { .integer = { .variable = _int, .minimum = _min, \
+			.maximum = _max } }, \
+	.write_side_effect = _wse }
+
+#define SYSFS_UL(_name, _mode, _ul, _min, _max, _flags) { \
+	.attr = {.name  = _name , .mode   = _mode }, \
+	.type = TOI_SYSFS_DATA_UL, \
+	.flags = _flags, \
+	.data = { .ul = { .variable = _ul, .minimum = _min, \
+			.maximum = _max } } }
+
+#define SYSFS_LONG(_name, _mode, _long, _min, _max, _flags) { \
+	.attr = {.name  = _name , .mode   = _mode }, \
+	.type = TOI_SYSFS_DATA_LONG, \
+	.flags = _flags, \
+	.data = { .a_long = { .variable = _long, .minimum = _min, \
+			.maximum = _max } } }
+
+#define SYSFS_STRING(_name, _mode, _string, _max_len, _flags, _wse) { \
+	.attr = {.name  = _name , .mode   = _mode }, \
+	.type = TOI_SYSFS_DATA_STRING, \
+	.flags = _flags, \
+	.data = { .string = { .variable = _string, .max_length = _max_len } }, \
+	.write_side_effect = _wse }
+
+#define SYSFS_CUSTOM(_name, _mode, _read, _write, _flags, _wse) { \
+	.attr = {.name  = _name , .mode   = _mode }, \
+	.type = TOI_SYSFS_DATA_CUSTOM, \
+	.flags = _flags, \
+	.data = { .special = { .read_sysfs = _read, .write_sysfs = _write } }, \
+	.write_side_effect = _wse }
+
+#define SYSFS_NONE(_name, _wse) { \
+	.attr = {.name  = _name , .mode   = SYSFS_WRITEONLY }, \
+	.type = TOI_SYSFS_DATA_NONE, \
+	.write_side_effect = _wse, \
+}
+
+/* Flags */
+#define SYSFS_NEEDS_SM_FOR_READ 1
+#define SYSFS_NEEDS_SM_FOR_WRITE 2
+#define SYSFS_HIBERNATE 4
+#define SYSFS_RESUME 8
+#define SYSFS_HIBERNATE_OR_RESUME (SYSFS_HIBERNATE | SYSFS_RESUME)
+#define SYSFS_HIBERNATING (SYSFS_HIBERNATE | SYSFS_NEEDS_SM_FOR_WRITE)
+#define SYSFS_RESUMING (SYSFS_RESUME | SYSFS_NEEDS_SM_FOR_WRITE)
+#define SYSFS_NEEDS_SM_FOR_BOTH \
+ (SYSFS_NEEDS_SM_FOR_READ | SYSFS_NEEDS_SM_FOR_WRITE)
+
+int toi_register_sysfs_file(struct kobject *kobj,
+		struct toi_sysfs_data *toi_sysfs_data);
+void toi_unregister_sysfs_file(struct kobject *kobj,
+		struct toi_sysfs_data *toi_sysfs_data);
+
+extern struct kobject *tuxonice_kobj;
+
+struct kobject *make_toi_sysdir(char *name);
+void remove_toi_sysdir(struct kobject *obj);
+extern void toi_cleanup_sysfs(void);
+
+extern int toi_sysfs_init(void);
+extern void toi_sysfs_exit(void);
diff --git a/kernel/power/tuxonice_ui.c b/kernel/power/tuxonice_ui.c
new file mode 100644
index 0000000..452b3db
--- /dev/null
+++ b/kernel/power/tuxonice_ui.c
@@ -0,0 +1,250 @@
+/*
+ * kernel/power/tuxonice_ui.c
+ *
+ * Copyright (C) 1998-2001 Gabor Kuti <seasons@fornax.hu>
+ * Copyright (C) 1998,2001,2002 Pavel Machek <pavel@suse.cz>
+ * Copyright (C) 2002-2003 Florent Chabaud <fchabaud@free.fr>
+ * Copyright (C) 2002-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * Routines for TuxOnIce's user interface.
+ *
+ * The user interface code talks to a userspace program via a
+ * netlink socket.
+ *
+ * The kernel side:
+ * - starts the userui program;
+ * - sends text messages and progress bar status;
+ *
+ * The user space side:
+ * - passes messages regarding user requests (abort, toggle reboot etc)
+ *
+ */
+
+#define __KERNEL_SYSCALLS__
+
+#include <linux/reboot.h>
+
+#include "tuxonice_sysfs.h"
+#include "tuxonice_modules.h"
+#include "tuxonice.h"
+#include "tuxonice_ui.h"
+#include "tuxonice_netlink.h"
+#include "tuxonice_power_off.h"
+#include "tuxonice_builtin.h"
+
+static char local_printf_buf[1024];	/* Same as printk - should be safe */
+struct ui_ops *toi_current_ui;
+EXPORT_SYMBOL_GPL(toi_current_ui);
+
+/**
+ * toi_wait_for_keypress - Wait for keypress via userui or /dev/console.
+ *
+ * @timeout: Maximum time to wait.
+ *
+ * Wait for a keypress, either from userui or /dev/console if userui isn't
+ * available. The non-userui path is particularly for at boot-time, prior
+ * to userui being started, when we have an important warning to give to
+ * the user.
+ */
+static char toi_wait_for_keypress(int timeout)
+{
+	if (toi_current_ui && toi_current_ui->wait_for_key(timeout))
+		return ' ';
+
+	return toi_wait_for_keypress_dev_console(timeout);
+}
+
+/* toi_early_boot_message()
+ * Description:	Handle errors early in the process of booting.
+ * 		The user may press C to continue booting, perhaps
+ * 		invalidating the image,  or space to reboot.
+ * 		This works from either the serial console or normally
+ * 		attached keyboard.
+ *
+ * 		Note that we come in here from init, while the kernel is
+ * 		locked. If we want to get events from the serial console,
+ * 		we need to temporarily unlock the kernel.
+ *
+ * 		toi_early_boot_message may also be called post-boot.
+ * 		In this case, it simply printks the message and returns.
+ *
+ * Arguments:	int	Whether we are able to erase the image.
+ * 		int	default_answer. What to do when we timeout. This
+ * 			will normally be continue, but the user might
+ * 			provide command line options (__setup) to override
+ * 			particular cases.
+ * 		Char *. Pointer to a string explaining why we're moaning.
+ */
+
+#define say(message, a...) printk(KERN_EMERG message, ##a)
+
+void toi_early_boot_message(int message_detail, int default_answer,
+	char *warning_reason, ...)
+{
+#if defined(CONFIG_VT) || defined(CONFIG_SERIAL_CONSOLE)
+	unsigned long orig_state = get_toi_state(), continue_req = 0;
+	unsigned long orig_loglevel = console_loglevel;
+	int can_ask = 1;
+#else
+	int can_ask = 0;
+#endif
+
+	va_list args;
+	int printed_len;
+
+	if (!toi_wait) {
+		set_toi_state(TOI_CONTINUE_REQ);
+		can_ask = 0;
+	}
+
+	if (warning_reason) {
+		va_start(args, warning_reason);
+		printed_len = vsnprintf(local_printf_buf,
+				sizeof(local_printf_buf),
+				warning_reason,
+				args);
+		va_end(args);
+	}
+
+	if (!test_toi_state(TOI_BOOT_TIME)) {
+		printk("TuxOnIce: %s\n", local_printf_buf);
+		return;
+	}
+
+	if (!can_ask) {
+		continue_req = !!default_answer;
+		goto post_ask;
+	}
+
+#if defined(CONFIG_VT) || defined(CONFIG_SERIAL_CONSOLE)
+	console_loglevel = 7;
+
+	say("=== TuxOnIce ===\n\n");
+	if (warning_reason) {
+		say("BIG FAT WARNING!! %s\n\n", local_printf_buf);
+		switch (message_detail) {
+		case 0:
+			say("If you continue booting, note that any image WILL"
+				"NOT BE REMOVED.\nTuxOnIce is unable to do so "
+				"because the appropriate modules aren't\n"
+				"loaded. You should manually remove the image "
+				"to avoid any\npossibility of corrupting your "
+				"filesystem(s) later.\n");
+			break;
+		case 1:
+			say("If you want to use the current TuxOnIce image, "
+				"reboot and try\nagain with the same kernel "
+				"that you hibernated from. If you want\n"
+				"to forget that image, continue and the image "
+				"will be erased.\n");
+			break;
+		}
+		say("Press SPACE to reboot or C to continue booting with "
+			"this kernel\n\n");
+		if (toi_wait > 0)
+			say("Default action if you don't select one in %d "
+				"seconds is: %s.\n",
+				toi_wait,
+				default_answer == TOI_CONTINUE_REQ ?
+				"continue booting" : "reboot");
+	} else {
+		say("BIG FAT WARNING!!\n\n"
+			"You have tried to resume from this image before.\n"
+			"If it failed once, it may well fail again.\n"
+			"Would you like to remove the image and boot "
+			"normally?\nThis will be equivalent to entering "
+			"noresume on the\nkernel command line.\n\n"
+			"Press SPACE to remove the image or C to continue "
+			"resuming.\n\n");
+		if (toi_wait > 0)
+			say("Default action if you don't select one in %d "
+				"seconds is: %s.\n", toi_wait,
+				!!default_answer ?
+				"continue resuming" : "remove the image");
+	}
+	console_loglevel = orig_loglevel;
+
+	set_toi_state(TOI_SANITY_CHECK_PROMPT);
+	clear_toi_state(TOI_CONTINUE_REQ);
+
+	if (toi_wait_for_keypress(toi_wait) == 0) /* We timed out */
+		continue_req = !!default_answer;
+	else
+		continue_req = test_toi_state(TOI_CONTINUE_REQ);
+
+#endif /* CONFIG_VT or CONFIG_SERIAL_CONSOLE */
+
+post_ask:
+	if ((warning_reason) && (!continue_req))
+		kernel_restart(NULL);
+
+	restore_toi_state(orig_state);
+	if (continue_req)
+		set_toi_state(TOI_CONTINUE_REQ);
+}
+EXPORT_SYMBOL_GPL(toi_early_boot_message);
+#undef say
+
+/*
+ * User interface specific /sys/power/tuxonice entries.
+ */
+
+static struct toi_sysfs_data sysfs_params[] = {
+#if defined(CONFIG_NET) && defined(CONFIG_SYSFS)
+	SYSFS_INT("default_console_level", SYSFS_RW,
+			&toi_bkd.toi_default_console_level, 0, 7, 0, NULL),
+	SYSFS_UL("debug_sections", SYSFS_RW, &toi_bkd.toi_debug_state, 0,
+			1 << 30, 0),
+	SYSFS_BIT("log_everything", SYSFS_RW, &toi_bkd.toi_action, TOI_LOGALL,
+			0)
+#endif
+};
+
+static struct toi_module_ops userui_ops = {
+	.type				= MISC_HIDDEN_MODULE,
+	.name				= "printk ui",
+	.directory			= "user_interface",
+	.module				= THIS_MODULE,
+	.sysfs_data			= sysfs_params,
+	.num_sysfs_entries		= sizeof(sysfs_params) /
+		sizeof(struct toi_sysfs_data),
+};
+
+int toi_register_ui_ops(struct ui_ops *this_ui)
+{
+	if (toi_current_ui) {
+		printk(KERN_INFO "Only one TuxOnIce user interface module can "
+				"be loaded at a time.");
+		return -EBUSY;
+	}
+
+	toi_current_ui = this_ui;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(toi_register_ui_ops);
+
+void toi_remove_ui_ops(struct ui_ops *this_ui)
+{
+	if (toi_current_ui != this_ui)
+		return;
+
+	toi_current_ui = NULL;
+}
+EXPORT_SYMBOL_GPL(toi_remove_ui_ops);
+
+/* toi_console_sysfs_init
+ * Description: Boot time initialisation for user interface.
+ */
+
+int toi_ui_init(void)
+{
+	return toi_register_module(&userui_ops);
+}
+
+void toi_ui_exit(void)
+{
+	toi_unregister_module(&userui_ops);
+}
diff --git a/kernel/power/tuxonice_ui.h b/kernel/power/tuxonice_ui.h
new file mode 100644
index 0000000..4ced165
--- /dev/null
+++ b/kernel/power/tuxonice_ui.h
@@ -0,0 +1,97 @@
+/*
+ * kernel/power/tuxonice_ui.h
+ *
+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net)
+ */
+
+enum {
+	DONT_CLEAR_BAR,
+	CLEAR_BAR
+};
+
+enum {
+	/* Userspace -> Kernel */
+	USERUI_MSG_ABORT = 0x11,
+	USERUI_MSG_SET_STATE = 0x12,
+	USERUI_MSG_GET_STATE = 0x13,
+	USERUI_MSG_GET_DEBUG_STATE = 0x14,
+	USERUI_MSG_SET_DEBUG_STATE = 0x15,
+	USERUI_MSG_SPACE = 0x18,
+	USERUI_MSG_GET_POWERDOWN_METHOD = 0x1A,
+	USERUI_MSG_SET_POWERDOWN_METHOD = 0x1B,
+	USERUI_MSG_GET_LOGLEVEL = 0x1C,
+	USERUI_MSG_SET_LOGLEVEL = 0x1D,
+	USERUI_MSG_PRINTK = 0x1E,
+
+	/* Kernel -> Userspace */
+	USERUI_MSG_MESSAGE = 0x21,
+	USERUI_MSG_PROGRESS = 0x22,
+	USERUI_MSG_POST_ATOMIC_RESTORE = 0x25,
+
+	USERUI_MSG_MAX,
+};
+
+struct userui_msg_params {
+	u32 a, b, c, d;
+	char text[255];
+};
+
+struct ui_ops {
+	char (*wait_for_key) (int timeout);
+	u32 (*update_status) (u32 value, u32 maximum, const char *fmt, ...);
+	void (*prepare_status) (int clearbar, const char *fmt, ...);
+	void (*cond_pause) (int pause, char *message);
+	void (*abort)(int result_code, const char *fmt, ...);
+	void (*prepare)(void);
+	void (*cleanup)(void);
+	void (*message)(u32 section, u32 level, u32 normally_logged,
+			const char *fmt, ...);
+};
+
+extern struct ui_ops *toi_current_ui;
+
+#define toi_update_status(val, max, fmt, args...) \
+ (toi_current_ui ? (toi_current_ui->update_status) (val, max, fmt, ##args) : \
+	max)
+
+#define toi_prepare_console(void) \
+	do { if (toi_current_ui) \
+		(toi_current_ui->prepare)(); \
+	} while (0)
+
+#define toi_cleanup_console(void) \
+	do { if (toi_current_ui) \
+		(toi_current_ui->cleanup)(); \
+	} while (0)
+
+#define abort_hibernate(result, fmt, args...) \
+	do { if (toi_current_ui) \
+		(toi_current_ui->abort)(result, fmt, ##args); \
+	     else { \
+		set_abort_result(result); \
+	     } \
+	} while (0)
+
+#define toi_cond_pause(pause, message) \
+	do { if (toi_current_ui) \
+		(toi_current_ui->cond_pause)(pause, message); \
+	} while (0)
+
+#define toi_prepare_status(clear, fmt, args...) \
+	do { if (toi_current_ui) \
+		(toi_current_ui->prepare_status)(clear, fmt, ##args); \
+	     else \
+		printk(KERN_INFO fmt "%s", ##args, "\n"); \
+	} while (0)
+
+#define toi_message(sn, lev, log, fmt, a...) \
+do { \
+	if (toi_current_ui && (!sn || test_debug_state(sn))) \
+		toi_current_ui->message(sn, lev, log, fmt, ##a); \
+} while (0)
+
+__exit void toi_ui_cleanup(void);
+extern int toi_ui_init(void);
+extern void toi_ui_exit(void);
+extern int toi_register_ui_ops(struct ui_ops *this_ui);
+extern void toi_remove_ui_ops(struct ui_ops *this_ui);
diff --git a/kernel/power/tuxonice_userui.c b/kernel/power/tuxonice_userui.c
new file mode 100644
index 0000000..bc74672
--- /dev/null
+++ b/kernel/power/tuxonice_userui.c
@@ -0,0 +1,667 @@
+/*
+ * kernel/power/user_ui.c
+ *
+ * Copyright (C) 2005-2007 Bernard Blackham
+ * Copyright (C) 2002-2010 Nigel Cunningham (nigel at tuxonice net)
+ *
+ * This file is released under the GPLv2.
+ *
+ * Routines for TuxOnIce's user interface.
+ *
+ * The user interface code talks to a userspace program via a
+ * netlink socket.
+ *
+ * The kernel side:
+ * - starts the userui program;
+ * - sends text messages and progress bar status;
+ *
+ * The user space side:
+ * - passes messages regarding user requests (abort, toggle reboot etc)
+ *
+ */
+
+#define __KERNEL_SYSCALLS__
+
+#include <linux/suspend.h>
+#include <linux/freezer.h>
+#include <linux/console.h>
+#include <linux/ctype.h>
+#include <linux/tty.h>
+#include <linux/vt_kern.h>
+#include <linux/reboot.h>
+#include <linux/security.h>
+#include <linux/syscalls.h>
+#include <linux/vt.h>
+
+#include "tuxonice_sysfs.h"
+#include "tuxonice_modules.h"
+#include "tuxonice.h"
+#include "tuxonice_ui.h"
+#include "tuxonice_netlink.h"
+#include "tuxonice_power_off.h"
+
+static char local_printf_buf[1024];	/* Same as printk - should be safe */
+
+static struct user_helper_data ui_helper_data;
+static struct toi_module_ops userui_ops;
+static int orig_kmsg;
+
+static char lastheader[512];
+static int lastheader_message_len;
+static int ui_helper_changed; /* Used at resume-time so don't overwrite value
+				set from initrd/ramfs. */
+
+/* Number of distinct progress amounts that userspace can display */
+static int progress_granularity = 30;
+
+static DECLARE_WAIT_QUEUE_HEAD(userui_wait_for_key);
+
+/**
+ * ui_nl_set_state - Update toi_action based on a message from userui.
+ *
+ * @n: The bit (1 << bit) to set.
+ */
+static void ui_nl_set_state(int n)
+{
+	/* Only let them change certain settings */
+	static const u32 toi_action_mask =
+		(1 << TOI_REBOOT) | (1 << TOI_PAUSE) |
+		(1 << TOI_LOGALL) |
+		(1 << TOI_SINGLESTEP) |
+		(1 << TOI_PAUSE_NEAR_PAGESET_END);
+	static unsigned long new_action;
+
+	new_action = (toi_bkd.toi_action & (~toi_action_mask)) |
+		(n & toi_action_mask);
+
+	printk(KERN_DEBUG "n is %x. Action flags being changed from %lx "
+			"to %lx.", n, toi_bkd.toi_action, new_action);
+	toi_bkd.toi_action = new_action;
+
+	if (!test_action_state(TOI_PAUSE) &&
+			!test_action_state(TOI_SINGLESTEP))
+		wake_up_interruptible(&userui_wait_for_key);
+}
+
+/**
+ * userui_post_atomic_restore - Tell userui that atomic restore just happened.
+ *
+ * Tell userui that atomic restore just occured, so that it can do things like
+ * redrawing the screen, re-getting settings and so on.
+ */
+static void userui_post_atomic_restore(struct toi_boot_kernel_data *bkd)
+{
+	toi_send_netlink_message(&ui_helper_data,
+			USERUI_MSG_POST_ATOMIC_RESTORE, NULL, 0);
+}
+
+/**
+ * userui_storage_needed - Report how much memory in image header is needed.
+ */
+static int userui_storage_needed(void)
+{
+	return sizeof(ui_helper_data.program) + 1 + sizeof(int);
+}
+
+/**
+ * userui_save_config_info - Fill buffer with config info for image header.
+ *
+ * @buf: Buffer into which to put the config info we want to save.
+ */
+static int userui_save_config_info(char *buf)
+{
+	*((int *) buf) = progress_granularity;
+	memcpy(buf + sizeof(int), ui_helper_data.program,
+			sizeof(ui_helper_data.program));
+	return sizeof(ui_helper_data.program) + sizeof(int) + 1;
+}
+
+/**
+ * userui_load_config_info - Restore config info from buffer.
+ *
+ * @buf: Buffer containing header info loaded.
+ * @size: Size of data loaded for this module.
+ */
+static void userui_load_config_info(char *buf, int size)
+{
+	progress_granularity = *((int *) buf);
+	size -= sizeof(int);
+
+	/* Don't load the saved path if one has already been set */
+	if (ui_helper_changed)
+		return;
+
+	if (size > sizeof(ui_helper_data.program))
+		size = sizeof(ui_helper_data.program);
+
+	memcpy(ui_helper_data.program, buf + sizeof(int), size);
+	ui_helper_data.program[sizeof(ui_helper_data.program)-1] = '\0';
+}
+
+/**
+ * set_ui_program_set: Record that userui program was changed.
+ *
+ * Side effect routine for when the userui program is set. In an initrd or
+ * ramfs, the user may set a location for the userui program. If this happens,
+ * we don't want to reload the value that was saved in the image header. This
+ * routine allows us to flag that we shouldn't restore the program name from
+ * the image header.
+ */
+static void set_ui_program_set(void)
+{
+	ui_helper_changed = 1;
+}
+
+/**
+ * userui_memory_needed - Tell core how much memory to reserve for us.
+ */
+static int userui_memory_needed(void)
+{
+	/* ball park figure of 128 pages */
+	return 128 * PAGE_SIZE;
+}
+
+/**
+ * userui_update_status - Update the progress bar and (if on) in-bar message.
+ *
+ * @value: Current progress percentage numerator.
+ * @maximum: Current progress percentage denominator.
+ * @fmt: Message to be displayed in the middle of the progress bar.
+ *
+ * Note that a NULL message does not mean that any previous message is erased!
+ * For that, you need toi_prepare_status with clearbar on.
+ *
+ * Returns an unsigned long, being the next numerator (as determined by the
+ * maximum and progress granularity) where status needs to be updated.
+ * This is to reduce unnecessary calls to update_status.
+ */
+static u32 userui_update_status(u32 value, u32 maximum, const char *fmt, ...)
+{
+	static u32 last_step = 9999;
+	struct userui_msg_params msg;
+	u32 this_step, next_update;
+	int bitshift;
+
+	if (ui_helper_data.pid == -1)
+		return 0;
+
+	if ((!maximum) || (!progress_granularity))
+		return maximum;
+
+	if (value < 0)
+		value = 0;
+
+	if (value > maximum)
+		value = maximum;
+
+	/* Try to avoid math problems - we can't do 64 bit math here
+	 * (and shouldn't need it - anyone got screen resolution
+	 * of 65536 pixels or more?) */
+	bitshift = fls(maximum) - 16;
+	if (bitshift > 0) {
+		u32 temp_maximum = maximum >> bitshift;
+		u32 temp_value = value >> bitshift;
+		this_step = (u32)
+			(temp_value * progress_granularity / temp_maximum);
+		next_update = (((this_step + 1) * temp_maximum /
+					progress_granularity) + 1) << bitshift;
+	} else {
+		this_step = (u32) (value * progress_granularity / maximum);
+		next_update = ((this_step + 1) * maximum /
+				progress_granularity) + 1;
+	}
+
+	if (this_step == last_step)
+		return next_update;
+
+	memset(&msg, 0, sizeof(msg));
+
+	msg.a = this_step;
+	msg.b = progress_granularity;
+
+	if (fmt) {
+		va_list args;
+		va_start(args, fmt);
+		vsnprintf(msg.text, sizeof(msg.text), fmt, args);
+		va_end(args);
+		msg.text[sizeof(msg.text)-1] = '\0';
+	}
+
+	toi_send_netlink_message(&ui_helper_data, USERUI_MSG_PROGRESS,
+			&msg, sizeof(msg));
+	last_step = this_step;
+
+	return next_update;
+}
+
+/**
+ * userui_message - Display a message without necessarily logging it.
+ *
+ * @section: Type of message. Messages can be filtered by type.
+ * @level: Degree of importance of the message. Lower values = higher priority.
+ * @normally_logged: Whether logged even if log_everything is off.
+ * @fmt: Message (and parameters).
+ *
+ * This function is intended to do the same job as printk, but without normally
+ * logging what is printed. The point is to be able to get debugging info on
+ * screen without filling the logs with "1/534. ^M 2/534^M. 3/534^M"
+ *
+ * It may be called from an interrupt context - can't sleep!
+ */
+static void userui_message(u32 section, u32 level, u32 normally_logged,
+		const char *fmt, ...)
+{
+	struct userui_msg_params msg;
+
+	if ((level) && (level > console_loglevel))
+		return;
+
+	memset(&msg, 0, sizeof(msg));
+
+	msg.a = section;
+	msg.b = level;
+	msg.c = normally_logged;
+
+	if (fmt) {
+		va_list args;
+		va_start(args, fmt);
+		vsnprintf(msg.text, sizeof(msg.text), fmt, args);
+		va_end(args);
+		msg.text[sizeof(msg.text)-1] = '\0';
+	}
+
+	if (test_action_state(TOI_LOGALL))
+		printk(KERN_INFO "%s\n", msg.text);
+
+	toi_send_netlink_message(&ui_helper_data, USERUI_MSG_MESSAGE,
+			&msg, sizeof(msg));
+}
+
+/**
+ * wait_for_key_via_userui - Wait for userui to receive a keypress.
+ */
+static void wait_for_key_via_userui(void)
+{
+	DECLARE_WAITQUEUE(wait, current);
+
+	add_wait_queue(&userui_wait_for_key, &wait);
+	set_current_state(TASK_INTERRUPTIBLE);
+
+	interruptible_sleep_on(&userui_wait_for_key);
+
+	set_current_state(TASK_RUNNING);
+	remove_wait_queue(&userui_wait_for_key, &wait);
+}
+
+/**
+ * userui_prepare_status - Display high level messages.
+ *
+ * @clearbar: Whether to clear the progress bar.
+ * @fmt...: New message for the title.
+ *
+ * Prepare the 'nice display', drawing the header and version, along with the
+ * current action and perhaps also resetting the progress bar.
+ */
+static void userui_prepare_status(int clearbar, const char *fmt, ...)
+{
+	va_list args;
+
+	if (fmt) {
+		va_start(args, fmt);
+		lastheader_message_len = vsnprintf(lastheader, 512, fmt, args);
+		va_end(args);
+	}
+
+	if (clearbar)
+		toi_update_status(0, 1, NULL);
+
+	if (ui_helper_data.pid == -1)
+		printk(KERN_EMERG "%s\n", lastheader);
+	else
+		toi_message(0, TOI_STATUS, 1, lastheader, NULL);
+}
+
+/**
+ * toi_wait_for_keypress - Wait for keypress via userui.
+ *
+ * @timeout: Maximum time to wait.
+ *
+ * Wait for a keypress from userui.
+ *
+ * FIXME: Implement timeout?
+ */
+static char userui_wait_for_keypress(int timeout)
+{
+	char key = '\0';
+
+	if (ui_helper_data.pid != -1) {
+		wait_for_key_via_userui();
+		key = ' ';
+	}
+
+	return key;
+}
+
+/**
+ * userui_abort_hibernate - Abort a cycle & tell user if they didn't request it.
+ *
+ * @result_code: Reason why we're aborting (1 << bit).
+ * @fmt: Message to display if telling the user what's going on.
+ *
+ * Abort a cycle. If this wasn't at the user's request (and we're displaying
+ * output), tell the user why and wait for them to acknowledge the message.
+ */
+static void userui_abort_hibernate(int result_code, const char *fmt, ...)
+{
+	va_list args;
+	int printed_len = 0;
+
+	set_result_state(result_code);
+
+	if (test_result_state(TOI_ABORTED))
+		return;
+
+	set_result_state(TOI_ABORTED);
+
+	if (test_result_state(TOI_ABORT_REQUESTED))
+		return;
+
+	va_start(args, fmt);
+	printed_len = vsnprintf(local_printf_buf,  sizeof(local_printf_buf),
+			fmt, args);
+	va_end(args);
+	if (ui_helper_data.pid != -1)
+		printed_len = sprintf(local_printf_buf + printed_len,
+					" (Press SPACE to continue)");
+
+	toi_prepare_status(CLEAR_BAR, "%s", local_printf_buf);
+
+	if (ui_helper_data.pid != -1)
+		userui_wait_for_keypress(0);
+}
+
+/**
+ * request_abort_hibernate - Abort hibernating or resuming at user request.
+ *
+ * Handle the user requesting the cancellation of a hibernation or resume by
+ * pressing escape.
+ */
+static void request_abort_hibernate(void)
+{
+	if (test_result_state(TOI_ABORT_REQUESTED) ||
+	   !test_action_state(TOI_CAN_CANCEL))
+		return;
+
+	if (test_toi_state(TOI_NOW_RESUMING)) {
+		toi_prepare_status(CLEAR_BAR, "Escape pressed. "
+					"Powering down again.");
+		set_toi_state(TOI_STOP_RESUME);
+		while (!test_toi_state(TOI_IO_STOPPED))
+			schedule();
+		if (toiActiveAllocator->mark_resume_attempted)
+			toiActiveAllocator->mark_resume_attempted(0);
+		toi_power_down();
+	}
+
+	toi_prepare_status(CLEAR_BAR, "--- ESCAPE PRESSED :"
+					" ABORTING HIBERNATION ---");
+	set_abort_result(TOI_ABORT_REQUESTED);
+	wake_up_interruptible(&userui_wait_for_key);
+}
+
+/**
+ * userui_user_rcv_msg - Receive a netlink message from userui.
+ *
+ * @skb: skb received.
+ * @nlh: Netlink header received.
+ */
+static int userui_user_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
+{
+	int type;
+	int *data;
+
+	type = nlh->nlmsg_type;
+
+	/* A control message: ignore them */
+	if (type < NETLINK_MSG_BASE)
+		return 0;
+
+	/* Unknown message: reply with EINVAL */
+	if (type >= USERUI_MSG_MAX)
+		return -EINVAL;
+
+	/* All operations require privileges, even GET */
+	if (!capable(CAP_NET_ADMIN))
+		return -EPERM;
+
+	/* Only allow one task to receive NOFREEZE privileges */
+	if (type == NETLINK_MSG_NOFREEZE_ME && ui_helper_data.pid != -1) {
+		printk(KERN_INFO "Got NOFREEZE_ME request when "
+			"ui_helper_data.pid is %d.\n", ui_helper_data.pid);
+		return -EBUSY;
+	}
+
+	data = (int *) NLMSG_DATA(nlh);
+
+	switch (type) {
+	case USERUI_MSG_ABORT:
+		request_abort_hibernate();
+		return 0;
+	case USERUI_MSG_GET_STATE:
+		toi_send_netlink_message(&ui_helper_data,
+				USERUI_MSG_GET_STATE, &toi_bkd.toi_action,
+				sizeof(toi_bkd.toi_action));
+		return 0;
+	case USERUI_MSG_GET_DEBUG_STATE:
+		toi_send_netlink_message(&ui_helper_data,
+				USERUI_MSG_GET_DEBUG_STATE,
+				&toi_bkd.toi_debug_state,
+				sizeof(toi_bkd.toi_debug_state));
+		return 0;
+	case USERUI_MSG_SET_STATE:
+		if (nlh->nlmsg_len < NLMSG_LENGTH(sizeof(int)))
+			return -EINVAL;
+		ui_nl_set_state(*data);
+		return 0;
+	case USERUI_MSG_SET_DEBUG_STATE:
+		if (nlh->nlmsg_len < NLMSG_LENGTH(sizeof(int)))
+			return -EINVAL;
+		toi_bkd.toi_debug_state = (*data);
+		return 0;
+	case USERUI_MSG_SPACE:
+		wake_up_interruptible(&userui_wait_for_key);
+		return 0;
+	case USERUI_MSG_GET_POWERDOWN_METHOD:
+		toi_send_netlink_message(&ui_helper_data,
+				USERUI_MSG_GET_POWERDOWN_METHOD,
+				&toi_poweroff_method,
+				sizeof(toi_poweroff_method));
+		return 0;
+	case USERUI_MSG_SET_POWERDOWN_METHOD:
+		if (nlh->nlmsg_len != NLMSG_LENGTH(sizeof(char)))
+			return -EINVAL;
+		toi_poweroff_method = (unsigned long)(*data);
+		return 0;
+	case USERUI_MSG_GET_LOGLEVEL:
+		toi_send_netlink_message(&ui_helper_data,
+				USERUI_MSG_GET_LOGLEVEL,
+				&toi_bkd.toi_default_console_level,
+				sizeof(toi_bkd.toi_default_console_level));
+		return 0;
+	case USERUI_MSG_SET_LOGLEVEL:
+		if (nlh->nlmsg_len < NLMSG_LENGTH(sizeof(int)))
+			return -EINVAL;
+		toi_bkd.toi_default_console_level = (*data);
+		return 0;
+	case USERUI_MSG_PRINTK:
+		printk(KERN_INFO "%s", (char *) data);
+		return 0;
+	}
+
+	/* Unhandled here */
+	return 1;
+}
+
+/**
+ * userui_cond_pause - Possibly pause at user request.
+ *
+ * @pause: Whether to pause or just display the message.
+ * @message: Message to display at the start of pausing.
+ *
+ * Potentially pause and wait for the user to tell us to continue. We normally
+ * only pause when @pause is set. While paused, the user can do things like
+ * changing the loglevel, toggling the display of debugging sections and such
+ * like.
+ */
+static void userui_cond_pause(int pause, char *message)
+{
+	int displayed_message = 0, last_key = 0;
+
+	while (last_key != 32 &&
+		ui_helper_data.pid != -1 &&
+		((test_action_state(TOI_PAUSE) && pause) ||
+		 (test_action_state(TOI_SINGLESTEP)))) {
+		if (!displayed_message) {
+			toi_prepare_status(DONT_CLEAR_BAR,
+			   "%s Press SPACE to continue.%s",
+			   message ? message : "",
+			   (test_action_state(TOI_SINGLESTEP)) ?
+			   " Single step on." : "");
+			displayed_message = 1;
+		}
+		last_key = userui_wait_for_keypress(0);
+	}
+	schedule();
+}
+
+/**
+ * userui_prepare_console - Prepare the console for use.
+ *
+ * Prepare a console for use, saving current kmsg settings and attempting to
+ * start userui. Console loglevel changes are handled by userui.
+ */
+static void userui_prepare_console(void)
+{
+	orig_kmsg = vt_kmsg_redirect(fg_console + 1);
+
+	ui_helper_data.pid = -1;
+
+	if (!userui_ops.enabled) {
+		printk(KERN_INFO "TuxOnIce: Userui disabled.\n");
+		return;
+	}
+
+	if (*ui_helper_data.program)
+		toi_netlink_setup(&ui_helper_data);
+	else
+		printk(KERN_INFO "TuxOnIce: Userui program not configured.\n");
+}
+
+/**
+ * userui_cleanup_console - Cleanup after a cycle.
+ *
+ * Tell userui to cleanup, and restore kmsg_redirect to its original value.
+ */
+
+static void userui_cleanup_console(void)
+{
+	if (ui_helper_data.pid > -1)
+		toi_netlink_close(&ui_helper_data);
+
+	vt_kmsg_redirect(orig_kmsg);
+}
+
+/*
+ * User interface specific /sys/power/tuxonice entries.
+ */
+
+static struct toi_sysfs_data sysfs_params[] = {
+#if defined(CONFIG_NET) && defined(CONFIG_SYSFS)
+	SYSFS_BIT("enable_escape", SYSFS_RW, &toi_bkd.toi_action,
+			TOI_CAN_CANCEL, 0),
+	SYSFS_BIT("pause_between_steps", SYSFS_RW, &toi_bkd.toi_action,
+			TOI_PAUSE, 0),
+	SYSFS_INT("enabled", SYSFS_RW, &userui_ops.enabled, 0, 1, 0, NULL),
+	SYSFS_INT("progress_granularity", SYSFS_RW, &progress_granularity, 1,
+			2048, 0, NULL),
+	SYSFS_STRING("program", SYSFS_RW, ui_helper_data.program, 255, 0,
+			set_ui_program_set),
+	SYSFS_INT("debug", SYSFS_RW, &ui_helper_data.debug, 0, 1, 0, NULL)
+#endif
+};
+
+static struct toi_module_ops userui_ops = {
+	.type				= MISC_MODULE,
+	.name				= "userui",
+	.shared_directory		= "user_interface",
+	.module				= THIS_MODULE,
+	.storage_needed			= userui_storage_needed,
+	.save_config_info		= userui_save_config_info,
+	.load_config_info		= userui_load_config_info,
+	.memory_needed			= userui_memory_needed,
+	.post_atomic_restore		= userui_post_atomic_restore,
+	.sysfs_data			= sysfs_params,
+	.num_sysfs_entries		= sizeof(sysfs_params) /
+		sizeof(struct toi_sysfs_data),
+};
+
+static struct ui_ops my_ui_ops = {
+	.update_status			= userui_update_status,
+	.message			= userui_message,
+	.prepare_status			= userui_prepare_status,
+	.abort				= userui_abort_hibernate,
+	.cond_pause			= userui_cond_pause,
+	.prepare			= userui_prepare_console,
+	.cleanup			= userui_cleanup_console,
+	.wait_for_key			= userui_wait_for_keypress,
+};
+
+/**
+ * toi_user_ui_init - Boot time initialisation for user interface.
+ *
+ * Invoked from the core init routine.
+ */
+static __init int toi_user_ui_init(void)
+{
+	int result;
+
+	ui_helper_data.nl = NULL;
+	strncpy(ui_helper_data.program, CONFIG_TOI_USERUI_DEFAULT_PATH, 255);
+	ui_helper_data.pid = -1;
+	ui_helper_data.skb_size = sizeof(struct userui_msg_params);
+	ui_helper_data.pool_limit = 6;
+	ui_helper_data.netlink_id = NETLINK_TOI_USERUI;
+	ui_helper_data.name = "userspace ui";
+	ui_helper_data.rcv_msg = userui_user_rcv_msg;
+	ui_helper_data.interface_version = 8;
+	ui_helper_data.must_init = 0;
+	ui_helper_data.not_ready = userui_cleanup_console;
+	init_completion(&ui_helper_data.wait_for_process);
+	result = toi_register_module(&userui_ops);
+	if (!result)
+		result = toi_register_ui_ops(&my_ui_ops);
+	if (result)
+		toi_unregister_module(&userui_ops);
+
+	return result;
+}
+
+#ifdef MODULE
+/**
+ * toi_user_ui_ext - Cleanup code for if the core is unloaded.
+ */
+static __exit void toi_user_ui_exit(void)
+{
+	toi_netlink_close_complete(&ui_helper_data);
+	toi_remove_ui_ops(&my_ui_ops);
+	toi_unregister_module(&userui_ops);
+}
+
+module_init(toi_user_ui_init);
+module_exit(toi_user_ui_exit);
+MODULE_AUTHOR("Nigel Cunningham");
+MODULE_DESCRIPTION("TuxOnIce Userui Support");
+MODULE_LICENSE("GPL");
+#else
+late_initcall(toi_user_ui_init);
+#endif
diff --git a/kernel/power/user.c b/kernel/power/user.c
index 98d3575..0c50ed1 100644
--- a/kernel/power/user.c
+++ b/kernel/power/user.c
@@ -12,6 +12,7 @@
 #include <linux/suspend.h>
 #include <linux/syscalls.h>
 #include <linux/reboot.h>
+#include <linux/export.h>
 #include <linux/string.h>
 #include <linux/device.h>
 #include <linux/miscdevice.h>
@@ -43,6 +44,7 @@ static struct snapshot_data {
 } snapshot_state;
 
 atomic_t snapshot_device_available = ATOMIC_INIT(1);
+EXPORT_SYMBOL_GPL(snapshot_device_available);
 
 static int snapshot_open(struct inode *inode, struct file *filp)
 {
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index be7c86b..690662b 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -34,6 +34,7 @@
 #include <linux/memblock.h>
 #include <linux/aio.h>
 #include <linux/syscalls.h>
+#include <linux/suspend.h>
 #include <linux/kexec.h>
 #include <linux/kdb.h>
 #include <linux/ratelimit.h>
@@ -67,6 +68,7 @@ int console_printk[4] = {
 	MINIMUM_CONSOLE_LOGLEVEL,	/* minimum_console_loglevel */
 	DEFAULT_CONSOLE_LOGLEVEL,	/* default_console_loglevel */
 };
+EXPORT_SYMBOL_GPL(console_printk);
 
 /*
  * Low level drivers may need that to know if they can schedule in
@@ -1883,6 +1885,7 @@ void suspend_console(void)
 	console_suspended = 1;
 	up(&console_sem);
 }
+EXPORT_SYMBOL_GPL(suspend_console);
 
 void resume_console(void)
 {
@@ -1892,6 +1895,7 @@ void resume_console(void)
 	console_suspended = 0;
 	console_unlock();
 }
+EXPORT_SYMBOL_GPL(resume_console);
 
 /**
  * console_cpu_notify - print deferred console messages after CPU hotplug
diff --git a/kernel/time/jiffies.c b/kernel/time/jiffies.c
index a6a5bf5..7a925ba 100644
--- a/kernel/time/jiffies.c
+++ b/kernel/time/jiffies.c
@@ -51,13 +51,7 @@
  * HZ shrinks, so values greater than 8 overflow 32bits when
  * HZ=100.
  */
-#if HZ < 34
-#define JIFFIES_SHIFT	6
-#elif HZ < 67
-#define JIFFIES_SHIFT	7
-#else
 #define JIFFIES_SHIFT	8
-#endif
 
 static cycle_t jiffies_read(struct clocksource *cs)
 {
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 61f17fa..9532690 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -756,7 +756,6 @@ out:
 static void tick_broadcast_clear_oneshot(int cpu)
 {
 	cpumask_clear_cpu(cpu, tick_broadcast_oneshot_mask);
-	cpumask_clear_cpu(cpu, tick_broadcast_pending_mask);
 }
 
 static void tick_broadcast_init_next_event(struct cpumask *mask,
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 0e337ee..cc2f66f 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -2397,13 +2397,6 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
 	write &= RB_WRITE_MASK;
 	tail = write - length;
 
-	/*
-	 * If this is the first commit on the page, then it has the same
-	 * timestamp as the page itself.
-	 */
-	if (!tail)
-		delta = 0;
-
 	/* See if we shot pass the end of this buffer page */
 	if (unlikely(write > BUF_PAGE_SIZE))
 		return rb_move_tail(cpu_buffer, length, tail,
diff --git a/mm/highmem.c b/mm/highmem.c
index b32b70c..db3d6ea 100644
--- a/mm/highmem.c
+++ b/mm/highmem.c
@@ -66,6 +66,7 @@ unsigned int nr_free_highpages (void)
 
 	return pages;
 }
+EXPORT_SYMBOL_GPL(nr_free_highpages);
 
 static int pkmap_count[LAST_PKMAP];
 static unsigned int last_pkmap_nr;
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 90977ac..6420be5 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -945,10 +945,8 @@ static int hwpoison_user_mappings(struct page *p, unsigned long pfn,
 			 * to it. Similarly, page lock is shifted.
 			 */
 			if (hpage != p) {
-				if (!(flags & MF_COUNT_INCREASED)) {
-					put_page(hpage);
-					get_page(p);
-				}
+				put_page(hpage);
+				get_page(p);
 				lock_page(p);
 				unlock_page(hpage);
 				*hpagep = p;
diff --git a/mm/memory.c b/mm/memory.c
index 6768ce9..a11002f 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1634,6 +1634,7 @@ no_page_table:
 		return ERR_PTR(-EFAULT);
 	return page;
 }
+EXPORT_SYMBOL_GPL(follow_page_mask);
 
 static inline int stack_guard_page(struct vm_area_struct *vma, unsigned long addr)
 {
diff --git a/mm/mmzone.c b/mm/mmzone.c
index bf34fb8..0990dd2 100644
--- a/mm/mmzone.c
+++ b/mm/mmzone.c
@@ -8,11 +8,13 @@
 #include <linux/stddef.h>
 #include <linux/mm.h>
 #include <linux/mmzone.h>
+#include <linux/export.h>
 
 struct pglist_data *first_online_pgdat(void)
 {
 	return NODE_DATA(first_online_node);
 }
+EXPORT_SYMBOL_GPL(first_online_pgdat);
 
 struct pglist_data *next_online_pgdat(struct pglist_data *pgdat)
 {
@@ -22,6 +24,7 @@ struct pglist_data *next_online_pgdat(struct pglist_data *pgdat)
 		return NULL;
 	return NODE_DATA(nid);
 }
+EXPORT_SYMBOL_GPL(next_online_pgdat);
 
 /*
  * next_zone - helper magic for for_each_zone()
@@ -41,6 +44,7 @@ struct zone *next_zone(struct zone *zone)
 	}
 	return zone;
 }
+EXPORT_SYMBOL_GPL(next_zone);
 
 static inline int zref_in_nodemask(struct zoneref *zref, nodemask_t *nodes)
 {
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 7106cb1..6bb49a0 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -111,6 +111,7 @@ unsigned int dirty_expire_interval = 30 * 100; /* centiseconds */
  * Flag that makes the machine dump writes/reads and block dirtyings.
  */
 int block_dump;
+EXPORT_SYMBOL_GPL(block_dump);
 
 /*
  * Flag that puts the machine in "laptop mode". Doubles as a timeout in jiffies:
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5248fe0..0b7ac40 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -141,6 +141,7 @@ void pm_restore_gfp_mask(void)
 		saved_gfp_mask = 0;
 	}
 }
+EXPORT_SYMBOL_GPL(pm_restore_gfp_mask);
 
 void pm_restrict_gfp_mask(void)
 {
@@ -149,6 +150,7 @@ void pm_restrict_gfp_mask(void)
 	saved_gfp_mask = gfp_allowed_mask;
 	gfp_allowed_mask &= ~GFP_IOFS;
 }
+EXPORT_SYMBOL_GPL(pm_restrict_gfp_mask);
 
 bool pm_suspended_storage(void)
 {
diff --git a/mm/shmem.c b/mm/shmem.c
index 902a148..de6111e 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1362,7 +1362,7 @@ static int shmem_mmap(struct file *file, struct vm_area_struct *vma)
 }
 
 static struct inode *shmem_get_inode(struct super_block *sb, const struct inode *dir,
-				     umode_t mode, dev_t dev, unsigned long flags)
+				     umode_t mode, dev_t dev, unsigned long flags, int atomic_copy)
 {
 	struct inode *inode;
 	struct shmem_inode_info *info;
@@ -1383,6 +1383,8 @@ static struct inode *shmem_get_inode(struct super_block *sb, const struct inode
 		memset(info, 0, (char *)inode - (char *)info);
 		spin_lock_init(&info->lock);
 		info->flags = flags & VM_NORESERVE;
+		if (atomic_copy)
+			inode->i_flags |= S_ATOMIC_COPY;
 		INIT_LIST_HEAD(&info->swaplist);
 		simple_xattrs_init(&info->xattrs);
 		cache_no_acl(inode);
@@ -1935,7 +1937,7 @@ shmem_mknod(struct inode *dir, struct dentry *dentry, umode_t mode, dev_t dev)
 	struct inode *inode;
 	int error = -ENOSPC;
 
-	inode = shmem_get_inode(dir->i_sb, dir, mode, dev, VM_NORESERVE);
+	inode = shmem_get_inode(dir->i_sb, dir, mode, dev, VM_NORESERVE, 0);
 	if (inode) {
 #ifdef CONFIG_TMPFS_POSIX_ACL
 		error = generic_acl_init(inode, dir);
@@ -1969,7 +1971,7 @@ shmem_tmpfile(struct inode *dir, struct dentry *dentry, umode_t mode)
 	struct inode *inode;
 	int error = -ENOSPC;
 
-	inode = shmem_get_inode(dir->i_sb, dir, mode, 0, VM_NORESERVE);
+	inode = shmem_get_inode(dir->i_sb, dir, mode, 0, VM_NORESERVE, 0);
 	if (inode) {
 		error = security_inode_init_security(inode, dir,
 						     NULL,
@@ -2105,7 +2107,7 @@ static int shmem_symlink(struct inode *dir, struct dentry *dentry, const char *s
 	if (len > PAGE_CACHE_SIZE)
 		return -ENAMETOOLONG;
 
-	inode = shmem_get_inode(dir->i_sb, dir, S_IFLNK|S_IRWXUGO, 0, VM_NORESERVE);
+	inode = shmem_get_inode(dir->i_sb, dir, S_IFLNK|S_IRWXUGO, 0, VM_NORESERVE, 0);
 	if (!inode)
 		return -ENOSPC;
 
@@ -2649,7 +2651,7 @@ int shmem_fill_super(struct super_block *sb, void *data, int silent)
 	sb->s_flags |= MS_POSIXACL;
 #endif
 
-	inode = shmem_get_inode(sb, NULL, S_IFDIR | sbinfo->mode, 0, VM_NORESERVE);
+	inode = shmem_get_inode(sb, NULL, S_IFDIR | sbinfo->mode, 0, VM_NORESERVE, 0);
 	if (!inode)
 		goto failed;
 	inode->i_uid = sbinfo->uid;
@@ -2906,7 +2908,7 @@ EXPORT_SYMBOL_GPL(shmem_truncate_range);
 
 #define shmem_vm_ops				generic_file_vm_ops
 #define shmem_file_operations			ramfs_file_operations
-#define shmem_get_inode(sb, dir, mode, dev, flags)	ramfs_get_inode(sb, dir, mode, dev)
+#define shmem_get_inode(sb, dir, mode, dev, flags, atomic_copy)	ramfs_get_inode(sb, dir, mode, dev)
 #define shmem_acct_size(flags, size)		0
 #define shmem_unacct_size(flags, size)		do {} while (0)
 
@@ -2919,7 +2921,8 @@ static struct dentry_operations anon_ops = {
 };
 
 static struct file *__shmem_file_setup(const char *name, loff_t size,
-				       unsigned long flags, unsigned int i_flags)
+				       unsigned long flags, unsigned int i_flags,
+				       int atomic_copy)
 {
 	struct file *res;
 	struct inode *inode;
@@ -2948,7 +2951,7 @@ static struct file *__shmem_file_setup(const char *name, loff_t size,
 	path.mnt = mntget(shm_mnt);
 
 	res = ERR_PTR(-ENOSPC);
-	inode = shmem_get_inode(sb, NULL, S_IFREG | S_IRWXUGO, 0, flags);
+	inode = shmem_get_inode(sb, NULL, S_IFREG | S_IRWXUGO, 0, flags, atomic_copy);
 	if (!inode)
 		goto put_dentry;
 
@@ -2984,9 +2987,9 @@ put_memory:
  * @size: size to be set for the file
  * @flags: VM_NORESERVE suppresses pre-accounting of the entire object size
  */
-struct file *shmem_kernel_file_setup(const char *name, loff_t size, unsigned long flags)
+struct file *shmem_kernel_file_setup(const char *name, loff_t size, unsigned long flags, int atomic_copy)
 {
-	return __shmem_file_setup(name, size, flags, S_PRIVATE);
+	return __shmem_file_setup(name, size, flags, S_PRIVATE, atomic_copy);
 }
 
 /**
@@ -2995,9 +2998,9 @@ struct file *shmem_kernel_file_setup(const char *name, loff_t size, unsigned lon
  * @size: size to be set for the file
  * @flags: VM_NORESERVE suppresses pre-accounting of the entire object size
  */
-struct file *shmem_file_setup(const char *name, loff_t size, unsigned long flags)
+struct file *shmem_file_setup(const char *name, loff_t size, unsigned long flags, int atomic_copy)
 {
-	return __shmem_file_setup(name, size, flags, 0);
+	return __shmem_file_setup(name, size, flags, 0, atomic_copy);
 }
 EXPORT_SYMBOL_GPL(shmem_file_setup);
 
@@ -3010,7 +3013,7 @@ int shmem_zero_setup(struct vm_area_struct *vma)
 	struct file *file;
 	loff_t size = vma->vm_end - vma->vm_start;
 
-	file = shmem_file_setup("dev/zero", size, vma->vm_flags);
+	file = shmem_file_setup("dev/zero", size, vma->vm_flags, 0);
 	if (IS_ERR(file))
 		return PTR_ERR(file);
 
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 461fce2..733ffe8 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -9,6 +9,7 @@
 #include <linux/hugetlb.h>
 #include <linux/mman.h>
 #include <linux/slab.h>
+#include <linux/export.h>
 #include <linux/kernel_stat.h>
 #include <linux/swap.h>
 #include <linux/vmalloc.h>
@@ -43,7 +44,6 @@
 static bool swap_count_continued(struct swap_info_struct *, pgoff_t,
 				 unsigned char);
 static void free_swap_count_continuations(struct swap_info_struct *);
-static sector_t map_swap_entry(swp_entry_t, struct block_device**);
 
 DEFINE_SPINLOCK(swap_lock);
 static unsigned int nr_swapfiles;
@@ -728,6 +728,62 @@ swp_entry_t get_swap_page_of_type(int type)
 	spin_unlock(&si->lock);
 	return (swp_entry_t) {0};
 }
+EXPORT_SYMBOL_GPL(get_swap_page_of_type);
+
+static unsigned int find_next_to_unuse(struct swap_info_struct *si,
+					unsigned int prev, bool frontswap);
+
+void get_swap_range_of_type(int type, swp_entry_t *start, swp_entry_t *end,
+		unsigned int limit)
+{
+	struct swap_info_struct *si;
+	pgoff_t start_at;
+	unsigned int i;
+
+	*start = swp_entry(0, 0);
+	*end = swp_entry(0, 0);
+	si = swap_info[type];
+	spin_lock(&si->lock);
+	if (si && (si->flags & SWP_WRITEOK)) {
+		atomic_long_dec(&nr_swap_pages);
+		/* This is called for allocating swap entry, not cache */
+		start_at = scan_swap_map(si, 1);
+		if (start_at) {
+			unsigned long stop_at = find_next_to_unuse(si, start_at, 0);
+			if (stop_at > start_at)
+				stop_at--;
+			else
+				stop_at = si->max - 1;
+			if (stop_at - start_at + 1 > limit)
+				stop_at = min_t(unsigned int,
+						start_at + limit - 1,
+						si->max - 1);
+			/* Mark them used */
+			for (i = start_at; i <= stop_at; i++)
+				si->swap_map[i] = 1;
+			/* first page already done above */
+			si->inuse_pages += stop_at - start_at;
+
+			atomic_long_sub(stop_at - start_at, &nr_swap_pages);
+			if (start_at == si->lowest_bit)
+				si->lowest_bit = stop_at + 1;
+			if (stop_at == si->highest_bit)
+				si->highest_bit = start_at - 1;
+			if (si->inuse_pages == si->pages) {
+				si->lowest_bit = si->max;
+				si->highest_bit = 0;
+			}
+			for (i = start_at; i <= stop_at; i++)
+				inc_cluster_info_page(si, si->cluster_info, i);
+			si->cluster_next = stop_at + 1;
+			*start = swp_entry(type, start_at);
+			*end = swp_entry(type, stop_at);
+		} else
+			atomic_long_inc(&nr_swap_pages);
+	}
+	spin_unlock(&si->lock);
+}
+EXPORT_SYMBOL_GPL(get_swap_range_of_type);
 
 static struct swap_info_struct *swap_info_get(swp_entry_t entry)
 {
@@ -875,6 +931,7 @@ void swapcache_free(swp_entry_t entry, struct page *page)
 		spin_unlock(&p->lock);
 	}
 }
+EXPORT_SYMBOL_GPL(swap_free);
 
 /*
  * How many references to page are currently swapped out?
@@ -1596,7 +1653,7 @@ static void drain_mmlist(void)
  * Note that the type of this function is sector_t, but it returns page offset
  * into the bdev, not sector offset.
  */
-static sector_t map_swap_entry(swp_entry_t entry, struct block_device **bdev)
+sector_t map_swap_entry(swp_entry_t entry, struct block_device **bdev)
 {
 	struct swap_info_struct *sis;
 	struct swap_extent *start_se;
@@ -1623,6 +1680,7 @@ static sector_t map_swap_entry(swp_entry_t entry, struct block_device **bdev)
 		BUG_ON(se == start_se);		/* It *must* be present */
 	}
 }
+EXPORT_SYMBOL_GPL(map_swap_entry);
 
 /*
  * Returns the page offset into bdev for the specified page's swap entry.
@@ -1967,6 +2025,7 @@ out:
 	putname(pathname);
 	return err;
 }
+EXPORT_SYMBOL_GPL(sys_swapoff);
 
 #ifdef CONFIG_PROC_FS
 static unsigned swaps_poll(struct file *file, poll_table *wait)
@@ -2573,6 +2632,7 @@ out:
 		mutex_unlock(&inode->i_mutex);
 	return error;
 }
+EXPORT_SYMBOL_GPL(sys_swapon);
 
 void si_swapinfo(struct sysinfo *val)
 {
@@ -2590,6 +2650,7 @@ void si_swapinfo(struct sysinfo *val)
 	val->totalswap = total_swap_pages + nr_to_be_unused;
 	spin_unlock(&swap_lock);
 }
+EXPORT_SYMBOL_GPL(si_swapinfo);
 
 /*
  * Verify that a swap entry is valid and increment its swap map count.
@@ -2734,8 +2795,15 @@ pgoff_t __page_file_index(struct page *page)
 	VM_BUG_ON(!PageSwapCache(page));
 	return swp_offset(swap);
 }
+
 EXPORT_SYMBOL_GPL(__page_file_index);
 
+struct swap_info_struct *get_swap_info_struct(unsigned type)
+{
+	return swap_info[type];
+}
+EXPORT_SYMBOL_GPL(get_swap_info_struct);
+
 /*
  * add_swap_count_continuation - called when a swap count is duplicated
  * beyond SWAP_MAP_MAX, it allocates a new page and links that to the entry's
diff --git a/mm/util.c b/mm/util.c
index 808f375..b6be426 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -403,6 +403,7 @@ struct address_space *page_mapping(struct page *page)
 		mapping = NULL;
 	return mapping;
 }
+EXPORT_SYMBOL_GPL(page_mapping);
 
 /*
  * Committed memory limit enforced when OVERCOMMIT_NEVER policy is used
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 05e6095..8c55ebed 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1327,7 +1327,7 @@ static int too_many_isolated(struct zone *zone, int file,
 {
 	unsigned long inactive, isolated;
 
-	if (current_is_kswapd())
+	if (current_is_kswapd() || sc->hibernation_mode)
 		return 0;
 
 	if (!global_reclaim(sc))
@@ -2118,6 +2118,9 @@ static inline bool should_continue_reclaim(struct zone *zone,
 	unsigned long pages_for_compaction;
 	unsigned long inactive_lru_pages;
 
+	if (nr_reclaimed && nr_scanned && sc->nr_to_reclaim >= sc->nr_reclaimed)
+		return true;
+
 	/* If not in reclaim/compaction mode, stop */
 	if (!in_reclaim_compaction(sc))
 		return false;
@@ -2303,7 +2306,7 @@ static bool shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
 			if (sc->priority != DEF_PRIORITY &&
 			    !zone_reclaimable(zone))
 				continue;	/* Let kswapd poll it */
-			if (IS_ENABLED(CONFIG_COMPACTION)) {
+			if (IS_ENABLED(CONFIG_COMPACTION) && !sc->hibernation_mode) {
 				/*
 				 * If we already have plenty of memory free for
 				 * compaction in this zone, don't free any more.
@@ -2386,6 +2389,11 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
 	unsigned long writeback_threshold;
 	bool aborted_reclaim;
 
+#ifdef CONFIG_FREEZER
+	if (unlikely(pm_freezing && !sc->hibernation_mode))
+		return 0;
+#endif
+
 	delayacct_freepages_start();
 
 	if (global_reclaim(sc))
@@ -3281,6 +3289,11 @@ void wakeup_kswapd(struct zone *zone, int order, enum zone_type classzone_idx)
 	if (!populated_zone(zone))
 		return;
 
+#ifdef CONFIG_FREEZER
+	if (pm_freezing)
+		return;
+#endif
+
 	if (!cpuset_zone_allowed_hardwall(zone, GFP_KERNEL))
 		return;
 	pgdat = zone->zone_pgdat;
@@ -3306,11 +3319,11 @@ void wakeup_kswapd(struct zone *zone, int order, enum zone_type classzone_idx)
  * LRU order by reclaiming preferentially
  * inactive > active > active referenced > active mapped
  */
-unsigned long shrink_all_memory(unsigned long nr_to_reclaim)
+unsigned long shrink_memory_mask(unsigned long nr_to_reclaim, gfp_t mask)
 {
 	struct reclaim_state reclaim_state;
 	struct scan_control sc = {
-		.gfp_mask = GFP_HIGHUSER_MOVABLE,
+		.gfp_mask = mask,
 		.may_swap = 1,
 		.may_unmap = 1,
 		.may_writepage = 1,
@@ -3339,6 +3352,13 @@ unsigned long shrink_all_memory(unsigned long nr_to_reclaim)
 
 	return nr_reclaimed;
 }
+EXPORT_SYMBOL_GPL(shrink_memory_mask);
+
+unsigned long shrink_all_memory(unsigned long nr_to_reclaim)
+{
+	return shrink_memory_mask(nr_to_reclaim, GFP_HIGHUSER_MOVABLE);
+}
+EXPORT_SYMBOL_GPL(shrink_all_memory);
 #endif /* CONFIG_HIBERNATION */
 
 /* It's optimal to keep kswapds on the same CPUs as their memory, but
diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c
index b4b61b2..364ce0c 100644
--- a/net/mac80211/cfg.c
+++ b/net/mac80211/cfg.c
@@ -995,10 +995,8 @@ static int ieee80211_start_ap(struct wiphy *wiphy, struct net_device *dev,
 					IEEE80211_P2P_OPPPS_ENABLE_BIT;
 
 	err = ieee80211_assign_beacon(sdata, &params->beacon);
-	if (err < 0) {
-		ieee80211_vif_release_channel(sdata);
+	if (err < 0)
 		return err;
-	}
 	changed |= err;
 
 	err = drv_start_ap(sdata->local, sdata);
@@ -1007,7 +1005,6 @@ static int ieee80211_start_ap(struct wiphy *wiphy, struct net_device *dev,
 		if (old)
 			kfree_rcu(old, rcu_head);
 		RCU_INIT_POINTER(sdata->u.ap.beacon, NULL);
-		ieee80211_vif_release_channel(sdata);
 		return err;
 	}
 
@@ -2611,24 +2608,6 @@ static int ieee80211_start_roc_work(struct ieee80211_local *local,
 	INIT_DELAYED_WORK(&roc->work, ieee80211_sw_roc_work);
 	INIT_LIST_HEAD(&roc->dependents);
 
-	/*
-	 * cookie is either the roc cookie (for normal roc)
-	 * or the SKB (for mgmt TX)
-	 */
-	if (!txskb) {
-		/* local->mtx protects this */
-		local->roc_cookie_counter++;
-		roc->cookie = local->roc_cookie_counter;
-		/* wow, you wrapped 64 bits ... more likely a bug */
-		if (WARN_ON(roc->cookie == 0)) {
-			roc->cookie = 1;
-			local->roc_cookie_counter++;
-		}
-		*cookie = roc->cookie;
-	} else {
-		*cookie = (unsigned long)txskb;
-	}
-
 	/* if there's one pending or we're scanning, queue this one */
 	if (!list_empty(&local->roc_list) ||
 	    local->scanning || local->radar_detect_enabled)
@@ -2763,6 +2742,24 @@ static int ieee80211_start_roc_work(struct ieee80211_local *local,
 	if (!queued)
 		list_add_tail(&roc->list, &local->roc_list);
 
+	/*
+	 * cookie is either the roc cookie (for normal roc)
+	 * or the SKB (for mgmt TX)
+	 */
+	if (!txskb) {
+		/* local->mtx protects this */
+		local->roc_cookie_counter++;
+		roc->cookie = local->roc_cookie_counter;
+		/* wow, you wrapped 64 bits ... more likely a bug */
+		if (WARN_ON(roc->cookie == 0)) {
+			roc->cookie = 1;
+			local->roc_cookie_counter++;
+		}
+		*cookie = roc->cookie;
+	} else {
+		*cookie = (unsigned long)txskb;
+	}
+
 	return 0;
 }
 
diff --git a/net/mac80211/ibss.c b/net/mac80211/ibss.c
index d40e0e1..27a39de 100644
--- a/net/mac80211/ibss.c
+++ b/net/mac80211/ibss.c
@@ -687,9 +687,12 @@ static void ieee80211_ibss_disconnect(struct ieee80211_sub_if_data *sdata)
 	struct cfg80211_bss *cbss;
 	struct beacon_data *presp;
 	struct sta_info *sta;
+	int active_ibss;
 	u16 capability;
 
-	if (!is_zero_ether_addr(ifibss->bssid)) {
+	active_ibss = ieee80211_sta_active_ibss(sdata);
+
+	if (!active_ibss && !is_zero_ether_addr(ifibss->bssid)) {
 		capability = WLAN_CAPABILITY_IBSS;
 
 		if (ifibss->privacy)
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index d298f32..ca7fa7f 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -874,7 +874,7 @@ static int ieee80211_fragment(struct ieee80211_tx_data *tx,
 	}
 
 	/* adjust first fragment's length */
-	skb_trim(skb, hdrlen + per_fragm);
+	skb->len = hdrlen + per_fragm;
 	return 0;
 }
 
diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index 16616f1..138dc3b 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -1677,10 +1677,9 @@ static int nl80211_dump_wiphy(struct sk_buff *skb, struct netlink_callback *cb)
 				 * We can then retry with the larger buffer.
 				 */
 				if ((ret == -ENOBUFS || ret == -EMSGSIZE) &&
-				    !skb->len && !state->split &&
+				    !skb->len &&
 				    cb->min_dump_alloc < 4096) {
 					cb->min_dump_alloc = 4096;
-					state->split_start = 0;
 					rtnl_unlock();
 					return 1;
 				}
diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c
index 25e5cb0..2370863 100644
--- a/scripts/mod/file2alias.c
+++ b/scripts/mod/file2alias.c
@@ -210,8 +210,8 @@ static void do_usb_entry(void *symval,
 				range_lo < 0x9 ? "[%X-9" : "[%X",
 				range_lo);
 			sprintf(alias + strlen(alias),
-				range_hi > 0xA ? "A-%X]" : "%X]",
-				range_hi);
+				range_hi > 0xA ? "a-%X]" : "%X]",
+				range_lo);
 		}
 	}
 	if (bcdDevice_initial_digits < (sizeof(bcdDevice_lo) * 2 - 1))
diff --git a/security/keys/big_key.c b/security/keys/big_key.c
index 8137b27..e2436f9 100644
--- a/security/keys/big_key.c
+++ b/security/keys/big_key.c
@@ -70,7 +70,7 @@ int big_key_instantiate(struct key *key, struct key_preparsed_payload *prep)
 		 *
 		 * TODO: Encrypt the stored data with a temporary key.
 		 */
-		file = shmem_kernel_file_setup("", datalen, 0);
+		file = shmem_kernel_file_setup("", datalen, 0, 0);
 		if (IS_ERR(file)) {
 			ret = PTR_ERR(file);
 			goto err_quota;
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index c3eac77..63b8291 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -4319,7 +4319,6 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
 	SND_PCI_QUIRK(0x1043, 0x8398, "ASUS P1005", ALC269_FIXUP_STEREO_DMIC),
 	SND_PCI_QUIRK(0x1043, 0x83ce, "ASUS P1005", ALC269_FIXUP_STEREO_DMIC),
 	SND_PCI_QUIRK(0x1043, 0x8516, "ASUS X101CH", ALC269_FIXUP_ASUS_X101),
-	SND_PCI_QUIRK(0x104d, 0x90b5, "Sony VAIO Pro 11", ALC286_FIXUP_SONY_MIC_NO_PRESENCE),
 	SND_PCI_QUIRK(0x104d, 0x90b6, "Sony VAIO Pro 13", ALC286_FIXUP_SONY_MIC_NO_PRESENCE),
 	SND_PCI_QUIRK(0x104d, 0x9073, "Sony VAIO", ALC275_FIXUP_SONY_VAIO_GPIO2),
 	SND_PCI_QUIRK(0x104d, 0x907b, "Sony VAIO", ALC275_FIXUP_SONY_HWEQ),
@@ -5094,7 +5093,6 @@ static const struct snd_pci_quirk alc662_fixup_tbl[] = {
 	SND_PCI_QUIRK(0x1025, 0x038b, "Acer Aspire 8943G", ALC662_FIXUP_ASPIRE),
 	SND_PCI_QUIRK(0x1028, 0x05d8, "Dell", ALC668_FIXUP_DELL_MIC_NO_PRESENCE),
 	SND_PCI_QUIRK(0x1028, 0x05db, "Dell", ALC668_FIXUP_DELL_MIC_NO_PRESENCE),
-	SND_PCI_QUIRK(0x1028, 0x060a, "Dell XPS 13", ALC668_FIXUP_DELL_MIC_NO_PRESENCE),
 	SND_PCI_QUIRK(0x1028, 0x0623, "Dell", ALC668_FIXUP_AUTO_MUTE),
 	SND_PCI_QUIRK(0x1028, 0x0624, "Dell", ALC668_FIXUP_AUTO_MUTE),
 	SND_PCI_QUIRK(0x1028, 0x0625, "Dell", ALC668_FIXUP_DELL_MIC_NO_PRESENCE),
diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c
index 00d8642..88b2fe3 100644
--- a/virt/kvm/coalesced_mmio.c
+++ b/virt/kvm/coalesced_mmio.c
@@ -154,13 +154,17 @@ int kvm_vm_ioctl_register_coalesced_mmio(struct kvm *kvm,
 	list_add_tail(&dev->list, &kvm->coalesced_zones);
 	mutex_unlock(&kvm->slots_lock);
 
-	return 0;
+	return ret;
 
 out_free_dev:
 	mutex_unlock(&kvm->slots_lock);
+
 	kfree(dev);
 
-	return ret;
+	if (dev == NULL)
+		return -ENXIO;
+
+	return 0;
 }
 
 int kvm_vm_ioctl_unregister_coalesced_mmio(struct kvm *kvm,