Skip to content
Permalink
main
Switch branches/tags
Go to file
 
 
Cannot retrieve contributors at this time
3442 lines (2775 sloc) 143 KB

Hierarchical File System (HFS)

Summary

The Hierarchical File System (HFS) is mainly used on the Apple Macintosh platform. It has evolved from the Mactintosh File System (MFS). The current variant of HFS is known as HFS+. This document is based on the Apple documentation of HFS and HFS+ and was enhanced by analyzing test data.

This document is intended as a working document for the HFS and HFS+ specification.

Document information

Author(s):

Joachim Metz <joachim.metz@gmail.com>

Abstract:

This document contains information about the Hierarchical File System (HFS).

Classification:

Public

Keywords:

Hierarchical File System, HFS, HFS+, HFSX, Mactintosh File System, MFS

License

Copyright (C) 2009-2021, Joachim Metz <joachim.metz@gmail.com>.
Permission is granted to copy, distribute and/or modify this document under the
terms of the GNU Free Documentation License, Version 1.3 or any later version
published by the Free Software Foundation; with no Invariant Sections, no
Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included
in the section entitled "GNU Free Documentation License".

Revision history

Version Author Date Comments

0.0.1

J.B. Metz

February 2009

Initial version.

0.0.2

J.B. Metz

October 2010

Fixed several typos.
Email change

0.0.3

J.B. Metz

July 2012

Email change

0.0.4

J.B. Metz

March 2015

Switched to asciidoc format.

0.0.5

J.B. Metz

February 2016

Textual changes.

0.0.6

J.B. Metz

June 2016

Textual changes.

0.0.7

J.B. Metz

December 2017

Additional information about HFS+ volume attribute flags and textual changes.

0.0.8

J.B. Metz

January 2017

Textual changes.

0.0.9

J.B. Metz

August 2020

Changes to formatting.

0.0.10

J.B. Metz

October 2020

Textual changes and changes to formatting.

0.0.11

J.B. Metz

July 2021

Additional information about extended attributes.

0.0.12

J.B. Metz

August 2021

Additional information about classic HFS.

1. Overview

The Hierarchical File System (HFS) is mainly used on the Macintosh platform. The Mactintosh File System (MFS) is the predecessor of HFS.

Version Introduced in

MFS

400 KiB floppies

HFS

Early Mac OS

HFS+ 8.10

Mac OS 8.1 to 9.2.2

HFS+ 10.0

Mac OS 10.0

HFSX

Mac OS 10.3

Characteristics Description

Byte order

big-endian

Date and time values

HFS/HFS+ date and time

Character strings

ASCII strings are Single Byte Character (SBC) or Multi Byte Character (MBC) string stored with a codepage.
Though technically maybe incorrect, this document will use term (extended) ASCII string.
Unicode strings are stored in UTF-16 big-endian without the byte order mark (BOM).

1.1. HFS and HFS+/HFSX

Feature HFS HFS+/HFSX

Maximum file size

231 (2 GiB)

263 (8 EiB)

Maximum filename length

31 characters

255 characters

Maximum allocation blocks

216 (65535 bytes)

232 (4294967296 bytes)

Character set

extended ASCII with codepage

Unicode UTF-16 big-endian

Time stamps

In local time

In UTC (GMT)

Catalog B-tree file node size

512 bytes

4096 bytes

File attributes

none

Basic and extended

Note
Unicode strings are stored in Normalization Form Canonical Decomposition (NFD) according to Unicode 3.2 rules, with exclusions. Unicode values in the ranges U+2000 - U+2FFF, U+F900 - U+FAFF and U+2F800 - U+2FAFF are not decomposed. For Mac OS 8.1 through 10.2.x decomposition is based on Unicode 2.1 rules.
Note
Based on observations on Mac OS 10.15.7 the range U+1D000 - U+1D1FF appears to be excluded from decomposition as well.
Note
Based on observations on Mac OS 10.15.7 U+2400 appears to be replaced by U+0
Note
HFS+ allows for the / character in file names. On Mac OS, Finder this will be represented as a / but in Terminal it is replaced by : seeing the same character is used as path segment separator. A file name with a : created in Terminal will be shown as / in Finder. Finder does not allow the creation of a file containing : in the name. A symbolic link created in Terminal to a file with a ':' in name will not convert the ':' character in the link target data. The Linux HFS+ implementation appears to apply a similar conversion logic as Terminal.

1.2. HFSX

HFSX (or sometimes referred to as HFS/X) is an extension to HFS+ to allow additional features that are incompatible with HFS+. A HFSX volume cannot be wrapped in a HFS volume.

One of such features is case-sensitive filenames. A HFSX volume may be either case-sensitive or case-insensitive. Case sensitivity (or lack thereof) is global to the volume; the setting applies to all file and directory names on the volume.

2. HFS timestamp

In HFS+ date and time values are stored in an unsigned 32-bit integer containing the number of seconds since January 1, 1904 at 00:00:00 (midnight) UTC (GMT). This is slightly different from HFS where the date and time value are stored using the local time. This document will refer to both forms as HFS timestamp.

The maximum representable date is February 6, 2040 at 06:28:15 UTC (GMT).

The date values do not account for leap seconds. They do include a leap day in every year that is evenly divisible by four. This is sufficient given that the range of representable dates does not contain 1900 or 2100, neither of which have leap days.

3. B-tree file

Both HFS and HFS+ use B-trees files. A B-tree file consists of fixed sized nodes:

  • header node

  • map nodes

  • index (root and branch) nodes

  • leaf nodes

Note
The node size is determined when the B-tree is created. In a HFS+ B-tree the node size is determined by the header node.
Feature HFS HFS+/HFSX

Node size

512 bytes

512 - 32768 bytes
The size value must be a power of 2

HFS+ uses the following default node sizes:

Feature HFS HFS+/HFSX

catalog file

512

4 KiB (8 KiB in Mac OS X)

extents (overflow) file

512

1 KiB (4 KiB in Mac OS X)

attributes file

N/A

4 KiB

The size of a B-tree file can be calculated in the following manner:

size = number of nodes x node size
Note
The data fork of the B-tree is used. The resource fork of a B-tree file is unused.

3.1. The B-tree (file) node

A B-tree file consists of nodes. Each node has the same structure and consists of three main parts:

  • the node descriptor

  • the node records

  • the node record offsets

The first node in the file is referenced by node number 0.

The node offset relative to the start of the file given a node number can be calculated in the following manner:

node offset = node number x node size

3.1.1. The B-tree node descriptor

The node descriptor (BTNodeDescriptor) contains information about the node, like the forward and backward links to other nodes.

The B-tree node descriptor is 14 bytes of size and consists of:

Offset Size Value Description

0

4

The next tree node number (forward link)
Contains 0 if empty

4

4

The previous tree node number (backward link)
Contains 0 if empty

8

1

The node type
Signed 8-bit integer
See section: B-tree node type

9

1

The node level
Signed 8-bit integer
The root node level is 0, with a maximum depth of 8.

10

2

The number of records

12

2

0

Unknown (Reserved)
Should contain 0-byte values

B-tree node type
Value Identifier Description

-1

kBTLeafNode

leaf node

0

kBTIndexNode

index node

1

kBTHeaderNode

header node

2

kBTMapNode

map node

3.1.2. The B-tree node record

The B-tree node record contains (leaf) data or a reference to an index node. The B-tree node record consists of:

  • key data

  • record data

3.1.3. The B-tree record offsets

The B-tree record offsets are an array of 16-bit integers relative from the start of the B-tree node record. The first record offset is found at node size - 2, e.g. 512 - 2 = 510, the second 2 bytes before that, e.g. 508, etc. An additional record offset is added at the end to signify the start of the free space.

Note
The record offsets are not necessarily stored in linear order.

3.2. The B-tree header node

The B-tree header node is stored in the first node of the B-tree file and contains 3 records:

  • the B-tree header record;

  • the user data record, which consist of 128 bytes (reserved within HFS);

  • the B-tree map record.

Note
The records in the B-tree header node do not have keys.

3.2.1. The B-tree header record

The B-tree header record (BTHeaderRec) contains information about the beginning of the tree, as well as the size of the tree.

The B-tree header record is 106 bytes of size and consists of:

Offset Size Value Description

0

2

Depth of the tree

2

4

Root node number

6

4

Number of data records contained in leaf nodes
(Does this equals the number of leaf nodes?)

10

4

First leaf node number

14

4

Last leaf node number

18

2

The node size
Contains number of bytes

20

2

Maximum key size
Contains number of bytes

22

4

Number of nodes

26

4

Number of free nodes

Introduced in HFS+

30

2

Unknown (Reserved)

32

4

Clump size

36

1

B-tree file type
See section: File type

37

1

Key compare type
See section: Key compare type

38

4

Attributes
See section: Attributes

42

( 16 x 4 ) = 64

Unknown (Reserved)

File type
Value Identifier Description

0x00

Control file

0x80

First user B-tree type

0xff

Reserved B-tree type

Key compare type
Value Identifier Description

0xbc

Binary compare (case-sensitive)

0xcf

Case folding (case-insensitive)

Attributes

The bits in the attributes value have the following meaning:

Value Identifier Description

0x00000001

kBTBadCloseMask

Bad close
This bit indicates that the B-tree was not closed properly and should be checked for consistency.
This bit is not used for HFS+ B-trees.

0x00000002

kBTBigKeysMask

Big keys
If this bit is set, the key size value of the keys in index and leaf nodes is 16-bit integer; otherwise, it is an 8-bit integer.
This bit must be set for all HFS+ B‑trees.

0x00000004

kBTVariableIndexKeysMask

Variable-size index keys
If this bit is set, the keys in index nodes occupy the number of bytes indicated by their key size; otherwise, the keys in index nodes always occupy maximum key size.
This bit must be set for the HFS+ Catalog B-tree, and cleared for the HFS+ Extents B-tree.

3.2.2. The B-tree map record

The B-tree map record contains of a bitmap that indicates which nodes in the B-tree file are used and which are not. The bits are interpreted in exactly the same way as the bits in the volume bitmap: if a bit in the map record is set, then the corresponding node in the B-tree file is being used.

The bitmap is 256 bytes of size and can therefore contain information about 2048 nodes at most. If more nodes are needed a map node is used to store additional mapping information.

3.3. The map node

If a B-tree file contains more than 2048 nodes, which are enough for about 8000 files, a map node is used to store additional node-mapping information.

The next tree node value in the B-tree node descriptor of the header node is used to refer to the first map node.

A map node consists of a B-tree node descriptor and one B-tree map record. The map record is 494 bytes of size ( 512 - ( 14 + 2 ) ) and can therefore contain mapping information for 3952 nodes.

If a B-tree contains more than 6000 nodes (enough for about 25000 files) a second map node is needed. The next tree node value in the B-tree node descriptor of the first map node is used to refer to the second. If more map nodes are required, each additional map node is similarly linked to the previous one.

3.4. The root node

The root node is the start of the B-tree structure; usually the root node is an index node, but it might be a leaf node if there are no index nodes.

The root node number is stored in the B-tree header record.

3.5. The index node

The records stored in an index node are called pointer records. A pointer record consists of a key followed by the node number of the corresponding node. The size of the key varies according to the type of B-tree file.

  • In a catalog file, the search key is a combination of the file or directory name and the parent identifier of that file or directory.

  • In an extents (overflow) file, the search key is a combination of that file’s type, its file identifier and the index of the first allocation block in the extent.

The immediate descendants of an index node are called the children of the index node. An index node can have from 1 to 15 children, depending on the size of the pointer records that the index node contains.

TODO size of the node number is 32-bit

3.6. The leaf node

The leaf nodes contain data records. The structure of the leaf node data records varies according to the type of B-tree.

  • In an extents (overflow) file, the leaf node data records consist of a key and an extent record.

  • In a catalog file, the leaf node data records can be any one of four kinds of records.

4. The HFS volume

The information on all block-formatted volumes is organized in logical blocks. These logical blocks are referred to as allocation blocks and contain a number of bytes of standard information (512 bytes on Macintosh-initialized volumes).

The allocation block size is a volume parameter whose value is set when the volume is initialized. To promote file contiguity and avoid fragmentation, space is allocated to files in groups of allocation blocks, or clumps. The clump size is always a multiple of the allocation block size, and it is the minimum number of bytes to allocate.

Each HFS volume begins with two boot blocks. The boot blocks on the startup volume are read at system startup time and contain booting instructions and other important information such as the name of the System file and the Finder. Following the boot blocks are two additional structures:

  • the master directory block, which contains information about the volume, such as the date and time of the volume’s creation and the number of files on the volume;

  • the volume bitmap, which contains a record of which blocks in the volume are currently in use.

All the areas on a volume are of fixed size and location, except for the catalog file and the extents (overflow) file. These two files can appear anywhere between the volume bitmap and the alternate master directory block (MDB). They can appear in any order and are not necessarily contiguous. The catalog and extents (overflow) files are both organized as B-trees.

The last block (512 bytes) were used during Apple’s CPU manufacturing process.

4.1. Boot blocks

The first two logical blocks on every Macintosh volume are boot blocks. These blocks contain system startup information: instructions and information necessary to start up (or "boot") a Macintosh computer. This information consists of certain configurable system parameters (such as the capacity of the event queue, the number of open files allowed, and so forth) and is contained in a boot block header. The system startup information also includes actual machine-language instructions that could be used to load and execute the System file. Usually these instructions follow immediately after the boot block header. Generally, however, the boot code stored on disk is ignored in favor of boot code stored in a resource in the System file.

Note that there are two boot block header formats. The current format includes two fields at the end that are not contained in the older format. These fields allow the Operating System to size the System heap relative to the amount of available physical RAM. A boot block header that conforms to the older format sets the size of the System heap absolutely, using values specified in the header itself. You can determine whether a boot block header uses the current or the older format by inspecting a bit in the high-order byte of the version value.

The boot block header is 141 bytes of size and consists of:

Offset Size Value Description

0

2

"LK" ("\x4c\x4b")

The boot block signature

2

4

Boot code entry point

6

2

Boot blocks version number

8

2

Page flags
(used internally)

10

15

System filename
ASCII string

25

15

Shell or Finder filename
ASCII string typically "Finder"

40

15

Debugger 1 filename
ASCII string typically "Macsbug"

55

15

Debugger 2 filename
ASCII string typically "Disassembler"

70

15

The name of the startup screen
ASCII string typically "StartUpScreen"

85

15

The name of the startup program
ASCII string typically "Finder"

100

15

The scrap filename
ASCII string typically "Clipboard"

115

2

The (initial) number of allocated file control blocks (FCBs)

117

2

The maximum number of event queue elements
This number determines the maximum number of events that the Event Manager can store at any one time.
Usually this field contains the value 20.

119

4

The system heap size on 128K Mac
The size of the System heap on a Macintosh computer having 128 KiB of RAM.

123

4

The system heap size on 256K Mac
The size of the System heap on a Macintosh computer having 256 KiB of RAM.

127

4

The system heap size on all machines
The size of the System heap on a Macintosh computer having 512 KiB or more of RAM.

131

2

Filler
(used internally)

133

4

Additional system heap space

137

4

Fraction of available RAM for the system heap

4.1.1. Boot code entry point

The boot code entry point contains machine-language instructions that translate to:

BRA.S *+ 0x90

Or for older versions of the boot block header:

BRA.S *+ 0x88

This instruction jumps to the main boot code following the boot block header.

This field is ignored, however, if bit 6 is clear in the high-order byte of the boot block version number or if the low-order byte contains 0x0d.

4.1.2. Boot blocks version number

The boot blocks version number consists of a flag byte (high order) and a version byte (low order).

TODO determine MSB and LSB

The bits in the flag byte have the following meaning:

Bit(s) Description

0 - 4

Unknown (Reserved), must be 0

5

Use relative system heap sizing

6

Execute boot code

7

Newer boot block header used

If bit 7 of the flag byte is clear, then bits 5 and 6 are ignored and the version number is found in the version byte.

If the version byte is:

  • less than 0x15, the values in the system heap size on 128K Mac and 256K Mac should be ignored and the value in system heap size on all machines should be used.

  • 0x0d the boot code should be executed using the value in boot code entry point.

  • greater than or equal to 0x15 the value in system heap size on all machines should be used.

If bit 7 of the flag byte is set

  • bit 6 should be used to determine whether to execute the boot code using the value in boot code entry point.

  • bit 5 should be used to determine whether to use relative System heap sizing. If bit 5 is

    • clear the value in system heap size on all machines should be used.

    • is set the System heap is extended by the value in the additional system heap space plus the fraction of available RAM for the system heap.

4.2. Master directory block (MDB)

The master directory block (MDB), also known as the volume information block (VIB), contains information about the data in the volume. The MDB starts at offset 1024 of the volume.

The MDB is 162 bytes of size and consists of:

Offset Size Value Description

0

2

"BD" ("\x42\x44")

The volume signature (kHFSSigWord)
For Mactintosh File System (MFS) volumes the signature contains "\xd2\xd7".

2

4

Volume creation date and time
Contains a HFS timestamp in local time
The date and time when the volume was created.

6

4

Volume modification date and time
Contains a HFS timestamp in local time
The date and time when the volume was last modified. This is not necessarily the data and time when the volume was last flushed.

10

2

Volume attribute flags
See section: Volume attribute flags

12

2

The number of files in the root directory

14

2

Volume bitmap block number
Contains an allocation block number relative from the start of the volume, where 0 is the first block number.
Typically has a value of 3

16

2

Unknown (Start of the next allocation search)
The (allocation or volume block) index of the allocation block at which the next allocation search will begin.

18

2

Number of (allocation) blocks
A volume can contain at most 65535 blocks.

20

4

Allocation block size
Contains number of bytes an must be a multitude of 512 bytes.

24

4

Default clump size

28

2

Extents start block number
Contains an allocation block number relative from the start of the volume, where 0 is the first block number.

30

4

Next available catalog node identifier (CNID)
Can be a directory or file record identifier.

34

2

Number of unused (allocation) blocks

36

1

The volume label size
The maximum size is 27

37

27

The volume label
Contains an ASCII string

64

4

Backup date and time
Contains a HFS timestamp in local time
The date and time when the volume was last backed up.

68

2

Backup sequence number

70

4

Volume write count
Contains the number of times the volume has been written to.

74

4

Clump size for extents (overflow) file

78

4

Clump size for catalog file

82

2

The number of sub directories in the root directory

84

4

Total number of files
It should equal the number of file records found in the catalog file.

88

4

Total number of directories (folders)
The value does not include the root folder.
It should equal the number of folder records in the catalog file minus one.

92

32

Finder information
See section: Finder information

124

2

Embedded volume signagure (formerly drVCSize)

126

4

Embedded volume extent descriptor (formerly drVBMCSize and drCtlCSize)
Contains a single HFS extent descriptor

130

4

Extents (overflow) file size

134

( 3 x 4 ) = 12

Extents (overflow) extents record
See section: The HFS extents record

146

4

Catalog file size

150

( 3 x 4 ) = 12

Catalog file extents record
See section: The HFS extents record

Notes:

drVCSize => Volume cache (allocation) block size (16-bit)
drVBMCSize => Volume bitmap cache (allocation) block size (16-bit)
drCtlCSize => Common volume cache (allocation) block size (16-bit)

4.2.1. Alternate master directory block (MDB)

A copy of the master directory block (MDB) is maintained in the Alternate MDB. This copy is updated when the extents (overflow) or the catalog file grow larger. The Alternate MBD is intended solely for use by disk utilities.

4.3. Volume bitmap

The volume bitmap is used to keep track of block allocation. The bitmap contains one bit for each allocation block in the volume. If a bit is set, the corresponding allocation block is currently in use by some file. If a bit is clear, the corresponding allocation block is not currently in use by any file and is available for allocation.

The volume bitmap does not indicate which files occupy which blocks. The actual file-mapping information in maintained in two locations:

  • in each file’s catalog entry;

  • in the extents (overflow) file.

The size of the volume bitmap depends on the number of allocation blocks in the volume. The number of allocation blocks depends both on the number of physical blocks in the volume and the size of the volume’s allocation blocks (the number of physical blocks per allocation block). The size of the volume bitmap is rounded up so that the volume bitmap occupies an integral number of physical blocks.

A floppy disk that can hold 800 KiB of data and has an allocation block size of one physical block (512 bytes) has a volume bitmap size of:

( ( 800 x 1024 ) / ( 512 x 8 ) ) = 1600 bits (200 bytes).

A volume containing 32 MiB of data and having an allocation block size of one physical block has a volume bitmap size of:

( ( 32 x 1024 x 1024 ) / ( 512 x 8 ) ) = 65536 bits (8192 bytes).

Because the number of allocation blocks in the volume in the MDB consists of a 16-bit value no more that 65535 allocation blocks can be addressed. The volume bitmap is never larger than 8192 bytes (or 16 physical blocks). For volumes containing more than 32 MB of space, the allocation block size must be increased.

A volume containing 40 MiB of space must have an allocation block size that is at least 2 physical blocks (2 x 512 bytes).

A volume containing 80 MiB of space must have an allocation block size that is at least 3 physical blocks (3 x 512 bytes).

5. The HFS+/HFSX volume

In HFS+ the boot blocks have been removed, therefore the first two blocks are reserved (unused).

5.1. Volume header

The volume header (HFSPlusVolumeHeader) replaces the master directory block (MDB). The volume header starts at offset 1024 of the volume.

The allocation block containing the first 1536 bytes (reserved space plus volume header) are marked as used in the allocation file.

The volume header is 512 bytes of size and consists of:

Offset Size Value Description

0

2

"H+" ("\x48\x2b")
"HX" ("\x48\x58")

The volume signature
Where "H+" (kHFSPlusSigWord) is used for HFS+ and "HX" (kHFSXSigWord) for HFSX

2

2

The volume version
Where 4 (kHFSPlusVersion) is used for HFS+ and 5 (kHFSXVersion) for HFSX

4

4

The volume attribute flags
See section: Volume attribute flags

8

4

Last mounted version
"8.10" ⇒ used by Mac OS 8.1 to 9.2.2
"10.0" (kHFSPlusMountVersion) ⇒ used by Mac OS X
"HFSJ" (kHFSJMountVersion) ⇒ used by journaled HFS+/HFSX
"fsck", "FSK!" ⇒ used by fsck_hfs on Mac OS X

12

4

Journal information block number
This field is used if the volume journaled bit has been set in the volumes attribute flags.
The allocation block number of the allocation block which contains the journal information block of the volume’s journal.

16

4

Creation date and time
Contains a HFS timestamp in UTC
The date and time when the volume was created.

20

4

Modification date and time
Contains a HFS timestamp in UTC
The date and time when the volume was last modified.

24

4

Backup date and time
Contains a HFS timestamp in UTC
The date and time when the volume was last backed up.

28

4

Checked date and time
Contains a HFS timestamp in UTC
The date and time when the volume was last checked for consistency.

32

4

Total number of files
The value does not include the special files.
It should equal the number of file records found in the catalog file.

36

4

Total number of directories (folders)
The value does not include the root folder.
It should equal the number of folder records in the catalog file minus one.

40

4

The (allocation) block size
Contains number of bytes

44

4

Total number of (allocation) blocks

48

4

Number of unused (allocation) blocks

52

4

Next available (allocation) block number
The (allocation or volume block) index of the allocation block at which the next allocation search will begin.

56

4

Default resource fork clump size
The default clump size for resource forks.
Contains number of bytes

60

4

Default data fork clump size
The default clump size for data forks.
Contains number of bytes

64

4

Next available catalog node identifier (CNID)
Can be a directory or file record identifier.

68

4

Volume write count
Contains the number of times the volume has been written to.

72

8

Encodings bitmap
This field keeps track of the text encodings used in the file and folder names on the volume.
See section: Text encoding

80

32

Finder information
See section: Finder information

112

80

Allocation file fork descriptor
Information about the location and size of the allocation file.
See section: HFS+ fork descriptor structure

192

80

Extents (overflow) file fork descriptor
Information about the location and size of the extents (overflow) file.
See section: HFS+ fork descriptor structure

272

80

Catalog file fork descriptor
Information about the location and size of the catalog file.
See section: HFS+ fork descriptor structure

352

80

Attributes file fork descriptor
Information about the location and size of the attributes file.
See section: HFS+ fork descriptor structure

432

80

Startup file fork descriptor
Information about the location and size of the startup file.
See section: HFS+ fork descriptor structure

5.1.1. Total number of allocation blocks

For a disk whose size is an even multiple of the allocation block size, all areas on the disk are included in an allocation block, including the volume header and alternate volume header. For a disk whose size is not an even multiple of the allocation block size, only the allocation blocks that will fit entirely on the disk are counted here. The remaining space at the end of the disk is not used by the volume format (except for storing the alternate volume header, as described above).

5.1.2. Volume attribute flags

The volume attributes flags are specified as following.

TODO: determine MSB and LSB

Value Identifier Description

0x00000080

kHFSVolumeHardwareLockBit

Volume hardware lock
This bit is set if the volume is write-protected due to a hardware setting.

0x00000100

kHFSVolumeUnmountedBit

Volume unmounted
This bit is set if the volume was correctly flushed before being unmounted or ejected.

0x00000200

kHFSVolumeSparedBlocksBit

Volume spared blocks
This bit is set if there are any records in the extents (overflow) file for bad blocks.

0x00000400

kHFSVolumeNoCacheRequiredBit

Volume no cache required
This bit is set if the blocks from this volume should not be cached.

0x00000800

kHFSBootVolumeInconsistentBit

Boot volume inconsistent
This bit is set if the volume was mounted for writing.

0x00001000

kHFSCatalogNodeIDsReusedBit

Catalog node identifiers reused
This bit is set when the next catalog identifier value overflows 32 bits, forcing smaller catalog node identifiers to be reused.

0x00002000

kHFSVolumeJournaledBit

Volume journaled
If this bit is set, the volume has a journal.

0x00004000

kHFSVolumeInconsistentBit

Unknown (Reserved)

0x00008000

kHFSVolumeSoftwareLockBit

Volume software lock
This bit is set if the volume is write-protected due to a software setting.

0x40000000

kHFSContentProtectionBit

Unknown (Reserved)

0x80000000

kHFSUnusedNodeFixBit

Unknown (Reserved)

5.1.3. Alternate volume header

A copy of the volume header, the alternate volume header, is stored starting 1024 bytes before the end of the volume. The alternate volume header is intended for use solely by disk repair utilities.

In order to accommodate the alternate volume header and the reserved space following it, the last allocation block is also marked as used in the allocation file.

The alternate volume header is always stored at offset 1024 bytes from the end of the volume. If the disk size is not an even multiple of the allocation block size, this area may lie beyond the last allocation block. However, the last allocation block (or two allocation blocks for a volume formatted with 512-byte allocation blocks) is still reserved even if the alternate volume header is not stored there.

5.2. Metadata zone

5.2.1. Notes

Mac OS X version 10.3 introduced a new policy for determining where to allocate
space for files, which improves performance for most users. This policy places
the volume metadata and frequently used small files ("hot files") near each
other on disk, which reduces the seek time for typical accesses. This area on
disk is known as the metadata zone.

The volume metadata are the structures that let the file system manage the
contents of the volume. It includes the allocation bitmap file, extents
(overflow) file, and the catalog file, and the journal file. The volume header
and alternate volume header are also metadata, but they have fixed locations
within the volume, so they are not located in the hot file area. Mac OS X may
use a quota users file and quota groups file to manage disk space quotas on a
volume. These files aren't strictly metadata, but they are included in the
metadata zone because of their heavy use by the OS and they are too large to be
considered ordinary hot files.

Implementations are encouraged not to interfere with the metadata zone policy.
For example, a disk optimizer should avoid moving files into the metadata zone
unless that file is known to be frequently accessed, in which case it may be
added to the "hot file" list. Similarly, files in the metadata zone should not
be moved elsewhere on disk unless they are also removed from the hot file list.

This policy is only applied to volumes whose size is at least 10GB, and which
have journaling enabled. The metadata zone is established when the volume is
mounted. The size of the zone is based upon the following sizes:

Item 	Contribution to the Metadata Zone size
Allocation Bitmap File 	Physical size (totalBlocks times the volume's allocation block size) of the allocation bitmap file.
Extents Overflow File 	4MB, plus 4MB per 100GB (up to 128MB maximum)
Journal File 	8MB, plus 8MB per 100GB (up to 512MB maximum)
Catalog File 	10 bytes per KB (1GB minimum)
Hot Files 	5 bytes per KB (10MB minimum; 512MB maximum)
Quota Users File 	Described below
Quota Groups File 	Described below

In Mac OS X version 10.3, the amount of space reserved for the allocation file
is actually the minimum allocation file size for the volume (the total number
of allocation blocks, divided by 8, rounded up to a multiple of the allocation
block size). If the allocation file is larger than that (which is sometimes
done to allow a volume to be more easily grown at a later time), then there
will be less space available for other metadata or hot files in the metadata
zone. This is a bug (r. 3522516).

The amount of space reserved for each type of metadata (except for the
allocation bitmap file) is based on the total size of the volume. For the
purposes of these computations, the total size of the volume is the allocation
block size multiplied by the total number of allocation blocks.

The sizes reserved for quota users and groups files are the result of complex
calculations. In each case, the size reserved is a value of the form (items +
1) * 64 bytes, where items is based on the size of the volume in gigabytes,
rounded down. For the quota users file, items is 256 per gigabyte, rounded up
to a power of 2, with a minimum of 2048, and a maximum of 2097152 (2M). For the
quota groups file, items is 32 per gigabyte, rounded up to a power of 2, with a
minimum of 2048, and a maximum of 262144 (256K). The quota files are considered
hot files, and occupy the hot file area, even though they are larger than the
maximum file size normally eligible to be a hot file.

The total size of the metadata zone is the sum of the above sizes, rounded up
so that the metadata zone is represented by a whole number of allocation blocks
within the volume bitmap. That is, the start and end of the metadata zone fall
on allocation block boundaries in the volume bitmap. That means that the size
of the metadata zone is rounded up to a multiple of 8 times the square of the
allocation block size. In Mac OS X version 10.3, the extra space due to the
round up of the metadata zone is split up between the catalog and the hot file
area (2/3 and 1/3, respectively).

The calculations for the extents (overflow) file and journal file divide the
total size of the volume by 100GB, rounding down. Then they add one (to
compensate for any remainder lost as part of the rounding). The result is then
multiplied by 4MB or 8MB, respectively. If the volume's total size is not a
multiple of 100GB, this is equivalent to 4MB (or 8MB) per 100GB, rounded up.

In Mac OS X version 10.3, the metadata zone is located at the start of the
volume, following the volume header. The hot file area is located towards the
end of the metadata zone.

When performing normal file allocations, the allocator will skip over the
metadata zone. This ensures that the metadata will be less fragmented, and all
of the metadata will be located in the same area on the disk. If the area
outside the metadata zone is exhausted, the allocator will then use space
inside the metadata zone for normal file allocations. Similarly, when
allocating space for metadata, the allocator will use space inside the metadata
zone first. If all of the metadata zone is in use, then metadata allocations
will use space outside the metadata zone.

5.3. Text encoding

HFS+ includes features specifically designed to help Mac OS handle the conversion between Mac OS-encoded strings and Unicode.

The first feature is the text encoding value of the file and folder catalog records. The value refers to a specific encoding type.

Encoding type Value Encodings bitmap number

MacRoman

0

0

MacJapanese

1

1

MacChineseTrad

2

2

MacKorean

3

3

MacArabic

4

4

MacHebrew

5

5

MacGreek

6

6

MacCyrillic

7

7

MacDevanagari

9

9

MacGurmukhi

10

10

MacGujarati

11

11

MacOriya

12

12

MacBengali

13

13

MacTamil

14

14

MacTelugu

15

15

MacKannada

16

16

MacMalayalam

17

17

MacSinhalese

18

18

MacBurmese

19

19

MacKhmer

20

20

MacThai

21

21

MacLaotian

22

22

MacGeorgian

23

23

MacArmenian

24

24

MacChineseSimp

25

25

MacTibetan

26

26

MacMongolian

27

27

MacEthiopic

28

28

MacCentralEurRoman

29

29

MacVietnamese

30

30

MacExtArabic

31

31

MacSymbol

33

33

MacDingbats

34

34

MacTurkish

35

35

MacCroatian

36

36

MacIcelandic

37

37

MacRomanian

38

38

MacFarsi

140

49

MacUkrainian

152

48

The second use of text encodings in HFS+ is the encodings bitmap value of the volume header. For each encoding used by a catalog node on the volume, the corresponding bit in the encodings bitmap field must be set.

The text encoding value is used as the number of the bit to set in encodings bitmap to indicate that the encoding is used on the volume. However, encodings bitmap is only 64 bits long, and thus the text encoding values for MacFarsi and MacUkrainian cannot be used as bit numbers. Instead, another bit number is used.

It is acceptable for a bit in this bitmap to be set even though no names on the volume use that encoding. This means that when an implementation deletes or renames an object, it does not have to clear the encoding bit if that was the last name to use the given encoding.

TODO: add text about classic HFS

HFS+ supports both hard links and symbolic links.

Hard links to directories are not supported (allowed).

Hard links in HFS+ are represented by multiple different types of file records:

  • one indirect node file record, named "iNode#", where # is the link reference. This file contains the content of the file shared by the hard links.

  • one or more hard link file records, that reference the indirect node file record.

Indirect node files exist in a special (invisible) directory called the metadata directory named "/\u2400\u2400\u2400\u2400HFS+ Private Data".

Note
TN1150 claims that a new link reference randomly chosen from the range 100 to 1073741923. However link references that fall outside of this range have been observed such as "iNode20".

The special permission data of the hard link file records contains the link reference if:

  • the catalog file record flag kHFSHasLinkChainMask is set;

  • and the first 8 bytes of the file information contains "hlnkhfs+"

enum {
    kHardLinkFileType = 0x686C6E6B,  /* 'hlnk' */
    kHFSPlusCreator   = 0x6866732B   /* 'hfs+' */
};
Notes
The fileType and fileCreator fields of the userInfo in the catalog record of a
hard link file must be set to kHardLinkFileType and kHFSPlusCreator,
respectively. The hard link file's creation date should be set to the creation
date of the metadata directory. The hard link file's creation date may also be
set to the creation date of the volume's root directory (if it differs from the
creation date of the metadata directory), though this is deprecated.

For better compatibility with older versions of the Mac OS Finder, the
kHasBeenInited flag should be set in the Finder flags. The other Finder
information, and other dates in the catalog record are reserved.

POSIX semantics allow an open file to be unlinked (deleted). These open but
unlinked files are stored on HFS+ volumes much like a hard link. When the open
file is deleted, it is renamed and moved into the metadata directory. The new
name is the string "temp" followed by the catalog node ID converted to decimal
text. When the file is eventually closed, this temporary file may be removed.
All such temporary files may be removed when repairing an unmounted HFS+ volume.
Repairing the Metadata Directory

When repairing a HFS+ volume with hard links or a metadata directory, there
are several conditions that might need to be repaired:

* Opened but deleted files (which are now orphaned).
* Orphaned indirect node files (no hard links refer to them).
* Broken hard link (hard link exists, but indirect node file does not).
* Incorrect link count.
* Link reference was 0.

Opened but deleted files are files whose names start with "temp", and are in
the metadata directory. If the volume is not in use (not mounted, and not being
used by any other utility), then these files can be deleted. Volumes with a
journal, even one with no active transactions, may have opened but undeleted
files that need to be deleted.

Detecting an orphaned indirect node file, broken hard link, or incorrect link
count requires finding all hard link files in the catalog, and comparing the
number of found hard links for each link reference with the link count of the
corresponding indirect node file.

A hard link with a link reference equal to 0 is invalid. Such a hard link may
be the result of a hard link being copied or restored by an implementation or
utility that does not use the permissions in catalog records. It may be
possible to repair the hard link by determining the proper link reference.
Otherwise, the hard link should be deleted.

The data fork of a symbolic link contains the path of the directory or file it refers to.

On HFS+ path is a POSIX pathname, as used by the Mac OS BSD and Cocoa programming interfaces. It is not a traditional Mac OS, or Carbon, path. The path is stored as an UTF-8 encoded string without an end-of-string character. The length of the path should be 1024 bytes or less. The path may be full or partial, with or without a leading forward slash.

The first 8 bytes of the file information should contain "slnkrhap".

enum {
    kSymLinkFileType  = 0x736C6E6B, /* 'slnk' */
    kSymLinkCreator   = 0x72686170  /* 'rhap' */
};
Notes
On a HFS+ volume, a symbolic link is stored as an ordinary file with special
values in some of the fields of its catalog record. The pathname of the file
being referred to is stored in the data fork. The file type in the fileMode
field of the permissions is set to S_IFLNK. For compatibility with Carbon and
Classic applications, the file type of a symbolic link is set to
kSymLinkFileType, and the creator code is set to kSymLinkCreator. The resource
fork of the symbolic link has zero length and is reserved.

6. The HFS wrapper

An HFS+ volume can be wrapped in a HFS volume.

Mac OS does not use the startup file to boot from HFS+ disks. Instead, it uses the HFS wrapper, as described later in this document.

When a HFS+ volume is embedded within a HFS wrapper the space used by the HFS+ volume is marked as part of the bad block file within the HFS wrapper itself.

6.1. Notes

An HFS+ volume may be contained within a HFS volume in a way that makes the
volume look like a HFS volume to systems without HFS+ support. This has a two
important advantages:

1. It allows a computer with HFS (but no HFS+) support in ROM to start up from a HFS+ volume. When creating the wrapper, Mac OS includes a System file containing the minimum code to locate and mount the embedded HFS+ volume and continue booting from its System file.
2. It improves the user experience when a HFS+ volume is inserted in a computer that has HFS support but no HFS+ support. On such a computer, the HFS wrapper will be mounted as a volume, which prevents error dialogs that might confuse the user into thinking the volume is empty, damaged, or unreadable. The HFS wrapper may also contain a Read Me document to explain the steps the user should take to access their files.

The rest of this section describes how the HFS wrapper is laid out and how the HFS+ volume is embedded within the wrapper.

IMPORTANT:
This section does not describe the HFS+ volume format; instead, it describes additions to the HFS volume format that allow a HFS+ volume (or some other volume) to be embedded in a HFS volume. However, as all Mac OS volumes are formatted with a HFS wrapper, all implementations should be able to parse the wrapper to find the embedded HFS+ volume.

Note:
An HFS+ volume is not required to have a HFS wrapper. In that case, the volume will start at the beginning of the disk, and the volume header will be at offset 1024 bytes. However, Apple software currently initializes all HFS+ volumes with a HFS wrapper.
HFS Master Directory Block

An HFS volume always contains a Master Directory Block (MDB), at offset 1024 bytes. The MDB is similar to a HFS+ volume header. In order to support volumes embedded within a HFS volume, several unused fields of the MDB have been changed, and are now used to indicate the type, location, and size of the embedded volume.

What was formerly the drVCSize field (at offset 0x7C) is now named drEmbedSigWord. This two-byte field contains a unique value that identifies the type of embedded volume. When a HFS+ volume is embedded, drEmbedSigWord must be kHFSPlusSigWord ('H+'), the same value stored in the signature field of a HFS+ volume header.

What were formerly the drVBMCSize and drCtlCSize fields (at offset 0x7E) have been combined into a single field occupying four bytes. The new structure is named drEmbedExtent and is of type HFSExtentDescriptor. It contains the starting allocation block number (startBlock) where the embedded volume begins and number of allocation blocks (blockCount ) the embedded volume occupies. The embedded volume must be contiguous. Both of these values are in terms of the HFS wrapper's allocation blocks, not HFS+ allocation blocks.

Note:
The description of the HFS volume format in Inside Macintosh: Files describes these fields as being used to store the size of various caches, and labels each one as "used internally".

To actually find the embedded volume's location on disk, an implementation must use the drAlBlkSiz and drAlBlSt fields of the MDB. The drAlBlkSiz field contains the size (in bytes) of the HFS allocation blocks. The drAlBlSt field contains the offset, in 512-byte blocks, of the wrapper's allocation block 0 relative to the start of the volume.

IMPORTANT:
This embedding introduces a transform between HFS+ volume offsets and disk offsets. The HFS+ volume exists on a virtual disk embedded within the real disk. When accessing a HFS+ structure on an embedded disk, an implementation must add the offset of the embedded disk to the HFS+ location. Listing 2 shows how one might do this, assuming 512-byte sectors.

static UInt32 HFSPlusSectorToDiskSector(UInt32 hfsPlusSector)
{
    UInt32 embeddedDiskOffset;

    embeddedDiskOffset = gMDB.drAlBlSt +
                         gMDB.drEmbedExtent.startBlock * (drAlBlkSiz / 512)
    return embeddedDiskOffset + hfsPlusSector;
}

Listing 2. Sector transform for embedded volumes.

In order to prevent accidentally changing the files in the HFS wrapper, the wrapper volume must be marked as software-write-protected by setting kHFSVolumeSoftwareLockBit in the drAtrb (volume attributes) field of the MDB. All correct HFS implementations will prevent any changes to the wrapper volume.

To improve performance of HFS+ volumes, the size of the wrapper's allocation blocks should be a multiple of the size of the HFS+ volume's allocation blocks. In addition, the wrapper's allocation block start (drAlBlSt) should be a multiple of the HFS+ volume's allocation block size (or perhaps 4 KB, if the HFS+ allocation blocks are larger). If these recommendations are followed, the HFS+ allocation blocks will be properly aligned on the disk. And, if the HFS+ allocation block size is a multiple of the sector size, then blocking and deblocking at the device driver level will be minimized.
Allocating Space for the Embedded Volume

The space occupied by the embedded volume must be marked as allocated in the HFS wrapper's volume bitmap (similar to the HFS+ allocation file) and placed in the HFS wrapper's bad block file (similar to the HFS+ bad block file). This doesn't mean the blocks are actually bad; it merely prevents the HFS+ volume from being overwritten by newly created files in the HFS wrapper, being deleted accidentally, or being marked as free, usable space by HFS disk repair utilities.

The kHFSVolumeSparedBlocksMask bit of the drAtrb (volume attributes) field of the MDB must be set to indicate that the volume has a bad blocks file.
Read Me and System Files

IMPORTANT:
This section is not part of the HFS+ volume format. It describes how the existing Mac OS implementation of HFS+ creates HFS wrappers. It is provided for your information only.

As initialized by the Mac OS Disk Initialization Package, the HFS wrapper volume contains five files in the root folder.

    * Read Me -- The Read Me file, whose name is actually "Where_have_all_my_files_gone?", contains text explaining that this volume is really a HFS+ volume but the contents cannot be accessed because HFS+ is not currently installed on the computer. It also describes the steps needed to install HFS+ support. Localized system software will also create a localized version of the file with localized file name and text content.
    * System and Finder (invisible) -- The System file contains the minimum code to locate and mount the embedded HFS+ volume, and to continue booting from the System file in the embedded volume. The Finder file is empty; it is there to prevent older versions of the Finder from de-blessing the wrapper's root directory, which would prevent booting from the volume.
    * Desktop DB and Desktop DF (invisible) -- The Desktop DB and Desktop DF files are an artifact of the way the files on the wrapper volume are created.

In addition, the root folder is set as the blessed folder by placing its folder ID in the first SInt32 of the drFndrInfo (Finder information) field of the MDB.

7. The catalog file

The catalog file is a B-tree file used to maintain information about the hierarchy of files and directories of a volume.

The allocation block number of the first file extent of the catalog file (the header node) is stored in the master directory block (HFS) or the volume header (HFS+). The B-tree structure is described in section: B-tree file.

Each node in the catalog file is assigned a unique catalog node identifier (CNID). The CNID is used for both directory and file identifiers. For any given file or directory the parent identifier is the CNID of the parent directory. The first 16 CNIDs are reserved for use by Apple and include the following standard assignments:

CNID Identifier Assignment

0

Unknown (Reserved)

1

kHFSRootParentID

Parent identifier of the root directory (folder)

2

kHFSRootFolderID

Directory identifier of the root directory (folder)

3

kHFSExtentsFileID

The extents (overflow) file

4

kHFSCatalogFileID

The catalog file

5

kHFSBadBlockFileID

The bad allocation block file

6

kHFSAllocationFileID

The allocation file (HFS+)

7

kHFSStartupFileID

The startup file (HFS+)

8

kHFSAttributesFileID

The attributes file (HFS+)

14

kHFSRepairCatalogFileID

Used temporarily by fsck_hfs when rebuilding the catalog file.

15

kHFSBogusExtentFileID

The bogus extent file
Used temporarily during exchange files operations.

16

kHFSFirstUserCatalogNodeID

The first available CNID for user’s files and folders

7.1. Catalog file index keys

In a catalog file the search key consists of:

  • parent directory identifier

  • file or directory name

The volume reference number is not included in the search key.

7.1.1. HFS catalog index key

The HFS catalog index key is variable in size and consists of:

Offset Size Value Description

0

1

The key data size
Signed 8-bit integer
Contains number of bytes

If key data size > 0

1

1

Unknown (Reserved)

2

4

The parent identifier
Contains a CNID

6

1

Number of characters in the name string
The end-of-string character is not included

7

…​

Name string
Contains an ASCII string with end-of-string character
Contains the name of the file or directory

…​

…​

Unknown (Padding)

The key data size may contain values from 7 to 37. A deleted record is indicated by a key data size of 0.

In an index node, the catalog node name always is stored as 32 bytes and therefore the maximum key size should be 37. In a leaf node the catalog node name varies in size.

HFS catalog index keys in a leaf node must be stored 16-bit aligned within the node data. The size of the alignment padding is not included in the key data size.

7.1.2. HFS+ catalog index key

The HFS+ catalog index key is variable in size and consists of:

Offset Size Value Description

0

2

The key data size
Contains number of bytes

If key data size > 0

2

4

The parent identifier
Contains a CNID

If key data size > 6

6

2

Number of characters in the name string

8

…​

Name string
UTF-16 big-endian string without end-of-string character
Contains the name of the file or directory

Maximum name string length 255 characters?

7.2. The catalog data

A catalog leaf node can contain four different types of records:

  • a directory record, which contains information about a single directory.

  • a file record, which contains information about a single file.

  • a directory thread record, which provides a link between a directory and its parent directory.

  • a file thread record, which provides a link between a file and its parent directory.

The thread records are used to find the name and directory identifier of the parent of a given file or directory.

Each catalog data record consists of:

  • the catalog data record header;

  • the catalog data record data.

7.2.1. The catalog data record header

The HFS catalog data record header

The HFS catalog data record header is 2 bytes of size and consists of:

Offset Size Value Description

0

1

The record type
Signed 8-bit integer
See section: Record types

1

1

0x00

Unknown (Reserved)
Signed 8-bit integer

Note
To distinguish between HFS and HFS+ record types, record type should be treated as a 16-bit big-endian value.
The HFS+ catalog data record header

The HFS+ catalog data record header is 2 bytes of size and consists of:

Offset Size Value Description

0

2

The record type
See section: Record types

The catalog data record types
Value Identifier Description

0x0001

kHFSPlusFolderRecord

HFS+ Directory record

0x0002

kHFSPlusFileRecord

HFS+ File record

0x0003

kHFSPlusFolderThreadRecord

HFS+ Directory thread record

0x0004

kHFSPlusFileThreadRecord

HFS+ File thread record

0x0100

kHFSFolderRecord

HFS Directory record

0x0200

kHFSFileRecord

HFS File record

0x0300

kHFSFolderThreadRecord

HFS Directory thread record

0x0400

kHFSFileThreadRecord

HFS File thread record

7.2.2. The catalog directory record

The HFS catalog directory record

The HFS catalog directory record (cdrDirRec, kHFSFolderRecord) is 70 bytes of size and consists of:

Offset Size Value Description

0

2

0x0100

The record type

2

2

Directory (folder) flags
See section: directory record flags

4

2

Number of directory entries (valence)

6

4

The identifier
Contains a CNID

10

4

Creation date and time
Contains a HFS timestamp in local time

14

4

(content) modification date and time
Contains a HFS timestamp in local time

18

4

Backup date and time
Contains a HFS timestamp in local time

22

16

Folder information
See section: HFS folder information

38

16

Extended folder information
See section: HFS extended folder information

54

( 4 x 4 ) = 16

Unknown (Reserved)
Array of 32-bit integer values

HFS catalog directory record flags

Not defined. The HFS catalog directory record appears to always have a corresponding folder thread record.

The HFS+ catalog directory record

The HFS+ catalog directory record (HFSPlusCatalogFolder) is 88 bytes of size and consists of:

Offset Size Value Description

0

2

0x0001

The record type

2

2

Directory (folder) flags
See section: file record flags

4

4

Number of directory entries (valence)

8

4

The identifier
Contains a CNID

12

4

Creation date and time
Contains a HFS timestamp in UTC

16

4

(content) modification date and time
Contains a HFS timestamp in UTC

20

4

Entry (or attribute) modification date and time
Contains a HFS timestamp in UTC

24

4

Access date and time
Contains a HFS timestamp in UTC

28

4

Backup date and time
Contains a HFS timestamp in UTC

Permissions

32

4

Owner identifier

36

4

Group identifier

40

1

Administration flags
BSD like flags settable by the super-user only
Also see: Administration flags

41

1

Owner flags
BSD like flags settable by the owner
Also see: Owner flags

42

2

File mode
Also see: File mode

44

4

Special permission data

Folder information

48

16

Folder information
See section: HFS+ folder information

Extended folder information

64

16

Extended folder information
See section: HFS+ extended folder information

80

4

Text encoding hint
See section: Text encoding

84

4

0x00

Unknown (Reserved)

7.2.3. The catalog file record

The HFS catalog file record

The HFS catalog file record (cdrFilRec, kHFSFileRecord) is 102 bytes of size and consists of:

Offset Size Value Description

0

2

0x0200

The record type

2

1

Flags
Signed 8-bit integer
See section: file record flags

3

1

File type
Signed 8-bit integer
This field should always contain 0.

4

16

File information
See section: HFS file information

20

4

The identifier
Contains a CNID

24

2

Data fork block number (not used?)

26

4

Data fork size

30

4

Data fork allocated size

34

2

Resource fork block number (not used?)

36

4

Resource fork size

40

4

Resource fork allocated size

44

4

Creation date and time
Contains a HFS timestamp in local time

48

4

(content) modification date and time
Contains a HFS timestamp in local time

52

4

Backup date and time
Contains a HFS timestamp in local time

56

16

Extended file information

72

2

The clump size

74

( 3 x 4 ) = 12

The first data fork extents record
See section: The HFS extents record

86

( 3 x 4 ) = 12

The first resource fork extents record
See section: The HFS extents record

98

4

0x00

Unknown (Reserved)

HFS catalog file record flags
Value Identifier Description

0x0001

File is locked and cannot be written to

0x0002

Has thread record

0x0080

kHFSHasDateAddedMask

Had added date and time

The HFS+ catalog file record

The HFS+ catalog file record (kHFSPlusFileRecord) is 248 bytes of size and consists of:

Offset Size Value Description

0

2

0x0002

The record type

2

2

Flags
See section: file record flags

4

4

0x00

Unknown (Reserved)

8

4

The identifier
Contains a CNID

12

4

Creation date and time
Contains a HFS timestamp in UTC

16

4

(content) modification date and time
Contains a HFS timestamp in UTC

20

4

Entry (or attribute) modification date and time
Contains a HFS timestamp in UTC

24

4

Access date and time
Contains a HFS timestamp in UTC

28

4

Backup date and time
Contains a HFS timestamp in UTC

Permissions

32

4

Owner identifier

36

4

Group identifier

40

1

Administration flags
BSD like flags settable by the super-user only
Also see: Administration flags

41

1

Owner flags
BSD like flags settable by the owner
Also see: Owner flags

42

2

File mode
Also see: File mode

44

4

Special permission data
Consist of either: hard link reference, number of (hard) links, raw device number

File information

48

16

File information (or user information)
See section: HFS+ file information

Extended file information

64

16

Extended file information (or finder information)
See section: HFS+ extended file information

80

4

Text encoding hint
See section: Text encoding

84

4

0x00

Unknown (Reserved)

88

80

Data fork
See section: HFS+ fork descriptor structure

168

80

Resource fork
See section: HFS+ fork descriptor structure

HFS+ catalog file record flags
Value Identifier Description

0x0001

kHFSFileLockedMask

File is locked and cannot be written to

0x0002

kHFSThreadExistsMask

Has thread record
This should be always set for files on HFS+/HSFX

0x0004

kHFSHasAttributesMask

Has extended attributes

0x0008

kHFSHasSecurityMask

Has ACLs

0x0010

kHFSHasFolderCountMask

Has number of sub-folder

0x0020

kHFSHasLinkChainMask

Has a hard link target (link chain)
The CNID of the hard link target is stored in the special permission data

0x0040

kHFSHasChildLinkMask

Has a child that is a directory link

0x0080

kHFSHasDateAddedMask

Had added date and time
The extended folder of file information contains the date and time the folder or file was added (date_added)

0x0100

kHFSFastDevPinnedMask

Unknown

0x0200

kHFSDoNotFastDevPinMask

Unknown

0x0400

kHFSFastDevCandidateMask

Unknown

0x0800

kHFSAutoCandidateMask

Unknown

7.2.4. The catalog thread record

The file thread record is similar to the directory thread record except that it refers to a file, instead of a directory.

The HFS catalog file thread record

The HFS catalog thread record (cdrThdRec, cdrFThdRec, HFSCatalogThread) is variable of size and consists of:

Offset Size Value Description

0

2

0x0300
0x0400

The record type

2

( 2 x 4 ) = 8

0x00

Unknown (Reserved)
Array of 32-bit integer values

10

4

The parent identifier
Contains a CNID

14

1

Number of characters in the name string

15

…​

Name string ASCII string
Contains the name of the associated file or directory

The HFS+ catalog file thread record

The HFS+ catalog thread record (HFSPlusCatalogThread) is variable of size and consists of:

Offset Size Value Description

0

2

0x0003
0x0004

The record type

2

2

0x00

Unknown (Reserved)
Unsigned 16-bit integer

4

4

The parent identifier
Contains a CNID

8

2

Number of characters in the name string

10

…​

Name string
UTF-16 big-endian string without end-of-string character
Contains the name of the associated file or directory

Maximum name string length 255 characters?

7.3. Permissions

For each file and folder HFS+ maintains basic access permissions record for each file and folder. These are similar to basic Unix file permissions.

TODO: add note about permissions on HFS

7.3.1. Owner and group identifier

The Mac OS X user ID of the owner of the file or folder. Mac OS X versions prior to 10.3 treats user ID 99 as if it was the user ID of the user currently logged in to the console. If no user is logged in to the console, user ID 99 is treated as user ID 0 (root). Mac OS X version 10.3 treats user ID 99 as if it was the user ID of the process making the call (in effect, making it owned by everyone simultaneously). These substitutions happen at run-time. The actual user ID on disk is not changed.

The Mac OS X group ID of the group associated with the file or folder. Mac OS X typically maps group ID 99 to the group named "unknown." There is no run-time substitution of group IDs in Mac OS X.

7.3.2. Administration flags

Value Identifier Description

0x01

SF_ARCHIVED

File has been archived

0x02

SF_IMMUTABLE

File is immutable and may not be changed

0x04

SF_APPEND

Writes to file may only append

7.3.3. Owner flags

Value Identifier Description

0x01

UF_NODUMP

Do not backup (dump) this file

0x02

UF_IMMUTABLE

File is immutable and may not be changed

0x04

UF_APPEND

Writes to file may only append

0x08

UF_OPAQUE

Directory is opaque

7.3.4. File mode

Value Identifier Description

0xf000 (0170000)

S_IFMT

File type bitmask

0x1000 (0010000)

S_IFIFO

Named pipe

0x2000 (0020000)

S_IFCHR

Character-special file (Character device)

0x4000 (0040000)

S_IFDIR

Directory

0x6000 (0060000)

S_IFBLK

Block-special file (Block device)

0x8000 (0100000)

S_IFREG

Regular file

0xa000 (0120000)

S_IFLNK

Symbolic link

0xc000 (0140000)

S_IFSOCK

Socket

0xe000 (0160000)

S_IFWHT

Whiteout
A whiteout is a file entry that covers up all entries of a particular name from lower branches

HFS+ uses the BSD file type and mode bits. Note that the constants from the header shown below are in octal (base eight), not hexadecimal.

Octal value Identifier Description

0004000

S_ISUID

Set user identifier on execution

0002000

S_ISGID

Set group identifier on execution

0001000

S_ISTXT

Sticky bit

0000700

S_IRWXU

Read, write and execute access for owner

0000400

S_IRUSR

Read access for owner

0000200

S_IWUSR

Write access for owner

0000100

S_IXUSR

Execute access for owner

0000070

S_IRWXG

Read, write and execute access for group

0000040

S_IRGRP

Read access for group

0000020

S_IWGRP

Write access for group

0000010

S_IXGRP

Execute access for group

0000007

S_IRWXO

Read, write and execute access for other

0000004

S_IROTH

Read access for other

0000002

S_IWOTH

Write access for other

0000001

S_IXOTH

Execute access for other

Note
If the sticky bit is set for a directory, then Mac OS restricts movement, deletion, and renaming of files in that directory. Files may be removed or renamed only if the user has write access to the directory; and is the owner of the file or the directory, or is the super-user.
Notes
special
    This field is used only for certain special kinds of files. For directories, and most files, this field is unused and reserved. When used, this field is used as one of the following:
iNodeNum
    For hard link files, this field contains the link reference number. See the Hard Links section for more information.
linkCount
    For indirect node files, this field contains the number of hard links that point at this indirect node file. See the Hard Links section for more information.
rawDevice
    For block and character special devices files (when the S_IFMT field contains S_IFCHR or S_IFBLK), this field contains the device number.

WARNING:
Mac OS 8 and 9 treat the permissions as reserved.

Note:
The S_IFWHT and UF_OPAQUE values are used when the file system is mounted as
part of a union mount. A union mount presents the combination (union) of
several file systems as a single file system. Conceptually, these file systems
are layered, one on top of another. If a file or directory appears in multiple
layers, the one in the top most layer is used. All changes are made to the top
most file system only; the others are read-only. To delete a file or directory
that appears in a layer other than the top layer, a whiteout entry (file type
S_IFWHT) is created in the top layer. If a directory that appears in a layer
other than the top layer is deleted and later recreated, the contents in the
lower layer must be hidden by setting the UF_OPAQUE flag in the directory in
the top layer. Both S_IFWHT and UF_OPAQUE hide corresponding names in lower
layers by preventing a union mount from accessing the same file or directory
name in a lower layer.

Note:
If the S_IFMT field (upper 4 bits) of the fileMode field is zero, then Mac OS X
assumes that the permissions structure is uninitialized, and internally uses
default values for all of the fields. The default user and group IDs are 99,
but can be changed at the time the volume is mounted. This default ownerID is
then subject to substitution as described above.

This means that files created by Mac OS 8 and 9, or any other implementation
that sets the permissions fields to zeroes, will behave as if the "ignore
ownership" option is enabled for those files, even if "ignore ownership" is
disabled for the volume as a whole.

7.4. File system hierarchy

File and directory (folder) records have a search key with a non-empty name string. In thread records the name string in the search key is empty. E.g. to list the file entries in a directory:

  • find all the file or directory records given the parent CNID

Finding a file or directory by its CNID is a two-step process:

  1. use the CNID to look up the thread record for the file or directory

  2. use the thread record to look up the file or directory record

7.5. File forks

Forks in HFS and HFS+ can be compared to data streams in NTFS. In HFS+ the fork values are grouped in a separate fork descriptor structure. HFS+ also defines extended attributes (named forks). These are not stored in the catalog file but in the attributes file.

7.5.1. HFS+ fork descriptor structure

HFS+ maintains information about file contents using the HFS+ fork descriptor structure (HFSPlusForkData).

The fork descriptor structure is 80 bytes of size and consists of:

Offset Size Value Description

0

8

Logical size
Contains number of bytes

8

4

Clump size
Contains number of bytes

12

4

Number of (allocation) blocks
The total number of allocation blocks used by all the extents in this fork.

16

( 8 x ( 4 + 4 ) ) = 64

The extent (data) record
See section: The HFS+ extents record

Clump size

For fork descriptor structures:

  • in the volume header this is the fork’s clump size, which is used in preference to the default clump size in the volume header.

  • in a catalog record, this value was intended to store a per-fork clump size to override the default clump size in the volume header. However, Apple implementations prior to Mac OS X version 10.3 ignored this field. As of Mac OS X version 10.3, this field is used to keep track of the number of blocks actually read from the fork.

8. The extents (overflow) file

In HFS and HFS+ extents (contiguous ranges of allocation blocks) are used to track which blocks belong to a file. The first three (HFS) and eight (HFS+) are stored in the catalog file. Additional extents are stored in the extents (overflow) file.

The structure of an extents (overflow) file is relatively simple compared to that of a catalog file. The function of the extents (overflow) file is to store those file extents that are not contained in the master directory block (MDB) or volume header and the catalog file.

Note
The file system B-tree files can have additional extents in the extents (overflow) file. This has been observed with the attributes file. It is currently unknown if the extents (overflow) file itself can have overflow extents.

8.1. The extent key (record)

Disks initialized using the enhanced Disk Initialization Manager introduced in system software version might contain extent records for some blocks that do not belong to any actual file in the file system. These extent records have been marked as a bad block (CNID 5). See the chapter "Disk Initialization Manager" in this book for details on bad block sparing.

The key has been selected so that the extent records for a particular fork are grouped together in the B-tree, right next to all the extent records for the other fork of the file. The fork offset of the preceding extent record is needed to determine the key of the next extent record.

In an extents (overflow) file the search key consists of:

  • fork type

  • file identifier

  • first allocation block in the extent

8.1.1. The HFS extent key (record)

The HFS extent key (record) is 8 bytes of size and consists of:

Offset Size Value Description

0

1

7

Key byte size
Signed 8-bit integer

1

1

Fork type
Signed 8-bit integer
See section: HFS fork types

2

4

File identifier
Contains a CNID

6

2

Start block
The first allocation block index described by the corresponding extent record

The first three extents in a fork are held in its catalog file record. So the number of extent records for a fork is ((number of extents - 3 + 2) / 4).

8.1.2. The HFS+ extent key (record)

The HFS+ extent key (record) is 12 bytes of size and consists of:

Offset Size Value Description

0

2

10

Key byte size
Unsigned 16-bit integer

2

1

Fork type
Signed 8-bit integer
See section: HFS fork types

3

1

0x00

Unknown (Padding)

4

4

File identifier
Contains a CNID

8

4

Start block
The first allocation block index described by the corresponding extent record

The first eight extents in a fork are held in its catalog file record. So the number of extent records for a fork is:

( ( number of extents - 8 + 7 ) / 8 )

8.1.3. HFS fork types

Value Identifier Description

-1 (0xff)

Resource fork

0 (0x00)

Data fork

8.2. The extent (data) record

An extent is a contiguous range of allocation blocks that have been allocated to some file. An extent is represented by an extent descriptor.

An unused extent descriptor in an extent record would have both the start block and number of blocks set to zero.

8.2.1. The HFS extents record

The HFS extents record (HFSExtentRecord) consist of an array of 3 HFS extent descriptors. The size of the HFS extents records is 3 x 4 = 12 bytes.

An individual HFS extent descriptor (HFSExtentDescriptor) is 4 bytes of size and consists of:

Offset Size Value Description

0

2

The start (allocation) block of the extent

2

2

The number of (allocation) blocks in the extent

The extents in a HFS extents record are relative to the extents start block number defined in the master directory block (MDB).

offset = ( extents start block number + extent block number ) x allocation block size;

8.2.2. The HFS+ extents record

The HFS+ extents record (HFSPlusExtentRecord) consist of an array of 8 HFS+ extent descriptors. The size of the HFS+ extents record is 8 x 8 = 64 bytes.

An individaul HFS+ extent descriptor (HFSPlusExtentDescriptor) is 8 bytes of size and consists of:

Offset Size Value Description

0

4

The start (allocation) block of the extent

4

4

The number of (allocation) blocks in the extent

8.3. Bad Block File

The extent overflow file is also used to hold information about the bad blocks; refered to as the bad block file. The bad block file is used to mark areas on the disk as bad, unable to be used for storing data; typically to map out bad sectors on the storage medium.

Typically, allocation blocks are larger than sectors. If a single sector is found to be bad, the entire allocation block is unusable. The bad block file is sometimes used to mark blocks as unusable when they are not bad, e.g. in the HFS wrapper.

Bad block extent records are always assumed to reference the data fork (fork type of 0).

9. The HFS+ allocation (bitmap) file

HFS+ uses an allocation file to keep track of whether each allocation block in a volume is currently allocated to some file system structure or not. The contents of the allocation file is a bitmap. The bitmap contains one bit for each allocation block in the volume.

  • If a bit is set, the corresponding allocation block is currently in use by some file system structure.

  • If a bit is clear, the corresponding allocation block is not currently in use, and is available for allocation.

The size of the allocation file depends on the number of allocation blocks in the volume, which in turn depends both on the size of the disk and on the size of the volume’s allocation blocks. For example, a volume on a 1 GB disk and having an allocation block size of 4 KB needs an allocation file size of 256 Kbits (32 KB, or 8 allocation blocks). Since the allocation file itself is allocated using allocation blocks, it always occupies an integral number of allocation blocks (its size may be rounded up).

The allocation file may be larger than the minimum number of bits required for the given volume size. Any unused bits in the bitmap must be set to zero.

Each byte in the allocation file holds the state of eight allocation blocks. The byte at offset X into the file contains the allocation state of allocations blocks (X * 8) through (X * 8 + 7). Within each byte, the most significant bit holds information about the allocation block with the lowest number, the least significant bit holds information about the allocation block with the highest number. Listing 1 shows how you would test whether an allocation block is in use, assuming that you’ve read the entire allocation file into memory.

static Boolean IsAllocationBlockUsed(UInt32 thisAllocationBlock,
                                     UInt8 *allocationFileContents)
{
    UInt8 thisByte;

    thisByte = allocationFileContents[thisAllocationBlock / 8];
    return (thisByte & (1 << (7 - (thisAllocationBlock % 8)))) != 0;
}

Listing 1 Determining whether an allocation block is in use.

10. The HFS+ attributes file

The HFS+ attributes file is a B-tree file. The location of the attributes file can be found in the volume header. The HFS+ attributes file is intended to store extended attributes.

10.1. HFS+ attributes index keys

The HFS+ attributes index key is variable in size and consists of:

Offset Size Value Description

0

2

The key data size
Contains number of bytes

If key data size > 0

2

2

Unknown

4

4

The identifier
Contains a CNID

8

4

Unknown

12

2

Number of characters in the name string

14

…​

Name string
UTF-16 big-endian string without end-of-string character
Contains the name of the file or directory

Note
The name of an extended attribute appears to be case senstive even on a case insensitive file system.

10.2. The attributes file data

The attributes file defines two types of attributes:

  1. Fork data attributes, which are used for attributes whose data is large. The attribute’s data is stored in extents on the volume and the attribute merely contains a reference to those extents.

  2. Extension attributes, which are used to augment fork descriptor structure, allowing a forks to have more than eight extents.

10.2.1. The HFS+ attributes file data record header

Each attributes file record starts with a type value, which describes the type of attribute data record.

The HFS+ attributes file data record header is 4 bytes of size and consists of:

Offset Size Value Description

0

4

The record type
See section: Record types

The attributes data record types
Value Identifier Description

0x00000010

kHFSPlusAttrInlineData

Attribute record with inline data

0x00000020

kHFSPlusAttrForkData

Attribute record with fork descriptor

0x00000030

kHFSPlusAttrExtents

Attribute record with extents (overflow)

Note
At the moment it is unclear when an attribute record of type kHFSPlusAttrExtents is created and how it should be handled.

10.2.2. The inline data attribute record

The HFS+ attributes file inline data attribute record is variable of size and consists of:

Offset Size Value Description

0

4

0x00000010

The record type

4

4

0

Unknown (Reserved)

8

4

Unknown

12

4

Attribute data size

16

…​

Attribute data

10.2.3. The fork descriptor attribute record

The HFS+ attributes file fork descriptor attribute record is 88 bytes of size and consists of:

Offset Size Value Description

0

4

0x00000020

The record type

4

4

0

Unknown (Reserved)

8

80

Attribute fork descriptor
See section: HFS+ fork descriptor structure

10.2.4. The extents attribute record

The HFS+ attributes file extents attribute record is 72 bytes of size and consists of:

Offset Size Value Description

0

4

0x00000030

The record type

4

4

0

Unknown (Reserved)

8

( 8 x 8 ) = 64

Attribute extent data
See section: The HFS+ extents record

10.3. Compressed data extended attribute

The compressed extended attribute is named "com.apple.decmpfs" and consists of:

  • compressed data header

  • optional compressed data

10.3.1. Compressed data header

The compressed data header is 16 bytes of size and consists of:

Offset Size Value Description

0

4

"fpmc"

Signature

4

4

Compression method
See section: Compression method

8

8

Note
The signature is likely stored in little-endian and represents "cmpf".

10.3.2. Compression method

Value Identifier Description

1

CMP_Type1

Unknown (uncompressed extended attribute data)

3

ZLIB (DEFLATE) compressed extended attribute data
The compressed data is stored in the extended attribute after the compressed data header

4

64k chunked ZLIB (DEFLATE) compressed resource fork
The compressed data is stored in the resource fork

5

Unknown (sparse compressed extended attribute data)
Uncompressed data contains 0-byte values
According to [APPLE04] specifies de-dup within the generation store.

6

Unknown (unused)

7

LZVN compressed extended attribute data
The compressed data is stored in the extended attribute after the compressed data header

8

64k chunked LZVN compressed resource fork
The compressed data is stored in the resource fork

9

Unknown (uncompressed extended attribute data, different than CMP_Type1)

10

Unknown (64k chunked uncompressed data resource fork)
The compressed data is stored in the resource fork

11

LZFSE compressed extended attribute data
The compressed data is stored in the extended attribute after the compressed data header

12

64k chunked LZFSE compressed resource fork
The compressed data is stored in the resource fork

0x80000001

Unknown (faulting file)

Note
If the ZLIB (DEFLATE) compressed data starts with 0xff the data is stored uncompressed after the first compressed data byte.
Note
If the LZVN compressed data starts with 0x06 (end of stream oppcode) the data is stored uncompressed after the first compressed data byte.

11. The HFS+ startup file

The startup file is a special file intended to hold information needed when booting a system that does not have built-in (ROM) support for HFS+. A boot loader can find the startup file without full knowledge of the HFS+ volume format using the first eight extents of the startup file located in the volume header.

Format wise it is valid for the startup file to contain more than eight extents, but in doing so the purpose of the startup file is defeated.

12. The HFS+ Hot file

12.1. Notes

Hot Files

Most files on a disk are rarely, if ever, accessed. Most frequently accessed
(hot) files are small. To improve performance of these small, frequently access
files, they are moved near the volume's metadata, into the metadata zone. This
reduces seek times for most accesses. As files are moved into the metadata
zone, they are also defragmented (allocated in a single extent), which further
improves performance. This process is known as adaptive hot file clustering.

The relative importance of a frequently used (hot) file is called its
temperature. Files with the hottest (largest) temperatures are the ones
actually moved into the metadata zone. In Mac OS X version 10.3, a file's
temperature is computed as the number of bytes read from the file during the
recording period divided by the file's size in bytes. This is a measure of how
often the file is read.

This section describes the on-disk structures used for tracking hot files. The
algorithms used at run-time are subject to change, and are not documented here.

Migration of files into or out of the hot file area of the metadata zone is a
gradual process, based upon the user's actual file access patterns. The
migration happens in several phases:

Recording
    Watch file accesses to determine which files are used most
Evaluation
    Merge recently used hot files with previously found hot files
Eviction
    Move older and less frequently used hot files out of metadata zone to make room for newer, hotter files
Adoption
    Move newer and hotter files into the metadata zone

Hot File B-tree

A B-tree is used to keep track of the files that currently occupy the hot file
area of the metadata zone. The hot file B-tree is an ordinary file on the
volume (that is, it has records in the catalog). It is a file named
".hotfiles.btree" in the root directory. To avoid accidental manipulation of
this file, the kIsInvisible and kNameLocked bits in the finderFlags field of
the Finder info should be set.

The node size of the hot file B-tree is at least 512 bytes, and is typically
the same as the the volume's allocation block size. Like other B-trees on an
HFS+ volume, the key length field is 16 bits, and kBTBigKeysMask is set in the
B-tree header's attributes. The btreeType in the header record must be set to
kUserBTreeType.

The B-tree's user data record contains information about hot file recording.
The format of the user data is described by the HotFilesInfo structure:

#define HFC_MAGIC   0xFF28FF26
#define HFC_VERSION 1
#define HFC_DEFAULT_DURATION     (3600 * 60)
#define HFC_MINIMUM_TEMPERATURE  16
#define HFC_MAXIMUM_FILESIZE     (10 * 1024 * 1024)
char hfc_tag[] = "CLUSTERED HOT FILES B-TREE     ";

struct HotFilesInfo {
    UInt32  magic;
    UInt32  version;
    UInt32  duration;    /* duration of sample period */
    UInt32  timebase;    /* recording period start time */
    UInt32  timeleft;    /* recording period stop time */
    UInt32  threshold;
    UInt32  maxfileblks;
    UInt32  maxfilecnt;
    UInt8   tag[32];
};
typedef struct HotFilesInfo HotFilesInfo;

The fields have the following meaning:

magic
    Must contain the value HFC_MAGIC (0xFF28FF26).
version
    Contains the version of the HotFilesInfo structure. Version 1 of the structure is described here. If your implementation encounters any other version number, it should not read or modify the hot file B-tree.
duration
    Contains the duration of the current recording phase, in seconds. In Mac OS X 10.3, this value is typically HFC_DEFAULT_DURATION (60 hours).
timebase
    Contains the time that the current recording phase began, in seconds since Jan 1, 1970 GMT.
timeleft
    Contains the time remaining in the current recording phase, in seconds.
threshold
    Contains the minimum temperature for a file to be eligible to be moved into the hot file area. Files whose temperature is less than this value will be moved out of the hot file area.
maxfileblks
    Contains the maximum file size, in allocation blocks, for a file to be eligible to be moved into the hot file area. Files larger than this size will not be moved into the hot file area. In Mac OS X 10.3, this value is typically HFC_MAXIMUM_FILESIZE divided by the volume's allocation block size.
maxfilecnt
    Contains the maximum number of files to place into the hot file area. Note that the hot file area may actually contain more than this number of files, especially if they previously existed in the hot file area before the beginning of the recording phase. This number represents the number of files that the hot file recording code intents to track and eventually place into the hot file area.
tag
    Contains the null-terminated (C-style) string containing the ASCII text "CLUSTERED HOT FILES B-TREE " (not including the quotes). Note that the last six bytes are five spaces and the null (zero) byte. This field exists to make it easier to recognize the hot file B-tree when debugging or using a disk editor. An implementation should not attempt to verify or change this field.

Hot File Record Key

A key in the hot file B-tree is of type HotFileKey.

struct HotFileKey {
    UInt16   keyLength;
    UInt8    forkType;
    UInt8    pad;
    UInt32   temperature;
    UInt32   fileID;
};
typedef struct HotFileKey HotFileKey;

#define HFC_LOOKUPTAG   0xFFFFFFFF
#define HFC_KEYLENGTH   (sizeof(HotFileKey) - sizeof(UInt32))

The fields have the following meaning:

keyLength
    The length of a hot file key, not including the keyLength field itself. Hot file keys are of fixed size. This field must contain the value 10.
forkType
    Indicates whether the fork being tracked is a data fork (value 0x00) or a resource fork (value 0xFF). In Mac OS X version 10.3, only data forks are eligible for placement into the hot file area.
pad
    An implementation must treat this as a pad field.
temperature
    The fork's temperature. For hot file thread records, this field contains the value HFC_LOOKUPTAG (0xFFFFFFFF).
fileID
    The catalog node ID of the file being tracked.

Hot file keys are compared first by temperature, then fileID, and lastly by forkType. All of these comparisons are unsigned.
Hot File Records

Much like the catalog file, the hot file B-tree stores two kinds of records:
hot file records and thread records. Every fork in the hot file area has both a
hot file record and a thread record in the hot file B-tree. Hot file records
are used to find hot files based on their temperature. Thread records are used
to find hot files based on their catalog node ID and fork type.

Thread records in the hot file B-tree use a special value (HFC_LOOKUPTAG) in
the temperature field of the key. The data for a thread record is the
temperature of that fork, stored as a UInt32. So, given a catalog node ID and
fork type, it is possible to construct a key for the fork's thread record. If a
thread record exists, you can get the temperature from the thread's data to
construct the key for the hot file record. If a thread record does not exist,
then the fork is not being tracked as a hot file.

Hot file records use all of the key fields as described above. The data for a
hot file record is 4 bytes. The data in a hot file record is not meaningful. To
aid in debugging, Mac OS X version 10.3 typically stores the first four bytes
of the file name (encoded in UTF-8), or the ASCII text "????".

When an implementation changes a hot file's temperature, the old hot file
record must be removed, a new hot file with the new temperature must be
inserted, and the thread record's data must be changed to contain the new
temperature.
Recording Hot File Temperatures

The recording phase gathers information about file usage over time. In order to
gather useful statistics, the recording phase may last longer than the duration
of a single mount. Therefore, information about file usage is stored on disk so
that it can accumulate over time.

The clumpSize field of the fork descriptor structure is used to record the
amount of data actually read from a fork. Since the field is only 32 bits long,
it stores the number of allocation blocks read from the file. The fork's
temperature can be computed by dividing its clumpSize by its totalBlocks.

13. The HFS+ journal

An HFS+ volume may have an optional journal to speed recovery when mounting a volume that was not unmounted safely. The purpose of the journal is to ensure that when a group of related changes are being made, that either all of those changes are actually made, or none of them are made. The journal makes it quick and easy to restore the volume structures to a consistent state, without having to scan all of the structures. The journal is used only for the volume structures and metadata; it does not protect the contents of a fork.

The volume header specifies if journalling is activated.

The journal data stuctures consist of:

  • a journal information block, contains the location and size of the journal header and journal buffer;

  • a journal header, describes which part of the journal buffer is active and contains transactions waiting to be committed;

  • a journal buffer, a cyclic buffer to hold the file system meta data transactions.

On HFS+ volumes, the journal information block is stored as a file. The name of that file is ".journal_info_block" and it is stored in the volume’s root directory.

The journal header and journal buffer are stored together in a different file named ".journal", also in the volume’s root directory. Each of these files are contiguous on disk, they occupy exactly one extent.

The volume header contains the extent of the journal information block file. The journal information block contains the location of the journal file.

13.1. The journal information block

The journal information block describes where the journal header and journal buffer are stored. The journal information block is stored at the start of the allocation block referred to by the volume header.

The HFS+ journal information block is 44 bytes of size and consists of:

Offset Size Value Description

0

4

Journal flags

4

( 8 x 4 ) = 32

Unknown (Reserved)
Device signature

36

8

Journal header offset
The offset in bytes to the start of the journal header.

44

8

The journal size
This includes the journal header and the journal buffer and not the journal information block.

52

( 32 x 4 ) = 128

0x00

Unknown (Reserved)

13.1.1. Journal flags

The journal flags consist of the following values:

Value(s) Description

0x00000001

In file system
The journal resides on the volume
The journal header offset is relative to the start of the volume.

0x00000002

On other device
The journal resides on another device.
The device signature value describes the device containing the journal.
The journal header offset is relative to the start of the device.
Journals stored on a separate device are not currently supported. The format of the device signature value is not yet defined.

0x00000004

Need initialization
The journal header is invalid (there are no valid transactions in the journal) and needs to be initialized.

13.2. The journal header

The journal begins with a journal header, whose main purpose is to describe the location of transactions in the journal buffer. The journal header is stored using the journal_header data type.

The HFS+ journal header is 44 bytes of size and consists of:

Offset Size Value Description

0

4

"\x4a\x4e\x4c\x78"

Signature
Used to verify the integrity of the journal header.

4

4

"\x12\x34\x56\x78"

Endian signature
Used to verify the integrity of the journal header.

8

8

First transaction start offset

16

8

Next transaction start offset

24

8

Journal (byte) size
The size includes the journal header and the journal buffer.
This value must be equal to the size in the journal information block.

32

4

Journal block header (byte) size
Typically ranges from 4096 to 16384

36

4

Journal checksum
See section: Journal checksums

40

4

Journal header (byte) size
Typically the size of one sector

13.2.1. First and next transaction offset

The first transaction offset contains the offset in bytes from the start of the journal header to the start of the first (oldest) transaction.

The next transaction offset contains the offset in bytes from the start of the journal header to the end of the last (newest) transaction. Note that this field may be less than the start field, indicating that the transactions wrap around the end of the journal’s circular buffer. If end equals start, then the journal is empty, and there are no transactions that need to be replayed.

13.3. Journal transactions

A single transaction is stored in the journal as several blocks. These blocks include both the data to be written and the location where that data is to be written. This is represented on storage medium by a block list header, which describes the number and sizes of the blocks, immediately followed by the contents of those blocks.

Since block list headers are of limited size, a single transaction may consist of several block list headers and their associated block contents. If the next value in the first block information structure is non-zero, then the next block list header is a continuation of the same transaction.

The journal buffer is treated as a circular buffer. When reading or writing the journal buffer, the I/O operation must stop at the end of the journal buffer and resume (wrap around) immediately following the journal header. Block list headers or the contents of blocks may wrap around in this way. Only a portion of the journal buffer is active at any given time; this portion is indicated by the start and end fields of the journal header. The part of the journal buffer that is not active contains no meaningful data, and must be ignored.

To prevent ambiguity when start equals end, the journal is never allowed to be perfectly full (all of the journal buffer used by block lists and blocks). If the journal was perfectly full, and start was not equal to jhdr_size, then end would be equal to start. You would then be unable to differentiate between an empty and full journal.

When the journal is not empty (contains transactions), it must be replayed to be sure the volume is consistent. That is, the data from each of the transactions must be written to the correct blocks on disk.

13.4. The journal block list header

The block list header describes a list of blocks included in a transaction. A transaction may include several block lists if it modifies more blocks than can be represented in a single block list. The block list header is stored in a structure of type block_list_header.

The HFS+ journal block list header is 16 bytes of size and consists of:

Offset Size Value Description

0

2

Unknown (Reserved)
Is used in memory for the maximum number of journal blocks

2

2

The number of journal blocks following the journal block header
Typically 1

4

4

The block list (byte) size
The block list size contains the number of bytes used for the block list, including the header and the data in each block.

8

4

Checksum
See section: Journal checksums

12

4

0x00

Unknown (Padding)
used for alignment

16

…​

Journal block information array

Note
The number of journal blocks includes the first journal block, The first journal block is reserved to be used when multiple blocks need to be chained, therefore the number of journal blocks actually containing data is minus one (- 1).

13.5. Journal block information

The HFS+ journal block information is 16 bytes of size and consists of:

Offset Size Value Description

0

8

Unknown (Reserved)
Is used in memory for the sector number where the block should be written
Only used in the first journal block information

8

4

Size
The number of bytes to be copied from the journal buffer to the sector number.
Only used in the first journal block information

12

4

Next journal block
Is used in memory to refer to the next journal block information
When stored a value of 0 indicates the end of the journal block list.

13.6. Journal checksums

The journal header and block list header both contain checksum values. The checksums are verified as part of a basic consistency check of these journal data structures. To verify the checksum, temporarily set the checksum field to zero and then call the hfs_plus_calculate_checksum routine as specified below.

uint32_t hfs_plus_calculate_checksum(
          uint8_t *buffer,
          size_t buffer_size )
{
    size_t buffer_offset = 0;
    uint32_t checksum    = 0;

    for( buffer_offset = 0;
         buffer_offset < buffer_size;
         buffer_offset++)
    {
        checksum = ( checksum << 8 ) ^ ( checksum + buffer[ buffer_offset ] );
    }
    return( ~checksum );
}

14. Application specific data structures

Both HFS and HFS+ contain application specific data structures. These structures are defined in this chapter.

14.1. Finder information

The finder information in the master directory block (MDB) and volume header consists of an array of 32-bit value. This array contains information used by the Mac OS Finder and the system software boot process.

Array entry Description

0

Contains the directory identifier of the directory containing the bootable system. I.e. "System Folder" in Mac OS 8 or 9, or "/System/Library/CoreServices" in Mac OS X.
It is zero if there is no bootable system on the volume.
Typically this value equals the value in entry 3 or 5.

1

Contains the parent identifier of the startup application, i.e. "Finder". The value is zero if the volume is not bootable.

2

Contains the directory identifier of a directory whose window should be displayed in the Finder when the volume is mounted, or zero if no directory window should be opened.
In classic Mac OS, this is the first in a linked list of windows to open; the frOpenChain field of the directory’s Finder Info contains the next directory ID in the list. The open window list is deprecated. The Mac OS X Finder will open this directory’s window, but ignores the rest of the open window list. The Mac OS X Finder does not modify this field.

3

Contains the directory identifier of a bootable Mac OS 8 or 9 System Folder, or zero if not available.

4

Unknown (Reserved)

5

Contains the directory identifier of a bootable Mac OS X system, the "/System/Library/CoreServices" directory, or zero if not available.

6 and 7

Used by Mac OS X to store an unique 64-bit volume identifier.
This identifier is used for tracking whether a given volume’s ownership (user identifier) information should be honored.
These elements may be zero if no such identifier has been created for the volume.

14.2. File information

14.2.1. HFS file information

The HFS file information is 16 bytes of size and consists of:

Offset Size Value Description

0

( 4 x 1 ) = 4

File type
Array of unsigned 8-bit integers

4

( 4 x 1 ) = 4

File creator
Array of unsigned 8-bit integers

8

2

Finder flags
See section: Finder flags

10

4

Location within the parent
Contains x and y-coordinate values
If set to {0, 0}, the Finder will place the item automatically

14

2

File icon window
The window in which the file’s icon appears.

14.2.2. HFS extended file information

The HFS extended file information is 16 bytes of size and consists of:

Offset Size Value Description

0

2

Icon identifier
An identifier, assigned by the Finder, of the file’s icon.

2

( 3 x 2 ) = 6

Unknown (Reserved)
Array of signed 16-bit integers

8

1

Extended finder script code flags
These flags are used if the script code flag is set.

9

1

Extended finder flags
See section: Extended finder flags

10

2

Comment
Signed 16-bit integer
If the high-bit is clear, an identifier, assigned by the Finder, for the comment that is displayed in the information window when the user selects a file and chooses the Get Info command from the File menu.

12

4

Put away folder identifier
Contains a CNID

14.2.3. HFS+ file information

The HFS+ file information (FileInfo) is 16 bytes of size and consists of:

Offset Size Value Description

0

( 4 x 1 ) = 4

File type
Array of unsigned 8-bit integers

4

( 4 x 1 ) = 4

File creator
Array of unsigned 8-bit integers

8

2

Finder flags
See section: Finder flags

10

4

Location within the parent
Contains x and y-coordinate values
If set to {0, 0}, the Finder will place the item automatically

14

2

Unknown (Reserved)

14.2.4. HFS+ extended file information

The HFS+ extended file information (ExtendedFileInfo) is 16 bytes of size and consists of:

Offset Size Value Description

0

4

Unknown (Reserved)

If kHFSHasDateAddedMask is not set

4

4

Unknown (Reserved)

If kHFSHasDateAddedMask is set

4

4

Added date and time
Contains a POSIX timestamp in UTC

Common

8

2

Extended Finder flags
See section: Extended finder flags

10

2

Unknown (Reserved)
Signed 16-bit integers

12

4

Put away folder identifier
Contains a CNID

14.3. Folder information

14.3.1. HFS folder information

The HFS folder information is 16 bytes of size and consists of:

Offset Size Value Description

0

8

Window boundaries
The position and dimension of the folder’s window
Contains top, left, bottom, right-coordinate values

8

2

Finder flags
See section: Finder flags

10

4

Location within the parent
Contains x and y-coordinate values
If set to {0, 0}, the Finder will place the item automatically

14

2

Folder view
The manner in which folders are displayed.

14.3.2. HFS extended folder information

The HFS extended folder information is 16 bytes of size and consists of:

Offset Size Value Description

0

4

Scroll position
The scroll position for icon views
Contains x and y-coordinate values

If kHFSHasDateAddedMask is not set

4

4

Open directory identifier chain
Signed 32-bit integer
Chain of directory identifiers for open folders.

If kHFSHasDateAddedMask is set

4

4

Added date and time
Contains a POSIX timestamp in UTC

Common

8

1

Extended finder script code flags
These flags are used if the script code flag is set.

9

1

Extended Finder flags
See section: Extended finder flags

10

2

Comment
Signed 16-bit integer
If the high-bit is clear, an identifier, assigned by the Finder, for the comment that is displayed in the information window when the user selects a folder and chooses the Get Info command from the File menu.

12

4

Put away folder identifier
Contains a CNID

14.3.3. HFS+ folder information

The HFS+ folder information is 16 bytes of size and consists of:

Offset Size Value Description

0

8

Window boundaries
The position and dimension of the folder’s window
Contains top, left, bottom, right-coordinate values

8

2

Finder flags
See section: Finder flags

10

4

Location within the parent
Contains x and y-coordinate values
If set to {0, 0}, the Finder will place the item automatically

14

2

Unknown (Reserved)

14.3.4. HFS+ extended folder information

The HFS+ extended folder information is 16 bytes of size and consists of:

Offset Size Value Description

0

4

Scroll position
The scroll position for icon views
Contains x and y-coordinate values

4

4

Unknown (Reserved)
Signed 32-bit integer

8

2

Extended Finder flags
See section: Extended finder flags

10

2

Unknown (Reserved)
Signed 16-bit integer

12

4

Put away folder identifier
Contains a CNID

14.4. Finder flags

The finder flags consists of the following values:

Value(s) Description

0x0001

Is on desk
(used for files and folders)

0x000e

Color
(used for files and folders)

0x0040

Is shared
if clear, the application needs to write to its resource fork, and therefore cannot be shared on a server
(used for files)

0x0080

Has no inits
(used for files)

0x0100

Bas been inited
Clear if the file contains desktop database resources that have not been added yet.
(used for files)

0x0400

Has custom icon
(used for files and folders)

0x0800

Is stationary
(used for files)

0x1000

Name locked
(used for files and folders)

0x2000

Has bundle
(used for files)

0x4000

Is invisible
(used for files and folders)

0x8000

Is alias
(used for files)

14.5. Extended finder flags

The extended finder flags consists of the following values:

Value(s) Description

0x0004

Has routing information
The file contains routing info resource

0x0100

Has custom badge
The file or folder has a badge resource.

0x8000

Extended flags are invalid
If set the other extended flags should be ignored

14.5.1. Notes

struct Point {
  SInt16              v;
  SInt16              h;
};
typedef struct Point  Point;

struct Rect {
  SInt16              top;
  SInt16              left;
  SInt16              bottom;
  SInt16              right;
};
typedef struct Rect   Rect;

/* OSType is a 32-bit value made by packing four 1-byte characters
   together. */
typedef UInt32        FourCharCode;
typedef FourCharCode  OSType;

15. File content

HFS supports multiple ways to store file content:

  • Data fork

  • Compressed data extended attribute

  • Compressed data extended attribute with resource fork

  • Resource fork

  • Extended attribute (named fork)

15.1. Data fork

The file content size is stored in the data fork descriptor of the catalog file record.

The extents of the file content are stored in the fork descriptor and extents (overflow) file.

15.2. Compressed data extended attribute

Compression method should be 3, 5 or 7.

The file content size is stored in the compressed data header of a "com.apple.decmpfs" extended attribute.

For compression method 3 or 7 the file content data is stored in a "com.apple.decmpfs" extended attribute after the compressed data header.

For compression method 5 the file content data contains 0-byte values. There are 12 bytes stored after the compressed data header that contain:

Offset Size Value Description

0

4

Unknown
Seen: 1

4

4

Unknown

8

4

Unknown
Seen: 0

15.3. Compressed data extended attribute with resource fork

Compression method should be 4 or 8.

The file content size is stored in the compressed data header of a "com.apple.decmpfs" extended attribute.

The file content data is stored in a "com.apple.ResourceFork" extended attribute.

The compressed data starts with metadata that contains the offsets of the compressed data blocks.

15.3.1. ZLIB (DEFLATE) compressed data

  • ZLIB (DEFLATE) compressed header

  • Unknown (empty values)

  • ZLIB (DEFLATE) compressed data block offsets and sizes

  • ZLIB (DEFLATE) compressed data blocks

  • ZLIB (DEFLATE) compressed footer

ZLIB (DEFLATE) compressed header

The ZLIB (DEFLATE) compressed header is 16 bytes of size and consists of:

Offset Size Value Description

0

4

Compressed data block descriptors offset
The offset is relative from the start of the ZLIB (DEFLATE) compressed data

4

4

Compressed footer offset
The offset is relative from the start of the ZLIB (DEFLATE) compressed data

8

4

Compressed data block descriptors and data size

12

4

Compressed footer size

Note
The values in the ZLIB (DEFLATE) compressed header are stored in big-endian.
ZLIB (DEFLATE) compressed data block descriptors

The ZLIB (DEFLATE) compressed data block descriptors are variable in size and consist of:

Offset Size Value Description

0

4

Compressed data size

4

4

Number of compressed data block offset and size tuples

8

8 x …​

Array of compressed data block descriptors

ZLIB (DEFLATE) compressed data block descriptor

The ZLIB (DEFLATE) compressed data block descriptor is 8 bytes of size and consists of:

Offset Size Value Description

0

4

Compressed block offset
The offset is relative from the start of the ZLIB (DEFLATE) compressed data + 20

4

4

Compressed block size

The ZLIB (DEFLATE) compressed footer is 50 bytes size and consists of:

Offset Size Value Description

0

24

Unknown (empty values)

24

2

Unknown

26

2

Unknown

28

2

Unknown

30

2

Unknown

32

4

"cmpf"

Unknown (signature)

36

4

Unknown

40

4

Unknown

44

6

Unknown (empty values)

Note
The values in the ZLIB (DEFLATE) compressed footer are stored in big-endian.

15.3.2. LZVN compressed data

Offset Size Value Description

0

4 x …​

Array of compressed data block offsets
The offset is relative from the start of the LZVN compressed data

…​

…​

LZVN compressed data blocks

Note
The compressed data block contains a maximum of 65536 bytes of data. The compressed data block therefore should not exceed 65537 bytes of size.

15.4. Resource fork

TODO: complete this section.

15.5. Extended attribute (named fork)

Extended attributes, also referred to as named forks, are stored in the HFS+ attributes file.

16. Notes

16.1. Master directory block and volume header

The nextAllocation field is used by Mac OS as a hint for where to start searching for free allocation blocks when allocating space for a file. It contains the allocation block number where the search should begin.

Traditional Mac OS implementations typically set it to the first allocation block of the extent most recently allocated. It is not set to the allocation block immediately following the most recently allocated extent because of the likelihood of that extent being shortened when the file is closed. Since a whole clump may have been allocated but not actually used.

16.1.2. The default clump size for resource/data forks

The default clump size for resource/data forks, in bytes. This is a hint to the implementation as to the size by which a growing file should be extended. All Apple implementations to date ignore the rsrcClumpSize and use dataClumpSize for both data and resource forks.

16.2. File Manager

16.2.1. Volume Control Blocks

Thereafter, whenever the volume is mounted, the File Manager reads the information in the MDB and copies some of that information into a volume control block (VCB). A VCB is a private data structure maintained in memory by the File Manager (in the VCB queue). The structure of a VCB is described in "Volume Control Blocks," later in this chapter.

When the File Manager needs to find a data record, it begins searching at the root node (which is an index node, unless the tree has only one level), moving from one record to the next until it finds the record with the highest key that is less than or equal to the search key. The pointer of that record leads to another node, one level down in the tree. This process continues until the File Manager reaches a leaf node; then the records of that leaf node are examined until the desired key is found. At that point, the desired data has also been found.

16.4. Determining the Amount of Free Space on a Volume

Appendix A: References

Title: hfs_format.h

URL:

https://opensource.apple.com/source/xnu/xnu-2050.18.24/bsd/hfs/hfs_format.h

[APPLE96]

Title: Insided Macintosh: Files - Data Organization on Volumes

URL:

https://developer.apple.com/library/archive/documentation/mac/Files/Files-99.html

[APPLE04]

Title: Technical Note TN1150: HFS plus volume format

URL:

https://developer.apple.com/library/archive/technotes/tn/tn1150.html

Appendix B: GNU Free Documentation License

Version 1.3, 3 November 2008 Copyright © 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc. http://fsf.org/

Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.

0. PREAMBLE

The purpose of this License is to make a manual, textbook, or other functional and useful document "free" in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with or without modifying it, either commercially or noncommercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modifications made by others.

This License is a kind of "copyleft", which means that derivative works of the document must themselves be free in the same sense. It complements the GNU General Public License, which is a copyleft license designed for free software.

We have designed this License in order to use it for manuals for free software, because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. But this License is not limited to software manuals; it can be used for any textual work, regardless of subject matter or whether it is published as a printed book. We recommend this License principally for works whose purpose is instruction or reference.

1. APPLICABILITY AND DEFINITIONS

This License applies to any manual or other work, in any medium, that contains a notice placed by the copyright holder saying it can be distributed under the terms of this License. Such a notice grants a world-wide, royalty-free license, unlimited in duration, to use that work under the conditions stated herein. The "Document", below, refers to any such manual or work. Any member of the public is a licensee, and is addressed as "you". You accept the license if you copy, modify or distribute the work in a way requiring permission under copyright law.

A "Modified Version" of the Document means any work containing the Document or a portion of it, either copied verbatim, or with modifications and/or translated into another language.

A "Secondary Section" is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document’s overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. (Thus, if the Document is in part a textbook of mathematics, a Secondary Section may not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters, or of legal, commercial, philosophical, ethical or political position regarding them.

The "Invariant Sections" are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the Document is released under this License. If a section does not fit the above definition of Secondary then it is not allowed to be designated as Invariant. The Document may contain zero Invariant Sections. If the Document does not identify any Invariant Sections then there are none.

The "Cover Texts" are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License. A Front-Cover Text may be at most 5 words, and a Back-Cover Text may be at most 25 words.

A "Transparent" copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, that is suitable for revising the document straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise Transparent file format whose markup, or absence of markup, has been arranged to thwart or discourage subsequent modification by readers is not Transparent. An image format is not Transparent if used for any substantial amount of text. A copy that is not "Transparent" is called "Opaque".

Examples of suitable formats for Transparent copies include plain ASCII without markup, Texinfo input format, LaTeX input format, SGML or XML using a publicly available DTD, and standard-conforming simple HTML, PostScript or PDF designed for human modification. Examples of transparent image formats include PNG, XCF and JPG. Opaque formats include proprietary formats that can be read and edited only by proprietary word processors, SGML or XML for which the DTD and/or processing tools are not generally available, and the machine-generated HTML, PostScript or PDF produced by some word processors for output purposes only.

The "Title Page" means, for a printed book, the title page itself, plus such following pages as are needed to hold, legibly, the material this License requires to appear in the title page. For works in formats which do not have any title page as such, "Title Page" means the text near the most prominent appearance of the work’s title, preceding the beginning of the body of the text.

The "publisher" means any person or entity that distributes copies of the Document to the public.

A section "Entitled XYZ" means a named subunit of the Document whose title either is precisely XYZ or contains XYZ in parentheses following text that translates XYZ in another language. (Here XYZ stands for a specific section name mentioned below, such as "Acknowledgements", "Dedications", "Endorsements", or "History".) To "Preserve the Title" of such a section when you modify the Document means that it remains a section "Entitled XYZ" according to this definition.

The Document may include Warranty Disclaimers next to the notice which states that this License applies to the Document. These Warranty Disclaimers are considered to be included by reference in this License, but only as regards disclaiming warranties: any other implication that these Warranty Disclaimers may have is void and has no effect on the meaning of this License.

2. VERBATIM COPYING

You may copy and distribute the Document in any medium, either commercially or noncommercially, provided that this License, the copyright notices, and the license notice saying this License applies to the Document are reproduced in all copies, and that you add no other conditions whatsoever to those of this License. You may not use technical measures to obstruct or control the reading or further copying of the copies you make or distribute. However, you may accept compensation in exchange for copies. If you distribute a large enough number of copies you must also follow the conditions in section 3.

You may also lend copies, under the same conditions stated above, and you may publicly display copies.

3. COPYING IN QUANTITY

If you publish printed copies (or copies in media that commonly have printed covers) of the Document, numbering more than 100, and the Document’s license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both covers must also clearly and legibly identify you as the publisher of these copies. The front cover must present the full title with all words of the title equally prominent and visible. You may add other material on the covers in addition. Copying with changes limited to the covers, as long as they preserve the title of the Document and satisfy these conditions, can be treated as verbatim copying in other respects.

If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages.

If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a computer-network location from which the general network-using public has access to download using public-standard network protocols a complete Transparent copy of the Document, free of added material. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public.

It is requested, but not required, that you contact the authors of the Document well before redistributing any large number of copies, to give them a chance to provide you with an updated version of the Document.

4. MODIFICATIONS

You may copy and distribute a Modified Version of the Document under the conditions of sections 2 and 3 above, provided that you release the Modified Version under precisely this License, with the Modified Version filling the role of the Document, thus licensing distribution and modification of the Modified Version to whoever possesses a copy of it. In addition, you must do these things in the Modified Version:

  1. Use in the Title Page (and on the covers, if any) a title distinct from that of the Document, and from those of previous versions (which should, if there were any, be listed in the History section of the Document). You may use the same title as a previous version if the original publisher of that version gives permission.

  2. List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together with at least five of the principal authors of the Document (all of its principal authors, if it has fewer than five), unless they release you from this requirement.

  3. State on the Title page the name of the publisher of the Modified Version, as the publisher.

  4. Preserve all the copyright notices of the Document.

  5. Add an appropriate copyright notice for your modifications adjacent to the other copyright notices.

  6. Include, immediately after the copyright notices, a license notice giving the public permission to use the Modified Version under the terms of this License, in the form shown in the Addendum below.

  7. Preserve in that license notice the full lists of Invariant Sections and required Cover Texts given in the Document’s license notice.

  8. Include an unaltered copy of this License.

  9. Preserve the section Entitled "History", Preserve its Title, and add to it an item stating at least the title, year, new authors, and publisher of the Modified Version as given on the Title Page. If there is no section Entitled "History" in the Document, create one stating the title, year, authors, and publisher of the Document as given on its Title Page, then add an item describing the Modified Version as stated in the previous sentence.

  10. Preserve the network location, if any, given in the Document for public access to a Transparent copy of the Document, and likewise the network locations given in the Document for previous versions it was based on. These may be placed in the "History" section. You may omit a network location for a work that was published at least four years before the Document itself, or if the original publisher of the version it refers to gives permission.

  11. For any section Entitled "Acknowledgements" or "Dedications", Preserve the Title of the section, and preserve in the section all the substance and tone of each of the contributor acknowledgements and/or dedications given therein.

  12. Preserve all the Invariant Sections of the Document, unaltered in their text and in their titles. Section numbers or the equivalent are not considered part of the section titles.

  13. Delete any section Entitled "Endorsements". Such a section may not be included in the Modified Version.

  14. Do not retitle any existing section to be Entitled "Endorsements" or to conflict in title with any Invariant Section.

  15. Preserve any Warranty Disclaimers.

If the Modified Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied from the Document, you may at your option designate some or all of these sections as invariant. To do this, add their titles to the list of Invariant Sections in the Modified Version’s license notice. These titles must be distinct from any other section titles.

You may add a section Entitled "Endorsements", provided it contains nothing but endorsements of your Modified Version by various parties—for example, statements of peer review or that the text has been approved by an organization as the authoritative definition of a standard.

You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any one entity. If the Document already includes a cover text for the same cover, previously added by you or by arrangement made by the same entity you are acting on behalf of, you may not add another; but you may replace the old one, on explicit permission from the previous publisher that added the old one.

The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modified Version.

5. COMBINING DOCUMENTS

You may combine the Document with other documents released under this License, under the terms defined in section 4 above for modified versions, provided that you include in the combination all of the Invariant Sections of all of the original documents, unmodified, and list them all as Invariant Sections of your combined work in its license notice, and that you preserve all their Warranty Disclaimers.

The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy. If there are multiple Invariant Sections with the same name but different contents, make the title of each such section unique by adding at the end of it, in parentheses, the name of the original author or publisher of that section if known, or else a unique number. Make the same adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work.

In the combination, you must combine any sections Entitled "History" in the various original documents, forming one section Entitled "History"; likewise combine any sections Entitled "Acknowledgements", and any sections Entitled "Dedications". You must delete all sections Entitled "Endorsements".

6. COLLECTIONS OF DOCUMENTS

You may make a collection consisting of the Document and other documents released under this License, and replace the individual copies of this License in the various documents with a single copy that is included in the collection, provided that you follow the rules of this License for verbatim copying of each of the documents in all other respects.

You may extract a single document from such a collection, and distribute it individually under this License, provided you insert a copy of this License into the extracted document, and follow this License in all other respects regarding verbatim copying of that document.

7. AGGREGATION WITH INDEPENDENT WORKS

A compilation of the Document or its derivatives with other separate and independent documents or works, in or on a volume of a storage or distribution medium, is called an "aggregate" if the copyright resulting from the compilation is not used to limit the legal rights of the compilation’s users beyond what the individual works permit. When the Document is included in an aggregate, this License does not apply to the other works in the aggregate which are not themselves derivative works of the Document.

If the Cover Text requirement of section 3 is applicable to these copies of the Document, then if the Document is less than one half of the entire aggregate, the Document’s Cover Texts may be placed on covers that bracket the Document within the aggregate, or the electronic equivalent of covers if the Document is in electronic form. Otherwise they must appear on printed covers that bracket the whole aggregate.

8. TRANSLATION

Translation is considered a kind of modification, so you may distribute translations of the Document under the terms of section 4. Replacing Invariant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections. You may include a translation of this License, and all the license notices in the Document, and any Warranty Disclaimers, provided that you also include the original English version of this License and the original versions of those notices and disclaimers. In case of a disagreement between the translation and the original version of this License or a notice or disclaimer, the original version will prevail.

If a section in the Document is Entitled "Acknowledgements", "Dedications", or "History", the requirement (section 4) to Preserve its Title (section 1) will typically require changing the actual title.

9. TERMINATION

You may not copy, modify, sublicense, or distribute the Document except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense, or distribute it is void, and will automatically terminate your rights under this License.

However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation.

Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice.

Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, receipt of a copy of some or all of the same material does not give you any rights to use it.

10. FUTURE REVISIONS OF THIS LICENSE

The Free Software Foundation may publish new, revised versions of the GNU Free Documentation License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. See http://www.gnu.org/copyleft/.

Each version of the License is given a distinguishing version number. If the Document specifies that a particular numbered version of this License "or any later version" applies to it, you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. If the Document does not specify a version number of this License, you may choose any version ever published (not as a draft) by the Free Software Foundation. If the Document specifies that a proxy can decide which future versions of this License can be used, that proxy’s public statement of acceptance of a version permanently authorizes you to choose that version for the Document.

11. RELICENSING

"Massive Multiauthor Collaboration Site" (or "MMC Site") means any World Wide Web server that publishes copyrightable works and also provides prominent facilities for anybody to edit those works. A public wiki that anybody can edit is an example of such a server. A "Massive Multiauthor Collaboration" (or "MMC") contained in the site means any set of copyrightable works thus published on the MMC site.

"CC-BY-SA" means the Creative Commons Attribution-Share Alike 3.0 license published by Creative Commons Corporation, a not-for-profit corporation with a principal place of business in San Francisco, California, as well as future copyleft versions of that license published by that same organization.

"Incorporate" means to publish or republish a Document, in whole or in part, as part of another Document.

An MMC is "eligible for relicensing" if it is licensed under this License, and if all works that were first published under this License somewhere other than this MMC, and subsequently incorporated in whole or in part into the MMC, (1) had no cover texts or invariant sections, and (2) were thus incorporated prior to November 1, 2008.

The operator of an MMC Site may republish an MMC contained in the site under CC-BY-SA on the same site at any time before August 1, 2009, provided the MMC is eligible for relicensing.