Switch branches/tags
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
243 lines (202 sloc) 10.9 KB
= Nickel .ini File Format =
The file format is documented very thoroughly in Nickel's external header file, [source:/branches/nickel/include/bohr/ni.h bohr/ni.h]. I'll copy the relevant description here as well, to be cleaned up and/or augmented later.
== Encoding ==
Files should be UTF-8 (or plain ol' ASCII) encoded, though Nickel can handle
arbitrary binary data too.
== Newlines ==
On reading, Nickel treats all newline flavors (at least, CR, LF, and CRLF) as
simply LF (`\n` in C). This applies '''even inside quoted strings''' (see [#Quotes below]).
If you need a CR (`\r` in C) character in a node's name/value, you must escape
it (see [#EscapeSequences below]), that is, put a literal `\r` (backslash, lower r).
== Identifying Signature ==
Though this is not enforced, the first few characters of a Nickel-compatible
(as per version 1 of the API) .ini file should be:
(that's semicolon, upper N, lower i, digit 1) followed by a newline. This
gives a machine-parsable sequence that future Nickel versions or other
parsers can use to identify the file format, while being a .ini comment (see
below) and thus ignored by anything that shouldn't care. Though it's not
currently checked in Nickel (it'll try to parse anything you throw at it), it
may be in future versions.
== The Parser ==
There are two kinds of data Nickel looks for in .ini files: names and values.
A name is a sequence of characters inside a number of `[]` (square brackets),
which denotes a section heading, OR before an `=` (equals sign), which denotes
that the value for the node follows. Either way a name is encountered, a
node with that name will be created (or, if the named node already exists,
its value or children will be modified, depending on what comes next).
=== Quotes ===
Both names and values can have surrounding `""` (double quotes). When the
parser encounters an opening `"` (double quote), it parses all data until a
closing `"` (double quote), even spanning newlines, in the context of the name
or value. The parser supports string concatenation whereby:
"some" "thing"
is equivalent to:
Whitespace must be the only characters between closing `"` (double quote) and
re-opening `"` (double quote) for this to work (or, in values, there can be a
`\` (backslash) and a newline--see [#EscapeSequences escaping], below).
=== Comments ===
Comments are initiated with the `;` (semicolon). Any `;` (semicolon) outside of
double quotes begins a comment (except when escaped, see below). Comments go
until the end of the line (and cannot be continued onto the next line with `\`
(backslash), as in some languages).
Any line deemed invalid (e.g. blank or without an `=` (equals sign) or `[]`
(square brackets)) is ignored as a comment.
=== Escape Sequences ===
Nickel understands escape sequences similar to those in C. Nickel translates
these escape sequences regardless of whether it's inside quotes or not. It
treats the following sequences the same as C:
\a \b \f \n \r \t \v \' \" \? \\ \ooo and \xhh
where `ooo` is up to 3 octal digits (0-7), `hh` is up to 2 hex digits (0-9, a-f,
or A-F), and all other characters are literally as they're written there.
The behavior of the hex escape (`\xhh`) may differ from how C handles it in
that I believe C allows the string of hex digits to be longer than 2. And
I'm not sure how C handles octal constants bigger than 255, but Nickel takes
the whole octal value and uses the result modulus 256. But that's just
Nickel also understands the following escape sequences, which exist only to
allow .ini reserved characters in names and values in the file:
\; \[ \] \=
These are simply replaced with the character following the backslash.
Note that escape sequences aren't always necessary, but are always parsed if
present. Thus the following two lines are equivalent:
poo = \ \="\[ ; here, the = and [ are escaped
poo = "\\ =\"[" ; here, double quotes are used and \ and " are escaped
and indicate the node has a value of: `\` (backslash), space, `=` (equals sign),
`"` (double quote), and `[` (opening square bracket). In the example, the double
quotes aren't necessary on the second line, but are there to illustrate that
the escape sequences are the same inside and outside of quotes (except that
in quotes, if a `"` (double quote) is encountered unescaped, it obviously ends
the quote).
In values only, it is possible to escape a literal newline to instruct Nickel
to continue consuming the next line as part of the value. This escape
consists of a `\` (backslash) as the last non-whitespace character on a line,
ignoring comments. If following a quoted string, the `\` (backslash) should
come after the closing `"` (quote). If the string immediately before the `\`
(backslash) is quoted, the `\` (backslash)-newline pair is simply omitted, but
if the previous string was unquoted, the pair is replaced by a single space,
regardless of how much intervening space there actually was. Thus, the
following lines all create equivalent nodes:
poo = abc def ; just the string "abc def"
poo = abc \ ; no quotes, lots of spaces, and continue on the next line
"def" ; quotes here just for shits and giggles--not necessary
poo = "abc " \ ; first line with quotes, then continue on the next line
def ; which could also have quotes ("def"), doesn't matter
with the 7-character value `abc def`.
=== Spaces ===
Surrounding names and values, whitespace is eaten by the parser. This
happens after comments are removed (see the above example). If you really
want a node whose name or value begins or ends with spaces, enclose it in
quotes. Inside names and values, whitespace is maintained. The exception to
this is with the line-continuation escape, where, if the string immediately
preceding the `\` (backslash) was unquoted, spaces trailing on the top line or
beginning the second line are all replaced with a single space, or if the
preceding string was quoted, no space is inserted and initial space on the
following line is ignored. See the above example for clarification on this.
=== Hierarchies ===
Lines giving values to named nodes (e.g. `name = value`) become children of the
node given in the most recent previous section (e.g. `[name]`), or the root
node if there have been no sections yet. You can define subsections (e.g.
`[[name]]`) by specifying increasing numbers of `[` (opening square bracket)
characters. Sections with the same number of brackets become children of the
most recently preceding parent defined with fewer brackets; sections with
more brackets become deeper children, and sections with fewer brackets are
closer to the root. The following example shows the root gaining two
children, `child1` (with value `teh suz` and child `sub_child1`) and `child2`
(with its own child `sub_child2`):
child1 = teh suz ; create child1 as a child of the root, give it a value
[ child2 ] ; another root child, anything below is child2's child
[[sub_child2 ]] ; sub_child2, having more [, is under child2
[child1] ; back up to root's child child1
sub_child1= ; this is a child of the previous section, i.e., child1
(Spaces were added and the order was slightly obfuscated to show these things
are unimportant.) Note that the parser ignores the number of `]` (closing
square brackets) for the sake of keeping insignificant typos from causing
errors, but to stay compatible with future versions of Nickel or other
parsers, you should close each section with the same number of `]` (closing
square brackets) as it was opened with.
Also note that if you skip steps in your subsections, e.g.:
[[[ sub_sub_child1 ]]]
empty children are inserted to make up the difference (here, `child1` has an
unnamed child, which has a child `sub_sub_child1`).
=== Uniqueness of Names ===
If a name is given more than once under the same parent in a file, the last
instance overrides all previous instances. Thus:
sub = asdf
sub = again
gives `child1` a child called `sub`, and gives it the value `again`. Note
that there is only one `child1`, and the value `asdf` is overwritten by
`again` for `sub`.
=== Empty Names ===
It's valid to specify empty names. Here:
creates a node with an empty name as the child of the root, and gives it a
child with an empty name and empty value.
=== On the Strictness of the Parser ===
One more note: the parser is pretty lax--it will glean as much valid
information from a line as it can, ignoring errors if it can still get
something useful. This can lead to weird results. Consider the following
[section] child = value
# comments = cool!
Here, because anything other than a comment after the `]` (closing square
bracket) is invalid, the parser ignores `child = value` on the first line.
Also, because `#` (hash mark) isn't the standard for initiating a comment, the
second line creates a child of `section` named `# comments`, with the value
`cool!`. Keep your files valid and standard and you won't have issues.
== Compatibility ==
While Nickel should be able to parse any Windows-compatible .ini file, it
won't necessarily output to a format that Windows will understand. Some
circumstances it will generate an incompatible file are:
* The tree is more than one level deep.
* A node's name or value contains a special character: `[` `=` `\` `;` etc. (as
these may be escaped by Nickel)
* A node's name or value has initial or trailing whitespace (as Nickel will
put such a string inside quotes)
Nickel can always parse a file it generates; it's just that Windows doesn't
understand some of Nickel's syntax, I believe, like nested headings, escaped
characters, and quotes. Also, if you read a Windows-compatible .ini file
into memory and write it right back out to disk, you'll get a Windows-compatible .ini file as long as no special characters are in any node's
value. Nickel is fairly good about escaping special characters only when
necessary (as of [milestone:"Nickel v1.1.0" v1.1.0], anyway), but if maximum compatibility with other
parsers is your goal, avoid special characters in names and values.
== File Generation ==
It's also worth noting that since Nickel builds an in-memory map based on the
.ini file, and writing files uses the in-memory representation of the data,
comments in the file are entirely lost. This is unfortunate, but means that
Nickel runs quickly enough to be used inside, say, a game, since accessing
the data doesn't require costly disk reads.
It's also unfortunate that order is lost between reading and writing. This
is due to hashing mixing things up. The speed gain from hashing entries
greatly outweighs maintaining order, in my mind, and keeping an index of the
order nodes were added would take more overhead than I believe it's worth.