Glk 071 draft spec changes

erkyrath edited this page Jan 30, 2011 · 16 revisions

This is a dump of the new and changed paragraphs in the Glk 0.7.1 spec. (See Todo.) Chunks from different chapters of the spec are glommed together here, so you'll see some redundancy.

Line input echoing

By default, when the player finishes his line of input, the library will display the input text at the end of the buffer text (if it wasn't there already.) It will be followed by a newline, so that the next text you print will start a new line (paragraph) after the input.

However, this default behavior can be changed with the glk_set_echo_line_event() call. If the default echoing is disabled, the library will not display the input text (plus newline) after input is either completed or cancelled. The buffer will end with whatever prompt you displayed before requesting input. If you want the traditional input behavior, it is then your responsibility to print the text, using the Input text style, followed by a newline (in the original style).

The glk_set_echo_line_event() call has no effect in grid windows.

void glk_set_echo_line_event(winid_t win, glui32 val);

Normally, after line input is completed or cancelled in a buffer window, the library ensures that the complete input line (or its latest state, after cancelling) is displayed at the end of the buffer, followed by a newline. This call allows you to suppress this behavior. If the val argument is zero, all subsequent line input requests in the given window will leave the buffer unchanged after the input is completed or cancelled; the player's input will not be printed. If val is nonzero, subsequent input requests will have the normal printing behavior.

Note that this feature is unrelated to the window's echo stream.

res = glk_gestalt(gestalt_LineInputEcho, 0);

Not all libraries support this feature. This returns 1 if glk_set_echo_line_event() is supported, and 0 if it is not. Remember that if it is not supported, the behavior is always the default, which is line echoing enabled.

If you turn off line input echoing, you can reproduce the standard input behavior by following each line input event (or line input cancellation) by printing the input line, in the Input style, followed by a newline in the original style.

The glk_set_echo_line_event() does not affect a pending line input request. It also has no effect in non-buffer windows. In a grid window, the game can overwrite the input area at will, so there is no need for this distinction.

Line input terminators

void glk_set_terminators_line_event(winid_t win, glui32 *keycodes, glui32 count);

If a window has a pending request for line input, the player can generally hit the enter key (in that window) to complete line input. The details will depend on the platform's native user interface.

It is possible to request that other keystrokes complete line input as well. (This allows a game to intercept function keys or other special keys during line input.) To do this, call glk_set_terminators_line_event(), and pass an array of count keycodes. These must all be special keycodes (see "Character Input"). Do not include regular printable characters in the array, nor keycode_Return (which represents the default enter key and will always be recognized). To return to the default behavior, pass a NULL or empty array.

The glk_set_terminators_line_event() affects subsequent line input requests in the given window. It does not affect a pending line input request. This distinction makes life easier for interpreters that set up UI callbacks only at the start of input.

A library may not support this feature; if it does, it may not support all special keys as terminators. (Some keystrokes are reserved for OS or interpreter control.)

res = glk_gestalt(gestalt_LineTerminators, 0);

This returns 1 if glk_set_terminators_line_event() is supported, and 0 if it is not.

res = glk_gestalt(gestalt_LineTerminatorKey, ch);

This returns 1 if the keycode ch can be passed to glk_set_terminators_line_event(). If it returns 0, that keycode will be ignored as a line terminator. Printable characters and keycode_Return will always return 0.

When line input is completed, glk_select() will return an event whose type is evtype_LineInput. Once this happens, the request is complete; it is no longer pending. You must call glk_request_line_event() if you want another line of text from that window.

In the event structure, win tells what window the event came from. val1 tells how many characters were entered. val2 will be 0 unless input was ended by a special terminator key, in which case val2 will be the keycode (one of the values passed to glk_set_terminators_line_event()).

The characters themselves are stored in the buffer specified in the original glk_request_line_event() or glk_request_line_event_uni() call. There is no null terminator or newline stored in the buffer.

Inter-window borders

(In the glk_window_open() call, method argument:)

  • winmethod_Border, winmethod_NoBorder: There should or should not be a visible window border between the new window and its sibling. (This is a hint to the library; you might specify NoBorder between two graphics windows that should form a single image.)

The way windows are displayed is, of course, entirely up to the Glk library; it depends on what is natural for the player's machine. The borders between windows may be black lines, 3-D bars, rows of "#" characters; there may even be no borders at all. The library may not support the Border/NoBorder hint, in which case every pair of windows will have a visible border -- or no border -- between them.

The Border/NoBorder was introduced in Glk 0.7.1. Prior to that, all games used the Border hint, and this remains the default. However, as noted, not all implementations display window borders. Therefore, for existing implementations, "Border" may be understood as "your normal style of window display"; "NoBorder" may be understood as "suppress any interwindow borders you may have".

There may be decorations within the windows as well. A text buffer window will often have a scroll bar. The library (or player) may prefer wide margins around each text window. And so on.

Unicode decompose/normalize

res = glk_gestalt(gestalt_UnicodeNorm, 0);

This returns 1 if the Unicode normalization functions are available. If it returns 0, you should not try to call them. The Unicode normalization functions include glk_buffer_canon_decompose_uni and glk_buffer_canon_normalize_uni.

The equivalent preprocessor test for these functions is GLK_MODULE_UNICODE_NORM.

Comparing Unicode strings is difficult, because there can be several ways to represent a piece of text as a Unicode string. For example, the one-character string "è" (an accented "e") will be displayed the same as the two-character string containing "e" followed by Unicode character 0x0300 (COMBINING GRAVE ACCENT). These strings should be considered equal.

Therefore, a Glk program that accepts line input should convert its text to a normalized form before parsing it. These functions offer those conversions. The algorithms are defined by the Unicode spec (chapter 3.7) and Unicode Standard Annex #15.

glui32 glk_buffer_canon_decompose_uni(glui32 *buf, glui32 len, glui32 numchars);

This transforms a string into its canonical decomposition ("Normalization Form D"). Effectively, this takes apart multipart characters into their individual parts. For example, it would convert "è" (character 0xE8, an accented "e") into the two-character string containing "e" followed by Unicode character 0x0300 (COMBINING GRAVE ACCENT). If a single character has multiple accent marks, they are also rearranged into a standard order.

glui32 glk_buffer_canon_normalize_uni(glui32 *buf, glui32 len, glui32 numchars);

This transforms a string into its canonical decomposition and recomposition ("Normalization Form C"). Effectively, this takes apart multipart characters, and then puts them back together in a standard way. For example, this would convert the two-character string containing "e" followed by Unicode character 0x0300 (COMBINING GRAVE ACCENT) into the one-character string "è" (character 0xE8, an accented "e").

The canon_normalize function includes decomposition as part of its implementation. You never have to call both functions on the same string.

Both of these functions are idempotent.

These functions provide two length arguments because a string of Unicode characters may expand when it is transformed. The len argument is the available length of the buffer; numchars is the number of characters in the buffer initially. (So numchars must be less than or equal to len. The contents of the buffer after numchars do not affect the operation.)

The functions return the number of characters after transformation. If this is greater than len, the characters in the array will be safely truncated at len, but the true count will be returned. (The contents of the buffer after the returned count are undefined.)

The Unicode spec also defines stronger forms of these functions, called "compatibility decomposition and recomposition" ("Normalization Form KD" and "Normalization Form KC".) These do all of the accent-mangling described above, but they also transform many other obscure Unicode characters into more familiar forms. For example, they split ligatures apart into separate letters. They also convert Unicode display variations such as script letters, circled letters, and half-width letters into their common forms.

The Glk spec does not currently provide these stronger transformations. Glk's expected use of Unicode normalization is for line input, and an OS facility for line input will generally not produce these alternate character forms (unless the user goes out of his way to type them). Therefore, the need for these transformations does not seem to be worth the extra data table space.

A Note on Unicode Case-Folding and Normalization

With all of these Unicode transformations hovering about, an author might reasonably ask about the right way to handle line input. Our recommendation is: call glk_buffer_to_lower_case_uni(), followed by glk_buffer_canon_normalize_uni(), and then parse the result. The parsing process should of course match against strings that have been put through the same process.

The Unicode spec (chapter 3.13) gives a different, three-step process: decomposition, case-folding, and decomposition again. Our recommendation comes through a series of practical compromises:

  • The initial decomposition is only necessary because of a historical error in the Unicode spec: character 0x0345 (COMBINING GREEK YPOGEGRAMMENI) behaves inconsistently. We ignore this case, and skip this step.
  • Case-folding is a slightly different operation from lower-casing. (Case-folding splits some combined characters, so that, for example, "ß" can match both "ss" and "SS".) However, Glk does not currently offer a case-folding function. We substitute glk_buffer_to_lower_case_uni().
  • I'm not sure why the spec recommends decomposition (glk_buffer_canon_decompose_uni()) rather than glk_buffer_canon_normalize_uni(). However, composed characters are the norm in source code, and therefore in compiled Inform game files. If we specified decomposition, the compiler would have to do extra work; also, the standard Inform dictionary table (with its fixed word length) would store fewer useful characters. Therefore, we substitute glk_buffer_canon_normalize_uni().

We may revisit these recommendations in future versions of the spec.

If possible, the library should return fully composed Unicode characters, rather than strings of base and composition characters.

Fully-composed characters are the norm for Unicode text, so an implementation that ignores this issue will probably produce the right result. However, the game may not want to rely on that. Another factor is that case-folding can (occasionally) produce non-normalized text. Therefore, to cover all its bases, a game should call glk_buffer_to_lower_case_uni(), followed by glk_buffer_canon_normalize_uni(), before parsing.

Earlier versions of this spec said that line input must always be in Unicode Normalization Form C. However, this has not been universally implemented. It is also somewhat redundant, for the results noted above. Therefore, we now merely recommend that line input be fully composed. The game is ultimately responsible for all case-folding and normalization. See "Unicode String Normalization".