# Multitiers

Multitiers are a novel way of representing linguistic data for purposes of historical investigation and language comparison, mainly in terms of regular correspondences. They can be conceived as an extension to alignments, incorporating information other than segments or sound classes, in ways that are suitable for directly or easily applying most methods of machine learning currently in vogue, particularly those for white-box results such as decision trees.

As mentioned, they initially stem from aligned data. Consider the words "house" /haʊs/ in English, "huis" /ɦœy̯s/ in Dutch, and "hus" /huːs/ in Icelandic, all cognate stemming from a Proto-Germanic "\*hūsą". Their alignment is straightforward, and can be done either manually or with recommended tools such as LingPy:

  | Language  | 1 | 2  | 3 | 4 |
  |-----------|---|----|---|---|
  | English   | h | a  | ʊ | s |
  | Dutch     | ɦ | œ  | y | s |
  | Icelandic | h | uː | - | s |
  
The information under the language names can be considered an essential and the most important tier, and to be clearer we should specify what information they carry (the segments). The indexes, on the other hand, constitute another kind of information, a positional tier, in this case counting left-to-right. We can extend this alignment to a more evident multitier system by clearly marking what are segments and adding positional tiers both left-to-right and right-to-left.

  | Tier Name |  |   |  |  |
  |-----------|---|----|---|---|
  | Index              | 1 |  2 | 3 | 4 |
  | RIndex             | 4 | 3  | 2 | 1 |
  | Segments_English   | h | a  | ʊ | s |
  | Segments_Dutch     | ɦ | œ  | y | s |
  | Segments_Icelandic | h | uː | - | s |
  
Each tier is in fact a variable that a given observation (that is, an alignment site in a cognate set) can assume. It is easier to note this if we transpose the table following common conventions of relational databases, also allowing us to give a unique ID to each position (here, "P" and a number, to distinguish from the index)

  | ID | Index | RIndex | Segment_ENG | Segment_DUT | Segment_ICE |
  |----|-------|--------|-------------|-------------|-------------|
  | P0 | 1     |  4     |  h          | ɦ           | h           |
  | P1 | 2     | 3      | a           | œ           | uː          |
  | P2 | 3     | 2      | ʊ           | y           | -           |
  | P3 | 4     | 1      | s           | s           | s           |
  
This allows us to easily expand with more tiers. We can, for example, incorporate information on the sound class of each site, for each language:

  | ID | Index | RIndex | Segment_ENG | SC_ENG |Segment_DUT  | SC_DUT | Segment_ICE | SC_ICE |
  |----|-------|--------|-------------|--------|-------------|--------|-------------|--------|
  | P0 | 1     |  4     |  h          |   H    | ɦ           |  H     | h           |   H    |
  | P1 | 2     | 3      | a           |   A    | œ           |  U     | uː          |   U    |
  | P2 | 3     | 2      | ʊ           |   U    | y           |   Y    | -           |    -   |
  | P3 | 4     | 1      | s           |   S    | s           |  S     | s           |  S     |
  
This can be extended with essentially any information at hand. For example, we can add for each language an information on the sound class one position before (to the left, L1) and one after (to the right, R1) for each alignment site.
  
