# Navigating an `AlignedTextGrid`

This documentation covers reading in the output from the Montreal Forced Aligner using the `Word` and `Phone` classes from `aligned_textgrid`, but everything will generalize to custom classes.

In [1]:
from aligned_textgrid.aligned_textgrid import AlignedTextGrid
from aligned_textgrid.sequences.word_and_phone import Word, Phone

## Reading in a TextGrid

To read in a [one-speaker TextGrid](../resources/josef-fruehwald_speaker.TextGrid), either give `AlignedTextGrid()` the path to the file, or a textgrid that has already been read in with `praatio.textgrid.openTextgrid()`.

You also need to specify the sequence classes of each tier in the order they appear. For MFA output, the top tier is `Word` and the bottom tier is `Phone`, but if these were reversed, you would have to pass `[Phone, Word]` to `entry_classes`. The information about which class is the superset and which is the subset is [encoded in the class information](../../02_Sequences/02_sequence_properties/#class-strictness), and is automatically handled.

In [2]:
one_speaker = AlignedTextGrid(
    textgrid_path = "../resources/josef-fruehwald_speaker.TextGrid", 
    entry_classes = [Word, Phone]
)

With a [two or more speaker TextGrid](../resources/KY25A_1.TextGrid), you can either pass `entry_classes` a single list of interval classes to re-use with each speaker (for example `[Word, Phone]`), or an explicit list of nested classes 
(for example, `[[Word, Phone], [Word, Phone]]`).


In [3]:
two_speaker = AlignedTextGrid(
    textgrid_path = "../resources/KY25A_1.TextGrid",
    entry_classes= [Word, Phone]
)

## Navigating the `AlignedTextGrid` object

Every `AlignedTextGrid` object contains at least one `TierGroup`, which in turn contains at least one `SequenceTier`.

![aligned-textgrid](../assets/alignedtextgrid.svg)

This information is available if you print the object:

In [4]:
print(two_speaker)

AlignedTextGrid with 2 groups, each with [2, 2] tiers. [['Word', 'Phone'], ['Word', 'Phone']]


Or if you compare the `len()` of the one speaker vs two speaker textgrids."

In [5]:
print(len(one_speaker))
print(len(two_speaker))

1
2


To get the Word tier of the first speaker in `one_speaker`, we can index it with `[0][0]`

In [6]:
one_speaker[0][0]

Sequence tier of Word; .superset_class: Top; .subset_class: Phone

If you'd prefer to wrote more verbose but explicit code, you can also access tiers via the `.tier_groups` and `.tier_list` attributes as well.

In [7]:
one_speaker.tier_groups[0].tier_list[0]

Sequence tier of Word; .superset_class: Top; .subset_class: Phone

To access the individual sequence intervals in a tier, you can also use indexing.

In [8]:
one_speaker[0][0][3]

Class Word, label: sunlight, .superset_class: Top, .super_instance, None, .subset_class: Phone, .subset_list: ['S', 'AH1', 'N', 'L', 'AY2', 'T']

Tiers are also iterable.

In [9]:
for i in range(5):
    print(one_speaker[0][0][i].label)


when
the
sunlight
strikes


Once you've gotten to a sequence interval, indexing goes into its [`.subset_list`](../../02_sequences/02_sequence_structure/#moving-downwards)

The `len()` of a tier returns how many sequence intervals it constains.

In [10]:
[len(one_speaker[0][0]), len(one_speaker[0][1])]

[377, 1191]

## Get interval at time

The "Get interval at time" functionality from Praat has been implemented for each level of TextGrid representation.

In [11]:
speaker_one = two_speaker[0]
speaker_one_word = speaker_one[0]

In [12]:
speaker_one_word.get_interval_at_time(11)

1

This is the index for the word that appears at 11 seconds.

In [13]:
speaker_one.get_intervals_at_time(11)

[1, 2]

These are the indices for the word and phone tiers that are at 11 seconds.

In [14]:
two_speaker.get_intervals_at_time(11)

[[1, 2], [39, 96]]

In [15]:
two_speaker.get_intervals_at_time(11)

[[1, 2], [39, 96]]

These are the indices for the word and phone tiers for both speakers at 11 seconds.

### Nested indexing

You can use the nested indices returned by `.get_intervals_at_time()` to get the actual sequence intervals as well.

In [16]:
eleven_seconds = two_speaker.get_intervals_at_time(11)
two_speaker[eleven_seconds]

[[Class Word, label: yeah, .superset_class: Top, .super_instance, None, .subset_class: Phone, .subset_list: ['Y', 'AE1'],
  Class Phone, label: AE1, .superset_class: Word, .super_instance: yeah, .subset_class: Bottom],
 [Class Word, label: after, .superset_class: Top, .super_instance, None, .subset_class: Phone, .subset_list: ['AE1', 'F', 'T', 'ER0'],
  Class Phone, label: F, .superset_class: Word, .super_instance: after, .subset_class: Bottom]]