<a href="https://colab.research.google.com/github/annashell/python_mag/blob/main/mag_python_advanced_praat.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<h2>Работа с файлами разметки Praat</h2>

<p>Для работы с файлами разметки Praat (формат .TextGrid) существует библиотека TextGridTools. Её можно установить через командную строку с помощью pip: </p>

In [None]:
!pip install tgt



In [None]:
import tgt

Документация TextGridTools:

https://textgridtools.readthedocs.io/en/stable/api.html

<h4>Чтение файла разметки</h4>

In [None]:
!wget https://github.com/annashell/python_mag/blob/main/txt_files/av18s.wav
!wget https://raw.githubusercontent.com/annashell/python_mag/refs/heads/main/txt_files/av18s.TextGrid

--2025-09-10 20:01:29--  https://github.com/annashell/python_mag/blob/main/txt_files/av18s.wav
Resolving github.com (github.com)... 140.82.114.3
Connecting to github.com (github.com)|140.82.114.3|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘av18s.wav.1’

av18s.wav.1             [ <=>                ] 183.24K  --.-KB/s    in 0.1s    

2025-09-10 20:01:29 (1.84 MB/s) - ‘av18s.wav.1’ saved [187640]

--2025-09-10 20:01:29--  https://raw.githubusercontent.com/annashell/python_mag/refs/heads/main/txt_files/av18s.TextGrid
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 59614 (58K) [text/plain]
Saving to: ‘av18s.TextGrid.1’


2025-09-10 20:01:29 (4.13 MB/s) - ‘av18s.TextGrid.1’ saved [59614/59614]



In [None]:
grid = tgt.io.read_textgrid("av18s.TextGrid")

In [None]:
print(grid)

<tgt.core.TextGrid object at 0x7b343d685430>


Посмотрим, какие методы есть у полученного объекта:

In [None]:
[i for i in dir(grid) if not i.startswith("_")]

['add_tier',
 'add_tiers',
 'delete_tier',
 'delete_tiers',
 'end_time',
 'extract_part',
 'filename',
 'get_tier_by_name',
 'get_tier_names',
 'get_tiers_by_name',
 'has_tier',
 'insert_tier',
 'start_time',
 'tiers']

Названия уровней:

In [None]:
grid.get_tier_names()

['acoustic',
 'perception',
 'ideal',
 'syllables',
 'stress',
 'syntagma',
 'melody',
 'hesitation']

Все уровни сразу:

In [None]:
grid.tiers

[IntervalTier(start_time=0.0, end_time=9.03071875, name="acoustic", objects=[Interval(0.0, 0.08803125, "d"), Interval(0.08803125, 0.299656249999999, "a"), Interval(0.299656249999999, 0.6634375, "pause"), Interval(0.6634375, 0.770875, "d"), Interval(0.770875, 0.87184375, "a"), Interval(0.87184375, 0.94334375, "e"), Interval(0.94334375, 1.01834375, "t"), Interval(1.01834375, 1.04153125, "\sw"), Interval(1.04153125, 1.122749999999999, "b"), Interval(1.122749999999999, 1.153, "\hs"), Interval(1.153, 1.198874999999999, "\l~"), Interval(1.198874999999999, 1.23853125, "\at"), Interval(1.23853125, 1.287281249999999, "f"), Interval(1.287281249999999, 1.37375, "s'"), Interval(1.37375, 1.41565625, "\oe"), Interval(1.41565625, 1.462968749999999, "v"), Interval(1.462968749999999, 1.53065625, "\vt"), Interval(1.53065625, 1.579312499999999, "d"), Interval(1.579312499999999, 1.61796875, "n"), Interval(1.61796875, 1.65721875, "\o~"), Interval(1.65721875, 1.71178125, "m"), Interval(1.71178125, 1.7938125

Уровни по одному:

In [None]:
for tier in grid:
    print(tier)

IntervalTier(start_time=0.0, end_time=9.03071875, name="acoustic", objects=[Interval(0.0, 0.08803125, "d"), Interval(0.08803125, 0.299656249999999, "a"), Interval(0.299656249999999, 0.6634375, "pause"), Interval(0.6634375, 0.770875, "d"), Interval(0.770875, 0.87184375, "a"), Interval(0.87184375, 0.94334375, "e"), Interval(0.94334375, 1.01834375, "t"), Interval(1.01834375, 1.04153125, "\sw"), Interval(1.04153125, 1.122749999999999, "b"), Interval(1.122749999999999, 1.153, "\hs"), Interval(1.153, 1.198874999999999, "\l~"), Interval(1.198874999999999, 1.23853125, "\at"), Interval(1.23853125, 1.287281249999999, "f"), Interval(1.287281249999999, 1.37375, "s'"), Interval(1.37375, 1.41565625, "\oe"), Interval(1.41565625, 1.462968749999999, "v"), Interval(1.462968749999999, 1.53065625, "\vt"), Interval(1.53065625, 1.579312499999999, "d"), Interval(1.579312499999999, 1.61796875, "n"), Interval(1.61796875, 1.65721875, "\o~"), Interval(1.65721875, 1.71178125, "m"), Interval(1.71178125, 1.7938125,

Уровень по имени:

In [None]:
grid.get_tier_by_name("ideal")

IntervalTier(start_time=0.0, end_time=9.03071875, name="ideal", objects=[Interval(0.0, 0.08803125, "d"), Interval(0.08803125, 0.299656249999999, "\as"), Interval(0.299656249999999, 0.6634375, "pause"), Interval(0.6634375, 0.770875, "d"), Interval(0.770875, 0.87184375, "\as"), Interval(0.87184375, 0.94334375, "e"), Interval(0.94334375, 1.01834375, "t"), Interval(1.01834375, 1.04153125, "\as"), Interval(1.04153125, 1.122749999999999, "b"), Interval(1.122749999999999, 1.153, "\i-"), Interval(1.153, 1.198874999999999, "\l~"), Interval(1.198874999999999, 1.23853125, "\as"), Interval(1.23853125, 1.287281249999999, "f"), Interval(1.287281249999999, 1.37375, "s'"), Interval(1.37375, 1.41565625, "o"), Interval(1.41565625, 1.462968749999999, "v"), Interval(1.462968749999999, 1.53065625, "\as"), Interval(1.53065625, 1.579312499999999, "d"), Interval(1.579312499999999, 1.61796875, "n"), Interval(1.61796875, 1.65721875, "o"), Interval(1.65721875, 1.71178125, "m"), Interval(1.71178125, 1.7938125, "z

In [None]:
ideal_tier = grid.get_tier_by_name("ideal")
for interval in ideal_tier:
    print(interval)

Interval(0.0, 0.08803125, "d")
Interval(0.08803125, 0.299656249999999, "\as")
Interval(0.299656249999999, 0.6634375, "pause")
Interval(0.6634375, 0.770875, "d")
Interval(0.770875, 0.87184375, "\as")
Interval(0.87184375, 0.94334375, "e")
Interval(0.94334375, 1.01834375, "t")
Interval(1.01834375, 1.04153125, "\as")
Interval(1.04153125, 1.122749999999999, "b")
Interval(1.122749999999999, 1.153, "\i-")
Interval(1.153, 1.198874999999999, "\l~")
Interval(1.198874999999999, 1.23853125, "\as")
Interval(1.23853125, 1.287281249999999, "f")
Interval(1.287281249999999, 1.37375, "s'")
Interval(1.37375, 1.41565625, "o")
Interval(1.41565625, 1.462968749999999, "v")
Interval(1.462968749999999, 1.53065625, "\as")
Interval(1.53065625, 1.579312499999999, "d")
Interval(1.579312499999999, 1.61796875, "n")
Interval(1.61796875, 1.65721875, "o")
Interval(1.65721875, 1.71178125, "m")
Interval(1.71178125, 1.7938125, "z")
Interval(1.7938125, 1.85753125, "d")
Interval(1.85753125, 1.966468749999999, "\as")
Interva

In [None]:
print(ideal_tier.name)
print(ideal_tier.start_time)
print(ideal_tier.end_time)

ideal
0.0
9.03071875


In [None]:
print(ideal_tier[3].start_time)
print(ideal_tier[3].end_time)
print(ideal_tier[3].text)

0.6634375
0.770875
d


Задание 1.

Возьмем слой ideal и напечатаем в столбик пары текст - начальная точка


<h4>Создание нового слоя</h4>

In [None]:
grid = tgt.core.TextGrid()

new_tier = tgt.core.IntervalTier(name="my interval tier")
grid.add_tier(new_tier)

In [None]:
new_tier.add_interval(tgt.core.Interval(0, 0.5, "new interval"))
new_tier

IntervalTier(start_time=0.0, end_time=0.5, name="my interval tier", objects=[Interval(0.0, 0.5, "new interval")])

In [None]:
new_point_tier = tgt.core.PointTier(name="new point tier")
grid.add_tier(new_point_tier)
new_point_tier.add_point(tgt.core.Point(1.5, "new point"))
new_point_tier

PointTier(start_time=0.0, end_time=1.5, name="new point tier", objects=[Point(1.5, "new point")])

In [None]:
tgt.io.write_to_file(grid, "new_grid_short.TextGrid", format="short")
tgt.io.write_to_file(grid, "new_grid_long.TextGrid", format="long")

Задача 1:

Для выбранного слоя (например,"ideal") вычислите и выведите:

* Общую длительность всех помеченных интервалов (пропуская пуcтые, т.е. те, где

text == "").

* Среднюю длительность интервалов в этом слое.

* Самый короткий и самый длинный интервал (вывести их текст и длительность).

In [None]:
grid = tgt.io.read_textgrid(r"C:\Users\User\Downloads\av18s.TextGrid")

ideal_tier = grid.get_tier_by_name("ideal")

durations = []
for interval in ideal_tier.intervals:
    if interval.text == "pause":
        durations.append(interval.end_time - interval.start_time)

print("Avg duration: ", sum(durations) / len(durations))

intervals_without_pauses = [inter_ for inter_ in ideal_tier.intervals if inter_.text != "pause"]

max_duration_interval = max(intervals_without_pauses,
                            key=lambda x: x.end_time - x.start_time)
min_duration_interval = min(intervals_without_pauses,
                            key=lambda x: x.end_time - x.start_time)

print(f"Самый длинный интервал: ", max_duration_interval.text, max_duration_interval.end_time - max_duration_interval.start_time)
print(f"Самый короткий интервал: ", min_duration_interval.text, min_duration_interval.end_time - min_duration_interval.start_time)



Задача 2:

Для слоя "ideal" найдите и выведите на экран все интервалы, помеченные как "d" (или любой другой звук по вашему выбору).

Для каждого найденного интервала выведите его начальное время, конечное время и длительность.

In [None]:
d_intervals = [interval for interval in ideal_tier.intervals if interval.text == 'd']

for interval in d_intervals:
    print(interval.start_time, interval.end_time, interval.end_time - interval.start_time)

Задача 3:

Загрузите av18s.TextGrid.

Найдите в слое "ideal" все интервалы с меткой "pause" и измените их метку на "пауза".

Создайте новый файл modified_av18s.TextGrid и сохраните в него изменённую версию.

In [None]:
for interval in ideal_tier.intervals:
    if interval.text == 'pause':
        interval.text = "пауза"

for tier in grid.tiers:
    print(tier)

tgt.io.write_to_file(grid, r"C:\Users\User\Downloads\new_grid_long_modified.TextGrid", format="long")