<h2>Parselmouth</h2>

In [None]:
!pip install praat-parselmouth

In [None]:
!wget https://raw.githubusercontent.com/phonetics-spbu/phonetics-spbu.github.io/main/public/courses/linear_models/files/cta0001.wav

In [None]:
import parselmouth
sound = parselmouth.Sound("cta0001.wav")

Методы parselmouth:

In [None]:
[i for i in dir(sound) if not i.startswith("_")]

In [None]:
t_max = sound.tmax
t_max

In [None]:
ch_num = sound.get_number_of_channels()
ch_num

In [None]:
intensity = sound.to_intensity()
print(intensity.get_value(0.1))

In [None]:
intensity.values

In [None]:
import matplotlib.pyplot as plt
plt.plot(sound.values.reshape(-1))
plt.xlabel("Samples")
plt.ylabel("Normalized amplitude")
plt.show()

In [None]:
pitch = sound.to_pitch(pitch_floor=75, pitch_ceiling=400)

In [None]:
pitch.get_value_at_time(1)

In [None]:
pitch.get_value_at_time(0.5)

Задание 1. Построить график ЧОТ от времени с шагом 0.05 секунды

In [None]:
spectrogram = sound.to_spectrogram()

In [None]:
spectrogram.get_power_at(0.24, 100) # t, f

In [None]:
slice_1 = spectrogram.to_spectrum_slice(0.24)
num_bins = slice_1.get_number_of_bins()
num_bins

In [None]:
import numpy as np

freqs = [slice_1.get_frequency_from_bin_number(i) for i in range(1, num_bins + 1)]
vals = [np.log10(abs(slice_1.get_value_in_bin(i))) for i in range(1, num_bins + 1)]
plt.plot(freqs, vals)
plt.show()

In [None]:
slice_2 = spectrogram.to_spectrum_slice(0.6)
num_bins = slice_2.get_number_of_bins()

freqs = [slice_2.get_frequency_from_bin_number(i) for i in range(1, num_bins + 1)]
vals = [np.log10(abs(slice_2.get_value_in_bin(i))) for i in range(1, num_bins + 1)]
plt.plot(freqs, vals)
plt.show()

In [None]:
formants = sound.to_formant_burg()

In [None]:
time = 0.24
for f in range(1, 5):
    print(f"F{f}: {formants.get_value_at_time(f, time)}")

In [None]:
time = 0.6
for f in range(1, 5):
    print(f"F{f}: {formants.get_value_at_time(f, time)}")

<h3>Чтение TextGrid</h3>

In [None]:
!wget https://raw.githubusercontent.com/phonetics-spbu/phonetics-spbu.github.io/refs/heads/main/public/courses/linear_models/files/av18s.TextGrid
tg = parselmouth.read("av18s.TextGrid")
tg

In [None]:
[i for i in dir(tg) if not i.startswith("_")]

Переводим в формат TextGridTools (библиотека tgt)

In [None]:
!pip install tgt
import tgt

In [None]:
tgt_grid = tg.to_tgt()
print(" ".join([i.text for i in tgt_grid.get_tier_by_name("ideal")]))

In [None]:
parselmouth.praat.call(tg, "Convert to Unicode")
tgt_grid_uni = tg.to_tgt()
print(" ".join([i.text for i in tgt_grid_uni.get_tier_by_name("ideal")]))

Сохранение файла:

In [None]:
tg.save("av18s_unicode.TextGrid")

Команды Praat, принимающие какие-то аргументы в интерфейсе Praat оканчиваются на "...". При вызове этих команд "..." опускается, аргументы перечисляются в порядке следования в интерфейсе

In [None]:
start_time, end_time = 0.0, 3.0  # s
pres_times = False
new_tg = parselmouth.praat.call(tg, "Extract part", start_time, end_time, pres_times)
tgt_grid_part = new_tg.to_tgt()
print(" ".join([i.text for i in tgt_grid_part.get_tier_by_name("ideal")]))

Задание 2. Продублировать идеальный слой на последнее место (Duplicate tier, аргументы: номер слоя, номер нового слоя, имя нового слоя), в новом слое заменить "pause" на "пауза" (Set interval text, аргументы: номер слоя, номер интервала, новый текст). Сохранить в новый файл TextGrid

При чтении .sbl частота дискретизации по умолчанию - 16 кГц

In [None]:
!wget https://raw.githubusercontent.com/phonetics-spbu/phonetics-spbu.github.io/refs/heads/main/public/courses/linear_models/files/cta0001.sbl

In [None]:
sound_sbl = parselmouth.praat.call("Read Sound from raw 16-bit Little Endian file", "cta0001.sbl")

In [None]:
from IPython.display import Audio
sound_sbl.override_sampling_frequency(22050)
Audio(sound_sbl.values, rate=22050)

Манипуляции: создается объект манипуляции, у него создается слой PitchTier. Далее этот слой преобразуется так, как нам хотелось бы. Методом Replace pitch tier заменяем PitchTier на модифицированный, затем вызываем команду Get resynthesis (overlap-add)

In [None]:
step, min_f0, max_f0 = 0.01, 75, 600
manipulation = parselmouth.praat.call(sound_sbl, "To Manipulation", step, min_f0, max_f0)

In [None]:
pitch_tier = parselmouth.praat.call(manipulation, "Extract pitch tier")
freq_shift = -60
parselmouth.praat.call(pitch_tier, "Shift frequencies", sound_sbl.xmin, sound_sbl.xmax, freq_shift, "Hertz") # time range, shift, unit

В случае, когда в Praat вы выполняете какое-то действие, выделив несколько объектов, в методе call эти объекты должны быть помещены в список:

In [None]:
parselmouth.praat.call([manipulation, pitch_tier], "Replace pitch tier")

In [None]:
new_sound_sbl = parselmouth.praat.call(manipulation, "Get resynthesis (overlap-add)")
new_sound_sbl.save("cta0001_mod.wav", "WAV")

In [None]:
from IPython.display import Audio
Audio(new_sound_sbl.values, rate=new_sound_sbl.sampling_frequency)

Можно извлекать информацию о слоях с помощью call:

In [None]:
num_points = parselmouth.praat.call(pitch_tier, "Get number of points")
num_points

In [None]:
print(parselmouth.praat.call(pitch_tier, "Get value at index", 1))
print(parselmouth.praat.call(pitch_tier, "Get time from index", 1))

Задание 3. Постройте график ЧОТ, используя приведенные выше команды, для cta0001


In [None]:
import matplotlib.pyplot as plt

time_values = []
f0_values = []

step, min_f0, max_f0 = 0.01, 75, 350
manipulation = parselmouth.praat.call(sound_sbl, "To Manipulation", step, min_f0, max_f0)
pitch_tier = parselmouth.praat.call(manipulation, "Extract pitch tier")
num_points = parselmouth.praat.call(pitch_tier, "Get number of points")

# ваш код


Пустой PitchTier:

In [None]:
start_time = sound_sbl.xmin
end_time = sound_sbl.xmax

new_pitch_tier = parselmouth.praat.call("Create PitchTier", "new_pitch_tier", start_time, end_time)

In [None]:
parselmouth.praat.call(new_pitch_tier, "Add point", end_time * 0.25, 300)
parselmouth.praat.call(new_pitch_tier, "Add point", end_time * 0.75, 150)

In [None]:
parselmouth.praat.call([manipulation, new_pitch_tier], "Replace pitch tier")
simple_sound = parselmouth.praat.call(manipulation, "Get resynthesis (overlap-add)")
Audio(simple_sound.values, rate=simple_sound.sampling_frequency)

In [None]:
x = []
y = []
for i in range(1, num_points + 1):
    x.append(parselmouth.praat.call(new_pitch_tier, "Get time from index", i))
    y.append(parselmouth.praat.call(new_pitch_tier, "Get value at index", i))

plt.plot(x, y, linestyle="-", marker="o")
plt.xlabel("Time, s")
plt.ylabel("F0, Hz")
plt.show()

Пересадим контур ЧОТ со случайными значениями:

In [None]:
new_pitch_tier_random = parselmouth.praat.call("Create PitchTier", "new_pitch_tier", start_time, end_time)

In [None]:
min_f0 = min(f0_values)
max_f0 = max(f0_values)

min_f0, max_f0

In [None]:
import random

for i in range(1, num_points + 1):
  time = parselmouth.praat.call(pitch_tier, "Get time from index", i)
  parselmouth.praat.call(new_pitch_tier_random, "Add point", time, random.uniform(min_f0, max_f0))

In [None]:
parselmouth.praat.call([manipulation, new_pitch_tier_random], "Replace pitch tier")
simple_sound = parselmouth.praat.call(manipulation, "Get resynthesis (overlap-add)")
Audio(simple_sound.values, rate=simple_sound.sampling_frequency)

In [None]:
time_values = []
f0_values = []

step, min_f0, max_f0 = 0.01, 75, 350
manipulation = parselmouth.praat.call(sound_sbl, "To Manipulation", step, min_f0, max_f0)
pitch_tier = parselmouth.praat.call(manipulation, "Extract pitch tier")
num_points = parselmouth.praat.call(new_pitch_tier_random, "Get number of points")

# ваш код
for i in range(1, num_points + 1):
  time_values.append(parselmouth.praat.call(new_pitch_tier_random, "Get time from index", i))
  f0_values.append(parselmouth.praat.call(new_pitch_tier_random, "Get value at index", i))

plt.plot(time_values, f0_values)
plt.xlabel("Time, s")
plt.ylabel("F0, Hz")
plt.show()