# Praat scripting with Parselmouth

The [Parselmouth library](https://github.com/YannickJadoul/Parselmouth) provides a Python interface to Praat objects and algorithms. In essence, it is a useful way to write Praat scripts without leaving Python.

Parselmouth is preinstalled on the BPM. To use it, import the library.

In [1]:
import parselmouth

## Load a Sound object

To load an audio file, instantiate a `Sound` object:

In [2]:
snd = parselmouth.Sound('../resources/two_plus_two_22.wav')
snd

<parselmouth.Sound at 0x7fbb39aaa7d8>

The commands normally found in the Praat Sound menu are mapped to Python methods. Here is a list.

In [3]:
print('\n'.join([name for name in dir(snd) if not name.startswith('_')]))

FileFormat
ToHarmonicityMethod
ToPitchMethod
add
as_array
at_xy
autocorrelate
centre_time
class_name
combine_to_stereo
concatenate
convert_to_mono
convert_to_stereo
convolve
copy
cross_correlate
de_emphasize
deepen_band_modulation
divide
dt
duration
dx
dy
end_time
extract_all_channels
extract_channel
extract_left_channel
extract_part
extract_part_for_overlap
extract_right_channel
formula
frame_number_to_time
full_name
get_column_distance
get_end_time
get_energy
get_energy_in_air
get_frame_number_from_time
get_highest_x
get_highest_y
get_index_from_time
get_intensity
get_lowest_x
get_lowest_y
get_maximum
get_minimum
get_nearest_zero_crossing
get_number_of_channels
get_number_of_columns
get_number_of_frames
get_number_of_rows
get_number_of_samples
get_power
get_power_in_air
get_rms
get_root_mean_square
get_row_distance
get_sampling_frequency
get_sampling_period
get_start_time
get_sum
get_time_from_frame_number
get_time_from_index
get_time_step
get_total_duration
get_value
get_value_at_xy

Some of these methods return other kinds of Praat object, such as Spectrum.

In [4]:
spec = snd.to_spectrum()
spec

<parselmouth.Spectrum at 0x7fbb39aaaf48>

The Spectrum object also maps Python functions to Praat's Spectrum commands.

In [5]:
print('\n'.join([name for name in dir(spec) if not name.startswith('_')]))

FileFormat
as_array
at_xy
bin_width
cepstral_smoothing
class_name
copy
df
dx
dy
fmax
fmin
formula
full_name
get_band_density
get_band_density_difference
get_band_energy
get_band_energy_difference
get_bin_number_from_frequency
get_bin_width
get_center_of_gravity
get_central_moment
get_centre_of_gravity
get_column_distance
get_frequency_from_bin_number
get_highest_frequency
get_highest_x
get_highest_y
get_imaginary_value_in_bin
get_kurtosis
get_lowest_frequency
get_lowest_x
get_lowest_y
get_maximum
get_minimum
get_number_of_bins
get_number_of_columns
get_number_of_rows
get_real_value_in_bin
get_row_distance
get_skewness
get_standard_deviation
get_sum
get_value_at_xy
get_value_in_bin
get_value_in_cell
get_x_of_column
get_y_of_row
highest_frequency
lowest_frequency
lpc_smoothing
n_bins
n_columns
n_rows
name
nf
nx
ny
read
save
save_as_binary_file
save_as_headerless_spreadsheet_file
save_as_matrix_text_file
save_as_short_text_file
save_as_text_file
scale_x_by
scale_x_to
set_real_value_in_bin

## A short demo

Here is a short example that demonstrates how to extract a portion of audio and calculate the spectral center of gravity for that portion.

First we create a dataframe that contains (some of) the phones in the audio file already loaded.

In [6]:
import pandas as pd
phdf = pd.DataFrame(
    {
        't1': [0.28, 0.72, 0.90],
        't2': [0.35, 0.84, 0.99],
        'label': ['t', 's', 't']
    }
)
phdf

Unnamed: 0,t1,t2,label
0,0.28,0.35,t
1,0.72,0.84,s
2,0.9,0.99,t


The `t1` times indicate the beginnings of the stop burst/fricative, and `t2` indicates the end. The next cell extracts center of gravity measures from the stop bursts of the [t] tokens.

In [7]:
def get_cog(phone, s):
    '''Extract audio from s based on phone's t1 and t2 and return spectral
    center of gravity.'''
    return s.extract_part(phone.t1, phone.t2) \
            .to_spectrum() \
            .get_center_of_gravity()

meas = phdf[phdf.label == 't'].apply(lambda ph: get_cog(ph, snd), axis=1)
meas

0    2457.509169
2    3380.582391
dtype: float64

Merge the measurements with the phone set. Notice that `apply` tracked indexes for us, and we can simply `assign` a new column to the dataframe. The measurements are assigned according to the index, and missing measurements are assigned NaN.

In [8]:
phdf = phdf.assign(cog=meas)
phdf

Unnamed: 0,t1,t2,label,cog
0,0.28,0.35,t,2457.509169
1,0.72,0.84,s,
2,0.9,0.99,t,3380.582391
