## Setup

Importing libraries:
- `codecs` for escaping non-ASCII characters in filenames (necessary for fasttrackpy)
- `csv` for exporting the results into a csv file
- `os`, `os.path` and `pathlib` for file handling
- `re` for regular expressions
- `numpy`, `pandas` and `tqdm` for data handling
- **`fasttrackpy`** for extracting formants

**Warning!** As of Feb 2025, `fasttrackpy` requires an older version of `numpy` (`numpy<2.0.0,>=1.26.1`). To install it, uncomment and run the next cell:

In [1]:
import codecs, csv, os, os.path, re
from pathlib import Path
import numpy as np
import pandas as pd
from tqdm import tqdm

from fasttrackpy import process_audio_file, \
    process_directory, \
    process_audio_textgrid,\
    process_corpus

**The folder** should be organized as follows:
```
├── fasttrackpy-F1-F2-extraction.ipynb   # this file
├── kr_MGM                               # folder with audio and TextGrids for speaker MGM
│   ├── ZOOM0003_2_sə́tva.WAV
│   ├── ZOOM0003_2_sə́tva.TextGrid
│   ├── ZOOM0004_4_soldaṭuyta.WAV
│   ├── ZOOM0004_4_soldaṭuyta.TextGrid
├── kr_ASR_2024                          # folder with audio and TextGrids for speaker ASR
│   ├── ...
├── ...
```

**Folder names** should be organized as follows. A prefix is followed by speaker’s code, delimited by an underscore `_`. After the speaker code, the folder name can end, or it can continue with another underscore and anything else after it:

```
<PREFIX>_<SPEAKER>_…
```

**Audio file names** should be organized as follows. A prefix is followed by the stimulus, delimited by an underscore `_`. After the stimulus, the folder name can end, or it can continue with another underscore and anything else after it:

```
<PREFIX>_<STIMULUS>_…
```

If your file names do not have the stimulus, turn the corresponding settings off below and just use any prefix.
_____

User variables:
- `SEGMENTS_TO_EXTRACT`: a regular expression that captures all the segment labels that need to be extracted (if you don’t know regular expressions, just include a list of labels as follows: e.g. `"(a|e|i|ow)"` will capture labels `"a"`, `"e"`, `"i"` and `"ow"`)
- `TARGET_TIER`: the name of the tier with the segments that need to be extracted
- `SPEAKER_SETTINGS`: a dictionary where every key is a speaker’s folder and the value is another dictionary with the following settings:
    - `min_max_formant`: min value of the highest formant
    - `max_max_formant`: max value of the highest formant
    - `n_formants`: number of formants to be extracted
    - `min_duration`: min duration of a segment (shorter segments won’t be extracted)
- `RESULTS_FILENAME`: the name of the final csv file with the formants (without the extension)

- `INCLUDE_STIMULUS`: should be `True` if your audio file names contain stimulus, else `False`

In [2]:
SEGMENTS_TO_EXTRACT = "[+]*[aəeiou][́]*[_]*"
TARGET_TIER = "v"

SPEAKER_SETTINGS = {
    "prefix_MGM": {"min_max_formant": 4000, "max_max_formant": 7000, "n_formants": 5, "min_duration": 0},
}

RESULTS_FILENAME = "results_kr"

INCLUDE_STIMULUS = True

## Extraction

In [3]:
def convert_string(stroka, to_ascii: bool):
    """Encodes and decodes strings to escape non-ASCII characters and vice versa"""
    # to ASCII
    if to_ascii:
        new_stroka = ""
        for s in stroka:
            n = ord(s)
            new_stroka += "@"+str(n) if n >= 128 else s
        return new_stroka

    # from ASCII
    escapes = re.finditer("@(\d+)", stroka)
    for esc in escapes:
        stroka = re.sub(esc.group(0), chr(int(esc.group(1))), stroka)
    return stroka


def rename_files_in_folder(folder_path, to_ascii: bool):
    """Rename a folder so that its name escapes non-ASCII characters"""
    for filename in os.listdir(folder_path):
        old_file_path = os.path.join(folder_path, filename)
        if os.path.isfile(old_file_path):
            new_filename = convert_string(filename, to_ascii=to_ascii)
            new_file_path = os.path.join(folder_path, new_filename)
            os.rename(old_file_path, new_file_path)


def get_stimulus_from_filename(filename):
    return re.split("_", filename)[1]

In [4]:
results = []

for speaker in SPEAKER_SETTINGS:
    speaker_folder = Path(speaker)
    rename_files_in_folder(speaker_folder, to_ascii=True)
    speaker_results = process_corpus(
        speaker_folder, entry_classes = [TARGET_TIER],
        target_tier = TARGET_TIER,
        target_labels = SEGMENTS_TO_EXTRACT,
        min_max_formant = SPEAKER_SETTINGS[speaker]["min_max_formant"],
        max_max_formant = SPEAKER_SETTINGS[speaker]["max_max_formant"],
        n_formants = SPEAKER_SETTINGS[speaker]["n_formants"],
        min_duration = SPEAKER_SETTINGS[speaker]["min_duration"]
    )
    
    speaker_res_list = []
    for r in speaker_results:
        speaker_r = r.to_df(which="winner").to_pandas()
        for formant in ("F1", "F2", "F3", "F4", "F1_s", "F2_s", "F3_s", "F4_s"):
            N = len(speaker_r[formant])
            if N>4:
                speaker_r[formant] = np.mean(speaker_r[formant][N//4:N//4*3])
            else:
                speaker_r[formant] = None
        speaker_res_list.append(pd.DataFrame(speaker_r.iloc[0]).T)
    
    speaker_results = pd.concat(speaker_res_list)
    
    speaker_results.insert(0, "speaker", str(speaker_folder).split("_")[1].upper())
    speaker_results.insert(1, "filename", speaker_results.pop("file_name"))
    speaker_results.insert(2, "label", speaker_results.pop("label"))
    speaker_results["filename"] = [convert_string(s, to_ascii=False) for s in speaker_results["filename"]]

    if INCLUDE_STIMULUS:
        speaker_results.insert(2, "stimulus", speaker_results["filename"].apply(get_stimulus_from_filename))
    
    results.append(speaker_results)
    
    rename_files_in_folder(speaker_folder, to_ascii=False)

100%|███████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 499.44it/s]
100%|██████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 2998.07it/s]


In [5]:
results = pd.concat(results)
results.reset_index(drop=True, inplace=True)

results.to_csv(f"{RESULTS_FILENAME}.csv", index=True)
results

Unnamed: 0,speaker,filename,stimulus,label,F1,F2,F3,F4,F5,F1_s,...,B3,B4,B5,error,time,max_formant,n_formant,smooth_method,id,group
0,MGM,ZOOM0003_sə́tva,sə́tva,ə́,446.08588,816.052887,1360.085238,2546.005621,3065.702995,446.091902,...,124.210444,432.865323,411.036279,7e-05,0.02532,4000.0,5,dct_smooth_regression,0-0-1,group_0
1,MGM,ZOOM0003_sə́tva,sə́tva,ə́,454.674971,1390.354549,2957.689467,4358.909726,5709.305115,454.657558,...,618.386174,630.290365,469.816599,7e-06,0.025186,6842.105263,5,dct_smooth_regression,0-0-3,group_0
2,MGM,ZOOM0003_sə́tva,sə́tva,ə́,429.672051,1410.727254,2631.269092,3874.380473,5028.63183,429.679657,...,166.244719,545.453974,920.817259,1e-06,0.025288,6210.526316,5,dct_smooth_regression,0-0-5,group_0
3,MGM,ZOOM0003_sə́tva,sə́tva,ə́,,,,,3387.946531,,...,160.052885,183.333931,451.318557,0.0,0.025685,4315.789474,5,dct_smooth_regression,0-0-7,group_0
4,MGM,ZOOM0003_sə́tva,sə́tva,ə́,290.248515,1390.817893,2754.679731,3539.444382,4514.224426,290.136264,...,541.955594,1016.391981,591.779794,8.1e-05,0.025125,4947.368421,5,dct_smooth_regression,0-0-9,group_0
5,MGM,ZOOM0004_soldaṭuyta,soldaṭuyta,a,664.844389,1295.870926,2696.388867,3137.384262,6184.198129,666.084035,...,66.458733,866.761791,690.287056,0.000674,0.025245,7000.0,5,dct_smooth_regression,0-0-1,group_0
6,MGM,ZOOM0004_soldaṭuyta,soldaṭuyta,a,626.685939,1214.977126,2651.841752,3258.496087,5888.672978,627.897313,...,148.693902,957.05768,630.040297,0.000625,0.025481,7000.0,5,dct_smooth_regression,0-0-3,group_0
7,MGM,ZOOM0004_soldaṭuyta,soldaṭuyta,a,599.887584,1245.219708,2648.56748,3210.35651,5351.945946,600.540714,...,134.148828,481.258861,486.838743,0.000349,0.025166,7000.0,5,dct_smooth_regression,0-0-5,group_0
8,MGM,ZOOM0004_soldaṭuyta,soldaṭuyta,a,663.653249,1325.143538,2616.685642,3762.078551,5124.150214,665.088089,...,177.544809,225.885496,372.11927,0.001813,0.025896,6842.105263,5,dct_smooth_regression,0-0-7,group_0
9,MGM,ZOOM0004_soldaṭuyta,soldaṭuyta,a,593.21684,1156.8306,2528.170656,2843.544263,3989.539052,593.237458,...,164.782176,449.72104,362.536516,0.000214,0.025029,4947.368421,5,dct_smooth_regression,0-0-9,group_0
