5/12 (Sun) | UF Measures

# UF Measure Calculation Based on Automatic Annotation

## 1. Introduction

This notebook calculates UF measures using the result of automatic temporal feature annotation.
Before starting the calculation, the following code block loads required packages and defines global variables.

In [1]:
from typing import List, Tuple, Generator
import sys
from pathlib import Path

import pandas as pd
import pickle as pkl
from textgrids import TextGrid

sys.path.append(
    "/home/matsuura/Development/app/feature_extraction_api/app/modules"
)

from fluency import Turn, UtteranceFluencyFeatureExtractor

DATA_DIR = Path("/home/matsuura/Development/app/feature_extraction_api/experiment/data")

TASK = ["Arg_Oly", "Cartoon", "RtSwithoutRAA", "RtSwithRAA", "WoZ_Interview"]

---

## 2. Define Functions

This section defines functions to calculate UF measures.
The following code block defines a function to yield file path of Turn object and TextGrid.

In [2]:
def turn_textgrid_path_generator(task: str) -> Generator[Tuple[Path, Path], None, None]:
    load_dir = DATA_DIR / f"{task}/08_Auto_Annotation"

    for turn_path in load_dir.glob("*.pkl"):
        if turn_path.stem.endswith("_long"):
            continue
        
        textgrid_path = load_dir / f"{turn_path.stem}.TextGrid"

        yield turn_path, textgrid_path

The following code block defines a function to load Turn object and TextGrid.

In [3]:
def load_turn_and_textgrid(turn_path: Path, textgrid_path: Path) -> Tuple[Turn, TextGrid]:
    with open(turn_path, "rb") as f:
        turn = pkl.load(f)

    textgrid = TextGrid(str(textgrid_path))

    return turn, textgrid

The following code block defines a function to calculate UF measures.

In [4]:
def extract(turn: Turn, textgrid: TextGrid, pruning: bool =True) -> Tuple[list, list]:
    extractor = UtteranceFluencyFeatureExtractor()

    measures = extractor.extract_by_turn(turn, textgrid, pruning)
    measure_names = extractor.check_feature_names()

    return measures, measure_names

The following code block defines a function to save calculated UF measures as csv file.

In [5]:
def save_measures(
        measure_list: List[list], 
        measure_names: list, 
        task: str, 
        pruning: bool =True
) -> None:
    columns = ["uid"] + measure_names
    df_measures = pd.DataFrame(measure_list, columns=columns)
    df_measures = df_measures.sort_values("uid").reset_index(drop=True)

    if pruning:
        filename = f"uf_measures_auto_pruned.csv"
    else:
        filename = f"uf_measures_auto_unpruned.csv"

    save_path = DATA_DIR / f"{task}/09_UF_Measures/{filename}"

    df_measures.to_csv(save_path, index=False)

---

## 3. Calculate UF measures

The following code block caclulate UF measures.

In [6]:
for task in TASK:
    measure_list_pruned = []
    measure_list_unpruned = []

    for turn_path, textgrid_path in turn_textgrid_path_generator(task):
        turn, textgrid = load_turn_and_textgrid(turn_path, textgrid_path)

        measures_pruned, measure_names = extract(turn, textgrid, pruning=True)
        measures_unpruned, _ = extract(turn, textgrid, pruning=False)

        uid = turn_path.stem

        measures_pruned = [uid] + measures_pruned
        measures_unpruned = [uid] + measures_unpruned

        measure_list_pruned.append(measures_pruned)
        measure_list_unpruned.append(measures_unpruned)

    save_measures(measure_list_pruned, measure_names, task, pruning=True)
    save_measures(measure_list_unpruned, measure_names, task, pruning=False)