5/14 (Tue) | Experiment

# Preliminary Analyses of Annoation

## 1. Introduction

This notebook conducts preliminary analyses.
The goal of current analyses is to fill the following table.

| Task | WER | N disfluency (Manual / Automatic) | N MCP (Manual / Automatic) | N ECP (Manual / Automatic) |
| - | - | - | - | - |
| Arg_Oly |  |  |  |  |
| Cartoon |  |  |  |  |
| RtSwithoutRAA |  |  |  |  |
| RtSwithRAA |  |  |  |  |
| Monologue |  |  |  |  |
| WoZ_Interview |  |  |  |  |
| ALL |  |  |  |  |

Before starting the analyses, the following code block loads required packages and define global variables.

In [1]:
from typing import List, Tuple, Dict, Generator, Optional
from pathlib import Path

import numpy as np
import pandas as pd
from jiwer import wer

from utils.mfr import logit_2_rating

DATA_DIR = Path("/home/matsuura/Development/app/feature_extraction_api/experiment/data")

MONOLOGUE_TASK = ["Arg_Oly", "Cartoon", "RtSwithoutRAA", "RtSwithRAA"]
DIALOGUE_TASK = ["WoZ_Interview"]

FILLER = {"uh", "ah", "um", "mm", "hmm", "oh", "mm-hmm", "er", "mhm", "uh-huh", "er", "erm", "huh", "uhu", "mmhmm", "uhhuh"}

---

## 2. Define Functions

This section defines functions for the preliminary analyses.
The following code block defines two functions; one generates csv file paths of manual and automatic annotation results; and another one loads them.

In [2]:
def annotation_result_csv_path_generator(task: str, rating_filter: Optional[List[int]] =None) -> Generator[Tuple[Path, Path], None, None]:
    load_dir = DATA_DIR / f"{task}/10_SCTK_Inputs"

    if rating_filter is None:
        for manu_csv_path in load_dir.glob("*_manu.csv"):
            filename = manu_csv_path.stem.removesuffix("_manu")
            auto_csv_path = load_dir / f"{filename}_auto_bert.csv"

            yield manu_csv_path, auto_csv_path
    else:
        pf_path = DATA_DIR / f"{task}/12_PF_Rating/pf_rating.csv"
        df_pf = pd.read_csv(pf_path)
        uid_list = df_pf["uid"].to_numpy()

        logit_path = pf_path.parent / "logit.csv"
        threshold_path = logit_path.parent / "threshold.csv"
        
        df_logit = pd.read_csv(logit_path, index_col=0)
        rating_list = logit_2_rating(df_logit["theta"], threshold_path)

        mask = np.full(rating_list.shape, False, dtype=bool)
        for rating in rating_filter:
            mask = mask | (rating_list == rating)
        
        uid_list = uid_list[mask]

        for uid in uid_list:
            if task == "WoZ_Interview":
                uid = str(int(uid)).zfill(3)

            filename_pattern = f"{uid}*_manu.csv"
            for manu_csv_path in load_dir.glob(filename_pattern):
                filename = manu_csv_path.stem.removesuffix("_manu")
                auto_csv_path = load_dir / f"{filename}_auto_bert.csv"

                yield manu_csv_path, auto_csv_path

def load_dataset(
        rating_filter_monologue: Optional[List[int]] =None,
        rating_filter_dialogue: Optional[List[int]] =None,
) -> Dict[str, Dict[str, List[Dict[str, pd.DataFrame]]]]:
    dataset = {
        "monologue": {},
        "dialogue": {}
    }
    
    for monologue_task in MONOLOGUE_TASK:
        dataset["monologue"][monologue_task] = []
        
        for manu_csv_path, auto_csv_path in annotation_result_csv_path_generator(monologue_task, rating_filter=rating_filter_monologue):
            df_manu = pd.read_csv(manu_csv_path)
            df_auto = pd.DataFrame([], columns=["text"])
            if auto_csv_path.exists():
                df_auto = pd.read_csv(auto_csv_path)

            dataset["monologue"][monologue_task].append({
                "manual": df_manu,
                "automatic": df_auto
            })

    for dialogue_task in DIALOGUE_TASK:
        dataset["dialogue"][dialogue_task] = []

        for manu_csv_path, auto_csv_path in annotation_result_csv_path_generator(dialogue_task, rating_filter=rating_filter_dialogue):
            df_manu = pd.read_csv(manu_csv_path, na_values=["", " "], keep_default_na=False)
            df_auto = pd.DataFrame([], columns=["text"])
            if auto_csv_path.exists():
                df_auto = pd.read_csv(auto_csv_path, na_values=["", " "], keep_default_na=False)

            dataset["dialogue"][dialogue_task].append({
                "manual": df_manu,
                "automatic": df_auto
            })

    return dataset

The following code block defines a function to calculate WER.

In [3]:
def calculate_wer(annotation_results: List[Dict[str, pd.DataFrame]], remove_filer: bool =False) -> float:
    ref = []
    hyp = []

    for annotation_result in annotation_results:
        df_manu = annotation_result["manual"]
        df_auto = annotation_result["automatic"]

        mask_tag_manu = df_manu["text"].astype(str).str.endswith(">")
        mask_tag_auto = df_auto["text"].astype(str).str.endswith(">")

        df_manu = df_manu[~mask_tag_manu]
        df_auto = df_auto[~mask_tag_auto]

        if remove_filer:
            for filler in FILLER:
                mask_filler_manu = (df_manu["text"] == filler)
                df_manu = df_manu[~mask_filler_manu]

                mask_filler_auto = (df_auto["text"] == filler)
                df_auto = df_auto[~mask_filler_auto]

        text_manu = " ".join(df_manu["text"].astype(str))
        text_auto = " ".join(df_auto["text"].astype(str))

        if len(text_manu) == 0 or len(text_auto) == 0:
            continue

        ref.append(text_manu)
        hyp.append(text_auto)

    return wer(ref, hyp)

The following code block defines a function to count the number of tags.

In [4]:
def count_tags(annotation_results: List[Dict[str, pd.DataFrame]], target_tag: str) -> Tuple[List[int], List[int]]:
    n_tag_manu = []
    n_tag_auto = []

    for annotation_result in annotation_results:
        df_manu = annotation_result["manual"]
        df_auto = annotation_result["automatic"]

        mask_tag_manu = (df_manu["text"] == target_tag)
        mask_tag_auto = (df_auto["text"] == target_tag)

        n_tag_manu.append(mask_tag_manu.sum())
        n_tag_auto.append(mask_tag_auto.sum())

    return n_tag_manu, n_tag_auto

---

## 3. Preliminary Analyses

This section conducts the preliminary analyses.
The following code block loads entire dataset.

In [5]:
dataset = load_dataset()

In [6]:
dataset["monologue"]["Arg_Oly"][0]["manual"].head()

Unnamed: 0,start_time,end_time,type,text
0,0.22,0.240063,01_text,i
1,0.300063,0.660125,01_text,agree
2,0.820187,1.04025,01_text,this
3,1.260312,2.080563,01_text,statement
4,2.132178,2.76438,02_pause,<CE>


In [7]:
dataset["monologue"]["Arg_Oly"][0]["automatic"].head()

Unnamed: 0,start_time,end_time,type,text
0,0.22,0.240063,01_text,i
1,0.300063,0.660125,01_text,agree
2,0.820187,1.04025,01_text,this
3,1.260312,2.080563,01_text,statement
4,2.080563,2.74075,02_pause,<CI>


### 3.1. WER

The following code block calculate WER of monologue tasks.

In [8]:
monologue_data = []
for monologue_task in MONOLOGUE_TASK:
    annotation_results = dataset["monologue"][monologue_task]

    res = calculate_wer(annotation_results, remove_filer=True)

    print(f"WER of {monologue_task} = {res}")

    monologue_data += annotation_results

res = calculate_wer(monologue_data, remove_filer=True)
print(f"WER of monologue task = {res}")

WER of Arg_Oly = 0.1602270395951583
WER of Cartoon = 0.13862017646433508
WER of RtSwithoutRAA = 0.1987206157686707
WER of RtSwithRAA = 0.20201547971533945
WER of monologue task = 0.17605261694198787


The following code block calculate WER of a dialogue task.

In [9]:
dialogue_data = []
for dialogue_task in DIALOGUE_TASK:
    annotation_results = dataset["dialogue"][dialogue_task]

    res = calculate_wer(annotation_results, remove_filer=True)

    print(f"WER of {dialogue_task} = {res}")

    dialogue_data += annotation_results

WER of WoZ_Interview = 0.15021711724331138


The following code block calcualte WER of the entire tasks.

In [10]:
all_task_data = monologue_data + dialogue_data

res = calculate_wer(all_task_data, remove_filer=True)
print(f"WER of all tasks = {res}")

WER of all tasks = 0.1674828781444276


### 3.2. Count Disfluency

The following code block counts the number of disfluency words in monologue tasks

In [11]:
for monologue_task in MONOLOGUE_TASK:
    annotation_results = dataset["monologue"][monologue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<DISFLUENCY>")

    print(f"[Manual] N_disfluency of {monologue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_disfluency of {monologue_task} = {sum(n_tags_auto)}")

n_tags_manu, n_tags_auto = count_tags(monologue_data, "<DISFLUENCY>")
print(f"[Manual] N_disfluency of monologue task = {sum(n_tags_manu)}")
print(f"[Automatic] N_disfluency of monologue task = {sum(n_tags_auto)}")

[Manual] N_disfluency of Arg_Oly = 1876
[Automatic] N_disfluency of Arg_Oly = 1265
[Manual] N_disfluency of Cartoon = 2813
[Automatic] N_disfluency of Cartoon = 1964
[Manual] N_disfluency of RtSwithoutRAA = 2949
[Automatic] N_disfluency of RtSwithoutRAA = 1812
[Manual] N_disfluency of RtSwithRAA = 2887
[Automatic] N_disfluency of RtSwithRAA = 1687
[Manual] N_disfluency of monologue task = 10525
[Automatic] N_disfluency of monologue task = 6728


The following code block counts the number of disfluency words in dialogue tasks.

In [12]:
for dialogue_task in DIALOGUE_TASK:
    annotation_results = dataset["dialogue"][dialogue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<DISFLUENCY>")

    print(f"[Manual] N_disfluency of {dialogue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_disfluency of {dialogue_task} = {sum(n_tags_auto)}")

[Manual] N_disfluency of WoZ_Interview = 3935
[Automatic] N_disfluency of WoZ_Interview = 2577


The following code block counts the number of disfluency words in the whole tasks.

In [13]:
n_tags_manu, n_tags_auto = count_tags(all_task_data, "<DISFLUENCY>")
print(f"[Manual] N_disfluency of all tasks = {sum(n_tags_manu)}")
print(f"[Automatic] N_disfluency of all tasks = {sum(n_tags_auto)}")

[Manual] N_disfluency of all tasks = 14460
[Automatic] N_disfluency of all tasks = 9305


### 3.3. Count Mid-Clause Pauses

The following code block counts the number of MCP in monologue tasks

In [14]:
for monologue_task in MONOLOGUE_TASK:
    annotation_results = dataset["monologue"][monologue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<CI>")

    print(f"[Manual] N_MCP of {monologue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_MCP of {monologue_task} = {sum(n_tags_auto)}")

n_tags_manu, n_tags_auto = count_tags(monologue_data, "<CI>")
print(f"[Manual] N_MCP of monologue task = {sum(n_tags_manu)}")
print(f"[Automatic] N_MCP of monologue task = {sum(n_tags_auto)}")

[Manual] N_MCP of Arg_Oly = 4302
[Automatic] N_MCP of Arg_Oly = 4879
[Manual] N_MCP of Cartoon = 5559
[Automatic] N_MCP of Cartoon = 6675
[Manual] N_MCP of RtSwithoutRAA = 6169
[Automatic] N_MCP of RtSwithoutRAA = 7302
[Manual] N_MCP of RtSwithRAA = 6332
[Automatic] N_MCP of RtSwithRAA = 7441
[Manual] N_MCP of monologue task = 22362
[Automatic] N_MCP of monologue task = 26297


The following code block counts the number of MCP in dialogue tasks.

In [15]:
for dialogue_task in DIALOGUE_TASK:
    annotation_results = dataset["dialogue"][dialogue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<CI>")

    print(f"[Manual] N_MCP of {dialogue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_MCP of {dialogue_task} = {sum(n_tags_auto)}")

[Manual] N_MCP of WoZ_Interview = 7574
[Automatic] N_MCP of WoZ_Interview = 10288


The following code block counts the number of disfluency words in the whole tasks.

In [16]:
n_tags_manu, n_tags_auto = count_tags(all_task_data, "<CI>")
print(f"[Manual] N_MCP of all tasks = {sum(n_tags_manu)}")
print(f"[Automatic] N_MCP of all tasks = {sum(n_tags_auto)}")

[Manual] N_MCP of all tasks = 29936
[Automatic] N_MCP of all tasks = 36585


### 3.4. Count End-Clause Pauses

The following code block counts the number of ECP in monologue tasks

In [17]:
for monologue_task in MONOLOGUE_TASK:
    annotation_results = dataset["monologue"][monologue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<CE>")

    print(f"[Manual] N_ECP of {monologue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_ECP of {monologue_task} = {sum(n_tags_auto)}")

n_tags_manu, n_tags_auto = count_tags(monologue_data, "<CE>")
print(f"[Manual] N_ECP of monologue task = {sum(n_tags_manu)}")
print(f"[Automatic] N_ECP of monologue task = {sum(n_tags_auto)}")

[Manual] N_ECP of Arg_Oly = 1142
[Automatic] N_ECP of Arg_Oly = 1045
[Manual] N_ECP of Cartoon = 1985
[Automatic] N_ECP of Cartoon = 1874
[Manual] N_ECP of RtSwithoutRAA = 1835
[Automatic] N_ECP of RtSwithoutRAA = 1692
[Manual] N_ECP of RtSwithRAA = 1842
[Automatic] N_ECP of RtSwithRAA = 1634
[Manual] N_ECP of monologue task = 6804
[Automatic] N_ECP of monologue task = 6245


The following code block counts the number of ECP in dialogue tasks.

In [18]:
for dialogue_task in DIALOGUE_TASK:
    annotation_results = dataset["dialogue"][dialogue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<CE>")

    print(f"[Manual] N_ECP of {dialogue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_ECP of {dialogue_task} = {sum(n_tags_auto)}")

[Manual] N_ECP of WoZ_Interview = 2414
[Automatic] N_ECP of WoZ_Interview = 2680


The following code block counts the number of disfluency words in the whole tasks.

In [19]:
n_tags_manu, n_tags_auto = count_tags(all_task_data, "<CE>")
print(f"[Manual] N_ECP of all tasks = {sum(n_tags_manu)}")
print(f"[Automatic] N_ECP of all tasks = {sum(n_tags_auto)}")

[Manual] N_ECP of all tasks = 9218
[Automatic] N_ECP of all tasks = 8925


---

## 4. Summary

As results, the following table was obtained.

| Task | WER | N disfluency (Manual / Auto_RoBERTa / Auto_BERT) | N MCP (Manual / Auto_RoBERTa / Auto_BERT) | N ECP (Manual / Auto_RoBERTa / Auto_BERT) |
| - | - | - | - | - |
| Arg_Oly | 16.1% | 1,876 / 1,793 / 1,265 | 4,302 / 4,923 / 4,879 | 1,142 / 1,001 / 1,045 |
| Cartoon | 13.9% | 2,813 / 2,517 / 1,964 | 5,559 / 6,739 / 6,675 | 1,985 / 1,810 / 1,874 |
| RtSwithoutRAA | 19.9% | 2,949 / 2,542 / 1,812 | 6,169 / 7,367 / 7,302 | 1,835 / 1,627 / 1,692 |
| RtSwithRAA | 20.2% | 2,887 / 2,556 / 1,687 | 6,332 / 7,502 / 7,441 | 1,842 / 1,573 / 1,634 |
| Monologue | 17.6% | 10,525 / 9,408 / 6,728 | 22,362 / 26,531 / 26,297 | 6,804 / 6,011 / 6,245 |
| WoZ_Interview | 15.0% | 3,935 / 3,514 / 2,577 | 7,574 / 10,322 / 10,288 | 2,414 / 2,646 / 2,680 |
| ALL | 16.8% | 14,460 / 12,922 / 9,305 | 29,936 / 36,853 / 36,585 | 9,218 / 8,657 / 8,925 |

## 5. Additional Analyses

This section conducts the same analyses for each PF groups.

### 5.1. Beginners

The following code block loads beginners' speech.

In [5]:
beginner_dataset = load_dataset(rating_filter_monologue=[0, 1, 2], rating_filter_dialogue=[0, 1])

The following code block calculates WER of beginners' speech.

In [6]:
monologue_data = []
for monologue_task in MONOLOGUE_TASK:
    annotation_results = beginner_dataset["monologue"][monologue_task]

    res = calculate_wer(annotation_results, remove_filer=True)

    print(f"WER of {monologue_task} = {res}")

    monologue_data += annotation_results

res = calculate_wer(monologue_data, remove_filer=True)
print(f"WER of monologue task = {res}")

dialogue_data = []
for dialogue_task in DIALOGUE_TASK:
    annotation_results = beginner_dataset["dialogue"][dialogue_task]

    res = calculate_wer(annotation_results, remove_filer=True)

    print(f"WER of {dialogue_task} = {res}")

    dialogue_data += annotation_results

all_task_data = monologue_data + dialogue_data

res = calculate_wer(all_task_data, remove_filer=True)
print(f"WER of all tasks = {res}")

WER of Arg_Oly = 0.20163060520539355
WER of Cartoon = 0.2280612244897959
WER of RtSwithoutRAA = 0.24742773150416464
WER of RtSwithRAA = 0.27688442211055275
WER of monologue task = 0.23625345334640407
WER of WoZ_Interview = 0.2485251852972319
WER of all tasks = 0.2408030506953791


The following code block counts disfluency tags in beginners' speech.

In [7]:
for monologue_task in MONOLOGUE_TASK:
    annotation_results = beginner_dataset["monologue"][monologue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<DISFLUENCY>")

    print(f"[Manual] N_disfluency of {monologue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_disfluency of {monologue_task} = {sum(n_tags_auto)}")

n_tags_manu, n_tags_auto = count_tags(monologue_data, "<DISFLUENCY>")
print(f"[Manual] N_disfluency of monologue task = {sum(n_tags_manu)}")
print(f"[Automatic] N_disfluency of monologue task = {sum(n_tags_auto)}")

for dialogue_task in DIALOGUE_TASK:
    annotation_results = beginner_dataset["dialogue"][dialogue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<DISFLUENCY>")

    print(f"[Manual] N_disfluency of {dialogue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_disfluency of {dialogue_task} = {sum(n_tags_auto)}")

[Manual] N_disfluency of Arg_Oly = 609
[Automatic] N_disfluency of Arg_Oly = 403
[Manual] N_disfluency of Cartoon = 386
[Automatic] N_disfluency of Cartoon = 290
[Manual] N_disfluency of RtSwithoutRAA = 817
[Automatic] N_disfluency of RtSwithoutRAA = 545
[Manual] N_disfluency of RtSwithRAA = 374
[Automatic] N_disfluency of RtSwithRAA = 249
[Manual] N_disfluency of monologue task = 2186
[Automatic] N_disfluency of monologue task = 1487
[Manual] N_disfluency of WoZ_Interview = 955
[Automatic] N_disfluency of WoZ_Interview = 573


The following code block counts MCP tags in beginners' speech.

In [8]:
for monologue_task in MONOLOGUE_TASK:
    annotation_results = beginner_dataset["monologue"][monologue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<CI>")

    print(f"[Manual] N_MCP of {monologue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_MCP of {monologue_task} = {sum(n_tags_auto)}")

n_tags_manu, n_tags_auto = count_tags(monologue_data, "<CI>")
print(f"[Manual] N_MCP of monologue task = {sum(n_tags_manu)}")
print(f"[Automatic] N_MCP of monologue task = {sum(n_tags_auto)}")

for dialogue_task in DIALOGUE_TASK:
    annotation_results = beginner_dataset["dialogue"][dialogue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<CI>")

    print(f"[Manual] N_MCP of {dialogue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_MCP of {dialogue_task} = {sum(n_tags_auto)}")

[Manual] N_MCP of Arg_Oly = 1256
[Automatic] N_MCP of Arg_Oly = 1269
[Manual] N_MCP of Cartoon = 812
[Automatic] N_MCP of Cartoon = 824
[Manual] N_MCP of RtSwithoutRAA = 1700
[Automatic] N_MCP of RtSwithoutRAA = 1921
[Manual] N_MCP of RtSwithRAA = 914
[Automatic] N_MCP of RtSwithRAA = 995
[Manual] N_MCP of monologue task = 4682
[Automatic] N_MCP of monologue task = 5009
[Manual] N_MCP of WoZ_Interview = 1913
[Automatic] N_MCP of WoZ_Interview = 2061


The following code block counts ECP tags in beginners' speech.

In [9]:
for monologue_task in MONOLOGUE_TASK:
    annotation_results = beginner_dataset["monologue"][monologue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<CE>")

    print(f"[Manual] N_ECP of {monologue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_ECP of {monologue_task} = {sum(n_tags_auto)}")

n_tags_manu, n_tags_auto = count_tags(monologue_data, "<CE>")
print(f"[Manual] N_ECP of monologue task = {sum(n_tags_manu)}")
print(f"[Automatic] N_ECP of monologue task = {sum(n_tags_auto)}")

for dialogue_task in DIALOGUE_TASK:
    annotation_results = beginner_dataset["dialogue"][dialogue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<CE>")

    print(f"[Manual] N_ECP of {dialogue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_ECP of {dialogue_task} = {sum(n_tags_auto)}")

[Manual] N_ECP of Arg_Oly = 268
[Automatic] N_ECP of Arg_Oly = 231
[Manual] N_ECP of Cartoon = 232
[Automatic] N_ECP of Cartoon = 205
[Manual] N_ECP of RtSwithoutRAA = 402
[Automatic] N_ECP of RtSwithoutRAA = 387
[Manual] N_ECP of RtSwithRAA = 211
[Automatic] N_ECP of RtSwithRAA = 182
[Manual] N_ECP of monologue task = 1113
[Automatic] N_ECP of monologue task = 1005
[Manual] N_ECP of WoZ_Interview = 472
[Automatic] N_ECP of WoZ_Interview = 451


### 5.2. Intemediate

The following code block loads intermediate group's speech.

In [10]:
intemediate_dataset = load_dataset(rating_filter_monologue=[3, 4, 5], rating_filter_dialogue=[2, 3])

The following code block calculates WER of intemediate learners' speech.

In [11]:
monologue_data = []
for monologue_task in MONOLOGUE_TASK:
    annotation_results = intemediate_dataset["monologue"][monologue_task]

    res = calculate_wer(annotation_results, remove_filer=True)

    print(f"WER of {monologue_task} = {res}")

    monologue_data += annotation_results

res = calculate_wer(monologue_data, remove_filer=True)
print(f"WER of monologue task = {res}")

dialogue_data = []
for dialogue_task in DIALOGUE_TASK:
    annotation_results = intemediate_dataset["dialogue"][dialogue_task]

    res = calculate_wer(annotation_results, remove_filer=True)

    print(f"WER of {dialogue_task} = {res}")

    dialogue_data += annotation_results

all_task_data = monologue_data + dialogue_data

res = calculate_wer(all_task_data, remove_filer=True)
print(f"WER of all tasks = {res}")

WER of Arg_Oly = 0.15636908002177463
WER of Cartoon = 0.13635947652603642
WER of RtSwithoutRAA = 0.20490001047010784
WER of RtSwithRAA = 0.20781302733715828
WER of monologue task = 0.17790715257825496
WER of WoZ_Interview = 0.14034321645342998
WER of all tasks = 0.16416050112657601


The following code block counts disfluency tags in intemediate learners' speech.

In [12]:
for monologue_task in MONOLOGUE_TASK:
    annotation_results = intemediate_dataset["monologue"][monologue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<DISFLUENCY>")

    print(f"[Manual] N_disfluency of {monologue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_disfluency of {monologue_task} = {sum(n_tags_auto)}")

n_tags_manu, n_tags_auto = count_tags(monologue_data, "<DISFLUENCY>")
print(f"[Manual] N_disfluency of monologue task = {sum(n_tags_manu)}")
print(f"[Automatic] N_disfluency of monologue task = {sum(n_tags_auto)}")

for dialogue_task in DIALOGUE_TASK:
    annotation_results = intemediate_dataset["dialogue"][dialogue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<DISFLUENCY>")

    print(f"[Manual] N_disfluency of {dialogue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_disfluency of {dialogue_task} = {sum(n_tags_auto)}")

[Manual] N_disfluency of Arg_Oly = 958
[Automatic] N_disfluency of Arg_Oly = 668
[Manual] N_disfluency of Cartoon = 1729
[Automatic] N_disfluency of Cartoon = 1233
[Manual] N_disfluency of RtSwithoutRAA = 1446
[Automatic] N_disfluency of RtSwithoutRAA = 837
[Manual] N_disfluency of RtSwithRAA = 1801
[Automatic] N_disfluency of RtSwithRAA = 1049
[Manual] N_disfluency of monologue task = 5934
[Automatic] N_disfluency of monologue task = 3787
[Manual] N_disfluency of WoZ_Interview = 2662
[Automatic] N_disfluency of WoZ_Interview = 1790


The following code block counts MCP tags in intemediate learners' speech.

In [13]:
for monologue_task in MONOLOGUE_TASK:
    annotation_results = intemediate_dataset["monologue"][monologue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<CI>")

    print(f"[Manual] N_MCP of {monologue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_MCP of {monologue_task} = {sum(n_tags_auto)}")

n_tags_manu, n_tags_auto = count_tags(monologue_data, "<CI>")
print(f"[Manual] N_MCP of monologue task = {sum(n_tags_manu)}")
print(f"[Automatic] N_MCP of monologue task = {sum(n_tags_auto)}")

for dialogue_task in DIALOGUE_TASK:
    annotation_results = intemediate_dataset["dialogue"][dialogue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<CI>")

    print(f"[Manual] N_MCP of {dialogue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_MCP of {dialogue_task} = {sum(n_tags_auto)}")

[Manual] N_MCP of Arg_Oly = 2268
[Automatic] N_MCP of Arg_Oly = 2623
[Manual] N_MCP of Cartoon = 3622
[Automatic] N_MCP of Cartoon = 4378
[Manual] N_MCP of RtSwithoutRAA = 3061
[Automatic] N_MCP of RtSwithoutRAA = 3671
[Manual] N_MCP of RtSwithRAA = 4001
[Automatic] N_MCP of RtSwithRAA = 4707
[Manual] N_MCP of monologue task = 12952
[Automatic] N_MCP of monologue task = 15379
[Manual] N_MCP of WoZ_Interview = 4911
[Automatic] N_MCP of WoZ_Interview = 6996


The following code block counts ECP tags in intermediate learners' speech.

In [14]:
for monologue_task in MONOLOGUE_TASK:
    annotation_results = intemediate_dataset["monologue"][monologue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<CE>")

    print(f"[Manual] N_ECP of {monologue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_ECP of {monologue_task} = {sum(n_tags_auto)}")

n_tags_manu, n_tags_auto = count_tags(monologue_data, "<CE>")
print(f"[Manual] N_ECP of monologue task = {sum(n_tags_manu)}")
print(f"[Automatic] N_ECP of monologue task = {sum(n_tags_auto)}")

for dialogue_task in DIALOGUE_TASK:
    annotation_results = intemediate_dataset["dialogue"][dialogue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<CE>")

    print(f"[Manual] N_ECP of {dialogue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_ECP of {dialogue_task} = {sum(n_tags_auto)}")

[Manual] N_ECP of Arg_Oly = 608
[Automatic] N_ECP of Arg_Oly = 555
[Manual] N_ECP of Cartoon = 1186
[Automatic] N_ECP of Cartoon = 1119
[Manual] N_ECP of RtSwithoutRAA = 955
[Automatic] N_ECP of RtSwithoutRAA = 858
[Manual] N_ECP of RtSwithRAA = 1154
[Automatic] N_ECP of RtSwithRAA = 1044
[Manual] N_ECP of monologue task = 3903
[Automatic] N_ECP of monologue task = 3576
[Manual] N_ECP of WoZ_Interview = 1682
[Automatic] N_ECP of WoZ_Interview = 1839


### 5.3. Advanced 

The following code block loads advanced learners' speech.

In [15]:
advanced_dataset = load_dataset(rating_filter_monologue=[6, 7, 8], rating_filter_dialogue=[4, 5])

The following code block calculates WER of advanced learners' speech.

In [16]:
monologue_data = []
for monologue_task in MONOLOGUE_TASK:
    annotation_results = advanced_dataset["monologue"][monologue_task]

    res = calculate_wer(annotation_results, remove_filer=True)

    print(f"WER of {monologue_task} = {res}")

    monologue_data += annotation_results

res = calculate_wer(monologue_data, remove_filer=True)
print(f"WER of monologue task = {res}")

dialogue_data = []
for dialogue_task in DIALOGUE_TASK:
    annotation_results = advanced_dataset["dialogue"][dialogue_task]

    res = calculate_wer(annotation_results, remove_filer=True)

    print(f"WER of {dialogue_task} = {res}")

    dialogue_data += annotation_results

all_task_data = monologue_data + dialogue_data

res = calculate_wer(all_task_data, remove_filer=True)
print(f"WER of all tasks = {res}")

WER of Arg_Oly = 0.13485070974057758
WER of Cartoon = 0.11321073055508689
WER of RtSwithoutRAA = 0.15263628239499552
WER of RtSwithRAA = 0.16176742466259939
WER of monologue task = 0.14041014416900605
WER of WoZ_Interview = 0.081675562024907
WER of all tasks = 0.12705882352941175


The following code block counts disfluency tags in advanced learners' speech.

In [17]:
for monologue_task in MONOLOGUE_TASK:
    annotation_results = advanced_dataset["monologue"][monologue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<DISFLUENCY>")

    print(f"[Manual] N_disfluency of {monologue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_disfluency of {monologue_task} = {sum(n_tags_auto)}")

n_tags_manu, n_tags_auto = count_tags(monologue_data, "<DISFLUENCY>")
print(f"[Manual] N_disfluency of monologue task = {sum(n_tags_manu)}")
print(f"[Automatic] N_disfluency of monologue task = {sum(n_tags_auto)}")

for dialogue_task in DIALOGUE_TASK:
    annotation_results = advanced_dataset["dialogue"][dialogue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<DISFLUENCY>")

    print(f"[Manual] N_disfluency of {dialogue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_disfluency of {dialogue_task} = {sum(n_tags_auto)}")

[Manual] N_disfluency of Arg_Oly = 309
[Automatic] N_disfluency of Arg_Oly = 194
[Manual] N_disfluency of Cartoon = 698
[Automatic] N_disfluency of Cartoon = 441
[Manual] N_disfluency of RtSwithoutRAA = 686
[Automatic] N_disfluency of RtSwithoutRAA = 430
[Manual] N_disfluency of RtSwithRAA = 712
[Automatic] N_disfluency of RtSwithRAA = 389
[Manual] N_disfluency of monologue task = 2405
[Automatic] N_disfluency of monologue task = 1454
[Manual] N_disfluency of WoZ_Interview = 318
[Automatic] N_disfluency of WoZ_Interview = 214


The following code block counts MCP tags in advanced learners' speech.

In [18]:
for monologue_task in MONOLOGUE_TASK:
    annotation_results = advanced_dataset["monologue"][monologue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<CI>")

    print(f"[Manual] N_MCP of {monologue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_MCP of {monologue_task} = {sum(n_tags_auto)}")

n_tags_manu, n_tags_auto = count_tags(monologue_data, "<CI>")
print(f"[Manual] N_MCP of monologue task = {sum(n_tags_manu)}")
print(f"[Automatic] N_MCP of monologue task = {sum(n_tags_auto)}")

for dialogue_task in DIALOGUE_TASK:
    annotation_results = advanced_dataset["dialogue"][dialogue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<CI>")

    print(f"[Manual] N_MCP of {dialogue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_MCP of {dialogue_task} = {sum(n_tags_auto)}")

[Manual] N_MCP of Arg_Oly = 778
[Automatic] N_MCP of Arg_Oly = 987
[Manual] N_MCP of Cartoon = 1125
[Automatic] N_MCP of Cartoon = 1473
[Manual] N_MCP of RtSwithoutRAA = 1408
[Automatic] N_MCP of RtSwithoutRAA = 1710
[Manual] N_MCP of RtSwithRAA = 1417
[Automatic] N_MCP of RtSwithRAA = 1739
[Manual] N_MCP of monologue task = 4728
[Automatic] N_MCP of monologue task = 5909
[Manual] N_MCP of WoZ_Interview = 750
[Automatic] N_MCP of WoZ_Interview = 1231


The following code block counts ECP tags in advanced learners' speech.

In [19]:
for monologue_task in MONOLOGUE_TASK:
    annotation_results = advanced_dataset["monologue"][monologue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<CE>")

    print(f"[Manual] N_ECP of {monologue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_ECP of {monologue_task} = {sum(n_tags_auto)}")

n_tags_manu, n_tags_auto = count_tags(monologue_data, "<CE>")
print(f"[Manual] N_ECP of monologue task = {sum(n_tags_manu)}")
print(f"[Automatic] N_ECP of monologue task = {sum(n_tags_auto)}")

for dialogue_task in DIALOGUE_TASK:
    annotation_results = advanced_dataset["dialogue"][dialogue_task]

    n_tags_manu, n_tags_auto = count_tags(annotation_results, "<CE>")

    print(f"[Manual] N_ECP of {dialogue_task} = {sum(n_tags_manu)}")
    print(f"[Automatic] N_ECP of {dialogue_task} = {sum(n_tags_auto)}")

[Manual] N_ECP of Arg_Oly = 266
[Automatic] N_ECP of Arg_Oly = 259
[Manual] N_ECP of Cartoon = 567
[Automatic] N_ECP of Cartoon = 550
[Manual] N_ECP of RtSwithoutRAA = 478
[Automatic] N_ECP of RtSwithoutRAA = 447
[Manual] N_ECP of RtSwithRAA = 477
[Automatic] N_ECP of RtSwithRAA = 408
[Manual] N_ECP of monologue task = 1788
[Automatic] N_ECP of monologue task = 1664
[Manual] N_ECP of WoZ_Interview = 260
[Automatic] N_ECP of WoZ_Interview = 390
