# 04_eval – Eval-Abschlusslauf: Finale Konfiguration für die Einreichung

## Kontext

Dieses Notebook ist der **finale Eval-Durchlauf** – die Generierung der
Einreichungs-VTTs auf dem ungesehenen Eval-Set mit der in `02k_` gefundenen
optimalen Konfiguration (identisch zu `04_dev_final_results`, nur `SPLIT = "EVAL"`).

## Technische Besonderheit: OOM-Abbrüche

Das Eval-Set ist größer als das Dev-Set. Beim ersten Durchlauf trat bei
Session 129 ein Out-of-Memory-Fehler auf. Die fehlenden Sessions wurden in
**drei weiteren Durchläufen** nachgeholt:

| Durchlauf | Zellen | Fehlende Sessions | Grund |
|-----------|--------|-------------------|-------|
| 1 (Zelle 10) | Alle Sessions | Abbruch bei session_129 | OOM |
| 2 (Zellen 11–12) | 38 fehlende Sessions | session_148, session_80 fehlgeschlagen | OOM |
| 3 (Zellen 13–14) | session_148, session_80 | session_148 fehlgeschlagen | OOM |
| 4 (Zellen 15–19) | session_148 | `PYTORCH_CUDA_ALLOC_CONF=expandable_segments` | Erfolg |

Der vierte Durchlauf nutzt `expandable_segments=True`, was PyTorch erlaubt,
CUDA-Speicher fragmentierungsresistenter zu verwalten.

## Konfiguration

Identisch zu `04_dev_final_results` (min_on=1.0, min_off=1.2, beam=12, len=20, BL4).
Keine WER-Evaluation: Das Eval-Set hat keine Labels → Ergebnisse werden erst nach Einreichung bekannt.

## 1 – GPU-Check

In [1]:
!nvidia-smi

Fri Feb  6 02:27:45 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.5     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA A100-SXM4-80GB          On  |   00000000:01:00.0 Off |                    0 |
| N/A   41C    P0             84W /  500W |   25069MiB /  81920MiB |      6%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA A100-SXM4-80GB          On  |   00

In [1]:
import os

# Physische GPU-Auswahl
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "3"

## 2 – CUDA-Verifikation

In [9]:
import torch, gc, glob
from pathlib import Path
import pandas as pd

In [3]:
print("CUDA devices:", torch.cuda.device_count())
print("Device 0 name (sollte A100 sein):", torch.cuda.get_device_name(0))

CUDA devices: 1
Device 0 name (sollte A100 sein): NVIDIA A100-SXM4-80GB


## 3 – Setup: Arbeitsverzeichnis & Imports

In [2]:
project_baseline_path = "/home/josch080/Projektgruppe/mcorec_baseline"
os.chdir(project_baseline_path)

import sys
if project_baseline_path not in sys.path:
    sys.path.append(project_baseline_path)

from script.pg_utils_experiments import run_inference_for_experiment, append_eval_results_for_experiments

  if not hasattr(np, "object"):


## 4 – Segmentierungs-Patch (identisch zu `04_dev_`)

In [3]:
from contextlib import contextmanager
import script.inference as inf  # das ist das Modul, das pg_utils_experiments nutzt

@contextmanager
def patch_avsr_segmentation(min_on=None, min_off=None):
    orig = inf.InferenceEngine.chunk_video

    if min_on is None and min_off is None:
        yield
        return

    def patched(self, video_path, asd_path=None, max_length=15):
        if asd_path is not None:
            with open(asd_path, "r") as f:
                asd = inf.json.load(f)

            frames = sorted([int(f) for f in asd.keys()])
            if not frames:
                return []
            min_frame = min(frames)

            params = {"max_chunk_size": max_length}
            if min_on is not None:
                params["min_duration_on"] = float(min_on)
            if min_off is not None:
                params["min_duration_off"] = float(min_off)

            segments_by_frames = inf.segment_by_asd(asd, params)
            return [((seg[0] - min_frame) / 25, (seg[-1] - min_frame) / 25) for seg in segments_by_frames]

        return orig(self, video_path, asd_path, max_length)

    inf.InferenceEngine.chunk_video = patched
    try:
        yield
    finally:
        inf.InferenceEngine.chunk_video = orig

def _tag(x: float) -> str:
    return str(x).replace(".", "p")

## 5 – Split-Konfiguration

`SPLIT = "EVAL"` → `data-bin/eval` (Unterschied zu `04_dev_`).

In [4]:
SPLIT = "EVAL"
DATA_ROOT = "data-bin/dev" if SPLIT == "DEV" else "data-bin/eval"  # ggf. anpassen

## 6 – Experiment-Definition

Identisch zu `04_dev_`, nur `SPLIT`-Präfix.

In [5]:
beam_size  = 12
max_length = 20
best_min_on  = 1.0
best_min_off = 1.2

exp_name = f"{SPLIT}_final_bugfix_mdOn{_tag(best_min_on)}_mdOff{_tag(best_min_off)}_bs{beam_size}_len{max_length}"
BASE_MODELS = {
    "avsr_cocktail_finetuned": {
        "model_type": "avsr_cocktail",
        "chkpt": "model-bin/avsr_cocktail_mcorec_finetune",
    },
}
EXPERIMENTS = {
    exp_name: {
        "base_model": "avsr_cocktail_finetuned",
        "beam_size": beam_size,
        "max_length": max_length,
        "comment": f"{SPLIT} FINAL: AVSR override min_on={best_min_on}, min_off={best_min_off}",
    }
}

## 7 – Session-Erkennung

In [13]:
# Sessions für den Split finden
session_dirs = sorted([p for p in glob.glob(f"{DATA_ROOT}/session_*")
                       if Path(p, "metadata.json").exists()])
session_ids = [Path(p).name for p in session_dirs]
print("Split:", SPLIT, "| Sessions:", len(session_ids))
print(session_ids)

Split: EVAL | Sessions: 67
['session_05', 'session_06', 'session_07', 'session_08', 'session_09', 'session_10', 'session_107', 'session_108', 'session_109', 'session_11', 'session_110', 'session_111', 'session_112', 'session_113', 'session_114', 'session_115', 'session_116', 'session_117', 'session_118', 'session_119', 'session_12', 'session_120', 'session_121', 'session_122', 'session_123', 'session_124', 'session_125', 'session_126', 'session_127', 'session_128', 'session_129', 'session_13', 'session_130', 'session_131', 'session_14', 'session_142', 'session_143', 'session_144', 'session_145', 'session_146', 'session_147', 'session_148', 'session_149', 'session_15', 'session_151', 'session_16', 'session_17', 'session_18', 'session_19', 'session_31', 'session_32', 'session_34', 'session_35', 'session_36', 'session_37', 'session_38', 'session_39', 'session_73', 'session_74', 'session_75', 'session_76', 'session_77', 'session_78', 'session_79', 'session_80', 'session_81', 'session_82']


## 8 – Inference: Erster Durchlauf (alle Sessions)

Abbruch bei `session_129` wegen OOM. Alle Sessions ab `session_122` aufwärts
müssen im zweiten Durchlauf nachgeholt werden.

In [10]:
# ======= Inference pro Session =======
for session_dir in session_dirs:
    print("\nRUN:", SPLIT, Path(session_dir).name)
    with patch_avsr_segmentation(min_on=best_min_on, min_off=best_min_off):
        run_inference_for_experiment(
            exp_name=exp_name,
            base_models=BASE_MODELS,
            experiments=EXPERIMENTS,
            session_dir=session_dir,
        )
    torch.cuda.empty_cache()
    gc.collect()


RUN: EVAL session_05

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_05
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...


  from .autonotebook import tqdm as notebook_tqdm


Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_05


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/30 [00:05<02:33,  5.28s/it]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/30 [00:05<01:10,  2.51s/it]
[Acessing speaker spk_0 track 1 of 1:  10%|█         | 3/30 [00:09<01:23,  3.09s/it]
[Acessing speaker spk_0 track 1 of 1:  13%|█▎        | 4/30 [00:13<01:23,  3.22s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 5/30 [00:20<02:00,  4.83s/it]
[Acessing speaker spk_0 track 1 of 1:  20%|██        | 6/30 [00:21<01:24,  3.53s/it]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 7/30 [00:25<01:25,  3.72s/it]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 8/30 [00:28<01:14,  3.41s/it]
[Acessing speaker spk_0 track 1 of 1:  30%|███       | 9/30 [00:29<00:57,  2.73s/it]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 10/30 [00:30<00:43,  2.18s/it]
[Acessing speaker spk_0 track 1 of 1:  37%|███▋      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/29 [00:03<01:28,  3.15s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/29 [00:04<00:56,  2.08s/it]
[Acessing speaker spk_1 track 1 of 1:  10%|█         | 3/29 [00:06<00:50,  1.93s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 4/29 [00:07<00:41,  1.66s/it]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 5/29 [00:08<00:30,  1.29s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 6/29 [00:18<01:39,  4.34s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 7/29 [00:19<01:13,  3.33s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 8/29 [00:20<00:52,  2.50s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███       | 9/29 [00:21<00:39,  1.96s/it]
[Acessing speaker spk_1 track 1 of 1:  34%|███▍      | 10/29 [00:25<00:49,  2.63s/it]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/8 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:  12%|█▎        | 1/8 [00:01<00:07,  1.10s/it]
[Acessing speaker spk_2 track 1 of 2:  25%|██▌       | 2/8 [00:01<00:05,  1.11it/s]
[Acessing speaker spk_2 track 1 of 2:  38%|███▊      | 3/8 [00:02<00:04,  1.22it/s]
[Acessing speaker spk_2 track 1 of 2:  50%|█████     | 4/8 [00:03<00:02,  1.40it/s]
[Acessing speaker spk_2 track 1 of 2:  62%|██████▎   | 5/8 [00:03<00:02,  1.41it/s]
[Acessing speaker spk_2 track 1 of 2:  75%|███████▌  | 6/8 [00:04<00:01,  1.40it/s]
[Acessing speaker spk_2 track 1 of 2:  88%|████████▊ | 7/8 [00:08<00:01,  1.63s/it]
Processing speaker spk_2 track 1 of 2: 100%|██████████| 8/8 [00:08<00:00,  1.12s/it]

[Acessing speaker spk_2 track 2 of 2:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 2:   5%|▍         | 1/21 [00:00<00:17,  1.13it/s]
[Acessing speaker spk_2 track 2 of 2:  10%|▉         | 2/21 [00:01<00:15,  1





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▍         | 1/23 [00:01<00:34,  1.55s/it]
[Acessing speaker spk_3 track 1 of 1:   9%|▊         | 2/23 [00:02<00:28,  1.37s/it]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 3/23 [00:07<00:56,  2.83s/it]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 4/23 [00:12<01:07,  3.55s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 5/23 [00:15<01:02,  3.46s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 6/23 [00:19<01:01,  3.63s/it]
[Acessing speaker spk_3 track 1 of 1:  30%|███       | 7/23 [00:19<00:42,  2.65s/it]
[Acessing speaker spk_3 track 1 of 1:  35%|███▍      | 8/23 [00:20<00:29,  1.95s/it]
[Acessing speaker spk_3 track 1 of 1:  39%|███▉      | 9/23 [00:21<00:23,  1.67s/it]
[Acessing speaker spk_3 track 1 of 1:  43%|████▎     | 10/23 [00:24<00:28,  2.18s/it]
[Acessing speaker spk_3 track 1 of 1:  48%|████▊     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/26 [00:00<00:22,  1.13it/s]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/26 [00:01<00:19,  1.21it/s]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 3/26 [00:03<00:34,  1.50s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▌        | 4/26 [00:04<00:26,  1.20s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 5/26 [00:05<00:20,  1.02it/s]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 6/26 [00:08<00:32,  1.62s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 7/26 [00:08<00:25,  1.32s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 8/26 [00:14<00:46,  2.58s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▍      | 9/26 [00:15<00:36,  2.14s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 10/26 [00:15<00:26,  1.67s/it]
[Acessing speaker spk_4 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/29 [00:02<01:00,  2.17s/it]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 2/29 [00:08<02:10,  4.82s/it]
[Acessing speaker spk_5 track 1 of 1:  10%|█         | 3/29 [00:09<01:19,  3.05s/it]
[Acessing speaker spk_5 track 1 of 1:  14%|█▍        | 4/29 [00:10<00:52,  2.12s/it]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 5/29 [00:11<00:38,  1.61s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 6/29 [00:17<01:16,  3.35s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 7/29 [00:20<01:10,  3.18s/it]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 8/29 [00:23<01:06,  3.15s/it]
[Acessing speaker spk_5 track 1 of 1:  31%|███       | 9/29 [00:29<01:19,  3.98s/it]
[Acessing speaker spk_5 track 1 of 1:  34%|███▍      | 10/29 [00:30<00:56,  2.96s/it]
[Acessing speaker spk_5 track 1 of 1:  38%|███▊      | 11/2


RUN: EVAL session_06

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_06
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_06


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▎         | 1/28 [00:01<00:46,  1.72s/it]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/28 [00:03<00:39,  1.51s/it]
[Acessing speaker spk_0 track 1 of 1:  11%|█         | 3/28 [00:03<00:26,  1.06s/it]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 4/28 [00:04<00:19,  1.21it/s]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 5/28 [00:04<00:17,  1.29it/s]
[Acessing speaker spk_0 track 1 of 1:  21%|██▏       | 6/28 [00:10<00:53,  2.44s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 7/28 [00:16<01:17,  3.70s/it]
[Acessing speaker spk_0 track 1 of 1:  29%|██▊       | 8/28 [00:22<01:26,  4.31s/it]
[Acessing speaker spk_0 track 1 of 1:  32%|███▏      | 9/28 [00:23<01:03,  3.35s/it]
[Acessing speaker spk_0 track 1 of 1:  36%|███▌      | 10/28 [00:24<00:46,  2.58s/it]
[Acessing speaker spk_0 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:00<00:25,  1.00it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:01<00:17,  1.43it/s]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:05<00:53,  2.22s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:06<00:36,  1.57s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:06<00:26,  1.22s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:07<00:21,  1.00s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:08<00:18,  1.06it/s]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:09<00:19,  1.05s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:10<00:21,  1.22s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:11<00:17,  1.06s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   4%|▍         | 1/23 [00:00<00:14,  1.47it/s]
[Acessing speaker spk_2 track 1 of 1:   9%|▊         | 2/23 [00:01<00:11,  1.77it/s]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 3/23 [00:04<00:33,  1.68s/it]
[Acessing speaker spk_2 track 1 of 1:  17%|█▋        | 4/23 [00:04<00:23,  1.26s/it]
[Acessing speaker spk_2 track 1 of 1:  22%|██▏       | 5/23 [00:06<00:26,  1.47s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 6/23 [00:07<00:23,  1.39s/it]
[Acessing speaker spk_2 track 1 of 1:  30%|███       | 7/23 [00:09<00:20,  1.31s/it]
[Acessing speaker spk_2 track 1 of 1:  35%|███▍      | 8/23 [00:09<00:17,  1.14s/it]
[Acessing speaker spk_2 track 1 of 1:  39%|███▉      | 9/23 [00:10<00:13,  1.05it/s]
[Acessing speaker spk_2 track 1 of 1:  43%|████▎     | 10/23 [00:12<00:15,  1.20s/it]
[Acessing speaker spk_2 track 1 of 1:  48%|████▊     | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▎         | 1/27 [00:00<00:19,  1.34it/s]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 2/27 [00:01<00:13,  1.80it/s]
[Acessing speaker spk_3 track 1 of 1:  11%|█         | 3/27 [00:02<00:18,  1.30it/s]
[Acessing speaker spk_3 track 1 of 1:  15%|█▍        | 4/27 [00:03<00:22,  1.01it/s]
[Acessing speaker spk_3 track 1 of 1:  19%|█▊        | 5/27 [00:03<00:17,  1.25it/s]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 6/27 [00:07<00:37,  1.78s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 7/27 [00:08<00:29,  1.45s/it]
[Acessing speaker spk_3 track 1 of 1:  30%|██▉       | 8/27 [00:11<00:37,  1.99s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 9/27 [00:12<00:28,  1.57s/it]
[Acessing speaker spk_3 track 1 of 1:  37%|███▋      | 10/27 [00:13<00:24,  1.46s/it]
[Acessing speaker spk_3 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/24 [00:00<00:20,  1.11it/s]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/24 [00:01<00:16,  1.33it/s]
[Acessing speaker spk_4 track 1 of 1:  12%|█▎        | 3/24 [00:02<00:18,  1.14it/s]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 4/24 [00:03<00:16,  1.23it/s]
[Acessing speaker spk_4 track 1 of 1:  21%|██        | 5/24 [00:07<00:38,  2.01s/it]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 6/24 [00:08<00:32,  1.80s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▉       | 7/24 [00:18<01:12,  4.24s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 8/24 [00:19<00:51,  3.20s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 9/24 [00:23<00:51,  3.45s/it]
[Acessing speaker spk_4 track 1 of 1:  42%|████▏     | 10/24 [00:27<00:52,  3.78s/it]
[Acessing speaker spk_4 track 1 of 1:  46%|████▌     | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/14 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 1/14 [00:03<00:49,  3.83s/it]
[Acessing speaker spk_5 track 1 of 1:  14%|█▍        | 2/14 [00:04<00:24,  2.02s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██▏       | 3/14 [00:05<00:15,  1.41s/it]
[Acessing speaker spk_5 track 1 of 1:  29%|██▊       | 4/14 [00:05<00:10,  1.03s/it]
[Acessing speaker spk_5 track 1 of 1:  36%|███▌      | 5/14 [00:10<00:21,  2.38s/it]
[Acessing speaker spk_5 track 1 of 1:  43%|████▎     | 6/14 [00:11<00:14,  1.85s/it]
[Acessing speaker spk_5 track 1 of 1:  50%|█████     | 7/14 [00:11<00:09,  1.43s/it]
[Acessing speaker spk_5 track 1 of 1:  57%|█████▋    | 8/14 [00:12<00:06,  1.13s/it]
[Acessing speaker spk_5 track 1 of 1:  64%|██████▍   | 9/14 [00:13<00:06,  1.21s/it]
[Acessing speaker spk_5 track 1 of 1:  71%|███████▏  | 10/14 [00:14<00:04,  1.13s/it]
[Acessing speaker spk_5 track 1 of 1:  79%|███████▊  | 11/1


RUN: EVAL session_07

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_07
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_07


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▍         | 1/23 [00:00<00:16,  1.34it/s]
[Acessing speaker spk_0 track 1 of 1:   9%|▊         | 2/23 [00:02<00:21,  1.05s/it]
[Acessing speaker spk_0 track 1 of 1:  13%|█▎        | 3/23 [00:03<00:24,  1.22s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 4/23 [00:04<00:19,  1.04s/it]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 5/23 [00:08<00:39,  2.21s/it]
[Acessing speaker spk_0 track 1 of 1:  26%|██▌       | 6/23 [00:12<00:45,  2.69s/it]
[Acessing speaker spk_0 track 1 of 1:  30%|███       | 7/23 [00:16<00:52,  3.28s/it]
[Acessing speaker spk_0 track 1 of 1:  35%|███▍      | 8/23 [00:21<00:56,  3.75s/it]
[Acessing speaker spk_0 track 1 of 1:  39%|███▉      | 9/23 [00:22<00:39,  2.80s/it]
[Acessing speaker spk_0 track 1 of 1:  43%|████▎     | 10/23 [00:22<00:28,  2.18s/it]
[Acessing speaker spk_0 track 1 of 1:  48%|████▊     | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/22 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▍         | 1/22 [00:01<00:33,  1.59s/it]
[Acessing speaker spk_1 track 1 of 1:   9%|▉         | 2/22 [00:04<00:51,  2.56s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▎        | 3/22 [00:05<00:31,  1.65s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 4/22 [00:08<00:37,  2.08s/it]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 5/22 [00:09<00:31,  1.83s/it]
[Acessing speaker spk_1 track 1 of 1:  27%|██▋       | 6/22 [00:11<00:29,  1.83s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 7/22 [00:13<00:31,  2.07s/it]
[Acessing speaker spk_1 track 1 of 1:  36%|███▋      | 8/22 [00:20<00:50,  3.63s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 9/22 [00:22<00:37,  2.88s/it]
[Acessing speaker spk_1 track 1 of 1:  45%|████▌     | 10/22 [00:22<00:26,  2.19s/it]
[Acessing speaker spk_1 track 1 of 1:  50%|█████     | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   4%|▍         | 1/23 [00:08<03:07,  8.53s/it]
[Acessing speaker spk_2 track 1 of 1:   9%|▊         | 2/23 [00:21<03:53, 11.10s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 3/23 [00:29<03:18,  9.93s/it]
[Acessing speaker spk_2 track 1 of 1:  17%|█▋        | 4/23 [00:36<02:46,  8.76s/it]
[Acessing speaker spk_2 track 1 of 1:  22%|██▏       | 5/23 [00:43<02:20,  7.81s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 6/23 [00:44<01:34,  5.54s/it]
[Acessing speaker spk_2 track 1 of 1:  30%|███       | 7/23 [00:51<01:37,  6.10s/it]
[Acessing speaker spk_2 track 1 of 1:  35%|███▍      | 8/23 [00:57<01:31,  6.11s/it]
[Acessing speaker spk_2 track 1 of 1:  39%|███▉      | 9/23 [01:00<01:13,  5.24s/it]
[Acessing speaker spk_2 track 1 of 1:  43%|████▎     | 10/23 [01:01<00:51,  3.93s/it]
[Acessing speaker spk_2 track 1 of 1:  48%|████▊     | 11/2





Processing speaker spk_3 track 1 of 3: 0it [00:00, ?it/s]

[Acessing speaker spk_3 track 2 of 3:   0%|          | 0/15 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 2 of 3:   7%|▋         | 1/15 [00:01<00:14,  1.07s/it]
[Acessing speaker spk_3 track 2 of 3:  13%|█▎        | 2/15 [00:02<00:15,  1.18s/it]
[Acessing speaker spk_3 track 2 of 3:  20%|██        | 3/15 [00:03<00:14,  1.25s/it]
[Acessing speaker spk_3 track 2 of 3:  27%|██▋       | 4/15 [00:04<00:12,  1.11s/it]
[Acessing speaker spk_3 track 2 of 3:  33%|███▎      | 5/15 [00:05<00:10,  1.09s/it]
[Acessing speaker spk_3 track 2 of 3:  40%|████      | 6/15 [00:06<00:08,  1.10it/s]
[Acessing speaker spk_3 track 2 of 3:  47%|████▋     | 7/15 [00:07<00:07,  1.06it/s]
[Acessing speaker spk_3 track 2 of 3:  53%|█████▎    | 8/15 [00:07<00:05,  1.19it/s]
[Acessing speaker spk_3 track 2 of 3:  60%|██████    | 9/15 [00:08<00:04,  1.33it/s]
[Acessing speaker spk_3 track 2 of 3:  67%|██████▋   | 10/15 [00:09<00:03,  1.29it/s]






[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/26 [00:00<00:22,  1.09it/s]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/26 [00:01<00:17,  1.40it/s]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 3/26 [00:02<00:20,  1.10it/s]
[Acessing speaker spk_4 track 1 of 1:  15%|█▌        | 4/26 [00:03<00:17,  1.24it/s]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 5/26 [00:05<00:25,  1.20s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 6/26 [00:05<00:20,  1.03s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 7/26 [00:06<00:18,  1.01it/s]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 8/26 [00:07<00:15,  1.16it/s]
[Acessing speaker spk_4 track 1 of 1:  35%|███▍      | 9/26 [00:08<00:16,  1.05it/s]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 10/26 [00:09<00:13,  1.21it/s]
[Acessing speaker spk_4 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/30 [00:00<00:24,  1.21it/s]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 2/30 [00:02<00:44,  1.59s/it]
[Acessing speaker spk_5 track 1 of 1:  10%|█         | 3/30 [00:05<00:57,  2.13s/it]
[Acessing speaker spk_5 track 1 of 1:  13%|█▎        | 4/30 [00:10<01:21,  3.15s/it]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 5/30 [00:11<00:59,  2.38s/it]
[Acessing speaker spk_5 track 1 of 1:  20%|██        | 6/30 [00:11<00:42,  1.75s/it]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 7/30 [00:12<00:33,  1.46s/it]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 8/30 [00:14<00:30,  1.37s/it]
[Acessing speaker spk_5 track 1 of 1:  30%|███       | 9/30 [00:20<01:04,  3.08s/it]
[Acessing speaker spk_5 track 1 of 1:  33%|███▎      | 10/30 [00:22<00:53,  2.65s/it]
[Acessing speaker spk_5 track 1 of 1:  37%|███▋      | 11/3


RUN: EVAL session_08

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_08
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_08


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▍         | 1/23 [00:05<01:50,  5.02s/it]
[Acessing speaker spk_0 track 1 of 1:   9%|▊         | 2/23 [00:10<01:56,  5.54s/it]
[Acessing speaker spk_0 track 1 of 1:  13%|█▎        | 3/23 [00:13<01:24,  4.25s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 4/23 [00:17<01:17,  4.10s/it]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 5/23 [00:19<01:03,  3.51s/it]
[Acessing speaker spk_0 track 1 of 1:  26%|██▌       | 6/23 [00:23<00:58,  3.43s/it]
[Acessing speaker spk_0 track 1 of 1:  30%|███       | 7/23 [00:29<01:09,  4.36s/it]
[Acessing speaker spk_0 track 1 of 1:  35%|███▍      | 8/23 [00:37<01:24,  5.60s/it]
[Acessing speaker spk_0 track 1 of 1:  39%|███▉      | 9/23 [00:43<01:20,  5.75s/it]
[Acessing speaker spk_0 track 1 of 1:  43%|████▎     | 10/23 [00:47<01:08,  5.23s/it]
[Acessing speaker spk_0 track 1 of 1:  48%|████▊     | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▍         | 1/21 [00:02<00:51,  2.55s/it]
[Acessing speaker spk_1 track 1 of 1:  10%|▉         | 2/21 [00:03<00:25,  1.34s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 3/21 [00:10<01:11,  3.97s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 4/21 [00:13<01:01,  3.60s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 5/21 [00:14<00:43,  2.73s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▊       | 6/21 [00:15<00:35,  2.34s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 7/21 [00:17<00:28,  2.07s/it]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 8/21 [00:18<00:22,  1.75s/it]
[Acessing speaker spk_1 track 1 of 1:  43%|████▎     | 9/21 [00:21<00:25,  2.14s/it]
[Acessing speaker spk_1 track 1 of 1:  48%|████▊     | 10/21 [00:26<00:31,  2.90s/it]
[Acessing speaker spk_1 track 1 of 1:  52%|█████▏    | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/31 [00:05<02:34,  5.14s/it]
[Acessing speaker spk_2 track 1 of 1:   6%|▋         | 2/31 [00:05<01:14,  2.56s/it]
[Acessing speaker spk_2 track 1 of 1:  10%|▉         | 3/31 [00:06<00:48,  1.74s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 4/31 [00:07<00:35,  1.31s/it]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 5/31 [00:08<00:28,  1.11s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 6/31 [00:08<00:26,  1.04s/it]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 7/31 [00:09<00:23,  1.02it/s]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 8/31 [00:11<00:30,  1.33s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 9/31 [00:12<00:25,  1.14s/it]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 10/31 [00:13<00:22,  1.05s/it]
[Acessing speaker spk_2 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▍         | 1/24 [00:01<00:36,  1.59s/it]
[Acessing speaker spk_3 track 1 of 1:   8%|▊         | 2/24 [00:02<00:20,  1.10it/s]
[Acessing speaker spk_3 track 1 of 1:  12%|█▎        | 3/24 [00:02<00:16,  1.30it/s]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 4/24 [00:03<00:17,  1.18it/s]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 5/24 [00:04<00:16,  1.12it/s]
[Acessing speaker spk_3 track 1 of 1:  25%|██▌       | 6/24 [00:07<00:25,  1.42s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 7/24 [00:08<00:26,  1.54s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 8/24 [00:12<00:35,  2.20s/it]
[Acessing speaker spk_3 track 1 of 1:  38%|███▊      | 9/24 [00:13<00:26,  1.78s/it]
[Acessing speaker spk_3 track 1 of 1:  42%|████▏     | 10/24 [00:14<00:21,  1.53s/it]
[Acessing speaker spk_3 track 1 of 1:  46%|████▌     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/28 [00:01<00:46,  1.72s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/28 [00:02<00:24,  1.07it/s]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/28 [00:04<00:44,  1.77s/it]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 4/28 [00:05<00:31,  1.31s/it]
[Acessing speaker spk_4 track 1 of 1:  18%|█▊        | 5/28 [00:06<00:31,  1.36s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██▏       | 6/28 [00:07<00:24,  1.09s/it]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 7/28 [00:08<00:19,  1.08it/s]
[Acessing speaker spk_4 track 1 of 1:  29%|██▊       | 8/28 [00:11<00:37,  1.88s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 9/28 [00:17<00:55,  2.93s/it]
[Acessing speaker spk_4 track 1 of 1:  36%|███▌      | 10/28 [00:21<01:00,  3.35s/it]
[Acessing speaker spk_4 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   4%|▍         | 1/26 [00:00<00:21,  1.16it/s]
[Acessing speaker spk_5 track 1 of 1:   8%|▊         | 2/26 [00:01<00:18,  1.32it/s]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 3/26 [00:02<00:17,  1.30it/s]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 4/26 [00:05<00:37,  1.70s/it]
[Acessing speaker spk_5 track 1 of 1:  19%|█▉        | 5/26 [00:07<00:41,  1.98s/it]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 6/26 [00:08<00:32,  1.62s/it]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 7/26 [00:09<00:23,  1.24s/it]
[Acessing speaker spk_5 track 1 of 1:  31%|███       | 8/26 [00:10<00:19,  1.08s/it]
[Acessing speaker spk_5 track 1 of 1:  35%|███▍      | 9/26 [00:11<00:18,  1.10s/it]
[Acessing speaker spk_5 track 1 of 1:  38%|███▊      | 10/26 [00:13<00:21,  1.33s/it]
[Acessing speaker spk_5 track 1 of 1:  42%|████▏     | 11/2


RUN: EVAL session_09

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_09
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_09


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▎         | 1/28 [00:01<00:48,  1.79s/it]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/28 [00:02<00:28,  1.11s/it]
[Acessing speaker spk_0 track 1 of 1:  11%|█         | 3/28 [00:03<00:22,  1.10it/s]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 4/28 [00:03<00:19,  1.23it/s]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 5/28 [00:04<00:17,  1.33it/s]
[Acessing speaker spk_0 track 1 of 1:  21%|██▏       | 6/28 [00:08<00:40,  1.84s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 7/28 [00:11<00:44,  2.13s/it]
[Acessing speaker spk_0 track 1 of 1:  29%|██▊       | 8/28 [00:16<01:01,  3.08s/it]
[Acessing speaker spk_0 track 1 of 1:  32%|███▏      | 9/28 [00:22<01:15,  3.96s/it]
[Acessing speaker spk_0 track 1 of 1:  36%|███▌      | 10/28 [00:23<00:54,  3.04s/it]
[Acessing speaker spk_0 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/28 [00:00<00:18,  1.48it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/28 [00:07<01:45,  4.06s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/28 [00:14<02:15,  5.41s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 4/28 [00:15<01:27,  3.65s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 5/28 [00:15<00:59,  2.59s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██▏       | 6/28 [00:16<00:45,  2.06s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 7/28 [00:19<00:44,  2.13s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▊       | 8/28 [00:19<00:34,  1.74s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 9/28 [00:20<00:26,  1.41s/it]
[Acessing speaker spk_1 track 1 of 1:  36%|███▌      | 10/28 [00:26<00:48,  2.69s/it]
[Acessing speaker spk_1 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/30 [00:03<01:54,  3.96s/it]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 2/30 [00:07<01:39,  3.56s/it]
[Acessing speaker spk_2 track 1 of 1:  10%|█         | 3/30 [00:08<01:09,  2.58s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 4/30 [00:09<00:45,  1.76s/it]
[Acessing speaker spk_2 track 1 of 1:  17%|█▋        | 5/30 [00:09<00:31,  1.28s/it]
[Acessing speaker spk_2 track 1 of 1:  20%|██        | 6/30 [00:10<00:24,  1.01s/it]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 7/30 [00:10<00:21,  1.08it/s]
[Acessing speaker spk_2 track 1 of 1:  27%|██▋       | 8/30 [00:11<00:18,  1.17it/s]
[Acessing speaker spk_2 track 1 of 1:  30%|███       | 9/30 [00:13<00:28,  1.35s/it]
[Acessing speaker spk_2 track 1 of 1:  33%|███▎      | 10/30 [00:15<00:30,  1.55s/it]
[Acessing speaker spk_2 track 1 of 1:  37%|███▋      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▍         | 1/23 [00:00<00:15,  1.38it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▊         | 2/23 [00:01<00:13,  1.51it/s]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 3/23 [00:06<00:55,  2.77s/it]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 4/23 [00:09<00:50,  2.64s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 5/23 [00:10<00:39,  2.21s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 6/23 [00:17<01:06,  3.89s/it]
[Acessing speaker spk_3 track 1 of 1:  30%|███       | 7/23 [00:19<00:51,  3.21s/it]
[Acessing speaker spk_3 track 1 of 1:  35%|███▍      | 8/23 [00:20<00:36,  2.42s/it]
[Acessing speaker spk_3 track 1 of 1:  39%|███▉      | 9/23 [00:25<00:46,  3.31s/it]
[Acessing speaker spk_3 track 1 of 1:  43%|████▎     | 10/23 [00:26<00:34,  2.63s/it]
[Acessing speaker spk_3 track 1 of 1:  48%|████▊     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   6%|▋         | 1/16 [00:00<00:10,  1.38it/s]
[Acessing speaker spk_4 track 1 of 1:  12%|█▎        | 2/16 [00:02<00:14,  1.07s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 3/16 [00:03<00:13,  1.03s/it]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 4/16 [00:05<00:17,  1.44s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███▏      | 5/16 [00:06<00:14,  1.35s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 6/16 [00:09<00:20,  2.05s/it]
[Acessing speaker spk_4 track 1 of 1:  44%|████▍     | 7/16 [00:11<00:17,  1.99s/it]
[Acessing speaker spk_4 track 1 of 1:  50%|█████     | 8/16 [00:12<00:12,  1.59s/it]
[Acessing speaker spk_4 track 1 of 1:  56%|█████▋    | 9/16 [00:13<00:09,  1.34s/it]
[Acessing speaker spk_4 track 1 of 1:  62%|██████▎   | 10/16 [00:13<00:07,  1.21s/it]
[Acessing speaker spk_4 track 1 of 1:  69%|██████▉   | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/20 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▌         | 1/20 [00:10<03:21, 10.61s/it]
[Acessing speaker spk_5 track 1 of 1:  10%|█         | 2/20 [00:20<02:58,  9.90s/it]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 3/20 [00:21<01:45,  6.22s/it]
[Acessing speaker spk_5 track 1 of 1:  20%|██        | 4/20 [00:26<01:28,  5.53s/it]
[Acessing speaker spk_5 track 1 of 1:  25%|██▌       | 5/20 [00:32<01:27,  5.85s/it]
[Acessing speaker spk_5 track 1 of 1:  30%|███       | 6/20 [00:33<00:59,  4.28s/it]
[Acessing speaker spk_5 track 1 of 1:  35%|███▌      | 7/20 [00:40<01:05,  5.05s/it]
[Acessing speaker spk_5 track 1 of 1:  40%|████      | 8/20 [00:45<00:58,  4.89s/it]
[Acessing speaker spk_5 track 1 of 1:  45%|████▌     | 9/20 [00:46<00:40,  3.65s/it]
[Acessing speaker spk_5 track 1 of 1:  50%|█████     | 10/20 [00:54<00:52,  5.24s/it]
[Acessing speaker spk_5 track 1 of 1:  55%|█████▌    | 11/2


RUN: EVAL session_10

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_10
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_10


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/12 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   8%|▊         | 1/12 [00:01<00:11,  1.08s/it]
[Acessing speaker spk_0 track 1 of 2:  17%|█▋        | 2/12 [00:01<00:06,  1.43it/s]
[Acessing speaker spk_0 track 1 of 2:  25%|██▌       | 3/12 [00:03<00:11,  1.33s/it]
[Acessing speaker spk_0 track 1 of 2:  33%|███▎      | 4/12 [00:04<00:10,  1.28s/it]
[Acessing speaker spk_0 track 1 of 2:  42%|████▏     | 5/12 [00:06<00:10,  1.49s/it]
[Acessing speaker spk_0 track 1 of 2:  50%|█████     | 6/12 [00:07<00:07,  1.25s/it]
[Acessing speaker spk_0 track 1 of 2:  58%|█████▊    | 7/12 [00:08<00:06,  1.31s/it]
[Acessing speaker spk_0 track 1 of 2:  67%|██████▋   | 8/12 [00:09<00:04,  1.15s/it]
[Acessing speaker spk_0 track 1 of 2:  75%|███████▌  | 9/12 [00:12<00:04,  1.52s/it]
[Acessing speaker spk_0 track 1 of 2:  83%|████████▎ | 10/12 [00:13<00:02,  1.47s/it]
[Acessing speaker spk_0 track 1 of 2:  92%|█████████▏| 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/29 [00:00<00:24,  1.12it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/29 [00:01<00:18,  1.46it/s]
[Acessing speaker spk_1 track 1 of 1:  10%|█         | 3/29 [00:01<00:16,  1.62it/s]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 4/29 [00:02<00:14,  1.74it/s]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 5/29 [00:03<00:13,  1.74it/s]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 6/29 [00:08<00:49,  2.14s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 7/29 [00:14<01:15,  3.42s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 8/29 [00:20<01:33,  4.44s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███       | 9/29 [00:25<01:31,  4.59s/it]
[Acessing speaker spk_1 track 1 of 1:  34%|███▍      | 10/29 [00:30<01:29,  4.69s/it]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   4%|▍         | 1/24 [00:02<00:52,  2.30s/it]
[Acessing speaker spk_2 track 1 of 1:   8%|▊         | 2/24 [00:07<01:24,  3.82s/it]
[Acessing speaker spk_2 track 1 of 1:  12%|█▎        | 3/24 [00:11<01:23,  3.97s/it]
[Acessing speaker spk_2 track 1 of 1:  17%|█▋        | 4/24 [00:19<01:52,  5.61s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 5/24 [00:26<01:56,  6.12s/it]
[Acessing speaker spk_2 track 1 of 1:  25%|██▌       | 6/24 [00:27<01:19,  4.40s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 7/24 [00:28<00:55,  3.27s/it]
[Acessing speaker spk_2 track 1 of 1:  33%|███▎      | 8/24 [00:30<00:43,  2.74s/it]
[Acessing speaker spk_2 track 1 of 1:  38%|███▊      | 9/24 [00:31<00:32,  2.20s/it]
[Acessing speaker spk_2 track 1 of 1:  42%|████▏     | 10/24 [00:33<00:32,  2.29s/it]
[Acessing speaker spk_2 track 1 of 1:  46%|████▌     | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▍         | 1/26 [00:01<00:49,  1.97s/it]
[Acessing speaker spk_3 track 1 of 1:   8%|▊         | 2/26 [00:02<00:28,  1.20s/it]
[Acessing speaker spk_3 track 1 of 1:  12%|█▏        | 3/26 [00:03<00:22,  1.03it/s]
[Acessing speaker spk_3 track 1 of 1:  15%|█▌        | 4/26 [00:03<00:18,  1.20it/s]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 5/26 [00:05<00:19,  1.06it/s]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 6/26 [00:06<00:19,  1.01it/s]
[Acessing speaker spk_3 track 1 of 1:  27%|██▋       | 7/26 [00:06<00:17,  1.12it/s]
[Acessing speaker spk_3 track 1 of 1:  31%|███       | 8/26 [00:07<00:15,  1.20it/s]
[Acessing speaker spk_3 track 1 of 1:  35%|███▍      | 9/26 [00:14<00:45,  2.65s/it]
[Acessing speaker spk_3 track 1 of 1:  38%|███▊      | 10/26 [00:20<00:59,  3.71s/it]
[Acessing speaker spk_3 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/23 [00:00<00:16,  1.34it/s]
[Acessing speaker spk_4 track 1 of 1:   9%|▊         | 2/23 [00:01<00:18,  1.16it/s]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 3/23 [00:02<00:14,  1.35it/s]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 4/23 [00:02<00:12,  1.49it/s]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 5/23 [00:07<00:38,  2.16s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 6/23 [00:10<00:37,  2.23s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|███       | 7/23 [00:11<00:29,  1.83s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▍      | 8/23 [00:15<00:41,  2.77s/it]
[Acessing speaker spk_4 track 1 of 1:  39%|███▉      | 9/23 [00:16<00:31,  2.22s/it]
[Acessing speaker spk_4 track 1 of 1:  43%|████▎     | 10/23 [00:17<00:22,  1.74s/it]
[Acessing speaker spk_4 track 1 of 1:  48%|████▊     | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/30 [00:01<00:30,  1.06s/it]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 2/30 [00:01<00:27,  1.02it/s]
[Acessing speaker spk_5 track 1 of 1:  10%|█         | 3/30 [00:03<00:27,  1.01s/it]
[Acessing speaker spk_5 track 1 of 1:  13%|█▎        | 4/30 [00:03<00:23,  1.09it/s]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 5/30 [00:04<00:22,  1.12it/s]
[Acessing speaker spk_5 track 1 of 1:  20%|██        | 6/30 [00:05<00:21,  1.11it/s]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 7/30 [00:06<00:20,  1.13it/s]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 8/30 [00:07<00:18,  1.20it/s]
[Acessing speaker spk_5 track 1 of 1:  30%|███       | 9/30 [00:08<00:22,  1.09s/it]
[Acessing speaker spk_5 track 1 of 1:  33%|███▎      | 10/30 [00:11<00:33,  1.67s/it]
[Acessing speaker spk_5 track 1 of 1:  37%|███▋      | 11/3


RUN: EVAL session_107

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_107
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_107


Processing speakers:   0%|          | 0/2 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/29 [00:00<00:26,  1.04it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/29 [00:01<00:18,  1.43it/s]
[Acessing speaker spk_0 track 1 of 1:  10%|█         | 3/29 [00:04<00:41,  1.58s/it]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 4/29 [00:05<00:40,  1.62s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 5/29 [00:11<01:11,  3.00s/it]
[Acessing speaker spk_0 track 1 of 1:  21%|██        | 6/29 [00:12<00:55,  2.40s/it]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 7/29 [00:13<00:39,  1.82s/it]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 8/29 [00:15<00:42,  2.04s/it]
[Acessing speaker spk_0 track 1 of 1:  31%|███       | 9/29 [00:16<00:36,  1.83s/it]
[Acessing speaker spk_0 track 1 of 1:  34%|███▍      | 10/29 [00:19<00:36,  1.90s/it]
[Acessing speaker spk_0 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:01<00:44,  1.71s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:02<00:32,  1.31s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:03<00:23,  1.03it/s]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:03<00:18,  1.25it/s]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:04<00:16,  1.32it/s]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:06<00:21,  1.00s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:13<00:59,  2.97s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:17<01:06,  3.48s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:19<00:51,  2.84s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:27<01:20,  4.71s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2


RUN: EVAL session_108

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_108
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_108


Processing speakers:   0%|          | 0/2 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/33 [00:01<00:33,  1.04s/it]
[Acessing speaker spk_0 track 1 of 1:   6%|▌         | 2/33 [00:01<00:24,  1.29it/s]
[Acessing speaker spk_0 track 1 of 1:   9%|▉         | 3/33 [00:02<00:23,  1.29it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▏        | 4/33 [00:03<00:21,  1.35it/s]
[Acessing speaker spk_0 track 1 of 1:  15%|█▌        | 5/33 [00:03<00:19,  1.45it/s]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 6/33 [00:04<00:17,  1.52it/s]
[Acessing speaker spk_0 track 1 of 1:  21%|██        | 7/33 [00:08<00:46,  1.79s/it]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 8/33 [00:09<00:36,  1.47s/it]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 9/33 [00:10<00:33,  1.38s/it]
[Acessing speaker spk_0 track 1 of 1:  30%|███       | 10/33 [00:11<00:32,  1.41s/it]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/28 [00:01<00:32,  1.19s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/28 [00:01<00:20,  1.27it/s]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/28 [00:08<01:26,  3.47s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 4/28 [00:09<01:05,  2.74s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 5/28 [00:18<01:47,  4.66s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██▏       | 6/28 [00:23<01:47,  4.89s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 7/28 [00:24<01:15,  3.60s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▊       | 8/28 [00:24<00:53,  2.66s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 9/28 [00:25<00:37,  1.97s/it]
[Acessing speaker spk_1 track 1 of 1:  36%|███▌      | 10/28 [00:31<00:59,  3.31s/it]
[Acessing speaker spk_1 track 1 of 1:  39%|███▉      | 11/2


RUN: EVAL session_109

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_109
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_109


Processing speakers:   0%|          | 0/2 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/33 [00:01<00:49,  1.53s/it]
[Acessing speaker spk_0 track 1 of 1:   6%|▌         | 2/33 [00:02<00:33,  1.09s/it]
[Acessing speaker spk_0 track 1 of 1:   9%|▉         | 3/33 [00:02<00:26,  1.14it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▏        | 4/33 [00:04<00:27,  1.05it/s]
[Acessing speaker spk_0 track 1 of 1:  15%|█▌        | 5/33 [00:04<00:22,  1.26it/s]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 6/33 [00:06<00:28,  1.07s/it]
[Acessing speaker spk_0 track 1 of 1:  21%|██        | 7/33 [00:06<00:24,  1.04it/s]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 8/33 [00:07<00:21,  1.14it/s]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 9/33 [00:08<00:19,  1.24it/s]
[Acessing speaker spk_0 track 1 of 1:  30%|███       | 10/33 [00:08<00:16,  1.43it/s]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/34 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/34 [00:02<01:19,  2.41s/it]
[Acessing speaker spk_1 track 1 of 1:   6%|▌         | 2/34 [00:04<01:13,  2.30s/it]
[Acessing speaker spk_1 track 1 of 1:   9%|▉         | 3/34 [00:05<00:48,  1.55s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 4/34 [00:05<00:35,  1.18s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 5/34 [00:06<00:29,  1.02s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 6/34 [00:11<01:03,  2.28s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 7/34 [00:17<01:34,  3.50s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▎       | 8/34 [00:23<01:49,  4.20s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▋       | 9/34 [00:24<01:20,  3.20s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▉       | 10/34 [00:24<00:58,  2.45s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 11/3


RUN: EVAL session_11

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_11
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_11


Processing speakers:   0%|          | 0/4 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▍         | 1/23 [00:03<01:12,  3.31s/it]
[Acessing speaker spk_0 track 1 of 1:   9%|▊         | 2/23 [00:04<00:46,  2.22s/it]
[Acessing speaker spk_0 track 1 of 1:  13%|█▎        | 3/23 [00:05<00:32,  1.62s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 4/23 [00:06<00:25,  1.36s/it]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 5/23 [00:09<00:34,  1.93s/it]
[Acessing speaker spk_0 track 1 of 1:  26%|██▌       | 6/23 [00:15<00:56,  3.34s/it]
[Acessing speaker spk_0 track 1 of 1:  30%|███       | 7/23 [00:20<01:02,  3.88s/it]
[Acessing speaker spk_0 track 1 of 1:  35%|███▍      | 8/23 [00:27<01:11,  4.79s/it]
[Acessing speaker spk_0 track 1 of 1:  39%|███▉      | 9/23 [00:29<00:57,  4.08s/it]
[Acessing speaker spk_0 track 1 of 1:  43%|████▎     | 10/23 [00:30<00:39,  3.06s/it]
[Acessing speaker spk_0 track 1 of 1:  48%|████▊     | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/20 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▌         | 1/20 [00:01<00:27,  1.45s/it]
[Acessing speaker spk_1 track 1 of 1:  10%|█         | 2/20 [00:07<01:13,  4.07s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▌        | 3/20 [00:08<00:48,  2.86s/it]
[Acessing speaker spk_1 track 1 of 1:  20%|██        | 4/20 [00:09<00:31,  1.98s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 5/20 [00:12<00:33,  2.26s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|███       | 6/20 [00:13<00:26,  1.91s/it]
[Acessing speaker spk_1 track 1 of 1:  35%|███▌      | 7/20 [00:18<00:38,  2.99s/it]
[Acessing speaker spk_1 track 1 of 1:  40%|████      | 8/20 [00:24<00:47,  3.96s/it]
[Acessing speaker spk_1 track 1 of 1:  45%|████▌     | 9/20 [00:26<00:37,  3.42s/it]
[Acessing speaker spk_1 track 1 of 1:  50%|█████     | 10/20 [00:27<00:25,  2.58s/it]
[Acessing speaker spk_1 track 1 of 1:  55%|█████▌    | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/31 [00:02<01:12,  2.42s/it]
[Acessing speaker spk_2 track 1 of 1:   6%|▋         | 2/31 [00:03<00:40,  1.39s/it]
[Acessing speaker spk_2 track 1 of 1:  10%|▉         | 3/31 [00:04<00:39,  1.40s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 4/31 [00:05<00:33,  1.24s/it]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 5/31 [00:06<00:31,  1.22s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 6/31 [00:08<00:35,  1.43s/it]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 7/31 [00:09<00:30,  1.25s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 8/31 [00:13<00:49,  2.14s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 9/31 [00:21<01:24,  3.84s/it]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 10/31 [00:23<01:10,  3.38s/it]
[Acessing speaker spk_2 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/25 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▍         | 1/25 [00:00<00:19,  1.22it/s]
[Acessing speaker spk_3 track 1 of 1:   8%|▊         | 2/25 [00:03<00:41,  1.82s/it]
[Acessing speaker spk_3 track 1 of 1:  12%|█▏        | 3/25 [00:06<00:50,  2.27s/it]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 4/25 [00:06<00:34,  1.63s/it]
[Acessing speaker spk_3 track 1 of 1:  20%|██        | 5/25 [00:07<00:25,  1.27s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 6/25 [00:08<00:21,  1.11s/it]
[Acessing speaker spk_3 track 1 of 1:  28%|██▊       | 7/25 [00:09<00:18,  1.02s/it]
[Acessing speaker spk_3 track 1 of 1:  32%|███▏      | 8/25 [00:09<00:15,  1.08it/s]
[Acessing speaker spk_3 track 1 of 1:  36%|███▌      | 9/25 [00:10<00:13,  1.22it/s]
[Acessing speaker spk_3 track 1 of 1:  40%|████      | 10/25 [00:11<00:11,  1.30it/s]
[Acessing speaker spk_3 track 1 of 1:  44%|████▍     | 11/2


RUN: EVAL session_110

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_110
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_110


Processing speakers:   0%|          | 0/2 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/10 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:  10%|█         | 1/10 [00:01<00:13,  1.55s/it]
[Acessing speaker spk_0 track 1 of 2:  20%|██        | 2/10 [00:05<00:23,  2.91s/it]
[Acessing speaker spk_0 track 1 of 2:  30%|███       | 3/10 [00:10<00:27,  3.94s/it]
[Acessing speaker spk_0 track 1 of 2:  40%|████      | 4/10 [00:11<00:16,  2.70s/it]
[Acessing speaker spk_0 track 1 of 2:  50%|█████     | 5/10 [00:12<00:10,  2.05s/it]
[Acessing speaker spk_0 track 1 of 2:  60%|██████    | 6/10 [00:13<00:07,  1.78s/it]
[Acessing speaker spk_0 track 1 of 2:  70%|███████   | 7/10 [00:15<00:05,  1.77s/it]
[Acessing speaker spk_0 track 1 of 2:  80%|████████  | 8/10 [00:16<00:03,  1.56s/it]
[Acessing speaker spk_0 track 1 of 2:  90%|█████████ | 9/10 [00:17<00:01,  1.32s/it]
Processing speaker spk_0 track 1 of 2: 100%|██████████| 10/10 [00:17<00:00,  1.79s/it]

[Acessing speaker spk_0 track 2 of 2:   0%|          | 0/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/33 [00:00<00:27,  1.15it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▌         | 2/33 [00:01<00:19,  1.61it/s]
[Acessing speaker spk_1 track 1 of 1:   9%|▉         | 3/33 [00:01<00:18,  1.59it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 4/33 [00:02<00:18,  1.56it/s]
[Acessing speaker spk_1 track 1 of 1:  15%|█▌        | 5/33 [00:10<01:30,  3.22s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 6/33 [00:14<01:30,  3.35s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 7/33 [00:22<02:11,  5.04s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 8/33 [00:30<02:25,  5.82s/it]
[Acessing speaker spk_1 track 1 of 1:  27%|██▋       | 9/33 [00:32<01:57,  4.90s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|███       | 10/33 [00:33<01:23,  3.63s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 11/3


RUN: EVAL session_111

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_111
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_111


Processing speakers:   0%|          | 0/2 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▍         | 1/26 [00:00<00:20,  1.23it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 2/26 [00:01<00:15,  1.54it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▏        | 3/26 [00:05<00:53,  2.31s/it]
[Acessing speaker spk_0 track 1 of 1:  15%|█▌        | 4/26 [00:09<01:08,  3.11s/it]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 5/26 [00:12<00:59,  2.85s/it]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 6/26 [00:13<00:45,  2.27s/it]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 7/26 [00:14<00:33,  1.75s/it]
[Acessing speaker spk_0 track 1 of 1:  31%|███       | 8/26 [00:14<00:25,  1.43s/it]
[Acessing speaker spk_0 track 1 of 1:  35%|███▍      | 9/26 [00:18<00:37,  2.19s/it]
[Acessing speaker spk_0 track 1 of 1:  38%|███▊      | 10/26 [00:22<00:40,  2.56s/it]
[Acessing speaker spk_0 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/28 [00:05<02:38,  5.89s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/28 [00:06<01:15,  2.91s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/28 [00:09<01:12,  2.92s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 4/28 [00:12<01:11,  2.99s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 5/28 [00:13<00:50,  2.17s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██▏       | 6/28 [00:14<00:40,  1.84s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 7/28 [00:15<00:30,  1.45s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▊       | 8/28 [00:16<00:24,  1.21s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 9/28 [00:16<00:19,  1.01s/it]
[Acessing speaker spk_1 track 1 of 1:  36%|███▌      | 10/28 [00:17<00:18,  1.03s/it]
[Acessing speaker spk_1 track 1 of 1:  39%|███▉      | 11/2


RUN: EVAL session_112

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_112
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_112


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/39 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/39 [00:01<00:45,  1.20s/it]
[Acessing speaker spk_0 track 1 of 1:   5%|▌         | 2/39 [00:05<01:42,  2.76s/it]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 3/39 [00:10<02:17,  3.81s/it]
[Acessing speaker spk_0 track 1 of 1:  10%|█         | 4/39 [00:10<01:29,  2.56s/it]
[Acessing speaker spk_0 track 1 of 1:  13%|█▎        | 5/39 [00:12<01:13,  2.17s/it]
[Acessing speaker spk_0 track 1 of 1:  15%|█▌        | 6/39 [00:13<00:56,  1.70s/it]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 7/39 [00:13<00:43,  1.35s/it]
[Acessing speaker spk_0 track 1 of 1:  21%|██        | 8/39 [00:14<00:33,  1.07s/it]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 9/39 [00:19<01:08,  2.27s/it]
[Acessing speaker spk_0 track 1 of 1:  26%|██▌       | 10/39 [00:23<01:29,  3.08s/it]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:01<00:39,  1.54s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:02<00:27,  1.11s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:03<00:30,  1.27s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:09<01:06,  2.89s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:13<01:12,  3.31s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:18<01:20,  3.82s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:18<00:56,  2.85s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:20<00:47,  2.48s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:24<00:51,  2.86s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:25<00:39,  2.35s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/6 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:  17%|█▋        | 1/6 [00:00<00:04,  1.22it/s]
[Acessing speaker spk_2 track 1 of 2:  33%|███▎      | 2/6 [00:01<00:02,  1.52it/s]
[Acessing speaker spk_2 track 1 of 2:  50%|█████     | 3/6 [00:01<00:01,  1.65it/s]
[Acessing speaker spk_2 track 1 of 2:  67%|██████▋   | 4/6 [00:02<00:01,  1.69it/s]
[Acessing speaker spk_2 track 1 of 2:  83%|████████▎ | 5/6 [00:03<00:00,  1.15it/s]
Processing speaker spk_2 track 1 of 2: 100%|██████████| 6/6 [00:04<00:00,  1.27it/s]

[Acessing speaker spk_2 track 2 of 2:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 2:   6%|▋         | 1/16 [00:00<00:09,  1.66it/s]
[Acessing speaker spk_2 track 2 of 2:  12%|█▎        | 2/16 [00:01<00:08,  1.62it/s]
[Acessing speaker spk_2 track 2 of 2:  19%|█▉        | 3/16 [00:02<00:10,  1.29it/s]
[Acessing speaker spk_2 track 2 of 2:  25%|██▌       | 4/16 [00:02<00:08, 





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/29 [00:01<00:29,  1.05s/it]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 2/29 [00:02<00:36,  1.34s/it]
[Acessing speaker spk_3 track 1 of 1:  10%|█         | 3/29 [00:03<00:25,  1.02it/s]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 4/29 [00:09<01:20,  3.23s/it]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 5/29 [00:11<01:05,  2.72s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 6/29 [00:14<01:02,  2.71s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 7/29 [00:15<00:49,  2.24s/it]
[Acessing speaker spk_3 track 1 of 1:  28%|██▊       | 8/29 [00:16<00:39,  1.87s/it]
[Acessing speaker spk_3 track 1 of 1:  31%|███       | 9/29 [00:17<00:30,  1.52s/it]
[Acessing speaker spk_3 track 1 of 1:  34%|███▍      | 10/29 [00:22<00:49,  2.63s/it]
[Acessing speaker spk_3 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/36 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/36 [00:01<00:42,  1.20s/it]
[Acessing speaker spk_4 track 1 of 1:   6%|▌         | 2/36 [00:02<00:41,  1.23s/it]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 3/36 [00:03<00:33,  1.00s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 4/36 [00:03<00:27,  1.15it/s]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 5/36 [00:07<00:56,  1.81s/it]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 6/36 [00:08<00:43,  1.47s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 7/36 [00:12<01:12,  2.49s/it]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 8/36 [00:16<01:23,  2.96s/it]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 9/36 [00:21<01:31,  3.40s/it]
[Acessing speaker spk_4 track 1 of 1:  28%|██▊       | 10/36 [00:22<01:09,  2.69s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 11/3





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/22 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▍         | 1/22 [00:01<00:26,  1.25s/it]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 2/22 [00:01<00:18,  1.06it/s]
[Acessing speaker spk_5 track 1 of 1:  14%|█▎        | 3/22 [00:03<00:20,  1.08s/it]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 4/22 [00:04<00:17,  1.02it/s]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 5/22 [00:04<00:14,  1.21it/s]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 6/22 [00:05<00:11,  1.34it/s]
[Acessing speaker spk_5 track 1 of 1:  32%|███▏      | 7/22 [00:05<00:10,  1.38it/s]
[Acessing speaker spk_5 track 1 of 1:  36%|███▋      | 8/22 [00:06<00:11,  1.26it/s]
[Acessing speaker spk_5 track 1 of 1:  41%|████      | 9/22 [00:09<00:16,  1.24s/it]
[Acessing speaker spk_5 track 1 of 1:  45%|████▌     | 10/22 [00:09<00:13,  1.13s/it]
[Acessing speaker spk_5 track 1 of 1:  50%|█████     | 11/2


RUN: EVAL session_113

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_113
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_113


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/31 [00:00<00:20,  1.46it/s]
[Acessing speaker spk_0 track 1 of 1:   6%|▋         | 2/31 [00:03<00:56,  1.95s/it]
[Acessing speaker spk_0 track 1 of 1:  10%|▉         | 3/31 [00:06<01:10,  2.53s/it]
[Acessing speaker spk_0 track 1 of 1:  13%|█▎        | 4/31 [00:09<01:10,  2.61s/it]
[Acessing speaker spk_0 track 1 of 1:  16%|█▌        | 5/31 [00:11<01:01,  2.37s/it]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 6/31 [00:13<00:55,  2.22s/it]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 7/31 [00:19<01:25,  3.57s/it]
[Acessing speaker spk_0 track 1 of 1:  26%|██▌       | 8/31 [00:24<01:31,  3.99s/it]
[Acessing speaker spk_0 track 1 of 1:  29%|██▉       | 9/31 [00:25<01:07,  3.08s/it]
[Acessing speaker spk_0 track 1 of 1:  32%|███▏      | 10/31 [00:26<00:48,  2.30s/it]
[Acessing speaker spk_0 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:00<00:22,  1.16it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:01<00:21,  1.15it/s]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:03<00:34,  1.43s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:05<00:35,  1.56s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:07<00:39,  1.79s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:08<00:32,  1.57s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:15<01:02,  3.13s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:24<01:36,  5.10s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:32<01:50,  6.13s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:35<01:25,  5.03s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/30 [00:00<00:24,  1.20it/s]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 2/30 [00:01<00:18,  1.54it/s]
[Acessing speaker spk_2 track 1 of 1:  10%|█         | 3/30 [00:02<00:27,  1.00s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 4/30 [00:05<00:45,  1.75s/it]
[Acessing speaker spk_2 track 1 of 1:  17%|█▋        | 5/30 [00:06<00:35,  1.41s/it]
[Acessing speaker spk_2 track 1 of 1:  20%|██        | 6/30 [00:06<00:26,  1.09s/it]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 7/30 [00:07<00:21,  1.06it/s]
[Acessing speaker spk_2 track 1 of 1:  27%|██▋       | 8/30 [00:08<00:19,  1.14it/s]
[Acessing speaker spk_2 track 1 of 1:  30%|███       | 9/30 [00:08<00:17,  1.23it/s]
[Acessing speaker spk_2 track 1 of 1:  33%|███▎      | 10/30 [00:09<00:16,  1.25it/s]
[Acessing speaker spk_2 track 1 of 1:  37%|███▋      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/33 [00:00<00:24,  1.32it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▌         | 2/33 [00:01<00:19,  1.57it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 3/33 [00:01<00:18,  1.59it/s]
[Acessing speaker spk_3 track 1 of 1:  12%|█▏        | 4/33 [00:02<00:18,  1.53it/s]
[Acessing speaker spk_3 track 1 of 1:  15%|█▌        | 5/33 [00:05<00:43,  1.56s/it]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 6/33 [00:06<00:34,  1.28s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 7/33 [00:08<00:34,  1.35s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 8/33 [00:08<00:29,  1.18s/it]
[Acessing speaker spk_3 track 1 of 1:  27%|██▋       | 9/33 [00:13<00:51,  2.13s/it]
[Acessing speaker spk_3 track 1 of 1:  30%|███       | 10/33 [00:21<01:33,  4.08s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 11/3





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/35 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/35 [00:00<00:29,  1.14it/s]
[Acessing speaker spk_4 track 1 of 1:   6%|▌         | 2/35 [00:01<00:25,  1.30it/s]
[Acessing speaker spk_4 track 1 of 1:   9%|▊         | 3/35 [00:02<00:33,  1.04s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█▏        | 4/35 [00:03<00:27,  1.12it/s]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 5/35 [00:04<00:27,  1.07it/s]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 6/35 [00:08<01:00,  2.08s/it]
[Acessing speaker spk_4 track 1 of 1:  20%|██        | 7/35 [00:09<00:45,  1.64s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 8/35 [00:11<00:45,  1.70s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 9/35 [00:12<00:36,  1.42s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▊       | 10/35 [00:13<00:33,  1.36s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███▏      | 11/3





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▌         | 1/17 [00:01<00:16,  1.02s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 2/17 [00:02<00:16,  1.13s/it]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 3/17 [00:04<00:20,  1.47s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▎       | 4/17 [00:04<00:14,  1.15s/it]
[Acessing speaker spk_5 track 1 of 1:  29%|██▉       | 5/17 [00:05<00:11,  1.05it/s]
[Acessing speaker spk_5 track 1 of 1:  35%|███▌      | 6/17 [00:05<00:08,  1.24it/s]
[Acessing speaker spk_5 track 1 of 1:  41%|████      | 7/17 [00:07<00:11,  1.13s/it]
[Acessing speaker spk_5 track 1 of 1:  47%|████▋     | 8/17 [00:08<00:08,  1.01it/s]
[Acessing speaker spk_5 track 1 of 1:  53%|█████▎    | 9/17 [00:09<00:07,  1.12it/s]
[Acessing speaker spk_5 track 1 of 1:  59%|█████▉    | 10/17 [00:09<00:06,  1.15it/s]
[Acessing speaker spk_5 track 1 of 1:  65%|██████▍   | 11/1


RUN: EVAL session_114

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_114
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_114


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/29 [00:00<00:16,  1.72it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/29 [00:04<01:05,  2.44s/it]
[Acessing speaker spk_0 track 1 of 1:  10%|█         | 3/29 [00:05<00:44,  1.70s/it]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 4/29 [00:06<00:38,  1.54s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 5/29 [00:06<00:28,  1.18s/it]
[Acessing speaker spk_0 track 1 of 1:  21%|██        | 6/29 [00:10<00:48,  2.10s/it]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 7/29 [00:15<01:03,  2.90s/it]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 8/29 [00:20<01:13,  3.49s/it]
[Acessing speaker spk_0 track 1 of 1:  31%|███       | 9/29 [00:27<01:32,  4.63s/it]
[Acessing speaker spk_0 track 1 of 1:  34%|███▍      | 10/29 [00:31<01:26,  4.55s/it]
[Acessing speaker spk_0 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/33 [00:01<00:34,  1.09s/it]
[Acessing speaker spk_1 track 1 of 1:   6%|▌         | 2/33 [00:01<00:26,  1.18it/s]
[Acessing speaker spk_1 track 1 of 1:   9%|▉         | 3/33 [00:02<00:26,  1.11it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 4/33 [00:03<00:23,  1.26it/s]
[Acessing speaker spk_1 track 1 of 1:  15%|█▌        | 5/33 [00:04<00:23,  1.21it/s]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 6/33 [00:05<00:26,  1.02it/s]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 7/33 [00:06<00:22,  1.15it/s]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 8/33 [00:07<00:21,  1.15it/s]
[Acessing speaker spk_1 track 1 of 1:  27%|██▋       | 9/33 [00:15<01:19,  3.30s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|███       | 10/33 [00:17<01:03,  2.75s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/31 [00:00<00:25,  1.17it/s]
[Acessing speaker spk_2 track 1 of 1:   6%|▋         | 2/31 [00:01<00:17,  1.64it/s]
[Acessing speaker spk_2 track 1 of 1:  10%|▉         | 3/31 [00:03<00:41,  1.48s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 4/31 [00:04<00:33,  1.24s/it]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 5/31 [00:08<00:54,  2.10s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 6/31 [00:09<00:44,  1.77s/it]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 7/31 [00:14<01:11,  2.98s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 8/31 [00:19<01:22,  3.57s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 9/31 [00:25<01:30,  4.13s/it]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 10/31 [00:28<01:19,  3.79s/it]
[Acessing speaker spk_2 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/26 [00:01<00:33,  1.33s/it]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/26 [00:04<00:51,  2.16s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 3/26 [00:10<01:34,  4.11s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▌        | 4/26 [00:11<01:02,  2.86s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 5/26 [00:14<01:04,  3.09s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 6/26 [00:19<01:14,  3.73s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 7/26 [00:22<01:01,  3.23s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 8/26 [00:22<00:43,  2.39s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▍      | 9/26 [00:23<00:32,  1.89s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 10/26 [00:24<00:23,  1.49s/it]
[Acessing speaker spk_4 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   4%|▍         | 1/23 [00:00<00:19,  1.13it/s]
[Acessing speaker spk_5 track 1 of 1:   9%|▊         | 2/23 [00:01<00:15,  1.32it/s]
[Acessing speaker spk_5 track 1 of 1:  13%|█▎        | 3/23 [00:05<00:45,  2.28s/it]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 4/23 [00:06<00:34,  1.83s/it]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 5/23 [00:07<00:25,  1.41s/it]
[Acessing speaker spk_5 track 1 of 1:  26%|██▌       | 6/23 [00:09<00:26,  1.57s/it]
[Acessing speaker spk_5 track 1 of 1:  30%|███       | 7/23 [00:10<00:23,  1.49s/it]
[Acessing speaker spk_5 track 1 of 1:  35%|███▍      | 8/23 [00:15<00:38,  2.57s/it]
[Acessing speaker spk_5 track 1 of 1:  39%|███▉      | 9/23 [00:16<00:28,  2.05s/it]
[Acessing speaker spk_5 track 1 of 1:  43%|████▎     | 10/23 [00:17<00:21,  1.65s/it]
[Acessing speaker spk_5 track 1 of 1:  48%|████▊     | 11/2


RUN: EVAL session_115

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_115
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_115


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/31 [00:00<00:16,  1.86it/s]
[Acessing speaker spk_0 track 1 of 1:   6%|▋         | 2/31 [00:01<00:16,  1.71it/s]
[Acessing speaker spk_0 track 1 of 1:  10%|▉         | 3/31 [00:01<00:19,  1.47it/s]
[Acessing speaker spk_0 track 1 of 1:  13%|█▎        | 4/31 [00:02<00:17,  1.58it/s]
[Acessing speaker spk_0 track 1 of 1:  16%|█▌        | 5/31 [00:04<00:25,  1.01it/s]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 6/31 [00:05<00:25,  1.01s/it]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 7/31 [00:09<00:48,  2.02s/it]
[Acessing speaker spk_0 track 1 of 1:  26%|██▌       | 8/31 [00:14<01:12,  3.17s/it]
[Acessing speaker spk_0 track 1 of 1:  29%|██▉       | 9/31 [00:20<01:23,  3.81s/it]
[Acessing speaker spk_0 track 1 of 1:  32%|███▏      | 10/31 [00:25<01:33,  4.43s/it]
[Acessing speaker spk_0 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/30 [00:00<00:14,  1.96it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/30 [00:01<00:22,  1.27it/s]
[Acessing speaker spk_1 track 1 of 1:  10%|█         | 3/30 [00:02<00:23,  1.14it/s]
[Acessing speaker spk_1 track 1 of 1:  13%|█▎        | 4/30 [00:07<01:00,  2.33s/it]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 5/30 [00:11<01:21,  3.25s/it]
[Acessing speaker spk_1 track 1 of 1:  20%|██        | 6/30 [00:13<01:02,  2.60s/it]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 7/30 [00:15<00:53,  2.34s/it]
[Acessing speaker spk_1 track 1 of 1:  27%|██▋       | 8/30 [00:16<00:43,  1.99s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|███       | 9/30 [00:17<00:37,  1.80s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 10/30 [00:18<00:31,  1.58s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 11/3





[Acessing speaker spk_2 track 1 of 3:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 1 of 3: 100%|██████████| 1/1 [00:00<00:00,  2.03it/s]

[Acessing speaker spk_2 track 2 of 3:   0%|          | 0/5 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 3:  20%|██        | 1/5 [00:00<00:03,  1.21it/s]
[Acessing speaker spk_2 track 2 of 3:  40%|████      | 2/5 [00:03<00:05,  1.95s/it]
[Acessing speaker spk_2 track 2 of 3:  60%|██████    | 3/5 [00:04<00:02,  1.39s/it]
[Acessing speaker spk_2 track 2 of 3:  80%|████████  | 4/5 [00:04<00:01,  1.10s/it]
Processing speaker spk_2 track 2 of 3: 100%|██████████| 5/5 [00:05<00:00,  1.11s/it]

[Acessing speaker spk_2 track 3 of 3:   0%|          | 0/19 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 3 of 3:   5%|▌         | 1/19 [00:00<00:13,  1.38it/s]
[Acessing speaker spk_2 track 3 of 3:  11%|█         | 2/19 [00:01<00:12,  1.38it/s]
[Acessing speaker spk_2 track 3 of 3:  16%|█▌        | 3/19 [00:01<00:10,  1.59it/





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/31 [00:00<00:16,  1.79it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▋         | 2/31 [00:03<00:59,  2.04s/it]
[Acessing speaker spk_3 track 1 of 1:  10%|▉         | 3/31 [00:04<00:42,  1.50s/it]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 4/31 [00:05<00:38,  1.43s/it]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 5/31 [00:09<01:01,  2.35s/it]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 6/31 [00:10<00:45,  1.82s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 7/31 [00:13<00:51,  2.13s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 8/31 [00:19<01:17,  3.37s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 9/31 [00:23<01:20,  3.66s/it]
[Acessing speaker spk_3 track 1 of 1:  32%|███▏      | 10/31 [00:30<01:35,  4.56s/it]
[Acessing speaker spk_3 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/15 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 1/15 [00:02<00:39,  2.84s/it]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 2/15 [00:05<00:33,  2.59s/it]
[Acessing speaker spk_4 track 1 of 1:  20%|██        | 3/15 [00:06<00:23,  1.96s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 4/15 [00:07<00:15,  1.44s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 5/15 [00:08<00:12,  1.25s/it]
[Acessing speaker spk_4 track 1 of 1:  40%|████      | 6/15 [00:08<00:09,  1.07s/it]
[Acessing speaker spk_4 track 1 of 1:  47%|████▋     | 7/15 [00:09<00:07,  1.02it/s]
[Acessing speaker spk_4 track 1 of 1:  53%|█████▎    | 8/15 [00:12<00:10,  1.53s/it]
[Acessing speaker spk_4 track 1 of 1:  60%|██████    | 9/15 [00:13<00:08,  1.47s/it]
[Acessing speaker spk_4 track 1 of 1:  67%|██████▋   | 10/15 [00:16<00:09,  1.89s/it]
[Acessing speaker spk_4 track 1 of 1:  73%|███████▎  | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/33 [00:01<00:33,  1.05s/it]
[Acessing speaker spk_5 track 1 of 1:   6%|▌         | 2/33 [00:01<00:26,  1.18it/s]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 3/33 [00:02<00:23,  1.27it/s]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 4/33 [00:02<00:19,  1.49it/s]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 5/33 [00:05<00:34,  1.22s/it]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 6/33 [00:07<00:43,  1.62s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 7/33 [00:09<00:42,  1.62s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 8/33 [00:10<00:39,  1.58s/it]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 9/33 [00:13<00:47,  2.00s/it]
[Acessing speaker spk_5 track 1 of 1:  30%|███       | 10/33 [00:23<01:42,  4.47s/it]
[Acessing speaker spk_5 track 1 of 1:  33%|███▎      | 11/3


RUN: EVAL session_116

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_116
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_116


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/9 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:  11%|█         | 1/9 [00:01<00:09,  1.18s/it]
[Acessing speaker spk_0 track 1 of 2:  22%|██▏       | 2/9 [00:02<00:10,  1.43s/it]
[Acessing speaker spk_0 track 1 of 2:  33%|███▎      | 3/9 [00:05<00:13,  2.20s/it]
[Acessing speaker spk_0 track 1 of 2:  44%|████▍     | 4/9 [00:08<00:11,  2.32s/it]
[Acessing speaker spk_0 track 1 of 2:  56%|█████▌    | 5/9 [00:09<00:07,  1.91s/it]
[Acessing speaker spk_0 track 1 of 2:  67%|██████▋   | 6/9 [00:10<00:04,  1.58s/it]
[Acessing speaker spk_0 track 1 of 2:  78%|███████▊  | 7/9 [00:11<00:02,  1.38s/it]
[Acessing speaker spk_0 track 1 of 2:  89%|████████▉ | 8/9 [00:12<00:01,  1.27s/it]
Processing speaker spk_0 track 1 of 2: 100%|██████████| 9/9 [00:15<00:00,  1.67s/it]

[Acessing speaker spk_0 track 2 of 2:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 2 of 2:   3%|▎         | 1/29 [00:02<01:19,  2.





[Acessing speaker spk_1 track 1 of 2:   0%|          | 0/9 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 2:  11%|█         | 1/9 [00:00<00:06,  1.20it/s]
[Acessing speaker spk_1 track 1 of 2:  22%|██▏       | 2/9 [00:02<00:09,  1.31s/it]
[Acessing speaker spk_1 track 1 of 2:  33%|███▎      | 3/9 [00:04<00:09,  1.51s/it]
[Acessing speaker spk_1 track 1 of 2:  44%|████▍     | 4/9 [00:04<00:05,  1.13s/it]
[Acessing speaker spk_1 track 1 of 2:  56%|█████▌    | 5/9 [00:05<00:03,  1.10it/s]
[Acessing speaker spk_1 track 1 of 2:  67%|██████▋   | 6/9 [00:05<00:02,  1.27it/s]
[Acessing speaker spk_1 track 1 of 2:  78%|███████▊  | 7/9 [00:06<00:01,  1.18it/s]
[Acessing speaker spk_1 track 1 of 2:  89%|████████▉ | 8/9 [00:07<00:00,  1.27it/s]
Processing speaker spk_1 track 1 of 2: 100%|██████████| 9/9 [00:08<00:00,  1.08it/s]

[Acessing speaker spk_1 track 2 of 2:   0%|          | 0/13 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 2 of 2:   8%|▊         | 1/13 [00:01<00:13,  1.





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   3%|▎         | 1/30 [00:00<00:20,  1.43it/s]
[Acessing speaker spk_2 track 1 of 2:   7%|▋         | 2/30 [00:01<00:18,  1.52it/s]
[Acessing speaker spk_2 track 1 of 2:  10%|█         | 3/30 [00:02<00:24,  1.11it/s]
[Acessing speaker spk_2 track 1 of 2:  13%|█▎        | 4/30 [00:03<00:19,  1.36it/s]
[Acessing speaker spk_2 track 1 of 2:  17%|█▋        | 5/30 [00:03<00:19,  1.29it/s]
[Acessing speaker spk_2 track 1 of 2:  20%|██        | 6/30 [00:04<00:19,  1.26it/s]
[Acessing speaker spk_2 track 1 of 2:  23%|██▎       | 7/30 [00:05<00:16,  1.36it/s]
[Acessing speaker spk_2 track 1 of 2:  27%|██▋       | 8/30 [00:11<00:56,  2.56s/it]
[Acessing speaker spk_2 track 1 of 2:  30%|███       | 9/30 [00:13<00:45,  2.15s/it]
[Acessing speaker spk_2 track 1 of 2:  33%|███▎      | 10/30 [00:13<00:34,  1.70s/it]
[Acessing speaker spk_2 track 1 of 2:  37%|███▋      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/30 [00:00<00:27,  1.06it/s]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 2/30 [00:01<00:21,  1.29it/s]
[Acessing speaker spk_3 track 1 of 1:  10%|█         | 3/30 [00:03<00:35,  1.30s/it]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 4/30 [00:04<00:31,  1.21s/it]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 5/30 [00:05<00:25,  1.01s/it]
[Acessing speaker spk_3 track 1 of 1:  20%|██        | 6/30 [00:10<01:02,  2.59s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 7/30 [00:11<00:45,  1.98s/it]
[Acessing speaker spk_3 track 1 of 1:  27%|██▋       | 8/30 [00:12<00:36,  1.65s/it]
[Acessing speaker spk_3 track 1 of 1:  30%|███       | 9/30 [00:18<01:05,  3.12s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 10/30 [00:27<01:33,  4.67s/it]
[Acessing speaker spk_3 track 1 of 1:  37%|███▋      | 11/3





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   5%|▍         | 1/21 [00:00<00:18,  1.05it/s]
[Acessing speaker spk_4 track 1 of 1:  10%|▉         | 2/21 [00:01<00:12,  1.47it/s]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 3/21 [00:02<00:14,  1.28it/s]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 4/21 [00:03<00:13,  1.24it/s]
[Acessing speaker spk_4 track 1 of 1:  24%|██▍       | 5/21 [00:03<00:11,  1.34it/s]
[Acessing speaker spk_4 track 1 of 1:  29%|██▊       | 6/21 [00:04<00:10,  1.39it/s]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 7/21 [00:05<00:12,  1.13it/s]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 8/21 [00:06<00:11,  1.17it/s]
[Acessing speaker spk_4 track 1 of 1:  43%|████▎     | 9/21 [00:07<00:12,  1.01s/it]
[Acessing speaker spk_4 track 1 of 1:  48%|████▊     | 10/21 [00:08<00:09,  1.16it/s]
[Acessing speaker spk_4 track 1 of 1:  52%|█████▏    | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   4%|▎         | 1/27 [00:00<00:19,  1.31it/s]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 2/27 [00:06<01:35,  3.82s/it]
[Acessing speaker spk_5 track 1 of 1:  11%|█         | 3/27 [00:13<02:02,  5.12s/it]
[Acessing speaker spk_5 track 1 of 1:  15%|█▍        | 4/27 [00:20<02:15,  5.88s/it]
[Acessing speaker spk_5 track 1 of 1:  19%|█▊        | 5/27 [00:26<02:09,  5.90s/it]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 6/27 [00:31<01:55,  5.51s/it]
[Acessing speaker spk_5 track 1 of 1:  26%|██▌       | 7/27 [00:35<01:44,  5.23s/it]
[Acessing speaker spk_5 track 1 of 1:  30%|██▉       | 8/27 [00:39<01:32,  4.88s/it]
[Acessing speaker spk_5 track 1 of 1:  33%|███▎      | 9/27 [00:43<01:19,  4.40s/it]
[Acessing speaker spk_5 track 1 of 1:  37%|███▋      | 10/27 [00:47<01:12,  4.25s/it]
[Acessing speaker spk_5 track 1 of 1:  41%|████      | 11/2


RUN: EVAL session_117

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_117
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_117


Processing speakers:   0%|          | 0/4 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/34 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/34 [00:00<00:27,  1.20it/s]
[Acessing speaker spk_0 track 1 of 1:   6%|▌         | 2/34 [00:03<01:04,  2.01s/it]
[Acessing speaker spk_0 track 1 of 1:   9%|▉         | 3/34 [00:06<01:14,  2.42s/it]
[Acessing speaker spk_0 track 1 of 1:  12%|█▏        | 4/34 [00:10<01:35,  3.20s/it]
[Acessing speaker spk_0 track 1 of 1:  15%|█▍        | 5/34 [00:15<01:45,  3.64s/it]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 6/34 [00:16<01:16,  2.72s/it]
[Acessing speaker spk_0 track 1 of 1:  21%|██        | 7/34 [00:20<01:24,  3.14s/it]
[Acessing speaker spk_0 track 1 of 1:  24%|██▎       | 8/34 [00:22<01:09,  2.67s/it]
[Acessing speaker spk_0 track 1 of 1:  26%|██▋       | 9/34 [00:23<00:54,  2.17s/it]
[Acessing speaker spk_0 track 1 of 1:  29%|██▉       | 10/34 [00:24<00:43,  1.82s/it]
[Acessing speaker spk_0 track 1 of 1:  32%|███▏      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/31 [00:00<00:18,  1.59it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▋         | 2/31 [00:01<00:18,  1.59it/s]
[Acessing speaker spk_1 track 1 of 1:  10%|▉         | 3/31 [00:02<00:26,  1.04it/s]
[Acessing speaker spk_1 track 1 of 1:  13%|█▎        | 4/31 [00:03<00:28,  1.07s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 5/31 [00:04<00:25,  1.00it/s]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 6/31 [00:05<00:23,  1.08it/s]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 7/31 [00:06<00:24,  1.01s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 8/31 [00:07<00:19,  1.16it/s]
[Acessing speaker spk_1 track 1 of 1:  29%|██▉       | 9/31 [00:08<00:19,  1.13it/s]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 10/31 [00:09<00:18,  1.14it/s]
[Acessing speaker spk_1 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/25 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   4%|▍         | 1/25 [00:00<00:16,  1.44it/s]
[Acessing speaker spk_2 track 1 of 1:   8%|▊         | 2/25 [00:04<00:56,  2.44s/it]
[Acessing speaker spk_2 track 1 of 1:  12%|█▏        | 3/25 [00:09<01:16,  3.49s/it]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 4/25 [00:15<01:33,  4.47s/it]
[Acessing speaker spk_2 track 1 of 1:  20%|██        | 5/25 [00:20<01:33,  4.65s/it]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 6/25 [00:24<01:26,  4.53s/it]
[Acessing speaker spk_2 track 1 of 1:  28%|██▊       | 7/25 [00:25<01:02,  3.48s/it]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 8/25 [00:26<00:45,  2.69s/it]
[Acessing speaker spk_2 track 1 of 1:  36%|███▌      | 9/25 [00:29<00:45,  2.84s/it]
[Acessing speaker spk_2 track 1 of 1:  40%|████      | 10/25 [00:39<01:13,  4.89s/it]
[Acessing speaker spk_2 track 1 of 1:  44%|████▍     | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▍         | 1/24 [00:01<00:26,  1.14s/it]
[Acessing speaker spk_3 track 1 of 1:   8%|▊         | 2/24 [00:01<00:16,  1.32it/s]
[Acessing speaker spk_3 track 1 of 1:  12%|█▎        | 3/24 [00:03<00:25,  1.22s/it]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 4/24 [00:04<00:19,  1.02it/s]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 5/24 [00:04<00:17,  1.11it/s]
[Acessing speaker spk_3 track 1 of 1:  25%|██▌       | 6/24 [00:05<00:13,  1.31it/s]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 7/24 [00:06<00:14,  1.21it/s]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 8/24 [00:07<00:15,  1.02it/s]
[Acessing speaker spk_3 track 1 of 1:  38%|███▊      | 9/24 [00:08<00:14,  1.02it/s]
[Acessing speaker spk_3 track 1 of 1:  42%|████▏     | 10/24 [00:09<00:13,  1.08it/s]
[Acessing speaker spk_3 track 1 of 1:  46%|████▌     | 11/2


RUN: EVAL session_118

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_118
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_118


Processing speakers:   0%|          | 0/2 [00:00<?, ?it/s]





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:01<00:29,  1.14s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:10<02:28,  5.96s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:17<02:30,  6.26s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:24<02:32,  6.63s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:26<01:51,  5.08s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:30<01:34,  4.50s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:33<01:20,  4.02s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:37<01:19,  4.17s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:43<01:26,  4.82s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:48<01:20,  4.75s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:  50%|█████     | 1/2 [00:00<00:00,  1.75it/s]
Processing speaker spk_2 track 1 of 2: 100%|██████████| 2/2 [00:01<00:00,  1.56it/s]

[Acessing speaker spk_2 track 2 of 2:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 2:   3%|▎         | 1/31 [00:00<00:20,  1.43it/s]
[Acessing speaker spk_2 track 2 of 2:   6%|▋         | 2/31 [00:01<00:28,  1.00it/s]
[Acessing speaker spk_2 track 2 of 2:  10%|▉         | 3/31 [00:02<00:22,  1.24it/s]
[Acessing speaker spk_2 track 2 of 2:  13%|█▎        | 4/31 [00:03<00:24,  1.12it/s]
[Acessing speaker spk_2 track 2 of 2:  16%|█▌        | 5/31 [00:04<00:20,  1.29it/s]
[Acessing speaker spk_2 track 2 of 2:  19%|█▉        | 6/31 [00:05<00:20,  1.19it/s]
[Acessing speaker spk_2 track 2 of 2:  23%|██▎       | 7/31 [00:05<00:19,  1.25it/s]
[Acessing speaker spk_2 track 2 of 2:  26%|██▌       | 8/31 [00:08<00:


RUN: EVAL session_119

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_119
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_119


Processing speakers:   0%|          | 0/4 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/37 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/37 [00:03<02:06,  3.52s/it]
[Acessing speaker spk_0 track 1 of 1:   5%|▌         | 2/37 [00:04<01:18,  2.26s/it]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 3/37 [00:05<00:56,  1.67s/it]
[Acessing speaker spk_0 track 1 of 1:  11%|█         | 4/37 [00:08<01:13,  2.22s/it]
[Acessing speaker spk_0 track 1 of 1:  14%|█▎        | 5/37 [00:12<01:27,  2.73s/it]
[Acessing speaker spk_0 track 1 of 1:  16%|█▌        | 6/37 [00:17<01:43,  3.34s/it]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 7/37 [00:17<01:16,  2.54s/it]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 8/37 [00:22<01:35,  3.28s/it]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 9/37 [00:27<01:41,  3.61s/it]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 10/37 [00:30<01:31,  3.37s/it]
[Acessing speaker spk_0 track 1 of 1:  30%|██▉       | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/31 [00:00<00:29,  1.01it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▋         | 2/31 [00:02<00:31,  1.07s/it]
[Acessing speaker spk_1 track 1 of 1:  10%|▉         | 3/31 [00:02<00:24,  1.16it/s]
[Acessing speaker spk_1 track 1 of 1:  13%|█▎        | 4/31 [00:03<00:21,  1.28it/s]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 5/31 [00:04<00:22,  1.17it/s]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 6/31 [00:05<00:21,  1.18it/s]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 7/31 [00:05<00:17,  1.35it/s]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 8/31 [00:07<00:24,  1.06s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▉       | 9/31 [00:08<00:21,  1.04it/s]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 10/31 [00:10<00:30,  1.44s/it]
[Acessing speaker spk_1 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/35 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/35 [00:01<00:48,  1.41s/it]
[Acessing speaker spk_2 track 1 of 1:   6%|▌         | 2/35 [00:01<00:29,  1.12it/s]
[Acessing speaker spk_2 track 1 of 1:   9%|▊         | 3/35 [00:03<00:34,  1.08s/it]
[Acessing speaker spk_2 track 1 of 1:  11%|█▏        | 4/35 [00:03<00:29,  1.06it/s]
[Acessing speaker spk_2 track 1 of 1:  14%|█▍        | 5/35 [00:11<01:36,  3.22s/it]
[Acessing speaker spk_2 track 1 of 1:  17%|█▋        | 6/35 [00:16<01:57,  4.05s/it]
[Acessing speaker spk_2 track 1 of 1:  20%|██        | 7/35 [00:23<02:13,  4.75s/it]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 8/35 [00:29<02:22,  5.26s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 9/35 [00:36<02:31,  5.82s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▊       | 10/35 [00:40<02:11,  5.27s/it]
[Acessing speaker spk_2 track 1 of 1:  31%|███▏      | 11/3





Processing speaker spk_3 track 1 of 2: 0it [00:00, ?it/s]

[Acessing speaker spk_3 track 2 of 2:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 2 of 2:   4%|▍         | 1/26 [00:00<00:20,  1.19it/s]
[Acessing speaker spk_3 track 2 of 2:   8%|▊         | 2/26 [00:01<00:23,  1.02it/s]
[Acessing speaker spk_3 track 2 of 2:  12%|█▏        | 3/26 [00:03<00:25,  1.12s/it]
[Acessing speaker spk_3 track 2 of 2:  15%|█▌        | 4/26 [00:04<00:22,  1.03s/it]
[Acessing speaker spk_3 track 2 of 2:  19%|█▉        | 5/26 [00:05<00:21,  1.00s/it]
[Acessing speaker spk_3 track 2 of 2:  23%|██▎       | 6/26 [00:06<00:21,  1.08s/it]
[Acessing speaker spk_3 track 2 of 2:  27%|██▋       | 7/26 [00:07<00:19,  1.01s/it]
[Acessing speaker spk_3 track 2 of 2:  31%|███       | 8/26 [00:07<00:16,  1.09it/s]
[Acessing speaker spk_3 track 2 of 2:  35%|███▍      | 9/26 [00:08<00:15,  1.12it/s]
[Acessing speaker spk_3 track 2 of 2:  38%|███▊      | 10/26 [00:09<00:12,  1.24it/s]



RUN: EVAL session_12

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_12
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_12


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   6%|▋         | 1/16 [00:01<00:23,  1.54s/it]
[Acessing speaker spk_0 track 1 of 2:  12%|█▎        | 2/16 [00:02<00:15,  1.12s/it]
[Acessing speaker spk_0 track 1 of 2:  19%|█▉        | 3/16 [00:03<00:12,  1.00it/s]
[Acessing speaker spk_0 track 1 of 2:  25%|██▌       | 4/16 [00:04<00:13,  1.12s/it]
[Acessing speaker spk_0 track 1 of 2:  31%|███▏      | 5/16 [00:06<00:13,  1.27s/it]
[Acessing speaker spk_0 track 1 of 2:  38%|███▊      | 6/16 [00:13<00:32,  3.26s/it]
[Acessing speaker spk_0 track 1 of 2:  44%|████▍     | 7/16 [00:17<00:31,  3.46s/it]
[Acessing speaker spk_0 track 1 of 2:  50%|█████     | 8/16 [00:20<00:27,  3.41s/it]
[Acessing speaker spk_0 track 1 of 2:  56%|█████▋    | 9/16 [00:30<00:38,  5.57s/it]
[Acessing speaker spk_0 track 1 of 2:  62%|██████▎   | 10/16 [00:39<00:39,  6.58s/it]
[Acessing speaker spk_0 track 1 of 2:  69%|██████▉   | 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/19 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▌         | 1/19 [00:01<00:21,  1.21s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 2/19 [00:06<01:00,  3.55s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 3/19 [00:11<01:09,  4.34s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 4/19 [00:13<00:53,  3.54s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▋       | 5/19 [00:14<00:36,  2.62s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 6/19 [00:15<00:24,  1.88s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 7/19 [00:16<00:17,  1.49s/it]
[Acessing speaker spk_1 track 1 of 1:  42%|████▏     | 8/19 [00:16<00:14,  1.30s/it]
[Acessing speaker spk_1 track 1 of 1:  47%|████▋     | 9/19 [00:17<00:11,  1.16s/it]
[Acessing speaker spk_1 track 1 of 1:  53%|█████▎    | 10/19 [00:18<00:09,  1.04s/it]
[Acessing speaker spk_1 track 1 of 1:  58%|█████▊    | 11/1





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   4%|▍         | 1/24 [00:01<00:29,  1.29s/it]
[Acessing speaker spk_2 track 1 of 1:   8%|▊         | 2/24 [00:02<00:28,  1.30s/it]
[Acessing speaker spk_2 track 1 of 1:  12%|█▎        | 3/24 [00:03<00:19,  1.07it/s]
[Acessing speaker spk_2 track 1 of 1:  17%|█▋        | 4/24 [00:04<00:23,  1.19s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 5/24 [00:06<00:26,  1.40s/it]
[Acessing speaker spk_2 track 1 of 1:  25%|██▌       | 6/24 [00:06<00:19,  1.10s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 7/24 [00:07<00:14,  1.15it/s]
[Acessing speaker spk_2 track 1 of 1:  33%|███▎      | 8/24 [00:07<00:11,  1.35it/s]
[Acessing speaker spk_2 track 1 of 1:  38%|███▊      | 9/24 [00:08<00:11,  1.25it/s]
[Acessing speaker spk_2 track 1 of 1:  42%|████▏     | 10/24 [00:10<00:13,  1.02it/s]
[Acessing speaker spk_2 track 1 of 1:  46%|████▌     | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▍         | 1/21 [00:05<01:47,  5.37s/it]
[Acessing speaker spk_3 track 1 of 1:  10%|▉         | 2/21 [00:09<01:28,  4.65s/it]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 3/21 [00:10<00:54,  3.02s/it]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 4/21 [00:15<01:03,  3.76s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 5/21 [00:19<01:03,  4.00s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▊       | 6/21 [00:27<01:18,  5.20s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 7/21 [00:36<01:30,  6.46s/it]
[Acessing speaker spk_3 track 1 of 1:  38%|███▊      | 8/21 [00:43<01:25,  6.57s/it]
[Acessing speaker spk_3 track 1 of 1:  43%|████▎     | 9/21 [00:44<00:58,  4.85s/it]
[Acessing speaker spk_3 track 1 of 1:  48%|████▊     | 10/21 [00:46<00:45,  4.16s/it]
[Acessing speaker spk_3 track 1 of 1:  52%|█████▏    | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/24 [00:01<00:40,  1.77s/it]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/24 [00:02<00:22,  1.00s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▎        | 3/24 [00:03<00:23,  1.10s/it]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 4/24 [00:04<00:20,  1.01s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██        | 5/24 [00:04<00:16,  1.17it/s]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 6/24 [00:05<00:14,  1.28it/s]
[Acessing speaker spk_4 track 1 of 1:  29%|██▉       | 7/24 [00:05<00:11,  1.51it/s]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 8/24 [00:06<00:10,  1.55it/s]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 9/24 [00:13<00:38,  2.55s/it]
[Acessing speaker spk_4 track 1 of 1:  42%|████▏     | 10/24 [00:14<00:29,  2.08s/it]
[Acessing speaker spk_4 track 1 of 1:  46%|████▌     | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▍         | 1/21 [00:06<02:15,  6.75s/it]
[Acessing speaker spk_5 track 1 of 1:  10%|▉         | 2/21 [00:08<01:08,  3.58s/it]
[Acessing speaker spk_5 track 1 of 1:  14%|█▍        | 3/21 [00:08<00:42,  2.35s/it]
[Acessing speaker spk_5 track 1 of 1:  19%|█▉        | 4/21 [00:09<00:29,  1.75s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 5/21 [00:18<01:06,  4.14s/it]
[Acessing speaker spk_5 track 1 of 1:  29%|██▊       | 6/21 [00:25<01:15,  5.05s/it]
[Acessing speaker spk_5 track 1 of 1:  33%|███▎      | 7/21 [00:33<01:28,  6.30s/it]
[Acessing speaker spk_5 track 1 of 1:  38%|███▊      | 8/21 [00:41<01:26,  6.65s/it]
[Acessing speaker spk_5 track 1 of 1:  43%|████▎     | 9/21 [00:47<01:18,  6.54s/it]
[Acessing speaker spk_5 track 1 of 1:  48%|████▊     | 10/21 [00:55<01:16,  7.00s/it]
[Acessing speaker spk_5 track 1 of 1:  52%|█████▏    | 11/2


RUN: EVAL session_120

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_120
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_120


Processing speakers:   0%|          | 0/3 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   4%|▎         | 1/27 [00:05<02:14,  5.16s/it]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/27 [00:08<01:38,  3.92s/it]
[Acessing speaker spk_0 track 1 of 2:  11%|█         | 3/27 [00:10<01:11,  2.99s/it]
[Acessing speaker spk_0 track 1 of 2:  15%|█▍        | 4/27 [00:11<00:52,  2.29s/it]
[Acessing speaker spk_0 track 1 of 2:  19%|█▊        | 5/27 [00:16<01:13,  3.35s/it]
[Acessing speaker spk_0 track 1 of 2:  22%|██▏       | 6/27 [00:21<01:21,  3.90s/it]
[Acessing speaker spk_0 track 1 of 2:  26%|██▌       | 7/27 [00:25<01:15,  3.77s/it]
[Acessing speaker spk_0 track 1 of 2:  30%|██▉       | 8/27 [00:28<01:09,  3.66s/it]
[Acessing speaker spk_0 track 1 of 2:  33%|███▎      | 9/27 [00:31<01:02,  3.49s/it]
[Acessing speaker spk_0 track 1 of 2:  37%|███▋      | 10/27 [00:33<00:50,  3.00s/it]
[Acessing speaker spk_0 track 1 of 2:  41%|████      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▍         | 1/23 [00:01<00:39,  1.80s/it]
[Acessing speaker spk_1 track 1 of 1:   9%|▊         | 2/23 [00:02<00:24,  1.17s/it]
[Acessing speaker spk_1 track 1 of 1:  13%|█▎        | 3/23 [00:03<00:20,  1.04s/it]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 4/23 [00:04<00:18,  1.04it/s]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 5/23 [00:04<00:15,  1.18it/s]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 6/23 [00:05<00:12,  1.32it/s]
[Acessing speaker spk_1 track 1 of 1:  30%|███       | 7/23 [00:06<00:10,  1.48it/s]
[Acessing speaker spk_1 track 1 of 1:  35%|███▍      | 8/23 [00:06<00:11,  1.31it/s]
[Acessing speaker spk_1 track 1 of 1:  39%|███▉      | 9/23 [00:07<00:11,  1.23it/s]
[Acessing speaker spk_1 track 1 of 1:  43%|████▎     | 10/23 [00:08<00:10,  1.30it/s]
[Acessing speaker spk_1 track 1 of 1:  48%|████▊     | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/30 [00:04<02:21,  4.88s/it]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 2/30 [00:11<02:51,  6.13s/it]
[Acessing speaker spk_2 track 1 of 1:  10%|█         | 3/30 [00:12<01:43,  3.84s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 4/30 [00:14<01:19,  3.05s/it]
[Acessing speaker spk_2 track 1 of 1:  17%|█▋        | 5/30 [00:19<01:30,  3.63s/it]
[Acessing speaker spk_2 track 1 of 1:  20%|██        | 6/30 [00:20<01:04,  2.69s/it]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 7/30 [00:25<01:23,  3.64s/it]
[Acessing speaker spk_2 track 1 of 1:  27%|██▋       | 8/30 [00:26<01:00,  2.74s/it]
[Acessing speaker spk_2 track 1 of 1:  30%|███       | 9/30 [00:30<01:06,  3.15s/it]
[Acessing speaker spk_2 track 1 of 1:  33%|███▎      | 10/30 [00:33<00:57,  2.89s/it]
[Acessing speaker spk_2 track 1 of 1:  37%|███▋      | 11/3


RUN: EVAL session_121

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_121
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_121


Processing speakers:   0%|          | 0/2 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 3:   0%|          | 0/14 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 3:   7%|▋         | 1/14 [00:00<00:08,  1.47it/s]
[Acessing speaker spk_0 track 1 of 3:  14%|█▍        | 2/14 [00:01<00:11,  1.08it/s]
[Acessing speaker spk_0 track 1 of 3:  21%|██▏       | 3/14 [00:04<00:20,  1.90s/it]
[Acessing speaker spk_0 track 1 of 3:  29%|██▊       | 4/14 [00:13<00:45,  4.53s/it]
[Acessing speaker spk_0 track 1 of 3:  36%|███▌      | 5/14 [00:19<00:47,  5.25s/it]
[Acessing speaker spk_0 track 1 of 3:  43%|████▎     | 6/14 [00:24<00:38,  4.85s/it]
[Acessing speaker spk_0 track 1 of 3:  50%|█████     | 7/14 [00:27<00:31,  4.48s/it]
[Acessing speaker spk_0 track 1 of 3:  57%|█████▋    | 8/14 [00:37<00:36,  6.14s/it]
[Acessing speaker spk_0 track 1 of 3:  64%|██████▍   | 9/14 [00:38<00:22,  4.48s/it]
[Acessing speaker spk_0 track 1 of 3:  71%|███████▏  | 10/14 [00:40<00:15,  3.94s/it]
[Acessing speaker spk_0 track 1 of 3:  79%|███████▊  | 11/1





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/31 [00:01<00:30,  1.02s/it]
[Acessing speaker spk_2 track 1 of 1:   6%|▋         | 2/31 [00:03<00:54,  1.88s/it]
[Acessing speaker spk_2 track 1 of 1:  10%|▉         | 3/31 [00:04<00:39,  1.41s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 4/31 [00:09<01:15,  2.79s/it]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 5/31 [00:11<01:07,  2.59s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 6/31 [00:12<00:49,  1.97s/it]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 7/31 [00:12<00:37,  1.57s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 8/31 [00:17<00:56,  2.45s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 9/31 [00:21<01:03,  2.90s/it]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 10/31 [00:23<01:00,  2.86s/it]
[Acessing speaker spk_2 track 1 of 1:  35%|███▌      | 11/3


RUN: EVAL session_122

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_122
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_122


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_0 track 1 of 2: 100%|██████████| 1/1 [00:00<00:00,  1.73it/s]

[Acessing speaker spk_0 track 2 of 2:   0%|          | 0/10 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 2 of 2:  10%|█         | 1/10 [00:01<00:12,  1.40s/it]
[Acessing speaker spk_0 track 2 of 2:  20%|██        | 2/10 [00:01<00:07,  1.14it/s]
[Acessing speaker spk_0 track 2 of 2:  30%|███       | 3/10 [00:02<00:06,  1.13it/s]
[Acessing speaker spk_0 track 2 of 2:  40%|████      | 4/10 [00:03<00:04,  1.30it/s]
[Acessing speaker spk_0 track 2 of 2:  50%|█████     | 5/10 [00:03<00:03,  1.50it/s]
[Acessing speaker spk_0 track 2 of 2:  60%|██████    | 6/10 [00:06<00:05,  1.27s/it]
[Acessing speaker spk_0 track 2 of 2:  70%|███████   | 7/10 [00:07<00:03,  1.11s/it]
[Acessing speaker spk_0 track 2 of 2:  80%|████████  | 8/10 [00:07<00:01,  1.03it/s]
[Acessing speaker spk_0 track 2 of 2:  90%|█████████ | 9/10 [00:08<00





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/34 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/34 [00:00<00:30,  1.07it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▌         | 2/34 [00:01<00:25,  1.25it/s]
[Acessing speaker spk_1 track 1 of 1:   9%|▉         | 3/34 [00:02<00:21,  1.45it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 4/34 [00:02<00:19,  1.56it/s]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 5/34 [00:04<00:30,  1.06s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 6/34 [00:05<00:28,  1.02s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 7/34 [00:08<00:43,  1.62s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▎       | 8/34 [00:09<00:36,  1.40s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▋       | 9/34 [00:12<00:47,  1.89s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▉       | 10/34 [00:17<01:11,  2.98s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   4%|▍         | 1/26 [00:01<00:32,  1.31s/it]
[Acessing speaker spk_2 track 1 of 1:   8%|▊         | 2/26 [00:01<00:18,  1.28it/s]
[Acessing speaker spk_2 track 1 of 1:  12%|█▏        | 3/26 [00:02<00:18,  1.26it/s]
[Acessing speaker spk_2 track 1 of 1:  15%|█▌        | 4/26 [00:09<01:07,  3.09s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 5/26 [00:10<00:48,  2.29s/it]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 6/26 [00:14<00:57,  2.87s/it]
[Acessing speaker spk_2 track 1 of 1:  27%|██▋       | 7/26 [00:14<00:42,  2.23s/it]
[Acessing speaker spk_2 track 1 of 1:  31%|███       | 8/26 [00:15<00:30,  1.72s/it]
[Acessing speaker spk_2 track 1 of 1:  35%|███▍      | 9/26 [00:16<00:23,  1.37s/it]
[Acessing speaker spk_2 track 1 of 1:  38%|███▊      | 10/26 [00:17<00:21,  1.36s/it]
[Acessing speaker spk_2 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/33 [00:02<01:23,  2.60s/it]
[Acessing speaker spk_3 track 1 of 1:   6%|▌         | 2/33 [00:03<00:49,  1.60s/it]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 3/33 [00:03<00:32,  1.09s/it]
[Acessing speaker spk_3 track 1 of 1:  12%|█▏        | 4/33 [00:04<00:26,  1.11it/s]
[Acessing speaker spk_3 track 1 of 1:  15%|█▌        | 5/33 [00:10<01:11,  2.56s/it]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 6/33 [00:11<01:01,  2.26s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 7/33 [00:14<01:01,  2.35s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 8/33 [00:15<00:53,  2.12s/it]
[Acessing speaker spk_3 track 1 of 1:  27%|██▋       | 9/33 [00:16<00:38,  1.61s/it]
[Acessing speaker spk_3 track 1 of 1:  30%|███       | 10/33 [00:17<00:32,  1.40s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 11/3





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/31 [00:00<00:20,  1.49it/s]
[Acessing speaker spk_4 track 1 of 1:   6%|▋         | 2/31 [00:01<00:19,  1.47it/s]
[Acessing speaker spk_4 track 1 of 1:  10%|▉         | 3/31 [00:01<00:17,  1.64it/s]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 4/31 [00:04<00:42,  1.57s/it]
[Acessing speaker spk_4 track 1 of 1:  16%|█▌        | 5/31 [00:07<00:54,  2.09s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 6/31 [00:08<00:42,  1.71s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 7/31 [00:13<01:01,  2.55s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 8/31 [00:17<01:10,  3.07s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▉       | 9/31 [00:23<01:25,  3.88s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 10/31 [00:24<01:06,  3.14s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▌      | 11/3


RUN: EVAL session_123

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_123
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_123


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/29 [00:00<00:19,  1.43it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/29 [00:02<00:39,  1.48s/it]
[Acessing speaker spk_0 track 1 of 1:  10%|█         | 3/29 [00:03<00:27,  1.07s/it]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 4/29 [00:03<00:21,  1.19it/s]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 5/29 [00:04<00:20,  1.19it/s]
[Acessing speaker spk_0 track 1 of 1:  21%|██        | 6/29 [00:05<00:20,  1.10it/s]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 7/29 [00:06<00:19,  1.15it/s]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 8/29 [00:07<00:16,  1.28it/s]
[Acessing speaker spk_0 track 1 of 1:  31%|███       | 9/29 [00:11<00:36,  1.84s/it]
[Acessing speaker spk_0 track 1 of 1:  34%|███▍      | 10/29 [00:17<01:03,  3.34s/it]
[Acessing speaker spk_0 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/22 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▍         | 1/22 [00:01<00:24,  1.16s/it]
[Acessing speaker spk_1 track 1 of 1:   9%|▉         | 2/22 [00:02<00:22,  1.11s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▎        | 3/22 [00:02<00:16,  1.18it/s]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 4/22 [00:05<00:29,  1.65s/it]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 5/22 [00:06<00:24,  1.43s/it]
[Acessing speaker spk_1 track 1 of 1:  27%|██▋       | 6/22 [00:08<00:26,  1.66s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 7/22 [00:13<00:37,  2.53s/it]
[Acessing speaker spk_1 track 1 of 1:  36%|███▋      | 8/22 [00:14<00:28,  2.01s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 9/22 [00:19<00:38,  2.97s/it]
[Acessing speaker spk_1 track 1 of 1:  45%|████▌     | 10/22 [00:20<00:30,  2.57s/it]
[Acessing speaker spk_1 track 1 of 1:  50%|█████     | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   5%|▍         | 1/21 [00:00<00:18,  1.09it/s]
[Acessing speaker spk_2 track 1 of 1:  10%|▉         | 2/21 [00:01<00:16,  1.13it/s]
[Acessing speaker spk_2 track 1 of 1:  14%|█▍        | 3/21 [00:02<00:16,  1.08it/s]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 4/21 [00:03<00:14,  1.20it/s]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 5/21 [00:04<00:13,  1.19it/s]
[Acessing speaker spk_2 track 1 of 1:  29%|██▊       | 6/21 [00:04<00:11,  1.36it/s]
[Acessing speaker spk_2 track 1 of 1:  33%|███▎      | 7/21 [00:08<00:25,  1.84s/it]
[Acessing speaker spk_2 track 1 of 1:  38%|███▊      | 8/21 [00:09<00:19,  1.47s/it]
[Acessing speaker spk_2 track 1 of 1:  43%|████▎     | 9/21 [00:10<00:14,  1.22s/it]
[Acessing speaker spk_2 track 1 of 1:  48%|████▊     | 10/21 [00:10<00:11,  1.01s/it]
[Acessing speaker spk_2 track 1 of 1:  52%|█████▏    | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▍         | 1/26 [00:00<00:20,  1.20it/s]
[Acessing speaker spk_3 track 1 of 1:   8%|▊         | 2/26 [00:01<00:21,  1.11it/s]
[Acessing speaker spk_3 track 1 of 1:  12%|█▏        | 3/26 [00:02<00:22,  1.03it/s]
[Acessing speaker spk_3 track 1 of 1:  15%|█▌        | 4/26 [00:10<01:17,  3.51s/it]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 5/26 [00:11<00:56,  2.69s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 6/26 [00:16<01:08,  3.44s/it]
[Acessing speaker spk_3 track 1 of 1:  27%|██▋       | 7/26 [00:17<00:48,  2.55s/it]
[Acessing speaker spk_3 track 1 of 1:  31%|███       | 8/26 [00:18<00:42,  2.33s/it]
[Acessing speaker spk_3 track 1 of 1:  35%|███▍      | 9/26 [00:19<00:31,  1.87s/it]
[Acessing speaker spk_3 track 1 of 1:  38%|███▊      | 10/26 [00:21<00:30,  1.90s/it]
[Acessing speaker spk_3 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/33 [00:00<00:23,  1.38it/s]
[Acessing speaker spk_4 track 1 of 1:   6%|▌         | 2/33 [00:04<01:16,  2.48s/it]
[Acessing speaker spk_4 track 1 of 1:   9%|▉         | 3/33 [00:08<01:30,  3.03s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 4/33 [00:13<01:52,  3.88s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▌        | 5/33 [00:22<02:37,  5.64s/it]
[Acessing speaker spk_4 track 1 of 1:  18%|█▊        | 6/33 [00:23<01:49,  4.05s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██        | 7/33 [00:24<01:19,  3.05s/it]
[Acessing speaker spk_4 track 1 of 1:  24%|██▍       | 8/33 [00:24<00:57,  2.29s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 9/33 [00:27<01:00,  2.53s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|███       | 10/33 [00:28<00:48,  2.11s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 11/3


RUN: EVAL session_124

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_124
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_124


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/8 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▎        | 1/8 [00:00<00:06,  1.16it/s]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 2/8 [00:01<00:03,  1.54it/s]
[Acessing speaker spk_0 track 1 of 1:  38%|███▊      | 3/8 [00:02<00:03,  1.48it/s]
[Acessing speaker spk_0 track 1 of 1:  50%|█████     | 4/8 [00:02<00:02,  1.54it/s]
[Acessing speaker spk_0 track 1 of 1:  62%|██████▎   | 5/8 [00:06<00:05,  1.73s/it]
[Acessing speaker spk_0 track 1 of 1:  75%|███████▌  | 6/8 [00:07<00:02,  1.40s/it]
[Acessing speaker spk_0 track 1 of 1:  88%|████████▊ | 7/8 [00:07<00:01,  1.10s/it]
Processing speaker spk_0 track 1 of 1: 100%|██████████| 8/8 [00:09<00:00,  1.13s/it]
Processing speakers:  20%|██        | 1/5 [00:09<00:36,  9.04s/it]





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▍         | 1/24 [00:01<00:34,  1.49s/it]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 2/24 [00:03<00:39,  1.81s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▎        | 3/24 [00:04<00:27,  1.31s/it]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 4/24 [00:04<00:20,  1.02s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 5/24 [00:05<00:19,  1.02s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 6/24 [00:11<00:46,  2.56s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▉       | 7/24 [00:12<00:33,  1.99s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 8/24 [00:14<00:31,  1.96s/it]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 9/24 [00:15<00:27,  1.86s/it]
[Acessing speaker spk_1 track 1 of 1:  42%|████▏     | 10/24 [00:18<00:28,  2.04s/it]
[Acessing speaker spk_1 track 1 of 1:  46%|████▌     | 11/2





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 1 of 2: 100%|██████████| 1/1 [00:00<00:00,  2.11it/s]

[Acessing speaker spk_2 track 2 of 2:   0%|          | 0/34 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 2:   3%|▎         | 1/34 [00:01<00:39,  1.21s/it]
[Acessing speaker spk_2 track 2 of 2:   6%|▌         | 2/34 [00:02<00:49,  1.55s/it]
[Acessing speaker spk_2 track 2 of 2:   9%|▉         | 3/34 [00:04<00:45,  1.48s/it]
[Acessing speaker spk_2 track 2 of 2:  12%|█▏        | 4/34 [00:05<00:34,  1.15s/it]
[Acessing speaker spk_2 track 2 of 2:  15%|█▍        | 5/34 [00:06<00:37,  1.29s/it]
[Acessing speaker spk_2 track 2 of 2:  18%|█▊        | 6/34 [00:07<00:30,  1.08s/it]
[Acessing speaker spk_2 track 2 of 2:  21%|██        | 7/34 [00:14<01:19,  2.95s/it]
[Acessing speaker spk_2 track 2 of 2:  24%|██▎       | 8/34 [00:19<01:33,  3.60s/it]
[Acessing speaker spk_2 track 2 of 2:  26%|██▋       | 9/34 [00:22<01





[Acessing speaker spk_3 track 1 of 2:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 2:  50%|█████     | 1/2 [00:00<00:00,  1.74it/s]
Processing speaker spk_3 track 1 of 2: 100%|██████████| 2/2 [00:01<00:00,  1.56it/s]

[Acessing speaker spk_3 track 2 of 2:   0%|          | 0/11 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 2 of 2:   9%|▉         | 1/11 [00:02<00:23,  2.32s/it]
[Acessing speaker spk_3 track 2 of 2:  18%|█▊        | 2/11 [00:07<00:34,  3.88s/it]
[Acessing speaker spk_3 track 2 of 2:  27%|██▋       | 3/11 [00:08<00:19,  2.44s/it]
[Acessing speaker spk_3 track 2 of 2:  36%|███▋      | 4/11 [00:08<00:11,  1.70s/it]
[Acessing speaker spk_3 track 2 of 2:  45%|████▌     | 5/11 [00:11<00:12,  2.16s/it]
[Acessing speaker spk_3 track 2 of 2:  55%|█████▍    | 6/11 [00:12<00:08,  1.72s/it]
[Acessing speaker spk_3 track 2 of 2:  64%|██████▎   | 7/11 [00:13<00:06,  1.57s/it]
[Acessing speaker spk_3 track 2 of 2:  73%|███████▎  | 8/11 [00:14<00:





[Acessing speaker spk_4 track 1 of 2:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_4 track 1 of 2: 100%|██████████| 1/1 [00:00<00:00,  1.37it/s]

[Acessing speaker spk_4 track 2 of 2:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 2 of 2:   4%|▎         | 1/27 [00:01<00:37,  1.43s/it]
[Acessing speaker spk_4 track 2 of 2:   7%|▋         | 2/27 [00:01<00:21,  1.17it/s]
[Acessing speaker spk_4 track 2 of 2:  11%|█         | 3/27 [00:03<00:27,  1.16s/it]
[Acessing speaker spk_4 track 2 of 2:  15%|█▍        | 4/27 [00:05<00:32,  1.40s/it]
[Acessing speaker spk_4 track 2 of 2:  19%|█▊        | 5/27 [00:05<00:25,  1.18s/it]
[Acessing speaker spk_4 track 2 of 2:  22%|██▏       | 6/27 [00:10<00:48,  2.30s/it]
[Acessing speaker spk_4 track 2 of 2:  26%|██▌       | 7/27 [00:13<00:51,  2.60s/it]
[Acessing speaker spk_4 track 2 of 2:  30%|██▉       | 8/27 [00:19<01:09,  3.68s/it]
[Acessing speaker spk_4 track 2 of 2:  33%|███▎      | 9/27 [00:25<01


RUN: EVAL session_125

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_125
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_125


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▍         | 1/23 [00:00<00:18,  1.22it/s]
[Acessing speaker spk_0 track 1 of 1:   9%|▊         | 2/23 [00:01<00:14,  1.45it/s]
[Acessing speaker spk_0 track 1 of 1:  13%|█▎        | 3/23 [00:02<00:13,  1.47it/s]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 4/23 [00:03<00:16,  1.17it/s]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 5/23 [00:03<00:14,  1.21it/s]
[Acessing speaker spk_0 track 1 of 1:  26%|██▌       | 6/23 [00:04<00:14,  1.19it/s]
[Acessing speaker spk_0 track 1 of 1:  30%|███       | 7/23 [00:05<00:12,  1.28it/s]
[Acessing speaker spk_0 track 1 of 1:  35%|███▍      | 8/23 [00:09<00:24,  1.64s/it]
[Acessing speaker spk_0 track 1 of 1:  39%|███▉      | 9/23 [00:10<00:21,  1.51s/it]
[Acessing speaker spk_0 track 1 of 1:  43%|████▎     | 10/23 [00:11<00:17,  1.33s/it]
[Acessing speaker spk_0 track 1 of 1:  48%|████▊     | 11/2





Processing speaker spk_1 track 1 of 2: 0it [00:00, ?it/s]

[Acessing speaker spk_1 track 2 of 2:   0%|          | 0/20 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 2 of 2:   5%|▌         | 1/20 [00:00<00:14,  1.27it/s]
[Acessing speaker spk_1 track 2 of 2:  10%|█         | 2/20 [00:01<00:10,  1.66it/s]
[Acessing speaker spk_1 track 2 of 2:  15%|█▌        | 3/20 [00:01<00:11,  1.51it/s]
[Acessing speaker spk_1 track 2 of 2:  20%|██        | 4/20 [00:05<00:31,  1.95s/it]
[Acessing speaker spk_1 track 2 of 2:  25%|██▌       | 5/20 [00:06<00:22,  1.52s/it]
[Acessing speaker spk_1 track 2 of 2:  30%|███       | 6/20 [00:07<00:17,  1.23s/it]
[Acessing speaker spk_1 track 2 of 2:  35%|███▌      | 7/20 [00:08<00:13,  1.08s/it]
[Acessing speaker spk_1 track 2 of 2:  40%|████      | 8/20 [00:08<00:11,  1.01it/s]
[Acessing speaker spk_1 track 2 of 2:  45%|████▌     | 9/20 [00:10<00:12,  1.12s/it]
[Acessing speaker spk_1 track 2 of 2:  50%|█████     | 10/20 [00:11<00:12,  1.26s/it]






[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   4%|▎         | 1/28 [00:00<00:22,  1.19it/s]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 2/28 [00:09<02:22,  5.48s/it]
[Acessing speaker spk_2 track 1 of 1:  11%|█         | 3/28 [00:10<01:26,  3.45s/it]
[Acessing speaker spk_2 track 1 of 1:  14%|█▍        | 4/28 [00:11<00:56,  2.34s/it]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 5/28 [00:16<01:21,  3.56s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██▏       | 6/28 [00:21<01:24,  3.84s/it]
[Acessing speaker spk_2 track 1 of 1:  25%|██▌       | 7/28 [00:26<01:27,  4.17s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▊       | 8/28 [00:27<01:02,  3.11s/it]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 9/28 [00:29<00:52,  2.78s/it]
[Acessing speaker spk_2 track 1 of 1:  36%|███▌      | 10/28 [00:31<00:47,  2.63s/it]
[Acessing speaker spk_2 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▎         | 1/28 [00:01<00:43,  1.62s/it]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 2/28 [00:02<00:28,  1.09s/it]
[Acessing speaker spk_3 track 1 of 1:  11%|█         | 3/28 [00:10<01:46,  4.25s/it]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 4/28 [00:12<01:23,  3.49s/it]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 5/28 [00:15<01:17,  3.36s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██▏       | 6/28 [00:16<00:55,  2.50s/it]
[Acessing speaker spk_3 track 1 of 1:  25%|██▌       | 7/28 [00:22<01:15,  3.58s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▊       | 8/28 [00:27<01:20,  4.05s/it]
[Acessing speaker spk_3 track 1 of 1:  32%|███▏      | 9/28 [00:28<00:57,  3.05s/it]
[Acessing speaker spk_3 track 1 of 1:  36%|███▌      | 10/28 [00:32<00:58,  3.27s/it]
[Acessing speaker spk_3 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/26 [00:00<00:23,  1.08it/s]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/26 [00:01<00:18,  1.32it/s]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 3/26 [00:02<00:21,  1.07it/s]
[Acessing speaker spk_4 track 1 of 1:  15%|█▌        | 4/26 [00:03<00:20,  1.05it/s]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 5/26 [00:09<00:53,  2.56s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 6/26 [00:09<00:38,  1.90s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 7/26 [00:10<00:30,  1.59s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 8/26 [00:12<00:32,  1.80s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▍      | 9/26 [00:13<00:24,  1.44s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 10/26 [00:14<00:22,  1.38s/it]
[Acessing speaker spk_4 track 1 of 1:  42%|████▏     | 11/2


RUN: EVAL session_126

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_126
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_126


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   5%|▍         | 1/21 [00:00<00:15,  1.28it/s]
[Acessing speaker spk_0 track 1 of 1:  10%|▉         | 2/21 [00:01<00:11,  1.67it/s]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 3/21 [00:02<00:12,  1.43it/s]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 4/21 [00:02<00:12,  1.41it/s]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 5/21 [00:04<00:15,  1.02it/s]
[Acessing speaker spk_0 track 1 of 1:  29%|██▊       | 6/21 [00:05<00:15,  1.03s/it]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 7/21 [00:06<00:14,  1.01s/it]
[Acessing speaker spk_0 track 1 of 1:  38%|███▊      | 8/21 [00:08<00:19,  1.49s/it]
[Acessing speaker spk_0 track 1 of 1:  43%|████▎     | 9/21 [00:09<00:15,  1.30s/it]
[Acessing speaker spk_0 track 1 of 1:  48%|████▊     | 10/21 [00:10<00:12,  1.15s/it]
[Acessing speaker spk_0 track 1 of 1:  52%|█████▏    | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▍         | 1/26 [00:01<00:29,  1.19s/it]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 2/26 [00:01<00:20,  1.18it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 3/26 [00:02<00:17,  1.35it/s]
[Acessing speaker spk_1 track 1 of 1:  15%|█▌        | 4/26 [00:04<00:23,  1.09s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 5/26 [00:05<00:28,  1.34s/it]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 6/26 [00:06<00:22,  1.10s/it]
[Acessing speaker spk_1 track 1 of 1:  27%|██▋       | 7/26 [00:07<00:20,  1.06s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███       | 8/26 [00:08<00:16,  1.11it/s]
[Acessing speaker spk_1 track 1 of 1:  35%|███▍      | 9/26 [00:08<00:13,  1.25it/s]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 10/26 [00:09<00:11,  1.37it/s]
[Acessing speaker spk_1 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_2 track 1 of 3:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 3:   6%|▋         | 1/16 [00:04<01:12,  4.86s/it]
[Acessing speaker spk_2 track 1 of 3:  12%|█▎        | 2/16 [00:07<00:52,  3.75s/it]
[Acessing speaker spk_2 track 1 of 3:  19%|█▉        | 3/16 [00:11<00:45,  3.51s/it]
[Acessing speaker spk_2 track 1 of 3:  25%|██▌       | 4/16 [00:13<00:38,  3.22s/it]
[Acessing speaker spk_2 track 1 of 3:  31%|███▏      | 5/16 [00:17<00:36,  3.32s/it]
[Acessing speaker spk_2 track 1 of 3:  38%|███▊      | 6/16 [00:21<00:35,  3.55s/it]
[Acessing speaker spk_2 track 1 of 3:  44%|████▍     | 7/16 [00:24<00:29,  3.27s/it]
[Acessing speaker spk_2 track 1 of 3:  50%|█████     | 8/16 [00:26<00:23,  2.89s/it]
[Acessing speaker spk_2 track 1 of 3:  56%|█████▋    | 9/16 [00:26<00:15,  2.21s/it]
[Acessing speaker spk_2 track 1 of 3:  62%|██████▎   | 10/16 [00:27<00:10,  1.82s/it]
[Acessing speaker spk_2 track 1 of 3:  69%|██████▉   | 11/1





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/20 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▌         | 1/20 [00:05<01:35,  5.02s/it]
[Acessing speaker spk_3 track 1 of 1:  10%|█         | 2/20 [00:05<00:43,  2.42s/it]
[Acessing speaker spk_3 track 1 of 1:  15%|█▌        | 3/20 [00:07<00:37,  2.20s/it]
[Acessing speaker spk_3 track 1 of 1:  20%|██        | 4/20 [00:08<00:24,  1.54s/it]
[Acessing speaker spk_3 track 1 of 1:  25%|██▌       | 5/20 [00:18<01:08,  4.57s/it]
[Acessing speaker spk_3 track 1 of 1:  30%|███       | 6/20 [00:27<01:27,  6.22s/it]
[Acessing speaker spk_3 track 1 of 1:  35%|███▌      | 7/20 [00:29<01:01,  4.76s/it]
[Acessing speaker spk_3 track 1 of 1:  40%|████      | 8/20 [00:31<00:45,  3.83s/it]
[Acessing speaker spk_3 track 1 of 1:  45%|████▌     | 9/20 [00:32<00:35,  3.21s/it]
[Acessing speaker spk_3 track 1 of 1:  50%|█████     | 10/20 [00:33<00:24,  2.49s/it]
[Acessing speaker spk_3 track 1 of 1:  55%|█████▌    | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/23 [00:02<00:53,  2.44s/it]
[Acessing speaker spk_4 track 1 of 1:   9%|▊         | 2/23 [00:03<00:33,  1.60s/it]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 3/23 [00:15<02:08,  6.40s/it]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 4/23 [00:20<01:53,  5.99s/it]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 5/23 [00:26<01:47,  5.99s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 6/23 [00:32<01:40,  5.89s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|███       | 7/23 [00:33<01:09,  4.32s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▍      | 8/23 [00:34<00:47,  3.15s/it]
[Acessing speaker spk_4 track 1 of 1:  39%|███▉      | 9/23 [00:37<00:44,  3.15s/it]
[Acessing speaker spk_4 track 1 of 1:  43%|████▎     | 10/23 [00:38<00:30,  2.38s/it]
[Acessing speaker spk_4 track 1 of 1:  48%|████▊     | 11/2


RUN: EVAL session_127

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_127
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_127


Processing speakers:   0%|          | 0/4 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/29 [00:00<00:18,  1.53it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/29 [00:01<00:16,  1.63it/s]
[Acessing speaker spk_0 track 1 of 1:  10%|█         | 3/29 [00:01<00:15,  1.64it/s]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 4/29 [00:02<00:17,  1.39it/s]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 5/29 [00:12<01:37,  4.08s/it]
[Acessing speaker spk_0 track 1 of 1:  21%|██        | 6/29 [00:14<01:13,  3.21s/it]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 7/29 [00:17<01:07,  3.05s/it]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 8/29 [00:18<00:51,  2.44s/it]
[Acessing speaker spk_0 track 1 of 1:  31%|███       | 9/29 [00:18<00:37,  1.89s/it]
[Acessing speaker spk_0 track 1 of 1:  34%|███▍      | 10/29 [00:24<00:55,  2.94s/it]
[Acessing speaker spk_0 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_1 track 1 of 3:   0%|          | 0/14 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 3:   7%|▋         | 1/14 [00:00<00:12,  1.07it/s]
[Acessing speaker spk_1 track 1 of 3:  14%|█▍        | 2/14 [00:01<00:07,  1.53it/s]
[Acessing speaker spk_1 track 1 of 3:  21%|██▏       | 3/14 [00:02<00:07,  1.40it/s]
[Acessing speaker spk_1 track 1 of 3:  29%|██▊       | 4/14 [00:03<00:10,  1.02s/it]
[Acessing speaker spk_1 track 1 of 3:  36%|███▌      | 5/14 [00:04<00:08,  1.04it/s]
[Acessing speaker spk_1 track 1 of 3:  43%|████▎     | 6/14 [00:06<00:10,  1.29s/it]
[Acessing speaker spk_1 track 1 of 3:  50%|█████     | 7/14 [00:08<00:09,  1.41s/it]
[Acessing speaker spk_1 track 1 of 3:  57%|█████▋    | 8/14 [00:08<00:07,  1.19s/it]
[Acessing speaker spk_1 track 1 of 3:  64%|██████▍   | 9/14 [00:09<00:05,  1.05s/it]
[Acessing speaker spk_1 track 1 of 3:  71%|███████▏  | 10/14 [00:10<00:04,  1.12s/it]
[Acessing speaker spk_1 track 1 of 3:  79%|███████▊  | 11/1





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   4%|▎         | 1/27 [00:01<00:37,  1.43s/it]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 2/27 [00:09<02:06,  5.07s/it]
[Acessing speaker spk_2 track 1 of 1:  11%|█         | 3/27 [00:10<01:16,  3.21s/it]
[Acessing speaker spk_2 track 1 of 1:  15%|█▍        | 4/27 [00:15<01:32,  4.04s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▊        | 5/27 [00:19<01:27,  3.97s/it]
[Acessing speaker spk_2 track 1 of 1:  22%|██▏       | 6/27 [00:20<01:05,  3.11s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 7/27 [00:21<00:45,  2.29s/it]
[Acessing speaker spk_2 track 1 of 1:  30%|██▉       | 8/27 [00:24<00:46,  2.47s/it]
[Acessing speaker spk_2 track 1 of 1:  33%|███▎      | 9/27 [00:25<00:39,  2.17s/it]
[Acessing speaker spk_2 track 1 of 1:  37%|███▋      | 10/27 [00:26<00:31,  1.88s/it]
[Acessing speaker spk_2 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/15 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 1/15 [00:04<00:57,  4.11s/it]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 2/15 [00:04<00:28,  2.21s/it]
[Acessing speaker spk_3 track 1 of 1:  20%|██        | 3/15 [00:05<00:17,  1.43s/it]
[Acessing speaker spk_3 track 1 of 1:  27%|██▋       | 4/15 [00:06<00:14,  1.36s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 5/15 [00:07<00:10,  1.08s/it]
[Acessing speaker spk_3 track 1 of 1:  40%|████      | 6/15 [00:11<00:20,  2.26s/it]
[Acessing speaker spk_3 track 1 of 1:  47%|████▋     | 7/15 [00:13<00:15,  1.95s/it]
[Acessing speaker spk_3 track 1 of 1:  53%|█████▎    | 8/15 [00:14<00:11,  1.61s/it]
[Acessing speaker spk_3 track 1 of 1:  60%|██████    | 9/15 [00:16<00:10,  1.72s/it]
[Acessing speaker spk_3 track 1 of 1:  67%|██████▋   | 10/15 [00:19<00:10,  2.18s/it]
[Acessing speaker spk_3 track 1 of 1:  73%|███████▎  | 11/1


RUN: EVAL session_128

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_128
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_128


Processing speakers:   0%|          | 0/4 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/30 [00:01<00:44,  1.52s/it]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/30 [00:02<00:30,  1.07s/it]
[Acessing speaker spk_0 track 1 of 1:  10%|█         | 3/30 [00:02<00:21,  1.23it/s]
[Acessing speaker spk_0 track 1 of 1:  13%|█▎        | 4/30 [00:03<00:18,  1.38it/s]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 5/30 [00:04<00:21,  1.19it/s]
[Acessing speaker spk_0 track 1 of 1:  20%|██        | 6/30 [00:05<00:25,  1.07s/it]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 7/30 [00:07<00:30,  1.31s/it]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 8/30 [00:08<00:25,  1.15s/it]
[Acessing speaker spk_0 track 1 of 1:  30%|███       | 9/30 [00:10<00:28,  1.36s/it]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 10/30 [00:18<01:06,  3.33s/it]
[Acessing speaker spk_0 track 1 of 1:  37%|███▋      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:01<00:42,  1.63s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:02<00:25,  1.04s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:02<00:19,  1.22it/s]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:03<00:19,  1.15it/s]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:04<00:18,  1.20it/s]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:05<00:17,  1.23it/s]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:06<00:18,  1.08it/s]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:07<00:18,  1.00it/s]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:11<00:33,  1.88s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:12<00:27,  1.60s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   4%|▎         | 1/27 [00:00<00:19,  1.31it/s]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 2/27 [00:01<00:14,  1.78it/s]
[Acessing speaker spk_2 track 1 of 1:  11%|█         | 3/27 [00:01<00:15,  1.50it/s]
[Acessing speaker spk_2 track 1 of 1:  15%|█▍        | 4/27 [00:08<01:04,  2.80s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▊        | 5/27 [00:09<00:46,  2.14s/it]
[Acessing speaker spk_2 track 1 of 1:  22%|██▏       | 6/27 [00:10<00:38,  1.83s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 7/27 [00:11<00:30,  1.54s/it]
[Acessing speaker spk_2 track 1 of 1:  30%|██▉       | 8/27 [00:11<00:23,  1.26s/it]
[Acessing speaker spk_2 track 1 of 1:  33%|███▎      | 9/27 [00:12<00:19,  1.06s/it]
[Acessing speaker spk_2 track 1 of 1:  37%|███▋      | 10/27 [00:15<00:29,  1.73s/it]
[Acessing speaker spk_2 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/20 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▌         | 1/20 [00:00<00:16,  1.12it/s]
[Acessing speaker spk_3 track 1 of 1:  10%|█         | 2/20 [00:01<00:16,  1.10it/s]
[Acessing speaker spk_3 track 1 of 1:  15%|█▌        | 3/20 [00:02<00:12,  1.40it/s]
[Acessing speaker spk_3 track 1 of 1:  20%|██        | 4/20 [00:02<00:09,  1.64it/s]
[Acessing speaker spk_3 track 1 of 1:  25%|██▌       | 5/20 [00:04<00:15,  1.06s/it]
[Acessing speaker spk_3 track 1 of 1:  30%|███       | 6/20 [00:06<00:18,  1.30s/it]
[Acessing speaker spk_3 track 1 of 1:  35%|███▌      | 7/20 [00:08<00:20,  1.59s/it]
[Acessing speaker spk_3 track 1 of 1:  40%|████      | 8/20 [00:09<00:15,  1.29s/it]
[Acessing speaker spk_3 track 1 of 1:  45%|████▌     | 9/20 [00:10<00:13,  1.22s/it]
[Acessing speaker spk_3 track 1 of 1:  50%|█████     | 10/20 [00:11<00:12,  1.25s/it]
[Acessing speaker spk_3 track 1 of 1:  55%|█████▌    | 11/2


RUN: EVAL session_129

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_129
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_129


Processing speakers:   0%|          | 0/4 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/5 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:  20%|██        | 1/5 [00:00<00:03,  1.13it/s]
[Acessing speaker spk_0 track 1 of 2:  40%|████      | 2/5 [00:04<00:08,  2.75s/it]
[Acessing speaker spk_0 track 1 of 2:  60%|██████    | 3/5 [00:07<00:04,  2.48s/it]
[Acessing speaker spk_0 track 1 of 2:  80%|████████  | 4/5 [00:09<00:02,  2.53s/it]
Processing speaker spk_0 track 1 of 2: 100%|██████████| 5/5 [00:12<00:00,  2.46s/it]

[Acessing speaker spk_0 track 2 of 2:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 2 of 2:   4%|▍         | 1/24 [00:05<02:08,  5.58s/it]
[Acessing speaker spk_0 track 2 of 2:   8%|▊         | 2/24 [00:10<01:54,  5.22s/it]
[Acessing speaker spk_0 track 2 of 2:  12%|█▎        | 3/24 [00:11<01:06,  3.16s/it]
[Acessing speaker spk_0 track 2 of 2:  17%|█▋        | 4/24 [00:12<00:49,  2.46s/it]
[Acessing speaker spk_0 track 2 of 2:  21%|██        | 5/24 [00:19<01:15,





[Acessing speaker spk_1 track 1 of 2:   0%|          | 0/7 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 2:  14%|█▍        | 1/7 [00:01<00:06,  1.03s/it]
[Acessing speaker spk_1 track 1 of 2:  29%|██▊       | 2/7 [00:01<00:04,  1.07it/s]
[Acessing speaker spk_1 track 1 of 2:  43%|████▎     | 3/7 [00:02<00:03,  1.27it/s]
[Acessing speaker spk_1 track 1 of 2:  57%|█████▋    | 4/7 [00:03<00:02,  1.40it/s]
[Acessing speaker spk_1 track 1 of 2:  71%|███████▏  | 5/7 [00:03<00:01,  1.60it/s]
[Acessing speaker spk_1 track 1 of 2:  86%|████████▌ | 6/7 [00:04<00:00,  1.78it/s]
Processing speaker spk_1 track 1 of 2: 100%|██████████| 7/7 [00:05<00:00,  1.35it/s]

Processing speaker spk_1 track 2 of 2:   0%|          | 0/22 [00:00<?, ?it/s]
Processing speakers:  25%|██▌       | 1/4 [01:51<05:33, 111.30s/it]


RuntimeError: Could not open input file: data-bin/eval/session_129/speakers/spk_1/central_crops/track_01_lip.av.mp4 Invalid data found when processing input

## 9 – Resume-Liste Durchlauf 2

Nach OOM-Abbruch bei `session_129` werden alle Sessions ab diesem Punkt
explizit aufgelistet, insgesamt 38 Sessions.

In [11]:
# Erster Durchlauf ist bei Session 129 abgebrochen, deshalb erneuter Durchlauf
missing_session_ids = [
    "session_122",
    "session_129",
    "session_13",
    "session_130",
    "session_131",
    "session_14",
    "session_142",
    "session_143",
    "session_144",
    "session_145",
    "session_146",
    "session_147",
    "session_148",
    "session_149",
    "session_15",
    "session_151",
    "session_16",
    "session_17",
    "session_18",
    "session_19",
    "session_31",
    "session_32",
    "session_34",
    "session_35",
    "session_36",
    "session_37",
    "session_38",
    "session_39",
    "session_73",
    "session_74",
    "session_75",
    "session_76",
    "session_77",
    "session_78",
    "session_79",
    "session_80",
    "session_81",
    "session_82",
]


## 10 – Resume Durchlauf 2

`try/except/finally`-Wrapper pro Session: Bei OOM wird die Session zur `failed`-Liste
hinzugefügt und der Lauf weitergeführt, statt komplett abzubrechen.

In [12]:
from pathlib import Path

missing_session_dirs = [str(Path(DATA_ROOT) / sid) for sid in missing_session_ids]

failed = []
for session_dir in missing_session_dirs:
    sid = Path(session_dir).name
    print("\nRUN:", SPLIT, sid)
    try:
        with patch_avsr_segmentation(min_on=best_min_on, min_off=best_min_off):
            run_inference_for_experiment(
                exp_name=exp_name,
                base_models=BASE_MODELS,
                experiments=EXPERIMENTS,
                session_dir=session_dir,
            )
    except Exception as e:
        print("FAILED:", sid, "->", repr(e))
        failed.append(sid) # Session merken, nicht abbrechen
    finally:
        torch.cuda.empty_cache()
        gc.collect()

print("Failed:", failed) # Ergebnis: ['session_148', 'session_80']



RUN: EVAL session_122

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_122
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_122


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_0 track 1 of 2: 100%|██████████| 1/1 [00:00<00:00,  1.79it/s]

[Acessing speaker spk_0 track 2 of 2:   0%|          | 0/10 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 2 of 2:  10%|█         | 1/10 [00:01<00:11,  1.30s/it]
[Acessing speaker spk_0 track 2 of 2:  20%|██        | 2/10 [00:01<00:06,  1.19it/s]
[Acessing speaker spk_0 track 2 of 2:  30%|███       | 3/10 [00:02<00:06,  1.15it/s]
[Acessing speaker spk_0 track 2 of 2:  40%|████      | 4/10 [00:03<00:04,  1.31it/s]
[Acessing speaker spk_0 track 2 of 2:  50%|█████     | 5/10 [00:03<00:03,  1.45it/s]
[Acessing speaker spk_0 track 2 of 2:  60%|██████    | 6/10 [00:06<00:05,  1.29s/it]
[Acessing speaker spk_0 track 2 of 2:  70%|███████   | 7/10 [00:07<00:03,  1.13s/it]
[Acessing speaker spk_0 track 2 of 2:  80%|████████  | 8/10 [00:07<00:01,  1.05it/s]
[Acessing speaker spk_0 track 2 of 2:  90%|█████████ | 9/10 [00:08<00





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/34 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/34 [00:00<00:32,  1.01it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▌         | 2/34 [00:01<00:26,  1.22it/s]
[Acessing speaker spk_1 track 1 of 1:   9%|▉         | 3/34 [00:02<00:21,  1.44it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 4/34 [00:02<00:19,  1.55it/s]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 5/34 [00:04<00:29,  1.03s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 6/34 [00:05<00:27,  1.00it/s]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 7/34 [00:08<00:42,  1.59s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▎       | 8/34 [00:09<00:35,  1.36s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▋       | 9/34 [00:12<00:49,  1.98s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▉       | 10/34 [00:17<01:11,  2.97s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   4%|▍         | 1/26 [00:01<00:27,  1.11s/it]
[Acessing speaker spk_2 track 1 of 1:   8%|▊         | 2/26 [00:01<00:17,  1.39it/s]
[Acessing speaker spk_2 track 1 of 1:  12%|█▏        | 3/26 [00:02<00:17,  1.35it/s]
[Acessing speaker spk_2 track 1 of 1:  15%|█▌        | 4/26 [00:11<01:27,  3.97s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 5/26 [00:12<00:59,  2.85s/it]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 6/26 [00:16<01:04,  3.21s/it]
[Acessing speaker spk_2 track 1 of 1:  27%|██▋       | 7/26 [00:16<00:45,  2.42s/it]
[Acessing speaker spk_2 track 1 of 1:  31%|███       | 8/26 [00:17<00:33,  1.85s/it]
[Acessing speaker spk_2 track 1 of 1:  35%|███▍      | 9/26 [00:18<00:24,  1.46s/it]
[Acessing speaker spk_2 track 1 of 1:  38%|███▊      | 10/26 [00:19<00:22,  1.41s/it]
[Acessing speaker spk_2 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/33 [00:02<01:19,  2.49s/it]
[Acessing speaker spk_3 track 1 of 1:   6%|▌         | 2/33 [00:03<00:49,  1.60s/it]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 3/33 [00:03<00:32,  1.08s/it]
[Acessing speaker spk_3 track 1 of 1:  12%|█▏        | 4/33 [00:04<00:26,  1.10it/s]
[Acessing speaker spk_3 track 1 of 1:  15%|█▌        | 5/33 [00:08<00:54,  1.96s/it]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 6/33 [00:10<00:51,  1.89s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 7/33 [00:12<00:54,  2.11s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 8/33 [00:14<00:49,  1.98s/it]
[Acessing speaker spk_3 track 1 of 1:  27%|██▋       | 9/33 [00:14<00:36,  1.51s/it]
[Acessing speaker spk_3 track 1 of 1:  30%|███       | 10/33 [00:15<00:30,  1.35s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 11/3





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/31 [00:00<00:21,  1.39it/s]
[Acessing speaker spk_4 track 1 of 1:   6%|▋         | 2/31 [00:01<00:20,  1.45it/s]
[Acessing speaker spk_4 track 1 of 1:  10%|▉         | 3/31 [00:01<00:16,  1.69it/s]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 4/31 [00:05<00:44,  1.64s/it]
[Acessing speaker spk_4 track 1 of 1:  16%|█▌        | 5/31 [00:08<00:55,  2.13s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 6/31 [00:09<00:43,  1.75s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 7/31 [00:15<01:15,  3.15s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 8/31 [00:18<01:17,  3.36s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▉       | 9/31 [00:24<01:30,  4.10s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 10/31 [00:26<01:09,  3.33s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▌      | 11/3


RUN: EVAL session_129

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_129
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_129


Processing speakers:   0%|          | 0/4 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/5 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:  20%|██        | 1/5 [00:00<00:02,  1.41it/s]
[Acessing speaker spk_0 track 1 of 2:  40%|████      | 2/5 [00:04<00:07,  2.57s/it]
[Acessing speaker spk_0 track 1 of 2:  60%|██████    | 3/5 [00:06<00:04,  2.39s/it]
[Acessing speaker spk_0 track 1 of 2:  80%|████████  | 4/5 [00:09<00:02,  2.46s/it]
Processing speaker spk_0 track 1 of 2: 100%|██████████| 5/5 [00:10<00:00,  2.19s/it]

[Acessing speaker spk_0 track 2 of 2:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 2 of 2:   4%|▍         | 1/24 [00:05<01:56,  5.06s/it]
[Acessing speaker spk_0 track 2 of 2:   8%|▊         | 2/24 [00:10<01:53,  5.15s/it]
[Acessing speaker spk_0 track 2 of 2:  12%|█▎        | 3/24 [00:10<01:05,  3.12s/it]
[Acessing speaker spk_0 track 2 of 2:  17%|█▋        | 4/24 [00:12<00:48,  2.44s/it]
[Acessing speaker spk_0 track 2 of 2:  21%|██        | 5/24 [00:19<01:16,





[Acessing speaker spk_1 track 1 of 2:   0%|          | 0/7 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 2:  14%|█▍        | 1/7 [00:00<00:04,  1.41it/s]
[Acessing speaker spk_1 track 1 of 2:  29%|██▊       | 2/7 [00:01<00:03,  1.28it/s]
[Acessing speaker spk_1 track 1 of 2:  43%|████▎     | 3/7 [00:02<00:02,  1.39it/s]
[Acessing speaker spk_1 track 1 of 2:  57%|█████▋    | 4/7 [00:03<00:03,  1.09s/it]
[Acessing speaker spk_1 track 1 of 2:  71%|███████▏  | 5/7 [00:05<00:02,  1.16s/it]
[Acessing speaker spk_1 track 1 of 2:  86%|████████▌ | 6/7 [00:05<00:00,  1.09it/s]
Processing speaker spk_1 track 1 of 2: 100%|██████████| 7/7 [00:06<00:00,  1.05it/s]

[Acessing speaker spk_1 track 2 of 2:   0%|          | 0/22 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 2 of 2:   5%|▍         | 1/22 [00:00<00:20,  1.04it/s]
[Acessing speaker spk_1 track 2 of 2:   9%|▉         | 2/22 [00:02<00:26,  1.30s/it]
[Acessing speaker spk_1 track 2 of 2:  14%|█▎        | 3/22 [00:03<00:19,  





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/38 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/38 [00:00<00:31,  1.17it/s]
[Acessing speaker spk_2 track 1 of 1:   5%|▌         | 2/38 [00:01<00:23,  1.51it/s]
[Acessing speaker spk_2 track 1 of 1:   8%|▊         | 3/38 [00:03<00:40,  1.17s/it]
[Acessing speaker spk_2 track 1 of 1:  11%|█         | 4/38 [00:04<00:35,  1.05s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 5/38 [00:05<00:41,  1.27s/it]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 6/38 [00:08<01:00,  1.88s/it]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 7/38 [00:09<00:45,  1.48s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 8/38 [00:10<00:42,  1.43s/it]
[Acessing speaker spk_2 track 1 of 1:  24%|██▎       | 9/38 [00:12<00:41,  1.44s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▋       | 10/38 [00:14<00:43,  1.57s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/29 [00:01<00:33,  1.20s/it]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 2/29 [00:04<01:09,  2.57s/it]
[Acessing speaker spk_3 track 1 of 1:  10%|█         | 3/29 [00:06<00:57,  2.20s/it]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 4/29 [00:08<00:48,  1.94s/it]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 5/29 [00:12<01:07,  2.79s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 6/29 [00:14<01:01,  2.68s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 7/29 [00:18<01:06,  3.01s/it]
[Acessing speaker spk_3 track 1 of 1:  28%|██▊       | 8/29 [00:24<01:24,  4.01s/it]
[Acessing speaker spk_3 track 1 of 1:  31%|███       | 9/29 [00:30<01:33,  4.67s/it]
[Acessing speaker spk_3 track 1 of 1:  34%|███▍      | 10/29 [00:32<01:10,  3.70s/it]
[Acessing speaker spk_3 track 1 of 1:  38%|███▊      | 11/2


RUN: EVAL session_13

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_13
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_13


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▍         | 1/24 [00:00<00:15,  1.52it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 2/24 [00:01<00:12,  1.76it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▎        | 3/24 [00:02<00:15,  1.38it/s]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 4/24 [00:06<00:45,  2.26s/it]
[Acessing speaker spk_0 track 1 of 1:  21%|██        | 5/24 [00:12<01:07,  3.53s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 6/24 [00:16<01:03,  3.53s/it]
[Acessing speaker spk_0 track 1 of 1:  29%|██▉       | 7/24 [00:17<00:46,  2.76s/it]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 8/24 [00:19<00:40,  2.56s/it]
[Acessing speaker spk_0 track 1 of 1:  38%|███▊      | 9/24 [00:22<00:39,  2.65s/it]
[Acessing speaker spk_0 track 1 of 1:  42%|████▏     | 10/24 [00:23<00:30,  2.20s/it]
[Acessing speaker spk_0 track 1 of 1:  46%|████▌     | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/20 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▌         | 1/20 [00:02<00:43,  2.30s/it]
[Acessing speaker spk_1 track 1 of 1:  10%|█         | 2/20 [00:03<00:27,  1.51s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▌        | 3/20 [00:03<00:18,  1.09s/it]
[Acessing speaker spk_1 track 1 of 1:  20%|██        | 4/20 [00:05<00:22,  1.42s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 5/20 [00:09<00:34,  2.31s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|███       | 6/20 [00:10<00:26,  1.92s/it]
[Acessing speaker spk_1 track 1 of 1:  35%|███▌      | 7/20 [00:13<00:27,  2.15s/it]
[Acessing speaker spk_1 track 1 of 1:  40%|████      | 8/20 [00:25<01:03,  5.32s/it]
[Acessing speaker spk_1 track 1 of 1:  45%|████▌     | 9/20 [00:31<01:00,  5.48s/it]
[Acessing speaker spk_1 track 1 of 1:  50%|█████     | 10/20 [00:37<00:57,  5.71s/it]
[Acessing speaker spk_1 track 1 of 1:  55%|█████▌    | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   5%|▍         | 1/21 [00:03<01:07,  3.38s/it]
[Acessing speaker spk_2 track 1 of 1:  10%|▉         | 2/21 [00:06<01:03,  3.37s/it]
[Acessing speaker spk_2 track 1 of 1:  14%|█▍        | 3/21 [00:10<01:03,  3.51s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 4/21 [00:14<01:02,  3.68s/it]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 5/21 [00:18<01:00,  3.75s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▊       | 6/21 [00:18<00:40,  2.73s/it]
[Acessing speaker spk_2 track 1 of 1:  33%|███▎      | 7/21 [00:19<00:28,  2.06s/it]
[Acessing speaker spk_2 track 1 of 1:  38%|███▊      | 8/21 [00:20<00:22,  1.73s/it]
[Acessing speaker spk_2 track 1 of 1:  43%|████▎     | 9/21 [00:24<00:30,  2.51s/it]
[Acessing speaker spk_2 track 1 of 1:  48%|████▊     | 10/21 [00:26<00:23,  2.11s/it]
[Acessing speaker spk_2 track 1 of 1:  52%|█████▏    | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/19 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▌         | 1/19 [00:06<01:49,  6.06s/it]
[Acessing speaker spk_3 track 1 of 1:  11%|█         | 2/19 [00:12<01:48,  6.38s/it]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 3/19 [00:16<01:23,  5.19s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 4/19 [00:19<01:06,  4.46s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▋       | 5/19 [00:20<00:43,  3.08s/it]
[Acessing speaker spk_3 track 1 of 1:  32%|███▏      | 6/19 [00:21<00:29,  2.27s/it]
[Acessing speaker spk_3 track 1 of 1:  37%|███▋      | 7/19 [00:26<00:40,  3.34s/it]
[Acessing speaker spk_3 track 1 of 1:  42%|████▏     | 8/19 [00:28<00:31,  2.84s/it]
[Acessing speaker spk_3 track 1 of 1:  47%|████▋     | 9/19 [00:32<00:32,  3.29s/it]
[Acessing speaker spk_3 track 1 of 1:  53%|█████▎    | 10/19 [00:33<00:22,  2.47s/it]
[Acessing speaker spk_3 track 1 of 1:  58%|█████▊    | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/27 [00:01<00:34,  1.34s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/27 [00:01<00:20,  1.22it/s]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/27 [00:02<00:15,  1.59it/s]
[Acessing speaker spk_4 track 1 of 1:  15%|█▍        | 4/27 [00:02<00:13,  1.73it/s]
[Acessing speaker spk_4 track 1 of 1:  19%|█▊        | 5/27 [00:03<00:13,  1.61it/s]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 6/27 [00:09<00:53,  2.53s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 7/27 [00:17<01:24,  4.21s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|██▉       | 8/27 [00:18<00:59,  3.13s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 9/27 [00:20<00:49,  2.77s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 10/27 [00:21<00:40,  2.40s/it]
[Acessing speaker spk_4 track 1 of 1:  41%|████      | 11/2


RUN: EVAL session_130

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_130
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_130


Processing speakers:   0%|          | 0/4 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/37 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/37 [00:00<00:31,  1.14it/s]
[Acessing speaker spk_0 track 1 of 1:   5%|▌         | 2/37 [00:02<00:39,  1.12s/it]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 3/37 [00:03<00:41,  1.21s/it]
[Acessing speaker spk_0 track 1 of 1:  11%|█         | 4/37 [00:04<00:37,  1.13s/it]
[Acessing speaker spk_0 track 1 of 1:  14%|█▎        | 5/37 [00:05<00:33,  1.04s/it]
[Acessing speaker spk_0 track 1 of 1:  16%|█▌        | 6/37 [00:07<00:48,  1.56s/it]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 7/37 [00:11<01:06,  2.22s/it]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 8/37 [00:12<00:57,  1.98s/it]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 9/37 [00:16<01:07,  2.40s/it]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 10/37 [00:20<01:17,  2.87s/it]
[Acessing speaker spk_0 track 1 of 1:  30%|██▉       | 11/3





[Acessing speaker spk_1 track 1 of 3:   0%|          | 0/5 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 3:  20%|██        | 1/5 [00:01<00:05,  1.37s/it]
[Acessing speaker spk_1 track 1 of 3:  40%|████      | 2/5 [00:03<00:05,  1.76s/it]
[Acessing speaker spk_1 track 1 of 3:  60%|██████    | 3/5 [00:04<00:02,  1.44s/it]
[Acessing speaker spk_1 track 1 of 3:  80%|████████  | 4/5 [00:05<00:01,  1.36s/it]
Processing speaker spk_1 track 1 of 3: 100%|██████████| 5/5 [00:07<00:00,  1.51s/it]

[Acessing speaker spk_1 track 2 of 3:   0%|          | 0/15 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 2 of 3:   7%|▋         | 1/15 [00:01<00:17,  1.24s/it]
[Acessing speaker spk_1 track 2 of 3:  13%|█▎        | 2/15 [00:01<00:11,  1.15it/s]
[Acessing speaker spk_1 track 2 of 3:  20%|██        | 3/15 [00:02<00:08,  1.42it/s]
[Acessing speaker spk_1 track 2 of 3:  27%|██▋       | 4/15 [00:03<00:08,  1.23it/s]
[Acessing speaker spk_1 track 2 of 3:  33%|███▎      | 5/15 [00:04<00:07,





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   6%|▌         | 1/17 [00:02<00:45,  2.85s/it]
[Acessing speaker spk_2 track 1 of 1:  12%|█▏        | 2/17 [00:03<00:27,  1.84s/it]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 3/17 [00:05<00:22,  1.63s/it]
[Acessing speaker spk_2 track 1 of 1:  24%|██▎       | 4/17 [00:10<00:37,  2.90s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 5/17 [00:11<00:28,  2.37s/it]
[Acessing speaker spk_2 track 1 of 1:  35%|███▌      | 6/17 [00:12<00:19,  1.80s/it]
[Acessing speaker spk_2 track 1 of 1:  41%|████      | 7/17 [00:13<00:16,  1.66s/it]
[Acessing speaker spk_2 track 1 of 1:  47%|████▋     | 8/17 [00:15<00:14,  1.66s/it]
[Acessing speaker spk_2 track 1 of 1:  53%|█████▎    | 9/17 [00:19<00:18,  2.28s/it]
[Acessing speaker spk_2 track 1 of 1:  59%|█████▉    | 10/17 [00:22<00:18,  2.58s/it]
[Acessing speaker spk_2 track 1 of 1:  65%|██████▍   | 11/1





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/20 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▌         | 1/20 [00:04<01:32,  4.85s/it]
[Acessing speaker spk_3 track 1 of 1:  10%|█         | 2/20 [00:05<00:46,  2.59s/it]
[Acessing speaker spk_3 track 1 of 1:  15%|█▌        | 3/20 [00:12<01:12,  4.27s/it]
[Acessing speaker spk_3 track 1 of 1:  20%|██        | 4/20 [00:12<00:46,  2.88s/it]
[Acessing speaker spk_3 track 1 of 1:  25%|██▌       | 5/20 [00:13<00:30,  2.01s/it]
[Acessing speaker spk_3 track 1 of 1:  30%|███       | 6/20 [00:15<00:31,  2.22s/it]
[Acessing speaker spk_3 track 1 of 1:  35%|███▌      | 7/20 [00:16<00:22,  1.71s/it]
[Acessing speaker spk_3 track 1 of 1:  40%|████      | 8/20 [00:20<00:29,  2.48s/it]
[Acessing speaker spk_3 track 1 of 1:  45%|████▌     | 9/20 [00:28<00:44,  4.02s/it]
[Acessing speaker spk_3 track 1 of 1:  50%|█████     | 10/20 [00:29<00:30,  3.05s/it]
[Acessing speaker spk_3 track 1 of 1:  55%|█████▌    | 11/2


RUN: EVAL session_131

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_131
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_131


Processing speakers:   0%|          | 0/4 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 3:   0%|          | 0/7 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 3:  14%|█▍        | 1/7 [00:00<00:03,  1.62it/s]
[Acessing speaker spk_0 track 1 of 3:  29%|██▊       | 2/7 [00:01<00:02,  1.81it/s]
[Acessing speaker spk_0 track 1 of 3:  43%|████▎     | 3/7 [00:02<00:02,  1.40it/s]
[Acessing speaker spk_0 track 1 of 3:  57%|█████▋    | 4/7 [00:02<00:02,  1.34it/s]
[Acessing speaker spk_0 track 1 of 3:  71%|███████▏  | 5/7 [00:04<00:02,  1.04s/it]
[Acessing speaker spk_0 track 1 of 3:  86%|████████▌ | 6/7 [00:05<00:01,  1.15s/it]
Processing speaker spk_0 track 1 of 3: 100%|██████████| 7/7 [00:07<00:00,  1.06s/it]

[Acessing speaker spk_0 track 2 of 3:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 2 of 3:   6%|▌         | 1/17 [00:00<00:12,  1.24it/s]
[Acessing speaker spk_0 track 2 of 3:  12%|█▏        | 2/17 [00:01<00:10,  1.49it/s]
[Acessing speaker spk_0 track 2 of 3:  18%|█▊        | 3/17 [00:03<00:20,  





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/28 [00:01<00:37,  1.38s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/28 [00:02<00:34,  1.33s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/28 [00:05<00:50,  2.02s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 4/28 [00:08<01:02,  2.59s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 5/28 [00:09<00:43,  1.88s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██▏       | 6/28 [00:14<01:05,  2.98s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 7/28 [00:25<01:55,  5.49s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▊       | 8/28 [00:31<01:51,  5.55s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 9/28 [00:31<01:17,  4.10s/it]
[Acessing speaker spk_1 track 1 of 1:  36%|███▌      | 10/28 [00:32<00:55,  3.07s/it]
[Acessing speaker spk_1 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/29 [00:02<01:01,  2.21s/it]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 2/29 [00:02<00:35,  1.31s/it]
[Acessing speaker spk_2 track 1 of 1:  10%|█         | 3/29 [00:03<00:25,  1.02it/s]
[Acessing speaker spk_2 track 1 of 1:  14%|█▍        | 4/29 [00:06<00:41,  1.66s/it]
[Acessing speaker spk_2 track 1 of 1:  17%|█▋        | 5/29 [00:07<00:35,  1.48s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 6/29 [00:08<00:29,  1.26s/it]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 7/29 [00:09<00:29,  1.32s/it]
[Acessing speaker spk_2 track 1 of 1:  28%|██▊       | 8/29 [00:15<00:57,  2.72s/it]
[Acessing speaker spk_2 track 1 of 1:  31%|███       | 9/29 [00:20<01:07,  3.37s/it]
[Acessing speaker spk_2 track 1 of 1:  34%|███▍      | 10/29 [00:21<00:51,  2.70s/it]
[Acessing speaker spk_2 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/19 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▌         | 1/19 [00:01<00:24,  1.36s/it]
[Acessing speaker spk_3 track 1 of 1:  11%|█         | 2/19 [00:02<00:16,  1.02it/s]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 3/19 [00:06<00:43,  2.71s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 4/19 [00:08<00:33,  2.21s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▋       | 5/19 [00:09<00:27,  1.98s/it]
[Acessing speaker spk_3 track 1 of 1:  32%|███▏      | 6/19 [00:10<00:19,  1.48s/it]
[Acessing speaker spk_3 track 1 of 1:  37%|███▋      | 7/19 [00:11<00:17,  1.46s/it]
[Acessing speaker spk_3 track 1 of 1:  42%|████▏     | 8/19 [00:17<00:29,  2.71s/it]
[Acessing speaker spk_3 track 1 of 1:  47%|████▋     | 9/19 [00:18<00:21,  2.14s/it]
[Acessing speaker spk_3 track 1 of 1:  53%|█████▎    | 10/19 [00:18<00:15,  1.72s/it]
[Acessing speaker spk_3 track 1 of 1:  58%|█████▊    | 11/1


RUN: EVAL session_14

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_14
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_14


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/30 [00:01<00:38,  1.34s/it]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/30 [00:02<00:34,  1.23s/it]
[Acessing speaker spk_0 track 1 of 1:  10%|█         | 3/30 [00:03<00:29,  1.09s/it]
[Acessing speaker spk_0 track 1 of 1:  13%|█▎        | 4/30 [00:03<00:21,  1.19it/s]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 5/30 [00:04<00:22,  1.11it/s]
[Acessing speaker spk_0 track 1 of 1:  20%|██        | 6/30 [00:05<00:19,  1.26it/s]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 7/30 [00:06<00:17,  1.32it/s]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 8/30 [00:08<00:26,  1.20s/it]
[Acessing speaker spk_0 track 1 of 1:  30%|███       | 9/30 [00:09<00:22,  1.08s/it]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 10/30 [00:09<00:19,  1.03it/s]
[Acessing speaker spk_0 track 1 of 1:  37%|███▋      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▍         | 1/21 [00:03<01:02,  3.13s/it]
[Acessing speaker spk_1 track 1 of 1:  10%|▉         | 2/21 [00:09<01:38,  5.19s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 3/21 [00:16<01:43,  5.72s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 4/21 [00:22<01:44,  6.17s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 5/21 [00:24<01:09,  4.34s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▊       | 6/21 [00:25<00:49,  3.27s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 7/21 [00:27<00:42,  3.06s/it]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 8/21 [00:31<00:43,  3.34s/it]
[Acessing speaker spk_1 track 1 of 1:  43%|████▎     | 9/21 [00:34<00:38,  3.19s/it]
[Acessing speaker spk_1 track 1 of 1:  48%|████▊     | 10/21 [00:35<00:27,  2.48s/it]
[Acessing speaker spk_1 track 1 of 1:  52%|█████▏    | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   4%|▎         | 1/27 [00:00<00:22,  1.18it/s]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 2/27 [00:01<00:23,  1.05it/s]
[Acessing speaker spk_2 track 1 of 1:  11%|█         | 3/27 [00:05<00:48,  2.02s/it]
[Acessing speaker spk_2 track 1 of 1:  15%|█▍        | 4/27 [00:06<00:41,  1.79s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▊        | 5/27 [00:14<01:26,  3.91s/it]
[Acessing speaker spk_2 track 1 of 1:  22%|██▏       | 6/27 [00:20<01:39,  4.74s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 7/27 [00:21<01:09,  3.50s/it]
[Acessing speaker spk_2 track 1 of 1:  30%|██▉       | 8/27 [00:22<00:49,  2.62s/it]
[Acessing speaker spk_2 track 1 of 1:  33%|███▎      | 9/27 [00:23<00:41,  2.29s/it]
[Acessing speaker spk_2 track 1 of 1:  37%|███▋      | 10/27 [00:28<00:50,  2.96s/it]
[Acessing speaker spk_2 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/22 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▍         | 1/22 [00:01<00:21,  1.02s/it]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 2/22 [00:02<00:23,  1.19s/it]
[Acessing speaker spk_3 track 1 of 1:  14%|█▎        | 3/22 [00:06<00:44,  2.36s/it]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 4/22 [00:06<00:30,  1.70s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 5/22 [00:17<01:22,  4.86s/it]
[Acessing speaker spk_3 track 1 of 1:  27%|██▋       | 6/22 [00:25<01:38,  6.19s/it]
[Acessing speaker spk_3 track 1 of 1:  32%|███▏      | 7/22 [00:27<01:08,  4.54s/it]
[Acessing speaker spk_3 track 1 of 1:  36%|███▋      | 8/22 [00:28<00:49,  3.50s/it]
[Acessing speaker spk_3 track 1 of 1:  41%|████      | 9/22 [00:29<00:35,  2.72s/it]
[Acessing speaker spk_3 track 1 of 1:  45%|████▌     | 10/22 [00:29<00:24,  2.05s/it]
[Acessing speaker spk_3 track 1 of 1:  50%|█████     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/23 [00:00<00:12,  1.79it/s]
[Acessing speaker spk_4 track 1 of 1:   9%|▊         | 2/23 [00:01<00:16,  1.27it/s]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 3/23 [00:02<00:14,  1.41it/s]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 4/23 [00:06<00:40,  2.15s/it]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 5/23 [00:09<00:43,  2.44s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 6/23 [00:10<00:36,  2.13s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|███       | 7/23 [00:13<00:35,  2.19s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▍      | 8/23 [00:19<00:50,  3.36s/it]
[Acessing speaker spk_4 track 1 of 1:  39%|███▉      | 9/23 [00:21<00:44,  3.20s/it]
[Acessing speaker spk_4 track 1 of 1:  43%|████▎     | 10/23 [00:22<00:32,  2.51s/it]
[Acessing speaker spk_4 track 1 of 1:  48%|████▊     | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/25 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   4%|▍         | 1/25 [00:01<00:29,  1.22s/it]
[Acessing speaker spk_5 track 1 of 1:   8%|▊         | 2/25 [00:04<00:57,  2.50s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 3/25 [00:05<00:41,  1.87s/it]
[Acessing speaker spk_5 track 1 of 1:  16%|█▌        | 4/25 [00:06<00:28,  1.36s/it]
[Acessing speaker spk_5 track 1 of 1:  20%|██        | 5/25 [00:06<00:20,  1.03s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 6/25 [00:07<00:17,  1.09it/s]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 7/25 [00:16<01:04,  3.60s/it]
[Acessing speaker spk_5 track 1 of 1:  32%|███▏      | 8/25 [00:18<00:50,  2.95s/it]
[Acessing speaker spk_5 track 1 of 1:  36%|███▌      | 9/25 [00:19<00:37,  2.32s/it]
[Acessing speaker spk_5 track 1 of 1:  40%|████      | 10/25 [00:19<00:28,  1.88s/it]
[Acessing speaker spk_5 track 1 of 1:  44%|████▍     | 11/2


RUN: EVAL session_142

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_142
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_142


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   5%|▍         | 1/21 [00:03<01:02,  3.13s/it]
[Acessing speaker spk_0 track 1 of 1:  10%|▉         | 2/21 [00:04<00:34,  1.81s/it]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 3/21 [00:05<00:25,  1.44s/it]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 4/21 [00:05<00:19,  1.17s/it]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 5/21 [00:06<00:16,  1.05s/it]
[Acessing speaker spk_0 track 1 of 1:  29%|██▊       | 6/21 [00:07<00:13,  1.08it/s]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 7/21 [00:10<00:23,  1.70s/it]
[Acessing speaker spk_0 track 1 of 1:  38%|███▊      | 8/21 [00:12<00:22,  1.69s/it]
[Acessing speaker spk_0 track 1 of 1:  43%|████▎     | 9/21 [00:12<00:15,  1.33s/it]
[Acessing speaker spk_0 track 1 of 1:  48%|████▊     | 10/21 [00:14<00:15,  1.41s/it]
[Acessing speaker spk_0 track 1 of 1:  52%|█████▏    | 11/2





[Acessing speaker spk_1 track 1 of 2:   0%|          | 0/6 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 2:  17%|█▋        | 1/6 [00:00<00:04,  1.20it/s]
[Acessing speaker spk_1 track 1 of 2:  33%|███▎      | 2/6 [00:01<00:03,  1.21it/s]
[Acessing speaker spk_1 track 1 of 2:  50%|█████     | 3/6 [00:02<00:02,  1.34it/s]
[Acessing speaker spk_1 track 1 of 2:  67%|██████▋   | 4/6 [00:02<00:01,  1.57it/s]
[Acessing speaker spk_1 track 1 of 2:  83%|████████▎ | 5/6 [00:03<00:00,  1.72it/s]
Processing speaker spk_1 track 1 of 2: 100%|██████████| 6/6 [00:08<00:00,  1.45s/it]

[Acessing speaker spk_1 track 2 of 2:   0%|          | 0/13 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 2 of 2:   8%|▊         | 1/13 [00:00<00:11,  1.04it/s]
[Acessing speaker spk_1 track 2 of 2:  15%|█▌        | 2/13 [00:02<00:16,  1.50s/it]
[Acessing speaker spk_1 track 2 of 2:  23%|██▎       | 3/13 [00:03<00:12,  1.22s/it]
[Acessing speaker spk_1 track 2 of 2:  31%|███       | 4/13 [00:04<00:08, 





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/14 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 1/14 [00:02<00:38,  2.95s/it]
[Acessing speaker spk_2 track 1 of 1:  14%|█▍        | 2/14 [00:04<00:22,  1.88s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██▏       | 3/14 [00:04<00:15,  1.44s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▊       | 4/14 [00:05<00:10,  1.07s/it]
[Acessing speaker spk_2 track 1 of 1:  36%|███▌      | 5/14 [00:06<00:09,  1.03s/it]
[Acessing speaker spk_2 track 1 of 1:  43%|████▎     | 6/14 [00:07<00:08,  1.02s/it]
[Acessing speaker spk_2 track 1 of 1:  50%|█████     | 7/14 [00:08<00:06,  1.04it/s]
[Acessing speaker spk_2 track 1 of 1:  57%|█████▋    | 8/14 [00:10<00:07,  1.27s/it]
[Acessing speaker spk_2 track 1 of 1:  64%|██████▍   | 9/14 [00:10<00:05,  1.07s/it]
[Acessing speaker spk_2 track 1 of 1:  71%|███████▏  | 10/14 [00:14<00:07,  1.84s/it]
[Acessing speaker spk_2 track 1 of 1:  79%|███████▊  | 11/1





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▎         | 1/28 [00:03<01:26,  3.19s/it]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 2/28 [00:08<01:49,  4.21s/it]
[Acessing speaker spk_3 track 1 of 1:  11%|█         | 3/28 [00:08<01:06,  2.64s/it]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 4/28 [00:12<01:12,  3.03s/it]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 5/28 [00:15<01:12,  3.17s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██▏       | 6/28 [00:16<00:52,  2.38s/it]
[Acessing speaker spk_3 track 1 of 1:  25%|██▌       | 7/28 [00:21<01:02,  2.99s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▊       | 8/28 [00:24<01:05,  3.26s/it]
[Acessing speaker spk_3 track 1 of 1:  32%|███▏      | 9/28 [00:30<01:13,  3.88s/it]
[Acessing speaker spk_3 track 1 of 1:  36%|███▌      | 10/28 [00:33<01:09,  3.85s/it]
[Acessing speaker spk_3 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/12 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 1/12 [00:00<00:09,  1.11it/s]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 2/12 [00:01<00:06,  1.56it/s]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 3/12 [00:02<00:06,  1.40it/s]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 4/12 [00:03<00:07,  1.06it/s]
[Acessing speaker spk_4 track 1 of 1:  42%|████▏     | 5/12 [00:04<00:05,  1.21it/s]
[Acessing speaker spk_4 track 1 of 1:  50%|█████     | 6/12 [00:04<00:04,  1.31it/s]
[Acessing speaker spk_4 track 1 of 1:  58%|█████▊    | 7/12 [00:05<00:03,  1.29it/s]
[Acessing speaker spk_4 track 1 of 1:  67%|██████▋   | 8/12 [00:06<00:02,  1.37it/s]
[Acessing speaker spk_4 track 1 of 1:  75%|███████▌  | 9/12 [00:06<00:02,  1.44it/s]
[Acessing speaker spk_4 track 1 of 1:  83%|████████▎ | 10/12 [00:09<00:02,  1.43s/it]
[Acessing speaker spk_4 track 1 of 1:  92%|█████████▏| 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   4%|▍         | 1/26 [00:00<00:19,  1.25it/s]
[Acessing speaker spk_5 track 1 of 1:   8%|▊         | 2/26 [00:01<00:16,  1.47it/s]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 3/26 [00:02<00:16,  1.37it/s]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 4/26 [00:02<00:13,  1.63it/s]
[Acessing speaker spk_5 track 1 of 1:  19%|█▉        | 5/26 [00:03<00:14,  1.41it/s]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 6/26 [00:04<00:13,  1.52it/s]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 7/26 [00:04<00:12,  1.48it/s]
[Acessing speaker spk_5 track 1 of 1:  31%|███       | 8/26 [00:05<00:11,  1.56it/s]
[Acessing speaker spk_5 track 1 of 1:  35%|███▍      | 9/26 [00:06<00:11,  1.49it/s]
[Acessing speaker spk_5 track 1 of 1:  38%|███▊      | 10/26 [00:08<00:18,  1.13s/it]
[Acessing speaker spk_5 track 1 of 1:  42%|████▏     | 11/2


RUN: EVAL session_143

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_143
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_143


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/14 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 1/14 [00:00<00:11,  1.15it/s]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 2/14 [00:01<00:09,  1.20it/s]
[Acessing speaker spk_0 track 1 of 1:  21%|██▏       | 3/14 [00:02<00:10,  1.08it/s]
[Acessing speaker spk_0 track 1 of 1:  29%|██▊       | 4/14 [00:04<00:10,  1.10s/it]
[Acessing speaker spk_0 track 1 of 1:  36%|███▌      | 5/14 [00:05<00:11,  1.32s/it]
[Acessing speaker spk_0 track 1 of 1:  43%|████▎     | 6/14 [00:06<00:09,  1.21s/it]
[Acessing speaker spk_0 track 1 of 1:  50%|█████     | 7/14 [00:07<00:07,  1.07s/it]
[Acessing speaker spk_0 track 1 of 1:  57%|█████▋    | 8/14 [00:10<00:09,  1.61s/it]
[Acessing speaker spk_0 track 1 of 1:  64%|██████▍   | 9/14 [00:12<00:08,  1.67s/it]
[Acessing speaker spk_0 track 1 of 1:  71%|███████▏  | 10/14 [00:13<00:06,  1.53s/it]
[Acessing speaker spk_0 track 1 of 1:  79%|███████▊  | 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/7 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 1/7 [00:01<00:06,  1.05s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▊       | 2/7 [00:05<00:14,  2.89s/it]
[Acessing speaker spk_1 track 1 of 1:  43%|████▎     | 3/7 [00:06<00:09,  2.25s/it]
[Acessing speaker spk_1 track 1 of 1:  57%|█████▋    | 4/7 [00:14<00:13,  4.50s/it]
[Acessing speaker spk_1 track 1 of 1:  71%|███████▏  | 5/7 [00:17<00:07,  3.86s/it]
[Acessing speaker spk_1 track 1 of 1:  86%|████████▌ | 6/7 [00:21<00:03,  3.99s/it]
Processing speaker spk_1 track 1 of 1: 100%|██████████| 7/7 [00:25<00:00,  3.59s/it]
Processing speakers:  33%|███▎      | 2/6 [00:49<01:39, 24.97s/it]





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/19 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   5%|▌         | 1/19 [00:01<00:21,  1.18s/it]
[Acessing speaker spk_2 track 1 of 1:  11%|█         | 2/19 [00:10<01:38,  5.82s/it]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 3/19 [00:15<01:32,  5.77s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 4/19 [00:16<00:57,  3.81s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▋       | 5/19 [00:17<00:36,  2.63s/it]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 6/19 [00:17<00:25,  1.95s/it]
[Acessing speaker spk_2 track 1 of 1:  37%|███▋      | 7/19 [00:21<00:31,  2.61s/it]
[Acessing speaker spk_2 track 1 of 1:  42%|████▏     | 8/19 [00:22<00:23,  2.09s/it]
[Acessing speaker spk_2 track 1 of 1:  47%|████▋     | 9/19 [00:24<00:20,  2.07s/it]
[Acessing speaker spk_2 track 1 of 1:  53%|█████▎    | 10/19 [00:29<00:24,  2.73s/it]
[Acessing speaker spk_2 track 1 of 1:  58%|█████▊    | 11/1





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/31 [00:01<00:37,  1.26s/it]
[Acessing speaker spk_3 track 1 of 1:   6%|▋         | 2/31 [00:02<00:28,  1.02it/s]
[Acessing speaker spk_3 track 1 of 1:  10%|▉         | 3/31 [00:02<00:23,  1.19it/s]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 4/31 [00:04<00:34,  1.29s/it]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 5/31 [00:05<00:28,  1.10s/it]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 6/31 [00:06<00:23,  1.07it/s]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 7/31 [00:07<00:26,  1.12s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 8/31 [00:12<00:52,  2.26s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 9/31 [00:14<00:52,  2.39s/it]
[Acessing speaker spk_3 track 1 of 1:  32%|███▏      | 10/31 [00:15<00:39,  1.87s/it]
[Acessing speaker spk_3 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/34 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/34 [00:00<00:27,  1.18it/s]
[Acessing speaker spk_4 track 1 of 1:   6%|▌         | 2/34 [00:02<00:50,  1.58s/it]
[Acessing speaker spk_4 track 1 of 1:   9%|▉         | 3/34 [00:03<00:37,  1.20s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 4/34 [00:08<01:15,  2.51s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▍        | 5/34 [00:11<01:25,  2.95s/it]
[Acessing speaker spk_4 track 1 of 1:  18%|█▊        | 6/34 [00:14<01:23,  2.98s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██        | 7/34 [00:22<02:00,  4.47s/it]
[Acessing speaker spk_4 track 1 of 1:  24%|██▎       | 8/34 [00:24<01:33,  3.60s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▋       | 9/34 [00:34<02:24,  5.78s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▉       | 10/34 [00:36<01:47,  4.48s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 11/3





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/10 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:  10%|█         | 1/10 [00:01<00:14,  1.59s/it]
[Acessing speaker spk_5 track 1 of 1:  20%|██        | 2/10 [00:02<00:10,  1.27s/it]
[Acessing speaker spk_5 track 1 of 1:  30%|███       | 3/10 [00:03<00:08,  1.20s/it]
[Acessing speaker spk_5 track 1 of 1:  40%|████      | 4/10 [00:04<00:05,  1.04it/s]
[Acessing speaker spk_5 track 1 of 1:  50%|█████     | 5/10 [00:04<00:04,  1.21it/s]
[Acessing speaker spk_5 track 1 of 1:  60%|██████    | 6/10 [00:05<00:03,  1.33it/s]
[Acessing speaker spk_5 track 1 of 1:  70%|███████   | 7/10 [00:06<00:02,  1.35it/s]
[Acessing speaker spk_5 track 1 of 1:  80%|████████  | 8/10 [00:06<00:01,  1.47it/s]
[Acessing speaker spk_5 track 1 of 1:  90%|█████████ | 9/10 [00:07<00:00,  1.19it/s]
Processing speaker spk_5 track 1 of 1: 100%|██████████| 10/10 [00:10<00:00,  1.06s/it]
Processing speakers: 100%|██████████| 6/6 [04:13<00:00, 42.2


RUN: EVAL session_144

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_144
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_144


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▍         | 1/26 [00:02<01:09,  2.80s/it]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 2/26 [00:03<00:40,  1.68s/it]
[Acessing speaker spk_0 track 1 of 1:  12%|█▏        | 3/26 [00:04<00:31,  1.38s/it]
[Acessing speaker spk_0 track 1 of 1:  15%|█▌        | 4/26 [00:05<00:22,  1.04s/it]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 5/26 [00:08<00:38,  1.82s/it]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 6/26 [00:09<00:28,  1.44s/it]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 7/26 [00:09<00:21,  1.13s/it]
[Acessing speaker spk_0 track 1 of 1:  31%|███       | 8/26 [00:10<00:18,  1.02s/it]
[Acessing speaker spk_0 track 1 of 1:  35%|███▍      | 9/26 [00:13<00:28,  1.66s/it]
[Acessing speaker spk_0 track 1 of 1:  38%|███▊      | 10/26 [00:15<00:28,  1.77s/it]
[Acessing speaker spk_0 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_1 track 1 of 2:   0%|          | 0/4 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 2:  25%|██▌       | 1/4 [00:00<00:02,  1.21it/s]
[Acessing speaker spk_1 track 1 of 2:  50%|█████     | 2/4 [00:01<00:01,  1.67it/s]
[Acessing speaker spk_1 track 1 of 2:  75%|███████▌  | 3/4 [00:02<00:01,  1.04s/it]
Processing speaker spk_1 track 1 of 2: 100%|██████████| 4/4 [00:03<00:00,  1.16it/s]

[Acessing speaker spk_1 track 2 of 2:   0%|          | 0/8 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 2 of 2:  12%|█▎        | 1/8 [00:05<00:41,  5.91s/it]
[Acessing speaker spk_1 track 2 of 2:  25%|██▌       | 2/8 [00:07<00:20,  3.39s/it]
[Acessing speaker spk_1 track 2 of 2:  38%|███▊      | 3/8 [00:08<00:11,  2.32s/it]
[Acessing speaker spk_1 track 2 of 2:  50%|█████     | 4/8 [00:09<00:07,  1.78s/it]
[Acessing speaker spk_1 track 2 of 2:  62%|██████▎   | 5/8 [00:10<00:04,  1.46s/it]
[Acessing speaker spk_1 track 2 of 2:  75%|███████▌  | 6/8 [00:13<00:04,  2.09





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/22 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   5%|▍         | 1/22 [00:02<00:53,  2.53s/it]
[Acessing speaker spk_2 track 1 of 1:   9%|▉         | 2/22 [00:06<01:11,  3.58s/it]
[Acessing speaker spk_2 track 1 of 1:  14%|█▎        | 3/22 [00:07<00:43,  2.31s/it]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 4/22 [00:14<01:17,  4.28s/it]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 5/22 [00:21<01:27,  5.12s/it]
[Acessing speaker spk_2 track 1 of 1:  27%|██▋       | 6/22 [00:25<01:13,  4.57s/it]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 7/22 [00:29<01:07,  4.50s/it]
[Acessing speaker spk_2 track 1 of 1:  36%|███▋      | 8/22 [00:30<00:46,  3.30s/it]
[Acessing speaker spk_2 track 1 of 1:  41%|████      | 9/22 [00:30<00:31,  2.44s/it]
[Acessing speaker spk_2 track 1 of 1:  45%|████▌     | 10/22 [00:31<00:22,  1.86s/it]
[Acessing speaker spk_2 track 1 of 1:  50%|█████     | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/36 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/36 [00:00<00:32,  1.06it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▌         | 2/36 [00:01<00:29,  1.14it/s]
[Acessing speaker spk_3 track 1 of 1:   8%|▊         | 3/36 [00:02<00:30,  1.07it/s]
[Acessing speaker spk_3 track 1 of 1:  11%|█         | 4/36 [00:03<00:29,  1.10it/s]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 5/36 [00:06<00:52,  1.69s/it]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 6/36 [00:10<01:14,  2.49s/it]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 7/36 [00:13<01:16,  2.64s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 8/36 [00:14<00:56,  2.02s/it]
[Acessing speaker spk_3 track 1 of 1:  25%|██▌       | 9/36 [00:15<00:49,  1.84s/it]
[Acessing speaker spk_3 track 1 of 1:  28%|██▊       | 10/36 [00:19<01:02,  2.40s/it]
[Acessing speaker spk_3 track 1 of 1:  31%|███       | 11/3





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/26 [00:01<00:46,  1.87s/it]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/26 [00:04<00:51,  2.14s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 3/26 [00:08<01:15,  3.27s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▌        | 4/26 [00:13<01:28,  4.01s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 5/26 [00:15<01:02,  2.96s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 6/26 [00:17<00:55,  2.76s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 7/26 [00:18<00:39,  2.10s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 8/26 [00:19<00:32,  1.81s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▍      | 9/26 [00:21<00:32,  1.92s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 10/26 [00:23<00:31,  1.99s/it]
[Acessing speaker spk_4 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/8 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:  12%|█▎        | 1/8 [00:00<00:04,  1.70it/s]
[Acessing speaker spk_5 track 1 of 1:  25%|██▌       | 2/8 [00:01<00:04,  1.45it/s]
[Acessing speaker spk_5 track 1 of 1:  38%|███▊      | 3/8 [00:01<00:03,  1.49it/s]
[Acessing speaker spk_5 track 1 of 1:  50%|█████     | 4/8 [00:02<00:02,  1.39it/s]
[Acessing speaker spk_5 track 1 of 1:  62%|██████▎   | 5/8 [00:05<00:03,  1.30s/it]
[Acessing speaker spk_5 track 1 of 1:  75%|███████▌  | 6/8 [00:06<00:02,  1.18s/it]
[Acessing speaker spk_5 track 1 of 1:  88%|████████▊ | 7/8 [00:09<00:01,  1.89s/it]
Processing speaker spk_5 track 1 of 1: 100%|██████████| 8/8 [00:10<00:00,  1.32s/it]
Processing speakers: 100%|██████████| 6/6 [04:15<00:00, 42.57s/it]



RUN: EVAL session_145

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_145
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_145


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▍         | 1/26 [00:00<00:22,  1.13it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 2/26 [00:02<00:26,  1.12s/it]
[Acessing speaker spk_0 track 1 of 1:  12%|█▏        | 3/26 [00:04<00:33,  1.46s/it]
[Acessing speaker spk_0 track 1 of 1:  15%|█▌        | 4/26 [00:06<00:44,  2.00s/it]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 5/26 [00:07<00:31,  1.49s/it]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 6/26 [00:10<00:41,  2.10s/it]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 7/26 [00:12<00:35,  1.86s/it]
[Acessing speaker spk_0 track 1 of 1:  31%|███       | 8/26 [00:19<01:03,  3.55s/it]
[Acessing speaker spk_0 track 1 of 1:  35%|███▍      | 9/26 [00:20<00:46,  2.72s/it]
[Acessing speaker spk_0 track 1 of 1:  38%|███▊      | 10/26 [00:21<00:37,  2.32s/it]
[Acessing speaker spk_0 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▌         | 1/18 [00:04<01:14,  4.36s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 2/18 [00:05<00:36,  2.26s/it]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 3/18 [00:14<01:20,  5.38s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 4/18 [00:19<01:15,  5.37s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 5/18 [00:20<00:48,  3.71s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 6/18 [00:25<00:49,  4.16s/it]
[Acessing speaker spk_1 track 1 of 1:  39%|███▉      | 7/18 [00:29<00:46,  4.25s/it]
[Acessing speaker spk_1 track 1 of 1:  44%|████▍     | 8/18 [00:32<00:36,  3.69s/it]
[Acessing speaker spk_1 track 1 of 1:  50%|█████     | 9/18 [00:33<00:26,  2.99s/it]
[Acessing speaker spk_1 track 1 of 1:  56%|█████▌    | 10/18 [00:37<00:25,  3.21s/it]
[Acessing speaker spk_1 track 1 of 1:  61%|██████    | 11/1





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/9 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:  11%|█         | 1/9 [00:00<00:06,  1.31it/s]
[Acessing speaker spk_2 track 1 of 1:  22%|██▏       | 2/9 [00:01<00:03,  1.78it/s]
[Acessing speaker spk_2 track 1 of 1:  33%|███▎      | 3/9 [00:02<00:06,  1.01s/it]
[Acessing speaker spk_2 track 1 of 1:  44%|████▍     | 4/9 [00:03<00:04,  1.18it/s]
[Acessing speaker spk_2 track 1 of 1:  56%|█████▌    | 5/9 [00:04<00:03,  1.24it/s]
[Acessing speaker spk_2 track 1 of 1:  67%|██████▋   | 6/9 [00:04<00:02,  1.38it/s]
[Acessing speaker spk_2 track 1 of 1:  78%|███████▊  | 7/9 [00:05<00:01,  1.56it/s]
[Acessing speaker spk_2 track 1 of 1:  89%|████████▉ | 8/9 [00:05<00:00,  1.52it/s]
Processing speaker spk_2 track 1 of 1: 100%|██████████| 9/9 [00:06<00:00,  1.41it/s]
Processing speakers:  50%|█████     | 3/6 [01:53<01:34, 31.39s/it]





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/11 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 1/11 [00:01<00:11,  1.20s/it]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 2/11 [00:01<00:08,  1.08it/s]
[Acessing speaker spk_3 track 1 of 1:  27%|██▋       | 3/11 [00:04<00:13,  1.69s/it]
[Acessing speaker spk_3 track 1 of 1:  36%|███▋      | 4/11 [00:05<00:08,  1.27s/it]
[Acessing speaker spk_3 track 1 of 1:  45%|████▌     | 5/11 [00:05<00:05,  1.02it/s]
[Acessing speaker spk_3 track 1 of 1:  55%|█████▍    | 6/11 [00:06<00:04,  1.22it/s]
[Acessing speaker spk_3 track 1 of 1:  64%|██████▎   | 7/11 [00:06<00:02,  1.38it/s]
[Acessing speaker spk_3 track 1 of 1:  73%|███████▎  | 8/11 [00:07<00:02,  1.29it/s]
[Acessing speaker spk_3 track 1 of 1:  82%|████████▏ | 9/11 [00:11<00:03,  1.89s/it]
[Acessing speaker spk_3 track 1 of 1:  91%|█████████ | 10/11 [00:12<00:01,  1.58s/it]
Processing speaker spk_3 track 1 of 1: 100%|██████████| 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   6%|▋         | 1/16 [00:01<00:19,  1.30s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▎        | 2/16 [00:01<00:12,  1.11it/s]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 3/16 [00:02<00:12,  1.07it/s]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 4/16 [00:03<00:10,  1.10it/s]
[Acessing speaker spk_4 track 1 of 1:  31%|███▏      | 5/16 [00:05<00:14,  1.36s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 6/16 [00:06<00:12,  1.20s/it]
[Acessing speaker spk_4 track 1 of 1:  44%|████▍     | 7/16 [00:08<00:12,  1.34s/it]
[Acessing speaker spk_4 track 1 of 1:  50%|█████     | 8/16 [00:09<00:09,  1.20s/it]
[Acessing speaker spk_4 track 1 of 1:  56%|█████▋    | 9/16 [00:11<00:10,  1.45s/it]
[Acessing speaker spk_4 track 1 of 1:  62%|██████▎   | 10/16 [00:12<00:07,  1.26s/it]
[Acessing speaker spk_4 track 1 of 1:  69%|██████▉   | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▌         | 1/17 [00:02<00:32,  2.04s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 2/17 [00:03<00:24,  1.64s/it]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 3/17 [00:04<00:19,  1.40s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▎       | 4/17 [00:05<00:14,  1.11s/it]
[Acessing speaker spk_5 track 1 of 1:  29%|██▉       | 5/17 [00:06<00:12,  1.01s/it]
[Acessing speaker spk_5 track 1 of 1:  35%|███▌      | 6/17 [00:09<00:19,  1.74s/it]
[Acessing speaker spk_5 track 1 of 1:  41%|████      | 7/17 [00:09<00:13,  1.38s/it]
[Acessing speaker spk_5 track 1 of 1:  47%|████▋     | 8/17 [00:11<00:13,  1.47s/it]
[Acessing speaker spk_5 track 1 of 1:  53%|█████▎    | 9/17 [00:14<00:14,  1.83s/it]
[Acessing speaker spk_5 track 1 of 1:  59%|█████▉    | 10/17 [00:14<00:10,  1.54s/it]
[Acessing speaker spk_5 track 1 of 1:  65%|██████▍   | 11/1


RUN: EVAL session_146

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_146
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_146


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▎         | 1/27 [00:00<00:19,  1.31it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/27 [00:01<00:18,  1.36it/s]
[Acessing speaker spk_0 track 1 of 1:  11%|█         | 3/27 [00:01<00:14,  1.67it/s]
[Acessing speaker spk_0 track 1 of 1:  15%|█▍        | 4/27 [00:03<00:23,  1.01s/it]
[Acessing speaker spk_0 track 1 of 1:  19%|█▊        | 5/27 [00:05<00:29,  1.33s/it]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 6/27 [00:06<00:29,  1.40s/it]
[Acessing speaker spk_0 track 1 of 1:  26%|██▌       | 7/27 [00:07<00:23,  1.15s/it]
[Acessing speaker spk_0 track 1 of 1:  30%|██▉       | 8/27 [00:08<00:19,  1.01s/it]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 9/27 [00:08<00:15,  1.17it/s]
[Acessing speaker spk_0 track 1 of 1:  37%|███▋      | 10/27 [00:09<00:14,  1.16it/s]
[Acessing speaker spk_0 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/22 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▍         | 1/22 [00:03<01:07,  3.23s/it]
[Acessing speaker spk_1 track 1 of 1:   9%|▉         | 2/22 [00:04<00:36,  1.85s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▎        | 3/22 [00:05<00:28,  1.49s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 4/22 [00:10<00:56,  3.14s/it]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 5/22 [00:15<01:05,  3.82s/it]
[Acessing speaker spk_1 track 1 of 1:  27%|██▋       | 6/22 [00:16<00:44,  2.79s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 7/22 [00:19<00:43,  2.92s/it]
[Acessing speaker spk_1 track 1 of 1:  36%|███▋      | 8/22 [00:20<00:31,  2.28s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 9/22 [00:21<00:22,  1.76s/it]
[Acessing speaker spk_1 track 1 of 1:  45%|████▌     | 10/22 [00:21<00:16,  1.39s/it]
[Acessing speaker spk_1 track 1 of 1:  50%|█████     | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/14 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 1/14 [00:01<00:14,  1.08s/it]
[Acessing speaker spk_2 track 1 of 1:  14%|█▍        | 2/14 [00:10<01:09,  5.78s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██▏       | 3/14 [00:17<01:12,  6.60s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▊       | 4/14 [00:18<00:44,  4.41s/it]
[Acessing speaker spk_2 track 1 of 1:  36%|███▌      | 5/14 [00:19<00:27,  3.07s/it]
[Acessing speaker spk_2 track 1 of 1:  43%|████▎     | 6/14 [00:20<00:19,  2.48s/it]
[Acessing speaker spk_2 track 1 of 1:  50%|█████     | 7/14 [00:27<00:26,  3.75s/it]
[Acessing speaker spk_2 track 1 of 1:  57%|█████▋    | 8/14 [00:28<00:17,  2.86s/it]
[Acessing speaker spk_2 track 1 of 1:  64%|██████▍   | 9/14 [00:28<00:10,  2.18s/it]
[Acessing speaker spk_2 track 1 of 1:  71%|███████▏  | 10/14 [00:29<00:06,  1.71s/it]
[Acessing speaker spk_2 track 1 of 1:  79%|███████▊  | 11/1





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/29 [00:01<00:33,  1.21s/it]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 2/29 [00:01<00:22,  1.20it/s]
[Acessing speaker spk_3 track 1 of 1:  10%|█         | 3/29 [00:02<00:22,  1.17it/s]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 4/29 [00:03<00:17,  1.43it/s]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 5/29 [00:03<00:15,  1.51it/s]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 6/29 [00:04<00:13,  1.65it/s]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 7/29 [00:05<00:14,  1.49it/s]
[Acessing speaker spk_3 track 1 of 1:  28%|██▊       | 8/29 [00:05<00:15,  1.36it/s]
[Acessing speaker spk_3 track 1 of 1:  31%|███       | 9/29 [00:06<00:13,  1.45it/s]
[Acessing speaker spk_3 track 1 of 1:  34%|███▍      | 10/29 [00:12<00:43,  2.30s/it]
[Acessing speaker spk_3 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   6%|▌         | 1/18 [00:05<01:38,  5.82s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 2/18 [00:06<00:48,  3.04s/it]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 3/18 [00:10<00:49,  3.31s/it]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 4/18 [00:11<00:31,  2.24s/it]
[Acessing speaker spk_4 track 1 of 1:  28%|██▊       | 5/18 [00:11<00:20,  1.59s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 6/18 [00:12<00:15,  1.32s/it]
[Acessing speaker spk_4 track 1 of 1:  39%|███▉      | 7/18 [00:13<00:13,  1.22s/it]
[Acessing speaker spk_4 track 1 of 1:  44%|████▍     | 8/18 [00:18<00:23,  2.37s/it]
[Acessing speaker spk_4 track 1 of 1:  50%|█████     | 9/18 [00:19<00:18,  2.02s/it]
[Acessing speaker spk_4 track 1 of 1:  56%|█████▌    | 10/18 [00:29<00:35,  4.46s/it]
[Acessing speaker spk_4 track 1 of 1:  61%|██████    | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▌         | 1/18 [00:00<00:11,  1.43it/s]
[Acessing speaker spk_5 track 1 of 1:  11%|█         | 2/18 [00:01<00:10,  1.47it/s]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 3/18 [00:03<00:19,  1.27s/it]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 4/18 [00:04<00:17,  1.22s/it]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 5/18 [00:05<00:12,  1.03it/s]
[Acessing speaker spk_5 track 1 of 1:  33%|███▎      | 6/18 [00:06<00:12,  1.02s/it]
[Acessing speaker spk_5 track 1 of 1:  39%|███▉      | 7/18 [00:07<00:13,  1.20s/it]
[Acessing speaker spk_5 track 1 of 1:  44%|████▍     | 8/18 [00:08<00:11,  1.10s/it]
[Acessing speaker spk_5 track 1 of 1:  50%|█████     | 9/18 [00:09<00:08,  1.05it/s]
[Acessing speaker spk_5 track 1 of 1:  56%|█████▌    | 10/18 [00:14<00:17,  2.19s/it]
[Acessing speaker spk_5 track 1 of 1:  61%|██████    | 11/1


RUN: EVAL session_147

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_147
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_147


Processing speakers:   0%|          | 0/8 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/13 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 1/13 [00:01<00:22,  1.87s/it]
[Acessing speaker spk_0 track 1 of 1:  15%|█▌        | 2/13 [00:02<00:14,  1.33s/it]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 3/13 [00:03<00:10,  1.01s/it]
[Acessing speaker spk_0 track 1 of 1:  31%|███       | 4/13 [00:05<00:12,  1.38s/it]
[Acessing speaker spk_0 track 1 of 1:  38%|███▊      | 5/13 [00:06<00:09,  1.14s/it]
[Acessing speaker spk_0 track 1 of 1:  46%|████▌     | 6/13 [00:07<00:07,  1.06s/it]
[Acessing speaker spk_0 track 1 of 1:  54%|█████▍    | 7/13 [00:07<00:05,  1.06it/s]
[Acessing speaker spk_0 track 1 of 1:  62%|██████▏   | 8/13 [00:08<00:04,  1.10it/s]
[Acessing speaker spk_0 track 1 of 1:  69%|██████▉   | 9/13 [00:09<00:03,  1.12it/s]
[Acessing speaker spk_0 track 1 of 1:  77%|███████▋  | 10/13 [00:09<00:02,  1.27it/s]
[Acessing speaker spk_0 track 1 of 1:  85%|████████▍ | 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▍         | 1/26 [00:00<00:23,  1.08it/s]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 2/26 [00:01<00:20,  1.19it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 3/26 [00:02<00:17,  1.29it/s]
[Acessing speaker spk_1 track 1 of 1:  15%|█▌        | 4/26 [00:07<00:55,  2.53s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 5/26 [00:09<00:47,  2.27s/it]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 6/26 [00:12<00:48,  2.44s/it]
[Acessing speaker spk_1 track 1 of 1:  27%|██▋       | 7/26 [00:12<00:35,  1.85s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███       | 8/26 [00:13<00:27,  1.50s/it]
[Acessing speaker spk_1 track 1 of 1:  35%|███▍      | 9/26 [00:17<00:37,  2.22s/it]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 10/26 [00:17<00:27,  1.69s/it]
[Acessing speaker spk_1 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_2 track 1 of 3:   0%|          | 0/15 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 3:   7%|▋         | 1/15 [00:00<00:10,  1.34it/s]
[Acessing speaker spk_2 track 1 of 3:  13%|█▎        | 2/15 [00:06<00:48,  3.72s/it]
[Acessing speaker spk_2 track 1 of 3:  20%|██        | 3/15 [00:10<00:45,  3.77s/it]
[Acessing speaker spk_2 track 1 of 3:  27%|██▋       | 4/15 [00:11<00:29,  2.65s/it]
[Acessing speaker spk_2 track 1 of 3:  33%|███▎      | 5/15 [00:12<00:22,  2.26s/it]
[Acessing speaker spk_2 track 1 of 3:  40%|████      | 6/15 [00:13<00:15,  1.69s/it]
[Acessing speaker spk_2 track 1 of 3:  47%|████▋     | 7/15 [00:15<00:13,  1.68s/it]
[Acessing speaker spk_2 track 1 of 3:  53%|█████▎    | 8/15 [00:21<00:22,  3.24s/it]
[Acessing speaker spk_2 track 1 of 3:  60%|██████    | 9/15 [00:24<00:18,  3.07s/it]
[Acessing speaker spk_2 track 1 of 3:  67%|██████▋   | 10/15 [00:25<00:11,  2.33s/it]
[Acessing speaker spk_2 track 1 of 3:  73%|███████▎  | 11/1





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/35 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/35 [00:00<00:24,  1.37it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▌         | 2/35 [00:01<00:33,  1.01s/it]
[Acessing speaker spk_3 track 1 of 1:   9%|▊         | 3/35 [00:02<00:25,  1.26it/s]
[Acessing speaker spk_3 track 1 of 1:  11%|█▏        | 4/35 [00:03<00:28,  1.10it/s]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 5/35 [00:05<00:37,  1.25s/it]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 6/35 [00:06<00:33,  1.15s/it]
[Acessing speaker spk_3 track 1 of 1:  20%|██        | 7/35 [00:07<00:29,  1.05s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 8/35 [00:07<00:23,  1.13it/s]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 9/35 [00:08<00:23,  1.13it/s]
[Acessing speaker spk_3 track 1 of 1:  29%|██▊       | 10/35 [00:09<00:21,  1.16it/s]
[Acessing speaker spk_3 track 1 of 1:  31%|███▏      | 11/3





[Acessing speaker spk_4 track 1 of 5:   0%|          | 0/10 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 5:  10%|█         | 1/10 [00:00<00:08,  1.05it/s]
[Acessing speaker spk_4 track 1 of 5:  20%|██        | 2/10 [00:06<00:30,  3.77s/it]
[Acessing speaker spk_4 track 1 of 5:  30%|███       | 3/10 [00:13<00:37,  5.36s/it]
[Acessing speaker spk_4 track 1 of 5:  40%|████      | 4/10 [00:14<00:21,  3.56s/it]
[Acessing speaker spk_4 track 1 of 5:  50%|█████     | 5/10 [00:18<00:18,  3.64s/it]
[Acessing speaker spk_4 track 1 of 5:  60%|██████    | 6/10 [00:23<00:15,  3.98s/it]
[Acessing speaker spk_4 track 1 of 5:  70%|███████   | 7/10 [00:24<00:09,  3.19s/it]
[Acessing speaker spk_4 track 1 of 5:  80%|████████  | 8/10 [00:28<00:06,  3.33s/it]
[Acessing speaker spk_4 track 1 of 5:  90%|█████████ | 9/10 [00:35<00:04,  4.61s/it]
Processing speaker spk_4 track 1 of 5: 100%|██████████| 10/10 [00:37<00:00,  3.71s/it]

[Acessing speaker spk_4 track 2 of 5:   0%|          | 0/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/33 [00:01<00:40,  1.27s/it]
[Acessing speaker spk_5 track 1 of 1:   6%|▌         | 2/33 [00:01<00:27,  1.12it/s]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 3/33 [00:02<00:24,  1.20it/s]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 4/33 [00:03<00:22,  1.28it/s]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 5/33 [00:04<00:22,  1.27it/s]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 6/33 [00:05<00:22,  1.18it/s]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 7/33 [00:05<00:19,  1.34it/s]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 8/33 [00:06<00:18,  1.39it/s]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 9/33 [00:06<00:16,  1.45it/s]
[Acessing speaker spk_5 track 1 of 1:  30%|███       | 10/33 [00:10<00:33,  1.47s/it]
[Acessing speaker spk_5 track 1 of 1:  33%|███▎      | 11/3





[Acessing speaker spk_6 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_6 track 1 of 1:   4%|▎         | 1/27 [00:02<01:14,  2.86s/it]
[Acessing speaker spk_6 track 1 of 1:   7%|▋         | 2/27 [00:03<00:44,  1.78s/it]
[Acessing speaker spk_6 track 1 of 1:  11%|█         | 3/27 [00:04<00:34,  1.43s/it]
[Acessing speaker spk_6 track 1 of 1:  15%|█▍        | 4/27 [00:05<00:23,  1.01s/it]
[Acessing speaker spk_6 track 1 of 1:  19%|█▊        | 5/27 [00:10<00:58,  2.65s/it]
[Acessing speaker spk_6 track 1 of 1:  22%|██▏       | 6/27 [00:17<01:21,  3.88s/it]
[Acessing speaker spk_6 track 1 of 1:  26%|██▌       | 7/27 [00:18<01:00,  3.03s/it]
[Acessing speaker spk_6 track 1 of 1:  30%|██▉       | 8/27 [00:20<00:49,  2.63s/it]
[Acessing speaker spk_6 track 1 of 1:  33%|███▎      | 9/27 [00:25<01:05,  3.61s/it]
[Acessing speaker spk_6 track 1 of 1:  37%|███▋      | 10/27 [00:28<00:55,  3.26s/it]
[Acessing speaker spk_6 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_7 track 1 of 1:   0%|          | 0/22 [00:00<?, ?it/s]
[Acessing speaker spk_7 track 1 of 1:   5%|▍         | 1/22 [00:01<00:22,  1.06s/it]
[Acessing speaker spk_7 track 1 of 1:   9%|▉         | 2/22 [00:01<00:15,  1.26it/s]
[Acessing speaker spk_7 track 1 of 1:  14%|█▎        | 3/22 [00:05<00:44,  2.37s/it]
[Acessing speaker spk_7 track 1 of 1:  18%|█▊        | 4/22 [00:08<00:43,  2.42s/it]
[Acessing speaker spk_7 track 1 of 1:  23%|██▎       | 5/22 [00:10<00:39,  2.33s/it]
[Acessing speaker spk_7 track 1 of 1:  27%|██▋       | 6/22 [00:13<00:39,  2.45s/it]
[Acessing speaker spk_7 track 1 of 1:  32%|███▏      | 7/22 [00:14<00:29,  1.95s/it]
[Acessing speaker spk_7 track 1 of 1:  36%|███▋      | 8/22 [00:24<01:04,  4.60s/it]
[Acessing speaker spk_7 track 1 of 1:  41%|████      | 9/22 [00:33<01:18,  6.04s/it]
[Acessing speaker spk_7 track 1 of 1:  45%|████▌     | 10/22 [00:43<01:25,  7.11s/it]
[Acessing speaker spk_7 track 1 of 1:  50%|█████     | 11/2


RUN: EVAL session_148

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_148
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_148


Processing speakers:   0%|          | 0/8 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/33 [00:03<01:38,  3.09s/it]
[Acessing speaker spk_0 track 1 of 1:   6%|▌         | 2/33 [00:03<00:51,  1.66s/it]
[Acessing speaker spk_0 track 1 of 1:   9%|▉         | 3/33 [00:04<00:33,  1.12s/it]
[Acessing speaker spk_0 track 1 of 1:  12%|█▏        | 4/33 [00:07<00:51,  1.78s/it]
[Acessing speaker spk_0 track 1 of 1:  15%|█▌        | 5/33 [00:08<00:43,  1.56s/it]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 6/33 [00:08<00:32,  1.21s/it]
[Acessing speaker spk_0 track 1 of 1:  21%|██        | 7/33 [00:10<00:35,  1.36s/it]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 8/33 [00:10<00:28,  1.12s/it]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 9/33 [00:12<00:30,  1.26s/it]
[Acessing speaker spk_0 track 1 of 1:  30%|███       | 10/33 [00:13<00:26,  1.17s/it]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 11/3





[Acessing speaker spk_1 track 1 of 5:   0%|          | 0/8 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 5:  12%|█▎        | 1/8 [00:00<00:04,  1.71it/s]
[Acessing speaker spk_1 track 1 of 5:  25%|██▌       | 2/8 [00:01<00:03,  1.75it/s]
[Acessing speaker spk_1 track 1 of 5:  38%|███▊      | 3/8 [00:01<00:02,  1.77it/s]
[Acessing speaker spk_1 track 1 of 5:  50%|█████     | 4/8 [00:02<00:02,  1.70it/s]
[Acessing speaker spk_1 track 1 of 5:  62%|██████▎   | 5/8 [00:03<00:02,  1.11it/s]
[Acessing speaker spk_1 track 1 of 5:  75%|███████▌  | 6/8 [00:04<00:01,  1.15it/s]
[Acessing speaker spk_1 track 1 of 5:  88%|████████▊ | 7/8 [00:05<00:00,  1.10it/s]
Processing speaker spk_1 track 1 of 5: 100%|██████████| 8/8 [00:06<00:00,  1.28it/s]

[Acessing speaker spk_1 track 2 of 5:   0%|          | 0/9 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 2 of 5:  11%|█         | 1/9 [00:00<00:06,  1.17it/s]
[Acessing speaker spk_1 track 2 of 5:  22%|██▏       | 2/9 [00:02<00:09,  1.41





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   4%|▎         | 1/28 [00:00<00:25,  1.07it/s]
[Acessing speaker spk_2 track 1 of 2:   7%|▋         | 2/28 [00:03<00:55,  2.12s/it]
[Acessing speaker spk_2 track 1 of 2:  11%|█         | 3/28 [00:09<01:34,  3.80s/it]
[Acessing speaker spk_2 track 1 of 2:  14%|█▍        | 4/28 [00:10<01:00,  2.54s/it]
[Acessing speaker spk_2 track 1 of 2:  18%|█▊        | 5/28 [00:11<00:47,  2.05s/it]
[Acessing speaker spk_2 track 1 of 2:  21%|██▏       | 6/28 [00:15<01:03,  2.88s/it]
[Acessing speaker spk_2 track 1 of 2:  25%|██▌       | 7/28 [00:17<00:54,  2.58s/it]
[Acessing speaker spk_2 track 1 of 2:  29%|██▊       | 8/28 [00:18<00:39,  1.98s/it]
[Acessing speaker spk_2 track 1 of 2:  32%|███▏      | 9/28 [00:19<00:30,  1.59s/it]
[Acessing speaker spk_2 track 1 of 2:  36%|███▌      | 10/28 [00:21<00:33,  1.86s/it]
[Acessing speaker spk_2 track 1 of 2:  39%|███▉      | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▍         | 1/24 [00:01<00:24,  1.07s/it]
[Acessing speaker spk_3 track 1 of 1:   8%|▊         | 2/24 [00:07<01:27,  4.00s/it]
[Acessing speaker spk_3 track 1 of 1:  12%|█▎        | 3/24 [00:13<01:46,  5.09s/it]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 4/24 [00:21<02:03,  6.18s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 5/24 [00:29<02:10,  6.88s/it]
[Acessing speaker spk_3 track 1 of 1:  25%|██▌       | 6/24 [00:39<02:23,  7.95s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 7/24 [00:47<02:14,  7.91s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 8/24 [00:56<02:14,  8.41s/it]
Processing speaker spk_3 track 1 of 1:  38%|███▊      | 9/24 [01:20<02:14,  8.98s/it]
Processing speakers:  38%|███▊      | 3/8 [04:26<07:23, 88.75s/it]

Error during inference for segment {'video': 'data-bin/eval/session_148/speakers/spk_3/central_crops/track_00_lip.av.mp4', 'start_time': 167.0, 'end_time': 186.16}
FAILED: session_148 -> OutOfMemoryError('CUDA out of memory. Tried to allocate 24.00 MiB. GPU 0 has a total capacity of 79.14 GiB of which 13.50 MiB is free. Process 3077613 has 79.11 GiB memory in use. Of the allocated memory 74.41 GiB is allocated by PyTorch, and 4.21 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)')






RUN: EVAL session_149

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_149
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_149


Processing speakers:   0%|          | 0/7 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/14 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 1/14 [00:01<00:14,  1.12s/it]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 2/14 [00:02<00:12,  1.03s/it]
[Acessing speaker spk_0 track 1 of 1:  21%|██▏       | 3/14 [00:02<00:08,  1.29it/s]
[Acessing speaker spk_0 track 1 of 1:  29%|██▊       | 4/14 [00:03<00:06,  1.51it/s]
[Acessing speaker spk_0 track 1 of 1:  36%|███▌      | 5/14 [00:03<00:05,  1.60it/s]
[Acessing speaker spk_0 track 1 of 1:  43%|████▎     | 6/14 [00:04<00:05,  1.58it/s]
[Acessing speaker spk_0 track 1 of 1:  50%|█████     | 7/14 [00:05<00:04,  1.43it/s]
[Acessing speaker spk_0 track 1 of 1:  57%|█████▋    | 8/14 [00:05<00:03,  1.59it/s]
[Acessing speaker spk_0 track 1 of 1:  64%|██████▍   | 9/14 [00:06<00:03,  1.58it/s]
[Acessing speaker spk_0 track 1 of 1:  71%|███████▏  | 10/14 [00:06<00:02,  1.55it/s]
[Acessing speaker spk_0 track 1 of 1:  79%|███████▊  | 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/19 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▌         | 1/19 [00:01<00:31,  1.74s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 2/19 [00:03<00:28,  1.68s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 3/19 [00:04<00:21,  1.35s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 4/19 [00:05<00:19,  1.29s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▋       | 5/19 [00:06<00:14,  1.04s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 6/19 [00:08<00:20,  1.60s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 7/19 [00:09<00:16,  1.41s/it]
[Acessing speaker spk_1 track 1 of 1:  42%|████▏     | 8/19 [00:13<00:22,  2.02s/it]
[Acessing speaker spk_1 track 1 of 1:  47%|████▋     | 9/19 [00:13<00:16,  1.60s/it]
[Acessing speaker spk_1 track 1 of 1:  53%|█████▎    | 10/19 [00:15<00:13,  1.46s/it]
[Acessing speaker spk_1 track 1 of 1:  58%|█████▊    | 11/1





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   4%|▎         | 1/27 [00:03<01:35,  3.67s/it]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 2/27 [00:06<01:19,  3.20s/it]
[Acessing speaker spk_2 track 1 of 1:  11%|█         | 3/27 [00:07<00:48,  2.01s/it]
[Acessing speaker spk_2 track 1 of 1:  15%|█▍        | 4/27 [00:12<01:12,  3.15s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▊        | 5/27 [00:16<01:23,  3.80s/it]
[Acessing speaker spk_2 track 1 of 1:  22%|██▏       | 6/27 [00:20<01:15,  3.59s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 7/27 [00:21<00:55,  2.75s/it]
[Acessing speaker spk_2 track 1 of 1:  30%|██▉       | 8/27 [00:21<00:40,  2.13s/it]
[Acessing speaker spk_2 track 1 of 1:  33%|███▎      | 9/27 [00:23<00:36,  2.03s/it]
[Acessing speaker spk_2 track 1 of 1:  37%|███▋      | 10/27 [00:31<01:04,  3.80s/it]
[Acessing speaker spk_2 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_3 track 1 of 5:   0%|          | 0/5 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 5:  20%|██        | 1/5 [00:01<00:05,  1.29s/it]
[Acessing speaker spk_3 track 1 of 5:  40%|████      | 2/5 [00:06<00:09,  3.33s/it]
[Acessing speaker spk_3 track 1 of 5:  60%|██████    | 3/5 [00:07<00:04,  2.30s/it]
[Acessing speaker spk_3 track 1 of 5:  80%|████████  | 4/5 [00:09<00:02,  2.14s/it]
Processing speaker spk_3 track 1 of 5: 100%|██████████| 5/5 [00:10<00:00,  2.09s/it]

[Acessing speaker spk_3 track 2 of 5:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_3 track 2 of 5: 100%|██████████| 1/1 [00:04<00:00,  4.59s/it]

[Acessing speaker spk_3 track 3 of 5:   0%|          | 0/8 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 3 of 5:  12%|█▎        | 1/8 [00:01<00:09,  1.42s/it]
[Acessing speaker spk_3 track 3 of 5:  25%|██▌       | 2/8 [00:02<00:06,  1.07s/it]
[Acessing speaker spk_3 track 3 of 5:  38%|███▊      | 3/8 [00:08<00:17,  3.44s/it]






[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/33 [00:00<00:28,  1.14it/s]
[Acessing speaker spk_4 track 1 of 1:   6%|▌         | 2/33 [00:02<00:35,  1.13s/it]
[Acessing speaker spk_4 track 1 of 1:   9%|▉         | 3/33 [00:03<00:43,  1.43s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 4/33 [00:05<00:45,  1.58s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▌        | 5/33 [00:09<01:03,  2.26s/it]
[Acessing speaker spk_4 track 1 of 1:  18%|█▊        | 6/33 [00:10<00:47,  1.75s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██        | 7/33 [00:10<00:36,  1.41s/it]
[Acessing speaker spk_4 track 1 of 1:  24%|██▍       | 8/33 [00:14<00:54,  2.18s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 9/33 [00:18<01:02,  2.60s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|███       | 10/33 [00:26<01:38,  4.27s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 11/3





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/29 [00:01<00:48,  1.75s/it]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 2/29 [00:02<00:31,  1.18s/it]
[Acessing speaker spk_5 track 1 of 1:  10%|█         | 3/29 [00:04<00:39,  1.52s/it]
[Acessing speaker spk_5 track 1 of 1:  14%|█▍        | 4/29 [00:04<00:28,  1.12s/it]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 5/29 [00:05<00:21,  1.12it/s]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 6/29 [00:06<00:21,  1.07it/s]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 7/29 [00:07<00:20,  1.08it/s]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 8/29 [00:09<00:29,  1.42s/it]
[Acessing speaker spk_5 track 1 of 1:  31%|███       | 9/29 [00:10<00:24,  1.23s/it]
[Acessing speaker spk_5 track 1 of 1:  34%|███▍      | 10/29 [00:11<00:20,  1.07s/it]
[Acessing speaker spk_5 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_6 track 1 of 2:   0%|          | 0/7 [00:00<?, ?it/s]
[Acessing speaker spk_6 track 1 of 2:  14%|█▍        | 1/7 [00:00<00:05,  1.15it/s]
[Acessing speaker spk_6 track 1 of 2:  29%|██▊       | 2/7 [00:07<00:22,  4.45s/it]
[Acessing speaker spk_6 track 1 of 2:  43%|████▎     | 3/7 [00:15<00:23,  5.84s/it]
[Acessing speaker spk_6 track 1 of 2:  57%|█████▋    | 4/7 [00:16<00:11,  3.91s/it]
[Acessing speaker spk_6 track 1 of 2:  71%|███████▏  | 5/7 [00:20<00:08,  4.13s/it]
[Acessing speaker spk_6 track 1 of 2:  86%|████████▌ | 6/7 [00:26<00:04,  4.50s/it]
Processing speaker spk_6 track 1 of 2: 100%|██████████| 7/7 [00:28<00:00,  4.12s/it]

[Acessing speaker spk_6 track 2 of 2:   0%|          | 0/20 [00:00<?, ?it/s]
[Acessing speaker spk_6 track 2 of 2:   5%|▌         | 1/20 [00:03<01:05,  3.47s/it]
[Acessing speaker spk_6 track 2 of 2:  10%|█         | 2/20 [00:04<00:39,  2.20s/it]
[Acessing speaker spk_6 track 2 of 2:  15%|█▌        | 3/20 [00:12<01:20,  


RUN: EVAL session_15

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_15
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_15


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▍         | 1/26 [00:02<01:05,  2.60s/it]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 2/26 [00:03<00:39,  1.64s/it]
[Acessing speaker spk_0 track 1 of 1:  12%|█▏        | 3/26 [00:04<00:33,  1.45s/it]
[Acessing speaker spk_0 track 1 of 1:  15%|█▌        | 4/26 [00:07<00:41,  1.90s/it]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 5/26 [00:14<01:22,  3.92s/it]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 6/26 [00:16<00:59,  2.99s/it]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 7/26 [00:17<00:47,  2.48s/it]
[Acessing speaker spk_0 track 1 of 1:  31%|███       | 8/26 [00:19<00:41,  2.33s/it]
[Acessing speaker spk_0 track 1 of 1:  35%|███▍      | 9/26 [00:20<00:30,  1.81s/it]
[Acessing speaker spk_0 track 1 of 1:  38%|███▊      | 10/26 [00:27<00:54,  3.39s/it]
[Acessing speaker spk_0 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▍         | 1/26 [00:00<00:22,  1.12it/s]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 2/26 [00:01<00:17,  1.40it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 3/26 [00:04<00:39,  1.73s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▌        | 4/26 [00:05<00:33,  1.52s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 5/26 [00:06<00:23,  1.12s/it]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 6/26 [00:06<00:18,  1.07it/s]
[Acessing speaker spk_1 track 1 of 1:  27%|██▋       | 7/26 [00:07<00:18,  1.00it/s]
[Acessing speaker spk_1 track 1 of 1:  31%|███       | 8/26 [00:08<00:14,  1.20it/s]
[Acessing speaker spk_1 track 1 of 1:  35%|███▍      | 9/26 [00:09<00:18,  1.11s/it]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 10/26 [00:11<00:19,  1.23s/it]
[Acessing speaker spk_1 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/19 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   5%|▌         | 1/19 [00:01<00:27,  1.55s/it]
[Acessing speaker spk_2 track 1 of 1:  11%|█         | 2/19 [00:06<01:05,  3.83s/it]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 3/19 [00:13<01:17,  4.84s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 4/19 [00:18<01:17,  5.17s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▋       | 5/19 [00:24<01:14,  5.29s/it]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 6/19 [00:25<00:50,  3.89s/it]
[Acessing speaker spk_2 track 1 of 1:  37%|███▋      | 7/19 [00:28<00:44,  3.74s/it]
[Acessing speaker spk_2 track 1 of 1:  42%|████▏     | 8/19 [00:31<00:36,  3.32s/it]
[Acessing speaker spk_2 track 1 of 1:  47%|████▋     | 9/19 [00:31<00:24,  2.50s/it]
[Acessing speaker spk_2 track 1 of 1:  53%|█████▎    | 10/19 [00:33<00:18,  2.06s/it]
[Acessing speaker spk_2 track 1 of 1:  58%|█████▊    | 11/1





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▌         | 1/18 [00:01<00:21,  1.28s/it]
[Acessing speaker spk_3 track 1 of 1:  11%|█         | 2/18 [00:02<00:22,  1.44s/it]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 3/18 [00:03<00:18,  1.25s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 4/18 [00:05<00:17,  1.28s/it]
[Acessing speaker spk_3 track 1 of 1:  28%|██▊       | 5/18 [00:05<00:13,  1.05s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 6/18 [00:08<00:18,  1.57s/it]
[Acessing speaker spk_3 track 1 of 1:  39%|███▉      | 7/18 [00:14<00:32,  2.91s/it]
[Acessing speaker spk_3 track 1 of 1:  44%|████▍     | 8/18 [00:19<00:38,  3.86s/it]
[Acessing speaker spk_3 track 1 of 1:  50%|█████     | 9/18 [00:25<00:40,  4.47s/it]
[Acessing speaker spk_3 track 1 of 1:  56%|█████▌    | 10/18 [00:30<00:37,  4.68s/it]
[Acessing speaker spk_3 track 1 of 1:  61%|██████    | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/19 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   5%|▌         | 1/19 [00:05<01:36,  5.38s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 2/19 [00:12<01:49,  6.42s/it]
[Acessing speaker spk_4 track 1 of 1:  16%|█▌        | 3/19 [00:21<01:59,  7.44s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██        | 4/19 [00:28<01:50,  7.35s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▋       | 5/19 [00:36<01:47,  7.65s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 6/19 [00:44<01:41,  7.83s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 7/19 [00:53<01:35,  8.00s/it]
[Acessing speaker spk_4 track 1 of 1:  42%|████▏     | 8/19 [01:00<01:25,  7.74s/it]
[Acessing speaker spk_4 track 1 of 1:  47%|████▋     | 9/19 [01:08<01:20,  8.01s/it]
[Acessing speaker spk_4 track 1 of 1:  53%|█████▎    | 10/19 [01:15<01:09,  7.67s/it]
[Acessing speaker spk_4 track 1 of 1:  58%|█████▊    | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   4%|▍         | 1/26 [00:01<00:28,  1.13s/it]
[Acessing speaker spk_5 track 1 of 1:   8%|▊         | 2/26 [00:01<00:21,  1.11it/s]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 3/26 [00:04<00:36,  1.57s/it]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 4/26 [00:05<00:30,  1.38s/it]
[Acessing speaker spk_5 track 1 of 1:  19%|█▉        | 5/26 [00:05<00:22,  1.06s/it]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 6/26 [00:06<00:17,  1.11it/s]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 7/26 [00:07<00:16,  1.12it/s]
[Acessing speaker spk_5 track 1 of 1:  31%|███       | 8/26 [00:07<00:14,  1.24it/s]
[Acessing speaker spk_5 track 1 of 1:  35%|███▍      | 9/26 [00:09<00:17,  1.03s/it]
[Acessing speaker spk_5 track 1 of 1:  38%|███▊      | 10/26 [00:11<00:20,  1.27s/it]
[Acessing speaker spk_5 track 1 of 1:  42%|████▏     | 11/2


RUN: EVAL session_151

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_151
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_151


Processing speakers:   0%|          | 0/8 [00:00<?, ?it/s]





Processing speaker spk_0 track 1 of 3: 0it [00:00, ?it/s]

[Acessing speaker spk_0 track 2 of 3:   0%|          | 0/6 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 2 of 3:  17%|█▋        | 1/6 [00:01<00:06,  1.35s/it]
[Acessing speaker spk_0 track 2 of 3:  33%|███▎      | 2/6 [00:01<00:03,  1.18it/s]
[Acessing speaker spk_0 track 2 of 3:  50%|█████     | 3/6 [00:04<00:04,  1.50s/it]
[Acessing speaker spk_0 track 2 of 3:  67%|██████▋   | 4/6 [00:05<00:03,  1.57s/it]
[Acessing speaker spk_0 track 2 of 3:  83%|████████▎ | 5/6 [00:07<00:01,  1.64s/it]
Processing speaker spk_0 track 2 of 3: 100%|██████████| 6/6 [00:08<00:00,  1.41s/it]

[Acessing speaker spk_0 track 3 of 3:   0%|          | 0/7 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 3 of 3:  14%|█▍        | 1/7 [00:00<00:05,  1.07it/s]
[Acessing speaker spk_0 track 3 of 3:  29%|██▊       | 2/7 [00:01<00:03,  1.31it/s]
[Acessing speaker spk_0 track 3 of 3:  43%|████▎     | 3/7 [00:02<00:03,  1.26it/s]
[Acessing speaker s





[Acessing speaker spk_1 track 1 of 2:   0%|          | 0/10 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 2:  10%|█         | 1/10 [00:01<00:12,  1.43s/it]
[Acessing speaker spk_1 track 1 of 2:  20%|██        | 2/10 [00:02<00:08,  1.09s/it]
[Acessing speaker spk_1 track 1 of 2:  30%|███       | 3/10 [00:05<00:15,  2.22s/it]
[Acessing speaker spk_1 track 1 of 2:  40%|████      | 4/10 [00:08<00:15,  2.59s/it]
[Acessing speaker spk_1 track 1 of 2:  50%|█████     | 5/10 [00:09<00:09,  1.88s/it]
[Acessing speaker spk_1 track 1 of 2:  60%|██████    | 6/10 [00:13<00:10,  2.51s/it]
[Acessing speaker spk_1 track 1 of 2:  70%|███████   | 7/10 [00:17<00:08,  2.94s/it]
[Acessing speaker spk_1 track 1 of 2:  80%|████████  | 8/10 [00:17<00:04,  2.23s/it]
[Acessing speaker spk_1 track 1 of 2:  90%|█████████ | 9/10 [00:20<00:02,  2.21s/it]
Processing speaker spk_1 track 1 of 2: 100%|██████████| 10/10 [00:28<00:00,  2.89s/it]

[Acessing speaker spk_1 track 2 of 2:   0%|          | 0/1





[Acessing speaker spk_2 track 1 of 4:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 1 of 4: 100%|██████████| 1/1 [00:00<00:00,  1.31it/s]

Processing speaker spk_2 track 2 of 4: 0it [00:00, ?it/s]

[Acessing speaker spk_2 track 3 of 4:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 3 of 4: 100%|██████████| 1/1 [00:02<00:00,  2.68s/it]

[Acessing speaker spk_2 track 4 of 4:   0%|          | 0/12 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 4 of 4:   8%|▊         | 1/12 [00:01<00:13,  1.23s/it]
[Acessing speaker spk_2 track 4 of 4:  17%|█▋        | 2/12 [00:01<00:08,  1.19it/s]
[Acessing speaker spk_2 track 4 of 4:  25%|██▌       | 3/12 [00:02<00:07,  1.25it/s]
[Acessing speaker spk_2 track 4 of 4:  33%|███▎      | 4/12 [00:03<00:06,  1.24it/s]
[Acessing speaker spk_2 track 4 of 4:  42%|████▏     | 5/12 [00:03<00:04,  1.47it/s]
[Acessing speaker spk_2 track 4 of 4:  50%|█████     | 6/12 [00:04<00:04,  1.40it/s]
[Acessing speaker s





[Acessing speaker spk_3 track 1 of 7:   0%|          | 0/8 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 7:  12%|█▎        | 1/8 [00:00<00:06,  1.16it/s]
[Acessing speaker spk_3 track 1 of 7:  25%|██▌       | 2/8 [00:01<00:04,  1.45it/s]
[Acessing speaker spk_3 track 1 of 7:  38%|███▊      | 3/8 [00:02<00:03,  1.44it/s]
[Acessing speaker spk_3 track 1 of 7:  50%|█████     | 4/8 [00:02<00:02,  1.35it/s]
[Acessing speaker spk_3 track 1 of 7:  62%|██████▎   | 5/8 [00:08<00:07,  2.57s/it]
[Acessing speaker spk_3 track 1 of 7:  75%|███████▌  | 6/8 [00:10<00:04,  2.22s/it]
[Acessing speaker spk_3 track 1 of 7:  88%|████████▊ | 7/8 [00:11<00:01,  1.85s/it]
Processing speaker spk_3 track 1 of 7: 100%|██████████| 8/8 [00:12<00:00,  1.51s/it]

[Acessing speaker spk_3 track 2 of 7:   0%|          | 0/9 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 2 of 7:  11%|█         | 1/9 [00:02<00:20,  2.53s/it]
[Acessing speaker spk_3 track 2 of 7:  22%|██▏       | 2/9 [00:05<00:20,  2.86





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/26 [00:01<00:34,  1.37s/it]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/26 [00:04<01:02,  2.60s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 3/26 [00:05<00:40,  1.75s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▌        | 4/26 [00:14<01:39,  4.50s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 5/26 [00:16<01:14,  3.54s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 6/26 [00:16<00:51,  2.58s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 7/26 [00:19<00:52,  2.74s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 8/26 [00:23<00:52,  2.92s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▍      | 9/26 [00:24<00:40,  2.40s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 10/26 [00:25<00:29,  1.84s/it]
[Acessing speaker spk_4 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/31 [00:00<00:27,  1.09it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▋         | 2/31 [00:01<00:24,  1.20it/s]
[Acessing speaker spk_5 track 1 of 1:  10%|▉         | 3/31 [00:02<00:18,  1.50it/s]
[Acessing speaker spk_5 track 1 of 1:  13%|█▎        | 4/31 [00:02<00:19,  1.38it/s]
[Acessing speaker spk_5 track 1 of 1:  16%|█▌        | 5/31 [00:04<00:30,  1.18s/it]
[Acessing speaker spk_5 track 1 of 1:  19%|█▉        | 6/31 [00:06<00:30,  1.21s/it]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 7/31 [00:07<00:27,  1.15s/it]
[Acessing speaker spk_5 track 1 of 1:  26%|██▌       | 8/31 [00:07<00:22,  1.01it/s]
[Acessing speaker spk_5 track 1 of 1:  29%|██▉       | 9/31 [00:09<00:24,  1.10s/it]
[Acessing speaker spk_5 track 1 of 1:  32%|███▏      | 10/31 [00:10<00:20,  1.00it/s]
[Acessing speaker spk_5 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_6 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_6 track 1 of 1:   3%|▎         | 1/33 [00:06<03:42,  6.94s/it]
[Acessing speaker spk_6 track 1 of 1:   6%|▌         | 2/33 [00:11<02:46,  5.36s/it]
[Acessing speaker spk_6 track 1 of 1:   9%|▉         | 3/33 [00:13<02:04,  4.14s/it]
[Acessing speaker spk_6 track 1 of 1:  12%|█▏        | 4/33 [00:15<01:28,  3.05s/it]
[Acessing speaker spk_6 track 1 of 1:  15%|█▌        | 5/33 [00:20<01:49,  3.89s/it]
[Acessing speaker spk_6 track 1 of 1:  18%|█▊        | 6/33 [00:29<02:28,  5.50s/it]
[Acessing speaker spk_6 track 1 of 1:  21%|██        | 7/33 [00:35<02:31,  5.83s/it]
[Acessing speaker spk_6 track 1 of 1:  24%|██▍       | 8/33 [00:37<01:53,  4.56s/it]
[Acessing speaker spk_6 track 1 of 1:  27%|██▋       | 9/33 [00:42<01:52,  4.71s/it]
[Acessing speaker spk_6 track 1 of 1:  30%|███       | 10/33 [00:49<02:00,  5.22s/it]
[Acessing speaker spk_6 track 1 of 1:  33%|███▎      | 11/3





[Acessing speaker spk_7 track 1 of 1:   0%|          | 0/36 [00:00<?, ?it/s]
[Acessing speaker spk_7 track 1 of 1:   3%|▎         | 1/36 [00:01<00:36,  1.05s/it]
[Acessing speaker spk_7 track 1 of 1:   6%|▌         | 2/36 [00:01<00:30,  1.12it/s]
[Acessing speaker spk_7 track 1 of 1:   8%|▊         | 3/36 [00:04<00:53,  1.61s/it]
[Acessing speaker spk_7 track 1 of 1:  11%|█         | 4/36 [00:04<00:40,  1.25s/it]
[Acessing speaker spk_7 track 1 of 1:  14%|█▍        | 5/36 [00:06<00:47,  1.52s/it]
[Acessing speaker spk_7 track 1 of 1:  17%|█▋        | 6/36 [00:08<00:41,  1.37s/it]
[Acessing speaker spk_7 track 1 of 1:  19%|█▉        | 7/36 [00:08<00:34,  1.20s/it]
[Acessing speaker spk_7 track 1 of 1:  22%|██▏       | 8/36 [00:09<00:28,  1.04s/it]
[Acessing speaker spk_7 track 1 of 1:  25%|██▌       | 9/36 [00:10<00:25,  1.04it/s]
[Acessing speaker spk_7 track 1 of 1:  28%|██▊       | 10/36 [00:17<01:14,  2.86s/it]
[Acessing speaker spk_7 track 1 of 1:  31%|███       | 11/3


RUN: EVAL session_16

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_16
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_16


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/30 [00:01<00:29,  1.03s/it]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/30 [00:03<00:55,  1.98s/it]
[Acessing speaker spk_0 track 1 of 1:  10%|█         | 3/30 [00:04<00:36,  1.36s/it]
[Acessing speaker spk_0 track 1 of 1:  13%|█▎        | 4/30 [00:06<00:41,  1.58s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 5/30 [00:06<00:30,  1.21s/it]
[Acessing speaker spk_0 track 1 of 1:  20%|██        | 6/30 [00:09<00:45,  1.89s/it]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 7/30 [00:11<00:38,  1.68s/it]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 8/30 [00:12<00:31,  1.45s/it]
[Acessing speaker spk_0 track 1 of 1:  30%|███       | 9/30 [00:12<00:26,  1.25s/it]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 10/30 [00:14<00:26,  1.33s/it]
[Acessing speaker spk_0 track 1 of 1:  37%|███▋      | 11/3





[Acessing speaker spk_1 track 1 of 2:   0%|          | 0/22 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 2:   5%|▍         | 1/22 [00:01<00:26,  1.26s/it]
[Acessing speaker spk_1 track 1 of 2:   9%|▉         | 2/22 [00:01<00:15,  1.30it/s]
[Acessing speaker spk_1 track 1 of 2:  14%|█▎        | 3/22 [00:02<00:13,  1.45it/s]
[Acessing speaker spk_1 track 1 of 2:  18%|█▊        | 4/22 [00:03<00:14,  1.24it/s]
[Acessing speaker spk_1 track 1 of 2:  23%|██▎       | 5/22 [00:03<00:12,  1.42it/s]
[Acessing speaker spk_1 track 1 of 2:  27%|██▋       | 6/22 [00:04<00:10,  1.54it/s]
[Acessing speaker spk_1 track 1 of 2:  32%|███▏      | 7/22 [00:05<00:10,  1.48it/s]
[Acessing speaker spk_1 track 1 of 2:  36%|███▋      | 8/22 [00:08<00:20,  1.46s/it]
[Acessing speaker spk_1 track 1 of 2:  41%|████      | 9/22 [00:08<00:16,  1.25s/it]
[Acessing speaker spk_1 track 1 of 2:  45%|████▌     | 10/22 [00:09<00:12,  1.02s/it]
[Acessing speaker spk_1 track 1 of 2:  50%|█████     | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   6%|▌         | 1/17 [00:08<02:11,  8.24s/it]
[Acessing speaker spk_2 track 1 of 1:  12%|█▏        | 2/17 [00:10<01:12,  4.81s/it]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 3/17 [00:18<01:29,  6.40s/it]
[Acessing speaker spk_2 track 1 of 1:  24%|██▎       | 4/17 [00:27<01:34,  7.26s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 5/17 [00:36<01:33,  7.83s/it]
[Acessing speaker spk_2 track 1 of 1:  35%|███▌      | 6/17 [00:38<01:04,  5.88s/it]
[Acessing speaker spk_2 track 1 of 1:  41%|████      | 7/17 [00:39<00:41,  4.17s/it]
[Acessing speaker spk_2 track 1 of 1:  47%|████▋     | 8/17 [00:39<00:27,  3.04s/it]
[Acessing speaker spk_2 track 1 of 1:  53%|█████▎    | 9/17 [00:43<00:25,  3.23s/it]
[Acessing speaker spk_2 track 1 of 1:  59%|█████▉    | 10/17 [00:50<00:31,  4.53s/it]
[Acessing speaker spk_2 track 1 of 1:  65%|██████▍   | 11/1





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/11 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 1/11 [00:08<01:23,  8.36s/it]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 2/11 [00:17<01:21,  9.02s/it]
[Acessing speaker spk_3 track 1 of 1:  27%|██▋       | 3/11 [00:22<00:54,  6.84s/it]
[Acessing speaker spk_3 track 1 of 1:  36%|███▋      | 4/11 [00:26<00:41,  5.96s/it]
[Acessing speaker spk_3 track 1 of 1:  45%|████▌     | 5/11 [00:28<00:27,  4.56s/it]
[Acessing speaker spk_3 track 1 of 1:  55%|█████▍    | 6/11 [00:29<00:16,  3.24s/it]
[Acessing speaker spk_3 track 1 of 1:  64%|██████▎   | 7/11 [00:36<00:18,  4.61s/it]
[Acessing speaker spk_3 track 1 of 1:  73%|███████▎  | 8/11 [00:42<00:15,  5.08s/it]
[Acessing speaker spk_3 track 1 of 1:  82%|████████▏ | 9/11 [00:45<00:08,  4.13s/it]
[Acessing speaker spk_3 track 1 of 1:  91%|█████████ | 10/11 [00:46<00:03,  3.36s/it]
Processing speaker spk_3 track 1 of 1: 100%|██████████| 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/26 [00:01<00:31,  1.26s/it]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/26 [00:07<01:43,  4.33s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 3/26 [00:15<02:18,  6.04s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▌        | 4/26 [00:16<01:29,  4.08s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 5/26 [00:17<01:02,  2.97s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 6/26 [00:21<01:03,  3.16s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 7/26 [00:28<01:25,  4.48s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 8/26 [00:37<01:43,  5.77s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▍      | 9/26 [00:43<01:40,  5.92s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 10/26 [00:44<01:10,  4.39s/it]
[Acessing speaker spk_4 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/29 [00:00<00:25,  1.09it/s]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 2/29 [00:01<00:20,  1.32it/s]
[Acessing speaker spk_5 track 1 of 1:  10%|█         | 3/29 [00:03<00:35,  1.38s/it]
[Acessing speaker spk_5 track 1 of 1:  14%|█▍        | 4/29 [00:05<00:34,  1.39s/it]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 5/29 [00:07<00:41,  1.74s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 6/29 [00:08<00:34,  1.51s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 7/29 [00:09<00:28,  1.29s/it]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 8/29 [00:12<00:37,  1.77s/it]
[Acessing speaker spk_5 track 1 of 1:  31%|███       | 9/29 [00:17<00:54,  2.75s/it]
[Acessing speaker spk_5 track 1 of 1:  34%|███▍      | 10/29 [00:20<00:54,  2.88s/it]
[Acessing speaker spk_5 track 1 of 1:  38%|███▊      | 11/2


RUN: EVAL session_17

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_17
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_17


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/30 [00:00<00:23,  1.21it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/30 [00:01<00:18,  1.51it/s]
[Acessing speaker spk_0 track 1 of 1:  10%|█         | 3/30 [00:06<01:15,  2.78s/it]
[Acessing speaker spk_0 track 1 of 1:  13%|█▎        | 4/30 [00:07<00:52,  2.01s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 5/30 [00:07<00:36,  1.45s/it]
[Acessing speaker spk_0 track 1 of 1:  20%|██        | 6/30 [00:10<00:43,  1.82s/it]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 7/30 [00:13<00:49,  2.14s/it]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 8/30 [00:14<00:43,  1.99s/it]
[Acessing speaker spk_0 track 1 of 1:  30%|███       | 9/30 [00:22<01:19,  3.80s/it]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 10/30 [00:25<01:08,  3.44s/it]
[Acessing speaker spk_0 track 1 of 1:  37%|███▋      | 11/3





[Acessing speaker spk_1 track 1 of 2:   0%|          | 0/9 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 2:  11%|█         | 1/9 [00:00<00:06,  1.26it/s]
[Acessing speaker spk_1 track 1 of 2:  22%|██▏       | 2/9 [00:01<00:05,  1.23it/s]
[Acessing speaker spk_1 track 1 of 2:  33%|███▎      | 3/9 [00:02<00:04,  1.35it/s]
[Acessing speaker spk_1 track 1 of 2:  44%|████▍     | 4/9 [00:02<00:03,  1.44it/s]
[Acessing speaker spk_1 track 1 of 2:  56%|█████▌    | 5/9 [00:03<00:02,  1.56it/s]
[Acessing speaker spk_1 track 1 of 2:  67%|██████▋   | 6/9 [00:03<00:01,  1.74it/s]
[Acessing speaker spk_1 track 1 of 2:  78%|███████▊  | 7/9 [00:04<00:01,  1.71it/s]
[Acessing speaker spk_1 track 1 of 2:  89%|████████▉ | 8/9 [00:05<00:00,  1.75it/s]
Processing speaker spk_1 track 1 of 2: 100%|██████████| 9/9 [00:06<00:00,  1.44it/s]

[Acessing speaker spk_1 track 2 of 2:   0%|          | 0/5 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 2 of 2:  20%|██        | 1/5 [00:01<00:04,  1.03





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/22 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   5%|▍         | 1/22 [00:00<00:19,  1.10it/s]
[Acessing speaker spk_2 track 1 of 1:   9%|▉         | 2/22 [00:01<00:15,  1.30it/s]
[Acessing speaker spk_2 track 1 of 1:  14%|█▎        | 3/22 [00:09<01:14,  3.93s/it]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 4/22 [00:14<01:17,  4.29s/it]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 5/22 [00:22<01:35,  5.62s/it]
[Acessing speaker spk_2 track 1 of 1:  27%|██▋       | 6/22 [00:25<01:16,  4.80s/it]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 7/22 [00:28<01:05,  4.37s/it]
[Acessing speaker spk_2 track 1 of 1:  36%|███▋      | 8/22 [00:35<01:11,  5.07s/it]
[Acessing speaker spk_2 track 1 of 1:  41%|████      | 9/22 [00:36<00:48,  3.74s/it]
[Acessing speaker spk_2 track 1 of 1:  45%|████▌     | 10/22 [00:37<00:34,  2.91s/it]
[Acessing speaker spk_2 track 1 of 1:  50%|█████     | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/15 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 1/15 [00:02<00:30,  2.19s/it]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 2/15 [00:04<00:26,  2.01s/it]
[Acessing speaker spk_3 track 1 of 1:  20%|██        | 3/15 [00:04<00:16,  1.38s/it]
[Acessing speaker spk_3 track 1 of 1:  27%|██▋       | 4/15 [00:05<00:12,  1.17s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 5/15 [00:06<00:10,  1.04s/it]
[Acessing speaker spk_3 track 1 of 1:  40%|████      | 6/15 [00:13<00:28,  3.14s/it]
[Acessing speaker spk_3 track 1 of 1:  47%|████▋     | 7/15 [00:25<00:47,  5.90s/it]
[Acessing speaker spk_3 track 1 of 1:  53%|█████▎    | 8/15 [00:35<00:51,  7.30s/it]
[Acessing speaker spk_3 track 1 of 1:  60%|██████    | 9/15 [00:37<00:33,  5.53s/it]
[Acessing speaker spk_3 track 1 of 1:  67%|██████▋   | 10/15 [00:37<00:20,  4.07s/it]
[Acessing speaker spk_3 track 1 of 1:  73%|███████▎  | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/29 [00:01<00:49,  1.77s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/29 [00:03<00:53,  2.00s/it]
[Acessing speaker spk_4 track 1 of 1:  10%|█         | 3/29 [00:09<01:36,  3.72s/it]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 4/29 [00:10<01:01,  2.47s/it]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 5/29 [00:10<00:42,  1.77s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██        | 6/29 [00:11<00:30,  1.33s/it]
[Acessing speaker spk_4 track 1 of 1:  24%|██▍       | 7/29 [00:16<00:55,  2.51s/it]
[Acessing speaker spk_4 track 1 of 1:  28%|██▊       | 8/29 [00:21<01:08,  3.28s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 9/29 [00:21<00:50,  2.51s/it]
[Acessing speaker spk_4 track 1 of 1:  34%|███▍      | 10/29 [00:23<00:41,  2.19s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   4%|▍         | 1/23 [00:03<01:10,  3.18s/it]
[Acessing speaker spk_5 track 1 of 1:   9%|▊         | 2/23 [00:06<01:08,  3.28s/it]
[Acessing speaker spk_5 track 1 of 1:  13%|█▎        | 3/23 [00:08<00:53,  2.69s/it]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 4/23 [00:09<00:35,  1.89s/it]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 5/23 [00:10<00:27,  1.56s/it]
[Acessing speaker spk_5 track 1 of 1:  26%|██▌       | 6/23 [00:12<00:28,  1.69s/it]
[Acessing speaker spk_5 track 1 of 1:  30%|███       | 7/23 [00:17<00:48,  3.05s/it]
[Acessing speaker spk_5 track 1 of 1:  35%|███▍      | 8/23 [00:23<00:59,  3.96s/it]
[Acessing speaker spk_5 track 1 of 1:  39%|███▉      | 9/23 [00:30<01:06,  4.75s/it]
[Acessing speaker spk_5 track 1 of 1:  43%|████▎     | 10/23 [00:37<01:11,  5.49s/it]
[Acessing speaker spk_5 track 1 of 1:  48%|████▊     | 11/2


RUN: EVAL session_18

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_18
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_18


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▎         | 1/28 [00:00<00:22,  1.21it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/28 [00:01<00:19,  1.37it/s]
[Acessing speaker spk_0 track 1 of 1:  11%|█         | 3/28 [00:02<00:24,  1.03it/s]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 4/28 [00:04<00:27,  1.16s/it]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 5/28 [00:05<00:31,  1.35s/it]
[Acessing speaker spk_0 track 1 of 1:  21%|██▏       | 6/28 [00:08<00:39,  1.81s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 7/28 [00:09<00:30,  1.43s/it]
[Acessing speaker spk_0 track 1 of 1:  29%|██▊       | 8/28 [00:11<00:30,  1.54s/it]
[Acessing speaker spk_0 track 1 of 1:  32%|███▏      | 9/28 [00:11<00:25,  1.36s/it]
[Acessing speaker spk_0 track 1 of 1:  36%|███▌      | 10/28 [00:13<00:25,  1.39s/it]
[Acessing speaker spk_0 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/28 [00:00<00:20,  1.32it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/28 [00:02<00:33,  1.29s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/28 [00:02<00:23,  1.05it/s]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 4/28 [00:04<00:28,  1.19s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 5/28 [00:05<00:22,  1.02it/s]
[Acessing speaker spk_1 track 1 of 1:  21%|██▏       | 6/28 [00:05<00:19,  1.12it/s]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 7/28 [00:06<00:17,  1.20it/s]
[Acessing speaker spk_1 track 1 of 1:  29%|██▊       | 8/28 [00:08<00:22,  1.14s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 9/28 [00:10<00:26,  1.37s/it]
[Acessing speaker spk_1 track 1 of 1:  36%|███▌      | 10/28 [00:10<00:20,  1.13s/it]
[Acessing speaker spk_1 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/22 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   5%|▍         | 1/22 [00:00<00:17,  1.23it/s]
[Acessing speaker spk_2 track 1 of 1:   9%|▉         | 2/22 [00:01<00:14,  1.38it/s]
[Acessing speaker spk_2 track 1 of 1:  14%|█▎        | 3/22 [00:01<00:11,  1.60it/s]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 4/22 [00:03<00:16,  1.10it/s]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 5/22 [00:07<00:36,  2.12s/it]
[Acessing speaker spk_2 track 1 of 1:  27%|██▋       | 6/22 [00:09<00:34,  2.13s/it]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 7/22 [00:14<00:45,  3.00s/it]
[Acessing speaker spk_2 track 1 of 1:  36%|███▋      | 8/22 [00:15<00:33,  2.40s/it]
[Acessing speaker spk_2 track 1 of 1:  41%|████      | 9/22 [00:16<00:23,  1.83s/it]
[Acessing speaker spk_2 track 1 of 1:  45%|████▌     | 10/22 [00:24<00:44,  3.67s/it]
[Acessing speaker spk_2 track 1 of 1:  50%|█████     | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/20 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▌         | 1/20 [00:02<00:54,  2.86s/it]
[Acessing speaker spk_3 track 1 of 1:  10%|█         | 2/20 [00:09<01:28,  4.94s/it]
[Acessing speaker spk_3 track 1 of 1:  15%|█▌        | 3/20 [00:12<01:07,  4.00s/it]
[Acessing speaker spk_3 track 1 of 1:  20%|██        | 4/20 [00:12<00:44,  2.75s/it]
[Acessing speaker spk_3 track 1 of 1:  25%|██▌       | 5/20 [00:24<01:28,  5.92s/it]
[Acessing speaker spk_3 track 1 of 1:  30%|███       | 6/20 [00:35<01:47,  7.67s/it]
[Acessing speaker spk_3 track 1 of 1:  35%|███▌      | 7/20 [00:39<01:22,  6.34s/it]
[Acessing speaker spk_3 track 1 of 1:  40%|████      | 8/20 [00:40<00:55,  4.60s/it]
[Acessing speaker spk_3 track 1 of 1:  45%|████▌     | 9/20 [00:41<00:38,  3.49s/it]
[Acessing speaker spk_3 track 1 of 1:  50%|█████     | 10/20 [00:42<00:26,  2.70s/it]
[Acessing speaker spk_3 track 1 of 1:  55%|█████▌    | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   5%|▍         | 1/21 [00:01<00:20,  1.04s/it]
[Acessing speaker spk_4 track 1 of 1:  10%|▉         | 2/21 [00:01<00:14,  1.34it/s]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 3/21 [00:10<01:17,  4.30s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 4/21 [00:17<01:37,  5.71s/it]
[Acessing speaker spk_4 track 1 of 1:  24%|██▍       | 5/21 [00:24<01:36,  6.01s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▊       | 6/21 [00:33<01:45,  7.05s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 7/21 [00:34<01:09,  4.99s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 8/21 [00:36<00:53,  4.11s/it]
[Acessing speaker spk_4 track 1 of 1:  43%|████▎     | 9/21 [00:39<00:46,  3.84s/it]
[Acessing speaker spk_4 track 1 of 1:  48%|████▊     | 10/21 [00:42<00:38,  3.46s/it]
[Acessing speaker spk_4 track 1 of 1:  52%|█████▏    | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/25 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   4%|▍         | 1/25 [00:00<00:22,  1.08it/s]
[Acessing speaker spk_5 track 1 of 1:   8%|▊         | 2/25 [00:01<00:19,  1.19it/s]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 3/25 [00:02<00:15,  1.44it/s]
[Acessing speaker spk_5 track 1 of 1:  16%|█▌        | 4/25 [00:03<00:16,  1.24it/s]
[Acessing speaker spk_5 track 1 of 1:  20%|██        | 5/25 [00:03<00:13,  1.47it/s]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 6/25 [00:09<00:43,  2.31s/it]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 7/25 [00:16<01:13,  4.10s/it]
[Acessing speaker spk_5 track 1 of 1:  32%|███▏      | 8/25 [00:18<00:54,  3.22s/it]
[Acessing speaker spk_5 track 1 of 1:  36%|███▌      | 9/25 [00:26<01:17,  4.84s/it]
[Acessing speaker spk_5 track 1 of 1:  40%|████      | 10/25 [00:32<01:16,  5.08s/it]
[Acessing speaker spk_5 track 1 of 1:  44%|████▍     | 11/2


RUN: EVAL session_19

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_19
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_19


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 4:   0%|          | 0/4 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 4:  25%|██▌       | 1/4 [00:00<00:02,  1.19it/s]
[Acessing speaker spk_0 track 1 of 4:  50%|█████     | 2/4 [00:01<00:01,  1.35it/s]
[Acessing speaker spk_0 track 1 of 4:  75%|███████▌  | 3/4 [00:03<00:01,  1.12s/it]
Processing speaker spk_0 track 1 of 4: 100%|██████████| 4/4 [00:03<00:00,  1.10it/s]

[Acessing speaker spk_0 track 2 of 4:   0%|          | 0/6 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 2 of 4:  17%|█▋        | 1/6 [00:00<00:04,  1.12it/s]
[Acessing speaker spk_0 track 2 of 4:  33%|███▎      | 2/6 [00:02<00:05,  1.29s/it]
[Acessing speaker spk_0 track 2 of 4:  50%|█████     | 3/6 [00:03<00:03,  1.01s/it]
[Acessing speaker spk_0 track 2 of 4:  67%|██████▋   | 4/6 [00:03<00:01,  1.09it/s]
[Acessing speaker spk_0 track 2 of 4:  83%|████████▎ | 5/6 [00:04<00:00,  1.37it/s]
Processing speaker spk_0 track 2 of 4: 100%|██████████| 6/6 [00:04<00:00,  1.23





[Acessing speaker spk_1 track 1 of 3:   0%|          | 0/3 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 3:  33%|███▎      | 1/3 [00:00<00:00,  2.06it/s]
[Acessing speaker spk_1 track 1 of 3:  67%|██████▋   | 2/3 [00:03<00:01,  1.85s/it]
Processing speaker spk_1 track 1 of 3: 100%|██████████| 3/3 [00:04<00:00,  1.37s/it]

[Acessing speaker spk_1 track 2 of 3:   0%|          | 0/4 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 2 of 3:  25%|██▌       | 1/4 [00:00<00:01,  1.61it/s]
[Acessing speaker spk_1 track 2 of 3:  50%|█████     | 2/4 [00:08<00:09,  4.73s/it]
[Acessing speaker spk_1 track 2 of 3:  75%|███████▌  | 3/4 [00:09<00:02,  2.99s/it]
Processing speaker spk_1 track 2 of 3: 100%|██████████| 4/4 [00:09<00:00,  2.41s/it]

[Acessing speaker spk_1 track 3 of 3:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 3 of 3:   4%|▎         | 1/27 [00:05<02:27,  5.66s/it]
[Acessing speaker spk_1 track 3 of 3:   7%|▋         | 2/27 [00:06<01:08,  2.75s/it





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   6%|▋         | 1/16 [00:00<00:12,  1.23it/s]
[Acessing speaker spk_2 track 1 of 1:  12%|█▎        | 2/16 [00:10<01:20,  5.76s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 3/16 [00:16<01:21,  6.28s/it]
[Acessing speaker spk_2 track 1 of 1:  25%|██▌       | 4/16 [00:23<01:18,  6.50s/it]
[Acessing speaker spk_2 track 1 of 1:  31%|███▏      | 5/16 [00:24<00:48,  4.43s/it]
[Acessing speaker spk_2 track 1 of 1:  38%|███▊      | 6/16 [00:25<00:32,  3.22s/it]
[Acessing speaker spk_2 track 1 of 1:  44%|████▍     | 7/16 [00:25<00:21,  2.35s/it]
[Acessing speaker spk_2 track 1 of 1:  50%|█████     | 8/16 [00:26<00:14,  1.76s/it]
[Acessing speaker spk_2 track 1 of 1:  56%|█████▋    | 9/16 [00:26<00:09,  1.38s/it]
[Acessing speaker spk_2 track 1 of 1:  62%|██████▎   | 10/16 [00:27<00:07,  1.21s/it]
[Acessing speaker spk_2 track 1 of 1:  69%|██████▉   | 11/1





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/14 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 1/14 [00:06<01:22,  6.33s/it]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 2/14 [00:18<01:56,  9.68s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██▏       | 3/14 [00:22<01:20,  7.35s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▊       | 4/14 [00:26<00:57,  5.74s/it]
[Acessing speaker spk_3 track 1 of 1:  36%|███▌      | 5/14 [00:35<01:01,  6.86s/it]
[Acessing speaker spk_3 track 1 of 1:  43%|████▎     | 6/14 [00:45<01:05,  8.17s/it]
[Acessing speaker spk_3 track 1 of 1:  50%|█████     | 7/14 [00:55<01:00,  8.66s/it]
[Acessing speaker spk_3 track 1 of 1:  57%|█████▋    | 8/14 [01:03<00:49,  8.33s/it]
[Acessing speaker spk_3 track 1 of 1:  64%|██████▍   | 9/14 [01:11<00:42,  8.41s/it]
[Acessing speaker spk_3 track 1 of 1:  71%|███████▏  | 10/14 [01:14<00:26,  6.62s/it]
[Acessing speaker spk_3 track 1 of 1:  79%|███████▊  | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/32 [00:00<00:24,  1.28it/s]
[Acessing speaker spk_4 track 1 of 1:   6%|▋         | 2/32 [00:01<00:24,  1.22it/s]
[Acessing speaker spk_4 track 1 of 1:   9%|▉         | 3/32 [00:02<00:20,  1.45it/s]
[Acessing speaker spk_4 track 1 of 1:  12%|█▎        | 4/32 [00:02<00:17,  1.64it/s]
[Acessing speaker spk_4 track 1 of 1:  16%|█▌        | 5/32 [00:03<00:15,  1.69it/s]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 6/32 [00:04<00:19,  1.31it/s]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 7/32 [00:04<00:18,  1.36it/s]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 8/32 [00:05<00:15,  1.50it/s]
[Acessing speaker spk_4 track 1 of 1:  28%|██▊       | 9/32 [00:05<00:13,  1.65it/s]
[Acessing speaker spk_4 track 1 of 1:  31%|███▏      | 10/32 [00:07<00:17,  1.28it/s]
[Acessing speaker spk_4 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▌         | 1/18 [00:05<01:27,  5.13s/it]
[Acessing speaker spk_5 track 1 of 1:  11%|█         | 2/18 [00:08<01:07,  4.19s/it]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 3/18 [00:11<00:52,  3.52s/it]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 4/18 [00:18<01:10,  5.03s/it]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 5/18 [00:25<01:15,  5.80s/it]
[Acessing speaker spk_5 track 1 of 1:  33%|███▎      | 6/18 [00:31<01:09,  5.80s/it]
[Acessing speaker spk_5 track 1 of 1:  39%|███▉      | 7/18 [00:35<00:55,  5.02s/it]
[Acessing speaker spk_5 track 1 of 1:  44%|████▍     | 8/18 [00:35<00:37,  3.71s/it]
[Acessing speaker spk_5 track 1 of 1:  50%|█████     | 9/18 [00:38<00:29,  3.23s/it]
[Acessing speaker spk_5 track 1 of 1:  56%|█████▌    | 10/18 [00:39<00:21,  2.75s/it]
[Acessing speaker spk_5 track 1 of 1:  61%|██████    | 11/1


RUN: EVAL session_31

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_31
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_31


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/34 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/34 [00:00<00:32,  1.01it/s]
[Acessing speaker spk_0 track 1 of 1:   6%|▌         | 2/34 [00:01<00:28,  1.12it/s]
[Acessing speaker spk_0 track 1 of 1:   9%|▉         | 3/34 [00:02<00:22,  1.37it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▏        | 4/34 [00:03<00:24,  1.24it/s]
[Acessing speaker spk_0 track 1 of 1:  15%|█▍        | 5/34 [00:09<01:19,  2.76s/it]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 6/34 [00:15<01:48,  3.88s/it]
[Acessing speaker spk_0 track 1 of 1:  21%|██        | 7/34 [00:17<01:27,  3.24s/it]
[Acessing speaker spk_0 track 1 of 1:  24%|██▎       | 8/34 [00:18<01:08,  2.64s/it]
[Acessing speaker spk_0 track 1 of 1:  26%|██▋       | 9/34 [00:19<00:54,  2.17s/it]
[Acessing speaker spk_0 track 1 of 1:  29%|██▉       | 10/34 [00:24<01:12,  3.01s/it]
[Acessing speaker spk_0 track 1 of 1:  32%|███▏      | 11/3





[Acessing speaker spk_1 track 1 of 2:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 2:   3%|▎         | 1/30 [00:03<01:28,  3.03s/it]
[Acessing speaker spk_1 track 1 of 2:   7%|▋         | 2/30 [00:03<00:49,  1.76s/it]
[Acessing speaker spk_1 track 1 of 2:  10%|█         | 3/30 [00:04<00:34,  1.29s/it]
[Acessing speaker spk_1 track 1 of 2:  13%|█▎        | 4/30 [00:05<00:28,  1.08s/it]
[Acessing speaker spk_1 track 1 of 2:  17%|█▋        | 5/30 [00:06<00:25,  1.02s/it]
[Acessing speaker spk_1 track 1 of 2:  20%|██        | 6/30 [00:07<00:25,  1.06s/it]
[Acessing speaker spk_1 track 1 of 2:  23%|██▎       | 7/30 [00:11<00:49,  2.13s/it]
[Acessing speaker spk_1 track 1 of 2:  27%|██▋       | 8/30 [00:12<00:39,  1.80s/it]
[Acessing speaker spk_1 track 1 of 2:  30%|███       | 9/30 [00:13<00:31,  1.52s/it]
[Acessing speaker spk_1 track 1 of 2:  33%|███▎      | 10/30 [00:15<00:34,  1.71s/it]
[Acessing speaker spk_1 track 1 of 2:  37%|███▋      | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/38 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/38 [00:01<00:54,  1.48s/it]
[Acessing speaker spk_2 track 1 of 1:   5%|▌         | 2/38 [00:02<00:42,  1.18s/it]
[Acessing speaker spk_2 track 1 of 1:   8%|▊         | 3/38 [00:03<00:32,  1.07it/s]
[Acessing speaker spk_2 track 1 of 1:  11%|█         | 4/38 [00:04<00:34,  1.00s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 5/38 [00:05<00:32,  1.03it/s]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 6/38 [00:11<01:25,  2.68s/it]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 7/38 [00:18<02:12,  4.27s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 8/38 [00:20<01:42,  3.41s/it]
[Acessing speaker spk_2 track 1 of 1:  24%|██▎       | 9/38 [00:20<01:15,  2.59s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▋       | 10/38 [00:24<01:24,  3.00s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/35 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/35 [00:01<00:35,  1.06s/it]
[Acessing speaker spk_3 track 1 of 1:   6%|▌         | 2/35 [00:01<00:30,  1.07it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▊         | 3/35 [00:02<00:24,  1.29it/s]
[Acessing speaker spk_3 track 1 of 1:  11%|█▏        | 4/35 [00:03<00:23,  1.31it/s]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 5/35 [00:03<00:21,  1.40it/s]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 6/35 [00:04<00:20,  1.39it/s]
[Acessing speaker spk_3 track 1 of 1:  20%|██        | 7/35 [00:06<00:34,  1.24s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 8/35 [00:07<00:29,  1.07s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 9/35 [00:08<00:25,  1.03it/s]
[Acessing speaker spk_3 track 1 of 1:  29%|██▊       | 10/35 [00:08<00:21,  1.16it/s]
[Acessing speaker spk_3 track 1 of 1:  31%|███▏      | 11/3





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/31 [00:01<00:35,  1.19s/it]
[Acessing speaker spk_4 track 1 of 1:   6%|▋         | 2/31 [00:04<01:02,  2.15s/it]
[Acessing speaker spk_4 track 1 of 1:  10%|▉         | 3/31 [00:04<00:42,  1.50s/it]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 4/31 [00:05<00:31,  1.17s/it]
[Acessing speaker spk_4 track 1 of 1:  16%|█▌        | 5/31 [00:13<01:37,  3.77s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 6/31 [00:21<02:10,  5.24s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 7/31 [00:27<02:08,  5.36s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 8/31 [00:29<01:39,  4.32s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▉       | 9/31 [00:30<01:09,  3.17s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 10/31 [00:32<00:59,  2.83s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/30 [00:00<00:28,  1.02it/s]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 2/30 [00:02<00:32,  1.18s/it]
[Acessing speaker spk_5 track 1 of 1:  10%|█         | 3/30 [00:03<00:28,  1.07s/it]
[Acessing speaker spk_5 track 1 of 1:  13%|█▎        | 4/30 [00:04<00:29,  1.12s/it]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 5/30 [00:05<00:30,  1.23s/it]
[Acessing speaker spk_5 track 1 of 1:  20%|██        | 6/30 [00:06<00:23,  1.01it/s]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 7/30 [00:07<00:22,  1.00it/s]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 8/30 [00:08<00:20,  1.09it/s]
[Acessing speaker spk_5 track 1 of 1:  30%|███       | 9/30 [00:10<00:27,  1.29s/it]
[Acessing speaker spk_5 track 1 of 1:  33%|███▎      | 10/30 [00:21<01:24,  4.23s/it]
[Acessing speaker spk_5 track 1 of 1:  37%|███▋      | 11/3


RUN: EVAL session_32

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_32
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_32


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▎         | 1/27 [00:01<00:32,  1.26s/it]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/27 [00:02<00:36,  1.47s/it]
[Acessing speaker spk_0 track 1 of 1:  11%|█         | 3/27 [00:04<00:36,  1.52s/it]
[Acessing speaker spk_0 track 1 of 1:  15%|█▍        | 4/27 [00:04<00:25,  1.13s/it]
[Acessing speaker spk_0 track 1 of 1:  19%|█▊        | 5/27 [00:05<00:19,  1.13it/s]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 6/27 [00:06<00:22,  1.06s/it]
[Acessing speaker spk_0 track 1 of 1:  26%|██▌       | 7/27 [00:08<00:23,  1.19s/it]
[Acessing speaker spk_0 track 1 of 1:  30%|██▉       | 8/27 [00:14<00:51,  2.71s/it]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 9/27 [00:18<00:55,  3.08s/it]
[Acessing speaker spk_0 track 1 of 1:  37%|███▋      | 10/27 [00:19<00:41,  2.47s/it]
[Acessing speaker spk_0 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/28 [00:00<00:19,  1.40it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/28 [00:05<01:21,  3.15s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/28 [00:10<01:41,  4.05s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 4/28 [00:11<01:07,  2.79s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 5/28 [00:14<01:04,  2.80s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██▏       | 6/28 [00:15<00:46,  2.10s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 7/28 [00:15<00:35,  1.67s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▊       | 8/28 [00:21<00:58,  2.92s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 9/28 [00:31<01:37,  5.11s/it]
[Acessing speaker spk_1 track 1 of 1:  36%|███▌      | 10/28 [00:40<01:52,  6.27s/it]
[Acessing speaker spk_1 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   4%|▎         | 1/28 [00:00<00:25,  1.05it/s]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 2/28 [00:01<00:18,  1.44it/s]
[Acessing speaker spk_2 track 1 of 1:  11%|█         | 3/28 [00:03<00:34,  1.40s/it]
[Acessing speaker spk_2 track 1 of 1:  14%|█▍        | 4/28 [00:05<00:36,  1.51s/it]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 5/28 [00:06<00:27,  1.21s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██▏       | 6/28 [00:09<00:43,  1.97s/it]
[Acessing speaker spk_2 track 1 of 1:  25%|██▌       | 7/28 [00:10<00:34,  1.65s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▊       | 8/28 [00:11<00:29,  1.45s/it]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 9/28 [00:12<00:25,  1.37s/it]
[Acessing speaker spk_2 track 1 of 1:  36%|███▌      | 10/28 [00:14<00:24,  1.35s/it]
[Acessing speaker spk_2 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_3 track 1 of 2:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 2:  50%|█████     | 1/2 [00:00<00:00,  1.25it/s]
Processing speaker spk_3 track 1 of 2: 100%|██████████| 2/2 [00:01<00:00,  1.14it/s]

[Acessing speaker spk_3 track 2 of 2:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 2 of 2:   4%|▍         | 1/26 [00:02<00:56,  2.28s/it]
[Acessing speaker spk_3 track 2 of 2:   8%|▊         | 2/26 [00:03<00:35,  1.48s/it]
[Acessing speaker spk_3 track 2 of 2:  12%|█▏        | 3/26 [00:04<00:27,  1.19s/it]
[Acessing speaker spk_3 track 2 of 2:  15%|█▌        | 4/26 [00:04<00:21,  1.04it/s]
[Acessing speaker spk_3 track 2 of 2:  19%|█▉        | 5/26 [00:06<00:24,  1.15s/it]
[Acessing speaker spk_3 track 2 of 2:  23%|██▎       | 6/26 [00:07<00:22,  1.12s/it]
[Acessing speaker spk_3 track 2 of 2:  27%|██▋       | 7/26 [00:11<00:39,  2.07s/it]
[Acessing speaker spk_3 track 2 of 2:  31%|███       | 8/26 [00:13<00:





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/29 [00:00<00:27,  1.02it/s]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/29 [00:01<00:25,  1.05it/s]
[Acessing speaker spk_4 track 1 of 1:  10%|█         | 3/29 [00:02<00:19,  1.30it/s]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 4/29 [00:03<00:19,  1.26it/s]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 5/29 [00:03<00:16,  1.50it/s]
[Acessing speaker spk_4 track 1 of 1:  21%|██        | 6/29 [00:04<00:19,  1.16it/s]
[Acessing speaker spk_4 track 1 of 1:  24%|██▍       | 7/29 [00:09<00:44,  2.00s/it]
[Acessing speaker spk_4 track 1 of 1:  28%|██▊       | 8/29 [00:13<00:53,  2.54s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 9/29 [00:14<00:43,  2.17s/it]
[Acessing speaker spk_4 track 1 of 1:  34%|███▍      | 10/29 [00:18<00:54,  2.84s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/33 [00:01<00:45,  1.44s/it]
[Acessing speaker spk_5 track 1 of 1:   6%|▌         | 2/33 [00:03<01:00,  1.96s/it]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 3/33 [00:04<00:43,  1.46s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 4/33 [00:05<00:33,  1.14s/it]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 5/33 [00:09<01:02,  2.24s/it]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 6/33 [00:13<01:18,  2.90s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 7/33 [00:16<01:11,  2.74s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 8/33 [00:23<01:43,  4.13s/it]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 9/33 [00:32<02:18,  5.77s/it]
[Acessing speaker spk_5 track 1 of 1:  30%|███       | 10/33 [00:35<01:53,  4.95s/it]
[Acessing speaker spk_5 track 1 of 1:  33%|███▎      | 11/3


RUN: EVAL session_34

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_34
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_34


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/22 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   5%|▍         | 1/22 [00:00<00:17,  1.18it/s]
[Acessing speaker spk_0 track 1 of 2:   9%|▉         | 2/22 [00:07<01:26,  4.30s/it]
[Acessing speaker spk_0 track 1 of 2:  14%|█▎        | 3/22 [00:16<02:03,  6.51s/it]
[Acessing speaker spk_0 track 1 of 2:  18%|█▊        | 4/22 [00:24<02:09,  7.18s/it]
[Acessing speaker spk_0 track 1 of 2:  23%|██▎       | 5/22 [00:30<01:52,  6.63s/it]
[Acessing speaker spk_0 track 1 of 2:  27%|██▋       | 6/22 [00:31<01:17,  4.86s/it]
[Acessing speaker spk_0 track 1 of 2:  32%|███▏      | 7/22 [00:32<00:53,  3.60s/it]
[Acessing speaker spk_0 track 1 of 2:  36%|███▋      | 8/22 [00:34<00:39,  2.83s/it]
[Acessing speaker spk_0 track 1 of 2:  41%|████      | 9/22 [00:34<00:28,  2.16s/it]
[Acessing speaker spk_0 track 1 of 2:  45%|████▌     | 10/22 [00:36<00:24,  2.03s/it]
[Acessing speaker spk_0 track 1 of 2:  50%|█████     | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/31 [00:00<00:22,  1.32it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▋         | 2/31 [00:01<00:24,  1.17it/s]
[Acessing speaker spk_1 track 1 of 1:  10%|▉         | 3/31 [00:02<00:19,  1.44it/s]
[Acessing speaker spk_1 track 1 of 1:  13%|█▎        | 4/31 [00:02<00:18,  1.48it/s]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 5/31 [00:04<00:24,  1.08it/s]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 6/31 [00:05<00:27,  1.12s/it]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 7/31 [00:06<00:27,  1.14s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 8/31 [00:07<00:21,  1.05it/s]
[Acessing speaker spk_1 track 1 of 1:  29%|██▉       | 9/31 [00:11<00:40,  1.84s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 10/31 [00:11<00:31,  1.48s/it]
[Acessing speaker spk_1 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/29 [00:00<00:25,  1.11it/s]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 2/29 [00:01<00:17,  1.56it/s]
[Acessing speaker spk_2 track 1 of 1:  10%|█         | 3/29 [00:02<00:22,  1.14it/s]
[Acessing speaker spk_2 track 1 of 1:  14%|█▍        | 4/29 [00:03<00:18,  1.36it/s]
[Acessing speaker spk_2 track 1 of 1:  17%|█▋        | 5/29 [00:03<00:17,  1.37it/s]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 6/29 [00:04<00:14,  1.59it/s]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 7/29 [00:04<00:15,  1.46it/s]
[Acessing speaker spk_2 track 1 of 1:  28%|██▊       | 8/29 [00:08<00:31,  1.49s/it]
[Acessing speaker spk_2 track 1 of 1:  31%|███       | 9/29 [00:09<00:28,  1.41s/it]
[Acessing speaker spk_2 track 1 of 1:  34%|███▍      | 10/29 [00:19<01:16,  4.03s/it]
[Acessing speaker spk_2 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/34 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/34 [00:00<00:21,  1.52it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▌         | 2/34 [00:10<03:17,  6.18s/it]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 3/34 [00:18<03:30,  6.79s/it]
[Acessing speaker spk_3 track 1 of 1:  12%|█▏        | 4/34 [00:23<03:08,  6.27s/it]
[Acessing speaker spk_3 track 1 of 1:  15%|█▍        | 5/34 [00:27<02:36,  5.41s/it]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 6/34 [00:32<02:27,  5.27s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 7/34 [00:42<03:01,  6.73s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▎       | 8/34 [00:54<03:37,  8.38s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▋       | 9/34 [00:55<02:33,  6.16s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 10/34 [00:57<01:56,  4.86s/it]
[Acessing speaker spk_3 track 1 of 1:  32%|███▏      | 11/3





[Acessing speaker spk_4 track 1 of 2:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 2:  50%|█████     | 1/2 [00:00<00:00,  1.42it/s]
Processing speaker spk_4 track 1 of 2: 100%|██████████| 2/2 [00:01<00:00,  1.82it/s]

[Acessing speaker spk_4 track 2 of 2:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 2 of 2:   4%|▎         | 1/28 [00:00<00:18,  1.46it/s]
[Acessing speaker spk_4 track 2 of 2:   7%|▋         | 2/28 [00:03<00:55,  2.14s/it]
[Acessing speaker spk_4 track 2 of 2:  11%|█         | 3/28 [00:05<00:45,  1.81s/it]
[Acessing speaker spk_4 track 2 of 2:  14%|█▍        | 4/28 [00:11<01:23,  3.47s/it]
[Acessing speaker spk_4 track 2 of 2:  18%|█▊        | 5/28 [00:17<01:43,  4.50s/it]
[Acessing speaker spk_4 track 2 of 2:  21%|██▏       | 6/28 [00:23<01:51,  5.06s/it]
[Acessing speaker spk_4 track 2 of 2:  25%|██▌       | 7/28 [00:24<01:18,  3.72s/it]
[Acessing speaker spk_4 track 2 of 2:  29%|██▊       | 8/28 [00:27<01:





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/32 [00:01<00:53,  1.73s/it]
[Acessing speaker spk_5 track 1 of 1:   6%|▋         | 2/32 [00:02<00:32,  1.09s/it]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 3/32 [00:02<00:25,  1.15it/s]
[Acessing speaker spk_5 track 1 of 1:  12%|█▎        | 4/32 [00:06<00:54,  1.94s/it]
[Acessing speaker spk_5 track 1 of 1:  16%|█▌        | 5/32 [00:07<00:42,  1.59s/it]
[Acessing speaker spk_5 track 1 of 1:  19%|█▉        | 6/32 [00:14<01:32,  3.57s/it]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 7/32 [00:15<01:06,  2.65s/it]
[Acessing speaker spk_5 track 1 of 1:  25%|██▌       | 8/32 [00:16<00:48,  2.00s/it]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 9/32 [00:20<01:03,  2.75s/it]
[Acessing speaker spk_5 track 1 of 1:  31%|███▏      | 10/32 [00:23<01:02,  2.84s/it]
[Acessing speaker spk_5 track 1 of 1:  34%|███▍      | 11/3


RUN: EVAL session_35

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_35
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_35


Processing speakers:   0%|          | 0/4 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/37 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/37 [00:00<00:32,  1.10it/s]
[Acessing speaker spk_0 track 1 of 1:   5%|▌         | 2/37 [00:03<01:09,  1.98s/it]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 3/37 [00:04<00:49,  1.45s/it]
[Acessing speaker spk_0 track 1 of 1:  11%|█         | 4/37 [00:06<00:50,  1.54s/it]
[Acessing speaker spk_0 track 1 of 1:  14%|█▎        | 5/37 [00:14<02:05,  3.94s/it]
[Acessing speaker spk_0 track 1 of 1:  16%|█▌        | 6/37 [00:18<02:00,  3.87s/it]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 7/37 [00:18<01:25,  2.85s/it]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 8/37 [00:22<01:31,  3.15s/it]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 9/37 [00:23<01:09,  2.49s/it]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 10/37 [00:24<00:57,  2.13s/it]
[Acessing speaker spk_0 track 1 of 1:  30%|██▉       | 11/3





[Acessing speaker spk_1 track 1 of 2:   0%|          | 0/19 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 2:   5%|▌         | 1/19 [00:01<00:29,  1.65s/it]
[Acessing speaker spk_1 track 1 of 2:  11%|█         | 2/19 [00:05<00:45,  2.70s/it]
[Acessing speaker spk_1 track 1 of 2:  16%|█▌        | 3/19 [00:05<00:27,  1.75s/it]
[Acessing speaker spk_1 track 1 of 2:  21%|██        | 4/19 [00:06<00:20,  1.36s/it]
[Acessing speaker spk_1 track 1 of 2:  26%|██▋       | 5/19 [00:08<00:24,  1.72s/it]
[Acessing speaker spk_1 track 1 of 2:  32%|███▏      | 6/19 [00:16<00:48,  3.72s/it]
[Acessing speaker spk_1 track 1 of 2:  37%|███▋      | 7/19 [00:18<00:38,  3.20s/it]
[Acessing speaker spk_1 track 1 of 2:  42%|████▏     | 8/19 [00:21<00:35,  3.24s/it]
[Acessing speaker spk_1 track 1 of 2:  47%|████▋     | 9/19 [00:22<00:24,  2.45s/it]
[Acessing speaker spk_1 track 1 of 2:  53%|█████▎    | 10/19 [00:25<00:22,  2.52s/it]
[Acessing speaker spk_1 track 1 of 2:  58%|█████▊    | 11/1





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/34 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/34 [00:01<00:55,  1.70s/it]
[Acessing speaker spk_2 track 1 of 1:   6%|▌         | 2/34 [00:02<00:37,  1.17s/it]
[Acessing speaker spk_2 track 1 of 1:   9%|▉         | 3/34 [00:03<00:28,  1.09it/s]
[Acessing speaker spk_2 track 1 of 1:  12%|█▏        | 4/34 [00:06<00:50,  1.70s/it]
[Acessing speaker spk_2 track 1 of 1:  15%|█▍        | 5/34 [00:15<02:12,  4.57s/it]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 6/34 [00:18<01:55,  4.13s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 7/34 [00:22<01:49,  4.06s/it]
[Acessing speaker spk_2 track 1 of 1:  24%|██▎       | 8/34 [00:24<01:26,  3.32s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▋       | 9/34 [00:25<01:05,  2.60s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 10/34 [00:27<00:54,  2.28s/it]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/31 [00:00<00:24,  1.23it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▋         | 2/31 [00:04<01:16,  2.65s/it]
[Acessing speaker spk_3 track 1 of 1:  10%|▉         | 3/31 [00:05<00:54,  1.93s/it]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 4/31 [00:07<00:54,  2.01s/it]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 5/31 [00:08<00:42,  1.65s/it]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 6/31 [00:10<00:36,  1.46s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 7/31 [00:11<00:32,  1.35s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 8/31 [00:18<01:15,  3.27s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 9/31 [00:20<01:05,  2.96s/it]
[Acessing speaker spk_3 track 1 of 1:  32%|███▏      | 10/31 [00:21<00:46,  2.22s/it]
[Acessing speaker spk_3 track 1 of 1:  35%|███▌      | 11/3


RUN: EVAL session_36

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_36
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_36


Processing speakers:   0%|          | 0/4 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 5:   0%|          | 0/6 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 5:  17%|█▋        | 1/6 [00:01<00:05,  1.15s/it]
[Acessing speaker spk_0 track 1 of 5:  33%|███▎      | 2/6 [00:04<00:09,  2.46s/it]
[Acessing speaker spk_0 track 1 of 5:  50%|█████     | 3/6 [00:07<00:08,  2.78s/it]
[Acessing speaker spk_0 track 1 of 5:  67%|██████▋   | 4/6 [00:09<00:04,  2.44s/it]
[Acessing speaker spk_0 track 1 of 5:  83%|████████▎ | 5/6 [00:17<00:04,  4.49s/it]
Processing speaker spk_0 track 1 of 5: 100%|██████████| 6/6 [00:23<00:00,  3.92s/it]

[Acessing speaker spk_0 track 2 of 5:   0%|          | 0/7 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 2 of 5:  14%|█▍        | 1/7 [00:01<00:06,  1.14s/it]
[Acessing speaker spk_0 track 2 of 5:  29%|██▊       | 2/7 [00:03<00:09,  1.96s/it]
[Acessing speaker spk_0 track 2 of 5:  43%|████▎     | 3/7 [00:04<00:05,  1.38s/it]
[Acessing speaker spk_0 track 2 of 5:  57%|█████▋    | 4/7 [00:06<00:05,  1.86





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/36 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/36 [00:00<00:34,  1.03it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▌         | 2/36 [00:01<00:24,  1.38it/s]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 3/36 [00:03<00:49,  1.50s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 4/36 [00:04<00:36,  1.14s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 5/36 [00:08<01:09,  2.23s/it]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 6/36 [00:09<00:55,  1.87s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 7/36 [00:10<00:41,  1.43s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 8/36 [00:11<00:38,  1.37s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 9/36 [00:12<00:30,  1.13s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 10/36 [00:12<00:24,  1.04it/s]
[Acessing speaker spk_1 track 1 of 1:  31%|███       | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/31 [00:01<00:31,  1.06s/it]
[Acessing speaker spk_2 track 1 of 1:   6%|▋         | 2/31 [00:03<00:58,  2.03s/it]
[Acessing speaker spk_2 track 1 of 1:  10%|▉         | 3/31 [00:04<00:40,  1.44s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 4/31 [00:05<00:32,  1.21s/it]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 5/31 [00:06<00:29,  1.13s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 6/31 [00:07<00:29,  1.19s/it]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 7/31 [00:08<00:26,  1.09s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 8/31 [00:09<00:21,  1.06it/s]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 9/31 [00:09<00:19,  1.16it/s]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 10/31 [00:10<00:19,  1.07it/s]
[Acessing speaker spk_2 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_3 track 1 of 3:   0%|          | 0/6 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 3:  17%|█▋        | 1/6 [00:00<00:02,  1.78it/s]
[Acessing speaker spk_3 track 1 of 3:  33%|███▎      | 2/6 [00:02<00:06,  1.56s/it]
[Acessing speaker spk_3 track 1 of 3:  50%|█████     | 3/6 [00:05<00:06,  2.11s/it]
[Acessing speaker spk_3 track 1 of 3:  67%|██████▋   | 4/6 [00:14<00:09,  4.93s/it]
[Acessing speaker spk_3 track 1 of 3:  83%|████████▎ | 5/6 [00:16<00:03,  3.74s/it]
Processing speaker spk_3 track 1 of 3: 100%|██████████| 6/6 [00:25<00:00,  4.17s/it]

[Acessing speaker spk_3 track 2 of 3:   0%|          | 0/5 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 2 of 3:  20%|██        | 1/5 [00:07<00:28,  7.04s/it]
[Acessing speaker spk_3 track 2 of 3:  40%|████      | 2/5 [00:11<00:17,  5.74s/it]
[Acessing speaker spk_3 track 2 of 3:  60%|██████    | 3/5 [00:16<00:10,  5.10s/it]
[Acessing speaker spk_3 track 2 of 3:  80%|████████  | 4/5 [00:18<00:03,  3.94


RUN: EVAL session_37

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_37
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_37


Processing speakers:   0%|          | 0/4 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/40 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   2%|▎         | 1/40 [00:01<00:56,  1.45s/it]
[Acessing speaker spk_0 track 1 of 1:   5%|▌         | 2/40 [00:01<00:32,  1.17it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 3/40 [00:02<00:25,  1.44it/s]
[Acessing speaker spk_0 track 1 of 1:  10%|█         | 4/40 [00:03<00:36,  1.01s/it]
[Acessing speaker spk_0 track 1 of 1:  12%|█▎        | 5/40 [00:04<00:29,  1.20it/s]
[Acessing speaker spk_0 track 1 of 1:  15%|█▌        | 6/40 [00:05<00:25,  1.32it/s]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 7/40 [00:05<00:24,  1.33it/s]
[Acessing speaker spk_0 track 1 of 1:  20%|██        | 8/40 [00:06<00:28,  1.14it/s]
[Acessing speaker spk_0 track 1 of 1:  22%|██▎       | 9/40 [00:07<00:24,  1.25it/s]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 10/40 [00:08<00:24,  1.23it/s]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 11/4





[Acessing speaker spk_1 track 1 of 2:   0%|          | 0/6 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 2:  17%|█▋        | 1/6 [00:04<00:21,  4.31s/it]
[Acessing speaker spk_1 track 1 of 2:  33%|███▎      | 2/6 [00:09<00:19,  4.82s/it]
[Acessing speaker spk_1 track 1 of 2:  50%|█████     | 3/6 [00:10<00:09,  3.03s/it]
[Acessing speaker spk_1 track 1 of 2:  67%|██████▋   | 4/6 [00:11<00:04,  2.19s/it]
[Acessing speaker spk_1 track 1 of 2:  83%|████████▎ | 5/6 [00:17<00:03,  3.55s/it]
Processing speaker spk_1 track 1 of 2: 100%|██████████| 6/6 [00:18<00:00,  3.01s/it]

[Acessing speaker spk_1 track 2 of 2:   0%|          | 0/22 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 2 of 2:   5%|▍         | 1/22 [00:00<00:16,  1.30it/s]
[Acessing speaker spk_1 track 2 of 2:   9%|▉         | 2/22 [00:04<00:51,  2.59s/it]
[Acessing speaker spk_1 track 2 of 2:  14%|█▎        | 3/22 [00:05<00:31,  1.68s/it]
[Acessing speaker spk_1 track 2 of 2:  18%|█▊        | 4/22 [00:05<00:21, 





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/39 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/39 [00:00<00:24,  1.53it/s]
[Acessing speaker spk_2 track 1 of 1:   5%|▌         | 2/39 [00:01<00:33,  1.10it/s]
[Acessing speaker spk_2 track 1 of 1:   8%|▊         | 3/39 [00:02<00:27,  1.32it/s]
[Acessing speaker spk_2 track 1 of 1:  10%|█         | 4/39 [00:02<00:24,  1.44it/s]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 5/39 [00:03<00:27,  1.24it/s]
[Acessing speaker spk_2 track 1 of 1:  15%|█▌        | 6/39 [00:05<00:40,  1.22s/it]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 7/39 [00:06<00:34,  1.09s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 8/39 [00:07<00:29,  1.06it/s]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 9/39 [00:09<00:42,  1.42s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 10/39 [00:10<00:34,  1.18s/it]
[Acessing speaker spk_2 track 1 of 1:  28%|██▊       | 11/3





[Acessing speaker spk_3 track 1 of 3:   0%|          | 0/7 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 3:  14%|█▍        | 1/7 [00:00<00:04,  1.33it/s]
[Acessing speaker spk_3 track 1 of 3:  29%|██▊       | 2/7 [00:09<00:28,  5.67s/it]
[Acessing speaker spk_3 track 1 of 3:  43%|████▎     | 3/7 [00:18<00:27,  6.92s/it]
[Acessing speaker spk_3 track 1 of 3:  57%|█████▋    | 4/7 [00:25<00:20,  6.99s/it]
[Acessing speaker spk_3 track 1 of 3:  71%|███████▏  | 5/7 [00:34<00:15,  7.92s/it]
[Acessing speaker spk_3 track 1 of 3:  86%|████████▌ | 6/7 [00:38<00:06,  6.50s/it]
Processing speaker spk_3 track 1 of 3: 100%|██████████| 7/7 [00:39<00:00,  5.64s/it]

[Acessing speaker spk_3 track 2 of 3:   0%|          | 0/11 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 2 of 3:   9%|▉         | 1/11 [00:04<00:49,  4.93s/it]
[Acessing speaker spk_3 track 2 of 3:  18%|█▊        | 2/11 [00:08<00:37,  4.18s/it]
[Acessing speaker spk_3 track 2 of 3:  27%|██▋       | 3/11 [00:09<00:20,  


RUN: EVAL session_38

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_38
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_38


Processing speakers:   0%|          | 0/4 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/19 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   5%|▌         | 1/19 [00:08<02:36,  8.70s/it]
[Acessing speaker spk_0 track 1 of 2:  11%|█         | 2/19 [00:16<02:13,  7.88s/it]
[Acessing speaker spk_0 track 1 of 2:  16%|█▌        | 3/19 [00:21<01:49,  6.86s/it]
[Acessing speaker spk_0 track 1 of 2:  21%|██        | 4/19 [00:22<01:06,  4.45s/it]
[Acessing speaker spk_0 track 1 of 2:  26%|██▋       | 5/19 [00:28<01:12,  5.15s/it]
[Acessing speaker spk_0 track 1 of 2:  32%|███▏      | 6/19 [00:36<01:20,  6.17s/it]
[Acessing speaker spk_0 track 1 of 2:  37%|███▋      | 7/19 [00:38<00:53,  4.50s/it]
[Acessing speaker spk_0 track 1 of 2:  42%|████▏     | 8/19 [00:40<00:42,  3.84s/it]
[Acessing speaker spk_0 track 1 of 2:  47%|████▋     | 9/19 [00:40<00:28,  2.81s/it]
[Acessing speaker spk_0 track 1 of 2:  53%|█████▎    | 10/19 [00:45<00:29,  3.29s/it]
[Acessing speaker spk_0 track 1 of 2:  58%|█████▊    | 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/33 [00:07<03:57,  7.42s/it]
[Acessing speaker spk_1 track 1 of 1:   6%|▌         | 2/33 [00:08<01:52,  3.64s/it]
[Acessing speaker spk_1 track 1 of 1:   9%|▉         | 3/33 [00:09<01:12,  2.41s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 4/33 [00:10<00:59,  2.07s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▌        | 5/33 [00:12<00:58,  2.08s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 6/33 [00:13<00:45,  1.68s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 7/33 [00:18<01:08,  2.65s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 8/33 [00:24<01:30,  3.62s/it]
[Acessing speaker spk_1 track 1 of 1:  27%|██▋       | 9/33 [00:25<01:10,  2.93s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|███       | 10/33 [00:26<00:51,  2.23s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/38 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/38 [00:00<00:33,  1.09it/s]
[Acessing speaker spk_2 track 1 of 1:   5%|▌         | 2/38 [00:01<00:24,  1.47it/s]
[Acessing speaker spk_2 track 1 of 1:   8%|▊         | 3/38 [00:03<00:43,  1.25s/it]
[Acessing speaker spk_2 track 1 of 1:  11%|█         | 4/38 [00:04<00:37,  1.10s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 5/38 [00:05<00:42,  1.30s/it]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 6/38 [00:08<00:57,  1.79s/it]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 7/38 [00:09<00:43,  1.41s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 8/38 [00:10<00:40,  1.35s/it]
[Acessing speaker spk_2 track 1 of 1:  24%|██▎       | 9/38 [00:11<00:39,  1.36s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▋       | 10/38 [00:13<00:41,  1.48s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/29 [00:00<00:27,  1.01it/s]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 2/29 [00:04<01:09,  2.56s/it]
[Acessing speaker spk_3 track 1 of 1:  10%|█         | 3/29 [00:06<00:55,  2.14s/it]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 4/29 [00:07<00:44,  1.79s/it]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 5/29 [00:11<01:03,  2.64s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 6/29 [00:13<00:58,  2.53s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 7/29 [00:17<01:03,  2.91s/it]
[Acessing speaker spk_3 track 1 of 1:  28%|██▊       | 8/29 [00:25<01:35,  4.53s/it]
[Acessing speaker spk_3 track 1 of 1:  31%|███       | 9/29 [00:31<01:37,  4.89s/it]
[Acessing speaker spk_3 track 1 of 1:  34%|███▍      | 10/29 [00:32<01:12,  3.84s/it]
[Acessing speaker spk_3 track 1 of 1:  38%|███▊      | 11/2


RUN: EVAL session_39

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_39
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_39


Processing speakers:   0%|          | 0/4 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/30 [00:00<00:27,  1.04it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/30 [00:04<01:03,  2.26s/it]
[Acessing speaker spk_0 track 1 of 1:  10%|█         | 3/30 [00:06<01:06,  2.45s/it]
[Acessing speaker spk_0 track 1 of 1:  13%|█▎        | 4/30 [00:12<01:32,  3.55s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 5/30 [00:16<01:36,  3.88s/it]
[Acessing speaker spk_0 track 1 of 1:  20%|██        | 6/30 [00:17<01:08,  2.84s/it]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 7/30 [00:18<00:49,  2.15s/it]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 8/30 [00:22<01:00,  2.75s/it]
[Acessing speaker spk_0 track 1 of 1:  30%|███       | 9/30 [00:24<00:52,  2.52s/it]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 10/30 [00:25<00:42,  2.13s/it]
[Acessing speaker spk_0 track 1 of 1:  37%|███▋      | 11/3





[Acessing speaker spk_1 track 1 of 3:   0%|          | 0/8 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 3:  12%|█▎        | 1/8 [00:01<00:07,  1.05s/it]
[Acessing speaker spk_1 track 1 of 3:  25%|██▌       | 2/8 [00:01<00:04,  1.22it/s]
[Acessing speaker spk_1 track 1 of 3:  38%|███▊      | 3/8 [00:03<00:05,  1.11s/it]
[Acessing speaker spk_1 track 1 of 3:  50%|█████     | 4/8 [00:03<00:03,  1.02it/s]
[Acessing speaker spk_1 track 1 of 3:  62%|██████▎   | 5/8 [00:04<00:02,  1.15it/s]
[Acessing speaker spk_1 track 1 of 3:  75%|███████▌  | 6/8 [00:05<00:01,  1.17it/s]
[Acessing speaker spk_1 track 1 of 3:  88%|████████▊ | 7/8 [00:06<00:00,  1.14it/s]
Processing speaker spk_1 track 1 of 3: 100%|██████████| 8/8 [00:07<00:00,  1.10it/s]

[Acessing speaker spk_1 track 2 of 3:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 2 of 3:   6%|▌         | 1/18 [00:01<00:30,  1.82s/it]
[Acessing speaker spk_1 track 2 of 3:  11%|█         | 2/18 [00:02<00:21,  1





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/41 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   2%|▏         | 1/41 [00:00<00:31,  1.26it/s]
[Acessing speaker spk_2 track 1 of 2:   5%|▍         | 2/41 [00:01<00:22,  1.76it/s]
[Acessing speaker spk_2 track 1 of 2:   7%|▋         | 3/41 [00:02<00:32,  1.18it/s]
[Acessing speaker spk_2 track 1 of 2:  10%|▉         | 4/41 [00:03<00:28,  1.30it/s]
[Acessing speaker spk_2 track 1 of 2:  12%|█▏        | 5/41 [00:03<00:26,  1.35it/s]
[Acessing speaker spk_2 track 1 of 2:  15%|█▍        | 6/41 [00:04<00:24,  1.44it/s]
[Acessing speaker spk_2 track 1 of 2:  17%|█▋        | 7/41 [00:05<00:28,  1.21it/s]
[Acessing speaker spk_2 track 1 of 2:  20%|█▉        | 8/41 [00:06<00:28,  1.14it/s]
[Acessing speaker spk_2 track 1 of 2:  22%|██▏       | 9/41 [00:06<00:23,  1.35it/s]
[Acessing speaker spk_2 track 1 of 2:  24%|██▍       | 10/41 [00:07<00:22,  1.41it/s]
[Acessing speaker spk_2 track 1 of 2:  27%|██▋       | 11/4





[Acessing speaker spk_3 track 1 of 3:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 3:   3%|▎         | 1/31 [00:04<02:16,  4.54s/it]
[Acessing speaker spk_3 track 1 of 3:   6%|▋         | 2/31 [00:05<01:05,  2.26s/it]
[Acessing speaker spk_3 track 1 of 3:  10%|▉         | 3/31 [00:11<02:01,  4.32s/it]
[Acessing speaker spk_3 track 1 of 3:  13%|█▎        | 4/31 [00:13<01:26,  3.21s/it]
[Acessing speaker spk_3 track 1 of 3:  16%|█▌        | 5/31 [00:14<01:06,  2.55s/it]
[Acessing speaker spk_3 track 1 of 3:  19%|█▉        | 6/31 [00:16<00:52,  2.09s/it]
[Acessing speaker spk_3 track 1 of 3:  23%|██▎       | 7/31 [00:16<00:37,  1.57s/it]
[Acessing speaker spk_3 track 1 of 3:  26%|██▌       | 8/31 [00:18<00:39,  1.73s/it]
[Acessing speaker spk_3 track 1 of 3:  29%|██▉       | 9/31 [00:20<00:41,  1.87s/it]
[Acessing speaker spk_3 track 1 of 3:  32%|███▏      | 10/31 [00:26<01:03,  3.04s/it]
[Acessing speaker spk_3 track 1 of 3:  35%|███▌      | 11/3


RUN: EVAL session_73

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_73
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_73


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/33 [00:01<00:33,  1.04s/it]
[Acessing speaker spk_0 track 1 of 1:   6%|▌         | 2/33 [00:02<00:37,  1.20s/it]
[Acessing speaker spk_0 track 1 of 1:   9%|▉         | 3/33 [00:03<00:28,  1.04it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▏        | 4/33 [00:09<01:35,  3.30s/it]
[Acessing speaker spk_0 track 1 of 1:  15%|█▌        | 5/33 [00:15<01:55,  4.12s/it]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 6/33 [00:16<01:22,  3.05s/it]
[Acessing speaker spk_0 track 1 of 1:  21%|██        | 7/33 [00:18<01:08,  2.62s/it]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 8/33 [00:20<01:01,  2.47s/it]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 9/33 [00:23<01:00,  2.54s/it]
[Acessing speaker spk_0 track 1 of 1:  30%|███       | 10/33 [00:25<00:58,  2.54s/it]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/25 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▍         | 1/25 [00:00<00:16,  1.44it/s]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 2/25 [00:01<00:13,  1.76it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 3/25 [00:01<00:13,  1.61it/s]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 4/25 [00:02<00:11,  1.76it/s]
[Acessing speaker spk_1 track 1 of 1:  20%|██        | 5/25 [00:03<00:12,  1.66it/s]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 6/25 [00:04<00:17,  1.09it/s]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 7/25 [00:05<00:16,  1.07it/s]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 8/25 [00:06<00:14,  1.19it/s]
[Acessing speaker spk_1 track 1 of 1:  36%|███▌      | 9/25 [00:07<00:15,  1.00it/s]
[Acessing speaker spk_1 track 1 of 1:  40%|████      | 10/25 [00:08<00:13,  1.15it/s]
[Acessing speaker spk_1 track 1 of 1:  44%|████▍     | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/37 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/37 [00:05<03:11,  5.31s/it]
[Acessing speaker spk_2 track 1 of 1:   5%|▌         | 2/37 [00:06<01:34,  2.71s/it]
[Acessing speaker spk_2 track 1 of 1:   8%|▊         | 3/37 [00:07<01:02,  1.85s/it]
[Acessing speaker spk_2 track 1 of 1:  11%|█         | 4/37 [00:07<00:43,  1.33s/it]
[Acessing speaker spk_2 track 1 of 1:  14%|█▎        | 5/37 [00:08<00:33,  1.05s/it]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 6/37 [00:09<00:35,  1.14s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 7/37 [00:12<00:53,  1.77s/it]
[Acessing speaker spk_2 track 1 of 1:  22%|██▏       | 8/37 [00:13<00:42,  1.47s/it]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 9/37 [00:14<00:36,  1.31s/it]
[Acessing speaker spk_2 track 1 of 1:  27%|██▋       | 10/37 [00:20<01:13,  2.72s/it]
[Acessing speaker spk_2 track 1 of 1:  30%|██▉       | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/30 [00:00<00:22,  1.27it/s]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 2/30 [00:02<00:41,  1.50s/it]
[Acessing speaker spk_3 track 1 of 1:  10%|█         | 3/30 [00:03<00:31,  1.17s/it]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 4/30 [00:04<00:32,  1.23s/it]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 5/30 [00:05<00:25,  1.04s/it]
[Acessing speaker spk_3 track 1 of 1:  20%|██        | 6/30 [00:06<00:21,  1.12it/s]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 7/30 [00:06<00:18,  1.27it/s]
[Acessing speaker spk_3 track 1 of 1:  27%|██▋       | 8/30 [00:07<00:15,  1.40it/s]
[Acessing speaker spk_3 track 1 of 1:  30%|███       | 9/30 [00:08<00:18,  1.16it/s]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 10/30 [00:09<00:15,  1.28it/s]
[Acessing speaker spk_3 track 1 of 1:  37%|███▋      | 11/3





[Acessing speaker spk_4 track 1 of 2:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 2:  50%|█████     | 1/2 [00:00<00:00,  1.07it/s]
Processing speaker spk_4 track 1 of 2: 100%|██████████| 2/2 [00:01<00:00,  1.47it/s]

[Acessing speaker spk_4 track 2 of 2:   0%|          | 0/9 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 2 of 2:  11%|█         | 1/9 [00:01<00:10,  1.36s/it]
[Acessing speaker spk_4 track 2 of 2:  22%|██▏       | 2/9 [00:01<00:06,  1.15it/s]
[Acessing speaker spk_4 track 2 of 2:  33%|███▎      | 3/9 [00:02<00:04,  1.35it/s]
[Acessing speaker spk_4 track 2 of 2:  44%|████▍     | 4/9 [00:02<00:03,  1.53it/s]
[Acessing speaker spk_4 track 2 of 2:  56%|█████▌    | 5/9 [00:03<00:03,  1.32it/s]
[Acessing speaker spk_4 track 2 of 2:  67%|██████▋   | 6/9 [00:04<00:02,  1.45it/s]
[Acessing speaker spk_4 track 2 of 2:  78%|███████▊  | 7/9 [00:05<00:01,  1.48it/s]
[Acessing speaker spk_4 track 2 of 2:  89%|████████▉ | 8/9 [00:05<00:00,  1.46


RUN: EVAL session_74

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_74
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_74


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/32 [00:00<00:26,  1.19it/s]
[Acessing speaker spk_0 track 1 of 1:   6%|▋         | 2/32 [00:01<00:27,  1.08it/s]
[Acessing speaker spk_0 track 1 of 1:   9%|▉         | 3/32 [00:03<00:30,  1.06s/it]
[Acessing speaker spk_0 track 1 of 1:  12%|█▎        | 4/32 [00:04<00:29,  1.07s/it]
[Acessing speaker spk_0 track 1 of 1:  16%|█▌        | 5/32 [00:05<00:33,  1.24s/it]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 6/32 [00:06<00:27,  1.08s/it]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 7/32 [00:06<00:22,  1.11it/s]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 8/32 [00:08<00:25,  1.04s/it]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 9/32 [00:10<00:28,  1.25s/it]
[Acessing speaker spk_0 track 1 of 1:  31%|███▏      | 10/32 [00:10<00:23,  1.09s/it]
[Acessing speaker spk_0 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/25 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▍         | 1/25 [00:02<01:03,  2.63s/it]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 2/25 [00:05<00:57,  2.49s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 3/25 [00:05<00:37,  1.71s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 4/25 [00:06<00:26,  1.28s/it]
[Acessing speaker spk_1 track 1 of 1:  20%|██        | 5/25 [00:07<00:21,  1.09s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 6/25 [00:07<00:18,  1.01it/s]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 7/25 [00:10<00:24,  1.33s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 8/25 [00:10<00:20,  1.20s/it]
[Acessing speaker spk_1 track 1 of 1:  36%|███▌      | 9/25 [00:11<00:16,  1.04s/it]
[Acessing speaker spk_1 track 1 of 1:  40%|████      | 10/25 [00:12<00:14,  1.06it/s]
[Acessing speaker spk_1 track 1 of 1:  44%|████▍     | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   4%|▎         | 1/28 [00:01<00:41,  1.52s/it]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 2/28 [00:02<00:32,  1.23s/it]
[Acessing speaker spk_2 track 1 of 1:  11%|█         | 3/28 [00:03<00:32,  1.29s/it]
[Acessing speaker spk_2 track 1 of 1:  14%|█▍        | 4/28 [00:04<00:26,  1.09s/it]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 5/28 [00:12<01:22,  3.59s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██▏       | 6/28 [00:21<01:57,  5.33s/it]
[Acessing speaker spk_2 track 1 of 1:  25%|██▌       | 7/28 [00:28<02:05,  5.98s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▊       | 8/28 [00:36<02:13,  6.69s/it]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 9/28 [00:43<02:09,  6.79s/it]
[Acessing speaker spk_2 track 1 of 1:  36%|███▌      | 10/28 [00:46<01:38,  5.45s/it]
[Acessing speaker spk_2 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▎         | 1/27 [00:02<01:03,  2.43s/it]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 2/27 [00:02<00:33,  1.33s/it]
[Acessing speaker spk_3 track 1 of 1:  11%|█         | 3/27 [00:04<00:30,  1.29s/it]
[Acessing speaker spk_3 track 1 of 1:  15%|█▍        | 4/27 [00:04<00:24,  1.07s/it]
[Acessing speaker spk_3 track 1 of 1:  19%|█▊        | 5/27 [00:05<00:22,  1.00s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 6/27 [00:06<00:21,  1.01s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 7/27 [00:11<00:41,  2.05s/it]
[Acessing speaker spk_3 track 1 of 1:  30%|██▉       | 8/27 [00:12<00:35,  1.89s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 9/27 [00:15<00:41,  2.29s/it]
[Acessing speaker spk_3 track 1 of 1:  37%|███▋      | 10/27 [00:16<00:30,  1.80s/it]
[Acessing speaker spk_3 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_4 track 1 of 2:   0%|          | 0/15 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 2:   7%|▋         | 1/15 [00:01<00:14,  1.02s/it]
[Acessing speaker spk_4 track 1 of 2:  13%|█▎        | 2/15 [00:01<00:09,  1.34it/s]
[Acessing speaker spk_4 track 1 of 2:  20%|██        | 3/15 [00:02<00:10,  1.16it/s]
[Acessing speaker spk_4 track 1 of 2:  27%|██▋       | 4/15 [00:03<00:09,  1.20it/s]
[Acessing speaker spk_4 track 1 of 2:  33%|███▎      | 5/15 [00:04<00:08,  1.15it/s]
[Acessing speaker spk_4 track 1 of 2:  40%|████      | 6/15 [00:04<00:07,  1.26it/s]
[Acessing speaker spk_4 track 1 of 2:  47%|████▋     | 7/15 [00:05<00:06,  1.28it/s]
[Acessing speaker spk_4 track 1 of 2:  53%|█████▎    | 8/15 [00:06<00:05,  1.37it/s]
[Acessing speaker spk_4 track 1 of 2:  60%|██████    | 9/15 [00:06<00:04,  1.41it/s]
[Acessing speaker spk_4 track 1 of 2:  67%|██████▋   | 10/15 [00:07<00:03,  1.31it/s]
[Acessing speaker spk_4 track 1 of 2:  73%|███████▎  | 11/1


RUN: EVAL session_75

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_75
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_75


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/32 [00:02<01:06,  2.15s/it]
[Acessing speaker spk_0 track 1 of 1:   6%|▋         | 2/32 [00:03<00:54,  1.81s/it]
[Acessing speaker spk_0 track 1 of 1:   9%|▉         | 3/32 [00:04<00:34,  1.18s/it]
[Acessing speaker spk_0 track 1 of 1:  12%|█▎        | 4/32 [00:15<02:27,  5.28s/it]
[Acessing speaker spk_0 track 1 of 1:  16%|█▌        | 5/32 [00:17<01:50,  4.08s/it]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 6/32 [00:19<01:26,  3.34s/it]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 7/32 [00:24<01:32,  3.70s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 8/32 [00:25<01:08,  2.85s/it]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 9/32 [00:25<00:50,  2.18s/it]
[Acessing speaker spk_0 track 1 of 1:  31%|███▏      | 10/32 [00:27<00:42,  1.93s/it]
[Acessing speaker spk_0 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/32 [00:05<03:02,  5.88s/it]
[Acessing speaker spk_1 track 1 of 1:   6%|▋         | 2/32 [00:07<01:35,  3.18s/it]
[Acessing speaker spk_1 track 1 of 1:   9%|▉         | 3/32 [00:08<01:01,  2.12s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▎        | 4/32 [00:08<00:44,  1.58s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 5/32 [00:09<00:35,  1.30s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 6/32 [00:13<01:01,  2.35s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 7/32 [00:21<01:37,  3.89s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 8/32 [00:21<01:09,  2.91s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 9/32 [00:23<01:00,  2.65s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███▏      | 10/32 [00:24<00:43,  1.99s/it]
[Acessing speaker spk_1 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/33 [00:02<01:05,  2.04s/it]
[Acessing speaker spk_2 track 1 of 1:   6%|▌         | 2/33 [00:03<00:54,  1.77s/it]
[Acessing speaker spk_2 track 1 of 1:   9%|▉         | 3/33 [00:04<00:36,  1.21s/it]
[Acessing speaker spk_2 track 1 of 1:  12%|█▏        | 4/33 [00:06<00:43,  1.51s/it]
[Acessing speaker spk_2 track 1 of 1:  15%|█▌        | 5/33 [00:08<00:46,  1.67s/it]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 6/33 [00:10<00:50,  1.88s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 7/33 [00:13<00:55,  2.15s/it]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 8/33 [00:13<00:42,  1.70s/it]
[Acessing speaker spk_2 track 1 of 1:  27%|██▋       | 9/33 [00:14<00:33,  1.38s/it]
[Acessing speaker spk_2 track 1 of 1:  30%|███       | 10/33 [00:15<00:27,  1.18s/it]
[Acessing speaker spk_2 track 1 of 1:  33%|███▎      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/29 [00:00<00:27,  1.03it/s]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 2/29 [00:01<00:23,  1.13it/s]
[Acessing speaker spk_3 track 1 of 1:  10%|█         | 3/29 [00:02<00:22,  1.14it/s]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 4/29 [00:03<00:20,  1.24it/s]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 5/29 [00:03<00:17,  1.40it/s]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 6/29 [00:04<00:14,  1.61it/s]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 7/29 [00:05<00:14,  1.56it/s]
[Acessing speaker spk_3 track 1 of 1:  28%|██▊       | 8/29 [00:05<00:12,  1.65it/s]
[Acessing speaker spk_3 track 1 of 1:  31%|███       | 9/29 [00:07<00:23,  1.16s/it]
[Acessing speaker spk_3 track 1 of 1:  34%|███▍      | 10/29 [00:08<00:18,  1.02it/s]
[Acessing speaker spk_3 track 1 of 1:  38%|███▊      | 11/2





Processing speaker spk_4 track 1 of 2: 0it [00:00, ?it/s]

[Acessing speaker spk_4 track 2 of 2:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 2 of 2:   4%|▎         | 1/27 [00:01<00:41,  1.59s/it]
[Acessing speaker spk_4 track 2 of 2:   7%|▋         | 2/27 [00:02<00:27,  1.10s/it]
[Acessing speaker spk_4 track 2 of 2:  11%|█         | 3/27 [00:02<00:21,  1.12it/s]
[Acessing speaker spk_4 track 2 of 2:  15%|█▍        | 4/27 [00:04<00:24,  1.06s/it]
[Acessing speaker spk_4 track 2 of 2:  19%|█▊        | 5/27 [00:04<00:20,  1.09it/s]
[Acessing speaker spk_4 track 2 of 2:  22%|██▏       | 6/27 [00:06<00:20,  1.05it/s]
[Acessing speaker spk_4 track 2 of 2:  26%|██▌       | 7/27 [00:06<00:16,  1.24it/s]
[Acessing speaker spk_4 track 2 of 2:  30%|██▉       | 8/27 [00:09<00:28,  1.50s/it]
[Acessing speaker spk_4 track 2 of 2:  33%|███▎      | 9/27 [00:10<00:22,  1.23s/it]
[Acessing speaker spk_4 track 2 of 2:  37%|███▋      | 10/27 [00:10<00:18,  1.08s/it]



RUN: EVAL session_76

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_76
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_76


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   6%|▌         | 1/17 [00:00<00:12,  1.25it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▏        | 2/17 [00:02<00:21,  1.41s/it]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 3/17 [00:03<00:14,  1.03s/it]
[Acessing speaker spk_0 track 1 of 1:  24%|██▎       | 4/17 [00:04<00:14,  1.08s/it]
[Acessing speaker spk_0 track 1 of 1:  29%|██▉       | 5/17 [00:05<00:12,  1.07s/it]
[Acessing speaker spk_0 track 1 of 1:  35%|███▌      | 6/17 [00:06<00:10,  1.04it/s]
[Acessing speaker spk_0 track 1 of 1:  41%|████      | 7/17 [00:06<00:08,  1.14it/s]
[Acessing speaker spk_0 track 1 of 1:  47%|████▋     | 8/17 [00:07<00:08,  1.10it/s]
[Acessing speaker spk_0 track 1 of 1:  53%|█████▎    | 9/17 [00:08<00:06,  1.22it/s]
[Acessing speaker spk_0 track 1 of 1:  59%|█████▉    | 10/17 [00:10<00:08,  1.20s/it]
[Acessing speaker spk_0 track 1 of 1:  65%|██████▍   | 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/28 [00:01<00:36,  1.34s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/28 [00:01<00:22,  1.16it/s]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/28 [00:02<00:16,  1.50it/s]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 4/28 [00:03<00:25,  1.04s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 5/28 [00:06<00:39,  1.72s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██▏       | 6/28 [00:08<00:36,  1.65s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 7/28 [00:09<00:28,  1.37s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▊       | 8/28 [00:12<00:40,  2.01s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 9/28 [00:13<00:29,  1.58s/it]
[Acessing speaker spk_1 track 1 of 1:  36%|███▌      | 10/28 [00:17<00:44,  2.46s/it]
[Acessing speaker spk_1 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   4%|▍         | 1/26 [00:00<00:19,  1.27it/s]
[Acessing speaker spk_2 track 1 of 1:   8%|▊         | 2/26 [00:01<00:19,  1.23it/s]
[Acessing speaker spk_2 track 1 of 1:  12%|█▏        | 3/26 [00:09<01:32,  4.01s/it]
[Acessing speaker spk_2 track 1 of 1:  15%|█▌        | 4/26 [00:18<02:15,  6.15s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 5/26 [00:27<02:29,  7.12s/it]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 6/26 [00:37<02:42,  8.10s/it]
[Acessing speaker spk_2 track 1 of 1:  27%|██▋       | 7/26 [00:45<02:32,  8.03s/it]
[Acessing speaker spk_2 track 1 of 1:  31%|███       | 8/26 [00:53<02:24,  8.02s/it]
[Acessing speaker spk_2 track 1 of 1:  35%|███▍      | 9/26 [00:58<02:01,  7.12s/it]
[Acessing speaker spk_2 track 1 of 1:  38%|███▊      | 10/26 [01:05<01:51,  6.96s/it]
[Acessing speaker spk_2 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_3 track 1 of 4:   0%|          | 0/13 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 4:   8%|▊         | 1/13 [00:00<00:11,  1.06it/s]
[Acessing speaker spk_3 track 1 of 4:  15%|█▌        | 2/13 [00:01<00:07,  1.40it/s]
[Acessing speaker spk_3 track 1 of 4:  23%|██▎       | 3/13 [00:02<00:06,  1.58it/s]
[Acessing speaker spk_3 track 1 of 4:  31%|███       | 4/13 [00:02<00:05,  1.58it/s]
[Acessing speaker spk_3 track 1 of 4:  38%|███▊      | 5/13 [00:03<00:05,  1.44it/s]
[Acessing speaker spk_3 track 1 of 4:  46%|████▌     | 6/13 [00:04<00:04,  1.53it/s]
[Acessing speaker spk_3 track 1 of 4:  54%|█████▍    | 7/13 [00:04<00:04,  1.49it/s]
[Acessing speaker spk_3 track 1 of 4:  62%|██████▏   | 8/13 [00:05<00:03,  1.38it/s]
[Acessing speaker spk_3 track 1 of 4:  69%|██████▉   | 9/13 [00:06<00:02,  1.44it/s]
[Acessing speaker spk_3 track 1 of 4:  77%|███████▋  | 10/13 [00:06<00:02,  1.47it/s]
[Acessing speaker spk_3 track 1 of 4:  85%|████████▍ | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   6%|▋         | 1/16 [00:03<00:49,  3.33s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▎        | 2/16 [00:05<00:34,  2.49s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 3/16 [00:06<00:26,  2.07s/it]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 4/16 [00:07<00:20,  1.70s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███▏      | 5/16 [00:08<00:15,  1.37s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 6/16 [00:09<00:12,  1.24s/it]
[Acessing speaker spk_4 track 1 of 1:  44%|████▍     | 7/16 [00:10<00:09,  1.06s/it]
[Acessing speaker spk_4 track 1 of 1:  50%|█████     | 8/16 [00:11<00:07,  1.06it/s]
[Acessing speaker spk_4 track 1 of 1:  56%|█████▋    | 9/16 [00:12<00:06,  1.06it/s]
[Acessing speaker spk_4 track 1 of 1:  62%|██████▎   | 10/16 [00:12<00:05,  1.18it/s]
[Acessing speaker spk_4 track 1 of 1:  69%|██████▉   | 11/1


RUN: EVAL session_77

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_77
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_77


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▎         | 1/28 [00:00<00:22,  1.18it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/28 [00:07<01:56,  4.49s/it]
[Acessing speaker spk_0 track 1 of 1:  11%|█         | 3/28 [00:16<02:33,  6.15s/it]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 4/28 [00:22<02:32,  6.37s/it]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 5/28 [00:23<01:39,  4.34s/it]
[Acessing speaker spk_0 track 1 of 1:  21%|██▏       | 6/28 [00:24<01:08,  3.12s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 7/28 [00:27<01:06,  3.16s/it]
[Acessing speaker spk_0 track 1 of 1:  29%|██▊       | 8/28 [00:28<00:51,  2.59s/it]
[Acessing speaker spk_0 track 1 of 1:  32%|███▏      | 9/28 [00:41<01:47,  5.65s/it]
[Acessing speaker spk_0 track 1 of 1:  36%|███▌      | 10/28 [00:52<02:11,  7.33s/it]
[Acessing speaker spk_0 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/19 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▌         | 1/19 [00:02<00:42,  2.38s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 2/19 [00:03<00:22,  1.35s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 3/19 [00:03<00:17,  1.07s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 4/19 [00:05<00:22,  1.50s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▋       | 5/19 [00:06<00:18,  1.29s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 6/19 [00:07<00:13,  1.03s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 7/19 [00:07<00:10,  1.11it/s]
[Acessing speaker spk_1 track 1 of 1:  42%|████▏     | 8/19 [00:08<00:08,  1.28it/s]
[Acessing speaker spk_1 track 1 of 1:  47%|████▋     | 9/19 [00:09<00:08,  1.24it/s]
[Acessing speaker spk_1 track 1 of 1:  53%|█████▎    | 10/19 [00:09<00:06,  1.38it/s]
[Acessing speaker spk_1 track 1 of 1:  58%|█████▊    | 11/1





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   4%|▍         | 1/23 [00:02<00:47,  2.17s/it]
[Acessing speaker spk_2 track 1 of 1:   9%|▊         | 2/23 [00:02<00:28,  1.36s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 3/23 [00:03<00:20,  1.05s/it]
[Acessing speaker spk_2 track 1 of 1:  17%|█▋        | 4/23 [00:04<00:16,  1.19it/s]
[Acessing speaker spk_2 track 1 of 1:  22%|██▏       | 5/23 [00:07<00:30,  1.71s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 6/23 [00:09<00:31,  1.88s/it]
[Acessing speaker spk_2 track 1 of 1:  30%|███       | 7/23 [00:10<00:25,  1.58s/it]
[Acessing speaker spk_2 track 1 of 1:  35%|███▍      | 8/23 [00:11<00:19,  1.28s/it]
[Acessing speaker spk_2 track 1 of 1:  39%|███▉      | 9/23 [00:14<00:28,  2.02s/it]
[Acessing speaker spk_2 track 1 of 1:  43%|████▎     | 10/23 [00:15<00:21,  1.64s/it]
[Acessing speaker spk_2 track 1 of 1:  48%|████▊     | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/29 [00:00<00:21,  1.29it/s]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 2/29 [00:01<00:19,  1.38it/s]
[Acessing speaker spk_3 track 1 of 1:  10%|█         | 3/29 [00:02<00:23,  1.12it/s]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 4/29 [00:03<00:19,  1.29it/s]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 5/29 [00:04<00:19,  1.23it/s]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 6/29 [00:04<00:18,  1.27it/s]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 7/29 [00:05<00:15,  1.46it/s]
[Acessing speaker spk_3 track 1 of 1:  28%|██▊       | 8/29 [00:05<00:12,  1.64it/s]
[Acessing speaker spk_3 track 1 of 1:  31%|███       | 9/29 [00:06<00:15,  1.30it/s]
[Acessing speaker spk_3 track 1 of 1:  34%|███▍      | 10/29 [00:07<00:13,  1.41it/s]
[Acessing speaker spk_3 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/36 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/36 [00:01<00:38,  1.10s/it]
[Acessing speaker spk_4 track 1 of 1:   6%|▌         | 2/36 [00:01<00:26,  1.27it/s]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 3/36 [00:02<00:22,  1.47it/s]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 4/36 [00:03<00:25,  1.25it/s]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 5/36 [00:03<00:21,  1.45it/s]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 6/36 [00:06<00:41,  1.40s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 7/36 [00:07<00:35,  1.22s/it]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 8/36 [00:09<00:40,  1.45s/it]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 9/36 [00:10<00:36,  1.34s/it]
[Acessing speaker spk_4 track 1 of 1:  28%|██▊       | 10/36 [00:11<00:33,  1.30s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 11/3


RUN: EVAL session_78

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_78
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_78


Processing speakers:   0%|          | 0/2 [00:00<?, ?it/s]





Processing speaker spk_0 track 1 of 2: 0it [00:00, ?it/s]

[Acessing speaker spk_0 track 2 of 2:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 2 of 2:   3%|▎         | 1/30 [00:00<00:26,  1.10it/s]
[Acessing speaker spk_0 track 2 of 2:   7%|▋         | 2/30 [00:01<00:19,  1.42it/s]
[Acessing speaker spk_0 track 2 of 2:  10%|█         | 3/30 [00:02<00:19,  1.40it/s]
[Acessing speaker spk_0 track 2 of 2:  13%|█▎        | 4/30 [00:02<00:16,  1.57it/s]
[Acessing speaker spk_0 track 2 of 2:  17%|█▋        | 5/30 [00:03<00:16,  1.54it/s]
[Acessing speaker spk_0 track 2 of 2:  20%|██        | 6/30 [00:04<00:16,  1.42it/s]
[Acessing speaker spk_0 track 2 of 2:  23%|██▎       | 7/30 [00:04<00:14,  1.58it/s]
[Acessing speaker spk_0 track 2 of 2:  27%|██▋       | 8/30 [00:05<00:13,  1.68it/s]
[Acessing speaker spk_0 track 2 of 2:  30%|███       | 9/30 [00:05<00:12,  1.72it/s]
[Acessing speaker spk_0 track 2 of 2:  33%|███▎      | 10/30 [00:06<00:11,  1.79it/s]






[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/29 [00:00<00:23,  1.20it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/29 [00:01<00:19,  1.36it/s]
[Acessing speaker spk_1 track 1 of 1:  10%|█         | 3/29 [00:02<00:27,  1.05s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 4/29 [00:03<00:21,  1.19it/s]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 5/29 [00:05<00:27,  1.14s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 6/29 [00:06<00:28,  1.26s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 7/29 [00:07<00:22,  1.04s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 8/29 [00:07<00:18,  1.17it/s]
[Acessing speaker spk_1 track 1 of 1:  31%|███       | 9/29 [00:12<00:41,  2.10s/it]
[Acessing speaker spk_1 track 1 of 1:  34%|███▍      | 10/29 [00:13<00:31,  1.67s/it]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 11/2


RUN: EVAL session_79

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_79
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_79


Processing speakers:   0%|          | 0/2 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/30 [00:00<00:21,  1.37it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/30 [00:01<00:23,  1.18it/s]
[Acessing speaker spk_0 track 1 of 1:  10%|█         | 3/30 [00:02<00:21,  1.26it/s]
[Acessing speaker spk_0 track 1 of 1:  13%|█▎        | 4/30 [00:03<00:23,  1.12it/s]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 5/30 [00:11<01:22,  3.32s/it]
[Acessing speaker spk_0 track 1 of 1:  20%|██        | 6/30 [00:11<00:58,  2.44s/it]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 7/30 [00:14<00:56,  2.44s/it]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 8/30 [00:15<00:43,  1.97s/it]
[Acessing speaker spk_0 track 1 of 1:  30%|███       | 9/30 [00:15<00:32,  1.53s/it]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 10/30 [00:16<00:23,  1.20s/it]
[Acessing speaker spk_0 track 1 of 1:  37%|███▋      | 11/3





[Acessing speaker spk_1 track 1 of 2:   0%|          | 0/20 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 2:   5%|▌         | 1/20 [00:00<00:13,  1.44it/s]
[Acessing speaker spk_1 track 1 of 2:  10%|█         | 2/20 [00:01<00:11,  1.58it/s]
[Acessing speaker spk_1 track 1 of 2:  15%|█▌        | 3/20 [00:01<00:10,  1.55it/s]
[Acessing speaker spk_1 track 1 of 2:  20%|██        | 4/20 [00:03<00:12,  1.23it/s]
[Acessing speaker spk_1 track 1 of 2:  25%|██▌       | 5/20 [00:03<00:10,  1.41it/s]
[Acessing speaker spk_1 track 1 of 2:  30%|███       | 6/20 [00:04<00:11,  1.22it/s]
[Acessing speaker spk_1 track 1 of 2:  35%|███▌      | 7/20 [00:05<00:10,  1.22it/s]
[Acessing speaker spk_1 track 1 of 2:  40%|████      | 8/20 [00:06<00:09,  1.26it/s]
[Acessing speaker spk_1 track 1 of 2:  45%|████▌     | 9/20 [00:06<00:07,  1.40it/s]
[Acessing speaker spk_1 track 1 of 2:  50%|█████     | 10/20 [00:07<00:06,  1.46it/s]
[Acessing speaker spk_1 track 1 of 2:  55%|█████▌    | 11/2


RUN: EVAL session_80

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_80
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_80


Processing speakers:   0%|          | 0/2 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 3:   0%|          | 0/7 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 3:  14%|█▍        | 1/7 [00:00<00:04,  1.28it/s]
[Acessing speaker spk_0 track 1 of 3:  29%|██▊       | 2/7 [00:01<00:03,  1.60it/s]
[Acessing speaker spk_0 track 1 of 3:  43%|████▎     | 3/7 [00:03<00:05,  1.27s/it]
[Acessing speaker spk_0 track 1 of 3:  57%|█████▋    | 4/7 [00:04<00:03,  1.04s/it]
[Acessing speaker spk_0 track 1 of 3:  71%|███████▏  | 5/7 [00:04<00:01,  1.20it/s]
[Acessing speaker spk_0 track 1 of 3:  86%|████████▌ | 6/7 [00:04<00:00,  1.41it/s]
Processing speaker spk_0 track 1 of 3: 100%|██████████| 7/7 [00:05<00:00,  1.27it/s]

[Acessing speaker spk_0 track 2 of 3:   0%|          | 0/8 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 2 of 3:  12%|█▎        | 1/8 [00:01<00:07,  1.10s/it]
[Acessing speaker spk_0 track 2 of 3:  25%|██▌       | 2/8 [00:01<00:04,  1.37it/s]
[Acessing speaker spk_0 track 2 of 3:  38%|███▊      | 3/8 [00:02<00:03,  1.51





[Acessing speaker spk_1 track 1 of 3:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 3:   6%|▌         | 1/18 [00:10<02:52, 10.14s/it]
[Acessing speaker spk_1 track 1 of 3:  11%|█         | 2/18 [00:11<01:21,  5.08s/it]
[Acessing speaker spk_1 track 1 of 3:  17%|█▋        | 3/18 [00:12<00:45,  3.06s/it]
[Acessing speaker spk_1 track 1 of 3:  22%|██▏       | 4/18 [00:12<00:28,  2.04s/it]
[Acessing speaker spk_1 track 1 of 3:  28%|██▊       | 5/18 [00:13<00:20,  1.55s/it]
[Acessing speaker spk_1 track 1 of 3:  33%|███▎      | 6/18 [00:14<00:15,  1.28s/it]
[Acessing speaker spk_1 track 1 of 3:  39%|███▉      | 7/18 [00:16<00:17,  1.63s/it]
[Acessing speaker spk_1 track 1 of 3:  44%|████▍     | 8/18 [00:17<00:12,  1.28s/it]
[Acessing speaker spk_1 track 1 of 3:  50%|█████     | 9/18 [00:19<00:14,  1.64s/it]
[Acessing speaker spk_1 track 1 of 3:  56%|█████▌    | 10/18 [00:20<00:11,  1.40s/it]
[Acessing speaker spk_1 track 1 of 3:  61%|██████    | 11/1

FAILED: session_80 -> RuntimeError('Could not open input file: data-bin/eval/session_80/speakers/spk_1/central_crops/track_01_lip.av.mp4 Invalid data found when processing input')

RUN: EVAL session_81

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_81
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_81


Processing speakers:   0%|          | 0/2 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▍         | 1/24 [00:00<00:22,  1.01it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 2/24 [00:01<00:18,  1.21it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▎        | 3/24 [00:02<00:17,  1.23it/s]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 4/24 [00:03<00:15,  1.33it/s]
[Acessing speaker spk_0 track 1 of 1:  21%|██        | 5/24 [00:03<00:13,  1.42it/s]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 6/24 [00:05<00:18,  1.02s/it]
[Acessing speaker spk_0 track 1 of 1:  29%|██▉       | 7/24 [00:06<00:15,  1.08it/s]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 8/24 [00:06<00:13,  1.22it/s]
[Acessing speaker spk_0 track 1 of 1:  38%|███▊      | 9/24 [00:07<00:11,  1.36it/s]
[Acessing speaker spk_0 track 1 of 1:  42%|████▏     | 10/24 [00:07<00:09,  1.49it/s]
[Acessing speaker spk_0 track 1 of 1:  46%|████▌     | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▍         | 1/26 [00:02<01:01,  2.47s/it]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 2/26 [00:03<00:38,  1.61s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 3/26 [00:04<00:29,  1.30s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▌        | 4/26 [00:05<00:24,  1.12s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 5/26 [00:06<00:21,  1.03s/it]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 6/26 [00:06<00:17,  1.12it/s]
[Acessing speaker spk_1 track 1 of 1:  27%|██▋       | 7/26 [00:07<00:15,  1.26it/s]
[Acessing speaker spk_1 track 1 of 1:  31%|███       | 8/26 [00:07<00:13,  1.34it/s]
[Acessing speaker spk_1 track 1 of 1:  35%|███▍      | 9/26 [00:12<00:30,  1.79s/it]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 10/26 [00:13<00:28,  1.79s/it]
[Acessing speaker spk_1 track 1 of 1:  42%|████▏     | 11/2


RUN: EVAL session_82

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_82
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_82


Processing speakers:   0%|          | 0/2 [00:00<?, ?it/s]





Processing speaker spk_0 track 1 of 3: 0it [00:00, ?it/s]

[Acessing speaker spk_0 track 2 of 3:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 2 of 3:   4%|▍         | 1/23 [00:00<00:16,  1.33it/s]
[Acessing speaker spk_0 track 2 of 3:   9%|▊         | 2/23 [00:01<00:15,  1.38it/s]
[Acessing speaker spk_0 track 2 of 3:  13%|█▎        | 3/23 [00:02<00:19,  1.03it/s]
[Acessing speaker spk_0 track 2 of 3:  17%|█▋        | 4/23 [00:03<00:15,  1.20it/s]
[Acessing speaker spk_0 track 2 of 3:  22%|██▏       | 5/23 [00:05<00:22,  1.28s/it]
[Acessing speaker spk_0 track 2 of 3:  26%|██▌       | 6/23 [00:10<00:45,  2.67s/it]
[Acessing speaker spk_0 track 2 of 3:  30%|███       | 7/23 [00:11<00:34,  2.15s/it]
[Acessing speaker spk_0 track 2 of 3:  35%|███▍      | 8/23 [00:12<00:24,  1.63s/it]
[Acessing speaker spk_0 track 2 of 3:  39%|███▉      | 9/23 [00:12<00:18,  1.31s/it]
[Acessing speaker spk_0 track 2 of 3:  43%|████▎     | 10/23 [00:13<00:14,  1.10s/it]






[Acessing speaker spk_1 track 1 of 5:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 5:  50%|█████     | 1/2 [00:00<00:00,  1.50it/s]
Processing speaker spk_1 track 1 of 5: 100%|██████████| 2/2 [00:01<00:00,  1.75it/s]

[Acessing speaker spk_1 track 2 of 5:   0%|          | 0/8 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 2 of 5:  12%|█▎        | 1/8 [00:00<00:04,  1.69it/s]
[Acessing speaker spk_1 track 2 of 5:  25%|██▌       | 2/8 [00:01<00:02,  2.01it/s]
[Acessing speaker spk_1 track 2 of 5:  38%|███▊      | 3/8 [00:01<00:02,  2.09it/s]
[Acessing speaker spk_1 track 2 of 5:  50%|█████     | 4/8 [00:02<00:02,  1.73it/s]
[Acessing speaker spk_1 track 2 of 5:  62%|██████▎   | 5/8 [00:03<00:02,  1.46it/s]
[Acessing speaker spk_1 track 2 of 5:  75%|███████▌  | 6/8 [00:03<00:01,  1.60it/s]
[Acessing speaker spk_1 track 2 of 5:  88%|████████▊ | 7/8 [00:03<00:00,  1.82it/s]
Processing speaker spk_1 track 2 of 5: 100%|██████████| 8/8 [00:04<00:00,  1.75

Failed: ['session_148', 'session_80']


## 11 – Resume-Liste Durchlauf 3

Nur noch `session_148` und `session_80` fehlgeschlagen (OOM trotz try/except).

In [17]:
# Im zweiten Durchlauf schlugen Session 148 und 80 fehl, deshalb erneuter Durchlauf
missing_session_ids = [
    "session_148",
    "session_80",
]


## 12 – Resume Durchlauf 3 (identisch zu Durchlauf 2)

In [18]:
from pathlib import Path

missing_session_dirs = [str(Path(DATA_ROOT) / sid) for sid in missing_session_ids]

failed = []
for session_dir in missing_session_dirs:
    sid = Path(session_dir).name
    print("\nRUN:", SPLIT, sid)
    try:
        with patch_avsr_segmentation(min_on=best_min_on, min_off=best_min_off):
            run_inference_for_experiment(
                exp_name=exp_name,
                base_models=BASE_MODELS,
                experiments=EXPERIMENTS,
                session_dir=session_dir,
            )
    except Exception as e:
        print("FAILED:", sid, "->", repr(e))
        failed.append(sid)
    finally:
        torch.cuda.empty_cache()
        gc.collect()

print("Failed:", failed) # Ergebnis: ['session_148']



RUN: EVAL session_148

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_148
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_148


Processing speakers:   0%|          | 0/8 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/33 [00:01<00:34,  1.09s/it]
[Acessing speaker spk_0 track 1 of 1:   6%|▌         | 2/33 [00:01<00:25,  1.23it/s]
[Acessing speaker spk_0 track 1 of 1:   9%|▉         | 3/33 [00:03<00:34,  1.17s/it]
[Acessing speaker spk_0 track 1 of 1:  12%|█▏        | 4/33 [00:06<01:02,  2.16s/it]
[Acessing speaker spk_0 track 1 of 1:  15%|█▌        | 5/33 [00:08<00:50,  1.81s/it]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 6/33 [00:08<00:37,  1.37s/it]
[Acessing speaker spk_0 track 1 of 1:  21%|██        | 7/33 [00:10<00:38,  1.48s/it]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 8/33 [00:10<00:30,  1.20s/it]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 9/33 [00:12<00:31,  1.32s/it]
[Acessing speaker spk_0 track 1 of 1:  30%|███       | 10/33 [00:13<00:27,  1.21s/it]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 11/3





[Acessing speaker spk_1 track 1 of 5:   0%|          | 0/8 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 5:  12%|█▎        | 1/8 [00:00<00:03,  2.08it/s]
[Acessing speaker spk_1 track 1 of 5:  25%|██▌       | 2/8 [00:01<00:03,  1.85it/s]
[Acessing speaker spk_1 track 1 of 5:  38%|███▊      | 3/8 [00:01<00:02,  1.77it/s]
[Acessing speaker spk_1 track 1 of 5:  50%|█████     | 4/8 [00:02<00:02,  1.70it/s]
[Acessing speaker spk_1 track 1 of 5:  62%|██████▎   | 5/8 [00:03<00:02,  1.09it/s]
[Acessing speaker spk_1 track 1 of 5:  75%|███████▌  | 6/8 [00:04<00:01,  1.14it/s]
[Acessing speaker spk_1 track 1 of 5:  88%|████████▊ | 7/8 [00:05<00:00,  1.09it/s]
Processing speaker spk_1 track 1 of 5: 100%|██████████| 8/8 [00:06<00:00,  1.28it/s]

[Acessing speaker spk_1 track 2 of 5:   0%|          | 0/9 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 2 of 5:  11%|█         | 1/9 [00:00<00:06,  1.32it/s]
[Acessing speaker spk_1 track 2 of 5:  22%|██▏       | 2/9 [00:02<00:08,  1.27





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   4%|▎         | 1/28 [00:00<00:22,  1.21it/s]
[Acessing speaker spk_2 track 1 of 2:   7%|▋         | 2/28 [00:03<00:55,  2.12s/it]
[Acessing speaker spk_2 track 1 of 2:  11%|█         | 3/28 [00:10<01:43,  4.13s/it]
[Acessing speaker spk_2 track 1 of 2:  14%|█▍        | 4/28 [00:11<01:05,  2.75s/it]
[Acessing speaker spk_2 track 1 of 2:  18%|█▊        | 5/28 [00:12<00:50,  2.18s/it]
[Acessing speaker spk_2 track 1 of 2:  21%|██▏       | 6/28 [00:16<01:03,  2.89s/it]
[Acessing speaker spk_2 track 1 of 2:  25%|██▌       | 7/28 [00:18<00:54,  2.59s/it]
[Acessing speaker spk_2 track 1 of 2:  29%|██▊       | 8/28 [00:19<00:39,  1.98s/it]
[Acessing speaker spk_2 track 1 of 2:  32%|███▏      | 9/28 [00:19<00:30,  1.59s/it]
[Acessing speaker spk_2 track 1 of 2:  36%|███▌      | 10/28 [00:22<00:33,  1.86s/it]
[Acessing speaker spk_2 track 1 of 2:  39%|███▉      | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▍         | 1/24 [00:00<00:19,  1.17it/s]
[Acessing speaker spk_3 track 1 of 1:   8%|▊         | 2/24 [00:06<01:26,  3.94s/it]
[Acessing speaker spk_3 track 1 of 1:  12%|█▎        | 3/24 [00:13<01:48,  5.16s/it]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 4/24 [00:21<02:03,  6.17s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 5/24 [00:29<02:11,  6.93s/it]
[Acessing speaker spk_3 track 1 of 1:  25%|██▌       | 6/24 [00:39<02:22,  7.90s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 7/24 [00:47<02:14,  7.90s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 8/24 [00:57<02:18,  8.66s/it]
Processing speaker spk_3 track 1 of 1:  38%|███▊      | 9/24 [01:21<02:16,  9.07s/it]
Processing speakers:  38%|███▊      | 3/8 [04:28<07:28, 89.65s/it]

Error during inference for segment {'video': 'data-bin/eval/session_148/speakers/spk_3/central_crops/track_00_lip.av.mp4', 'start_time': 167.0, 'end_time': 186.16}
FAILED: session_148 -> OutOfMemoryError('CUDA out of memory. Tried to allocate 24.00 MiB. GPU 0 has a total capacity of 79.14 GiB of which 19.50 MiB is free. Process 3077613 has 79.11 GiB memory in use. Of the allocated memory 74.26 GiB is allocated by PyTorch, and 4.35 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)')






RUN: EVAL session_80

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_80
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_80


Processing speakers:   0%|          | 0/2 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 3:   0%|          | 0/7 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 3:  14%|█▍        | 1/7 [00:00<00:02,  2.04it/s]
[Acessing speaker spk_0 track 1 of 3:  29%|██▊       | 2/7 [00:01<00:02,  1.94it/s]
[Acessing speaker spk_0 track 1 of 3:  43%|████▎     | 3/7 [00:03<00:04,  1.22s/it]
[Acessing speaker spk_0 track 1 of 3:  57%|█████▋    | 4/7 [00:03<00:03,  1.01s/it]
[Acessing speaker spk_0 track 1 of 3:  71%|███████▏  | 5/7 [00:04<00:01,  1.21it/s]
[Acessing speaker spk_0 track 1 of 3:  86%|████████▌ | 6/7 [00:04<00:00,  1.42it/s]
Processing speaker spk_0 track 1 of 3: 100%|██████████| 7/7 [00:05<00:00,  1.32it/s]

[Acessing speaker spk_0 track 2 of 3:   0%|          | 0/8 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 2 of 3:  12%|█▎        | 1/8 [00:00<00:03,  1.99it/s]
[Acessing speaker spk_0 track 2 of 3:  25%|██▌       | 2/8 [00:00<00:02,  2.05it/s]
[Acessing speaker spk_0 track 2 of 3:  38%|███▊      | 3/8 [00:01<00:02,  1.88





[Acessing speaker spk_1 track 1 of 3:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 3:   6%|▌         | 1/18 [00:09<02:48,  9.92s/it]
[Acessing speaker spk_1 track 1 of 3:  11%|█         | 2/18 [00:13<01:39,  6.23s/it]
[Acessing speaker spk_1 track 1 of 3:  17%|█▋        | 3/18 [00:14<00:55,  3.68s/it]
[Acessing speaker spk_1 track 1 of 3:  22%|██▏       | 4/18 [00:14<00:33,  2.42s/it]
[Acessing speaker spk_1 track 1 of 3:  28%|██▊       | 5/18 [00:15<00:23,  1.79s/it]
[Acessing speaker spk_1 track 1 of 3:  33%|███▎      | 6/18 [00:16<00:17,  1.44s/it]
[Acessing speaker spk_1 track 1 of 3:  39%|███▉      | 7/18 [00:18<00:19,  1.74s/it]
[Acessing speaker spk_1 track 1 of 3:  44%|████▍     | 8/18 [00:19<00:13,  1.36s/it]
[Acessing speaker spk_1 track 1 of 3:  50%|█████     | 9/18 [00:21<00:15,  1.69s/it]
[Acessing speaker spk_1 track 1 of 3:  56%|█████▌    | 10/18 [00:22<00:11,  1.44s/it]
[Acessing speaker spk_1 track 1 of 3:  61%|██████    | 11/1

Failed: ['session_148']


## 13 – CUDA-Konfiguration für Durchlauf 4

`PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True` erlaubt PyTorch,
CUDA-Speicher in kleineren, nicht-zusammenhängenden Blöcken zu verwalten.
Das reduziert Fragmentierung und ermöglicht die Verarbeitung von `session_148`,
die besonders viele/lange Video-Chunks enthält.

In [6]:
import os

# expandable_segments: PyTorch darf CUDA-Speicher fragmentierungsresistenter verwalten
# → ermöglicht Verarbeitung von session_148 die ohne dieses Flag OOM erzeugt

os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True"
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "3"
import torch


In [7]:
import torch
print(torch.cuda.get_device_name(0))


NVIDIA A100-SXM4-80GB


In [8]:
from tqdm import tqdm

## 14 – Resume-Liste Durchlauf 4

Nur noch `session_148` übrig.

In [10]:
# Im dritten Durchlauf schlug Session 148 fehl, deshalb erneuter Durchlauf
missing_session_ids = [
    "session_148",
]


## 15 – Resume Durchlauf 4 (session_148 mit expandable_segments)

Mit `expandable_segments:True` kann PyTorch CUDA-Speicher fragmentierungsresistenter
verwalten. `session_148` wird so erfolgreich verarbeitet.

In [11]:
from pathlib import Path

missing_session_dirs = [str(Path(DATA_ROOT) / sid) for sid in missing_session_ids]

failed = []
for session_dir in missing_session_dirs:
    sid = Path(session_dir).name
    print("\nRUN:", SPLIT, sid)
    try:
        with patch_avsr_segmentation(min_on=best_min_on, min_off=best_min_off):
            run_inference_for_experiment(
                exp_name=exp_name,
                base_models=BASE_MODELS,
                experiments=EXPERIMENTS,
                session_dir=session_dir,
            )

    except Exception as e:
        print("FAILED:", sid, "->", repr(e))
        failed.append(sid)
    finally:
        torch.cuda.empty_cache()
        gc.collect()

print("Failed:", failed)



RUN: EVAL session_148

Starte Inference für Experiment: EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/eval/session_148
  comment         = EVAL FINAL: AVSR override min_on=1.0, min_off=1.2
Loading avsr_cocktail model...


  from .autonotebook import tqdm as notebook_tqdm


Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_148


Processing speakers:   0%|          | 0/8 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/33 [00:01<00:55,  1.72s/it]
[Acessing speaker spk_0 track 1 of 1:   6%|▌         | 2/33 [00:02<00:30,  1.01it/s]
[Acessing speaker spk_0 track 1 of 1:   9%|▉         | 3/33 [00:02<00:21,  1.39it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▏        | 4/33 [00:04<00:35,  1.22s/it]
[Acessing speaker spk_0 track 1 of 1:  15%|█▌        | 5/33 [00:05<00:29,  1.07s/it]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 6/33 [00:05<00:22,  1.19it/s]
[Acessing speaker spk_0 track 1 of 1:  21%|██        | 7/33 [00:07<00:25,  1.03it/s]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 8/33 [00:07<00:19,  1.26it/s]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 9/33 [00:08<00:21,  1.11it/s]
[Acessing speaker spk_0 track 1 of 1:  30%|███       | 10/33 [00:09<00:19,  1.19it/s]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 11/3





[Acessing speaker spk_1 track 1 of 5:   0%|          | 0/8 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 5:  12%|█▎        | 1/8 [00:00<00:02,  2.83it/s]
[Acessing speaker spk_1 track 1 of 5:  25%|██▌       | 2/8 [00:00<00:02,  2.51it/s]
[Acessing speaker spk_1 track 1 of 5:  38%|███▊      | 3/8 [00:01<00:02,  2.49it/s]
[Acessing speaker spk_1 track 1 of 5:  50%|█████     | 4/8 [00:01<00:01,  2.36it/s]
[Acessing speaker spk_1 track 1 of 5:  62%|██████▎   | 5/8 [00:02<00:01,  1.58it/s]
[Acessing speaker spk_1 track 1 of 5:  75%|███████▌  | 6/8 [00:03<00:01,  1.64it/s]
[Acessing speaker spk_1 track 1 of 5:  88%|████████▊ | 7/8 [00:03<00:00,  1.55it/s]
Processing speaker spk_1 track 1 of 5: 100%|██████████| 8/8 [00:04<00:00,  1.82it/s]

[Acessing speaker spk_1 track 2 of 5:   0%|          | 0/9 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 2 of 5:  11%|█         | 1/9 [00:00<00:04,  1.97it/s]
[Acessing speaker spk_1 track 2 of 5:  22%|██▏       | 2/9 [00:01<00:06,  1.15





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   4%|▎         | 1/28 [00:00<00:14,  1.80it/s]
[Acessing speaker spk_2 track 1 of 2:   7%|▋         | 2/28 [00:02<00:37,  1.45s/it]
[Acessing speaker spk_2 track 1 of 2:  11%|█         | 3/28 [00:05<00:54,  2.17s/it]
[Acessing speaker spk_2 track 1 of 2:  14%|█▍        | 4/28 [00:06<00:34,  1.45s/it]
[Acessing speaker spk_2 track 1 of 2:  18%|█▊        | 5/28 [00:06<00:28,  1.25s/it]
[Acessing speaker spk_2 track 1 of 2:  21%|██▏       | 6/28 [00:09<00:41,  1.87s/it]
[Acessing speaker spk_2 track 1 of 2:  25%|██▌       | 7/28 [00:11<00:35,  1.71s/it]
[Acessing speaker spk_2 track 1 of 2:  29%|██▊       | 8/28 [00:11<00:26,  1.30s/it]
[Acessing speaker spk_2 track 1 of 2:  32%|███▏      | 9/28 [00:12<00:20,  1.06s/it]
[Acessing speaker spk_2 track 1 of 2:  36%|███▌      | 10/28 [00:14<00:22,  1.27s/it]
[Acessing speaker spk_2 track 1 of 2:  39%|███▉      | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▍         | 1/24 [00:00<00:14,  1.61it/s]
[Acessing speaker spk_3 track 1 of 1:   8%|▊         | 2/24 [00:04<01:01,  2.78s/it]
[Acessing speaker spk_3 track 1 of 1:  12%|█▎        | 3/24 [00:09<01:15,  3.60s/it]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 4/24 [00:14<01:26,  4.35s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 5/24 [00:20<01:32,  4.89s/it]
[Acessing speaker spk_3 track 1 of 1:  25%|██▌       | 6/24 [00:27<01:40,  5.56s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 7/24 [00:33<01:34,  5.53s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 8/24 [00:39<01:30,  5.68s/it]
[Acessing speaker spk_3 track 1 of 1:  38%|███▊      | 9/24 [00:47<01:35,  6.39s/it]
[Acessing speaker spk_3 track 1 of 1:  42%|████▏     | 10/24 [01:01<02:02,  8.79s/it]
[Acessing speaker spk_3 track 1 of 1:  46%|████▌     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/25 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/25 [00:01<00:35,  1.49s/it]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/25 [00:03<00:43,  1.90s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 3/25 [00:04<00:30,  1.38s/it]
[Acessing speaker spk_4 track 1 of 1:  16%|█▌        | 4/25 [00:06<00:31,  1.48s/it]
[Acessing speaker spk_4 track 1 of 1:  20%|██        | 5/25 [00:06<00:22,  1.10s/it]
[Acessing speaker spk_4 track 1 of 1:  24%|██▍       | 6/25 [00:07<00:18,  1.05it/s]
[Acessing speaker spk_4 track 1 of 1:  28%|██▊       | 7/25 [00:07<00:13,  1.30it/s]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 8/25 [00:07<00:10,  1.56it/s]
[Acessing speaker spk_4 track 1 of 1:  36%|███▌      | 9/25 [00:10<00:18,  1.15s/it]
[Acessing speaker spk_4 track 1 of 1:  40%|████      | 10/25 [00:10<00:14,  1.03it/s]
[Acessing speaker spk_4 track 1 of 1:  44%|████▍     | 11/2





[Acessing speaker spk_5 track 1 of 2:   0%|          | 0/7 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 2:  14%|█▍        | 1/7 [00:01<00:06,  1.03s/it]
[Acessing speaker spk_5 track 1 of 2:  29%|██▊       | 2/7 [00:01<00:04,  1.02it/s]
[Acessing speaker spk_5 track 1 of 2:  43%|████▎     | 3/7 [00:02<00:03,  1.21it/s]
[Acessing speaker spk_5 track 1 of 2:  57%|█████▋    | 4/7 [00:07<00:07,  2.46s/it]
[Acessing speaker spk_5 track 1 of 2:  71%|███████▏  | 5/7 [00:09<00:04,  2.09s/it]
[Acessing speaker spk_5 track 1 of 2:  86%|████████▌ | 6/7 [00:14<00:03,  3.21s/it]
Processing speaker spk_5 track 1 of 2: 100%|██████████| 7/7 [00:19<00:00,  2.82s/it]

[Acessing speaker spk_5 track 2 of 2:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 2 of 2:   5%|▍         | 1/21 [00:01<00:35,  1.78s/it]
[Acessing speaker spk_5 track 2 of 2:  10%|▉         | 2/21 [00:02<00:21,  1.11s/it]
[Acessing speaker spk_5 track 2 of 2:  14%|█▍        | 3/21 [00:02<00:15,  





[Acessing speaker spk_6 track 1 of 2:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_6 track 1 of 2:   6%|▋         | 1/16 [00:01<00:20,  1.34s/it]
[Acessing speaker spk_6 track 1 of 2:  12%|█▎        | 2/16 [00:02<00:19,  1.41s/it]
[Acessing speaker spk_6 track 1 of 2:  19%|█▉        | 3/16 [00:04<00:19,  1.50s/it]
[Acessing speaker spk_6 track 1 of 2:  25%|██▌       | 4/16 [00:04<00:12,  1.07s/it]
[Acessing speaker spk_6 track 1 of 2:  31%|███▏      | 5/16 [00:05<00:09,  1.21it/s]
[Acessing speaker spk_6 track 1 of 2:  38%|███▊      | 6/16 [00:05<00:06,  1.46it/s]
[Acessing speaker spk_6 track 1 of 2:  44%|████▍     | 7/16 [00:06<00:06,  1.45it/s]
[Acessing speaker spk_6 track 1 of 2:  50%|█████     | 8/16 [00:06<00:04,  1.71it/s]
[Acessing speaker spk_6 track 1 of 2:  56%|█████▋    | 9/16 [00:07<00:03,  1.81it/s]
[Acessing speaker spk_6 track 1 of 2:  62%|██████▎   | 10/16 [00:08<00:04,  1.47it/s]
[Acessing speaker spk_6 track 1 of 2:  69%|██████▉   | 11/1





[Acessing speaker spk_7 track 1 of 2:   0%|          | 0/11 [00:00<?, ?it/s]
[Acessing speaker spk_7 track 1 of 2:   9%|▉         | 1/11 [00:00<00:07,  1.28it/s]
[Acessing speaker spk_7 track 1 of 2:  18%|█▊        | 2/11 [00:02<00:10,  1.17s/it]
[Acessing speaker spk_7 track 1 of 2:  27%|██▋       | 3/11 [00:02<00:06,  1.19it/s]
[Acessing speaker spk_7 track 1 of 2:  36%|███▋      | 4/11 [00:03<00:04,  1.49it/s]
[Acessing speaker spk_7 track 1 of 2:  45%|████▌     | 5/11 [00:03<00:03,  1.72it/s]
[Acessing speaker spk_7 track 1 of 2:  55%|█████▍    | 6/11 [00:04<00:03,  1.60it/s]
[Acessing speaker spk_7 track 1 of 2:  64%|██████▎   | 7/11 [00:04<00:02,  1.60it/s]
[Acessing speaker spk_7 track 1 of 2:  73%|███████▎  | 8/11 [00:05<00:01,  1.73it/s]
[Acessing speaker spk_7 track 1 of 2:  82%|████████▏ | 9/11 [00:05<00:01,  1.92it/s]
[Acessing speaker spk_7 track 1 of 2:  91%|█████████ | 10/11 [00:10<00:01,  1.92s/it]
Processing speaker spk_7 track 1 of 2: 100%|██████████| 11/1

Failed: []


## 16 – Hinweis: Keine Evaluation auf dem Eval-Set

Das Eval-Set hat keine Labels – `evaluate.py` kann nicht ausgeführt werden.
Die generierten VTT-Dateien (`output_EVAL_final_bugfix_mdOn1p0_mdOff1p2_bs12_len20`)
werden direkt als Einreichung verwendet.
Die endgültigen WER-Metriken auf dem Eval-Set werden erst nach der Einreichung
durch den CHiME-Organisatoren bekannt gegeben.
