# 02k – 12. Experiment: Post-Bugfix: Grid-Search min_duration_on × min_duration_off

## Kontext: Warum dieser Grid-Search?

`02j_` zeigte, dass der Bugfix in `segmentation.py` die WER *verschlechtert* hat (beam=12, len=20: +0.029). Die korrigierte
`min_duration_off`-Einstellung ändert, wie Sprachpausen verarbeitet werden.
Ziel dieses Notebooks ist es, durch gezielte Parametersuche die WER
unter das Vor-Bugfix-Niveau zu senken.

### Der Bug (zur Erinnerung)

```python
# FEHLERHAFT: min_duration_off las den Fallback von min_duration_ON
min_duration_off_frames = int(parameters.get("min_duration_off",
    CENTRAL_ASD_CHUNKING_PARAMETERS["min_duration_on"]) * 25)

# KORREKT:
min_duration_off_frames = int(parameters.get("min_duration_off",
    CENTRAL_ASD_CHUNKING_PARAMETERS["min_duration_off"]) * 25)
```

### Parameter-Semantik

| Parameter | Bedeutung | Effekt bei größerem Wert |
|-----------|-----------|-------------------------|
| `min_duration_on` | Mindestdauer einer Sprachregion | Kurze Äußerungen verworfen → weniger Segmente |
| `min_duration_off` | Max. Pause zum Zusammenführen | Länger Pausen werden überbrückt → weniger, längere Segmente |

## Technischer Ansatz: Monkey-Patching

Da `run_inference_for_experiment` die Segmentierungsparameter nicht direkt
entgegennimmt, wird `InferenceEngine.chunk_video` zur Laufzeit überschrieben
(`patch_avsr_segmentation`). Der Patch ist als Context-Manager implementiert:
nach dem Experiment wird die Original-Methode sauber wiederhergestellt.

## Ergebnis (Vorschau)

**Beste Kombination: min_on=1.0 s, min_off=1.2 s** (E72)

| Konfiguration | WER | Joint Error |
|--------------|-----|-------------|
| E56 (Bugfix-Default, kein Override) | 0.5245 | 0.3384 |
| **E72 (min_on=1.0, min_off=1.2)** | **0.4902** | **0.3213** |
| Δ | **−0.034** | **−0.017** |

E72 liegt damit sogar unter dem Vor-Bugfix-Wert (0.4954) und wird
als finale Konfiguration für `04_dev_final_results` verwendet.

## 1 – GPU-Check & Auswahl

In [1]:
!nvidia-smi

Wed Feb  4 18:35:07 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.5     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA A100-SXM4-80GB          On  |   00000000:01:00.0 Off |                    0 |
| N/A   30C    P0             79W /  500W |   10041MiB /  81920MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA A100-SXM4-80GB          On  |   00

In [2]:
import os

# Physische GPU-Auswahl: hier GPU 2 (siehe nvidia-smi)
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "1"


## 2 – CUDA-Verifikation

In [3]:
import torch

In [4]:
print("CUDA devices:", torch.cuda.device_count())
print("Device 0 name (sollte A100 sein):", torch.cuda.get_device_name(0))

CUDA devices: 1
Device 0 name (sollte A100 sein): NVIDIA A100-SXM4-80GB


## 3 – Setup: Arbeitsverzeichnis & Imports

In [5]:
import os, sys
import pandas as pd

# Arbeitsverzeichnis auf Repo-Root setzen (Voraussetzung für alle relativen Pfade)
project_baseline_path = "/home/josch080/Projektgruppe/mcorec_baseline"
os.chdir(project_baseline_path)

# Repo-Root in sys.path, damit projektinterne Module importierbar sind
if project_baseline_path not in sys.path:
    sys.path.append(project_baseline_path)

from script.pg_utils_experiments import run_inference_for_experiment, run_eval_and_log, append_eval_results_for_experiments

  if not hasattr(np, "object"):


## 4 – Monkey-Patch: Segmentierungsparameter zur Laufzeit überschreiben

`run_inference_for_experiment` nimmt `min_duration_on/off` nicht direkt entgegen.
Die Lösung: `InferenceEngine.chunk_video` wird als Context-Manager temporär
durch eine patched Version ersetzt, die die gewünschten Parameter an
`segment_by_asd()` weitergibt. Nach dem `with`-Block wird die Original-Methode
sauber wiederhergestellt – auch bei Exceptions (`finally`-Clause).

In [6]:
from contextlib import contextmanager
import script.inference as inf  # dasselbe Modul das pg_utils_experiments nutzt

@contextmanager
def patch_avsr_segmentation(min_on=None, min_off=None):
    # Context-Manager: überschreibt InferenceEngine.chunk_video für den ASD-Zweig
    # Ermöglicht das Setzen von min_duration_on/off ohne Änderung an pg_utils_experiments
    # Nach dem with-Block wird die Original-Methode wiederhergestellt (auch bei Exceptions)
    orig = inf.InferenceEngine.chunk_video # Original-Methode sichern

    # wenn nix gesetzt -> kein Patch nötig
    if min_on is None and min_off is None:
        yield # Kein Patch nötig → Original-Verhalten
        return

    def patched(self, video_path, asd_path=None, max_length=15):
        if asd_path is not None:
            with open(asd_path, "r") as f:
                asd = inf.json.load(f)

            frames = sorted([int(f) for f in asd.keys()])
            if not frames:
                return []

            min_frame = min(frames)

            # Parameter-Dict für segment_by_asd aufbauen
            params = {"max_chunk_size": max_length}
            if min_on is not None:
                params["min_duration_on"] = float(min_on)
            if min_off is not None:
                params["min_duration_off"] = float(min_off)

            segments_by_frames = inf.segment_by_asd(asd, params)

            # Frame-Indizes in Sekunden konvertieren (25 fps → / 25)
            return [((seg[0] - min_frame) / 25, (seg[-1] - min_frame) / 25) for seg in segments_by_frames]

        # ohne ASD: Originalverhalten
        return orig(self, video_path, asd_path, max_length)

    inf.InferenceEngine.chunk_video = patched
    try:
        yield
    finally:
        inf.InferenceEngine.chunk_video = orig # immer wiederherstellen


def _tag(x: float) -> str:
    # Konvertiert float zu Dateinamen-sicherem String: 1.2 → '1p2'
    return str(x).replace(".", "p")


## 5 – Experiment-Grid (E56–E72)

Beam-Size und max_length sind fixiert (beste Kombination aus `02j_`).
Das 4×4-Grid über `min_duration_on` × `min_duration_off` ergibt
16 Experimente (E57–E72) plus eine Referenz ohne Override (E56).

In [7]:
# Fixe Decoding-Parameter (beste Kombination aus 02j_)
GRID_BEAM = 12
GRID_LEN  = 20

# Grid-Achsen für Segmentierungsparameter (in Sekunden)
MIN_ON_LIST  = [0.4, 0.6, 0.8, 1.0]     # Mindestdauer Sprachregion
MIN_OFF_LIST = [0.5, 0.8, 1.0, 1.2]     # Max. Pause für Gap-Filling

EXPERIMENTS = {}

# E56: Referenz ohne Parameter-Override (Bugfix-Default)
EXPERIMENTS["E56_bugfix_default_bs12_len20"] = {
    "base_model": "avsr_cocktail_finetuned",
    "beam_size": GRID_BEAM,
    "max_length": GRID_LEN,
    "comment": "Bugfix-default segmentation (kein Override von min_duration)",
}

# E57–E72: 4×4-Grid
eid = 57
for min_on in MIN_ON_LIST:
    for min_off in MIN_OFF_LIST:
        exp = f"E{eid}_bugfix_mdOn{_tag(min_on)}_mdOff{_tag(min_off)}_bs{GRID_BEAM}_len{GRID_LEN}"
        EXPERIMENTS[exp] = {
            "base_model": "avsr_cocktail_finetuned",
            "beam_size": GRID_BEAM,
            "max_length": GRID_LEN,
            "min_duration_on": float(min_on),
            "min_duration_off": float(min_off),
            "comment": f"AVSR-Override: min_on={min_on}s, min_off={min_off}s (nur ASD-Chunks)",
        }
        eid += 1

len(EXPERIMENTS), list(EXPERIMENTS.keys())


(17,
 ['E56_bugfix_default_bs12_len20',
  'E57_bugfix_mdOn0p4_mdOff0p5_bs12_len20',
  'E58_bugfix_mdOn0p4_mdOff0p8_bs12_len20',
  'E59_bugfix_mdOn0p4_mdOff1p0_bs12_len20',
  'E60_bugfix_mdOn0p4_mdOff1p2_bs12_len20',
  'E61_bugfix_mdOn0p6_mdOff0p5_bs12_len20',
  'E62_bugfix_mdOn0p6_mdOff0p8_bs12_len20',
  'E63_bugfix_mdOn0p6_mdOff1p0_bs12_len20',
  'E64_bugfix_mdOn0p6_mdOff1p2_bs12_len20',
  'E65_bugfix_mdOn0p8_mdOff0p5_bs12_len20',
  'E66_bugfix_mdOn0p8_mdOff0p8_bs12_len20',
  'E67_bugfix_mdOn0p8_mdOff1p0_bs12_len20',
  'E68_bugfix_mdOn0p8_mdOff1p2_bs12_len20',
  'E69_bugfix_mdOn1p0_mdOff0p5_bs12_len20',
  'E70_bugfix_mdOn1p0_mdOff0p8_bs12_len20',
  'E71_bugfix_mdOn1p0_mdOff1p0_bs12_len20',
  'E72_bugfix_mdOn1p0_mdOff1p2_bs12_len20'])

## 6 – Modell-Definitionen

Nur `avsr_cocktail_finetuned` (BL4) wird in den Experimenten genutzt.

In [8]:
BASE_MODELS = {
    "auto_avsr": {
        "model_type": "auto_avsr",
        "chkpt": "model-bin/auto_avsr/avsr_trlrwlr2lrs3vox2avsp_base.pth",
    },
    "avsr_cocktail": {
        "model_type": "avsr_cocktail",
        "chkpt": "model-bin/avsr_cocktail",
    },
    "avsr_cocktail_finetuned": {
        "model_type": "avsr_cocktail",
        "chkpt": "model-bin/avsr_cocktail_mcorec_finetune",
    },
    "muavic_en": {
        "model_type": "muavic_en",
        "chkpt": "nguyenvulebinh/AV-HuBERT-MuAViC-en",
    },
}


## 7 – Sessions

In [9]:
SESSION_IDS = ["session_40", "session_43", "session_49", "session_50", "session_54"]

## 8 – Inference mit Monkey-Patch

Jedes Experiment wird innerhalb des `patch_avsr_segmentation`-Context-Managers
ausgeführt. Der Patch ist aktiv für genau einen `run_inference_for_experiment`-Aufruf;
danach wird die Original-Methode wiederhergestellt.

CUDA-Cache und Garbage-Collector werden nach jedem Experiment geleert,
um Speicherlecks bei vielen aufeinanderfolgenden GPU-Läufen zu verhindern.

In [10]:
import gc, torch

for sid in SESSION_IDS:
    session_dir = f"data-bin/dev/{sid}"
    print(f"\n########## Starte Grid-Experimente für {sid} ##########")

    for exp_name, cfg in EXPERIMENTS.items():
        min_on  = cfg.get("min_duration_on", None)
        min_off = cfg.get("min_duration_off", None)

        # Patch ist nur innerhalb dieses with-Blocks aktiv
        with patch_avsr_segmentation(min_on=min_on, min_off=min_off):
            run_inference_for_experiment(
                exp_name=exp_name,
                base_models=BASE_MODELS,
                experiments=EXPERIMENTS,
                session_dir=session_dir,
            )

        # Speicher nach jedem Experiment freigeben
        torch.cuda.empty_cache()
        gc.collect()



########## Starte Grid-Experimente für session_40 ##########

Starte Inference für Experiment: E56_bugfix_default_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E56_bugfix_default_bs12_len20
  session_dir     = data-bin/dev/session_40
  comment         = Bugfix-default segmentation (kein Override von min_duration)
Loading avsr_cocktail model...


  from .autonotebook import tqdm as notebook_tqdm


Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_40


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/41 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   2%|▏         | 1/41 [00:01<00:54,  1.37s/it]
[Acessing speaker spk_0 track 1 of 1:   5%|▍         | 2/41 [00:01<00:31,  1.24it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 3/41 [00:02<00:25,  1.51it/s]
[Acessing speaker spk_0 track 1 of 1:  10%|▉         | 4/41 [00:02<00:21,  1.73it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▏        | 5/41 [00:03<00:18,  1.97it/s]
[Acessing speaker spk_0 track 1 of 1:  15%|█▍        | 6/41 [00:03<00:17,  1.95it/s]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 7/41 [00:04<00:18,  1.81it/s]
[Acessing speaker spk_0 track 1 of 1:  20%|█▉        | 8/41 [00:04<00:17,  1.91it/s]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 9/41 [00:07<00:42,  1.31s/it]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 10/41 [00:08<00:35,  1.14s/it]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 11/4





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/41 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   2%|▏         | 1/41 [00:00<00:17,  2.27it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▍         | 2/41 [00:01<00:29,  1.32it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 3/41 [00:02<00:33,  1.12it/s]
[Acessing speaker spk_1 track 1 of 1:  10%|▉         | 4/41 [00:03<00:29,  1.24it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 5/41 [00:04<00:36,  1.00s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 6/41 [00:05<00:35,  1.01s/it]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 7/41 [00:06<00:29,  1.14it/s]
[Acessing speaker spk_1 track 1 of 1:  20%|█▉        | 8/41 [00:06<00:25,  1.31it/s]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 9/41 [00:07<00:22,  1.45it/s]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 10/41 [00:07<00:21,  1.42it/s]
[Acessing speaker spk_1 track 1 of 1:  27%|██▋       | 11/4





Processing speaker spk_2 track 1 of 3: 0it [00:00, ?it/s]

[Acessing speaker spk_2 track 2 of 3:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 3:   6%|▋         | 1/16 [00:00<00:07,  1.98it/s]
[Acessing speaker spk_2 track 2 of 3:  12%|█▎        | 2/16 [00:01<00:11,  1.25it/s]
[Acessing speaker spk_2 track 2 of 3:  19%|█▉        | 3/16 [00:02<00:14,  1.10s/it]
[Acessing speaker spk_2 track 2 of 3:  25%|██▌       | 4/16 [00:07<00:28,  2.35s/it]
[Acessing speaker spk_2 track 2 of 3:  31%|███▏      | 5/16 [00:08<00:21,  1.96s/it]
[Acessing speaker spk_2 track 2 of 3:  38%|███▊      | 6/16 [00:09<00:17,  1.71s/it]
[Acessing speaker spk_2 track 2 of 3:  44%|████▍     | 7/16 [00:10<00:13,  1.49s/it]
[Acessing speaker spk_2 track 2 of 3:  50%|█████     | 8/16 [00:15<00:20,  2.56s/it]
[Acessing speaker spk_2 track 2 of 3:  56%|█████▋    | 9/16 [00:22<00:26,  3.81s/it]
[Acessing speaker spk_2 track 2 of 3:  62%|██████▎   | 10/16 [00:23<00:17,  2.92s/it]






[Acessing speaker spk_3 track 1 of 2:   0%|          | 0/19 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 2:   5%|▌         | 1/19 [00:00<00:16,  1.11it/s]
[Acessing speaker spk_3 track 1 of 2:  11%|█         | 2/19 [00:01<00:12,  1.41it/s]
[Acessing speaker spk_3 track 1 of 2:  16%|█▌        | 3/19 [00:02<00:13,  1.15it/s]
[Acessing speaker spk_3 track 1 of 2:  21%|██        | 4/19 [00:06<00:28,  1.90s/it]
[Acessing speaker spk_3 track 1 of 2:  26%|██▋       | 5/19 [00:08<00:27,  1.99s/it]
[Acessing speaker spk_3 track 1 of 2:  32%|███▏      | 6/19 [00:12<00:38,  2.94s/it]
[Acessing speaker spk_3 track 1 of 2:  37%|███▋      | 7/19 [00:14<00:28,  2.37s/it]
[Acessing speaker spk_3 track 1 of 2:  42%|████▏     | 8/19 [00:18<00:33,  3.07s/it]
[Acessing speaker spk_3 track 1 of 2:  47%|████▋     | 9/19 [00:21<00:31,  3.12s/it]
[Acessing speaker spk_3 track 1 of 2:  53%|█████▎    | 10/19 [00:25<00:28,  3.11s/it]
[Acessing speaker spk_3 track 1 of 2:  58%|█████▊    | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/26 [00:02<01:07,  2.71s/it]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/26 [00:03<00:43,  1.81s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 3/26 [00:13<02:00,  5.23s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▌        | 4/26 [00:24<02:44,  7.46s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 5/26 [00:26<01:58,  5.65s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 6/26 [00:28<01:26,  4.35s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 7/26 [00:30<01:07,  3.57s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 8/26 [00:32<00:56,  3.13s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▍      | 9/26 [00:34<00:45,  2.70s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 10/26 [00:43<01:13,  4.58s/it]
[Acessing speaker spk_4 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/43 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   2%|▏         | 1/43 [00:01<00:44,  1.06s/it]
[Acessing speaker spk_5 track 1 of 1:   5%|▍         | 2/43 [00:04<01:49,  2.68s/it]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 3/43 [00:06<01:28,  2.20s/it]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 4/43 [00:07<01:12,  1.85s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 5/43 [00:09<01:02,  1.63s/it]
[Acessing speaker spk_5 track 1 of 1:  14%|█▍        | 6/43 [00:09<00:50,  1.35s/it]
[Acessing speaker spk_5 track 1 of 1:  16%|█▋        | 7/43 [00:12<00:59,  1.66s/it]
[Acessing speaker spk_5 track 1 of 1:  19%|█▊        | 8/43 [00:13<00:53,  1.53s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 9/43 [00:15<00:55,  1.64s/it]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 10/43 [00:15<00:44,  1.34s/it]
[Acessing speaker spk_5 track 1 of 1:  26%|██▌       | 11/4


Starte Inference für Experiment: E57_bugfix_mdOn0p4_mdOff0p5_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E57_bugfix_mdOn0p4_mdOff0p5_bs12_len20
  session_dir     = data-bin/dev/session_40
  comment         = AVSR-Override: min_on=0.4s, min_off=0.5s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_40


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/44 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   2%|▏         | 1/44 [00:00<00:23,  1.86it/s]
[Acessing speaker spk_0 track 1 of 1:   5%|▍         | 2/44 [00:00<00:19,  2.15it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 3/44 [00:01<00:20,  2.05it/s]
[Acessing speaker spk_0 track 1 of 1:   9%|▉         | 4/44 [00:01<00:18,  2.11it/s]
[Acessing speaker spk_0 track 1 of 1:  11%|█▏        | 5/44 [00:02<00:17,  2.24it/s]
[Acessing speaker spk_0 track 1 of 1:  14%|█▎        | 6/44 [00:02<00:17,  2.11it/s]
[Acessing speaker spk_0 track 1 of 1:  16%|█▌        | 7/44 [00:03<00:19,  1.90it/s]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 8/44 [00:03<00:18,  1.97it/s]
[Acessing speaker spk_0 track 1 of 1:  20%|██        | 9/44 [00:06<00:45,  1.29s/it]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 10/44 [00:07<00:38,  1.13s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 11/4





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/44 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   2%|▏         | 1/44 [00:00<00:19,  2.15it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▍         | 2/44 [00:01<00:33,  1.27it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 3/44 [00:02<00:32,  1.25it/s]
[Acessing speaker spk_1 track 1 of 1:   9%|▉         | 4/44 [00:02<00:29,  1.34it/s]
[Acessing speaker spk_1 track 1 of 1:  11%|█▏        | 5/44 [00:04<00:37,  1.03it/s]
[Acessing speaker spk_1 track 1 of 1:  14%|█▎        | 6/44 [00:05<00:36,  1.03it/s]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 7/44 [00:05<00:31,  1.18it/s]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 8/44 [00:06<00:26,  1.34it/s]
[Acessing speaker spk_1 track 1 of 1:  20%|██        | 9/44 [00:06<00:23,  1.49it/s]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 10/44 [00:07<00:23,  1.45it/s]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 11/4





Processing speaker spk_2 track 1 of 3: 0it [00:00, ?it/s]

[Acessing speaker spk_2 track 2 of 3:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 3:   6%|▌         | 1/17 [00:00<00:08,  1.92it/s]
[Acessing speaker spk_2 track 2 of 3:  12%|█▏        | 2/17 [00:01<00:12,  1.23it/s]
[Acessing speaker spk_2 track 2 of 3:  18%|█▊        | 3/17 [00:03<00:15,  1.11s/it]
[Acessing speaker spk_2 track 2 of 3:  24%|██▎       | 4/17 [00:03<00:10,  1.20it/s]
[Acessing speaker spk_2 track 2 of 3:  29%|██▉       | 5/17 [00:07<00:25,  2.13s/it]
[Acessing speaker spk_2 track 2 of 3:  35%|███▌      | 6/17 [00:09<00:20,  1.86s/it]
[Acessing speaker spk_2 track 2 of 3:  41%|████      | 7/17 [00:11<00:20,  2.10s/it]
[Acessing speaker spk_2 track 2 of 3:  47%|████▋     | 8/17 [00:13<00:16,  1.85s/it]
[Acessing speaker spk_2 track 2 of 3:  53%|█████▎    | 9/17 [00:17<00:22,  2.77s/it]
[Acessing speaker spk_2 track 2 of 3:  59%|█████▉    | 10/17 [00:22<00:23,  3.34s/it]






[Acessing speaker spk_3 track 1 of 2:   0%|          | 0/19 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 2:   5%|▌         | 1/19 [00:00<00:10,  1.69it/s]
[Acessing speaker spk_3 track 1 of 2:  11%|█         | 2/19 [00:01<00:09,  1.78it/s]
[Acessing speaker spk_3 track 1 of 2:  16%|█▌        | 3/19 [00:02<00:12,  1.31it/s]
[Acessing speaker spk_3 track 1 of 2:  21%|██        | 4/19 [00:05<00:26,  1.78s/it]
[Acessing speaker spk_3 track 1 of 2:  26%|██▋       | 5/19 [00:07<00:26,  1.90s/it]
[Acessing speaker spk_3 track 1 of 2:  32%|███▏      | 6/19 [00:11<00:35,  2.71s/it]
[Acessing speaker spk_3 track 1 of 2:  37%|███▋      | 7/19 [00:13<00:26,  2.21s/it]
[Acessing speaker spk_3 track 1 of 2:  42%|████▏     | 8/19 [00:17<00:32,  2.98s/it]
[Acessing speaker spk_3 track 1 of 2:  47%|████▋     | 9/19 [00:20<00:30,  3.06s/it]
[Acessing speaker spk_3 track 1 of 2:  53%|█████▎    | 10/19 [00:24<00:27,  3.07s/it]
[Acessing speaker spk_3 track 1 of 2:  58%|█████▊    | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/28 [00:02<01:04,  2.40s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/28 [00:03<00:43,  1.68s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/28 [00:12<02:04,  4.97s/it]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 4/28 [00:22<02:50,  7.09s/it]
[Acessing speaker spk_4 track 1 of 1:  18%|█▊        | 5/28 [00:25<02:04,  5.42s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██▏       | 6/28 [00:25<01:23,  3.79s/it]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 7/28 [00:27<01:07,  3.23s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▊       | 8/28 [00:31<01:06,  3.33s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 9/28 [00:33<00:57,  3.00s/it]
[Acessing speaker spk_4 track 1 of 1:  36%|███▌      | 10/28 [00:34<00:40,  2.26s/it]
[Acessing speaker spk_4 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/46 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   2%|▏         | 1/46 [00:00<00:19,  2.26it/s]
[Acessing speaker spk_5 track 1 of 1:   4%|▍         | 2/46 [00:01<00:26,  1.66it/s]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 3/46 [00:04<01:22,  1.91s/it]
[Acessing speaker spk_5 track 1 of 1:   9%|▊         | 4/46 [00:06<01:15,  1.79s/it]
[Acessing speaker spk_5 track 1 of 1:  11%|█         | 5/46 [00:07<01:05,  1.61s/it]
[Acessing speaker spk_5 track 1 of 1:  13%|█▎        | 6/46 [00:08<01:02,  1.56s/it]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 7/46 [00:09<00:51,  1.32s/it]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 8/46 [00:12<01:01,  1.62s/it]
[Acessing speaker spk_5 track 1 of 1:  20%|█▉        | 9/46 [00:13<00:55,  1.49s/it]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 10/46 [00:15<00:57,  1.59s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 11/4


Starte Inference für Experiment: E58_bugfix_mdOn0p4_mdOff0p8_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E58_bugfix_mdOn0p4_mdOff0p8_bs12_len20
  session_dir     = data-bin/dev/session_40
  comment         = AVSR-Override: min_on=0.4s, min_off=0.8s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_40


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/41 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   2%|▏         | 1/41 [00:00<00:20,  1.94it/s]
[Acessing speaker spk_0 track 1 of 1:   5%|▍         | 2/41 [00:00<00:17,  2.21it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 3/41 [00:01<00:18,  2.09it/s]
[Acessing speaker spk_0 track 1 of 1:  10%|▉         | 4/41 [00:01<00:17,  2.14it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▏        | 5/41 [00:02<00:16,  2.24it/s]
[Acessing speaker spk_0 track 1 of 1:  15%|█▍        | 6/41 [00:02<00:16,  2.10it/s]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 7/41 [00:03<00:18,  1.87it/s]
[Acessing speaker spk_0 track 1 of 1:  20%|█▉        | 8/41 [00:03<00:17,  1.94it/s]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 9/41 [00:06<00:41,  1.30s/it]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 10/41 [00:07<00:35,  1.13s/it]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 11/4





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/44 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   2%|▏         | 1/44 [00:00<00:19,  2.20it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▍         | 2/44 [00:01<00:32,  1.31it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 3/44 [00:02<00:31,  1.29it/s]
[Acessing speaker spk_1 track 1 of 1:   9%|▉         | 4/44 [00:02<00:29,  1.37it/s]
[Acessing speaker spk_1 track 1 of 1:  11%|█▏        | 5/44 [00:04<00:38,  1.01it/s]
[Acessing speaker spk_1 track 1 of 1:  14%|█▎        | 6/44 [00:05<00:37,  1.00it/s]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 7/44 [00:05<00:32,  1.15it/s]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 8/44 [00:06<00:27,  1.31it/s]
[Acessing speaker spk_1 track 1 of 1:  20%|██        | 9/44 [00:06<00:23,  1.47it/s]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 10/44 [00:07<00:23,  1.43it/s]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 11/4





Processing speaker spk_2 track 1 of 3: 0it [00:00, ?it/s]

[Acessing speaker spk_2 track 2 of 3:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 3:   6%|▌         | 1/17 [00:00<00:08,  1.87it/s]
[Acessing speaker spk_2 track 2 of 3:  12%|█▏        | 2/17 [00:01<00:12,  1.24it/s]
[Acessing speaker spk_2 track 2 of 3:  18%|█▊        | 3/17 [00:02<00:15,  1.10s/it]
[Acessing speaker spk_2 track 2 of 3:  24%|██▎       | 4/17 [00:03<00:10,  1.22it/s]
[Acessing speaker spk_2 track 2 of 3:  29%|██▉       | 5/17 [00:07<00:24,  2.07s/it]
[Acessing speaker spk_2 track 2 of 3:  35%|███▌      | 6/17 [00:09<00:20,  1.87s/it]
[Acessing speaker spk_2 track 2 of 3:  41%|████      | 7/17 [00:10<00:16,  1.65s/it]
[Acessing speaker spk_2 track 2 of 3:  47%|████▋     | 8/17 [00:11<00:13,  1.46s/it]
[Acessing speaker spk_2 track 2 of 3:  53%|█████▎    | 9/17 [00:16<00:20,  2.55s/it]
[Acessing speaker spk_2 track 2 of 3:  59%|█████▉    | 10/17 [00:20<00:22,  3.17s/it]






[Acessing speaker spk_3 track 1 of 2:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 2:   6%|▌         | 1/18 [00:00<00:10,  1.69it/s]
[Acessing speaker spk_3 track 1 of 2:  11%|█         | 2/18 [00:01<00:09,  1.78it/s]
[Acessing speaker spk_3 track 1 of 2:  17%|█▋        | 3/18 [00:06<00:40,  2.69s/it]
[Acessing speaker spk_3 track 1 of 2:  22%|██▏       | 4/18 [00:08<00:34,  2.49s/it]
[Acessing speaker spk_3 track 1 of 2:  28%|██▊       | 5/18 [00:13<00:41,  3.21s/it]
[Acessing speaker spk_3 track 1 of 2:  33%|███▎      | 6/18 [00:14<00:30,  2.52s/it]
[Acessing speaker spk_3 track 1 of 2:  39%|███▉      | 7/18 [00:18<00:35,  3.21s/it]
[Acessing speaker spk_3 track 1 of 2:  44%|████▍     | 8/18 [00:22<00:32,  3.21s/it]
[Acessing speaker spk_3 track 1 of 2:  50%|█████     | 9/18 [00:25<00:28,  3.18s/it]
[Acessing speaker spk_3 track 1 of 2:  56%|█████▌    | 10/18 [00:25<00:19,  2.45s/it]
[Acessing speaker spk_3 track 1 of 2:  61%|██████    | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/28 [00:02<01:05,  2.43s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/28 [00:03<00:43,  1.68s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/28 [00:12<02:04,  4.98s/it]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 4/28 [00:23<02:52,  7.19s/it]
[Acessing speaker spk_4 track 1 of 1:  18%|█▊        | 5/28 [00:28<02:27,  6.40s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██▏       | 6/28 [00:28<01:37,  4.43s/it]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 7/28 [00:30<01:15,  3.60s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▊       | 8/28 [00:32<01:01,  3.07s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 9/28 [00:34<00:53,  2.80s/it]
[Acessing speaker spk_4 track 1 of 1:  36%|███▌      | 10/28 [00:35<00:38,  2.12s/it]
[Acessing speaker spk_4 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/39 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/39 [00:00<00:16,  2.27it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▌         | 2/39 [00:01<00:22,  1.68it/s]
[Acessing speaker spk_5 track 1 of 1:   8%|▊         | 3/39 [00:04<01:11,  2.00s/it]
[Acessing speaker spk_5 track 1 of 1:  10%|█         | 4/39 [00:06<01:04,  1.85s/it]
[Acessing speaker spk_5 track 1 of 1:  13%|█▎        | 5/39 [00:09<01:17,  2.27s/it]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 6/39 [00:10<00:59,  1.81s/it]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 7/39 [00:12<01:02,  1.95s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 8/39 [00:13<00:52,  1.71s/it]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 9/39 [00:15<00:52,  1.73s/it]
[Acessing speaker spk_5 track 1 of 1:  26%|██▌       | 10/39 [00:16<00:40,  1.40s/it]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 11/3


Starte Inference für Experiment: E59_bugfix_mdOn0p4_mdOff1p0_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E59_bugfix_mdOn0p4_mdOff1p0_bs12_len20
  session_dir     = data-bin/dev/session_40
  comment         = AVSR-Override: min_on=0.4s, min_off=1.0s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_40


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/37 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/37 [00:00<00:18,  1.93it/s]
[Acessing speaker spk_0 track 1 of 1:   5%|▌         | 2/37 [00:00<00:15,  2.21it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 3/37 [00:01<00:16,  2.11it/s]
[Acessing speaker spk_0 track 1 of 1:  11%|█         | 4/37 [00:01<00:15,  2.16it/s]
[Acessing speaker spk_0 track 1 of 1:  14%|█▎        | 5/37 [00:02<00:14,  2.28it/s]
[Acessing speaker spk_0 track 1 of 1:  16%|█▌        | 6/37 [00:03<00:21,  1.45it/s]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 7/37 [00:09<01:11,  2.39s/it]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 8/37 [00:10<00:54,  1.88s/it]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 9/37 [00:11<00:48,  1.72s/it]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 10/37 [00:12<00:39,  1.46s/it]
[Acessing speaker spk_0 track 1 of 1:  30%|██▉       | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/43 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   2%|▏         | 1/43 [00:00<00:19,  2.19it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▍         | 2/43 [00:01<00:31,  1.29it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 3/43 [00:02<00:31,  1.27it/s]
[Acessing speaker spk_1 track 1 of 1:   9%|▉         | 4/43 [00:02<00:28,  1.35it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 5/43 [00:04<00:36,  1.03it/s]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 6/43 [00:05<00:36,  1.02it/s]
[Acessing speaker spk_1 track 1 of 1:  16%|█▋        | 7/43 [00:05<00:30,  1.17it/s]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 8/43 [00:06<00:26,  1.33it/s]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 9/43 [00:06<00:22,  1.48it/s]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 10/43 [00:07<00:22,  1.44it/s]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 11/4





Processing speaker spk_2 track 1 of 3: 0it [00:00, ?it/s]

[Acessing speaker spk_2 track 2 of 3:   0%|          | 0/14 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 3:   7%|▋         | 1/14 [00:00<00:06,  1.95it/s]
[Acessing speaker spk_2 track 2 of 3:  14%|█▍        | 2/14 [00:01<00:09,  1.22it/s]
[Acessing speaker spk_2 track 2 of 3:  21%|██▏       | 3/14 [00:03<00:12,  1.12s/it]
[Acessing speaker spk_2 track 2 of 3:  29%|██▊       | 4/14 [00:03<00:08,  1.20it/s]
[Acessing speaker spk_2 track 2 of 3:  36%|███▌      | 5/14 [00:07<00:18,  2.09s/it]
[Acessing speaker spk_2 track 2 of 3:  43%|████▎     | 6/14 [00:09<00:14,  1.81s/it]
[Acessing speaker spk_2 track 2 of 3:  50%|█████     | 7/14 [00:12<00:16,  2.40s/it]
[Acessing speaker spk_2 track 2 of 3:  57%|█████▋    | 8/14 [00:17<00:19,  3.20s/it]
[Acessing speaker spk_2 track 2 of 3:  64%|██████▍   | 9/14 [00:22<00:18,  3.72s/it]
[Acessing speaker spk_2 track 2 of 3:  71%|███████▏  | 10/14 [00:23<00:11,  2.85s/it]






[Acessing speaker spk_3 track 1 of 2:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 2:   6%|▌         | 1/18 [00:00<00:10,  1.65it/s]
[Acessing speaker spk_3 track 1 of 2:  11%|█         | 2/18 [00:01<00:09,  1.72it/s]
[Acessing speaker spk_3 track 1 of 2:  17%|█▋        | 3/18 [00:06<00:43,  2.88s/it]
[Acessing speaker spk_3 track 1 of 2:  22%|██▏       | 4/18 [00:08<00:36,  2.60s/it]
[Acessing speaker spk_3 track 1 of 2:  28%|██▊       | 5/18 [00:13<00:41,  3.19s/it]
[Acessing speaker spk_3 track 1 of 2:  33%|███▎      | 6/18 [00:14<00:30,  2.51s/it]
[Acessing speaker spk_3 track 1 of 2:  39%|███▉      | 7/18 [00:18<00:35,  3.19s/it]
[Acessing speaker spk_3 track 1 of 2:  44%|████▍     | 8/18 [00:22<00:33,  3.31s/it]
[Acessing speaker spk_3 track 1 of 2:  50%|█████     | 9/18 [00:25<00:29,  3.26s/it]
[Acessing speaker spk_3 track 1 of 2:  56%|█████▌    | 10/18 [00:26<00:20,  2.50s/it]
[Acessing speaker spk_3 track 1 of 2:  61%|██████    | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/28 [00:02<01:05,  2.42s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/28 [00:03<00:43,  1.68s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/28 [00:12<02:02,  4.91s/it]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 4/28 [00:25<03:12,  8.01s/it]
[Acessing speaker spk_4 track 1 of 1:  18%|█▊        | 5/28 [00:27<02:18,  6.02s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██▏       | 6/28 [00:28<01:31,  4.17s/it]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 7/28 [00:30<01:11,  3.43s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▊       | 8/28 [00:32<00:59,  2.95s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 9/28 [00:34<00:51,  2.70s/it]
[Acessing speaker spk_4 track 1 of 1:  36%|███▌      | 10/28 [00:34<00:36,  2.04s/it]
[Acessing speaker spk_4 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/33 [00:00<00:13,  2.31it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▌         | 2/33 [00:01<00:18,  1.67it/s]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 3/33 [00:04<00:56,  1.89s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 4/33 [00:06<00:51,  1.78s/it]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 5/33 [00:11<01:23,  2.99s/it]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 6/33 [00:12<01:01,  2.29s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 7/33 [00:14<00:58,  2.27s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 8/33 [00:15<00:48,  1.93s/it]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 9/33 [00:17<00:45,  1.89s/it]
[Acessing speaker spk_5 track 1 of 1:  30%|███       | 10/33 [00:19<00:41,  1.81s/it]
[Acessing speaker spk_5 track 1 of 1:  33%|███▎      | 11/3


Starte Inference für Experiment: E60_bugfix_mdOn0p4_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E60_bugfix_mdOn0p4_mdOff1p2_bs12_len20
  session_dir     = data-bin/dev/session_40
  comment         = AVSR-Override: min_on=0.4s, min_off=1.2s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_40


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/32 [00:00<00:16,  1.90it/s]
[Acessing speaker spk_0 track 1 of 1:   6%|▋         | 2/32 [00:00<00:13,  2.19it/s]
[Acessing speaker spk_0 track 1 of 1:   9%|▉         | 3/32 [00:01<00:14,  2.07it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▎        | 4/32 [00:01<00:13,  2.11it/s]
[Acessing speaker spk_0 track 1 of 1:  16%|█▌        | 5/32 [00:02<00:12,  2.24it/s]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 6/32 [00:03<00:18,  1.42it/s]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 7/32 [00:07<00:43,  1.75s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 8/32 [00:08<00:34,  1.44s/it]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 9/32 [00:09<00:32,  1.43s/it]
[Acessing speaker spk_0 track 1 of 1:  31%|███▏      | 10/32 [00:11<00:34,  1.57s/it]
[Acessing speaker spk_0 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/41 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   2%|▏         | 1/41 [00:00<00:18,  2.14it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▍         | 2/41 [00:03<01:09,  1.77s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 3/41 [00:03<00:49,  1.31s/it]
[Acessing speaker spk_1 track 1 of 1:  10%|▉         | 4/41 [00:05<00:49,  1.35s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 5/41 [00:06<00:43,  1.22s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 6/41 [00:06<00:35,  1.00s/it]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 7/41 [00:07<00:28,  1.18it/s]
[Acessing speaker spk_1 track 1 of 1:  20%|█▉        | 8/41 [00:07<00:24,  1.36it/s]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 9/41 [00:08<00:23,  1.36it/s]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 10/41 [00:11<00:39,  1.26s/it]
[Acessing speaker spk_1 track 1 of 1:  27%|██▋       | 11/4





Processing speaker spk_2 track 1 of 3: 0it [00:00, ?it/s]

[Acessing speaker spk_2 track 2 of 3:   0%|          | 0/14 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 3:   7%|▋         | 1/14 [00:00<00:06,  1.88it/s]
[Acessing speaker spk_2 track 2 of 3:  14%|█▍        | 2/14 [00:01<00:10,  1.19it/s]
[Acessing speaker spk_2 track 2 of 3:  21%|██▏       | 3/14 [00:03<00:12,  1.12s/it]
[Acessing speaker spk_2 track 2 of 3:  29%|██▊       | 4/14 [00:03<00:08,  1.20it/s]
[Acessing speaker spk_2 track 2 of 3:  36%|███▌      | 5/14 [00:07<00:18,  2.09s/it]
[Acessing speaker spk_2 track 2 of 3:  43%|████▎     | 6/14 [00:09<00:14,  1.81s/it]
[Acessing speaker spk_2 track 2 of 3:  50%|█████     | 7/14 [00:12<00:16,  2.39s/it]
[Acessing speaker spk_2 track 2 of 3:  57%|█████▋    | 8/14 [00:17<00:19,  3.26s/it]
[Acessing speaker spk_2 track 2 of 3:  64%|██████▍   | 9/14 [00:22<00:18,  3.69s/it]
[Acessing speaker spk_2 track 2 of 3:  71%|███████▏  | 10/14 [00:23<00:11,  2.83s/it]






[Acessing speaker spk_3 track 1 of 2:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 2:   6%|▌         | 1/18 [00:00<00:10,  1.69it/s]
[Acessing speaker spk_3 track 1 of 2:  11%|█         | 2/18 [00:01<00:09,  1.77it/s]
[Acessing speaker spk_3 track 1 of 2:  17%|█▋        | 3/18 [00:06<00:40,  2.69s/it]
[Acessing speaker spk_3 track 1 of 2:  22%|██▏       | 4/18 [00:08<00:34,  2.47s/it]
[Acessing speaker spk_3 track 1 of 2:  28%|██▊       | 5/18 [00:13<00:46,  3.55s/it]
[Acessing speaker spk_3 track 1 of 2:  33%|███▎      | 6/18 [00:15<00:32,  2.74s/it]
[Acessing speaker spk_3 track 1 of 2:  39%|███▉      | 7/18 [00:19<00:36,  3.36s/it]
[Acessing speaker spk_3 track 1 of 2:  44%|████▍     | 8/18 [00:22<00:33,  3.32s/it]
[Acessing speaker spk_3 track 1 of 2:  50%|█████     | 9/18 [00:26<00:29,  3.24s/it]
[Acessing speaker spk_3 track 1 of 2:  56%|█████▌    | 10/18 [00:26<00:19,  2.49s/it]
[Acessing speaker spk_3 track 1 of 2:  61%|██████    | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/28 [00:04<02:02,  4.55s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/28 [00:05<01:06,  2.56s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/28 [00:14<02:14,  5.38s/it]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 4/28 [00:25<03:02,  7.59s/it]
[Acessing speaker spk_4 track 1 of 1:  18%|█▊        | 5/28 [00:27<02:11,  5.74s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██▏       | 6/28 [00:28<01:27,  3.99s/it]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 7/28 [00:30<01:08,  3.28s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▊       | 8/28 [00:32<00:57,  2.85s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 9/28 [00:34<00:51,  2.69s/it]
[Acessing speaker spk_4 track 1 of 1:  36%|███▌      | 10/28 [00:35<00:36,  2.04s/it]
[Acessing speaker spk_4 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/33 [00:00<00:13,  2.31it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▌         | 2/33 [00:01<00:18,  1.68it/s]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 3/33 [00:04<00:56,  1.88s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 4/33 [00:06<00:51,  1.78s/it]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 5/33 [00:09<01:03,  2.28s/it]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 6/33 [00:10<00:48,  1.81s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 7/33 [00:12<00:50,  1.94s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 8/33 [00:13<00:42,  1.70s/it]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 9/33 [00:15<00:41,  1.74s/it]
[Acessing speaker spk_5 track 1 of 1:  30%|███       | 10/33 [00:17<00:39,  1.71s/it]
[Acessing speaker spk_5 track 1 of 1:  33%|███▎      | 11/3


Starte Inference für Experiment: E61_bugfix_mdOn0p6_mdOff0p5_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E61_bugfix_mdOn0p6_mdOff0p5_bs12_len20
  session_dir     = data-bin/dev/session_40
  comment         = AVSR-Override: min_on=0.6s, min_off=0.5s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_40


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/43 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   2%|▏         | 1/43 [00:00<00:21,  1.94it/s]
[Acessing speaker spk_0 track 1 of 1:   5%|▍         | 2/43 [00:00<00:18,  2.20it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 3/43 [00:01<00:19,  2.07it/s]
[Acessing speaker spk_0 track 1 of 1:   9%|▉         | 4/43 [00:01<00:18,  2.11it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▏        | 5/43 [00:02<00:16,  2.24it/s]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 6/43 [00:02<00:17,  2.11it/s]
[Acessing speaker spk_0 track 1 of 1:  16%|█▋        | 7/43 [00:03<00:19,  1.89it/s]
[Acessing speaker spk_0 track 1 of 1:  19%|█▊        | 8/43 [00:03<00:17,  1.97it/s]
[Acessing speaker spk_0 track 1 of 1:  21%|██        | 9/43 [00:06<00:42,  1.26s/it]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 10/43 [00:07<00:36,  1.11s/it]
[Acessing speaker spk_0 track 1 of 1:  26%|██▌       | 11/4





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/42 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   2%|▏         | 1/42 [00:00<00:19,  2.15it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▍         | 2/42 [00:01<00:31,  1.25it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 3/42 [00:02<00:31,  1.23it/s]
[Acessing speaker spk_1 track 1 of 1:  10%|▉         | 4/42 [00:03<00:28,  1.31it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 5/42 [00:04<00:36,  1.02it/s]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 6/42 [00:05<00:35,  1.01it/s]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 7/42 [00:05<00:29,  1.17it/s]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 8/42 [00:06<00:25,  1.33it/s]
[Acessing speaker spk_1 track 1 of 1:  21%|██▏       | 9/42 [00:06<00:22,  1.49it/s]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 10/42 [00:07<00:22,  1.44it/s]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 11/4





Processing speaker spk_2 track 1 of 3: 0it [00:00, ?it/s]

[Acessing speaker spk_2 track 2 of 3:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 3:   6%|▋         | 1/16 [00:00<00:07,  1.93it/s]
[Acessing speaker spk_2 track 2 of 3:  12%|█▎        | 2/16 [00:01<00:11,  1.18it/s]
[Acessing speaker spk_2 track 2 of 3:  19%|█▉        | 3/16 [00:03<00:18,  1.42s/it]
[Acessing speaker spk_2 track 2 of 3:  25%|██▌       | 4/16 [00:08<00:31,  2.62s/it]
[Acessing speaker spk_2 track 2 of 3:  31%|███▏      | 5/16 [00:09<00:23,  2.13s/it]
[Acessing speaker spk_2 track 2 of 3:  38%|███▊      | 6/16 [00:10<00:18,  1.81s/it]
[Acessing speaker spk_2 track 2 of 3:  44%|████▍     | 7/16 [00:11<00:14,  1.56s/it]
[Acessing speaker spk_2 track 2 of 3:  50%|█████     | 8/16 [00:16<00:20,  2.59s/it]
[Acessing speaker spk_2 track 2 of 3:  56%|█████▋    | 9/16 [00:21<00:22,  3.22s/it]
[Acessing speaker spk_2 track 2 of 3:  62%|██████▎   | 10/16 [00:22<00:15,  2.52s/it]






[Acessing speaker spk_3 track 1 of 2:   0%|          | 0/19 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 2:   5%|▌         | 1/19 [00:00<00:10,  1.69it/s]
[Acessing speaker spk_3 track 1 of 2:  11%|█         | 2/19 [00:01<00:09,  1.75it/s]
[Acessing speaker spk_3 track 1 of 2:  16%|█▌        | 3/19 [00:04<00:27,  1.74s/it]
[Acessing speaker spk_3 track 1 of 2:  21%|██        | 4/19 [00:07<00:36,  2.41s/it]
[Acessing speaker spk_3 track 1 of 2:  26%|██▋       | 5/19 [00:09<00:32,  2.31s/it]
[Acessing speaker spk_3 track 1 of 2:  32%|███▏      | 6/19 [00:14<00:38,  2.94s/it]
[Acessing speaker spk_3 track 1 of 2:  37%|███▋      | 7/19 [00:15<00:28,  2.36s/it]
[Acessing speaker spk_3 track 1 of 2:  42%|████▏     | 8/19 [00:19<00:33,  3.07s/it]
[Acessing speaker spk_3 track 1 of 2:  47%|████▋     | 9/19 [00:22<00:31,  3.12s/it]
[Acessing speaker spk_3 track 1 of 2:  53%|█████▎    | 10/19 [00:26<00:27,  3.11s/it]
[Acessing speaker spk_3 track 1 of 2:  58%|█████▊    | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/27 [00:02<01:02,  2.40s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/27 [00:03<00:42,  1.70s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/27 [00:12<01:58,  4.93s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▍        | 4/27 [00:22<02:44,  7.13s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▊        | 5/27 [00:25<01:59,  5.44s/it]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 6/27 [00:25<01:19,  3.79s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 7/27 [00:27<01:03,  3.17s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|██▉       | 8/27 [00:29<00:52,  2.79s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 9/27 [00:32<00:48,  2.67s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 10/27 [00:34<00:40,  2.41s/it]
[Acessing speaker spk_4 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/46 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   2%|▏         | 1/46 [00:00<00:19,  2.26it/s]
[Acessing speaker spk_5 track 1 of 1:   4%|▍         | 2/46 [00:01<00:26,  1.67it/s]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 3/46 [00:04<01:21,  1.89s/it]
[Acessing speaker spk_5 track 1 of 1:   9%|▊         | 4/46 [00:06<01:15,  1.79s/it]
[Acessing speaker spk_5 track 1 of 1:  11%|█         | 5/46 [00:07<01:05,  1.61s/it]
[Acessing speaker spk_5 track 1 of 1:  13%|█▎        | 6/46 [00:08<00:58,  1.47s/it]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 7/46 [00:09<00:48,  1.25s/it]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 8/46 [00:11<00:59,  1.57s/it]
[Acessing speaker spk_5 track 1 of 1:  20%|█▉        | 9/46 [00:12<00:53,  1.45s/it]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 10/46 [00:14<00:57,  1.60s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 11/4


Starte Inference für Experiment: E62_bugfix_mdOn0p6_mdOff0p8_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E62_bugfix_mdOn0p6_mdOff0p8_bs12_len20
  session_dir     = data-bin/dev/session_40
  comment         = AVSR-Override: min_on=0.6s, min_off=0.8s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_40


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/40 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   2%|▎         | 1/40 [00:00<00:20,  1.92it/s]
[Acessing speaker spk_0 track 1 of 1:   5%|▌         | 2/40 [00:00<00:17,  2.21it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 3/40 [00:01<00:17,  2.09it/s]
[Acessing speaker spk_0 track 1 of 1:  10%|█         | 4/40 [00:01<00:16,  2.15it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▎        | 5/40 [00:02<00:15,  2.28it/s]
[Acessing speaker spk_0 track 1 of 1:  15%|█▌        | 6/40 [00:02<00:15,  2.13it/s]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 7/40 [00:03<00:17,  1.92it/s]
[Acessing speaker spk_0 track 1 of 1:  20%|██        | 8/40 [00:03<00:16,  2.00it/s]
[Acessing speaker spk_0 track 1 of 1:  22%|██▎       | 9/40 [00:06<00:38,  1.24s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 10/40 [00:07<00:32,  1.09s/it]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 11/4





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/42 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   2%|▏         | 1/42 [00:00<00:18,  2.23it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▍         | 2/42 [00:01<00:31,  1.29it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 3/42 [00:02<00:30,  1.26it/s]
[Acessing speaker spk_1 track 1 of 1:  10%|▉         | 4/42 [00:02<00:28,  1.35it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 5/42 [00:04<00:35,  1.04it/s]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 6/42 [00:05<00:35,  1.02it/s]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 7/42 [00:05<00:29,  1.18it/s]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 8/42 [00:06<00:25,  1.35it/s]
[Acessing speaker spk_1 track 1 of 1:  21%|██▏       | 9/42 [00:06<00:22,  1.50it/s]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 10/42 [00:07<00:21,  1.46it/s]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 11/4





Processing speaker spk_2 track 1 of 3: 0it [00:00, ?it/s]

[Acessing speaker spk_2 track 2 of 3:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 3:   6%|▋         | 1/16 [00:00<00:07,  2.00it/s]
[Acessing speaker spk_2 track 2 of 3:  12%|█▎        | 2/16 [00:01<00:11,  1.25it/s]
[Acessing speaker spk_2 track 2 of 3:  19%|█▉        | 3/16 [00:02<00:14,  1.11s/it]
[Acessing speaker spk_2 track 2 of 3:  25%|██▌       | 4/16 [00:07<00:28,  2.34s/it]
[Acessing speaker spk_2 track 2 of 3:  31%|███▏      | 5/16 [00:08<00:21,  1.95s/it]
[Acessing speaker spk_2 track 2 of 3:  38%|███▊      | 6/16 [00:09<00:16,  1.70s/it]
[Acessing speaker spk_2 track 2 of 3:  44%|████▍     | 7/16 [00:10<00:13,  1.48s/it]
[Acessing speaker spk_2 track 2 of 3:  50%|█████     | 8/16 [00:15<00:20,  2.59s/it]
[Acessing speaker spk_2 track 2 of 3:  56%|█████▋    | 9/16 [00:20<00:22,  3.21s/it]
[Acessing speaker spk_2 track 2 of 3:  62%|██████▎   | 10/16 [00:21<00:14,  2.50s/it]






[Acessing speaker spk_3 track 1 of 2:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 2:   6%|▌         | 1/18 [00:00<00:09,  1.70it/s]
[Acessing speaker spk_3 track 1 of 2:  11%|█         | 2/18 [00:01<00:08,  1.78it/s]
[Acessing speaker spk_3 track 1 of 2:  17%|█▋        | 3/18 [00:08<00:55,  3.70s/it]
[Acessing speaker spk_3 track 1 of 2:  22%|██▏       | 4/18 [00:10<00:43,  3.08s/it]
[Acessing speaker spk_3 track 1 of 2:  28%|██▊       | 5/18 [00:14<00:45,  3.47s/it]
[Acessing speaker spk_3 track 1 of 2:  33%|███▎      | 6/18 [00:16<00:32,  2.69s/it]
[Acessing speaker spk_3 track 1 of 2:  39%|███▉      | 7/18 [00:20<00:36,  3.34s/it]
[Acessing speaker spk_3 track 1 of 2:  44%|████▍     | 8/18 [00:24<00:33,  3.37s/it]
[Acessing speaker spk_3 track 1 of 2:  50%|█████     | 9/18 [00:27<00:29,  3.28s/it]
[Acessing speaker spk_3 track 1 of 2:  56%|█████▌    | 10/18 [00:28<00:20,  2.52s/it]
[Acessing speaker spk_3 track 1 of 2:  61%|██████    | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/27 [00:02<01:02,  2.40s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/27 [00:03<00:41,  1.67s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/27 [00:12<01:55,  4.82s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▍        | 4/27 [00:22<02:42,  7.04s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▊        | 5/27 [00:25<01:58,  5.40s/it]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 6/27 [00:25<01:18,  3.76s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 7/27 [00:27<01:02,  3.13s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|██▉       | 8/27 [00:29<00:52,  2.75s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 9/27 [00:31<00:46,  2.58s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 10/27 [00:33<00:39,  2.32s/it]
[Acessing speaker spk_4 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/39 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/39 [00:00<00:16,  2.30it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▌         | 2/39 [00:01<00:21,  1.69it/s]
[Acessing speaker spk_5 track 1 of 1:   8%|▊         | 3/39 [00:04<01:07,  1.87s/it]
[Acessing speaker spk_5 track 1 of 1:  10%|█         | 4/39 [00:06<01:01,  1.77s/it]
[Acessing speaker spk_5 track 1 of 1:  13%|█▎        | 5/39 [00:09<01:15,  2.22s/it]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 6/39 [00:10<00:58,  1.77s/it]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 7/39 [00:12<01:01,  1.91s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 8/39 [00:13<00:52,  1.68s/it]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 9/39 [00:15<00:51,  1.72s/it]
[Acessing speaker spk_5 track 1 of 1:  26%|██▌       | 10/39 [00:16<00:41,  1.44s/it]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 11/3


Starte Inference für Experiment: E63_bugfix_mdOn0p6_mdOff1p0_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E63_bugfix_mdOn0p6_mdOff1p0_bs12_len20
  session_dir     = data-bin/dev/session_40
  comment         = AVSR-Override: min_on=0.6s, min_off=1.0s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_40


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/36 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/36 [00:00<00:17,  1.95it/s]
[Acessing speaker spk_0 track 1 of 1:   6%|▌         | 2/36 [00:00<00:15,  2.23it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 3/36 [00:01<00:15,  2.11it/s]
[Acessing speaker spk_0 track 1 of 1:  11%|█         | 4/36 [00:01<00:14,  2.15it/s]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 5/36 [00:02<00:13,  2.28it/s]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 6/36 [00:03<00:20,  1.46it/s]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 7/36 [00:07<00:48,  1.68s/it]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 8/36 [00:07<00:39,  1.39s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 9/36 [00:09<00:37,  1.38s/it]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 10/36 [00:10<00:32,  1.23s/it]
[Acessing speaker spk_0 track 1 of 1:  31%|███       | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/41 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   2%|▏         | 1/41 [00:00<00:18,  2.22it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▍         | 2/41 [00:01<00:29,  1.31it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 3/41 [00:02<00:29,  1.29it/s]
[Acessing speaker spk_1 track 1 of 1:  10%|▉         | 4/41 [00:02<00:26,  1.37it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 5/41 [00:04<00:34,  1.05it/s]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 6/41 [00:05<00:33,  1.04it/s]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 7/41 [00:05<00:28,  1.19it/s]
[Acessing speaker spk_1 track 1 of 1:  20%|█▉        | 8/41 [00:06<00:24,  1.36it/s]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 9/41 [00:06<00:21,  1.51it/s]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 10/41 [00:07<00:21,  1.47it/s]
[Acessing speaker spk_1 track 1 of 1:  27%|██▋       | 11/4





Processing speaker spk_2 track 1 of 3: 0it [00:00, ?it/s]

[Acessing speaker spk_2 track 2 of 3:   0%|          | 0/13 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 3:   8%|▊         | 1/13 [00:00<00:06,  1.95it/s]
[Acessing speaker spk_2 track 2 of 3:  15%|█▌        | 2/13 [00:01<00:08,  1.25it/s]
[Acessing speaker spk_2 track 2 of 3:  23%|██▎       | 3/13 [00:02<00:11,  1.11s/it]
[Acessing speaker spk_2 track 2 of 3:  31%|███       | 4/13 [00:07<00:21,  2.41s/it]
[Acessing speaker spk_2 track 2 of 3:  38%|███▊      | 5/13 [00:08<00:16,  2.00s/it]
[Acessing speaker spk_2 track 2 of 3:  46%|████▌     | 6/13 [00:12<00:17,  2.53s/it]
[Acessing speaker spk_2 track 2 of 3:  54%|█████▍    | 7/13 [00:17<00:19,  3.29s/it]
[Acessing speaker spk_2 track 2 of 3:  62%|██████▏   | 8/13 [00:21<00:18,  3.75s/it]
[Acessing speaker spk_2 track 2 of 3:  69%|██████▉   | 9/13 [00:22<00:11,  2.85s/it]
[Acessing speaker spk_2 track 2 of 3:  77%|███████▋  | 10/13 [00:27<00:10,  3.58s/it]






[Acessing speaker spk_3 track 1 of 2:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 2:   6%|▌         | 1/18 [00:00<00:10,  1.68it/s]
[Acessing speaker spk_3 track 1 of 2:  11%|█         | 2/18 [00:01<00:08,  1.79it/s]
[Acessing speaker spk_3 track 1 of 2:  17%|█▋        | 3/18 [00:08<00:54,  3.65s/it]
[Acessing speaker spk_3 track 1 of 2:  22%|██▏       | 4/18 [00:10<00:42,  3.07s/it]
[Acessing speaker spk_3 track 1 of 2:  28%|██▊       | 5/18 [00:14<00:45,  3.53s/it]
[Acessing speaker spk_3 track 1 of 2:  33%|███▎      | 6/18 [00:16<00:32,  2.73s/it]
[Acessing speaker spk_3 track 1 of 2:  39%|███▉      | 7/18 [00:20<00:36,  3.32s/it]
[Acessing speaker spk_3 track 1 of 2:  44%|████▍     | 8/18 [00:23<00:32,  3.29s/it]
[Acessing speaker spk_3 track 1 of 2:  50%|█████     | 9/18 [00:26<00:29,  3.22s/it]
[Acessing speaker spk_3 track 1 of 2:  56%|█████▌    | 10/18 [00:27<00:19,  2.48s/it]
[Acessing speaker spk_3 track 1 of 2:  61%|██████    | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/27 [00:02<01:02,  2.41s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/27 [00:03<00:41,  1.67s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/27 [00:12<01:57,  4.90s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▍        | 4/27 [00:23<02:50,  7.43s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▊        | 5/27 [00:26<02:03,  5.63s/it]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 6/27 [00:26<01:22,  3.91s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 7/27 [00:28<01:04,  3.23s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|██▉       | 8/27 [00:30<00:53,  2.81s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 9/27 [00:32<00:47,  2.61s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 10/27 [00:34<00:39,  2.35s/it]
[Acessing speaker spk_4 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/33 [00:00<00:14,  2.23it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▌         | 2/33 [00:01<00:18,  1.67it/s]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 3/33 [00:04<00:59,  2.00s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 4/33 [00:06<00:53,  1.84s/it]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 5/33 [00:09<01:03,  2.26s/it]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 6/33 [00:10<00:48,  1.80s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 7/33 [00:12<00:50,  1.94s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 8/33 [00:13<00:42,  1.70s/it]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 9/33 [00:15<00:41,  1.73s/it]
[Acessing speaker spk_5 track 1 of 1:  30%|███       | 10/33 [00:17<00:39,  1.70s/it]
[Acessing speaker spk_5 track 1 of 1:  33%|███▎      | 11/3


Starte Inference für Experiment: E64_bugfix_mdOn0p6_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E64_bugfix_mdOn0p6_mdOff1p2_bs12_len20
  session_dir     = data-bin/dev/session_40
  comment         = AVSR-Override: min_on=0.6s, min_off=1.2s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_40


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/32 [00:00<00:16,  1.92it/s]
[Acessing speaker spk_0 track 1 of 1:   6%|▋         | 2/32 [00:00<00:13,  2.22it/s]
[Acessing speaker spk_0 track 1 of 1:   9%|▉         | 3/32 [00:01<00:14,  2.07it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▎        | 4/32 [00:01<00:13,  2.14it/s]
[Acessing speaker spk_0 track 1 of 1:  16%|█▌        | 5/32 [00:02<00:11,  2.28it/s]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 6/32 [00:03<00:17,  1.45it/s]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 7/32 [00:07<00:43,  1.73s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 8/32 [00:08<00:34,  1.43s/it]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 9/32 [00:09<00:32,  1.41s/it]
[Acessing speaker spk_0 track 1 of 1:  31%|███▏      | 10/32 [00:11<00:34,  1.56s/it]
[Acessing speaker spk_0 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/39 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/39 [00:00<00:17,  2.13it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▌         | 2/39 [00:03<01:05,  1.76s/it]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 3/39 [00:03<00:47,  1.31s/it]
[Acessing speaker spk_1 track 1 of 1:  10%|█         | 4/39 [00:05<00:46,  1.32s/it]
[Acessing speaker spk_1 track 1 of 1:  13%|█▎        | 5/39 [00:06<00:40,  1.20s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▌        | 6/39 [00:06<00:32,  1.01it/s]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 7/39 [00:07<00:26,  1.20it/s]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 8/39 [00:07<00:22,  1.37it/s]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 9/39 [00:08<00:21,  1.37it/s]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 10/39 [00:09<00:19,  1.51it/s]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 11/3





Processing speaker spk_2 track 1 of 3: 0it [00:00, ?it/s]

[Acessing speaker spk_2 track 2 of 3:   0%|          | 0/13 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 3:   8%|▊         | 1/13 [00:00<00:06,  1.96it/s]
[Acessing speaker spk_2 track 2 of 3:  15%|█▌        | 2/13 [00:01<00:08,  1.25it/s]
[Acessing speaker spk_2 track 2 of 3:  23%|██▎       | 3/13 [00:02<00:11,  1.11s/it]
[Acessing speaker spk_2 track 2 of 3:  31%|███       | 4/13 [00:07<00:21,  2.36s/it]
[Acessing speaker spk_2 track 2 of 3:  38%|███▊      | 5/13 [00:08<00:15,  1.96s/it]
[Acessing speaker spk_2 track 2 of 3:  46%|████▌     | 6/13 [00:12<00:17,  2.50s/it]
[Acessing speaker spk_2 track 2 of 3:  54%|█████▍    | 7/13 [00:17<00:19,  3.32s/it]
[Acessing speaker spk_2 track 2 of 3:  62%|██████▏   | 8/13 [00:23<00:21,  4.24s/it]
[Acessing speaker spk_2 track 2 of 3:  69%|██████▉   | 9/13 [00:24<00:12,  3.20s/it]
[Acessing speaker spk_2 track 2 of 3:  77%|███████▋  | 10/13 [00:29<00:11,  3.88s/it]






[Acessing speaker spk_3 track 1 of 2:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 2:   6%|▌         | 1/18 [00:00<00:10,  1.70it/s]
[Acessing speaker spk_3 track 1 of 2:  11%|█         | 2/18 [00:01<00:08,  1.80it/s]
[Acessing speaker spk_3 track 1 of 2:  17%|█▋        | 3/18 [00:06<00:40,  2.73s/it]
[Acessing speaker spk_3 track 1 of 2:  22%|██▏       | 4/18 [00:08<00:35,  2.53s/it]
[Acessing speaker spk_3 track 1 of 2:  28%|██▊       | 5/18 [00:12<00:40,  3.12s/it]
[Acessing speaker spk_3 track 1 of 2:  33%|███▎      | 6/18 [00:13<00:29,  2.46s/it]
[Acessing speaker spk_3 track 1 of 2:  39%|███▉      | 7/18 [00:18<00:34,  3.14s/it]
[Acessing speaker spk_3 track 1 of 2:  44%|████▍     | 8/18 [00:21<00:31,  3.17s/it]
[Acessing speaker spk_3 track 1 of 2:  50%|█████     | 9/18 [00:25<00:28,  3.20s/it]
[Acessing speaker spk_3 track 1 of 2:  56%|█████▌    | 10/18 [00:25<00:19,  2.46s/it]
[Acessing speaker spk_3 track 1 of 2:  61%|██████    | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/27 [00:02<01:01,  2.37s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/27 [00:03<00:41,  1.65s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/27 [00:12<01:54,  4.77s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▍        | 4/27 [00:22<02:43,  7.09s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▊        | 5/27 [00:25<01:59,  5.43s/it]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 6/27 [00:25<01:19,  3.77s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 7/27 [00:27<01:02,  3.14s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|██▉       | 8/27 [00:29<00:52,  2.77s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 9/27 [00:31<00:46,  2.60s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 10/27 [00:33<00:39,  2.35s/it]
[Acessing speaker spk_4 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/33 [00:00<00:14,  2.24it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▌         | 2/33 [00:01<00:18,  1.69it/s]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 3/33 [00:04<00:57,  1.92s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 4/33 [00:06<00:52,  1.80s/it]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 5/33 [00:09<01:03,  2.28s/it]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 6/33 [00:10<00:48,  1.81s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 7/33 [00:12<00:50,  1.94s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 8/33 [00:13<00:42,  1.71s/it]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 9/33 [00:15<00:41,  1.73s/it]
[Acessing speaker spk_5 track 1 of 1:  30%|███       | 10/33 [00:17<00:39,  1.70s/it]
[Acessing speaker spk_5 track 1 of 1:  33%|███▎      | 11/3


Starte Inference für Experiment: E65_bugfix_mdOn0p8_mdOff0p5_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E65_bugfix_mdOn0p8_mdOff0p5_bs12_len20
  session_dir     = data-bin/dev/session_40
  comment         = AVSR-Override: min_on=0.8s, min_off=0.5s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_40


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/42 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   2%|▏         | 1/42 [00:00<00:21,  1.87it/s]
[Acessing speaker spk_0 track 1 of 1:   5%|▍         | 2/42 [00:00<00:18,  2.16it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 3/42 [00:01<00:19,  2.04it/s]
[Acessing speaker spk_0 track 1 of 1:  10%|▉         | 4/42 [00:01<00:18,  2.11it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▏        | 5/42 [00:02<00:16,  2.23it/s]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 6/42 [00:02<00:16,  2.12it/s]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 7/42 [00:03<00:18,  1.90it/s]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 8/42 [00:03<00:17,  1.99it/s]
[Acessing speaker spk_0 track 1 of 1:  21%|██▏       | 9/42 [00:06<00:41,  1.27s/it]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 10/42 [00:07<00:35,  1.11s/it]
[Acessing speaker spk_0 track 1 of 1:  26%|██▌       | 11/4





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/41 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   2%|▏         | 1/41 [00:00<00:18,  2.16it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▍         | 2/41 [00:01<00:29,  1.31it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 3/41 [00:02<00:29,  1.29it/s]
[Acessing speaker spk_1 track 1 of 1:  10%|▉         | 4/41 [00:02<00:26,  1.38it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 5/41 [00:04<00:34,  1.06it/s]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 6/41 [00:05<00:33,  1.05it/s]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 7/41 [00:05<00:28,  1.20it/s]
[Acessing speaker spk_1 track 1 of 1:  20%|█▉        | 8/41 [00:06<00:24,  1.36it/s]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 9/41 [00:06<00:21,  1.52it/s]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 10/41 [00:07<00:20,  1.48it/s]
[Acessing speaker spk_1 track 1 of 1:  27%|██▋       | 11/4





Processing speaker spk_2 track 1 of 3: 0it [00:00, ?it/s]

[Acessing speaker spk_2 track 2 of 3:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 3:   6%|▋         | 1/16 [00:00<00:07,  1.95it/s]
[Acessing speaker spk_2 track 2 of 3:  12%|█▎        | 2/16 [00:01<00:11,  1.25it/s]
[Acessing speaker spk_2 track 2 of 3:  19%|█▉        | 3/16 [00:02<00:14,  1.10s/it]
[Acessing speaker spk_2 track 2 of 3:  25%|██▌       | 4/16 [00:07<00:27,  2.33s/it]
[Acessing speaker spk_2 track 2 of 3:  31%|███▏      | 5/16 [00:08<00:21,  1.95s/it]
[Acessing speaker spk_2 track 2 of 3:  38%|███▊      | 6/16 [00:09<00:17,  1.76s/it]
[Acessing speaker spk_2 track 2 of 3:  44%|████▍     | 7/16 [00:10<00:13,  1.53s/it]
[Acessing speaker spk_2 track 2 of 3:  50%|█████     | 8/16 [00:15<00:20,  2.55s/it]
[Acessing speaker spk_2 track 2 of 3:  56%|█████▋    | 9/16 [00:20<00:22,  3.22s/it]
[Acessing speaker spk_2 track 2 of 3:  62%|██████▎   | 10/16 [00:21<00:15,  2.51s/it]






[Acessing speaker spk_3 track 1 of 2:   0%|          | 0/19 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 2:   5%|▌         | 1/19 [00:00<00:11,  1.60it/s]
[Acessing speaker spk_3 track 1 of 2:  11%|█         | 2/19 [00:01<00:09,  1.73it/s]
[Acessing speaker spk_3 track 1 of 2:  16%|█▌        | 3/19 [00:02<00:12,  1.30it/s]
[Acessing speaker spk_3 track 1 of 2:  21%|██        | 4/19 [00:05<00:26,  1.79s/it]
[Acessing speaker spk_3 track 1 of 2:  26%|██▋       | 5/19 [00:07<00:26,  1.90s/it]
[Acessing speaker spk_3 track 1 of 2:  32%|███▏      | 6/19 [00:11<00:35,  2.70s/it]
[Acessing speaker spk_3 track 1 of 2:  37%|███▋      | 7/19 [00:13<00:26,  2.19s/it]
[Acessing speaker spk_3 track 1 of 2:  42%|████▏     | 8/19 [00:17<00:32,  2.91s/it]
[Acessing speaker spk_3 track 1 of 2:  47%|████▋     | 9/19 [00:20<00:30,  3.08s/it]
[Acessing speaker spk_3 track 1 of 2:  53%|█████▎    | 10/19 [00:24<00:27,  3.08s/it]
[Acessing speaker spk_3 track 1 of 2:  58%|█████▊    | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/27 [00:02<01:03,  2.42s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/27 [00:03<00:42,  1.69s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/27 [00:12<01:57,  4.88s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▍        | 4/27 [00:22<02:44,  7.14s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▊        | 5/27 [00:25<01:59,  5.44s/it]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 6/27 [00:26<01:20,  3.83s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 7/27 [00:27<01:03,  3.17s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|██▉       | 8/27 [00:29<00:53,  2.84s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 9/27 [00:32<00:47,  2.62s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 10/27 [00:33<00:39,  2.35s/it]
[Acessing speaker spk_4 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/44 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   2%|▏         | 1/44 [00:00<00:35,  1.22it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▍         | 2/44 [00:04<01:40,  2.40s/it]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 3/44 [00:05<01:23,  2.04s/it]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 4/44 [00:07<01:09,  1.74s/it]
[Acessing speaker spk_5 track 1 of 1:  11%|█▏        | 5/44 [00:08<01:00,  1.55s/it]
[Acessing speaker spk_5 track 1 of 1:  14%|█▎        | 6/44 [00:09<00:49,  1.29s/it]
[Acessing speaker spk_5 track 1 of 1:  16%|█▌        | 7/44 [00:11<01:00,  1.63s/it]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 8/44 [00:12<00:53,  1.49s/it]
[Acessing speaker spk_5 track 1 of 1:  20%|██        | 9/44 [00:14<00:57,  1.65s/it]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 10/44 [00:15<00:45,  1.34s/it]
[Acessing speaker spk_5 track 1 of 1:  25%|██▌       | 11/4


Starte Inference für Experiment: E66_bugfix_mdOn0p8_mdOff0p8_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E66_bugfix_mdOn0p8_mdOff0p8_bs12_len20
  session_dir     = data-bin/dev/session_40
  comment         = AVSR-Override: min_on=0.8s, min_off=0.8s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_40


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/40 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   2%|▎         | 1/40 [00:00<00:21,  1.84it/s]
[Acessing speaker spk_0 track 1 of 1:   5%|▌         | 2/40 [00:00<00:17,  2.15it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 3/40 [00:01<00:17,  2.06it/s]
[Acessing speaker spk_0 track 1 of 1:  10%|█         | 4/40 [00:01<00:16,  2.14it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▎        | 5/40 [00:02<00:15,  2.27it/s]
[Acessing speaker spk_0 track 1 of 1:  15%|█▌        | 6/40 [00:02<00:15,  2.16it/s]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 7/40 [00:03<00:17,  1.94it/s]
[Acessing speaker spk_0 track 1 of 1:  20%|██        | 8/40 [00:03<00:15,  2.01it/s]
[Acessing speaker spk_0 track 1 of 1:  22%|██▎       | 9/40 [00:06<00:38,  1.25s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 10/40 [00:07<00:32,  1.09s/it]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 11/4





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/41 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   2%|▏         | 1/41 [00:00<00:18,  2.22it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▍         | 2/41 [00:01<00:29,  1.32it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 3/41 [00:02<00:29,  1.29it/s]
[Acessing speaker spk_1 track 1 of 1:  10%|▉         | 4/41 [00:02<00:26,  1.38it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 5/41 [00:04<00:34,  1.05it/s]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 6/41 [00:05<00:33,  1.05it/s]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 7/41 [00:05<00:28,  1.20it/s]
[Acessing speaker spk_1 track 1 of 1:  20%|█▉        | 8/41 [00:06<00:24,  1.37it/s]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 9/41 [00:06<00:21,  1.49it/s]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 10/41 [00:07<00:21,  1.46it/s]
[Acessing speaker spk_1 track 1 of 1:  27%|██▋       | 11/4





Processing speaker spk_2 track 1 of 3: 0it [00:00, ?it/s]

[Acessing speaker spk_2 track 2 of 3:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 3:   6%|▋         | 1/16 [00:00<00:07,  1.92it/s]
[Acessing speaker spk_2 track 2 of 3:  12%|█▎        | 2/16 [00:01<00:11,  1.23it/s]
[Acessing speaker spk_2 track 2 of 3:  19%|█▉        | 3/16 [00:03<00:14,  1.12s/it]
[Acessing speaker spk_2 track 2 of 3:  25%|██▌       | 4/16 [00:07<00:28,  2.34s/it]
[Acessing speaker spk_2 track 2 of 3:  31%|███▏      | 5/16 [00:08<00:21,  1.96s/it]
[Acessing speaker spk_2 track 2 of 3:  38%|███▊      | 6/16 [00:09<00:16,  1.70s/it]
[Acessing speaker spk_2 track 2 of 3:  44%|████▍     | 7/16 [00:10<00:13,  1.48s/it]
[Acessing speaker spk_2 track 2 of 3:  50%|█████     | 8/16 [00:15<00:20,  2.62s/it]
[Acessing speaker spk_2 track 2 of 3:  56%|█████▋    | 9/16 [00:21<00:24,  3.48s/it]
[Acessing speaker spk_2 track 2 of 3:  62%|██████▎   | 10/16 [00:22<00:16,  2.69s/it]






[Acessing speaker spk_3 track 1 of 2:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 2:   6%|▌         | 1/18 [00:00<00:10,  1.69it/s]
[Acessing speaker spk_3 track 1 of 2:  11%|█         | 2/18 [00:01<00:08,  1.79it/s]
[Acessing speaker spk_3 track 1 of 2:  17%|█▋        | 3/18 [00:06<00:40,  2.68s/it]
[Acessing speaker spk_3 track 1 of 2:  22%|██▏       | 4/18 [00:08<00:35,  2.51s/it]
[Acessing speaker spk_3 track 1 of 2:  28%|██▊       | 5/18 [00:12<00:41,  3.18s/it]
[Acessing speaker spk_3 track 1 of 2:  33%|███▎      | 6/18 [00:14<00:30,  2.50s/it]
[Acessing speaker spk_3 track 1 of 2:  39%|███▉      | 7/18 [00:18<00:34,  3.13s/it]
[Acessing speaker spk_3 track 1 of 2:  44%|████▍     | 8/18 [00:22<00:32,  3.26s/it]
[Acessing speaker spk_3 track 1 of 2:  50%|█████     | 9/18 [00:25<00:29,  3.24s/it]
[Acessing speaker spk_3 track 1 of 2:  56%|█████▌    | 10/18 [00:26<00:19,  2.49s/it]
[Acessing speaker spk_3 track 1 of 2:  61%|██████    | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/27 [00:02<01:02,  2.39s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/27 [00:03<00:41,  1.67s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/27 [00:12<02:00,  5.02s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▍        | 4/27 [00:23<02:45,  7.20s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▊        | 5/27 [00:25<02:00,  5.48s/it]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 6/27 [00:26<01:20,  3.81s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 7/27 [00:27<01:03,  3.16s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|██▉       | 8/27 [00:29<00:52,  2.78s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 9/27 [00:32<00:47,  2.64s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 10/27 [00:33<00:40,  2.35s/it]
[Acessing speaker spk_4 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/38 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/38 [00:00<00:30,  1.22it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▌         | 2/38 [00:04<01:24,  2.36s/it]
[Acessing speaker spk_5 track 1 of 1:   8%|▊         | 3/38 [00:05<01:10,  2.01s/it]
[Acessing speaker spk_5 track 1 of 1:  11%|█         | 4/38 [00:08<01:21,  2.41s/it]
[Acessing speaker spk_5 track 1 of 1:  13%|█▎        | 5/38 [00:09<01:01,  1.87s/it]
[Acessing speaker spk_5 track 1 of 1:  16%|█▌        | 6/38 [00:12<01:05,  2.06s/it]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 7/38 [00:13<00:55,  1.78s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 8/38 [00:15<00:54,  1.82s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▎       | 9/38 [00:16<00:42,  1.47s/it]
[Acessing speaker spk_5 track 1 of 1:  26%|██▋       | 10/38 [00:16<00:35,  1.29s/it]
[Acessing speaker spk_5 track 1 of 1:  29%|██▉       | 11/3


Starte Inference für Experiment: E67_bugfix_mdOn0p8_mdOff1p0_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E67_bugfix_mdOn0p8_mdOff1p0_bs12_len20
  session_dir     = data-bin/dev/session_40
  comment         = AVSR-Override: min_on=0.8s, min_off=1.0s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_40


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/36 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/36 [00:00<00:18,  1.90it/s]
[Acessing speaker spk_0 track 1 of 1:   6%|▌         | 2/36 [00:00<00:15,  2.18it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 3/36 [00:01<00:15,  2.07it/s]
[Acessing speaker spk_0 track 1 of 1:  11%|█         | 4/36 [00:01<00:15,  2.13it/s]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 5/36 [00:02<00:13,  2.25it/s]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 6/36 [00:03<00:20,  1.46it/s]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 7/36 [00:07<00:48,  1.66s/it]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 8/36 [00:07<00:38,  1.39s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 9/36 [00:09<00:37,  1.39s/it]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 10/36 [00:10<00:32,  1.24s/it]
[Acessing speaker spk_0 track 1 of 1:  31%|███       | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/40 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   2%|▎         | 1/40 [00:00<00:18,  2.12it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▌         | 2/40 [00:01<00:29,  1.28it/s]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 3/40 [00:02<00:29,  1.27it/s]
[Acessing speaker spk_1 track 1 of 1:  10%|█         | 4/40 [00:02<00:26,  1.35it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▎        | 5/40 [00:04<00:33,  1.03it/s]
[Acessing speaker spk_1 track 1 of 1:  15%|█▌        | 6/40 [00:05<00:33,  1.01it/s]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 7/40 [00:05<00:28,  1.16it/s]
[Acessing speaker spk_1 track 1 of 1:  20%|██        | 8/40 [00:06<00:24,  1.32it/s]
[Acessing speaker spk_1 track 1 of 1:  22%|██▎       | 9/40 [00:06<00:21,  1.47it/s]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 10/40 [00:07<00:20,  1.43it/s]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 11/4





Processing speaker spk_2 track 1 of 3: 0it [00:00, ?it/s]

[Acessing speaker spk_2 track 2 of 3:   0%|          | 0/13 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 3:   8%|▊         | 1/13 [00:00<00:06,  1.94it/s]
[Acessing speaker spk_2 track 2 of 3:  15%|█▌        | 2/13 [00:01<00:08,  1.25it/s]
[Acessing speaker spk_2 track 2 of 3:  23%|██▎       | 3/13 [00:02<00:11,  1.10s/it]
[Acessing speaker spk_2 track 2 of 3:  31%|███       | 4/13 [00:07<00:20,  2.33s/it]
[Acessing speaker spk_2 track 2 of 3:  38%|███▊      | 5/13 [00:08<00:15,  1.95s/it]
[Acessing speaker spk_2 track 2 of 3:  46%|████▌     | 6/13 [00:12<00:17,  2.56s/it]
[Acessing speaker spk_2 track 2 of 3:  54%|█████▍    | 7/13 [00:18<00:22,  3.75s/it]
[Acessing speaker spk_2 track 2 of 3:  62%|██████▏   | 8/13 [00:22<00:20,  4.01s/it]
[Acessing speaker spk_2 track 2 of 3:  69%|██████▉   | 9/13 [00:23<00:12,  3.04s/it]
[Acessing speaker spk_2 track 2 of 3:  77%|███████▋  | 10/13 [00:29<00:11,  3.75s/it]






[Acessing speaker spk_3 track 1 of 2:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 2:   6%|▌         | 1/18 [00:00<00:10,  1.67it/s]
[Acessing speaker spk_3 track 1 of 2:  11%|█         | 2/18 [00:01<00:09,  1.77it/s]
[Acessing speaker spk_3 track 1 of 2:  17%|█▋        | 3/18 [00:06<00:39,  2.66s/it]
[Acessing speaker spk_3 track 1 of 2:  22%|██▏       | 4/18 [00:08<00:34,  2.47s/it]
[Acessing speaker spk_3 track 1 of 2:  28%|██▊       | 5/18 [00:12<00:39,  3.07s/it]
[Acessing speaker spk_3 track 1 of 2:  33%|███▎      | 6/18 [00:13<00:29,  2.42s/it]
[Acessing speaker spk_3 track 1 of 2:  39%|███▉      | 7/18 [00:18<00:35,  3.19s/it]
[Acessing speaker spk_3 track 1 of 2:  44%|████▍     | 8/18 [00:21<00:32,  3.21s/it]
[Acessing speaker spk_3 track 1 of 2:  50%|█████     | 9/18 [00:24<00:28,  3.16s/it]
[Acessing speaker spk_3 track 1 of 2:  56%|█████▌    | 10/18 [00:25<00:19,  2.43s/it]
[Acessing speaker spk_3 track 1 of 2:  61%|██████    | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/27 [00:02<01:02,  2.40s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/27 [00:03<00:42,  1.68s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/27 [00:12<01:56,  4.86s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▍        | 4/27 [00:22<02:41,  7.01s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▊        | 5/27 [00:25<01:58,  5.40s/it]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 6/27 [00:25<01:18,  3.76s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 7/27 [00:27<01:03,  3.15s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|██▉       | 8/27 [00:29<00:52,  2.78s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 9/27 [00:31<00:47,  2.66s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 10/27 [00:33<00:40,  2.38s/it]
[Acessing speaker spk_4 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/32 [00:00<00:25,  1.22it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▋         | 2/32 [00:04<01:10,  2.35s/it]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 3/32 [00:05<00:58,  2.01s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▎        | 4/32 [00:09<01:09,  2.47s/it]
[Acessing speaker spk_5 track 1 of 1:  16%|█▌        | 5/32 [00:09<00:51,  1.91s/it]
[Acessing speaker spk_5 track 1 of 1:  19%|█▉        | 6/32 [00:12<00:52,  2.01s/it]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 7/32 [00:13<00:43,  1.75s/it]
[Acessing speaker spk_5 track 1 of 1:  25%|██▌       | 8/32 [00:15<00:42,  1.77s/it]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 9/32 [00:16<00:39,  1.73s/it]
[Acessing speaker spk_5 track 1 of 1:  31%|███▏      | 10/32 [00:19<00:41,  1.90s/it]
[Acessing speaker spk_5 track 1 of 1:  34%|███▍      | 11/3


Starte Inference für Experiment: E68_bugfix_mdOn0p8_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E68_bugfix_mdOn0p8_mdOff1p2_bs12_len20
  session_dir     = data-bin/dev/session_40
  comment         = AVSR-Override: min_on=0.8s, min_off=1.2s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_40


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/32 [00:00<00:16,  1.89it/s]
[Acessing speaker spk_0 track 1 of 1:   6%|▋         | 2/32 [00:00<00:13,  2.17it/s]
[Acessing speaker spk_0 track 1 of 1:   9%|▉         | 3/32 [00:01<00:14,  2.07it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▎        | 4/32 [00:01<00:13,  2.12it/s]
[Acessing speaker spk_0 track 1 of 1:  16%|█▌        | 5/32 [00:02<00:12,  2.25it/s]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 6/32 [00:03<00:17,  1.46it/s]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 7/32 [00:07<00:41,  1.67s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 8/32 [00:07<00:33,  1.39s/it]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 9/32 [00:09<00:31,  1.38s/it]
[Acessing speaker spk_0 track 1 of 1:  31%|███▏      | 10/32 [00:11<00:33,  1.53s/it]
[Acessing speaker spk_0 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/38 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/38 [00:00<00:17,  2.17it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▌         | 2/38 [00:03<01:02,  1.75s/it]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 3/38 [00:03<00:45,  1.31s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 4/38 [00:05<00:45,  1.33s/it]
[Acessing speaker spk_1 track 1 of 1:  13%|█▎        | 5/38 [00:06<00:39,  1.21s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 6/38 [00:06<00:31,  1.00it/s]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 7/38 [00:07<00:26,  1.19it/s]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 8/38 [00:07<00:22,  1.36it/s]
[Acessing speaker spk_1 track 1 of 1:  24%|██▎       | 9/38 [00:08<00:21,  1.37it/s]
[Acessing speaker spk_1 track 1 of 1:  26%|██▋       | 10/38 [00:09<00:18,  1.54it/s]
[Acessing speaker spk_1 track 1 of 1:  29%|██▉       | 11/3





Processing speaker spk_2 track 1 of 3: 0it [00:00, ?it/s]

[Acessing speaker spk_2 track 2 of 3:   0%|          | 0/13 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 3:   8%|▊         | 1/13 [00:00<00:06,  1.96it/s]
[Acessing speaker spk_2 track 2 of 3:  15%|█▌        | 2/13 [00:01<00:08,  1.26it/s]
[Acessing speaker spk_2 track 2 of 3:  23%|██▎       | 3/13 [00:02<00:10,  1.10s/it]
[Acessing speaker spk_2 track 2 of 3:  31%|███       | 4/13 [00:07<00:21,  2.35s/it]
[Acessing speaker spk_2 track 2 of 3:  38%|███▊      | 5/13 [00:08<00:15,  1.96s/it]
[Acessing speaker spk_2 track 2 of 3:  46%|████▌     | 6/13 [00:13<00:21,  3.11s/it]
[Acessing speaker spk_2 track 2 of 3:  54%|█████▍    | 7/13 [00:18<00:22,  3.72s/it]
[Acessing speaker spk_2 track 2 of 3:  62%|██████▏   | 8/13 [00:23<00:19,  3.98s/it]
[Acessing speaker spk_2 track 2 of 3:  69%|██████▉   | 9/13 [00:24<00:12,  3.01s/it]
[Acessing speaker spk_2 track 2 of 3:  77%|███████▋  | 10/13 [00:29<00:11,  3.69s/it]






[Acessing speaker spk_3 track 1 of 2:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 2:   6%|▌         | 1/18 [00:00<00:10,  1.68it/s]
[Acessing speaker spk_3 track 1 of 2:  11%|█         | 2/18 [00:01<00:09,  1.77it/s]
[Acessing speaker spk_3 track 1 of 2:  17%|█▋        | 3/18 [00:06<00:40,  2.69s/it]
[Acessing speaker spk_3 track 1 of 2:  22%|██▏       | 4/18 [00:08<00:34,  2.49s/it]
[Acessing speaker spk_3 track 1 of 2:  28%|██▊       | 5/18 [00:12<00:40,  3.08s/it]
[Acessing speaker spk_3 track 1 of 2:  33%|███▎      | 6/18 [00:13<00:29,  2.43s/it]
[Acessing speaker spk_3 track 1 of 2:  39%|███▉      | 7/18 [00:18<00:34,  3.10s/it]
[Acessing speaker spk_3 track 1 of 2:  44%|████▍     | 8/18 [00:21<00:31,  3.20s/it]
[Acessing speaker spk_3 track 1 of 2:  50%|█████     | 9/18 [00:24<00:28,  3.19s/it]
[Acessing speaker spk_3 track 1 of 2:  56%|█████▌    | 10/18 [00:25<00:19,  2.45s/it]
[Acessing speaker spk_3 track 1 of 2:  61%|██████    | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/27 [00:02<01:01,  2.38s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/27 [00:03<00:41,  1.66s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/27 [00:12<01:55,  4.80s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▍        | 4/27 [00:22<02:40,  6.97s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▊        | 5/27 [00:24<01:57,  5.34s/it]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 6/27 [00:25<01:18,  3.72s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 7/27 [00:27<01:01,  3.10s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|██▉       | 8/27 [00:29<00:52,  2.77s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 9/27 [00:31<00:46,  2.58s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 10/27 [00:33<00:39,  2.32s/it]
[Acessing speaker spk_4 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/32 [00:00<00:25,  1.22it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▋         | 2/32 [00:04<01:10,  2.36s/it]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 3/32 [00:05<00:58,  2.03s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▎        | 4/32 [00:09<01:09,  2.49s/it]
[Acessing speaker spk_5 track 1 of 1:  16%|█▌        | 5/32 [00:10<00:51,  1.92s/it]
[Acessing speaker spk_5 track 1 of 1:  19%|█▉        | 6/32 [00:12<00:52,  2.02s/it]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 7/32 [00:13<00:43,  1.76s/it]
[Acessing speaker spk_5 track 1 of 1:  25%|██▌       | 8/32 [00:15<00:42,  1.78s/it]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 9/32 [00:16<00:40,  1.74s/it]
[Acessing speaker spk_5 track 1 of 1:  31%|███▏      | 10/32 [00:19<00:41,  1.90s/it]
[Acessing speaker spk_5 track 1 of 1:  34%|███▍      | 11/3


Starte Inference für Experiment: E69_bugfix_mdOn1p0_mdOff0p5_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E69_bugfix_mdOn1p0_mdOff0p5_bs12_len20
  session_dir     = data-bin/dev/session_40
  comment         = AVSR-Override: min_on=1.0s, min_off=0.5s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_40


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/41 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   2%|▏         | 1/41 [00:00<00:20,  1.91it/s]
[Acessing speaker spk_0 track 1 of 1:   5%|▍         | 2/41 [00:00<00:17,  2.21it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 3/41 [00:01<00:18,  2.09it/s]
[Acessing speaker spk_0 track 1 of 1:  10%|▉         | 4/41 [00:01<00:17,  2.14it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▏        | 5/41 [00:02<00:15,  2.27it/s]
[Acessing speaker spk_0 track 1 of 1:  15%|█▍        | 6/41 [00:02<00:16,  2.14it/s]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 7/41 [00:03<00:17,  1.92it/s]
[Acessing speaker spk_0 track 1 of 1:  20%|█▉        | 8/41 [00:03<00:16,  1.99it/s]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 9/41 [00:06<00:39,  1.24s/it]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 10/41 [00:07<00:33,  1.08s/it]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 11/4





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/41 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   2%|▏         | 1/41 [00:00<00:18,  2.12it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▍         | 2/41 [00:01<00:30,  1.27it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 3/41 [00:02<00:30,  1.25it/s]
[Acessing speaker spk_1 track 1 of 1:  10%|▉         | 4/41 [00:02<00:27,  1.34it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 5/41 [00:04<00:34,  1.04it/s]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 6/41 [00:05<00:33,  1.03it/s]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 7/41 [00:05<00:28,  1.19it/s]
[Acessing speaker spk_1 track 1 of 1:  20%|█▉        | 8/41 [00:06<00:24,  1.35it/s]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 9/41 [00:06<00:21,  1.49it/s]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 10/41 [00:07<00:21,  1.45it/s]
[Acessing speaker spk_1 track 1 of 1:  27%|██▋       | 11/4





Processing speaker spk_2 track 1 of 3: 0it [00:00, ?it/s]

[Acessing speaker spk_2 track 2 of 3:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 3:   6%|▋         | 1/16 [00:00<00:07,  1.93it/s]
[Acessing speaker spk_2 track 2 of 3:  12%|█▎        | 2/16 [00:01<00:11,  1.22it/s]
[Acessing speaker spk_2 track 2 of 3:  19%|█▉        | 3/16 [00:03<00:14,  1.12s/it]
[Acessing speaker spk_2 track 2 of 3:  25%|██▌       | 4/16 [00:07<00:29,  2.45s/it]
[Acessing speaker spk_2 track 2 of 3:  31%|███▏      | 5/16 [00:10<00:28,  2.63s/it]
[Acessing speaker spk_2 track 2 of 3:  38%|███▊      | 6/16 [00:11<00:21,  2.16s/it]
[Acessing speaker spk_2 track 2 of 3:  44%|████▍     | 7/16 [00:12<00:16,  1.86s/it]
[Acessing speaker spk_2 track 2 of 3:  50%|█████     | 8/16 [00:17<00:22,  2.79s/it]
[Acessing speaker spk_2 track 2 of 3:  56%|█████▋    | 9/16 [00:22<00:23,  3.34s/it]
[Acessing speaker spk_2 track 2 of 3:  62%|██████▎   | 10/16 [00:23<00:15,  2.59s/it]






[Acessing speaker spk_3 track 1 of 2:   0%|          | 0/19 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 2:   5%|▌         | 1/19 [00:00<00:10,  1.68it/s]
[Acessing speaker spk_3 track 1 of 2:  11%|█         | 2/19 [00:01<00:09,  1.78it/s]
[Acessing speaker spk_3 track 1 of 2:  16%|█▌        | 3/19 [00:02<00:12,  1.31it/s]
[Acessing speaker spk_3 track 1 of 2:  21%|██        | 4/19 [00:05<00:26,  1.76s/it]
[Acessing speaker spk_3 track 1 of 2:  26%|██▋       | 5/19 [00:07<00:26,  1.89s/it]
[Acessing speaker spk_3 track 1 of 2:  32%|███▏      | 6/19 [00:11<00:34,  2.67s/it]
[Acessing speaker spk_3 track 1 of 2:  37%|███▋      | 7/19 [00:12<00:26,  2.18s/it]
[Acessing speaker spk_3 track 1 of 2:  42%|████▏     | 8/19 [00:17<00:32,  2.92s/it]
[Acessing speaker spk_3 track 1 of 2:  47%|████▋     | 9/19 [00:20<00:30,  3.07s/it]
[Acessing speaker spk_3 track 1 of 2:  53%|█████▎    | 10/19 [00:23<00:27,  3.07s/it]
[Acessing speaker spk_3 track 1 of 2:  58%|█████▊    | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/26 [00:02<00:59,  2.38s/it]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/26 [00:03<00:39,  1.66s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 3/26 [00:12<01:51,  4.85s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▌        | 4/26 [00:22<02:37,  7.15s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 5/26 [00:25<01:55,  5.48s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 6/26 [00:27<01:24,  4.25s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 7/26 [00:29<01:06,  3.50s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 8/26 [00:31<00:56,  3.15s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▍      | 9/26 [00:33<00:46,  2.72s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 10/26 [00:42<01:13,  4.56s/it]
[Acessing speaker spk_4 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/43 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   2%|▏         | 1/43 [00:00<00:34,  1.20it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▍         | 2/43 [00:04<01:35,  2.34s/it]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 3/43 [00:05<01:20,  2.01s/it]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 4/43 [00:07<01:07,  1.73s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 5/43 [00:08<00:58,  1.54s/it]
[Acessing speaker spk_5 track 1 of 1:  14%|█▍        | 6/43 [00:09<00:47,  1.29s/it]
[Acessing speaker spk_5 track 1 of 1:  16%|█▋        | 7/43 [00:11<00:57,  1.59s/it]
[Acessing speaker spk_5 track 1 of 1:  19%|█▊        | 8/43 [00:12<00:51,  1.47s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 9/43 [00:14<00:55,  1.62s/it]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 10/43 [00:15<00:43,  1.32s/it]
[Acessing speaker spk_5 track 1 of 1:  26%|██▌       | 11/4


Starte Inference für Experiment: E70_bugfix_mdOn1p0_mdOff0p8_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E70_bugfix_mdOn1p0_mdOff0p8_bs12_len20
  session_dir     = data-bin/dev/session_40
  comment         = AVSR-Override: min_on=1.0s, min_off=0.8s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_40


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/39 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/39 [00:00<00:19,  1.92it/s]
[Acessing speaker spk_0 track 1 of 1:   5%|▌         | 2/39 [00:00<00:16,  2.21it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 3/39 [00:01<00:17,  2.09it/s]
[Acessing speaker spk_0 track 1 of 1:  10%|█         | 4/39 [00:01<00:16,  2.14it/s]
[Acessing speaker spk_0 track 1 of 1:  13%|█▎        | 5/39 [00:02<00:14,  2.27it/s]
[Acessing speaker spk_0 track 1 of 1:  15%|█▌        | 6/39 [00:02<00:15,  2.14it/s]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 7/39 [00:03<00:16,  1.93it/s]
[Acessing speaker spk_0 track 1 of 1:  21%|██        | 8/39 [00:03<00:15,  2.00it/s]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 9/39 [00:06<00:37,  1.25s/it]
[Acessing speaker spk_0 track 1 of 1:  26%|██▌       | 10/39 [00:07<00:31,  1.09s/it]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/41 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   2%|▏         | 1/41 [00:00<00:18,  2.17it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▍         | 2/41 [00:01<00:29,  1.30it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 3/41 [00:02<00:29,  1.27it/s]
[Acessing speaker spk_1 track 1 of 1:  10%|▉         | 4/41 [00:02<00:27,  1.36it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 5/41 [00:04<00:34,  1.05it/s]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 6/41 [00:05<00:33,  1.03it/s]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 7/41 [00:05<00:28,  1.19it/s]
[Acessing speaker spk_1 track 1 of 1:  20%|█▉        | 8/41 [00:06<00:24,  1.34it/s]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 9/41 [00:06<00:21,  1.49it/s]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 10/41 [00:07<00:21,  1.46it/s]
[Acessing speaker spk_1 track 1 of 1:  27%|██▋       | 11/4





Processing speaker spk_2 track 1 of 3: 0it [00:00, ?it/s]

[Acessing speaker spk_2 track 2 of 3:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 3:   6%|▋         | 1/16 [00:00<00:07,  1.94it/s]
[Acessing speaker spk_2 track 2 of 3:  12%|█▎        | 2/16 [00:01<00:11,  1.25it/s]
[Acessing speaker spk_2 track 2 of 3:  19%|█▉        | 3/16 [00:02<00:14,  1.10s/it]
[Acessing speaker spk_2 track 2 of 3:  25%|██▌       | 4/16 [00:07<00:29,  2.42s/it]
[Acessing speaker spk_2 track 2 of 3:  31%|███▏      | 5/16 [00:08<00:22,  2.01s/it]
[Acessing speaker spk_2 track 2 of 3:  38%|███▊      | 6/16 [00:09<00:17,  1.74s/it]
[Acessing speaker spk_2 track 2 of 3:  44%|████▍     | 7/16 [00:10<00:13,  1.51s/it]
[Acessing speaker spk_2 track 2 of 3:  50%|█████     | 8/16 [00:17<00:25,  3.15s/it]
[Acessing speaker spk_2 track 2 of 3:  56%|█████▋    | 9/16 [00:22<00:25,  3.59s/it]
[Acessing speaker spk_2 track 2 of 3:  62%|██████▎   | 10/16 [00:23<00:16,  2.77s/it]






[Acessing speaker spk_3 track 1 of 2:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 2:   6%|▌         | 1/18 [00:00<00:10,  1.66it/s]
[Acessing speaker spk_3 track 1 of 2:  11%|█         | 2/18 [00:01<00:09,  1.76it/s]
[Acessing speaker spk_3 track 1 of 2:  17%|█▋        | 3/18 [00:06<00:39,  2.66s/it]
[Acessing speaker spk_3 track 1 of 2:  22%|██▏       | 4/18 [00:08<00:34,  2.47s/it]
[Acessing speaker spk_3 track 1 of 2:  28%|██▊       | 5/18 [00:12<00:40,  3.09s/it]
[Acessing speaker spk_3 track 1 of 2:  33%|███▎      | 6/18 [00:14<00:29,  2.50s/it]
[Acessing speaker spk_3 track 1 of 2:  39%|███▉      | 7/18 [00:18<00:34,  3.14s/it]
[Acessing speaker spk_3 track 1 of 2:  44%|████▍     | 8/18 [00:21<00:31,  3.18s/it]
[Acessing speaker spk_3 track 1 of 2:  50%|█████     | 9/18 [00:24<00:28,  3.16s/it]
[Acessing speaker spk_3 track 1 of 2:  56%|█████▌    | 10/18 [00:25<00:19,  2.44s/it]
[Acessing speaker spk_3 track 1 of 2:  61%|██████    | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/26 [00:02<01:00,  2.40s/it]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/26 [00:03<00:40,  1.68s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 3/26 [00:12<01:51,  4.87s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▌        | 4/26 [00:22<02:34,  7.02s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 5/26 [00:25<01:54,  5.43s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 6/26 [00:26<01:23,  4.19s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 7/26 [00:28<01:05,  3.44s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 8/26 [00:31<00:54,  3.04s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▍      | 9/26 [00:32<00:44,  2.62s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 10/26 [00:39<01:01,  3.86s/it]
[Acessing speaker spk_4 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/37 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/37 [00:00<00:29,  1.22it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▌         | 2/37 [00:04<01:21,  2.34s/it]
[Acessing speaker spk_5 track 1 of 1:   8%|▊         | 3/37 [00:05<01:08,  2.00s/it]
[Acessing speaker spk_5 track 1 of 1:  11%|█         | 4/37 [00:09<01:21,  2.47s/it]
[Acessing speaker spk_5 track 1 of 1:  14%|█▎        | 5/37 [00:09<01:01,  1.91s/it]
[Acessing speaker spk_5 track 1 of 1:  16%|█▌        | 6/37 [00:12<01:02,  2.01s/it]
[Acessing speaker spk_5 track 1 of 1:  19%|█▉        | 7/37 [00:13<00:52,  1.75s/it]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 8/37 [00:15<00:51,  1.76s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 9/37 [00:15<00:39,  1.42s/it]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 10/37 [00:16<00:33,  1.24s/it]
[Acessing speaker spk_5 track 1 of 1:  30%|██▉       | 11/3


Starte Inference für Experiment: E71_bugfix_mdOn1p0_mdOff1p0_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E71_bugfix_mdOn1p0_mdOff1p0_bs12_len20
  session_dir     = data-bin/dev/session_40
  comment         = AVSR-Override: min_on=1.0s, min_off=1.0s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_40


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/35 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/35 [00:00<00:17,  1.92it/s]
[Acessing speaker spk_0 track 1 of 1:   6%|▌         | 2/35 [00:00<00:14,  2.22it/s]
[Acessing speaker spk_0 track 1 of 1:   9%|▊         | 3/35 [00:01<00:15,  2.11it/s]
[Acessing speaker spk_0 track 1 of 1:  11%|█▏        | 4/35 [00:01<00:14,  2.16it/s]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 5/35 [00:02<00:13,  2.29it/s]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 6/35 [00:03<00:19,  1.48it/s]
[Acessing speaker spk_0 track 1 of 1:  20%|██        | 7/35 [00:07<00:46,  1.65s/it]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 8/35 [00:07<00:37,  1.37s/it]
[Acessing speaker spk_0 track 1 of 1:  26%|██▌       | 9/35 [00:09<00:35,  1.37s/it]
[Acessing speaker spk_0 track 1 of 1:  29%|██▊       | 10/35 [00:10<00:30,  1.22s/it]
[Acessing speaker spk_0 track 1 of 1:  31%|███▏      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/40 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   2%|▎         | 1/40 [00:00<00:17,  2.19it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▌         | 2/40 [00:01<00:28,  1.32it/s]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 3/40 [00:02<00:28,  1.29it/s]
[Acessing speaker spk_1 track 1 of 1:  10%|█         | 4/40 [00:02<00:26,  1.36it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▎        | 5/40 [00:04<00:33,  1.05it/s]
[Acessing speaker spk_1 track 1 of 1:  15%|█▌        | 6/40 [00:05<00:32,  1.04it/s]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 7/40 [00:05<00:27,  1.19it/s]
[Acessing speaker spk_1 track 1 of 1:  20%|██        | 8/40 [00:06<00:23,  1.36it/s]
[Acessing speaker spk_1 track 1 of 1:  22%|██▎       | 9/40 [00:06<00:20,  1.51it/s]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 10/40 [00:07<00:20,  1.48it/s]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 11/4





Processing speaker spk_2 track 1 of 3: 0it [00:00, ?it/s]

[Acessing speaker spk_2 track 2 of 3:   0%|          | 0/13 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 3:   8%|▊         | 1/13 [00:00<00:06,  1.97it/s]
[Acessing speaker spk_2 track 2 of 3:  15%|█▌        | 2/13 [00:01<00:08,  1.25it/s]
[Acessing speaker spk_2 track 2 of 3:  23%|██▎       | 3/13 [00:02<00:10,  1.10s/it]
[Acessing speaker spk_2 track 2 of 3:  31%|███       | 4/13 [00:07<00:21,  2.39s/it]
[Acessing speaker spk_2 track 2 of 3:  38%|███▊      | 5/13 [00:08<00:15,  1.99s/it]
[Acessing speaker spk_2 track 2 of 3:  46%|████▌     | 6/13 [00:12<00:17,  2.52s/it]
[Acessing speaker spk_2 track 2 of 3:  54%|█████▍    | 7/13 [00:16<00:19,  3.27s/it]
[Acessing speaker spk_2 track 2 of 3:  62%|██████▏   | 8/13 [00:21<00:18,  3.73s/it]
[Acessing speaker spk_2 track 2 of 3:  69%|██████▉   | 9/13 [00:22<00:11,  2.85s/it]
[Acessing speaker spk_2 track 2 of 3:  77%|███████▋  | 10/13 [00:27<00:10,  3.57s/it]






[Acessing speaker spk_3 track 1 of 2:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 2:   6%|▌         | 1/18 [00:00<00:10,  1.68it/s]
[Acessing speaker spk_3 track 1 of 2:  11%|█         | 2/18 [00:01<00:09,  1.77it/s]
[Acessing speaker spk_3 track 1 of 2:  17%|█▋        | 3/18 [00:06<00:41,  2.76s/it]
[Acessing speaker spk_3 track 1 of 2:  22%|██▏       | 4/18 [00:08<00:35,  2.56s/it]
[Acessing speaker spk_3 track 1 of 2:  28%|██▊       | 5/18 [00:12<00:41,  3.16s/it]
[Acessing speaker spk_3 track 1 of 2:  33%|███▎      | 6/18 [00:14<00:29,  2.48s/it]
[Acessing speaker spk_3 track 1 of 2:  39%|███▉      | 7/18 [00:18<00:34,  3.12s/it]
[Acessing speaker spk_3 track 1 of 2:  44%|████▍     | 8/18 [00:21<00:31,  3.15s/it]
[Acessing speaker spk_3 track 1 of 2:  50%|█████     | 9/18 [00:25<00:28,  3.18s/it]
[Acessing speaker spk_3 track 1 of 2:  56%|█████▌    | 10/18 [00:25<00:19,  2.44s/it]
[Acessing speaker spk_3 track 1 of 2:  61%|██████    | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/26 [00:02<01:00,  2.42s/it]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/26 [00:03<00:40,  1.68s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 3/26 [00:12<01:50,  4.80s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▌        | 4/26 [00:22<02:36,  7.10s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 5/26 [00:25<01:53,  5.42s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 6/26 [00:26<01:23,  4.19s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 7/26 [00:28<01:05,  3.45s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 8/26 [00:31<00:54,  3.04s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▍      | 9/26 [00:32<00:44,  2.63s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 10/26 [00:39<01:02,  3.92s/it]
[Acessing speaker spk_4 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/32 [00:00<00:25,  1.20it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▋         | 2/32 [00:04<01:10,  2.34s/it]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 3/32 [00:05<00:58,  2.00s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▎        | 4/32 [00:08<01:07,  2.39s/it]
[Acessing speaker spk_5 track 1 of 1:  16%|█▌        | 5/32 [00:09<00:50,  1.86s/it]
[Acessing speaker spk_5 track 1 of 1:  19%|█▉        | 6/32 [00:11<00:51,  1.97s/it]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 7/32 [00:13<00:42,  1.72s/it]
[Acessing speaker spk_5 track 1 of 1:  25%|██▌       | 8/32 [00:14<00:41,  1.74s/it]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 9/32 [00:16<00:39,  1.70s/it]
[Acessing speaker spk_5 track 1 of 1:  31%|███▏      | 10/32 [00:18<00:41,  1.91s/it]
[Acessing speaker spk_5 track 1 of 1:  34%|███▍      | 11/3


Starte Inference für Experiment: E72_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E72_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/dev/session_40
  comment         = AVSR-Override: min_on=1.0s, min_off=1.2s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_40


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/32 [00:00<00:16,  1.86it/s]
[Acessing speaker spk_0 track 1 of 1:   6%|▋         | 2/32 [00:00<00:13,  2.19it/s]
[Acessing speaker spk_0 track 1 of 1:   9%|▉         | 3/32 [00:01<00:13,  2.09it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▎        | 4/32 [00:01<00:13,  2.15it/s]
[Acessing speaker spk_0 track 1 of 1:  16%|█▌        | 5/32 [00:02<00:11,  2.28it/s]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 6/32 [00:03<00:18,  1.42it/s]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 7/32 [00:07<00:43,  1.75s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 8/32 [00:08<00:34,  1.45s/it]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 9/32 [00:09<00:32,  1.43s/it]
[Acessing speaker spk_0 track 1 of 1:  31%|███▏      | 10/32 [00:11<00:34,  1.57s/it]
[Acessing speaker spk_0 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/38 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/38 [00:00<00:17,  2.17it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▌         | 2/38 [00:03<01:02,  1.73s/it]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 3/38 [00:03<00:44,  1.29s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 4/38 [00:05<00:44,  1.30s/it]
[Acessing speaker spk_1 track 1 of 1:  13%|█▎        | 5/38 [00:06<00:39,  1.19s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 6/38 [00:06<00:31,  1.01it/s]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 7/38 [00:07<00:25,  1.20it/s]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 8/38 [00:07<00:21,  1.37it/s]
[Acessing speaker spk_1 track 1 of 1:  24%|██▎       | 9/38 [00:08<00:21,  1.37it/s]
[Acessing speaker spk_1 track 1 of 1:  26%|██▋       | 10/38 [00:08<00:18,  1.55it/s]
[Acessing speaker spk_1 track 1 of 1:  29%|██▉       | 11/3





Processing speaker spk_2 track 1 of 3: 0it [00:00, ?it/s]

[Acessing speaker spk_2 track 2 of 3:   0%|          | 0/13 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 2 of 3:   8%|▊         | 1/13 [00:00<00:06,  1.96it/s]
[Acessing speaker spk_2 track 2 of 3:  15%|█▌        | 2/13 [00:01<00:08,  1.25it/s]
[Acessing speaker spk_2 track 2 of 3:  23%|██▎       | 3/13 [00:02<00:10,  1.10s/it]
[Acessing speaker spk_2 track 2 of 3:  31%|███       | 4/13 [00:07<00:21,  2.34s/it]
[Acessing speaker spk_2 track 2 of 3:  38%|███▊      | 5/13 [00:08<00:15,  1.96s/it]
[Acessing speaker spk_2 track 2 of 3:  46%|████▌     | 6/13 [00:12<00:17,  2.50s/it]
[Acessing speaker spk_2 track 2 of 3:  54%|█████▍    | 7/13 [00:19<00:23,  3.96s/it]
[Acessing speaker spk_2 track 2 of 3:  62%|██████▏   | 8/13 [00:23<00:20,  4.15s/it]
[Acessing speaker spk_2 track 2 of 3:  69%|██████▉   | 9/13 [00:24<00:12,  3.14s/it]
[Acessing speaker spk_2 track 2 of 3:  77%|███████▋  | 10/13 [00:29<00:11,  3.81s/it]






[Acessing speaker spk_3 track 1 of 2:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 2:   6%|▌         | 1/18 [00:00<00:10,  1.66it/s]
[Acessing speaker spk_3 track 1 of 2:  11%|█         | 2/18 [00:01<00:09,  1.77it/s]
[Acessing speaker spk_3 track 1 of 2:  17%|█▋        | 3/18 [00:06<00:39,  2.66s/it]
[Acessing speaker spk_3 track 1 of 2:  22%|██▏       | 4/18 [00:08<00:34,  2.46s/it]
[Acessing speaker spk_3 track 1 of 2:  28%|██▊       | 5/18 [00:12<00:39,  3.08s/it]
[Acessing speaker spk_3 track 1 of 2:  33%|███▎      | 6/18 [00:13<00:29,  2.43s/it]
[Acessing speaker spk_3 track 1 of 2:  39%|███▉      | 7/18 [00:18<00:34,  3.12s/it]
[Acessing speaker spk_3 track 1 of 2:  44%|████▍     | 8/18 [00:21<00:32,  3.20s/it]
[Acessing speaker spk_3 track 1 of 2:  50%|█████     | 9/18 [00:24<00:28,  3.16s/it]
[Acessing speaker spk_3 track 1 of 2:  56%|█████▌    | 10/18 [00:25<00:19,  2.44s/it]
[Acessing speaker spk_3 track 1 of 2:  61%|██████    | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/26 [00:02<00:59,  2.39s/it]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/26 [00:03<00:40,  1.67s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 3/26 [00:12<01:51,  4.84s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▌        | 4/26 [00:22<02:33,  7.00s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 5/26 [00:24<01:52,  5.35s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 6/26 [00:26<01:22,  4.14s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 7/26 [00:28<01:06,  3.47s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 8/26 [00:30<00:55,  3.06s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▍      | 9/26 [00:32<00:45,  2.65s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 10/26 [00:41<01:11,  4.44s/it]
[Acessing speaker spk_4 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/32 [00:00<00:25,  1.23it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▋         | 2/32 [00:04<01:10,  2.36s/it]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 3/32 [00:05<00:58,  2.03s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▎        | 4/32 [00:08<01:08,  2.43s/it]
[Acessing speaker spk_5 track 1 of 1:  16%|█▌        | 5/32 [00:09<00:50,  1.88s/it]
[Acessing speaker spk_5 track 1 of 1:  19%|█▉        | 6/32 [00:12<00:52,  2.00s/it]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 7/32 [00:13<00:45,  1.80s/it]
[Acessing speaker spk_5 track 1 of 1:  25%|██▌       | 8/32 [00:15<00:43,  1.80s/it]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 9/32 [00:16<00:40,  1.76s/it]
[Acessing speaker spk_5 track 1 of 1:  31%|███▏      | 10/32 [00:19<00:41,  1.90s/it]
[Acessing speaker spk_5 track 1 of 1:  34%|███▍      | 11/3


########## Starte Grid-Experimente für session_43 ##########

Starte Inference für Experiment: E56_bugfix_default_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E56_bugfix_default_bs12_len20
  session_dir     = data-bin/dev/session_43
  comment         = Bugfix-default segmentation (kein Override von min_duration)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_43


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/30 [00:00<00:25,  1.14it/s]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/30 [00:01<00:22,  1.25it/s]
[Acessing speaker spk_0 track 1 of 2:  10%|█         | 3/30 [00:02<00:26,  1.01it/s]
[Acessing speaker spk_0 track 1 of 2:  13%|█▎        | 4/30 [00:03<00:20,  1.26it/s]
[Acessing speaker spk_0 track 1 of 2:  17%|█▋        | 5/30 [00:04<00:22,  1.12it/s]
[Acessing speaker spk_0 track 1 of 2:  20%|██        | 6/30 [00:06<00:35,  1.46s/it]
[Acessing speaker spk_0 track 1 of 2:  23%|██▎       | 7/30 [00:15<01:23,  3.65s/it]
[Acessing speaker spk_0 track 1 of 2:  27%|██▋       | 8/30 [00:22<01:43,  4.69s/it]
[Acessing speaker spk_0 track 1 of 2:  30%|███       | 9/30 [00:22<01:12,  3.43s/it]
[Acessing speaker spk_0 track 1 of 2:  33%|███▎      | 10/30 [00:25<01:02,  3.13s/it]
[Acessing speaker spk_0 track 1 of 2:  37%|███▋      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/39 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/39 [00:01<00:39,  1.04s/it]
[Acessing speaker spk_1 track 1 of 1:   5%|▌         | 2/39 [00:01<00:32,  1.15it/s]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 3/39 [00:02<00:25,  1.40it/s]
[Acessing speaker spk_1 track 1 of 1:  10%|█         | 4/39 [00:05<00:58,  1.68s/it]
[Acessing speaker spk_1 track 1 of 1:  13%|█▎        | 5/39 [00:06<00:46,  1.36s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▌        | 6/39 [00:06<00:36,  1.09s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 7/39 [00:07<00:34,  1.07s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 8/39 [00:08<00:28,  1.08it/s]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 9/39 [00:09<00:28,  1.07it/s]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 10/39 [00:11<00:32,  1.13s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/32 [00:00<00:27,  1.13it/s]
[Acessing speaker spk_2 track 1 of 1:   6%|▋         | 2/32 [00:01<00:20,  1.44it/s]
[Acessing speaker spk_2 track 1 of 1:   9%|▉         | 3/32 [00:02<00:20,  1.40it/s]
[Acessing speaker spk_2 track 1 of 1:  12%|█▎        | 4/32 [00:02<00:19,  1.43it/s]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 5/32 [00:03<00:17,  1.55it/s]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 6/32 [00:04<00:18,  1.39it/s]
[Acessing speaker spk_2 track 1 of 1:  22%|██▏       | 7/32 [00:04<00:15,  1.58it/s]
[Acessing speaker spk_2 track 1 of 1:  25%|██▌       | 8/32 [00:05<00:14,  1.63it/s]
[Acessing speaker spk_2 track 1 of 1:  28%|██▊       | 9/32 [00:09<00:37,  1.63s/it]
[Acessing speaker spk_2 track 1 of 1:  31%|███▏      | 10/32 [00:10<00:33,  1.54s/it]
[Acessing speaker spk_2 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/35 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/35 [00:00<00:26,  1.28it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▌         | 2/35 [00:01<00:23,  1.43it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▊         | 3/35 [00:02<00:23,  1.36it/s]
[Acessing speaker spk_3 track 1 of 1:  11%|█▏        | 4/35 [00:02<00:20,  1.54it/s]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 5/35 [00:03<00:25,  1.18it/s]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 6/35 [00:06<00:39,  1.37s/it]
[Acessing speaker spk_3 track 1 of 1:  20%|██        | 7/35 [00:06<00:31,  1.13s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 8/35 [00:09<00:44,  1.64s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 9/35 [00:10<00:36,  1.39s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▊       | 10/35 [00:19<01:33,  3.75s/it]
[Acessing speaker spk_3 track 1 of 1:  31%|███▏      | 11/3





[Acessing speaker spk_4 track 1 of 2:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 2:   6%|▌         | 1/17 [00:00<00:14,  1.08it/s]
[Acessing speaker spk_4 track 1 of 2:  12%|█▏        | 2/17 [00:02<00:23,  1.59s/it]
[Acessing speaker spk_4 track 1 of 2:  18%|█▊        | 3/17 [00:06<00:33,  2.40s/it]
[Acessing speaker spk_4 track 1 of 2:  24%|██▎       | 4/17 [00:08<00:31,  2.41s/it]
[Acessing speaker spk_4 track 1 of 2:  29%|██▉       | 5/17 [00:09<00:21,  1.82s/it]
[Acessing speaker spk_4 track 1 of 2:  35%|███▌      | 6/17 [00:10<00:15,  1.45s/it]
[Acessing speaker spk_4 track 1 of 2:  41%|████      | 7/17 [00:11<00:12,  1.24s/it]
[Acessing speaker spk_4 track 1 of 2:  47%|████▋     | 8/17 [00:11<00:09,  1.02s/it]
[Acessing speaker spk_4 track 1 of 2:  53%|█████▎    | 9/17 [00:12<00:07,  1.05it/s]
[Acessing speaker spk_4 track 1 of 2:  59%|█████▉    | 10/17 [00:13<00:06,  1.16it/s]
[Acessing speaker spk_4 track 1 of 2:  65%|██████▍   | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/43 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   2%|▏         | 1/43 [00:00<00:40,  1.04it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▍         | 2/43 [00:01<00:28,  1.44it/s]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 3/43 [00:04<01:10,  1.76s/it]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 4/43 [00:05<00:56,  1.45s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 5/43 [00:07<00:56,  1.48s/it]
[Acessing speaker spk_5 track 1 of 1:  14%|█▍        | 6/43 [00:08<00:53,  1.45s/it]
[Acessing speaker spk_5 track 1 of 1:  16%|█▋        | 7/43 [00:09<00:46,  1.28s/it]
[Acessing speaker spk_5 track 1 of 1:  19%|█▊        | 8/43 [00:12<01:05,  1.89s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 9/43 [00:13<00:51,  1.51s/it]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 10/43 [00:14<00:47,  1.44s/it]
[Acessing speaker spk_5 track 1 of 1:  26%|██▌       | 11/4


Starte Inference für Experiment: E57_bugfix_mdOn0p4_mdOff0p5_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E57_bugfix_mdOn0p4_mdOff0p5_bs12_len20
  session_dir     = data-bin/dev/session_43
  comment         = AVSR-Override: min_on=0.4s, min_off=0.5s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_43


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/33 [00:00<00:18,  1.69it/s]
[Acessing speaker spk_0 track 1 of 2:   6%|▌         | 2/33 [00:01<00:20,  1.48it/s]
[Acessing speaker spk_0 track 1 of 2:   9%|▉         | 3/33 [00:02<00:27,  1.08it/s]
[Acessing speaker spk_0 track 1 of 2:  12%|█▏        | 4/33 [00:03<00:21,  1.33it/s]
[Acessing speaker spk_0 track 1 of 2:  15%|█▌        | 5/33 [00:04<00:24,  1.14it/s]
[Acessing speaker spk_0 track 1 of 2:  18%|█▊        | 6/33 [00:06<00:39,  1.45s/it]
[Acessing speaker spk_0 track 1 of 2:  21%|██        | 7/33 [00:14<01:35,  3.67s/it]
[Acessing speaker spk_0 track 1 of 2:  24%|██▍       | 8/33 [00:21<01:57,  4.72s/it]
[Acessing speaker spk_0 track 1 of 2:  27%|██▋       | 9/33 [00:22<01:22,  3.45s/it]
[Acessing speaker spk_0 track 1 of 2:  30%|███       | 10/33 [00:26<01:25,  3.74s/it]
[Acessing speaker spk_0 track 1 of 2:  33%|███▎      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/39 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/39 [00:00<00:30,  1.23it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▌         | 2/39 [00:01<00:37,  1.00s/it]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 3/39 [00:03<00:40,  1.11s/it]
[Acessing speaker spk_1 track 1 of 1:  10%|█         | 4/39 [00:06<01:06,  1.91s/it]
[Acessing speaker spk_1 track 1 of 1:  13%|█▎        | 5/39 [00:07<00:50,  1.50s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▌        | 6/39 [00:07<00:39,  1.19s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 7/39 [00:08<00:36,  1.13s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 8/39 [00:09<00:30,  1.03it/s]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 9/39 [00:10<00:29,  1.03it/s]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 10/39 [00:11<00:33,  1.15s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/34 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/34 [00:00<00:22,  1.46it/s]
[Acessing speaker spk_2 track 1 of 1:   6%|▌         | 2/34 [00:01<00:16,  1.94it/s]
[Acessing speaker spk_2 track 1 of 1:   9%|▉         | 3/34 [00:01<00:13,  2.38it/s]
[Acessing speaker spk_2 track 1 of 1:  12%|█▏        | 4/34 [00:01<00:14,  2.13it/s]
[Acessing speaker spk_2 track 1 of 1:  15%|█▍        | 5/34 [00:02<00:16,  1.77it/s]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 6/34 [00:03<00:16,  1.67it/s]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 7/34 [00:03<00:15,  1.72it/s]
[Acessing speaker spk_2 track 1 of 1:  24%|██▎       | 8/34 [00:04<00:17,  1.49it/s]
[Acessing speaker spk_2 track 1 of 1:  26%|██▋       | 9/34 [00:05<00:15,  1.66it/s]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 10/34 [00:05<00:14,  1.67it/s]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/37 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/37 [00:00<00:20,  1.74it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▌         | 2/37 [00:01<00:21,  1.64it/s]
[Acessing speaker spk_3 track 1 of 1:   8%|▊         | 3/37 [00:01<00:23,  1.45it/s]
[Acessing speaker spk_3 track 1 of 1:  11%|█         | 4/37 [00:02<00:21,  1.57it/s]
[Acessing speaker spk_3 track 1 of 1:  14%|█▎        | 5/37 [00:03<00:26,  1.19it/s]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 6/37 [00:06<00:42,  1.36s/it]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 7/37 [00:06<00:33,  1.12s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 8/37 [00:09<00:47,  1.64s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 9/37 [00:10<00:39,  1.39s/it]
[Acessing speaker spk_3 track 1 of 1:  27%|██▋       | 10/37 [00:19<01:40,  3.74s/it]
[Acessing speaker spk_3 track 1 of 1:  30%|██▉       | 11/3





[Acessing speaker spk_4 track 1 of 2:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 2:   6%|▌         | 1/17 [00:00<00:12,  1.25it/s]
[Acessing speaker spk_4 track 1 of 2:  12%|█▏        | 2/17 [00:04<00:37,  2.49s/it]
[Acessing speaker spk_4 track 1 of 2:  18%|█▊        | 3/17 [00:07<00:40,  2.87s/it]
[Acessing speaker spk_4 track 1 of 2:  24%|██▎       | 4/17 [00:10<00:34,  2.69s/it]
[Acessing speaker spk_4 track 1 of 2:  29%|██▉       | 5/17 [00:10<00:24,  2.00s/it]
[Acessing speaker spk_4 track 1 of 2:  35%|███▌      | 6/17 [00:11<00:17,  1.57s/it]
[Acessing speaker spk_4 track 1 of 2:  41%|████      | 7/17 [00:12<00:13,  1.31s/it]
[Acessing speaker spk_4 track 1 of 2:  47%|████▋     | 8/17 [00:13<00:09,  1.07s/it]
[Acessing speaker spk_4 track 1 of 2:  53%|█████▎    | 9/17 [00:13<00:07,  1.01it/s]
[Acessing speaker spk_4 track 1 of 2:  59%|█████▉    | 10/17 [00:14<00:06,  1.14it/s]
[Acessing speaker spk_4 track 1 of 2:  65%|██████▍   | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/48 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   2%|▏         | 1/48 [00:00<00:30,  1.52it/s]
[Acessing speaker spk_5 track 1 of 1:   4%|▍         | 2/48 [00:01<00:26,  1.73it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▋         | 3/48 [00:04<01:13,  1.63s/it]
[Acessing speaker spk_5 track 1 of 1:   8%|▊         | 4/48 [00:05<01:01,  1.39s/it]
[Acessing speaker spk_5 track 1 of 1:  10%|█         | 5/48 [00:06<01:05,  1.52s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▎        | 6/48 [00:08<01:02,  1.49s/it]
[Acessing speaker spk_5 track 1 of 1:  15%|█▍        | 7/48 [00:09<00:54,  1.32s/it]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 8/48 [00:10<00:57,  1.44s/it]
[Acessing speaker spk_5 track 1 of 1:  19%|█▉        | 9/48 [00:11<00:47,  1.21s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 10/48 [00:14<01:09,  1.82s/it]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 11/4


Starte Inference für Experiment: E58_bugfix_mdOn0p4_mdOff0p8_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E58_bugfix_mdOn0p4_mdOff0p8_bs12_len20
  session_dir     = data-bin/dev/session_43
  comment         = AVSR-Override: min_on=0.4s, min_off=0.8s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_43


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/32 [00:00<00:18,  1.67it/s]
[Acessing speaker spk_0 track 1 of 2:   6%|▋         | 2/32 [00:01<00:20,  1.46it/s]
[Acessing speaker spk_0 track 1 of 2:   9%|▉         | 3/32 [00:02<00:27,  1.07it/s]
[Acessing speaker spk_0 track 1 of 2:  12%|█▎        | 4/32 [00:03<00:21,  1.32it/s]
[Acessing speaker spk_0 track 1 of 2:  16%|█▌        | 5/32 [00:04<00:23,  1.14it/s]
[Acessing speaker spk_0 track 1 of 2:  19%|█▉        | 6/32 [00:06<00:37,  1.45s/it]
[Acessing speaker spk_0 track 1 of 2:  22%|██▏       | 7/32 [00:14<01:31,  3.65s/it]
[Acessing speaker spk_0 track 1 of 2:  25%|██▌       | 8/32 [00:21<01:52,  4.67s/it]
[Acessing speaker spk_0 track 1 of 2:  28%|██▊       | 9/32 [00:22<01:18,  3.42s/it]
[Acessing speaker spk_0 track 1 of 2:  31%|███▏      | 10/32 [00:26<01:20,  3.67s/it]
[Acessing speaker spk_0 track 1 of 2:  34%|███▍      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/38 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/38 [00:00<00:29,  1.26it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▌         | 2/38 [00:02<00:42,  1.19s/it]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 3/38 [00:03<00:43,  1.24s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 4/38 [00:06<01:07,  1.99s/it]
[Acessing speaker spk_1 track 1 of 1:  13%|█▎        | 5/38 [00:07<00:51,  1.55s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 6/38 [00:08<00:39,  1.23s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 7/38 [00:09<00:35,  1.16s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 8/38 [00:10<00:39,  1.32s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▎       | 9/38 [00:12<00:40,  1.40s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▋       | 10/38 [00:12<00:31,  1.12s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▉       | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/31 [00:00<00:20,  1.46it/s]
[Acessing speaker spk_2 track 1 of 1:   6%|▋         | 2/31 [00:01<00:16,  1.71it/s]
[Acessing speaker spk_2 track 1 of 1:  10%|▉         | 3/31 [00:01<00:15,  1.75it/s]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 4/31 [00:02<00:17,  1.56it/s]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 5/31 [00:03<00:17,  1.52it/s]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 6/31 [00:03<00:15,  1.61it/s]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 7/31 [00:04<00:16,  1.42it/s]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 8/31 [00:05<00:14,  1.60it/s]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 9/31 [00:05<00:13,  1.64it/s]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 10/31 [00:12<00:53,  2.54s/it]
[Acessing speaker spk_2 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/34 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/34 [00:00<00:18,  1.75it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▌         | 2/34 [00:01<00:19,  1.65it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 3/34 [00:02<00:21,  1.44it/s]
[Acessing speaker spk_3 track 1 of 1:  12%|█▏        | 4/34 [00:02<00:18,  1.61it/s]
[Acessing speaker spk_3 track 1 of 1:  15%|█▍        | 5/34 [00:03<00:23,  1.22it/s]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 6/34 [00:06<00:37,  1.34s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 7/34 [00:06<00:29,  1.10s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▎       | 8/34 [00:10<00:55,  2.13s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▋       | 9/34 [00:19<01:46,  4.25s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 10/34 [00:24<01:42,  4.28s/it]
[Acessing speaker spk_3 track 1 of 1:  32%|███▏      | 11/3





[Acessing speaker spk_4 track 1 of 2:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 2:   6%|▌         | 1/17 [00:00<00:12,  1.26it/s]
[Acessing speaker spk_4 track 1 of 2:  12%|█▏        | 2/17 [00:02<00:21,  1.40s/it]
[Acessing speaker spk_4 track 1 of 2:  18%|█▊        | 3/17 [00:05<00:32,  2.29s/it]
[Acessing speaker spk_4 track 1 of 2:  24%|██▎       | 4/17 [00:08<00:31,  2.42s/it]
[Acessing speaker spk_4 track 1 of 2:  29%|██▉       | 5/17 [00:09<00:21,  1.83s/it]
[Acessing speaker spk_4 track 1 of 2:  35%|███▌      | 6/17 [00:10<00:16,  1.46s/it]
[Acessing speaker spk_4 track 1 of 2:  41%|████      | 7/17 [00:10<00:12,  1.23s/it]
[Acessing speaker spk_4 track 1 of 2:  47%|████▋     | 8/17 [00:11<00:09,  1.01s/it]
[Acessing speaker spk_4 track 1 of 2:  53%|█████▎    | 9/17 [00:12<00:07,  1.05it/s]
[Acessing speaker spk_4 track 1 of 2:  59%|█████▉    | 10/17 [00:12<00:05,  1.17it/s]
[Acessing speaker spk_4 track 1 of 2:  65%|██████▍   | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/43 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   2%|▏         | 1/43 [00:00<00:27,  1.51it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▍         | 2/43 [00:01<00:23,  1.73it/s]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 3/43 [00:05<01:39,  2.48s/it]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 4/43 [00:09<01:53,  2.91s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 5/43 [00:10<01:29,  2.37s/it]
[Acessing speaker spk_5 track 1 of 1:  14%|█▍        | 6/43 [00:11<01:09,  1.88s/it]
[Acessing speaker spk_5 track 1 of 1:  16%|█▋        | 7/43 [00:13<01:05,  1.81s/it]
[Acessing speaker spk_5 track 1 of 1:  19%|█▊        | 8/43 [00:14<00:51,  1.46s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 9/43 [00:15<00:47,  1.40s/it]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 10/43 [00:15<00:37,  1.13s/it]
[Acessing speaker spk_5 track 1 of 1:  26%|██▌       | 11/4


Starte Inference für Experiment: E59_bugfix_mdOn0p4_mdOff1p0_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E59_bugfix_mdOn0p4_mdOff1p0_bs12_len20
  session_dir     = data-bin/dev/session_43
  comment         = AVSR-Override: min_on=0.4s, min_off=1.0s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_43


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/30 [00:00<00:17,  1.67it/s]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/30 [00:01<00:19,  1.46it/s]
[Acessing speaker spk_0 track 1 of 2:  10%|█         | 3/30 [00:02<00:25,  1.07it/s]
[Acessing speaker spk_0 track 1 of 2:  13%|█▎        | 4/30 [00:03<00:19,  1.32it/s]
[Acessing speaker spk_0 track 1 of 2:  17%|█▋        | 5/30 [00:04<00:21,  1.14it/s]
[Acessing speaker spk_0 track 1 of 2:  20%|██        | 6/30 [00:06<00:34,  1.45s/it]
[Acessing speaker spk_0 track 1 of 2:  23%|██▎       | 7/30 [00:14<01:24,  3.66s/it]
[Acessing speaker spk_0 track 1 of 2:  27%|██▋       | 8/30 [00:21<01:43,  4.68s/it]
[Acessing speaker spk_0 track 1 of 2:  30%|███       | 9/30 [00:22<01:12,  3.43s/it]
[Acessing speaker spk_0 track 1 of 2:  33%|███▎      | 10/30 [00:24<01:02,  3.14s/it]
[Acessing speaker spk_0 track 1 of 2:  37%|███▋      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/32 [00:00<00:25,  1.23it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▋         | 2/32 [00:01<00:23,  1.28it/s]
[Acessing speaker spk_1 track 1 of 1:   9%|▉         | 3/32 [00:02<00:19,  1.50it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▎        | 4/32 [00:05<00:48,  1.73s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 5/32 [00:06<00:37,  1.39s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 6/32 [00:06<00:29,  1.12s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 7/32 [00:07<00:27,  1.09s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 8/32 [00:09<00:30,  1.28s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 9/32 [00:11<00:31,  1.36s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███▏      | 10/32 [00:11<00:26,  1.19s/it]
[Acessing speaker spk_1 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/29 [00:00<00:19,  1.47it/s]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 2/29 [00:01<00:15,  1.73it/s]
[Acessing speaker spk_2 track 1 of 1:  10%|█         | 3/29 [00:01<00:14,  1.77it/s]
[Acessing speaker spk_2 track 1 of 1:  14%|█▍        | 4/29 [00:02<00:15,  1.58it/s]
[Acessing speaker spk_2 track 1 of 1:  17%|█▋        | 5/29 [00:03<00:15,  1.55it/s]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 6/29 [00:03<00:14,  1.64it/s]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 7/29 [00:04<00:15,  1.44it/s]
[Acessing speaker spk_2 track 1 of 1:  28%|██▊       | 8/29 [00:05<00:12,  1.62it/s]
[Acessing speaker spk_2 track 1 of 1:  31%|███       | 9/29 [00:05<00:12,  1.66it/s]
[Acessing speaker spk_2 track 1 of 1:  34%|███▍      | 10/29 [00:14<01:00,  3.19s/it]
[Acessing speaker spk_2 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/32 [00:00<00:17,  1.76it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▋         | 2/32 [00:01<00:18,  1.64it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 3/32 [00:01<00:20,  1.45it/s]
[Acessing speaker spk_3 track 1 of 1:  12%|█▎        | 4/32 [00:02<00:17,  1.63it/s]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 5/32 [00:03<00:21,  1.23it/s]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 6/32 [00:06<00:34,  1.34s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 7/32 [00:06<00:27,  1.10s/it]
[Acessing speaker spk_3 track 1 of 1:  25%|██▌       | 8/32 [00:15<01:27,  3.64s/it]
[Acessing speaker spk_3 track 1 of 1:  28%|██▊       | 9/32 [00:27<02:20,  6.10s/it]
[Acessing speaker spk_3 track 1 of 1:  31%|███▏      | 10/32 [00:28<01:41,  4.64s/it]
[Acessing speaker spk_3 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_4 track 1 of 2:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 2:   6%|▋         | 1/16 [00:00<00:12,  1.24it/s]
[Acessing speaker spk_4 track 1 of 2:  12%|█▎        | 2/16 [00:02<00:19,  1.41s/it]
[Acessing speaker spk_4 track 1 of 2:  19%|█▉        | 3/16 [00:05<00:29,  2.29s/it]
[Acessing speaker spk_4 track 1 of 2:  25%|██▌       | 4/16 [00:08<00:28,  2.34s/it]
[Acessing speaker spk_4 track 1 of 2:  31%|███▏      | 5/16 [00:09<00:19,  1.78s/it]
[Acessing speaker spk_4 track 1 of 2:  38%|███▊      | 6/16 [00:09<00:14,  1.42s/it]
[Acessing speaker spk_4 track 1 of 2:  44%|████▍     | 7/16 [00:10<00:10,  1.21s/it]
[Acessing speaker spk_4 track 1 of 2:  50%|█████     | 8/16 [00:11<00:08,  1.00s/it]
[Acessing speaker spk_4 track 1 of 2:  56%|█████▋    | 9/16 [00:12<00:06,  1.06it/s]
[Acessing speaker spk_4 track 1 of 2:  62%|██████▎   | 10/16 [00:12<00:05,  1.18it/s]
[Acessing speaker spk_4 track 1 of 2:  69%|██████▉   | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/41 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   2%|▏         | 1/41 [00:00<00:25,  1.55it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▍         | 2/41 [00:01<00:21,  1.78it/s]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 3/41 [00:05<01:31,  2.41s/it]
[Acessing speaker spk_5 track 1 of 1:  10%|▉         | 4/41 [00:07<01:18,  2.11s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 5/41 [00:08<01:07,  1.86s/it]
[Acessing speaker spk_5 track 1 of 1:  15%|█▍        | 6/41 [00:09<00:56,  1.61s/it]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 7/41 [00:11<00:55,  1.63s/it]
[Acessing speaker spk_5 track 1 of 1:  20%|█▉        | 8/41 [00:12<00:43,  1.33s/it]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 9/41 [00:13<00:42,  1.31s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 10/41 [00:14<00:33,  1.07s/it]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 11/4


Starte Inference für Experiment: E60_bugfix_mdOn0p4_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E60_bugfix_mdOn0p4_mdOff1p2_bs12_len20
  session_dir     = data-bin/dev/session_43
  comment         = AVSR-Override: min_on=0.4s, min_off=1.2s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_43


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/29 [00:00<00:16,  1.68it/s]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/29 [00:01<00:18,  1.46it/s]
[Acessing speaker spk_0 track 1 of 2:  10%|█         | 3/29 [00:02<00:24,  1.08it/s]
[Acessing speaker spk_0 track 1 of 2:  14%|█▍        | 4/29 [00:03<00:18,  1.33it/s]
[Acessing speaker spk_0 track 1 of 2:  17%|█▋        | 5/29 [00:04<00:20,  1.15it/s]
[Acessing speaker spk_0 track 1 of 2:  21%|██        | 6/29 [00:07<00:41,  1.82s/it]
[Acessing speaker spk_0 track 1 of 2:  24%|██▍       | 7/29 [00:16<01:29,  4.07s/it]
[Acessing speaker spk_0 track 1 of 2:  28%|██▊       | 8/29 [00:23<01:44,  4.98s/it]
[Acessing speaker spk_0 track 1 of 2:  31%|███       | 9/29 [00:24<01:12,  3.64s/it]
[Acessing speaker spk_0 track 1 of 2:  34%|███▍      | 10/29 [00:26<01:02,  3.27s/it]
[Acessing speaker spk_0 track 1 of 2:  38%|███▊      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/32 [00:00<00:24,  1.25it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▋         | 2/32 [00:01<00:23,  1.29it/s]
[Acessing speaker spk_1 track 1 of 1:   9%|▉         | 3/32 [00:02<00:19,  1.50it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▎        | 4/32 [00:05<00:48,  1.72s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 5/32 [00:06<00:37,  1.38s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 6/32 [00:06<00:29,  1.12s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 7/32 [00:07<00:27,  1.08s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 8/32 [00:09<00:30,  1.28s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 9/32 [00:11<00:31,  1.37s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███▏      | 10/32 [00:11<00:26,  1.19s/it]
[Acessing speaker spk_1 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/29 [00:00<00:19,  1.45it/s]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 2/29 [00:01<00:15,  1.71it/s]
[Acessing speaker spk_2 track 1 of 1:  10%|█         | 3/29 [00:01<00:14,  1.76it/s]
[Acessing speaker spk_2 track 1 of 1:  14%|█▍        | 4/29 [00:02<00:16,  1.56it/s]
[Acessing speaker spk_2 track 1 of 1:  17%|█▋        | 5/29 [00:03<00:15,  1.53it/s]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 6/29 [00:03<00:14,  1.62it/s]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 7/29 [00:04<00:15,  1.43it/s]
[Acessing speaker spk_2 track 1 of 1:  28%|██▊       | 8/29 [00:05<00:13,  1.61it/s]
[Acessing speaker spk_2 track 1 of 1:  31%|███       | 9/29 [00:05<00:12,  1.65it/s]
[Acessing speaker spk_2 track 1 of 1:  34%|███▍      | 10/29 [00:12<00:49,  2.59s/it]
[Acessing speaker spk_2 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/32 [00:00<00:17,  1.76it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▋         | 2/32 [00:01<00:18,  1.63it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 3/32 [00:02<00:20,  1.44it/s]
[Acessing speaker spk_3 track 1 of 1:  12%|█▎        | 4/32 [00:02<00:17,  1.61it/s]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 5/32 [00:03<00:22,  1.22it/s]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 6/32 [00:06<00:34,  1.34s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 7/32 [00:06<00:27,  1.11s/it]
[Acessing speaker spk_3 track 1 of 1:  25%|██▌       | 8/32 [00:13<01:12,  3.03s/it]
[Acessing speaker spk_3 track 1 of 1:  28%|██▊       | 9/32 [00:25<02:09,  5.64s/it]
[Acessing speaker spk_3 track 1 of 1:  31%|███▏      | 10/32 [00:26<01:35,  4.34s/it]
[Acessing speaker spk_3 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_4 track 1 of 2:   0%|          | 0/15 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 2:   7%|▋         | 1/15 [00:00<00:11,  1.24it/s]
[Acessing speaker spk_4 track 1 of 2:  13%|█▎        | 2/15 [00:02<00:18,  1.42s/it]
[Acessing speaker spk_4 track 1 of 2:  20%|██        | 3/15 [00:05<00:27,  2.29s/it]
[Acessing speaker spk_4 track 1 of 2:  27%|██▋       | 4/15 [00:08<00:25,  2.34s/it]
[Acessing speaker spk_4 track 1 of 2:  33%|███▎      | 5/15 [00:09<00:17,  1.78s/it]
[Acessing speaker spk_4 track 1 of 2:  40%|████      | 6/15 [00:09<00:12,  1.43s/it]
[Acessing speaker spk_4 track 1 of 2:  47%|████▋     | 7/15 [00:10<00:09,  1.22s/it]
[Acessing speaker spk_4 track 1 of 2:  53%|█████▎    | 8/15 [00:11<00:07,  1.01s/it]
[Acessing speaker spk_4 track 1 of 2:  60%|██████    | 9/15 [00:12<00:05,  1.06it/s]
[Acessing speaker spk_4 track 1 of 2:  67%|██████▋   | 10/15 [00:13<00:05,  1.01s/it]
[Acessing speaker spk_4 track 1 of 2:  73%|███████▎  | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/40 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   2%|▎         | 1/40 [00:00<00:25,  1.53it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▌         | 2/40 [00:01<00:21,  1.76it/s]
[Acessing speaker spk_5 track 1 of 1:   8%|▊         | 3/40 [00:05<01:29,  2.42s/it]
[Acessing speaker spk_5 track 1 of 1:  10%|█         | 4/40 [00:07<01:16,  2.11s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▎        | 5/40 [00:09<01:07,  1.92s/it]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 6/40 [00:09<00:54,  1.60s/it]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 7/40 [00:11<00:53,  1.63s/it]
[Acessing speaker spk_5 track 1 of 1:  20%|██        | 8/40 [00:12<00:42,  1.33s/it]
[Acessing speaker spk_5 track 1 of 1:  22%|██▎       | 9/40 [00:13<00:40,  1.32s/it]
[Acessing speaker spk_5 track 1 of 1:  25%|██▌       | 10/40 [00:14<00:32,  1.08s/it]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 11/4


Starte Inference für Experiment: E61_bugfix_mdOn0p6_mdOff0p5_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E61_bugfix_mdOn0p6_mdOff0p5_bs12_len20
  session_dir     = data-bin/dev/session_43
  comment         = AVSR-Override: min_on=0.6s, min_off=0.5s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_43


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/32 [00:00<00:18,  1.67it/s]
[Acessing speaker spk_0 track 1 of 2:   6%|▋         | 2/32 [00:01<00:20,  1.46it/s]
[Acessing speaker spk_0 track 1 of 2:   9%|▉         | 3/32 [00:02<00:27,  1.06it/s]
[Acessing speaker spk_0 track 1 of 2:  12%|█▎        | 4/32 [00:03<00:21,  1.31it/s]
[Acessing speaker spk_0 track 1 of 2:  16%|█▌        | 5/32 [00:04<00:24,  1.12it/s]
[Acessing speaker spk_0 track 1 of 2:  19%|█▉        | 6/32 [00:06<00:38,  1.48s/it]
[Acessing speaker spk_0 track 1 of 2:  22%|██▏       | 7/32 [00:15<01:32,  3.71s/it]
[Acessing speaker spk_0 track 1 of 2:  25%|██▌       | 8/32 [00:22<01:53,  4.74s/it]
[Acessing speaker spk_0 track 1 of 2:  28%|██▊       | 9/32 [00:22<01:19,  3.47s/it]
[Acessing speaker spk_0 track 1 of 2:  31%|███▏      | 10/32 [00:25<01:09,  3.16s/it]
[Acessing speaker spk_0 track 1 of 2:  34%|███▍      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/39 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/39 [00:00<00:30,  1.24it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▌         | 2/39 [00:01<00:28,  1.29it/s]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 3/39 [00:02<00:24,  1.49it/s]
[Acessing speaker spk_1 track 1 of 1:  10%|█         | 4/39 [00:05<00:57,  1.65s/it]
[Acessing speaker spk_1 track 1 of 1:  13%|█▎        | 5/39 [00:06<00:45,  1.33s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▌        | 6/39 [00:06<00:35,  1.08s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 7/39 [00:07<00:34,  1.07s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 8/39 [00:08<00:28,  1.07it/s]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 9/39 [00:09<00:28,  1.05it/s]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 10/39 [00:11<00:34,  1.21s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/33 [00:00<00:22,  1.44it/s]
[Acessing speaker spk_2 track 1 of 1:   6%|▌         | 2/33 [00:01<00:15,  1.94it/s]
[Acessing speaker spk_2 track 1 of 1:   9%|▉         | 3/33 [00:01<00:16,  1.85it/s]
[Acessing speaker spk_2 track 1 of 1:  12%|█▏        | 4/33 [00:02<00:17,  1.61it/s]
[Acessing speaker spk_2 track 1 of 1:  15%|█▌        | 5/33 [00:03<00:17,  1.56it/s]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 6/33 [00:03<00:16,  1.64it/s]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 7/33 [00:04<00:18,  1.44it/s]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 8/33 [00:04<00:15,  1.62it/s]
[Acessing speaker spk_2 track 1 of 1:  27%|██▋       | 9/33 [00:05<00:14,  1.64it/s]
[Acessing speaker spk_2 track 1 of 1:  30%|███       | 10/33 [00:09<00:35,  1.56s/it]
[Acessing speaker spk_2 track 1 of 1:  33%|███▎      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/37 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/37 [00:00<00:20,  1.75it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▌         | 2/37 [00:01<00:21,  1.64it/s]
[Acessing speaker spk_3 track 1 of 1:   8%|▊         | 3/37 [00:02<00:23,  1.44it/s]
[Acessing speaker spk_3 track 1 of 1:  11%|█         | 4/37 [00:02<00:20,  1.61it/s]
[Acessing speaker spk_3 track 1 of 1:  14%|█▎        | 5/37 [00:03<00:28,  1.12it/s]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 6/37 [00:06<00:42,  1.39s/it]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 7/37 [00:06<00:34,  1.14s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 8/37 [00:09<00:47,  1.65s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 9/37 [00:10<00:39,  1.40s/it]
[Acessing speaker spk_3 track 1 of 1:  27%|██▋       | 10/37 [00:17<01:23,  3.11s/it]
[Acessing speaker spk_3 track 1 of 1:  30%|██▉       | 11/3





[Acessing speaker spk_4 track 1 of 2:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 2:   6%|▌         | 1/17 [00:00<00:12,  1.25it/s]
[Acessing speaker spk_4 track 1 of 2:  12%|█▏        | 2/17 [00:02<00:20,  1.40s/it]
[Acessing speaker spk_4 track 1 of 2:  18%|█▊        | 3/17 [00:05<00:31,  2.27s/it]
[Acessing speaker spk_4 track 1 of 2:  24%|██▎       | 4/17 [00:08<00:30,  2.33s/it]
[Acessing speaker spk_4 track 1 of 2:  29%|██▉       | 5/17 [00:09<00:21,  1.77s/it]
[Acessing speaker spk_4 track 1 of 2:  35%|███▌      | 6/17 [00:09<00:15,  1.42s/it]
[Acessing speaker spk_4 track 1 of 2:  41%|████      | 7/17 [00:10<00:12,  1.21s/it]
[Acessing speaker spk_4 track 1 of 2:  47%|████▋     | 8/17 [00:11<00:09,  1.00s/it]
[Acessing speaker spk_4 track 1 of 2:  53%|█████▎    | 9/17 [00:12<00:07,  1.06it/s]
[Acessing speaker spk_4 track 1 of 2:  59%|█████▉    | 10/17 [00:12<00:05,  1.17it/s]
[Acessing speaker spk_4 track 1 of 2:  65%|██████▍   | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/46 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   2%|▏         | 1/46 [00:00<00:28,  1.56it/s]
[Acessing speaker spk_5 track 1 of 1:   4%|▍         | 2/46 [00:01<00:24,  1.78it/s]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 3/46 [00:03<01:07,  1.56s/it]
[Acessing speaker spk_5 track 1 of 1:   9%|▊         | 4/46 [00:04<00:56,  1.34s/it]
[Acessing speaker spk_5 track 1 of 1:  11%|█         | 5/46 [00:06<00:57,  1.41s/it]
[Acessing speaker spk_5 track 1 of 1:  13%|█▎        | 6/46 [00:07<00:56,  1.41s/it]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 7/46 [00:08<00:48,  1.25s/it]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 8/46 [00:10<00:52,  1.38s/it]
[Acessing speaker spk_5 track 1 of 1:  20%|█▉        | 9/46 [00:11<00:43,  1.18s/it]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 10/46 [00:12<00:43,  1.21s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 11/4


Starte Inference für Experiment: E62_bugfix_mdOn0p6_mdOff0p8_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E62_bugfix_mdOn0p6_mdOff0p8_bs12_len20
  session_dir     = data-bin/dev/session_43
  comment         = AVSR-Override: min_on=0.6s, min_off=0.8s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_43


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/31 [00:00<00:18,  1.66it/s]
[Acessing speaker spk_0 track 1 of 2:   6%|▋         | 2/31 [00:01<00:20,  1.45it/s]
[Acessing speaker spk_0 track 1 of 2:  10%|▉         | 3/31 [00:02<00:26,  1.07it/s]
[Acessing speaker spk_0 track 1 of 2:  13%|█▎        | 4/31 [00:03<00:20,  1.31it/s]
[Acessing speaker spk_0 track 1 of 2:  16%|█▌        | 5/31 [00:04<00:22,  1.14it/s]
[Acessing speaker spk_0 track 1 of 2:  19%|█▉        | 6/31 [00:06<00:36,  1.46s/it]
[Acessing speaker spk_0 track 1 of 2:  23%|██▎       | 7/31 [00:14<01:27,  3.66s/it]
[Acessing speaker spk_0 track 1 of 2:  26%|██▌       | 8/31 [00:21<01:48,  4.70s/it]
[Acessing speaker spk_0 track 1 of 2:  29%|██▉       | 9/31 [00:22<01:15,  3.45s/it]
[Acessing speaker spk_0 track 1 of 2:  32%|███▏      | 10/31 [00:25<01:05,  3.14s/it]
[Acessing speaker spk_0 track 1 of 2:  35%|███▌      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/38 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/38 [00:00<00:29,  1.27it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▌         | 2/38 [00:01<00:27,  1.29it/s]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 3/38 [00:02<00:23,  1.48it/s]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 4/38 [00:05<00:56,  1.66s/it]
[Acessing speaker spk_1 track 1 of 1:  13%|█▎        | 5/38 [00:06<00:44,  1.35s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 6/38 [00:06<00:34,  1.09s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 7/38 [00:07<00:33,  1.07s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 8/38 [00:09<00:37,  1.26s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▎       | 9/38 [00:10<00:39,  1.36s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▋       | 10/38 [00:11<00:30,  1.10s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▉       | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/31 [00:00<00:20,  1.45it/s]
[Acessing speaker spk_2 track 1 of 1:   6%|▋         | 2/31 [00:01<00:17,  1.70it/s]
[Acessing speaker spk_2 track 1 of 1:  10%|▉         | 3/31 [00:01<00:16,  1.74it/s]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 4/31 [00:02<00:17,  1.55it/s]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 5/31 [00:03<00:17,  1.52it/s]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 6/31 [00:03<00:15,  1.61it/s]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 7/31 [00:04<00:16,  1.42it/s]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 8/31 [00:05<00:14,  1.60it/s]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 9/31 [00:05<00:13,  1.64it/s]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 10/31 [00:12<00:53,  2.56s/it]
[Acessing speaker spk_2 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/34 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/34 [00:00<00:18,  1.78it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▌         | 2/34 [00:01<00:19,  1.65it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 3/34 [00:01<00:21,  1.44it/s]
[Acessing speaker spk_3 track 1 of 1:  12%|█▏        | 4/34 [00:02<00:18,  1.61it/s]
[Acessing speaker spk_3 track 1 of 1:  15%|█▍        | 5/34 [00:03<00:23,  1.23it/s]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 6/34 [00:06<00:37,  1.33s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 7/34 [00:06<00:29,  1.11s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▎       | 8/34 [00:11<00:56,  2.17s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▋       | 9/34 [00:18<01:32,  3.70s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 10/34 [00:22<01:32,  3.84s/it]
[Acessing speaker spk_3 track 1 of 1:  32%|███▏      | 11/3





[Acessing speaker spk_4 track 1 of 2:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 2:   6%|▌         | 1/17 [00:00<00:15,  1.02it/s]
[Acessing speaker spk_4 track 1 of 2:  12%|█▏        | 2/17 [00:02<00:22,  1.50s/it]
[Acessing speaker spk_4 track 1 of 2:  18%|█▊        | 3/17 [00:06<00:32,  2.34s/it]
[Acessing speaker spk_4 track 1 of 2:  24%|██▎       | 4/17 [00:08<00:30,  2.37s/it]
[Acessing speaker spk_4 track 1 of 2:  29%|██▉       | 5/17 [00:09<00:21,  1.80s/it]
[Acessing speaker spk_4 track 1 of 2:  35%|███▌      | 6/17 [00:10<00:15,  1.44s/it]
[Acessing speaker spk_4 track 1 of 2:  41%|████      | 7/17 [00:10<00:12,  1.23s/it]
[Acessing speaker spk_4 track 1 of 2:  47%|████▋     | 8/17 [00:11<00:09,  1.02s/it]
[Acessing speaker spk_4 track 1 of 2:  53%|█████▎    | 9/17 [00:12<00:07,  1.05it/s]
[Acessing speaker spk_4 track 1 of 2:  59%|█████▉    | 10/17 [00:12<00:06,  1.17it/s]
[Acessing speaker spk_4 track 1 of 2:  65%|██████▍   | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/42 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   2%|▏         | 1/42 [00:00<00:26,  1.54it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▍         | 2/42 [00:01<00:22,  1.74it/s]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 3/42 [00:05<01:34,  2.43s/it]
[Acessing speaker spk_5 track 1 of 1:  10%|▉         | 4/42 [00:07<01:20,  2.12s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 5/42 [00:08<01:09,  1.87s/it]
[Acessing speaker spk_5 track 1 of 1:  14%|█▍        | 6/42 [00:09<00:55,  1.55s/it]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 7/42 [00:11<00:55,  1.59s/it]
[Acessing speaker spk_5 track 1 of 1:  19%|█▉        | 8/42 [00:12<00:44,  1.31s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██▏       | 9/42 [00:13<00:42,  1.30s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 10/42 [00:13<00:33,  1.06s/it]
[Acessing speaker spk_5 track 1 of 1:  26%|██▌       | 11/4


Starte Inference für Experiment: E63_bugfix_mdOn0p6_mdOff1p0_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E63_bugfix_mdOn0p6_mdOff1p0_bs12_len20
  session_dir     = data-bin/dev/session_43
  comment         = AVSR-Override: min_on=0.6s, min_off=1.0s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_43


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/29 [00:00<00:16,  1.67it/s]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/29 [00:01<00:18,  1.46it/s]
[Acessing speaker spk_0 track 1 of 2:  10%|█         | 3/29 [00:02<00:24,  1.07it/s]
[Acessing speaker spk_0 track 1 of 2:  14%|█▍        | 4/29 [00:03<00:19,  1.31it/s]
[Acessing speaker spk_0 track 1 of 2:  17%|█▋        | 5/29 [00:04<00:21,  1.12it/s]
[Acessing speaker spk_0 track 1 of 2:  21%|██        | 6/29 [00:06<00:34,  1.50s/it]
[Acessing speaker spk_0 track 1 of 2:  24%|██▍       | 7/29 [00:15<01:24,  3.85s/it]
[Acessing speaker spk_0 track 1 of 2:  28%|██▊       | 8/29 [00:22<01:41,  4.84s/it]
[Acessing speaker spk_0 track 1 of 2:  31%|███       | 9/29 [00:23<01:10,  3.54s/it]
[Acessing speaker spk_0 track 1 of 2:  34%|███▍      | 10/29 [00:25<01:00,  3.20s/it]
[Acessing speaker spk_0 track 1 of 2:  38%|███▊      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/32 [00:00<00:24,  1.25it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▋         | 2/32 [00:01<00:23,  1.28it/s]
[Acessing speaker spk_1 track 1 of 1:   9%|▉         | 3/32 [00:02<00:19,  1.48it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▎        | 4/32 [00:05<00:46,  1.66s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 5/32 [00:06<00:38,  1.42s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 6/32 [00:06<00:29,  1.14s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 7/32 [00:07<00:27,  1.10s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 8/32 [00:09<00:30,  1.28s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 9/32 [00:11<00:31,  1.37s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███▏      | 10/32 [00:11<00:26,  1.20s/it]
[Acessing speaker spk_1 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/29 [00:00<00:19,  1.43it/s]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 2/29 [00:01<00:15,  1.69it/s]
[Acessing speaker spk_2 track 1 of 1:  10%|█         | 3/29 [00:01<00:15,  1.72it/s]
[Acessing speaker spk_2 track 1 of 1:  14%|█▍        | 4/29 [00:02<00:16,  1.54it/s]
[Acessing speaker spk_2 track 1 of 1:  17%|█▋        | 5/29 [00:03<00:15,  1.51it/s]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 6/29 [00:03<00:14,  1.59it/s]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 7/29 [00:04<00:15,  1.41it/s]
[Acessing speaker spk_2 track 1 of 1:  28%|██▊       | 8/29 [00:05<00:13,  1.58it/s]
[Acessing speaker spk_2 track 1 of 1:  31%|███       | 9/29 [00:05<00:12,  1.61it/s]
[Acessing speaker spk_2 track 1 of 1:  34%|███▍      | 10/29 [00:12<00:49,  2.62s/it]
[Acessing speaker spk_2 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/32 [00:00<00:17,  1.74it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▋         | 2/32 [00:01<00:18,  1.63it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 3/32 [00:02<00:20,  1.42it/s]
[Acessing speaker spk_3 track 1 of 1:  12%|█▎        | 4/32 [00:02<00:17,  1.59it/s]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 5/32 [00:03<00:22,  1.21it/s]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 6/32 [00:06<00:35,  1.36s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 7/32 [00:06<00:28,  1.13s/it]
[Acessing speaker spk_3 track 1 of 1:  25%|██▌       | 8/32 [00:14<01:14,  3.11s/it]
[Acessing speaker spk_3 track 1 of 1:  28%|██▊       | 9/32 [00:25<02:11,  5.72s/it]
[Acessing speaker spk_3 track 1 of 1:  31%|███▏      | 10/32 [00:26<01:36,  4.38s/it]
[Acessing speaker spk_3 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_4 track 1 of 2:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 2:   6%|▋         | 1/16 [00:00<00:12,  1.24it/s]
[Acessing speaker spk_4 track 1 of 2:  12%|█▎        | 2/16 [00:02<00:19,  1.41s/it]
[Acessing speaker spk_4 track 1 of 2:  19%|█▉        | 3/16 [00:05<00:29,  2.29s/it]
[Acessing speaker spk_4 track 1 of 2:  25%|██▌       | 4/16 [00:08<00:28,  2.35s/it]
[Acessing speaker spk_4 track 1 of 2:  31%|███▏      | 5/16 [00:09<00:19,  1.79s/it]
[Acessing speaker spk_4 track 1 of 2:  38%|███▊      | 6/16 [00:09<00:14,  1.43s/it]
[Acessing speaker spk_4 track 1 of 2:  44%|████▍     | 7/16 [00:10<00:11,  1.23s/it]
[Acessing speaker spk_4 track 1 of 2:  50%|█████     | 8/16 [00:11<00:08,  1.01s/it]
[Acessing speaker spk_4 track 1 of 2:  56%|█████▋    | 9/16 [00:12<00:06,  1.05it/s]
[Acessing speaker spk_4 track 1 of 2:  62%|██████▎   | 10/16 [00:14<00:07,  1.33s/it]
[Acessing speaker spk_4 track 1 of 2:  69%|██████▉   | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/41 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   2%|▏         | 1/41 [00:00<00:26,  1.52it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▍         | 2/41 [00:01<00:22,  1.72it/s]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 3/41 [00:06<01:36,  2.54s/it]
[Acessing speaker spk_5 track 1 of 1:  10%|▉         | 4/41 [00:07<01:21,  2.20s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 5/41 [00:09<01:08,  1.91s/it]
[Acessing speaker spk_5 track 1 of 1:  15%|█▍        | 6/41 [00:10<00:55,  1.58s/it]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 7/41 [00:11<00:54,  1.61s/it]
[Acessing speaker spk_5 track 1 of 1:  20%|█▉        | 8/41 [00:12<00:43,  1.32s/it]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 9/41 [00:13<00:41,  1.30s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 10/41 [00:14<00:32,  1.06s/it]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 11/4


Starte Inference für Experiment: E64_bugfix_mdOn0p6_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E64_bugfix_mdOn0p6_mdOff1p2_bs12_len20
  session_dir     = data-bin/dev/session_43
  comment         = AVSR-Override: min_on=0.6s, min_off=1.2s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_43


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/29 [00:00<00:16,  1.69it/s]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/29 [00:01<00:18,  1.45it/s]
[Acessing speaker spk_0 track 1 of 2:  10%|█         | 3/29 [00:02<00:24,  1.07it/s]
[Acessing speaker spk_0 track 1 of 2:  14%|█▍        | 4/29 [00:03<00:19,  1.31it/s]
[Acessing speaker spk_0 track 1 of 2:  17%|█▋        | 5/29 [00:04<00:21,  1.14it/s]
[Acessing speaker spk_0 track 1 of 2:  21%|██        | 6/29 [00:06<00:33,  1.47s/it]
[Acessing speaker spk_0 track 1 of 2:  24%|██▍       | 7/29 [00:15<01:21,  3.69s/it]
[Acessing speaker spk_0 track 1 of 2:  28%|██▊       | 8/29 [00:21<01:39,  4.72s/it]
[Acessing speaker spk_0 track 1 of 2:  31%|███       | 9/29 [00:22<01:09,  3.46s/it]
[Acessing speaker spk_0 track 1 of 2:  34%|███▍      | 10/29 [00:25<00:59,  3.15s/it]
[Acessing speaker spk_0 track 1 of 2:  38%|███▊      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/32 [00:00<00:25,  1.23it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▋         | 2/32 [00:01<00:23,  1.27it/s]
[Acessing speaker spk_1 track 1 of 1:   9%|▉         | 3/32 [00:02<00:30,  1.06s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▎        | 4/32 [00:07<01:02,  2.24s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 5/32 [00:07<00:46,  1.71s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 6/32 [00:08<00:34,  1.33s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 7/32 [00:09<00:30,  1.23s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 8/32 [00:11<00:33,  1.38s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 9/32 [00:12<00:32,  1.43s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███▏      | 10/32 [00:13<00:27,  1.24s/it]
[Acessing speaker spk_1 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/29 [00:00<00:19,  1.45it/s]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 2/29 [00:01<00:15,  1.70it/s]
[Acessing speaker spk_2 track 1 of 1:  10%|█         | 3/29 [00:01<00:14,  1.75it/s]
[Acessing speaker spk_2 track 1 of 1:  14%|█▍        | 4/29 [00:02<00:16,  1.55it/s]
[Acessing speaker spk_2 track 1 of 1:  17%|█▋        | 5/29 [00:03<00:15,  1.51it/s]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 6/29 [00:03<00:14,  1.60it/s]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 7/29 [00:04<00:15,  1.42it/s]
[Acessing speaker spk_2 track 1 of 1:  28%|██▊       | 8/29 [00:05<00:13,  1.59it/s]
[Acessing speaker spk_2 track 1 of 1:  31%|███       | 9/29 [00:05<00:12,  1.63it/s]
[Acessing speaker spk_2 track 1 of 1:  34%|███▍      | 10/29 [00:12<00:49,  2.60s/it]
[Acessing speaker spk_2 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/32 [00:00<00:17,  1.74it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▋         | 2/32 [00:01<00:18,  1.62it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 3/32 [00:02<00:20,  1.43it/s]
[Acessing speaker spk_3 track 1 of 1:  12%|█▎        | 4/32 [00:02<00:17,  1.60it/s]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 5/32 [00:03<00:22,  1.21it/s]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 6/32 [00:06<00:34,  1.34s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 7/32 [00:06<00:27,  1.11s/it]
[Acessing speaker spk_3 track 1 of 1:  25%|██▌       | 8/32 [00:15<01:26,  3.62s/it]
[Acessing speaker spk_3 track 1 of 1:  28%|██▊       | 9/32 [00:28<02:28,  6.46s/it]
[Acessing speaker spk_3 track 1 of 1:  31%|███▏      | 10/32 [00:29<01:49,  4.95s/it]
[Acessing speaker spk_3 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_4 track 1 of 2:   0%|          | 0/15 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 2:   7%|▋         | 1/15 [00:00<00:12,  1.12it/s]
[Acessing speaker spk_4 track 1 of 2:  13%|█▎        | 2/15 [00:02<00:20,  1.57s/it]
[Acessing speaker spk_4 track 1 of 2:  20%|██        | 3/15 [00:06<00:32,  2.69s/it]
[Acessing speaker spk_4 track 1 of 2:  27%|██▋       | 4/15 [00:09<00:29,  2.70s/it]
[Acessing speaker spk_4 track 1 of 2:  33%|███▎      | 5/15 [00:10<00:20,  2.03s/it]
[Acessing speaker spk_4 track 1 of 2:  40%|████      | 6/15 [00:11<00:14,  1.61s/it]
[Acessing speaker spk_4 track 1 of 2:  47%|████▋     | 7/15 [00:12<00:10,  1.36s/it]
[Acessing speaker spk_4 track 1 of 2:  53%|█████▎    | 8/15 [00:12<00:07,  1.11s/it]
[Acessing speaker spk_4 track 1 of 2:  60%|██████    | 9/15 [00:13<00:06,  1.03s/it]
[Acessing speaker spk_4 track 1 of 2:  67%|██████▋   | 10/15 [00:14<00:05,  1.08s/it]
[Acessing speaker spk_4 track 1 of 2:  73%|███████▎  | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/40 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   2%|▎         | 1/40 [00:00<00:27,  1.42it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▌         | 2/40 [00:01<00:23,  1.65it/s]
[Acessing speaker spk_5 track 1 of 1:   8%|▊         | 3/40 [00:06<01:34,  2.56s/it]
[Acessing speaker spk_5 track 1 of 1:  10%|█         | 4/40 [00:07<01:20,  2.24s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▎        | 5/40 [00:09<01:11,  2.05s/it]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 6/40 [00:10<00:57,  1.69s/it]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 7/40 [00:12<00:56,  1.70s/it]
[Acessing speaker spk_5 track 1 of 1:  20%|██        | 8/40 [00:13<00:44,  1.39s/it]
[Acessing speaker spk_5 track 1 of 1:  22%|██▎       | 9/40 [00:14<00:42,  1.37s/it]
[Acessing speaker spk_5 track 1 of 1:  25%|██▌       | 10/40 [00:14<00:33,  1.11s/it]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 11/4


Starte Inference für Experiment: E65_bugfix_mdOn0p8_mdOff0p5_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E65_bugfix_mdOn0p8_mdOff0p5_bs12_len20
  session_dir     = data-bin/dev/session_43
  comment         = AVSR-Override: min_on=0.8s, min_off=0.5s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_43


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/31 [00:00<00:18,  1.59it/s]
[Acessing speaker spk_0 track 1 of 2:   6%|▋         | 2/31 [00:01<00:21,  1.37it/s]
[Acessing speaker spk_0 track 1 of 2:  10%|▉         | 3/31 [00:02<00:27,  1.00it/s]
[Acessing speaker spk_0 track 1 of 2:  13%|█▎        | 4/31 [00:03<00:21,  1.23it/s]
[Acessing speaker spk_0 track 1 of 2:  16%|█▌        | 5/31 [00:04<00:24,  1.07it/s]
[Acessing speaker spk_0 track 1 of 2:  19%|█▉        | 6/31 [00:07<00:38,  1.56s/it]
[Acessing speaker spk_0 track 1 of 2:  23%|██▎       | 7/31 [00:16<01:35,  3.98s/it]
[Acessing speaker spk_0 track 1 of 2:  26%|██▌       | 8/31 [00:23<01:56,  5.09s/it]
[Acessing speaker spk_0 track 1 of 2:  29%|██▉       | 9/31 [00:24<01:21,  3.73s/it]
[Acessing speaker spk_0 track 1 of 2:  32%|███▏      | 10/31 [00:26<01:10,  3.38s/it]
[Acessing speaker spk_0 track 1 of 2:  35%|███▌      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/39 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/39 [00:00<00:33,  1.15it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▌         | 2/39 [00:01<00:31,  1.19it/s]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 3/39 [00:02<00:25,  1.40it/s]
[Acessing speaker spk_1 track 1 of 1:  10%|█         | 4/39 [00:05<01:01,  1.75s/it]
[Acessing speaker spk_1 track 1 of 1:  13%|█▎        | 5/39 [00:06<00:48,  1.42s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▌        | 6/39 [00:07<00:38,  1.15s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 7/39 [00:08<00:35,  1.12s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 8/39 [00:08<00:30,  1.02it/s]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 9/39 [00:09<00:29,  1.01it/s]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 10/39 [00:11<00:34,  1.20s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/32 [00:00<00:21,  1.46it/s]
[Acessing speaker spk_2 track 1 of 1:   6%|▋         | 2/32 [00:01<00:18,  1.63it/s]
[Acessing speaker spk_2 track 1 of 1:   9%|▉         | 3/32 [00:01<00:19,  1.48it/s]
[Acessing speaker spk_2 track 1 of 1:  12%|█▎        | 4/32 [00:02<00:18,  1.48it/s]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 5/32 [00:03<00:17,  1.58it/s]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 6/32 [00:04<00:18,  1.41it/s]
[Acessing speaker spk_2 track 1 of 1:  22%|██▏       | 7/32 [00:04<00:15,  1.59it/s]
[Acessing speaker spk_2 track 1 of 1:  25%|██▌       | 8/32 [00:05<00:14,  1.63it/s]
[Acessing speaker spk_2 track 1 of 1:  28%|██▊       | 9/32 [00:08<00:36,  1.59s/it]
[Acessing speaker spk_2 track 1 of 1:  31%|███▏      | 10/32 [00:10<00:33,  1.51s/it]
[Acessing speaker spk_2 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/35 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/35 [00:00<00:19,  1.73it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▌         | 2/35 [00:01<00:20,  1.62it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▊         | 3/35 [00:02<00:22,  1.43it/s]
[Acessing speaker spk_3 track 1 of 1:  11%|█▏        | 4/35 [00:02<00:19,  1.60it/s]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 5/35 [00:03<00:25,  1.20it/s]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 6/35 [00:06<00:39,  1.36s/it]
[Acessing speaker spk_3 track 1 of 1:  20%|██        | 7/35 [00:06<00:31,  1.12s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 8/35 [00:09<00:46,  1.70s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 9/35 [00:10<00:37,  1.44s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▊       | 10/35 [00:17<01:18,  3.16s/it]
[Acessing speaker spk_3 track 1 of 1:  31%|███▏      | 11/3





[Acessing speaker spk_4 track 1 of 2:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 2:   6%|▌         | 1/17 [00:00<00:13,  1.20it/s]
[Acessing speaker spk_4 track 1 of 2:  12%|█▏        | 2/17 [00:02<00:21,  1.41s/it]
[Acessing speaker spk_4 track 1 of 2:  18%|█▊        | 3/17 [00:05<00:32,  2.29s/it]
[Acessing speaker spk_4 track 1 of 2:  24%|██▎       | 4/17 [00:08<00:30,  2.34s/it]
[Acessing speaker spk_4 track 1 of 2:  29%|██▉       | 5/17 [00:09<00:21,  1.77s/it]
[Acessing speaker spk_4 track 1 of 2:  35%|███▌      | 6/17 [00:09<00:15,  1.43s/it]
[Acessing speaker spk_4 track 1 of 2:  41%|████      | 7/17 [00:10<00:12,  1.22s/it]
[Acessing speaker spk_4 track 1 of 2:  47%|████▋     | 8/17 [00:11<00:09,  1.01s/it]
[Acessing speaker spk_4 track 1 of 2:  53%|█████▎    | 9/17 [00:12<00:07,  1.05it/s]
[Acessing speaker spk_4 track 1 of 2:  59%|█████▉    | 10/17 [00:12<00:06,  1.16it/s]
[Acessing speaker spk_4 track 1 of 2:  65%|██████▍   | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/44 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   2%|▏         | 1/44 [00:00<00:28,  1.53it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▍         | 2/44 [00:01<00:23,  1.76it/s]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 3/44 [00:03<01:04,  1.58s/it]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 4/44 [00:04<00:54,  1.36s/it]
[Acessing speaker spk_5 track 1 of 1:  11%|█▏        | 5/44 [00:06<00:55,  1.43s/it]
[Acessing speaker spk_5 track 1 of 1:  14%|█▎        | 6/44 [00:07<00:54,  1.42s/it]
[Acessing speaker spk_5 track 1 of 1:  16%|█▌        | 7/44 [00:08<00:46,  1.27s/it]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 8/44 [00:10<00:50,  1.41s/it]
[Acessing speaker spk_5 track 1 of 1:  20%|██        | 9/44 [00:11<00:41,  1.19s/it]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 10/44 [00:12<00:41,  1.22s/it]
[Acessing speaker spk_5 track 1 of 1:  25%|██▌       | 11/4


Starte Inference für Experiment: E66_bugfix_mdOn0p8_mdOff0p8_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E66_bugfix_mdOn0p8_mdOff0p8_bs12_len20
  session_dir     = data-bin/dev/session_43
  comment         = AVSR-Override: min_on=0.8s, min_off=0.8s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_43


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/30 [00:00<00:17,  1.68it/s]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/30 [00:01<00:19,  1.46it/s]
[Acessing speaker spk_0 track 1 of 2:  10%|█         | 3/30 [00:02<00:25,  1.07it/s]
[Acessing speaker spk_0 track 1 of 2:  13%|█▎        | 4/30 [00:03<00:19,  1.32it/s]
[Acessing speaker spk_0 track 1 of 2:  17%|█▋        | 5/30 [00:04<00:21,  1.14it/s]
[Acessing speaker spk_0 track 1 of 2:  20%|██        | 6/30 [00:06<00:35,  1.46s/it]
[Acessing speaker spk_0 track 1 of 2:  23%|██▎       | 7/30 [00:14<01:24,  3.66s/it]
[Acessing speaker spk_0 track 1 of 2:  27%|██▋       | 8/30 [00:21<01:43,  4.69s/it]
[Acessing speaker spk_0 track 1 of 2:  30%|███       | 9/30 [00:22<01:12,  3.44s/it]
[Acessing speaker spk_0 track 1 of 2:  33%|███▎      | 10/30 [00:24<01:02,  3.14s/it]
[Acessing speaker spk_0 track 1 of 2:  37%|███▋      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/38 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/38 [00:00<00:29,  1.24it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▌         | 2/38 [00:01<00:28,  1.27it/s]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 3/38 [00:02<00:23,  1.48it/s]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 4/38 [00:05<00:56,  1.66s/it]
[Acessing speaker spk_1 track 1 of 1:  13%|█▎        | 5/38 [00:06<00:44,  1.34s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 6/38 [00:06<00:35,  1.09s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 7/38 [00:07<00:33,  1.07s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 8/38 [00:09<00:38,  1.28s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▎       | 9/38 [00:10<00:39,  1.36s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▋       | 10/38 [00:11<00:30,  1.10s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▉       | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/31 [00:00<00:21,  1.42it/s]
[Acessing speaker spk_2 track 1 of 1:   6%|▋         | 2/31 [00:01<00:17,  1.67it/s]
[Acessing speaker spk_2 track 1 of 1:  10%|▉         | 3/31 [00:01<00:16,  1.72it/s]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 4/31 [00:02<00:17,  1.54it/s]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 5/31 [00:03<00:17,  1.51it/s]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 6/31 [00:03<00:15,  1.59it/s]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 7/31 [00:04<00:17,  1.41it/s]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 8/31 [00:05<00:14,  1.58it/s]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 9/31 [00:05<00:13,  1.61it/s]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 10/31 [00:12<00:53,  2.55s/it]
[Acessing speaker spk_2 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/32 [00:00<00:19,  1.62it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▋         | 2/32 [00:01<00:19,  1.55it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 3/32 [00:02<00:24,  1.19it/s]
[Acessing speaker spk_3 track 1 of 1:  12%|█▎        | 4/32 [00:02<00:20,  1.40it/s]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 5/32 [00:04<00:23,  1.14it/s]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 6/32 [00:06<00:35,  1.38s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 7/32 [00:07<00:28,  1.14s/it]
[Acessing speaker spk_3 track 1 of 1:  25%|██▌       | 8/32 [00:11<00:51,  2.16s/it]
[Acessing speaker spk_3 track 1 of 1:  28%|██▊       | 9/32 [00:18<01:25,  3.70s/it]
[Acessing speaker spk_3 track 1 of 1:  31%|███▏      | 10/32 [00:22<01:25,  3.89s/it]
[Acessing speaker spk_3 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_4 track 1 of 2:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 2:   6%|▌         | 1/17 [00:00<00:12,  1.26it/s]
[Acessing speaker spk_4 track 1 of 2:  12%|█▏        | 2/17 [00:02<00:21,  1.40s/it]
[Acessing speaker spk_4 track 1 of 2:  18%|█▊        | 3/17 [00:05<00:32,  2.29s/it]
[Acessing speaker spk_4 track 1 of 2:  24%|██▎       | 4/17 [00:08<00:31,  2.42s/it]
[Acessing speaker spk_4 track 1 of 2:  29%|██▉       | 5/17 [00:09<00:21,  1.83s/it]
[Acessing speaker spk_4 track 1 of 2:  35%|███▌      | 6/17 [00:10<00:16,  1.46s/it]
[Acessing speaker spk_4 track 1 of 2:  41%|████      | 7/17 [00:10<00:12,  1.24s/it]
[Acessing speaker spk_4 track 1 of 2:  47%|████▋     | 8/17 [00:11<00:09,  1.02s/it]
[Acessing speaker spk_4 track 1 of 2:  53%|█████▎    | 9/17 [00:12<00:07,  1.05it/s]
[Acessing speaker spk_4 track 1 of 2:  59%|█████▉    | 10/17 [00:12<00:06,  1.16it/s]
[Acessing speaker spk_4 track 1 of 2:  65%|██████▍   | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/40 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   2%|▎         | 1/40 [00:00<00:25,  1.55it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▌         | 2/40 [00:01<00:21,  1.76it/s]
[Acessing speaker spk_5 track 1 of 1:   8%|▊         | 3/40 [00:05<01:30,  2.43s/it]
[Acessing speaker spk_5 track 1 of 1:  10%|█         | 4/40 [00:07<01:16,  2.13s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▎        | 5/40 [00:09<01:08,  1.95s/it]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 6/40 [00:10<00:54,  1.61s/it]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 7/40 [00:11<00:53,  1.63s/it]
[Acessing speaker spk_5 track 1 of 1:  20%|██        | 8/40 [00:12<00:42,  1.34s/it]
[Acessing speaker spk_5 track 1 of 1:  22%|██▎       | 9/40 [00:13<00:40,  1.32s/it]
[Acessing speaker spk_5 track 1 of 1:  25%|██▌       | 10/40 [00:14<00:32,  1.07s/it]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 11/4


Starte Inference für Experiment: E67_bugfix_mdOn0p8_mdOff1p0_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E67_bugfix_mdOn0p8_mdOff1p0_bs12_len20
  session_dir     = data-bin/dev/session_43
  comment         = AVSR-Override: min_on=0.8s, min_off=1.0s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_43


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   4%|▎         | 1/28 [00:00<00:16,  1.66it/s]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/28 [00:01<00:17,  1.45it/s]
[Acessing speaker spk_0 track 1 of 2:  11%|█         | 3/28 [00:02<00:23,  1.06it/s]
[Acessing speaker spk_0 track 1 of 2:  14%|█▍        | 4/28 [00:03<00:18,  1.31it/s]
[Acessing speaker spk_0 track 1 of 2:  18%|█▊        | 5/28 [00:04<00:20,  1.13it/s]
[Acessing speaker spk_0 track 1 of 2:  21%|██▏       | 6/28 [00:06<00:32,  1.47s/it]
[Acessing speaker spk_0 track 1 of 2:  25%|██▌       | 7/28 [00:15<01:19,  3.79s/it]
[Acessing speaker spk_0 track 1 of 2:  29%|██▊       | 8/28 [00:22<01:37,  4.87s/it]
[Acessing speaker spk_0 track 1 of 2:  32%|███▏      | 9/28 [00:23<01:07,  3.57s/it]
[Acessing speaker spk_0 track 1 of 2:  36%|███▌      | 10/28 [00:25<00:58,  3.24s/it]
[Acessing speaker spk_0 track 1 of 2:  39%|███▉      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/32 [00:00<00:24,  1.27it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▋         | 2/32 [00:01<00:23,  1.29it/s]
[Acessing speaker spk_1 track 1 of 1:   9%|▉         | 3/32 [00:02<00:19,  1.49it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▎        | 4/32 [00:05<00:46,  1.66s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 5/32 [00:06<00:36,  1.34s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 6/32 [00:06<00:28,  1.09s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 7/32 [00:07<00:28,  1.14s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 8/32 [00:09<00:31,  1.31s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 9/32 [00:11<00:31,  1.39s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███▏      | 10/32 [00:11<00:26,  1.21s/it]
[Acessing speaker spk_1 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/29 [00:00<00:19,  1.47it/s]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 2/29 [00:01<00:15,  1.71it/s]
[Acessing speaker spk_2 track 1 of 1:  10%|█         | 3/29 [00:01<00:14,  1.75it/s]
[Acessing speaker spk_2 track 1 of 1:  14%|█▍        | 4/29 [00:02<00:16,  1.56it/s]
[Acessing speaker spk_2 track 1 of 1:  17%|█▋        | 5/29 [00:03<00:15,  1.53it/s]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 6/29 [00:03<00:14,  1.61it/s]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 7/29 [00:04<00:15,  1.42it/s]
[Acessing speaker spk_2 track 1 of 1:  28%|██▊       | 8/29 [00:05<00:13,  1.60it/s]
[Acessing speaker spk_2 track 1 of 1:  31%|███       | 9/29 [00:05<00:12,  1.63it/s]
[Acessing speaker spk_2 track 1 of 1:  34%|███▍      | 10/29 [00:12<00:47,  2.52s/it]
[Acessing speaker spk_2 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/31 [00:00<00:17,  1.71it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▋         | 2/31 [00:01<00:18,  1.59it/s]
[Acessing speaker spk_3 track 1 of 1:  10%|▉         | 3/31 [00:02<00:19,  1.40it/s]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 4/31 [00:02<00:17,  1.56it/s]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 5/31 [00:03<00:21,  1.19it/s]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 6/31 [00:06<00:34,  1.38s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 7/31 [00:06<00:27,  1.15s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 8/31 [00:14<01:12,  3.14s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 9/31 [00:25<02:07,  5.79s/it]
[Acessing speaker spk_3 track 1 of 1:  32%|███▏      | 10/31 [00:27<01:32,  4.42s/it]
[Acessing speaker spk_3 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_4 track 1 of 2:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 2:   6%|▋         | 1/16 [00:00<00:11,  1.26it/s]
[Acessing speaker spk_4 track 1 of 2:  12%|█▎        | 2/16 [00:02<00:19,  1.39s/it]
[Acessing speaker spk_4 track 1 of 2:  19%|█▉        | 3/16 [00:05<00:29,  2.27s/it]
[Acessing speaker spk_4 track 1 of 2:  25%|██▌       | 4/16 [00:08<00:27,  2.33s/it]
[Acessing speaker spk_4 track 1 of 2:  31%|███▏      | 5/16 [00:09<00:19,  1.76s/it]
[Acessing speaker spk_4 track 1 of 2:  38%|███▊      | 6/16 [00:10<00:17,  1.77s/it]
[Acessing speaker spk_4 track 1 of 2:  44%|████▍     | 7/16 [00:12<00:15,  1.67s/it]
[Acessing speaker spk_4 track 1 of 2:  50%|█████     | 8/16 [00:12<00:10,  1.32s/it]
[Acessing speaker spk_4 track 1 of 2:  56%|█████▋    | 9/16 [00:13<00:08,  1.16s/it]
[Acessing speaker spk_4 track 1 of 2:  62%|██████▎   | 10/16 [00:14<00:06,  1.01s/it]
[Acessing speaker spk_4 track 1 of 2:  69%|██████▉   | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/39 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/39 [00:00<00:24,  1.54it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▌         | 2/39 [00:01<00:20,  1.77it/s]
[Acessing speaker spk_5 track 1 of 1:   8%|▊         | 3/39 [00:05<01:27,  2.44s/it]
[Acessing speaker spk_5 track 1 of 1:  10%|█         | 4/39 [00:07<01:16,  2.19s/it]
[Acessing speaker spk_5 track 1 of 1:  13%|█▎        | 5/39 [00:09<01:05,  1.92s/it]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 6/39 [00:10<00:52,  1.59s/it]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 7/39 [00:11<00:51,  1.62s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 8/39 [00:12<00:41,  1.33s/it]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 9/39 [00:13<00:39,  1.31s/it]
[Acessing speaker spk_5 track 1 of 1:  26%|██▌       | 10/39 [00:14<00:30,  1.07s/it]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 11/3


Starte Inference für Experiment: E68_bugfix_mdOn0p8_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E68_bugfix_mdOn0p8_mdOff1p2_bs12_len20
  session_dir     = data-bin/dev/session_43
  comment         = AVSR-Override: min_on=0.8s, min_off=1.2s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_43


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   4%|▎         | 1/28 [00:00<00:16,  1.68it/s]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/28 [00:01<00:17,  1.46it/s]
[Acessing speaker spk_0 track 1 of 2:  11%|█         | 3/28 [00:02<00:23,  1.07it/s]
[Acessing speaker spk_0 track 1 of 2:  14%|█▍        | 4/28 [00:03<00:18,  1.32it/s]
[Acessing speaker spk_0 track 1 of 2:  18%|█▊        | 5/28 [00:04<00:20,  1.14it/s]
[Acessing speaker spk_0 track 1 of 2:  21%|██▏       | 6/28 [00:06<00:31,  1.45s/it]
[Acessing speaker spk_0 track 1 of 2:  25%|██▌       | 7/28 [00:14<01:16,  3.63s/it]
[Acessing speaker spk_0 track 1 of 2:  29%|██▊       | 8/28 [00:21<01:33,  4.65s/it]
[Acessing speaker spk_0 track 1 of 2:  32%|███▏      | 9/28 [00:22<01:04,  3.41s/it]
[Acessing speaker spk_0 track 1 of 2:  36%|███▌      | 10/28 [00:24<00:55,  3.11s/it]
[Acessing speaker spk_0 track 1 of 2:  39%|███▉      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/32 [00:00<00:24,  1.26it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▋         | 2/32 [00:01<00:23,  1.30it/s]
[Acessing speaker spk_1 track 1 of 1:   9%|▉         | 3/32 [00:03<00:32,  1.12s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▎        | 4/32 [00:07<01:03,  2.25s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 5/32 [00:07<00:46,  1.72s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 6/32 [00:08<00:34,  1.34s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 7/32 [00:09<00:30,  1.23s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 8/32 [00:11<00:32,  1.37s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 9/32 [00:12<00:32,  1.43s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███▏      | 10/32 [00:13<00:27,  1.23s/it]
[Acessing speaker spk_1 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/29 [00:00<00:19,  1.45it/s]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 2/29 [00:01<00:15,  1.70it/s]
[Acessing speaker spk_2 track 1 of 1:  10%|█         | 3/29 [00:01<00:14,  1.74it/s]
[Acessing speaker spk_2 track 1 of 1:  14%|█▍        | 4/29 [00:02<00:16,  1.56it/s]
[Acessing speaker spk_2 track 1 of 1:  17%|█▋        | 5/29 [00:03<00:15,  1.53it/s]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 6/29 [00:03<00:14,  1.62it/s]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 7/29 [00:04<00:15,  1.43it/s]
[Acessing speaker spk_2 track 1 of 1:  28%|██▊       | 8/29 [00:05<00:13,  1.61it/s]
[Acessing speaker spk_2 track 1 of 1:  31%|███       | 9/29 [00:05<00:12,  1.65it/s]
[Acessing speaker spk_2 track 1 of 1:  34%|███▍      | 10/29 [00:12<00:48,  2.56s/it]
[Acessing speaker spk_2 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/31 [00:00<00:17,  1.75it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▋         | 2/31 [00:01<00:17,  1.63it/s]
[Acessing speaker spk_3 track 1 of 1:  10%|▉         | 3/31 [00:02<00:19,  1.41it/s]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 4/31 [00:02<00:17,  1.57it/s]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 5/31 [00:03<00:21,  1.19it/s]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 6/31 [00:06<00:34,  1.36s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 7/31 [00:06<00:27,  1.13s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 8/31 [00:15<01:17,  3.39s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 9/31 [00:27<02:14,  6.14s/it]
[Acessing speaker spk_3 track 1 of 1:  32%|███▏      | 10/31 [00:28<01:38,  4.68s/it]
[Acessing speaker spk_3 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_4 track 1 of 2:   0%|          | 0/15 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 2:   7%|▋         | 1/15 [00:00<00:11,  1.24it/s]
[Acessing speaker spk_4 track 1 of 2:  13%|█▎        | 2/15 [00:02<00:18,  1.45s/it]
[Acessing speaker spk_4 track 1 of 2:  20%|██        | 3/15 [00:06<00:27,  2.32s/it]
[Acessing speaker spk_4 track 1 of 2:  27%|██▋       | 4/15 [00:08<00:26,  2.36s/it]
[Acessing speaker spk_4 track 1 of 2:  33%|███▎      | 5/15 [00:09<00:17,  1.79s/it]
[Acessing speaker spk_4 track 1 of 2:  40%|████      | 6/15 [00:10<00:12,  1.44s/it]
[Acessing speaker spk_4 track 1 of 2:  47%|████▋     | 7/15 [00:10<00:09,  1.23s/it]
[Acessing speaker spk_4 track 1 of 2:  53%|█████▎    | 8/15 [00:11<00:07,  1.01s/it]
[Acessing speaker spk_4 track 1 of 2:  60%|██████    | 9/15 [00:12<00:05,  1.05it/s]
[Acessing speaker spk_4 track 1 of 2:  67%|██████▋   | 10/15 [00:13<00:05,  1.02s/it]
[Acessing speaker spk_4 track 1 of 2:  73%|███████▎  | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/38 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/38 [00:00<00:23,  1.56it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▌         | 2/38 [00:01<00:20,  1.77it/s]
[Acessing speaker spk_5 track 1 of 1:   8%|▊         | 3/38 [00:07<01:56,  3.33s/it]
[Acessing speaker spk_5 track 1 of 1:  11%|█         | 4/38 [00:09<01:30,  2.67s/it]
[Acessing speaker spk_5 track 1 of 1:  13%|█▎        | 5/38 [00:10<01:13,  2.22s/it]
[Acessing speaker spk_5 track 1 of 1:  16%|█▌        | 6/38 [00:11<00:57,  1.79s/it]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 7/38 [00:13<00:54,  1.75s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 8/38 [00:14<00:42,  1.42s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▎       | 9/38 [00:15<00:40,  1.38s/it]
[Acessing speaker spk_5 track 1 of 1:  26%|██▋       | 10/38 [00:16<00:31,  1.12s/it]
[Acessing speaker spk_5 track 1 of 1:  29%|██▉       | 11/3


Starte Inference für Experiment: E69_bugfix_mdOn1p0_mdOff0p5_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E69_bugfix_mdOn1p0_mdOff0p5_bs12_len20
  session_dir     = data-bin/dev/session_43
  comment         = AVSR-Override: min_on=1.0s, min_off=0.5s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_43


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/30 [00:00<00:17,  1.66it/s]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/30 [00:01<00:19,  1.45it/s]
[Acessing speaker spk_0 track 1 of 2:  10%|█         | 3/30 [00:02<00:25,  1.07it/s]
[Acessing speaker spk_0 track 1 of 2:  13%|█▎        | 4/30 [00:03<00:19,  1.32it/s]
[Acessing speaker spk_0 track 1 of 2:  17%|█▋        | 5/30 [00:04<00:21,  1.14it/s]
[Acessing speaker spk_0 track 1 of 2:  20%|██        | 6/30 [00:06<00:34,  1.45s/it]
[Acessing speaker spk_0 track 1 of 2:  23%|██▎       | 7/30 [00:16<01:37,  4.24s/it]
[Acessing speaker spk_0 track 1 of 2:  27%|██▋       | 8/30 [00:23<01:51,  5.08s/it]
[Acessing speaker spk_0 track 1 of 2:  30%|███       | 9/30 [00:24<01:17,  3.70s/it]
[Acessing speaker spk_0 track 1 of 2:  33%|███▎      | 10/30 [00:26<01:06,  3.31s/it]
[Acessing speaker spk_0 track 1 of 2:  37%|███▋      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/39 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/39 [00:00<00:30,  1.27it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▌         | 2/39 [00:01<00:28,  1.29it/s]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 3/39 [00:02<00:23,  1.50it/s]
[Acessing speaker spk_1 track 1 of 1:  10%|█         | 4/39 [00:05<00:57,  1.65s/it]
[Acessing speaker spk_1 track 1 of 1:  13%|█▎        | 5/39 [00:06<00:45,  1.34s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▌        | 6/39 [00:06<00:35,  1.08s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 7/39 [00:07<00:34,  1.06s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 8/39 [00:08<00:28,  1.08it/s]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 9/39 [00:09<00:28,  1.06it/s]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 10/39 [00:10<00:33,  1.14s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/32 [00:00<00:21,  1.45it/s]
[Acessing speaker spk_2 track 1 of 1:   6%|▋         | 2/32 [00:01<00:18,  1.58it/s]
[Acessing speaker spk_2 track 1 of 1:   9%|▉         | 3/32 [00:02<00:19,  1.46it/s]
[Acessing speaker spk_2 track 1 of 1:  12%|█▎        | 4/32 [00:02<00:19,  1.45it/s]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 5/32 [00:03<00:17,  1.56it/s]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 6/32 [00:04<00:18,  1.39it/s]
[Acessing speaker spk_2 track 1 of 1:  22%|██▏       | 7/32 [00:04<00:16,  1.56it/s]
[Acessing speaker spk_2 track 1 of 1:  25%|██▌       | 8/32 [00:05<00:14,  1.62it/s]
[Acessing speaker spk_2 track 1 of 1:  28%|██▊       | 9/32 [00:10<00:51,  2.23s/it]
[Acessing speaker spk_2 track 1 of 1:  31%|███▏      | 10/32 [00:12<00:43,  1.97s/it]
[Acessing speaker spk_2 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/35 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/35 [00:00<00:19,  1.74it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▌         | 2/35 [00:01<00:20,  1.64it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▊         | 3/35 [00:02<00:22,  1.44it/s]
[Acessing speaker spk_3 track 1 of 1:  11%|█▏        | 4/35 [00:02<00:19,  1.61it/s]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 5/35 [00:03<00:24,  1.22it/s]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 6/35 [00:07<00:56,  1.96s/it]
[Acessing speaker spk_3 track 1 of 1:  20%|██        | 7/35 [00:08<00:43,  1.54s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 8/35 [00:11<00:52,  1.93s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 9/35 [00:12<00:41,  1.60s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▊       | 10/35 [00:19<01:23,  3.33s/it]
[Acessing speaker spk_3 track 1 of 1:  31%|███▏      | 11/3





[Acessing speaker spk_4 track 1 of 2:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 2:   6%|▌         | 1/17 [00:00<00:13,  1.23it/s]
[Acessing speaker spk_4 track 1 of 2:  12%|█▏        | 2/17 [00:02<00:21,  1.43s/it]
[Acessing speaker spk_4 track 1 of 2:  18%|█▊        | 3/17 [00:06<00:32,  2.31s/it]
[Acessing speaker spk_4 track 1 of 2:  24%|██▎       | 4/17 [00:08<00:30,  2.36s/it]
[Acessing speaker spk_4 track 1 of 2:  29%|██▉       | 5/17 [00:09<00:21,  1.79s/it]
[Acessing speaker spk_4 track 1 of 2:  35%|███▌      | 6/17 [00:09<00:15,  1.43s/it]
[Acessing speaker spk_4 track 1 of 2:  41%|████      | 7/17 [00:10<00:12,  1.22s/it]
[Acessing speaker spk_4 track 1 of 2:  47%|████▋     | 8/17 [00:11<00:09,  1.01s/it]
[Acessing speaker spk_4 track 1 of 2:  53%|█████▎    | 9/17 [00:12<00:07,  1.05it/s]
[Acessing speaker spk_4 track 1 of 2:  59%|█████▉    | 10/17 [00:12<00:05,  1.17it/s]
[Acessing speaker spk_4 track 1 of 2:  65%|██████▍   | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/43 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   2%|▏         | 1/43 [00:00<00:27,  1.55it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▍         | 2/43 [00:01<00:23,  1.76it/s]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 3/43 [00:05<01:38,  2.46s/it]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 4/43 [00:06<01:13,  1.89s/it]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 5/43 [00:08<01:06,  1.76s/it]
[Acessing speaker spk_5 track 1 of 1:  14%|█▍        | 6/43 [00:09<01:00,  1.64s/it]
[Acessing speaker spk_5 track 1 of 1:  16%|█▋        | 7/43 [00:10<00:50,  1.41s/it]
[Acessing speaker spk_5 track 1 of 1:  19%|█▊        | 8/43 [00:12<00:52,  1.49s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 9/43 [00:13<00:42,  1.25s/it]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 10/43 [00:14<00:41,  1.26s/it]
[Acessing speaker spk_5 track 1 of 1:  26%|██▌       | 11/4


Starte Inference für Experiment: E70_bugfix_mdOn1p0_mdOff0p8_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E70_bugfix_mdOn1p0_mdOff0p8_bs12_len20
  session_dir     = data-bin/dev/session_43
  comment         = AVSR-Override: min_on=1.0s, min_off=0.8s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_43


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/29 [00:00<00:16,  1.67it/s]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/29 [00:01<00:18,  1.45it/s]
[Acessing speaker spk_0 track 1 of 2:  10%|█         | 3/29 [00:02<00:24,  1.07it/s]
[Acessing speaker spk_0 track 1 of 2:  14%|█▍        | 4/29 [00:03<00:19,  1.31it/s]
[Acessing speaker spk_0 track 1 of 2:  17%|█▋        | 5/29 [00:04<00:21,  1.14it/s]
[Acessing speaker spk_0 track 1 of 2:  21%|██        | 6/29 [00:06<00:33,  1.45s/it]
[Acessing speaker spk_0 track 1 of 2:  24%|██▍       | 7/29 [00:14<01:19,  3.61s/it]
[Acessing speaker spk_0 track 1 of 2:  28%|██▊       | 8/29 [00:21<01:38,  4.68s/it]
[Acessing speaker spk_0 track 1 of 2:  31%|███       | 9/29 [00:22<01:08,  3.43s/it]
[Acessing speaker spk_0 track 1 of 2:  34%|███▍      | 10/29 [00:24<00:59,  3.13s/it]
[Acessing speaker spk_0 track 1 of 2:  38%|███▊      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/38 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/38 [00:00<00:29,  1.24it/s]
[Acessing speaker spk_1 track 1 of 1:   5%|▌         | 2/38 [00:01<00:28,  1.28it/s]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 3/38 [00:02<00:23,  1.48it/s]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 4/38 [00:05<00:56,  1.65s/it]
[Acessing speaker spk_1 track 1 of 1:  13%|█▎        | 5/38 [00:06<00:44,  1.34s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 6/38 [00:06<00:34,  1.09s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 7/38 [00:07<00:32,  1.06s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 8/38 [00:09<00:37,  1.25s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▎       | 9/38 [00:10<00:39,  1.35s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▋       | 10/38 [00:11<00:30,  1.09s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▉       | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/31 [00:00<00:21,  1.39it/s]
[Acessing speaker spk_2 track 1 of 1:   6%|▋         | 2/31 [00:01<00:18,  1.57it/s]
[Acessing speaker spk_2 track 1 of 1:  10%|▉         | 3/31 [00:01<00:17,  1.59it/s]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 4/31 [00:02<00:18,  1.48it/s]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 5/31 [00:03<00:17,  1.47it/s]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 6/31 [00:03<00:15,  1.58it/s]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 7/31 [00:04<00:17,  1.37it/s]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 8/31 [00:05<00:14,  1.54it/s]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 9/31 [00:05<00:14,  1.56it/s]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 10/31 [00:14<01:04,  3.06s/it]
[Acessing speaker spk_2 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/32 [00:00<00:17,  1.78it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▋         | 2/32 [00:01<00:18,  1.62it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 3/32 [00:02<00:20,  1.41it/s]
[Acessing speaker spk_3 track 1 of 1:  12%|█▎        | 4/32 [00:02<00:17,  1.57it/s]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 5/32 [00:04<00:26,  1.02it/s]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 6/32 [00:06<00:38,  1.50s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 7/32 [00:08<00:37,  1.48s/it]
[Acessing speaker spk_3 track 1 of 1:  25%|██▌       | 8/32 [00:13<01:03,  2.63s/it]
[Acessing speaker spk_3 track 1 of 1:  28%|██▊       | 9/32 [00:20<01:34,  4.09s/it]
[Acessing speaker spk_3 track 1 of 1:  31%|███▏      | 10/32 [00:24<01:32,  4.22s/it]
[Acessing speaker spk_3 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_4 track 1 of 2:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 2:   6%|▌         | 1/17 [00:00<00:13,  1.22it/s]
[Acessing speaker spk_4 track 1 of 2:  12%|█▏        | 2/17 [00:02<00:21,  1.43s/it]
[Acessing speaker spk_4 track 1 of 2:  18%|█▊        | 3/17 [00:06<00:32,  2.32s/it]
[Acessing speaker spk_4 track 1 of 2:  24%|██▎       | 4/17 [00:08<00:31,  2.43s/it]
[Acessing speaker spk_4 track 1 of 2:  29%|██▉       | 5/17 [00:09<00:22,  1.84s/it]
[Acessing speaker spk_4 track 1 of 2:  35%|███▌      | 6/17 [00:10<00:16,  1.47s/it]
[Acessing speaker spk_4 track 1 of 2:  41%|████      | 7/17 [00:11<00:12,  1.26s/it]
[Acessing speaker spk_4 track 1 of 2:  47%|████▋     | 8/17 [00:11<00:09,  1.05s/it]
[Acessing speaker spk_4 track 1 of 2:  53%|█████▎    | 9/17 [00:12<00:07,  1.01it/s]
[Acessing speaker spk_4 track 1 of 2:  59%|█████▉    | 10/17 [00:13<00:06,  1.10it/s]
[Acessing speaker spk_4 track 1 of 2:  65%|██████▍   | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/38 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/38 [00:00<00:23,  1.55it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▌         | 2/38 [00:01<00:20,  1.76it/s]
[Acessing speaker spk_5 track 1 of 1:   8%|▊         | 3/38 [00:06<01:29,  2.55s/it]
[Acessing speaker spk_5 track 1 of 1:  11%|█         | 4/38 [00:07<01:14,  2.20s/it]
[Acessing speaker spk_5 track 1 of 1:  13%|█▎        | 5/38 [00:09<01:03,  1.92s/it]
[Acessing speaker spk_5 track 1 of 1:  16%|█▌        | 6/38 [00:10<00:50,  1.59s/it]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 7/38 [00:11<00:50,  1.62s/it]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 8/38 [00:12<00:39,  1.33s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▎       | 9/38 [00:13<00:38,  1.31s/it]
[Acessing speaker spk_5 track 1 of 1:  26%|██▋       | 10/38 [00:14<00:30,  1.08s/it]
[Acessing speaker spk_5 track 1 of 1:  29%|██▉       | 11/3


Starte Inference für Experiment: E71_bugfix_mdOn1p0_mdOff1p0_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E71_bugfix_mdOn1p0_mdOff1p0_bs12_len20
  session_dir     = data-bin/dev/session_43
  comment         = AVSR-Override: min_on=1.0s, min_off=1.0s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_43


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   4%|▎         | 1/27 [00:00<00:15,  1.68it/s]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/27 [00:01<00:17,  1.45it/s]
[Acessing speaker spk_0 track 1 of 2:  11%|█         | 3/27 [00:02<00:22,  1.06it/s]
[Acessing speaker spk_0 track 1 of 2:  15%|█▍        | 4/27 [00:03<00:17,  1.30it/s]
[Acessing speaker spk_0 track 1 of 2:  19%|█▊        | 5/27 [00:04<00:19,  1.13it/s]
[Acessing speaker spk_0 track 1 of 2:  22%|██▏       | 6/27 [00:06<00:30,  1.46s/it]
[Acessing speaker spk_0 track 1 of 2:  26%|██▌       | 7/27 [00:16<01:25,  4.28s/it]
[Acessing speaker spk_0 track 1 of 2:  30%|██▉       | 8/27 [00:23<01:37,  5.11s/it]
[Acessing speaker spk_0 track 1 of 2:  33%|███▎      | 9/27 [00:24<01:07,  3.73s/it]
[Acessing speaker spk_0 track 1 of 2:  37%|███▋      | 10/27 [00:26<00:56,  3.33s/it]
[Acessing speaker spk_0 track 1 of 2:  41%|████      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/32 [00:00<00:25,  1.21it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▋         | 2/32 [00:01<00:23,  1.27it/s]
[Acessing speaker spk_1 track 1 of 1:   9%|▉         | 3/32 [00:02<00:19,  1.48it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▎        | 4/32 [00:05<00:46,  1.66s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 5/32 [00:06<00:36,  1.35s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 6/32 [00:06<00:28,  1.09s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 7/32 [00:07<00:28,  1.13s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 8/32 [00:09<00:31,  1.31s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 9/32 [00:11<00:31,  1.38s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███▏      | 10/32 [00:11<00:26,  1.20s/it]
[Acessing speaker spk_1 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/29 [00:00<00:19,  1.45it/s]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 2/29 [00:01<00:15,  1.70it/s]
[Acessing speaker spk_2 track 1 of 1:  10%|█         | 3/29 [00:01<00:14,  1.74it/s]
[Acessing speaker spk_2 track 1 of 1:  14%|█▍        | 4/29 [00:02<00:16,  1.53it/s]
[Acessing speaker spk_2 track 1 of 1:  17%|█▋        | 5/29 [00:03<00:21,  1.12it/s]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 6/29 [00:05<00:23,  1.03s/it]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 7/29 [00:06<00:21,  1.02it/s]
[Acessing speaker spk_2 track 1 of 1:  28%|██▊       | 8/29 [00:06<00:17,  1.23it/s]
[Acessing speaker spk_2 track 1 of 1:  31%|███       | 9/29 [00:07<00:14,  1.35it/s]
[Acessing speaker spk_2 track 1 of 1:  34%|███▍      | 10/29 [00:14<00:50,  2.68s/it]
[Acessing speaker spk_2 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/31 [00:00<00:17,  1.70it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▋         | 2/31 [00:01<00:18,  1.60it/s]
[Acessing speaker spk_3 track 1 of 1:  10%|▉         | 3/31 [00:02<00:19,  1.42it/s]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 4/31 [00:02<00:17,  1.58it/s]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 5/31 [00:03<00:21,  1.21it/s]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 6/31 [00:06<00:33,  1.35s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 7/31 [00:06<00:26,  1.12s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 8/31 [00:13<01:10,  3.06s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 9/31 [00:25<02:04,  5.67s/it]
[Acessing speaker spk_3 track 1 of 1:  32%|███▏      | 10/31 [00:26<01:31,  4.34s/it]
[Acessing speaker spk_3 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_4 track 1 of 2:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 2:   6%|▋         | 1/16 [00:00<00:11,  1.25it/s]
[Acessing speaker spk_4 track 1 of 2:  12%|█▎        | 2/16 [00:02<00:19,  1.41s/it]
[Acessing speaker spk_4 track 1 of 2:  19%|█▉        | 3/16 [00:06<00:30,  2.31s/it]
[Acessing speaker spk_4 track 1 of 2:  25%|██▌       | 4/16 [00:08<00:28,  2.37s/it]
[Acessing speaker spk_4 track 1 of 2:  31%|███▏      | 5/16 [00:09<00:19,  1.80s/it]
[Acessing speaker spk_4 track 1 of 2:  38%|███▊      | 6/16 [00:10<00:14,  1.44s/it]
[Acessing speaker spk_4 track 1 of 2:  44%|████▍     | 7/16 [00:10<00:11,  1.23s/it]
[Acessing speaker spk_4 track 1 of 2:  50%|█████     | 8/16 [00:11<00:08,  1.01s/it]
[Acessing speaker spk_4 track 1 of 2:  56%|█████▋    | 9/16 [00:12<00:06,  1.05it/s]
[Acessing speaker spk_4 track 1 of 2:  62%|██████▎   | 10/16 [00:12<00:05,  1.16it/s]
[Acessing speaker spk_4 track 1 of 2:  69%|██████▉   | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/37 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/37 [00:00<00:23,  1.54it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▌         | 2/37 [00:01<00:20,  1.75it/s]
[Acessing speaker spk_5 track 1 of 1:   8%|▊         | 3/37 [00:05<01:22,  2.44s/it]
[Acessing speaker spk_5 track 1 of 1:  11%|█         | 4/37 [00:07<01:10,  2.13s/it]
[Acessing speaker spk_5 track 1 of 1:  14%|█▎        | 5/37 [00:08<01:00,  1.88s/it]
[Acessing speaker spk_5 track 1 of 1:  16%|█▌        | 6/37 [00:09<00:48,  1.56s/it]
[Acessing speaker spk_5 track 1 of 1:  19%|█▉        | 7/37 [00:11<00:50,  1.67s/it]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 8/37 [00:12<00:39,  1.36s/it]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 9/37 [00:13<00:37,  1.33s/it]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 10/37 [00:14<00:29,  1.09s/it]
[Acessing speaker spk_5 track 1 of 1:  30%|██▉       | 11/3


Starte Inference für Experiment: E72_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E72_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/dev/session_43
  comment         = AVSR-Override: min_on=1.0s, min_off=1.2s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_43


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   4%|▎         | 1/27 [00:00<00:15,  1.68it/s]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/27 [00:01<00:17,  1.43it/s]
[Acessing speaker spk_0 track 1 of 2:  11%|█         | 3/27 [00:03<00:37,  1.57s/it]
[Acessing speaker spk_0 track 1 of 2:  15%|█▍        | 4/27 [00:04<00:26,  1.15s/it]
[Acessing speaker spk_0 track 1 of 2:  19%|█▊        | 5/27 [00:05<00:24,  1.13s/it]
[Acessing speaker spk_0 track 1 of 2:  22%|██▏       | 6/27 [00:08<00:34,  1.63s/it]
[Acessing speaker spk_0 track 1 of 2:  26%|██▌       | 7/27 [00:16<01:15,  3.79s/it]
[Acessing speaker spk_0 track 1 of 2:  30%|██▉       | 8/27 [00:23<01:30,  4.78s/it]
[Acessing speaker spk_0 track 1 of 2:  33%|███▎      | 9/27 [00:24<01:03,  3.50s/it]
[Acessing speaker spk_0 track 1 of 2:  37%|███▋      | 10/27 [00:26<00:54,  3.20s/it]
[Acessing speaker spk_0 track 1 of 2:  41%|████      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/32 [00:00<00:24,  1.27it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▋         | 2/32 [00:01<00:23,  1.29it/s]
[Acessing speaker spk_1 track 1 of 1:   9%|▉         | 3/32 [00:02<00:19,  1.49it/s]
[Acessing speaker spk_1 track 1 of 1:  12%|█▎        | 4/32 [00:05<00:48,  1.74s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 5/32 [00:06<00:37,  1.39s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 6/32 [00:06<00:29,  1.12s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 7/32 [00:07<00:27,  1.09s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 8/32 [00:09<00:30,  1.28s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 9/32 [00:11<00:31,  1.36s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███▏      | 10/32 [00:11<00:26,  1.19s/it]
[Acessing speaker spk_1 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/29 [00:00<00:19,  1.47it/s]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 2/29 [00:01<00:15,  1.72it/s]
[Acessing speaker spk_2 track 1 of 1:  10%|█         | 3/29 [00:01<00:14,  1.75it/s]
[Acessing speaker spk_2 track 1 of 1:  14%|█▍        | 4/29 [00:02<00:15,  1.56it/s]
[Acessing speaker spk_2 track 1 of 1:  17%|█▋        | 5/29 [00:03<00:15,  1.54it/s]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 6/29 [00:03<00:14,  1.61it/s]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 7/29 [00:04<00:15,  1.41it/s]
[Acessing speaker spk_2 track 1 of 1:  28%|██▊       | 8/29 [00:05<00:13,  1.59it/s]
[Acessing speaker spk_2 track 1 of 1:  31%|███       | 9/29 [00:05<00:12,  1.64it/s]
[Acessing speaker spk_2 track 1 of 1:  34%|███▍      | 10/29 [00:12<00:50,  2.67s/it]
[Acessing speaker spk_2 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/31 [00:00<00:18,  1.63it/s]
[Acessing speaker spk_3 track 1 of 1:   6%|▋         | 2/31 [00:01<00:19,  1.52it/s]
[Acessing speaker spk_3 track 1 of 1:  10%|▉         | 3/31 [00:02<00:20,  1.34it/s]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 4/31 [00:02<00:17,  1.51it/s]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 5/31 [00:03<00:22,  1.15it/s]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 6/31 [00:06<00:35,  1.41s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 7/31 [00:07<00:28,  1.17s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 8/31 [00:14<01:11,  3.13s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 9/31 [00:25<02:06,  5.74s/it]
[Acessing speaker spk_3 track 1 of 1:  32%|███▏      | 10/31 [00:27<01:32,  4.41s/it]
[Acessing speaker spk_3 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_4 track 1 of 2:   0%|          | 0/15 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 2:   7%|▋         | 1/15 [00:00<00:12,  1.08it/s]
[Acessing speaker spk_4 track 1 of 2:  13%|█▎        | 2/15 [00:02<00:19,  1.53s/it]
[Acessing speaker spk_4 track 1 of 2:  20%|██        | 3/15 [00:06<00:29,  2.47s/it]
[Acessing speaker spk_4 track 1 of 2:  27%|██▋       | 4/15 [00:09<00:27,  2.53s/it]
[Acessing speaker spk_4 track 1 of 2:  33%|███▎      | 5/15 [00:09<00:19,  1.92s/it]
[Acessing speaker spk_4 track 1 of 2:  40%|████      | 6/15 [00:10<00:13,  1.54s/it]
[Acessing speaker spk_4 track 1 of 2:  47%|████▋     | 7/15 [00:11<00:10,  1.32s/it]
[Acessing speaker spk_4 track 1 of 2:  53%|█████▎    | 8/15 [00:12<00:07,  1.09s/it]
[Acessing speaker spk_4 track 1 of 2:  60%|██████    | 9/15 [00:13<00:06,  1.02s/it]
[Acessing speaker spk_4 track 1 of 2:  67%|██████▋   | 10/15 [00:14<00:05,  1.10s/it]
[Acessing speaker spk_4 track 1 of 2:  73%|███████▎  | 11/1





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/36 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/36 [00:00<00:24,  1.45it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▌         | 2/36 [00:01<00:20,  1.66it/s]
[Acessing speaker spk_5 track 1 of 1:   8%|▊         | 3/36 [00:06<01:29,  2.71s/it]
[Acessing speaker spk_5 track 1 of 1:  11%|█         | 4/36 [00:08<01:14,  2.34s/it]
[Acessing speaker spk_5 track 1 of 1:  14%|█▍        | 5/36 [00:09<01:03,  2.05s/it]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 6/36 [00:10<00:50,  1.70s/it]
[Acessing speaker spk_5 track 1 of 1:  19%|█▉        | 7/36 [00:12<00:50,  1.73s/it]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 8/36 [00:13<00:39,  1.42s/it]
[Acessing speaker spk_5 track 1 of 1:  25%|██▌       | 9/36 [00:14<00:37,  1.39s/it]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 10/36 [00:15<00:29,  1.14s/it]
[Acessing speaker spk_5 track 1 of 1:  31%|███       | 11/3


########## Starte Grid-Experimente für session_49 ##########

Starte Inference für Experiment: E56_bugfix_default_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E56_bugfix_default_bs12_len20
  session_dir     = data-bin/dev/session_49
  comment         = Bugfix-default segmentation (kein Override von min_duration)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_49


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/12 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 1/12 [00:01<00:13,  1.27s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 2/12 [00:02<00:10,  1.10s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 3/12 [00:02<00:07,  1.23it/s]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 4/12 [00:03<00:05,  1.49it/s]
[Acessing speaker spk_0 track 1 of 1:  42%|████▏     | 5/12 [00:03<00:04,  1.45it/s]
[Acessing speaker spk_0 track 1 of 1:  50%|█████     | 6/12 [00:04<00:04,  1.42it/s]
[Acessing speaker spk_0 track 1 of 1:  58%|█████▊    | 7/12 [00:05<00:03,  1.52it/s]
[Acessing speaker spk_0 track 1 of 1:  67%|██████▋   | 8/12 [00:05<00:02,  1.44it/s]
[Acessing speaker spk_0 track 1 of 1:  75%|███████▌  | 9/12 [00:09<00:04,  1.46s/it]
[Acessing speaker spk_0 track 1 of 1:  83%|████████▎ | 10/12 [00:09<00:02,  1.26s/it]
[Acessing speaker spk_0 track 1 of 1:  92%|█████████▏| 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▋         | 1/16 [00:01<00:22,  1.52s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▎        | 2/16 [00:03<00:27,  1.99s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 3/16 [00:04<00:20,  1.55s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 4/16 [00:05<00:14,  1.17s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███▏      | 5/16 [00:06<00:10,  1.03it/s]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 6/16 [00:06<00:08,  1.12it/s]
[Acessing speaker spk_1 track 1 of 1:  44%|████▍     | 7/16 [00:07<00:08,  1.11it/s]
[Acessing speaker spk_1 track 1 of 1:  50%|█████     | 8/16 [00:09<00:08,  1.02s/it]
[Acessing speaker spk_1 track 1 of 1:  56%|█████▋    | 9/16 [00:10<00:08,  1.14s/it]
[Acessing speaker spk_1 track 1 of 1:  62%|██████▎   | 10/16 [00:13<00:10,  1.75s/it]
[Acessing speaker spk_1 track 1 of 1:  69%|██████▉   | 11/1





[Acessing speaker spk_2 track 1 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 1 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.53it/s]

[Acessing speaker spk_2 track 2 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 2 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.75it/s]

[Acessing speaker spk_2 track 3 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 3 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.85it/s]

[Acessing speaker spk_2 track 4 of 8:   0%|          | 0/3 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 4 of 8:  33%|███▎      | 1/3 [00:04<00:09,  4.82s/it]
[Acessing speaker spk_2 track 4 of 8:  67%|██████▋   | 2/3 [00:09<00:04,  4.65s/it]
Processing speaker spk_2 track 4 of 8: 100%|██████████| 3/3 [00:16<00:00,  5.58s/it]

[Acessing speaker spk_2 track 5 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 5 of 8:  50%|█████     | 1/2 [00:07<00:07,  7.84s/it]
Processing spea





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▍         | 1/23 [00:00<00:20,  1.07it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▊         | 2/23 [00:02<00:23,  1.13s/it]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 3/23 [00:05<00:37,  1.90s/it]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 4/23 [00:08<00:49,  2.60s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 5/23 [00:10<00:43,  2.43s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 6/23 [00:11<00:32,  1.89s/it]
[Acessing speaker spk_3 track 1 of 1:  30%|███       | 7/23 [00:15<00:40,  2.54s/it]
[Acessing speaker spk_3 track 1 of 1:  35%|███▍      | 8/23 [00:18<00:42,  2.81s/it]
[Acessing speaker spk_3 track 1 of 1:  39%|███▉      | 9/23 [00:20<00:33,  2.36s/it]
[Acessing speaker spk_3 track 1 of 1:  43%|████▎     | 10/23 [00:21<00:24,  1.87s/it]
[Acessing speaker spk_3 track 1 of 1:  48%|████▊     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/28 [00:00<00:19,  1.40it/s]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/28 [00:04<01:05,  2.54s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/28 [00:08<01:20,  3.21s/it]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 4/28 [00:12<01:28,  3.70s/it]
[Acessing speaker spk_4 track 1 of 1:  18%|█▊        | 5/28 [00:17<01:30,  3.94s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██▏       | 6/28 [00:21<01:24,  3.86s/it]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 7/28 [00:25<01:22,  3.95s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▊       | 8/28 [00:28<01:14,  3.72s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 9/28 [00:29<00:52,  2.78s/it]
[Acessing speaker spk_4 track 1 of 1:  36%|███▌      | 10/28 [00:29<00:38,  2.13s/it]
[Acessing speaker spk_4 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_5 track 1 of 2:   0%|          | 0/25 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 2:   4%|▍         | 1/25 [00:01<00:24,  1.02s/it]
[Acessing speaker spk_5 track 1 of 2:   8%|▊         | 2/25 [00:01<00:18,  1.25it/s]
[Acessing speaker spk_5 track 1 of 2:  12%|█▏        | 3/25 [00:03<00:26,  1.20s/it]
[Acessing speaker spk_5 track 1 of 2:  16%|█▌        | 4/25 [00:04<00:20,  1.01it/s]
[Acessing speaker spk_5 track 1 of 2:  20%|██        | 5/25 [00:04<00:18,  1.09it/s]
[Acessing speaker spk_5 track 1 of 2:  24%|██▍       | 6/25 [00:05<00:16,  1.13it/s]
[Acessing speaker spk_5 track 1 of 2:  28%|██▊       | 7/25 [00:06<00:14,  1.27it/s]
[Acessing speaker spk_5 track 1 of 2:  32%|███▏      | 8/25 [00:07<00:15,  1.10it/s]
[Acessing speaker spk_5 track 1 of 2:  36%|███▌      | 9/25 [00:08<00:13,  1.17it/s]
[Acessing speaker spk_5 track 1 of 2:  40%|████      | 10/25 [00:10<00:21,  1.45s/it]
[Acessing speaker spk_5 track 1 of 2:  44%|████▍     | 11/2


Starte Inference für Experiment: E57_bugfix_mdOn0p4_mdOff0p5_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E57_bugfix_mdOn0p4_mdOff0p5_bs12_len20
  session_dir     = data-bin/dev/session_49
  comment         = AVSR-Override: min_on=0.4s, min_off=0.5s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_49


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/13 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 1/13 [00:01<00:13,  1.09s/it]
[Acessing speaker spk_0 track 1 of 1:  15%|█▌        | 2/13 [00:02<00:11,  1.03s/it]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 3/13 [00:02<00:07,  1.29it/s]
[Acessing speaker spk_0 track 1 of 1:  31%|███       | 4/13 [00:03<00:05,  1.53it/s]
[Acessing speaker spk_0 track 1 of 1:  38%|███▊      | 5/13 [00:03<00:05,  1.49it/s]
[Acessing speaker spk_0 track 1 of 1:  46%|████▌     | 6/13 [00:04<00:04,  1.43it/s]
[Acessing speaker spk_0 track 1 of 1:  54%|█████▍    | 7/13 [00:05<00:03,  1.50it/s]
[Acessing speaker spk_0 track 1 of 1:  62%|██████▏   | 8/13 [00:05<00:03,  1.43it/s]
[Acessing speaker spk_0 track 1 of 1:  69%|██████▉   | 9/13 [00:06<00:02,  1.60it/s]
[Acessing speaker spk_0 track 1 of 1:  77%|███████▋  | 10/13 [00:09<00:04,  1.43s/it]
[Acessing speaker spk_0 track 1 of 1:  85%|████████▍ | 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▌         | 1/17 [00:01<00:18,  1.15s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 2/17 [00:03<00:27,  1.82s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 3/17 [00:04<00:20,  1.44s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▎       | 4/17 [00:05<00:14,  1.10s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▉       | 5/17 [00:05<00:10,  1.10it/s]
[Acessing speaker spk_1 track 1 of 1:  35%|███▌      | 6/17 [00:06<00:09,  1.17it/s]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 7/17 [00:07<00:09,  1.07it/s]
[Acessing speaker spk_1 track 1 of 1:  47%|████▋     | 8/17 [00:08<00:09,  1.04s/it]
[Acessing speaker spk_1 track 1 of 1:  53%|█████▎    | 9/17 [00:09<00:08,  1.08s/it]
[Acessing speaker spk_1 track 1 of 1:  59%|█████▉    | 10/17 [00:12<00:12,  1.72s/it]
[Acessing speaker spk_1 track 1 of 1:  65%|██████▍   | 11/1





[Acessing speaker spk_2 track 1 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 8:  50%|█████     | 1/2 [00:00<00:00,  1.96it/s]
Processing speaker spk_2 track 1 of 8: 100%|██████████| 2/2 [00:00<00:00,  2.07it/s]

[Acessing speaker spk_2 track 2 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 2 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.31it/s]

[Acessing speaker spk_2 track 3 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 3 of 8:  50%|█████     | 1/2 [00:00<00:00,  3.06it/s]
Processing speaker spk_2 track 3 of 8: 100%|██████████| 2/2 [00:00<00:00,  2.57it/s]

[Acessing speaker spk_2 track 4 of 8:   0%|          | 0/3 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 4 of 8:  33%|███▎      | 1/3 [00:06<00:12,  6.40s/it]
[Acessing speaker spk_2 track 4 of 8:  67%|██████▋   | 2/3 [00:10<00:05,  5.30s/it]
Processing speaker spk_2 track 4 of 8: 100%|██████████| 3/3 [00:18<00:00,  6.16s/it]

[Acess





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▍         | 1/24 [00:00<00:12,  1.83it/s]
[Acessing speaker spk_3 track 1 of 1:   8%|▊         | 2/24 [00:01<00:12,  1.74it/s]
[Acessing speaker spk_3 track 1 of 1:  12%|█▎        | 3/24 [00:02<00:19,  1.10it/s]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 4/24 [00:05<00:32,  1.62s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 5/24 [00:08<00:44,  2.34s/it]
[Acessing speaker spk_3 track 1 of 1:  25%|██▌       | 6/24 [00:10<00:40,  2.26s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 7/24 [00:11<00:30,  1.79s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 8/24 [00:15<00:38,  2.43s/it]
[Acessing speaker spk_3 track 1 of 1:  38%|███▊      | 9/24 [00:18<00:40,  2.68s/it]
[Acessing speaker spk_3 track 1 of 1:  42%|████▏     | 10/24 [00:20<00:31,  2.27s/it]
[Acessing speaker spk_3 track 1 of 1:  46%|████▌     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/33 [00:00<00:17,  1.80it/s]
[Acessing speaker spk_4 track 1 of 1:   6%|▌         | 2/33 [00:04<01:15,  2.45s/it]
[Acessing speaker spk_4 track 1 of 1:   9%|▉         | 3/33 [00:08<01:33,  3.12s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 4/33 [00:14<02:02,  4.22s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▌        | 5/33 [00:18<01:55,  4.14s/it]
[Acessing speaker spk_4 track 1 of 1:  18%|█▊        | 6/33 [00:21<01:46,  3.93s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██        | 7/33 [00:25<01:43,  3.98s/it]
[Acessing speaker spk_4 track 1 of 1:  24%|██▍       | 8/33 [00:28<01:33,  3.73s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 9/33 [00:29<01:06,  2.77s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|███       | 10/33 [00:30<00:48,  2.09s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 11/3





[Acessing speaker spk_5 track 1 of 2:   0%|          | 0/25 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 2:   4%|▍         | 1/25 [00:01<00:24,  1.04s/it]
[Acessing speaker spk_5 track 1 of 2:   8%|▊         | 2/25 [00:01<00:18,  1.24it/s]
[Acessing speaker spk_5 track 1 of 2:  12%|█▏        | 3/25 [00:03<00:25,  1.17s/it]
[Acessing speaker spk_5 track 1 of 2:  16%|█▌        | 4/25 [00:03<00:20,  1.04it/s]
[Acessing speaker spk_5 track 1 of 2:  20%|██        | 5/25 [00:04<00:17,  1.12it/s]
[Acessing speaker spk_5 track 1 of 2:  24%|██▍       | 6/25 [00:05<00:16,  1.17it/s]
[Acessing speaker spk_5 track 1 of 2:  28%|██▊       | 7/25 [00:06<00:13,  1.31it/s]
[Acessing speaker spk_5 track 1 of 2:  32%|███▏      | 8/25 [00:07<00:14,  1.14it/s]
[Acessing speaker spk_5 track 1 of 2:  36%|███▌      | 9/25 [00:07<00:13,  1.20it/s]
[Acessing speaker spk_5 track 1 of 2:  40%|████      | 10/25 [00:12<00:29,  1.99s/it]
[Acessing speaker spk_5 track 1 of 2:  44%|████▍     | 11/2


Starte Inference für Experiment: E58_bugfix_mdOn0p4_mdOff0p8_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E58_bugfix_mdOn0p4_mdOff0p8_bs12_len20
  session_dir     = data-bin/dev/session_49
  comment         = AVSR-Override: min_on=0.4s, min_off=0.8s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_49


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/13 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 1/13 [00:01<00:12,  1.06s/it]
[Acessing speaker spk_0 track 1 of 1:  15%|█▌        | 2/13 [00:02<00:11,  1.01s/it]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 3/13 [00:02<00:07,  1.31it/s]
[Acessing speaker spk_0 track 1 of 1:  31%|███       | 4/13 [00:02<00:05,  1.56it/s]
[Acessing speaker spk_0 track 1 of 1:  38%|███▊      | 5/13 [00:03<00:05,  1.49it/s]
[Acessing speaker spk_0 track 1 of 1:  46%|████▌     | 6/13 [00:04<00:04,  1.47it/s]
[Acessing speaker spk_0 track 1 of 1:  54%|█████▍    | 7/13 [00:04<00:03,  1.55it/s]
[Acessing speaker spk_0 track 1 of 1:  62%|██████▏   | 8/13 [00:05<00:03,  1.47it/s]
[Acessing speaker spk_0 track 1 of 1:  69%|██████▉   | 9/13 [00:06<00:02,  1.64it/s]
[Acessing speaker spk_0 track 1 of 1:  77%|███████▋  | 10/13 [00:11<00:05,  1.98s/it]
[Acessing speaker spk_0 track 1 of 1:  85%|████████▍ | 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▋         | 1/16 [00:01<00:15,  1.04s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▎        | 2/16 [00:03<00:24,  1.74s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 3/16 [00:04<00:18,  1.39s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 4/16 [00:04<00:12,  1.08s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███▏      | 5/16 [00:05<00:09,  1.11it/s]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 6/16 [00:06<00:08,  1.18it/s]
[Acessing speaker spk_1 track 1 of 1:  44%|████▍     | 7/16 [00:07<00:08,  1.06it/s]
[Acessing speaker spk_1 track 1 of 1:  50%|█████     | 8/16 [00:08<00:08,  1.04s/it]
[Acessing speaker spk_1 track 1 of 1:  56%|█████▋    | 9/16 [00:09<00:07,  1.08s/it]
[Acessing speaker spk_1 track 1 of 1:  62%|██████▎   | 10/16 [00:12<00:10,  1.67s/it]
[Acessing speaker spk_1 track 1 of 1:  69%|██████▉   | 11/1





[Acessing speaker spk_2 track 1 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 8:  50%|█████     | 1/2 [00:00<00:00,  1.91it/s]
Processing speaker spk_2 track 1 of 8: 100%|██████████| 2/2 [00:00<00:00,  2.04it/s]

[Acessing speaker spk_2 track 2 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 2 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.90it/s]

[Acessing speaker spk_2 track 3 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 3 of 8:  50%|█████     | 1/2 [00:00<00:00,  3.05it/s]
Processing speaker spk_2 track 3 of 8: 100%|██████████| 2/2 [00:00<00:00,  2.58it/s]

[Acessing speaker spk_2 track 4 of 8:   0%|          | 0/3 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 4 of 8:  33%|███▎      | 1/3 [00:04<00:09,  4.65s/it]
[Acessing speaker spk_2 track 4 of 8:  67%|██████▋   | 2/3 [00:09<00:04,  4.55s/it]
Processing speaker spk_2 track 4 of 8: 100%|██████████| 3/3 [00:16<00:00,  5.46s/it]

[Acess





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▍         | 1/24 [00:00<00:12,  1.82it/s]
[Acessing speaker spk_3 track 1 of 1:   8%|▊         | 2/24 [00:01<00:12,  1.77it/s]
[Acessing speaker spk_3 track 1 of 1:  12%|█▎        | 3/24 [00:02<00:18,  1.13it/s]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 4/24 [00:05<00:32,  1.62s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 5/24 [00:08<00:44,  2.36s/it]
[Acessing speaker spk_3 track 1 of 1:  25%|██▌       | 6/24 [00:10<00:41,  2.28s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 7/24 [00:11<00:30,  1.80s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 8/24 [00:15<00:38,  2.42s/it]
[Acessing speaker spk_3 track 1 of 1:  38%|███▊      | 9/24 [00:18<00:40,  2.67s/it]
[Acessing speaker spk_3 track 1 of 1:  42%|████▏     | 10/24 [00:20<00:31,  2.28s/it]
[Acessing speaker spk_3 track 1 of 1:  46%|████▌     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/29 [00:05<02:20,  5.02s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/29 [00:08<01:55,  4.28s/it]
[Acessing speaker spk_4 track 1 of 1:  10%|█         | 3/29 [00:13<01:51,  4.29s/it]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 4/29 [00:17<01:44,  4.19s/it]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 5/29 [00:20<01:34,  3.94s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██        | 6/29 [00:24<01:31,  4.00s/it]
[Acessing speaker spk_4 track 1 of 1:  24%|██▍       | 7/29 [00:27<01:22,  3.74s/it]
[Acessing speaker spk_4 track 1 of 1:  28%|██▊       | 8/29 [00:28<00:57,  2.76s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 9/29 [00:29<00:42,  2.14s/it]
[Acessing speaker spk_4 track 1 of 1:  34%|███▍      | 10/29 [00:29<00:30,  1.62s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_5 track 1 of 2:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 2:   4%|▍         | 1/24 [00:00<00:19,  1.18it/s]
[Acessing speaker spk_5 track 1 of 2:   8%|▊         | 2/24 [00:01<00:16,  1.34it/s]
[Acessing speaker spk_5 track 1 of 2:  12%|█▎        | 3/24 [00:03<00:24,  1.15s/it]
[Acessing speaker spk_5 track 1 of 2:  17%|█▋        | 4/24 [00:03<00:19,  1.04it/s]
[Acessing speaker spk_5 track 1 of 2:  21%|██        | 5/24 [00:04<00:17,  1.12it/s]
[Acessing speaker spk_5 track 1 of 2:  25%|██▌       | 6/24 [00:05<00:15,  1.17it/s]
[Acessing speaker spk_5 track 1 of 2:  29%|██▉       | 7/24 [00:05<00:12,  1.31it/s]
[Acessing speaker spk_5 track 1 of 2:  33%|███▎      | 8/24 [00:07<00:14,  1.14it/s]
[Acessing speaker spk_5 track 1 of 2:  38%|███▊      | 9/24 [00:07<00:12,  1.19it/s]
[Acessing speaker spk_5 track 1 of 2:  42%|████▏     | 10/24 [00:10<00:19,  1.42s/it]
[Acessing speaker spk_5 track 1 of 2:  46%|████▌     | 11/2


Starte Inference für Experiment: E59_bugfix_mdOn0p4_mdOff1p0_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E59_bugfix_mdOn0p4_mdOff1p0_bs12_len20
  session_dir     = data-bin/dev/session_49
  comment         = AVSR-Override: min_on=0.4s, min_off=1.0s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_49


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/13 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 1/13 [00:01<00:12,  1.05s/it]
[Acessing speaker spk_0 track 1 of 1:  15%|█▌        | 2/13 [00:02<00:11,  1.01s/it]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 3/13 [00:02<00:07,  1.30it/s]
[Acessing speaker spk_0 track 1 of 1:  31%|███       | 4/13 [00:02<00:05,  1.55it/s]
[Acessing speaker spk_0 track 1 of 1:  38%|███▊      | 5/13 [00:03<00:05,  1.50it/s]
[Acessing speaker spk_0 track 1 of 1:  46%|████▌     | 6/13 [00:04<00:04,  1.47it/s]
[Acessing speaker spk_0 track 1 of 1:  54%|█████▍    | 7/13 [00:04<00:03,  1.55it/s]
[Acessing speaker spk_0 track 1 of 1:  62%|██████▏   | 8/13 [00:05<00:03,  1.48it/s]
[Acessing speaker spk_0 track 1 of 1:  69%|██████▉   | 9/13 [00:06<00:02,  1.64it/s]
[Acessing speaker spk_0 track 1 of 1:  77%|███████▋  | 10/13 [00:09<00:04,  1.41s/it]
[Acessing speaker spk_0 track 1 of 1:  85%|████████▍ | 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/14 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 1/14 [00:06<01:20,  6.16s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 2/14 [00:07<00:38,  3.22s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██▏       | 3/14 [00:07<00:22,  2.02s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▊       | 4/14 [00:08<00:14,  1.45s/it]
[Acessing speaker spk_1 track 1 of 1:  36%|███▌      | 5/14 [00:09<00:10,  1.19s/it]
[Acessing speaker spk_1 track 1 of 1:  43%|████▎     | 6/14 [00:10<00:08,  1.10s/it]
[Acessing speaker spk_1 track 1 of 1:  50%|█████     | 7/14 [00:11<00:08,  1.15s/it]
[Acessing speaker spk_1 track 1 of 1:  57%|█████▋    | 8/14 [00:12<00:06,  1.15s/it]
[Acessing speaker spk_1 track 1 of 1:  64%|██████▍   | 9/14 [00:17<00:11,  2.34s/it]
[Acessing speaker spk_1 track 1 of 1:  71%|███████▏  | 10/14 [00:27<00:19,  4.79s/it]
[Acessing speaker spk_1 track 1 of 1:  79%|███████▊  | 11/1





[Acessing speaker spk_2 track 1 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 8:  50%|█████     | 1/2 [00:00<00:00,  1.93it/s]
Processing speaker spk_2 track 1 of 8: 100%|██████████| 2/2 [00:00<00:00,  2.07it/s]

[Acessing speaker spk_2 track 2 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 2 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.90it/s]

[Acessing speaker spk_2 track 3 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 3 of 8:  50%|█████     | 1/2 [00:00<00:00,  3.11it/s]
Processing speaker spk_2 track 3 of 8: 100%|██████████| 2/2 [00:00<00:00,  2.61it/s]

[Acessing speaker spk_2 track 4 of 8:   0%|          | 0/3 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 4 of 8:  33%|███▎      | 1/3 [00:04<00:08,  4.35s/it]
[Acessing speaker spk_2 track 4 of 8:  67%|██████▋   | 2/3 [00:08<00:04,  4.41s/it]
Processing speaker spk_2 track 4 of 8: 100%|██████████| 3/3 [00:16<00:00,  5.37s/it]

[Acess





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/22 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▍         | 1/22 [00:00<00:11,  1.83it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 2/22 [00:01<00:11,  1.78it/s]
[Acessing speaker spk_3 track 1 of 1:  14%|█▎        | 3/22 [00:02<00:16,  1.14it/s]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 4/22 [00:05<00:28,  1.61s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 5/22 [00:08<00:39,  2.33s/it]
[Acessing speaker spk_3 track 1 of 1:  27%|██▋       | 6/22 [00:10<00:36,  2.26s/it]
[Acessing speaker spk_3 track 1 of 1:  32%|███▏      | 7/22 [00:11<00:26,  1.79s/it]
[Acessing speaker spk_3 track 1 of 1:  36%|███▋      | 8/22 [00:15<00:34,  2.46s/it]
[Acessing speaker spk_3 track 1 of 1:  41%|████      | 9/22 [00:18<00:34,  2.69s/it]
[Acessing speaker spk_3 track 1 of 1:  45%|████▌     | 10/22 [00:20<00:27,  2.27s/it]
[Acessing speaker spk_3 track 1 of 1:  50%|█████     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/27 [00:05<02:11,  5.05s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/27 [00:12<02:44,  6.59s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/27 [00:21<03:01,  7.58s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▍        | 4/27 [00:25<02:19,  6.08s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▊        | 5/27 [00:29<02:02,  5.58s/it]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 6/27 [00:33<01:40,  4.81s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 7/27 [00:33<01:09,  3.46s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|██▉       | 8/27 [00:34<00:49,  2.62s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 9/27 [00:35<00:35,  1.94s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 10/27 [00:36<00:27,  1.64s/it]
[Acessing speaker spk_4 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_5 track 1 of 2:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 2:   5%|▍         | 1/21 [00:00<00:16,  1.19it/s]
[Acessing speaker spk_5 track 1 of 2:  10%|▉         | 2/21 [00:01<00:13,  1.37it/s]
[Acessing speaker spk_5 track 1 of 2:  14%|█▍        | 3/21 [00:03<00:20,  1.14s/it]
[Acessing speaker spk_5 track 1 of 2:  19%|█▉        | 4/21 [00:03<00:16,  1.06it/s]
[Acessing speaker spk_5 track 1 of 2:  24%|██▍       | 5/21 [00:04<00:14,  1.13it/s]
[Acessing speaker spk_5 track 1 of 2:  29%|██▊       | 6/21 [00:05<00:12,  1.18it/s]
[Acessing speaker spk_5 track 1 of 2:  33%|███▎      | 7/21 [00:05<00:10,  1.32it/s]
[Acessing speaker spk_5 track 1 of 2:  38%|███▊      | 8/21 [00:07<00:11,  1.13it/s]
[Acessing speaker spk_5 track 1 of 2:  43%|████▎     | 9/21 [00:07<00:10,  1.19it/s]
[Acessing speaker spk_5 track 1 of 2:  48%|████▊     | 10/21 [00:10<00:15,  1.43s/it]
[Acessing speaker spk_5 track 1 of 2:  52%|█████▏    | 11/2


Starte Inference für Experiment: E60_bugfix_mdOn0p4_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E60_bugfix_mdOn0p4_mdOff1p2_bs12_len20
  session_dir     = data-bin/dev/session_49
  comment         = AVSR-Override: min_on=0.4s, min_off=1.2s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_49


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/13 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 1/13 [00:01<00:12,  1.05s/it]
[Acessing speaker spk_0 track 1 of 1:  15%|█▌        | 2/13 [00:02<00:11,  1.01s/it]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 3/13 [00:02<00:07,  1.31it/s]
[Acessing speaker spk_0 track 1 of 1:  31%|███       | 4/13 [00:02<00:05,  1.55it/s]
[Acessing speaker spk_0 track 1 of 1:  38%|███▊      | 5/13 [00:03<00:05,  1.49it/s]
[Acessing speaker spk_0 track 1 of 1:  46%|████▌     | 6/13 [00:04<00:04,  1.46it/s]
[Acessing speaker spk_0 track 1 of 1:  54%|█████▍    | 7/13 [00:04<00:03,  1.54it/s]
[Acessing speaker spk_0 track 1 of 1:  62%|██████▏   | 8/13 [00:05<00:03,  1.47it/s]
[Acessing speaker spk_0 track 1 of 1:  69%|██████▉   | 9/13 [00:06<00:02,  1.64it/s]
[Acessing speaker spk_0 track 1 of 1:  77%|███████▋  | 10/13 [00:09<00:04,  1.41s/it]
[Acessing speaker spk_0 track 1 of 1:  85%|████████▍ | 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/12 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 1/12 [00:06<01:09,  6.32s/it]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 2/12 [00:07<00:32,  3.29s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 3/12 [00:08<00:18,  2.06s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 4/12 [00:08<00:11,  1.48s/it]
[Acessing speaker spk_1 track 1 of 1:  42%|████▏     | 5/12 [00:09<00:08,  1.22s/it]
[Acessing speaker spk_1 track 1 of 1:  50%|█████     | 6/12 [00:12<00:10,  1.71s/it]
[Acessing speaker spk_1 track 1 of 1:  58%|█████▊    | 7/12 [00:17<00:15,  3.03s/it]
[Acessing speaker spk_1 track 1 of 1:  67%|██████▋   | 8/12 [00:22<00:14,  3.59s/it]
[Acessing speaker spk_1 track 1 of 1:  75%|███████▌  | 9/12 [00:27<00:12,  4.05s/it]
[Acessing speaker spk_1 track 1 of 1:  83%|████████▎ | 10/12 [00:28<00:06,  3.10s/it]
[Acessing speaker spk_1 track 1 of 1:  92%|█████████▏| 11/1





[Acessing speaker spk_2 track 1 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 8:  50%|█████     | 1/2 [00:00<00:00,  1.95it/s]
Processing speaker spk_2 track 1 of 8: 100%|██████████| 2/2 [00:00<00:00,  2.06it/s]

[Acessing speaker spk_2 track 2 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 2 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.89it/s]

[Acessing speaker spk_2 track 3 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 3 of 8:  50%|█████     | 1/2 [00:00<00:00,  3.07it/s]
Processing speaker spk_2 track 3 of 8: 100%|██████████| 2/2 [00:00<00:00,  2.58it/s]

[Acessing speaker spk_2 track 4 of 8:   0%|          | 0/3 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 4 of 8:  33%|███▎      | 1/3 [00:04<00:08,  4.46s/it]
[Acessing speaker spk_2 track 4 of 8:  67%|██████▋   | 2/3 [00:08<00:04,  4.50s/it]
Processing speaker spk_2 track 4 of 8: 100%|██████████| 3/3 [00:16<00:00,  5.52s/it]

[Acess





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/22 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▍         | 1/22 [00:00<00:11,  1.86it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 2/22 [00:01<00:11,  1.78it/s]
[Acessing speaker spk_3 track 1 of 1:  14%|█▎        | 3/22 [00:02<00:16,  1.14it/s]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 4/22 [00:05<00:28,  1.61s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 5/22 [00:08<00:39,  2.34s/it]
[Acessing speaker spk_3 track 1 of 1:  27%|██▋       | 6/22 [00:11<00:37,  2.32s/it]
[Acessing speaker spk_3 track 1 of 1:  32%|███▏      | 7/22 [00:11<00:27,  1.83s/it]
[Acessing speaker spk_3 track 1 of 1:  36%|███▋      | 8/22 [00:15<00:33,  2.43s/it]
[Acessing speaker spk_3 track 1 of 1:  41%|████      | 9/22 [00:18<00:34,  2.67s/it]
[Acessing speaker spk_3 track 1 of 1:  45%|████▌     | 10/22 [00:20<00:27,  2.27s/it]
[Acessing speaker spk_3 track 1 of 1:  50%|█████     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/25 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/25 [00:05<02:01,  5.08s/it]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/25 [00:13<02:44,  7.17s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 3/25 [00:24<03:18,  9.01s/it]
[Acessing speaker spk_4 track 1 of 1:  16%|█▌        | 4/25 [00:34<03:16,  9.36s/it]
[Acessing speaker spk_4 track 1 of 1:  20%|██        | 5/25 [00:38<02:25,  7.26s/it]
[Acessing speaker spk_4 track 1 of 1:  24%|██▍       | 6/25 [00:39<01:35,  5.04s/it]
[Acessing speaker spk_4 track 1 of 1:  28%|██▊       | 7/25 [00:39<01:06,  3.67s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 8/25 [00:40<00:45,  2.65s/it]
[Acessing speaker spk_4 track 1 of 1:  36%|███▌      | 9/25 [00:41<00:33,  2.12s/it]
[Acessing speaker spk_4 track 1 of 1:  40%|████      | 10/25 [00:41<00:24,  1.66s/it]
[Acessing speaker spk_4 track 1 of 1:  44%|████▍     | 11/2





[Acessing speaker spk_5 track 1 of 2:   0%|          | 0/20 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 2:   5%|▌         | 1/20 [00:00<00:16,  1.17it/s]
[Acessing speaker spk_5 track 1 of 2:  10%|█         | 2/20 [00:01<00:13,  1.36it/s]
[Acessing speaker spk_5 track 1 of 2:  15%|█▌        | 3/20 [00:03<00:19,  1.15s/it]
[Acessing speaker spk_5 track 1 of 2:  20%|██        | 4/20 [00:03<00:15,  1.04it/s]
[Acessing speaker spk_5 track 1 of 2:  25%|██▌       | 5/20 [00:04<00:13,  1.12it/s]
[Acessing speaker spk_5 track 1 of 2:  30%|███       | 6/20 [00:05<00:11,  1.17it/s]
[Acessing speaker spk_5 track 1 of 2:  35%|███▌      | 7/20 [00:05<00:09,  1.30it/s]
[Acessing speaker spk_5 track 1 of 2:  40%|████      | 8/20 [00:07<00:10,  1.13it/s]
[Acessing speaker spk_5 track 1 of 2:  45%|████▌     | 9/20 [00:07<00:09,  1.18it/s]
[Acessing speaker spk_5 track 1 of 2:  50%|█████     | 10/20 [00:10<00:14,  1.43s/it]
[Acessing speaker spk_5 track 1 of 2:  55%|█████▌    | 11/2


Starte Inference für Experiment: E61_bugfix_mdOn0p6_mdOff0p5_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E61_bugfix_mdOn0p6_mdOff0p5_bs12_len20
  session_dir     = data-bin/dev/session_49
  comment         = AVSR-Override: min_on=0.6s, min_off=0.5s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_49


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/12 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 1/12 [00:01<00:12,  1.10s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 2/12 [00:02<00:10,  1.03s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 3/12 [00:02<00:06,  1.29it/s]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 4/12 [00:03<00:05,  1.54it/s]
[Acessing speaker spk_0 track 1 of 1:  42%|████▏     | 5/12 [00:03<00:04,  1.48it/s]
[Acessing speaker spk_0 track 1 of 1:  50%|█████     | 6/12 [00:04<00:04,  1.47it/s]
[Acessing speaker spk_0 track 1 of 1:  58%|█████▊    | 7/12 [00:05<00:03,  1.54it/s]
[Acessing speaker spk_0 track 1 of 1:  67%|██████▋   | 8/12 [00:05<00:02,  1.47it/s]
[Acessing speaker spk_0 track 1 of 1:  75%|███████▌  | 9/12 [00:08<00:04,  1.46s/it]
[Acessing speaker spk_0 track 1 of 1:  83%|████████▎ | 10/12 [00:09<00:02,  1.26s/it]
[Acessing speaker spk_0 track 1 of 1:  92%|█████████▏| 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▌         | 1/17 [00:01<00:16,  1.05s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 2/17 [00:03<00:26,  1.75s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 3/17 [00:04<00:19,  1.40s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▎       | 4/17 [00:04<00:14,  1.08s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▉       | 5/17 [00:05<00:10,  1.11it/s]
[Acessing speaker spk_1 track 1 of 1:  35%|███▌      | 6/17 [00:06<00:09,  1.18it/s]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 7/17 [00:07<00:08,  1.13it/s]
[Acessing speaker spk_1 track 1 of 1:  47%|████▋     | 8/17 [00:08<00:09,  1.00s/it]
[Acessing speaker spk_1 track 1 of 1:  53%|█████▎    | 9/17 [00:09<00:08,  1.12s/it]
[Acessing speaker spk_1 track 1 of 1:  59%|█████▉    | 10/17 [00:12<00:12,  1.72s/it]
[Acessing speaker spk_1 track 1 of 1:  65%|██████▍   | 11/1





[Acessing speaker spk_2 track 1 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 8:  50%|█████     | 1/2 [00:00<00:00,  1.92it/s]
Processing speaker spk_2 track 1 of 8: 100%|██████████| 2/2 [00:00<00:00,  2.03it/s]

[Acessing speaker spk_2 track 2 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 2 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.90it/s]

[Acessing speaker spk_2 track 3 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 3 of 8:  50%|█████     | 1/2 [00:00<00:00,  3.01it/s]
Processing speaker spk_2 track 3 of 8: 100%|██████████| 2/2 [00:00<00:00,  2.54it/s]

[Acessing speaker spk_2 track 4 of 8:   0%|          | 0/3 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 4 of 8:  33%|███▎      | 1/3 [00:04<00:09,  4.68s/it]
[Acessing speaker spk_2 track 4 of 8:  67%|██████▋   | 2/3 [00:09<00:04,  4.59s/it]
Processing speaker spk_2 track 4 of 8: 100%|██████████| 3/3 [00:16<00:00,  5.58s/it]

[Acess





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▍         | 1/23 [00:00<00:16,  1.31it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▊         | 2/23 [00:02<00:22,  1.07s/it]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 3/23 [00:04<00:36,  1.83s/it]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 4/23 [00:08<00:48,  2.55s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 5/23 [00:10<00:43,  2.39s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 6/23 [00:11<00:31,  1.86s/it]
[Acessing speaker spk_3 track 1 of 1:  30%|███       | 7/23 [00:15<00:39,  2.47s/it]
[Acessing speaker spk_3 track 1 of 1:  35%|███▍      | 8/23 [00:18<00:41,  2.79s/it]
[Acessing speaker spk_3 track 1 of 1:  39%|███▉      | 9/23 [00:19<00:32,  2.34s/it]
[Acessing speaker spk_3 track 1 of 1:  43%|████▎     | 10/23 [00:20<00:24,  1.86s/it]
[Acessing speaker spk_3 track 1 of 1:  48%|████▊     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/31 [00:00<00:16,  1.80it/s]
[Acessing speaker spk_4 track 1 of 1:   6%|▋         | 2/31 [00:05<01:36,  3.32s/it]
[Acessing speaker spk_4 track 1 of 1:  10%|▉         | 3/31 [00:09<01:40,  3.58s/it]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 4/31 [00:13<01:43,  3.85s/it]
[Acessing speaker spk_4 track 1 of 1:  16%|█▌        | 5/31 [00:18<01:42,  3.94s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 6/31 [00:21<01:34,  3.80s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 7/31 [00:25<01:33,  3.90s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 8/31 [00:28<01:24,  3.67s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▉       | 9/31 [00:29<00:59,  2.72s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 10/31 [00:30<00:43,  2.06s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_5 track 1 of 2:   0%|          | 0/25 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 2:   4%|▍         | 1/25 [00:01<00:33,  1.39s/it]
[Acessing speaker spk_5 track 1 of 2:   8%|▊         | 2/25 [00:02<00:21,  1.05it/s]
[Acessing speaker spk_5 track 1 of 2:  12%|█▏        | 3/25 [00:03<00:27,  1.26s/it]
[Acessing speaker spk_5 track 1 of 2:  16%|█▌        | 4/25 [00:04<00:21,  1.02s/it]
[Acessing speaker spk_5 track 1 of 2:  20%|██        | 5/25 [00:05<00:18,  1.07it/s]
[Acessing speaker spk_5 track 1 of 2:  24%|██▍       | 6/25 [00:05<00:16,  1.14it/s]
[Acessing speaker spk_5 track 1 of 2:  28%|██▊       | 7/25 [00:06<00:14,  1.28it/s]
[Acessing speaker spk_5 track 1 of 2:  32%|███▏      | 8/25 [00:07<00:15,  1.12it/s]
[Acessing speaker spk_5 track 1 of 2:  36%|███▌      | 9/25 [00:08<00:13,  1.18it/s]
[Acessing speaker spk_5 track 1 of 2:  40%|████      | 10/25 [00:11<00:21,  1.43s/it]
[Acessing speaker spk_5 track 1 of 2:  44%|████▍     | 11/2


Starte Inference für Experiment: E62_bugfix_mdOn0p6_mdOff0p8_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E62_bugfix_mdOn0p6_mdOff0p8_bs12_len20
  session_dir     = data-bin/dev/session_49
  comment         = AVSR-Override: min_on=0.6s, min_off=0.8s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_49


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/12 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 1/12 [00:02<00:29,  2.70s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 2/12 [00:03<00:16,  1.69s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 3/12 [00:04<00:10,  1.14s/it]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 4/12 [00:04<00:06,  1.15it/s]
[Acessing speaker spk_0 track 1 of 1:  42%|████▏     | 5/12 [00:05<00:05,  1.23it/s]
[Acessing speaker spk_0 track 1 of 1:  50%|█████     | 6/12 [00:06<00:04,  1.28it/s]
[Acessing speaker spk_0 track 1 of 1:  58%|█████▊    | 7/12 [00:06<00:03,  1.40it/s]
[Acessing speaker spk_0 track 1 of 1:  67%|██████▋   | 8/12 [00:07<00:02,  1.37it/s]
[Acessing speaker spk_0 track 1 of 1:  75%|███████▌  | 9/12 [00:10<00:04,  1.50s/it]
[Acessing speaker spk_0 track 1 of 1:  83%|████████▎ | 10/12 [00:11<00:02,  1.29s/it]
[Acessing speaker spk_0 track 1 of 1:  92%|█████████▏| 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▋         | 1/16 [00:01<00:15,  1.04s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▎        | 2/16 [00:03<00:24,  1.74s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 3/16 [00:04<00:18,  1.39s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 4/16 [00:04<00:12,  1.07s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███▏      | 5/16 [00:05<00:09,  1.12it/s]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 6/16 [00:06<00:08,  1.19it/s]
[Acessing speaker spk_1 track 1 of 1:  44%|████▍     | 7/16 [00:07<00:07,  1.15it/s]
[Acessing speaker spk_1 track 1 of 1:  50%|█████     | 8/16 [00:08<00:07,  1.01it/s]
[Acessing speaker spk_1 track 1 of 1:  56%|█████▋    | 9/16 [00:09<00:07,  1.10s/it]
[Acessing speaker spk_1 track 1 of 1:  62%|██████▎   | 10/16 [00:12<00:10,  1.70s/it]
[Acessing speaker spk_1 track 1 of 1:  69%|██████▉   | 11/1





[Acessing speaker spk_2 track 1 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 8:  50%|█████     | 1/2 [00:00<00:00,  1.94it/s]
Processing speaker spk_2 track 1 of 8: 100%|██████████| 2/2 [00:00<00:00,  2.05it/s]

[Acessing speaker spk_2 track 2 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 2 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.90it/s]

[Acessing speaker spk_2 track 3 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 3 of 8:  50%|█████     | 1/2 [00:00<00:00,  3.06it/s]
Processing speaker spk_2 track 3 of 8: 100%|██████████| 2/2 [00:00<00:00,  2.58it/s]

[Acessing speaker spk_2 track 4 of 8:   0%|          | 0/3 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 4 of 8:  33%|███▎      | 1/3 [00:04<00:09,  4.53s/it]
[Acessing speaker spk_2 track 4 of 8:  67%|██████▋   | 2/3 [00:08<00:04,  4.48s/it]
Processing speaker spk_2 track 4 of 8: 100%|██████████| 3/3 [00:17<00:00,  5.87s/it]

[Acess





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▍         | 1/23 [00:00<00:16,  1.36it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▊         | 2/23 [00:01<00:21,  1.04s/it]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 3/23 [00:04<00:36,  1.81s/it]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 4/23 [00:08<00:48,  2.54s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 5/23 [00:10<00:42,  2.39s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 6/23 [00:11<00:31,  1.86s/it]
[Acessing speaker spk_3 track 1 of 1:  30%|███       | 7/23 [00:16<00:47,  2.98s/it]
[Acessing speaker spk_3 track 1 of 1:  35%|███▍      | 8/23 [00:19<00:46,  3.11s/it]
[Acessing speaker spk_3 track 1 of 1:  39%|███▉      | 9/23 [00:21<00:35,  2.56s/it]
[Acessing speaker spk_3 track 1 of 1:  43%|████▎     | 10/23 [00:22<00:26,  2.01s/it]
[Acessing speaker spk_3 track 1 of 1:  48%|████▊     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/27 [00:04<02:09,  4.98s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/27 [00:08<01:46,  4.26s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/27 [00:13<01:45,  4.41s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▍        | 4/27 [00:17<01:36,  4.18s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▊        | 5/27 [00:20<01:26,  3.94s/it]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 6/27 [00:24<01:23,  3.99s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 7/27 [00:27<01:14,  3.72s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|██▉       | 8/27 [00:28<00:52,  2.74s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 9/27 [00:29<00:38,  2.13s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 10/27 [00:29<00:27,  1.62s/it]
[Acessing speaker spk_4 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_5 track 1 of 2:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 2:   4%|▍         | 1/24 [00:00<00:19,  1.18it/s]
[Acessing speaker spk_5 track 1 of 2:   8%|▊         | 2/24 [00:01<00:16,  1.37it/s]
[Acessing speaker spk_5 track 1 of 2:  12%|█▎        | 3/24 [00:03<00:23,  1.13s/it]
[Acessing speaker spk_5 track 1 of 2:  17%|█▋        | 4/24 [00:03<00:18,  1.06it/s]
[Acessing speaker spk_5 track 1 of 2:  21%|██        | 5/24 [00:04<00:16,  1.13it/s]
[Acessing speaker spk_5 track 1 of 2:  25%|██▌       | 6/24 [00:05<00:15,  1.18it/s]
[Acessing speaker spk_5 track 1 of 2:  29%|██▉       | 7/24 [00:05<00:12,  1.32it/s]
[Acessing speaker spk_5 track 1 of 2:  33%|███▎      | 8/24 [00:07<00:14,  1.14it/s]
[Acessing speaker spk_5 track 1 of 2:  38%|███▊      | 9/24 [00:07<00:12,  1.19it/s]
[Acessing speaker spk_5 track 1 of 2:  42%|████▏     | 10/24 [00:10<00:20,  1.43s/it]
[Acessing speaker spk_5 track 1 of 2:  46%|████▌     | 11/2


Starte Inference für Experiment: E63_bugfix_mdOn0p6_mdOff1p0_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E63_bugfix_mdOn0p6_mdOff1p0_bs12_len20
  session_dir     = data-bin/dev/session_49
  comment         = AVSR-Override: min_on=0.6s, min_off=1.0s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_49


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/12 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 1/12 [00:01<00:11,  1.06s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 2/12 [00:02<00:10,  1.02s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 3/12 [00:02<00:06,  1.31it/s]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 4/12 [00:02<00:05,  1.55it/s]
[Acessing speaker spk_0 track 1 of 1:  42%|████▏     | 5/12 [00:03<00:04,  1.48it/s]
[Acessing speaker spk_0 track 1 of 1:  50%|█████     | 6/12 [00:04<00:04,  1.46it/s]
[Acessing speaker spk_0 track 1 of 1:  58%|█████▊    | 7/12 [00:04<00:03,  1.54it/s]
[Acessing speaker spk_0 track 1 of 1:  67%|██████▋   | 8/12 [00:05<00:02,  1.47it/s]
[Acessing speaker spk_0 track 1 of 1:  75%|███████▌  | 9/12 [00:08<00:04,  1.44s/it]
[Acessing speaker spk_0 track 1 of 1:  83%|████████▎ | 10/12 [00:09<00:02,  1.25s/it]
[Acessing speaker spk_0 track 1 of 1:  92%|█████████▏| 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/14 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 1/14 [00:06<01:21,  6.24s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 2/14 [00:07<00:39,  3.26s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██▏       | 3/14 [00:08<00:22,  2.04s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▊       | 4/14 [00:08<00:14,  1.47s/it]
[Acessing speaker spk_1 track 1 of 1:  36%|███▌      | 5/14 [00:09<00:10,  1.20s/it]
[Acessing speaker spk_1 track 1 of 1:  43%|████▎     | 6/14 [00:10<00:08,  1.11s/it]
[Acessing speaker spk_1 track 1 of 1:  50%|█████     | 7/14 [00:12<00:10,  1.44s/it]
[Acessing speaker spk_1 track 1 of 1:  57%|█████▋    | 8/14 [00:13<00:08,  1.36s/it]
[Acessing speaker spk_1 track 1 of 1:  64%|██████▍   | 9/14 [00:16<00:09,  1.88s/it]
[Acessing speaker spk_1 track 1 of 1:  71%|███████▏  | 10/14 [00:26<00:17,  4.42s/it]
[Acessing speaker spk_1 track 1 of 1:  79%|███████▊  | 11/1





[Acessing speaker spk_2 track 1 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 8:  50%|█████     | 1/2 [00:00<00:00,  1.92it/s]
Processing speaker spk_2 track 1 of 8: 100%|██████████| 2/2 [00:00<00:00,  2.05it/s]

[Acessing speaker spk_2 track 2 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 2 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.91it/s]

[Acessing speaker spk_2 track 3 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 3 of 8:  50%|█████     | 1/2 [00:00<00:00,  3.06it/s]
Processing speaker spk_2 track 3 of 8: 100%|██████████| 2/2 [00:00<00:00,  2.56it/s]

[Acessing speaker spk_2 track 4 of 8:   0%|          | 0/3 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 4 of 8:  33%|███▎      | 1/3 [00:04<00:08,  4.42s/it]
[Acessing speaker spk_2 track 4 of 8:  67%|██████▋   | 2/3 [00:08<00:04,  4.48s/it]
Processing speaker spk_2 track 4 of 8: 100%|██████████| 3/3 [00:16<00:00,  5.45s/it]

[Acess





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▍         | 1/21 [00:00<00:14,  1.35it/s]
[Acessing speaker spk_3 track 1 of 1:  10%|▉         | 2/21 [00:02<00:19,  1.05s/it]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 3/21 [00:04<00:33,  1.83s/it]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 4/21 [00:08<00:43,  2.56s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 5/21 [00:10<00:38,  2.41s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▊       | 6/21 [00:11<00:28,  1.87s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 7/21 [00:15<00:34,  2.47s/it]
[Acessing speaker spk_3 track 1 of 1:  38%|███▊      | 8/21 [00:18<00:35,  2.76s/it]
[Acessing speaker spk_3 track 1 of 1:  43%|████▎     | 9/21 [00:19<00:27,  2.32s/it]
[Acessing speaker spk_3 track 1 of 1:  48%|████▊     | 10/21 [00:21<00:24,  2.23s/it]
[Acessing speaker spk_3 track 1 of 1:  52%|█████▏    | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/25 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/25 [00:04<01:59,  4.98s/it]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/25 [00:12<02:30,  6.52s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 3/25 [00:20<02:42,  7.39s/it]
[Acessing speaker spk_4 track 1 of 1:  16%|█▌        | 4/25 [00:24<02:03,  5.90s/it]
[Acessing speaker spk_4 track 1 of 1:  20%|██        | 5/25 [00:28<01:45,  5.28s/it]
[Acessing speaker spk_4 track 1 of 1:  24%|██▍       | 6/25 [00:31<01:26,  4.57s/it]
[Acessing speaker spk_4 track 1 of 1:  28%|██▊       | 7/25 [00:32<00:59,  3.28s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 8/25 [00:33<00:42,  2.49s/it]
[Acessing speaker spk_4 track 1 of 1:  36%|███▌      | 9/25 [00:33<00:29,  1.85s/it]
[Acessing speaker spk_4 track 1 of 1:  40%|████      | 10/25 [00:34<00:23,  1.56s/it]
[Acessing speaker spk_4 track 1 of 1:  44%|████▍     | 11/2





[Acessing speaker spk_5 track 1 of 2:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 2:   5%|▍         | 1/21 [00:00<00:16,  1.19it/s]
[Acessing speaker spk_5 track 1 of 2:  10%|▉         | 2/21 [00:01<00:13,  1.38it/s]
[Acessing speaker spk_5 track 1 of 2:  14%|█▍        | 3/21 [00:03<00:20,  1.13s/it]
[Acessing speaker spk_5 track 1 of 2:  19%|█▉        | 4/21 [00:03<00:16,  1.06it/s]
[Acessing speaker spk_5 track 1 of 2:  24%|██▍       | 5/21 [00:04<00:14,  1.13it/s]
[Acessing speaker spk_5 track 1 of 2:  29%|██▊       | 6/21 [00:05<00:12,  1.18it/s]
[Acessing speaker spk_5 track 1 of 2:  33%|███▎      | 7/21 [00:05<00:10,  1.32it/s]
[Acessing speaker spk_5 track 1 of 2:  38%|███▊      | 8/21 [00:07<00:11,  1.14it/s]
[Acessing speaker spk_5 track 1 of 2:  43%|████▎     | 9/21 [00:07<00:10,  1.19it/s]
[Acessing speaker spk_5 track 1 of 2:  48%|████▊     | 10/21 [00:10<00:16,  1.48s/it]
[Acessing speaker spk_5 track 1 of 2:  52%|█████▏    | 11/2


Starte Inference für Experiment: E64_bugfix_mdOn0p6_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E64_bugfix_mdOn0p6_mdOff1p2_bs12_len20
  session_dir     = data-bin/dev/session_49
  comment         = AVSR-Override: min_on=0.6s, min_off=1.2s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_49


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/12 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 1/12 [00:01<00:11,  1.06s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 2/12 [00:02<00:10,  1.00s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 3/12 [00:02<00:06,  1.32it/s]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 4/12 [00:02<00:05,  1.57it/s]
[Acessing speaker spk_0 track 1 of 1:  42%|████▏     | 5/12 [00:03<00:04,  1.50it/s]
[Acessing speaker spk_0 track 1 of 1:  50%|█████     | 6/12 [00:04<00:04,  1.48it/s]
[Acessing speaker spk_0 track 1 of 1:  58%|█████▊    | 7/12 [00:04<00:03,  1.55it/s]
[Acessing speaker spk_0 track 1 of 1:  67%|██████▋   | 8/12 [00:05<00:02,  1.48it/s]
[Acessing speaker spk_0 track 1 of 1:  75%|███████▌  | 9/12 [00:08<00:04,  1.46s/it]
[Acessing speaker spk_0 track 1 of 1:  83%|████████▎ | 10/12 [00:09<00:02,  1.27s/it]
[Acessing speaker spk_0 track 1 of 1:  92%|█████████▏| 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/12 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 1/12 [00:06<01:08,  6.23s/it]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 2/12 [00:07<00:32,  3.25s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 3/12 [00:07<00:18,  2.04s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 4/12 [00:08<00:11,  1.46s/it]
[Acessing speaker spk_1 track 1 of 1:  42%|████▏     | 5/12 [00:09<00:08,  1.20s/it]
[Acessing speaker spk_1 track 1 of 1:  50%|█████     | 6/12 [00:11<00:10,  1.68s/it]
[Acessing speaker spk_1 track 1 of 1:  58%|█████▊    | 7/12 [00:17<00:15,  3.06s/it]
[Acessing speaker spk_1 track 1 of 1:  67%|██████▋   | 8/12 [00:23<00:15,  3.83s/it]
[Acessing speaker spk_1 track 1 of 1:  75%|███████▌  | 9/12 [00:28<00:13,  4.38s/it]
[Acessing speaker spk_1 track 1 of 1:  83%|████████▎ | 10/12 [00:29<00:06,  3.33s/it]
[Acessing speaker spk_1 track 1 of 1:  92%|█████████▏| 11/1





[Acessing speaker spk_2 track 1 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 8:  50%|█████     | 1/2 [00:00<00:00,  1.93it/s]
Processing speaker spk_2 track 1 of 8: 100%|██████████| 2/2 [00:00<00:00,  2.07it/s]

[Acessing speaker spk_2 track 2 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 2 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.91it/s]

[Acessing speaker spk_2 track 3 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 3 of 8:  50%|█████     | 1/2 [00:00<00:00,  3.12it/s]
Processing speaker spk_2 track 3 of 8: 100%|██████████| 2/2 [00:00<00:00,  2.61it/s]

[Acessing speaker spk_2 track 4 of 8:   0%|          | 0/3 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 4 of 8:  33%|███▎      | 1/3 [00:04<00:08,  4.36s/it]
[Acessing speaker spk_2 track 4 of 8:  67%|██████▋   | 2/3 [00:08<00:04,  4.41s/it]
Processing speaker spk_2 track 4 of 8: 100%|██████████| 3/3 [00:16<00:00,  5.40s/it]

[Acess





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▍         | 1/21 [00:00<00:15,  1.32it/s]
[Acessing speaker spk_3 track 1 of 1:  10%|▉         | 2/21 [00:02<00:20,  1.06s/it]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 3/21 [00:04<00:32,  1.83s/it]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 4/21 [00:08<00:44,  2.61s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 5/21 [00:10<00:39,  2.44s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▊       | 6/21 [00:11<00:28,  1.89s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 7/21 [00:15<00:34,  2.49s/it]
[Acessing speaker spk_3 track 1 of 1:  38%|███▊      | 8/21 [00:18<00:35,  2.72s/it]
[Acessing speaker spk_3 track 1 of 1:  43%|████▎     | 9/21 [00:19<00:27,  2.29s/it]
[Acessing speaker spk_3 track 1 of 1:  48%|████▊     | 10/21 [00:21<00:24,  2.20s/it]
[Acessing speaker spk_3 track 1 of 1:  52%|█████▏    | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/24 [00:05<02:01,  5.30s/it]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/24 [00:13<02:35,  7.08s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▎        | 3/24 [00:24<03:05,  8.82s/it]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 4/24 [00:34<03:02,  9.14s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██        | 5/24 [00:37<02:13,  7.04s/it]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 6/24 [00:38<01:27,  4.86s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▉       | 7/24 [00:38<01:00,  3.53s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 8/24 [00:39<00:40,  2.55s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 9/24 [00:40<00:30,  2.03s/it]
[Acessing speaker spk_4 track 1 of 1:  42%|████▏     | 10/24 [00:40<00:22,  1.59s/it]
[Acessing speaker spk_4 track 1 of 1:  46%|████▌     | 11/2





[Acessing speaker spk_5 track 1 of 2:   0%|          | 0/20 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 2:   5%|▌         | 1/20 [00:00<00:15,  1.19it/s]
[Acessing speaker spk_5 track 1 of 2:  10%|█         | 2/20 [00:01<00:13,  1.38it/s]
[Acessing speaker spk_5 track 1 of 2:  15%|█▌        | 3/20 [00:03<00:19,  1.13s/it]
[Acessing speaker spk_5 track 1 of 2:  20%|██        | 4/20 [00:03<00:15,  1.06it/s]
[Acessing speaker spk_5 track 1 of 2:  25%|██▌       | 5/20 [00:04<00:13,  1.13it/s]
[Acessing speaker spk_5 track 1 of 2:  30%|███       | 6/20 [00:05<00:11,  1.19it/s]
[Acessing speaker spk_5 track 1 of 2:  35%|███▌      | 7/20 [00:05<00:09,  1.32it/s]
[Acessing speaker spk_5 track 1 of 2:  40%|████      | 8/20 [00:07<00:10,  1.14it/s]
[Acessing speaker spk_5 track 1 of 2:  45%|████▌     | 9/20 [00:07<00:09,  1.20it/s]
[Acessing speaker spk_5 track 1 of 2:  50%|█████     | 10/20 [00:10<00:14,  1.42s/it]
[Acessing speaker spk_5 track 1 of 2:  55%|█████▌    | 11/2


Starte Inference für Experiment: E65_bugfix_mdOn0p8_mdOff0p5_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E65_bugfix_mdOn0p8_mdOff0p5_bs12_len20
  session_dir     = data-bin/dev/session_49
  comment         = AVSR-Override: min_on=0.8s, min_off=0.5s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_49


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/12 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 1/12 [00:01<00:11,  1.05s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 2/12 [00:02<00:10,  1.01s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 3/12 [00:02<00:06,  1.32it/s]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 4/12 [00:02<00:05,  1.56it/s]
[Acessing speaker spk_0 track 1 of 1:  42%|████▏     | 5/12 [00:03<00:04,  1.50it/s]
[Acessing speaker spk_0 track 1 of 1:  50%|█████     | 6/12 [00:04<00:04,  1.47it/s]
[Acessing speaker spk_0 track 1 of 1:  58%|█████▊    | 7/12 [00:04<00:03,  1.56it/s]
[Acessing speaker spk_0 track 1 of 1:  67%|██████▋   | 8/12 [00:05<00:02,  1.48it/s]
[Acessing speaker spk_0 track 1 of 1:  75%|███████▌  | 9/12 [00:08<00:04,  1.44s/it]
[Acessing speaker spk_0 track 1 of 1:  83%|████████▎ | 10/12 [00:09<00:02,  1.25s/it]
[Acessing speaker spk_0 track 1 of 1:  92%|█████████▏| 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▌         | 1/17 [00:01<00:16,  1.04s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 2/17 [00:03<00:26,  1.74s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 3/17 [00:04<00:19,  1.40s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▎       | 4/17 [00:04<00:13,  1.07s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▉       | 5/17 [00:05<00:10,  1.12it/s]
[Acessing speaker spk_1 track 1 of 1:  35%|███▌      | 6/17 [00:06<00:09,  1.19it/s]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 7/17 [00:07<00:08,  1.15it/s]
[Acessing speaker spk_1 track 1 of 1:  47%|████▋     | 8/17 [00:08<00:08,  1.00it/s]
[Acessing speaker spk_1 track 1 of 1:  53%|█████▎    | 9/17 [00:09<00:08,  1.10s/it]
[Acessing speaker spk_1 track 1 of 1:  59%|█████▉    | 10/17 [00:12<00:11,  1.69s/it]
[Acessing speaker spk_1 track 1 of 1:  65%|██████▍   | 11/1





[Acessing speaker spk_2 track 1 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 8:  50%|█████     | 1/2 [00:00<00:00,  1.88it/s]
Processing speaker spk_2 track 1 of 8: 100%|██████████| 2/2 [00:02<00:00,  1.14s/it]

[Acessing speaker spk_2 track 2 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 2 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.80it/s]

[Acessing speaker spk_2 track 3 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 3 of 8: 100%|██████████| 1/1 [00:00<00:00,  2.23it/s]

[Acessing speaker spk_2 track 4 of 8:   0%|          | 0/3 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 4 of 8:  33%|███▎      | 1/3 [00:04<00:09,  4.53s/it]
[Acessing speaker spk_2 track 4 of 8:  67%|██████▋   | 2/3 [00:08<00:04,  4.48s/it]
Processing speaker spk_2 track 4 of 8: 100%|██████████| 3/3 [00:16<00:00,  5.39s/it]

[Acessing speaker spk_2 track 5 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing spea





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▍         | 1/23 [00:00<00:16,  1.34it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▊         | 2/23 [00:02<00:22,  1.06s/it]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 3/23 [00:04<00:36,  1.82s/it]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 4/23 [00:08<00:48,  2.54s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 5/23 [00:10<00:43,  2.39s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 6/23 [00:11<00:31,  1.85s/it]
[Acessing speaker spk_3 track 1 of 1:  30%|███       | 7/23 [00:15<00:39,  2.46s/it]
[Acessing speaker spk_3 track 1 of 1:  35%|███▍      | 8/23 [00:18<00:40,  2.70s/it]
[Acessing speaker spk_3 track 1 of 1:  39%|███▉      | 9/23 [00:19<00:31,  2.28s/it]
[Acessing speaker spk_3 track 1 of 1:  43%|████▎     | 10/23 [00:20<00:23,  1.81s/it]
[Acessing speaker spk_3 track 1 of 1:  48%|████▊     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/30 [00:00<00:16,  1.78it/s]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/30 [00:04<01:08,  2.45s/it]
[Acessing speaker spk_4 track 1 of 1:  10%|█         | 3/30 [00:08<01:23,  3.11s/it]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 4/30 [00:12<01:37,  3.73s/it]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 5/30 [00:16<01:34,  3.78s/it]
[Acessing speaker spk_4 track 1 of 1:  20%|██        | 6/30 [00:20<01:29,  3.75s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 7/30 [00:24<01:29,  3.89s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 8/30 [00:27<01:20,  3.68s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|███       | 9/30 [00:28<00:57,  2.73s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 10/30 [00:29<00:41,  2.07s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 11/3





[Acessing speaker spk_5 track 1 of 2:   0%|          | 0/25 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 2:   4%|▍         | 1/25 [00:00<00:20,  1.19it/s]
[Acessing speaker spk_5 track 1 of 2:   8%|▊         | 2/25 [00:01<00:16,  1.37it/s]
[Acessing speaker spk_5 track 1 of 2:  12%|█▏        | 3/25 [00:03<00:25,  1.14s/it]
[Acessing speaker spk_5 track 1 of 2:  16%|█▌        | 4/25 [00:03<00:19,  1.05it/s]
[Acessing speaker spk_5 track 1 of 2:  20%|██        | 5/25 [00:04<00:17,  1.13it/s]
[Acessing speaker spk_5 track 1 of 2:  24%|██▍       | 6/25 [00:05<00:16,  1.18it/s]
[Acessing speaker spk_5 track 1 of 2:  28%|██▊       | 7/25 [00:05<00:13,  1.32it/s]
[Acessing speaker spk_5 track 1 of 2:  32%|███▏      | 8/25 [00:07<00:14,  1.13it/s]
[Acessing speaker spk_5 track 1 of 2:  36%|███▌      | 9/25 [00:07<00:13,  1.19it/s]
[Acessing speaker spk_5 track 1 of 2:  40%|████      | 10/25 [00:12<00:29,  1.98s/it]
[Acessing speaker spk_5 track 1 of 2:  44%|████▍     | 11/2


Starte Inference für Experiment: E66_bugfix_mdOn0p8_mdOff0p8_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E66_bugfix_mdOn0p8_mdOff0p8_bs12_len20
  session_dir     = data-bin/dev/session_49
  comment         = AVSR-Override: min_on=0.8s, min_off=0.8s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_49


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/12 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 1/12 [00:01<00:11,  1.06s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 2/12 [00:02<00:10,  1.01s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 3/12 [00:02<00:06,  1.31it/s]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 4/12 [00:02<00:05,  1.56it/s]
[Acessing speaker spk_0 track 1 of 1:  42%|████▏     | 5/12 [00:03<00:04,  1.50it/s]
[Acessing speaker spk_0 track 1 of 1:  50%|█████     | 6/12 [00:04<00:04,  1.47it/s]
[Acessing speaker spk_0 track 1 of 1:  58%|█████▊    | 7/12 [00:04<00:03,  1.54it/s]
[Acessing speaker spk_0 track 1 of 1:  67%|██████▋   | 8/12 [00:05<00:02,  1.47it/s]
[Acessing speaker spk_0 track 1 of 1:  75%|███████▌  | 9/12 [00:08<00:04,  1.46s/it]
[Acessing speaker spk_0 track 1 of 1:  83%|████████▎ | 10/12 [00:09<00:02,  1.26s/it]
[Acessing speaker spk_0 track 1 of 1:  92%|█████████▏| 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▋         | 1/16 [00:01<00:15,  1.04s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▎        | 2/16 [00:03<00:24,  1.74s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 3/16 [00:04<00:18,  1.39s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 4/16 [00:04<00:12,  1.07s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███▏      | 5/16 [00:05<00:09,  1.11it/s]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 6/16 [00:06<00:08,  1.19it/s]
[Acessing speaker spk_1 track 1 of 1:  44%|████▍     | 7/16 [00:07<00:07,  1.15it/s]
[Acessing speaker spk_1 track 1 of 1:  50%|█████     | 8/16 [00:08<00:07,  1.01it/s]
[Acessing speaker spk_1 track 1 of 1:  56%|█████▋    | 9/16 [00:09<00:07,  1.10s/it]
[Acessing speaker spk_1 track 1 of 1:  62%|██████▎   | 10/16 [00:12<00:10,  1.69s/it]
[Acessing speaker spk_1 track 1 of 1:  69%|██████▉   | 11/1





[Acessing speaker spk_2 track 1 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 8:  50%|█████     | 1/2 [00:00<00:00,  1.92it/s]
Processing speaker spk_2 track 1 of 8: 100%|██████████| 2/2 [00:00<00:00,  2.03it/s]

[Acessing speaker spk_2 track 2 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 2 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.90it/s]

[Acessing speaker spk_2 track 3 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 3 of 8: 100%|██████████| 1/1 [00:00<00:00,  2.20it/s]

[Acessing speaker spk_2 track 4 of 8:   0%|          | 0/3 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 4 of 8:  33%|███▎      | 1/3 [00:04<00:09,  4.55s/it]
[Acessing speaker spk_2 track 4 of 8:  67%|██████▋   | 2/3 [00:09<00:04,  4.50s/it]
Processing speaker spk_2 track 4 of 8: 100%|██████████| 3/3 [00:16<00:00,  5.38s/it]

[Acessing speaker spk_2 track 5 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing spea





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▍         | 1/23 [00:00<00:16,  1.34it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▊         | 2/23 [00:02<00:22,  1.06s/it]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 3/23 [00:04<00:36,  1.83s/it]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 4/23 [00:08<00:49,  2.58s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 5/23 [00:10<00:43,  2.42s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 6/23 [00:11<00:31,  1.88s/it]
[Acessing speaker spk_3 track 1 of 1:  30%|███       | 7/23 [00:15<00:39,  2.48s/it]
[Acessing speaker spk_3 track 1 of 1:  35%|███▍      | 8/23 [00:18<00:40,  2.72s/it]
[Acessing speaker spk_3 track 1 of 1:  39%|███▉      | 9/23 [00:19<00:33,  2.36s/it]
[Acessing speaker spk_3 track 1 of 1:  43%|████▎     | 10/23 [00:20<00:24,  1.87s/it]
[Acessing speaker spk_3 track 1 of 1:  48%|████▊     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/26 [00:05<02:06,  5.06s/it]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/26 [00:08<01:43,  4.30s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 3/26 [00:13<01:39,  4.32s/it]
[Acessing speaker spk_4 track 1 of 1:  15%|█▌        | 4/26 [00:17<01:32,  4.22s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 5/26 [00:20<01:23,  3.97s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 6/26 [00:24<01:20,  4.01s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 7/26 [00:28<01:11,  3.74s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 8/26 [00:28<00:49,  2.76s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▍      | 9/26 [00:29<00:36,  2.14s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 10/26 [00:29<00:25,  1.62s/it]
[Acessing speaker spk_4 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_5 track 1 of 2:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 2:   4%|▍         | 1/24 [00:00<00:19,  1.18it/s]
[Acessing speaker spk_5 track 1 of 2:   8%|▊         | 2/24 [00:01<00:16,  1.36it/s]
[Acessing speaker spk_5 track 1 of 2:  12%|█▎        | 3/24 [00:03<00:24,  1.15s/it]
[Acessing speaker spk_5 track 1 of 2:  17%|█▋        | 4/24 [00:03<00:19,  1.04it/s]
[Acessing speaker spk_5 track 1 of 2:  21%|██        | 5/24 [00:04<00:16,  1.12it/s]
[Acessing speaker spk_5 track 1 of 2:  25%|██▌       | 6/24 [00:05<00:15,  1.17it/s]
[Acessing speaker spk_5 track 1 of 2:  29%|██▉       | 7/24 [00:05<00:12,  1.31it/s]
[Acessing speaker spk_5 track 1 of 2:  33%|███▎      | 8/24 [00:07<00:14,  1.13it/s]
[Acessing speaker spk_5 track 1 of 2:  38%|███▊      | 9/24 [00:07<00:12,  1.18it/s]
[Acessing speaker spk_5 track 1 of 2:  42%|████▏     | 10/24 [00:10<00:20,  1.44s/it]
[Acessing speaker spk_5 track 1 of 2:  46%|████▌     | 11/2


Starte Inference für Experiment: E67_bugfix_mdOn0p8_mdOff1p0_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E67_bugfix_mdOn0p8_mdOff1p0_bs12_len20
  session_dir     = data-bin/dev/session_49
  comment         = AVSR-Override: min_on=0.8s, min_off=1.0s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_49


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/12 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 1/12 [00:01<00:11,  1.08s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 2/12 [00:02<00:10,  1.05s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 3/12 [00:02<00:07,  1.26it/s]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 4/12 [00:03<00:05,  1.50it/s]
[Acessing speaker spk_0 track 1 of 1:  42%|████▏     | 5/12 [00:03<00:04,  1.44it/s]
[Acessing speaker spk_0 track 1 of 1:  50%|█████     | 6/12 [00:04<00:04,  1.41it/s]
[Acessing speaker spk_0 track 1 of 1:  58%|█████▊    | 7/12 [00:05<00:03,  1.50it/s]
[Acessing speaker spk_0 track 1 of 1:  67%|██████▋   | 8/12 [00:05<00:02,  1.44it/s]
[Acessing speaker spk_0 track 1 of 1:  75%|███████▌  | 9/12 [00:09<00:04,  1.49s/it]
[Acessing speaker spk_0 track 1 of 1:  83%|████████▎ | 10/12 [00:09<00:02,  1.28s/it]
[Acessing speaker spk_0 track 1 of 1:  92%|█████████▏| 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/14 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 1/14 [00:06<01:19,  6.15s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 2/14 [00:07<00:38,  3.22s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██▏       | 3/14 [00:07<00:22,  2.02s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▊       | 4/14 [00:08<00:14,  1.45s/it]
[Acessing speaker spk_1 track 1 of 1:  36%|███▌      | 5/14 [00:09<00:10,  1.19s/it]
[Acessing speaker spk_1 track 1 of 1:  43%|████▎     | 6/14 [00:10<00:08,  1.10s/it]
[Acessing speaker spk_1 track 1 of 1:  50%|█████     | 7/14 [00:11<00:08,  1.15s/it]
[Acessing speaker spk_1 track 1 of 1:  57%|█████▋    | 8/14 [00:12<00:06,  1.15s/it]
[Acessing speaker spk_1 track 1 of 1:  64%|██████▍   | 9/14 [00:15<00:08,  1.74s/it]
[Acessing speaker spk_1 track 1 of 1:  71%|███████▏  | 10/14 [00:27<00:19,  4.86s/it]
[Acessing speaker spk_1 track 1 of 1:  79%|███████▊  | 11/1





[Acessing speaker spk_2 track 1 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 8:  50%|█████     | 1/2 [00:00<00:00,  1.90it/s]
Processing speaker spk_2 track 1 of 8: 100%|██████████| 2/2 [00:00<00:00,  2.05it/s]

[Acessing speaker spk_2 track 2 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 2 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.90it/s]

[Acessing speaker spk_2 track 3 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 3 of 8: 100%|██████████| 1/1 [00:00<00:00,  2.24it/s]

[Acessing speaker spk_2 track 4 of 8:   0%|          | 0/3 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 4 of 8:  33%|███▎      | 1/3 [00:04<00:08,  4.36s/it]
[Acessing speaker spk_2 track 4 of 8:  67%|██████▋   | 2/3 [00:08<00:04,  4.41s/it]
Processing speaker spk_2 track 4 of 8: 100%|██████████| 3/3 [00:16<00:00,  5.38s/it]

[Acessing speaker spk_2 track 5 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing spea





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▍         | 1/21 [00:00<00:14,  1.34it/s]
[Acessing speaker spk_3 track 1 of 1:  10%|▉         | 2/21 [00:02<00:20,  1.05s/it]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 3/21 [00:04<00:33,  1.83s/it]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 4/21 [00:08<00:43,  2.59s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 5/21 [00:10<00:40,  2.52s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▊       | 6/21 [00:11<00:29,  1.94s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 7/21 [00:15<00:35,  2.52s/it]
[Acessing speaker spk_3 track 1 of 1:  38%|███▊      | 8/21 [00:18<00:36,  2.79s/it]
[Acessing speaker spk_3 track 1 of 1:  43%|████▎     | 9/21 [00:20<00:28,  2.35s/it]
[Acessing speaker spk_3 track 1 of 1:  48%|████▊     | 10/21 [00:22<00:24,  2.25s/it]
[Acessing speaker spk_3 track 1 of 1:  52%|█████▏    | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/24 [00:05<01:55,  5.00s/it]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/24 [00:12<02:24,  6.55s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▎        | 3/24 [00:21<02:35,  7.40s/it]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 4/24 [00:24<01:58,  5.91s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██        | 5/24 [00:28<01:40,  5.29s/it]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 6/24 [00:31<01:22,  4.56s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▉       | 7/24 [00:32<00:55,  3.28s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 8/24 [00:33<00:39,  2.48s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 9/24 [00:33<00:27,  1.84s/it]
[Acessing speaker spk_4 track 1 of 1:  42%|████▏     | 10/24 [00:34<00:21,  1.56s/it]
[Acessing speaker spk_4 track 1 of 1:  46%|████▌     | 11/2





[Acessing speaker spk_5 track 1 of 2:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 2:   5%|▍         | 1/21 [00:00<00:16,  1.19it/s]
[Acessing speaker spk_5 track 1 of 2:  10%|▉         | 2/21 [00:01<00:13,  1.38it/s]
[Acessing speaker spk_5 track 1 of 2:  14%|█▍        | 3/21 [00:03<00:20,  1.13s/it]
[Acessing speaker spk_5 track 1 of 2:  19%|█▉        | 4/21 [00:03<00:16,  1.06it/s]
[Acessing speaker spk_5 track 1 of 2:  24%|██▍       | 5/21 [00:04<00:14,  1.13it/s]
[Acessing speaker spk_5 track 1 of 2:  29%|██▊       | 6/21 [00:05<00:12,  1.18it/s]
[Acessing speaker spk_5 track 1 of 2:  33%|███▎      | 7/21 [00:05<00:10,  1.33it/s]
[Acessing speaker spk_5 track 1 of 2:  38%|███▊      | 8/21 [00:07<00:11,  1.14it/s]
[Acessing speaker spk_5 track 1 of 2:  43%|████▎     | 9/21 [00:07<00:10,  1.19it/s]
[Acessing speaker spk_5 track 1 of 2:  48%|████▊     | 10/21 [00:10<00:16,  1.49s/it]
[Acessing speaker spk_5 track 1 of 2:  52%|█████▏    | 11/2


Starte Inference für Experiment: E68_bugfix_mdOn0p8_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E68_bugfix_mdOn0p8_mdOff1p2_bs12_len20
  session_dir     = data-bin/dev/session_49
  comment         = AVSR-Override: min_on=0.8s, min_off=1.2s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_49


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/12 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 1/12 [00:01<00:11,  1.06s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 2/12 [00:02<00:10,  1.02s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 3/12 [00:02<00:06,  1.31it/s]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 4/12 [00:02<00:05,  1.55it/s]
[Acessing speaker spk_0 track 1 of 1:  42%|████▏     | 5/12 [00:03<00:04,  1.49it/s]
[Acessing speaker spk_0 track 1 of 1:  50%|█████     | 6/12 [00:04<00:04,  1.47it/s]
[Acessing speaker spk_0 track 1 of 1:  58%|█████▊    | 7/12 [00:04<00:03,  1.54it/s]
[Acessing speaker spk_0 track 1 of 1:  67%|██████▋   | 8/12 [00:05<00:02,  1.47it/s]
[Acessing speaker spk_0 track 1 of 1:  75%|███████▌  | 9/12 [00:08<00:04,  1.46s/it]
[Acessing speaker spk_0 track 1 of 1:  83%|████████▎ | 10/12 [00:09<00:02,  1.26s/it]
[Acessing speaker spk_0 track 1 of 1:  92%|█████████▏| 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/12 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 1/12 [00:06<01:07,  6.12s/it]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 2/12 [00:07<00:32,  3.21s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 3/12 [00:07<00:18,  2.01s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 4/12 [00:08<00:11,  1.45s/it]
[Acessing speaker spk_1 track 1 of 1:  42%|████▏     | 5/12 [00:09<00:08,  1.19s/it]
[Acessing speaker spk_1 track 1 of 1:  50%|█████     | 6/12 [00:11<00:10,  1.67s/it]
[Acessing speaker spk_1 track 1 of 1:  58%|█████▊    | 7/12 [00:17<00:14,  2.99s/it]
[Acessing speaker spk_1 track 1 of 1:  67%|██████▋   | 8/12 [00:22<00:13,  3.48s/it]
[Acessing speaker spk_1 track 1 of 1:  75%|███████▌  | 9/12 [00:27<00:11,  3.97s/it]
[Acessing speaker spk_1 track 1 of 1:  83%|████████▎ | 10/12 [00:28<00:06,  3.06s/it]
[Acessing speaker spk_1 track 1 of 1:  92%|█████████▏| 11/1





[Acessing speaker spk_2 track 1 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 8:  50%|█████     | 1/2 [00:00<00:00,  1.84it/s]
Processing speaker spk_2 track 1 of 8: 100%|██████████| 2/2 [00:00<00:00,  2.00it/s]

[Acessing speaker spk_2 track 2 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 2 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.92it/s]

[Acessing speaker spk_2 track 3 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 3 of 8: 100%|██████████| 1/1 [00:00<00:00,  2.22it/s]

[Acessing speaker spk_2 track 4 of 8:   0%|          | 0/3 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 4 of 8:  33%|███▎      | 1/3 [00:04<00:08,  4.34s/it]
[Acessing speaker spk_2 track 4 of 8:  67%|██████▋   | 2/3 [00:08<00:04,  4.39s/it]
Processing speaker spk_2 track 4 of 8: 100%|██████████| 3/3 [00:16<00:00,  5.37s/it]

[Acessing speaker spk_2 track 5 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing spea





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▍         | 1/21 [00:00<00:14,  1.34it/s]
[Acessing speaker spk_3 track 1 of 1:  10%|▉         | 2/21 [00:02<00:20,  1.06s/it]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 3/21 [00:04<00:32,  1.83s/it]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 4/21 [00:08<00:43,  2.55s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 5/21 [00:10<00:38,  2.39s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▊       | 6/21 [00:11<00:27,  1.86s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 7/21 [00:15<00:35,  2.53s/it]
[Acessing speaker spk_3 track 1 of 1:  38%|███▊      | 8/21 [00:18<00:35,  2.76s/it]
[Acessing speaker spk_3 track 1 of 1:  43%|████▎     | 9/21 [00:19<00:27,  2.32s/it]
[Acessing speaker spk_3 track 1 of 1:  48%|████▊     | 10/21 [00:21<00:24,  2.23s/it]
[Acessing speaker spk_3 track 1 of 1:  52%|█████▏    | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/23 [00:05<01:50,  5.04s/it]
[Acessing speaker spk_4 track 1 of 1:   9%|▊         | 2/23 [00:15<02:50,  8.12s/it]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 3/23 [00:26<03:07,  9.35s/it]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 4/23 [00:36<03:02,  9.60s/it]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 5/23 [00:39<02:12,  7.36s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 6/23 [00:40<01:26,  5.08s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|███       | 7/23 [00:40<00:58,  3.68s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▍      | 8/23 [00:41<00:39,  2.65s/it]
[Acessing speaker spk_4 track 1 of 1:  39%|███▉      | 9/23 [00:42<00:29,  2.11s/it]
[Acessing speaker spk_4 track 1 of 1:  43%|████▎     | 10/23 [00:42<00:21,  1.65s/it]
[Acessing speaker spk_4 track 1 of 1:  48%|████▊     | 11/2





[Acessing speaker spk_5 track 1 of 2:   0%|          | 0/20 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 2:   5%|▌         | 1/20 [00:00<00:16,  1.17it/s]
[Acessing speaker spk_5 track 1 of 2:  10%|█         | 2/20 [00:01<00:13,  1.36it/s]
[Acessing speaker spk_5 track 1 of 2:  15%|█▌        | 3/20 [00:03<00:19,  1.15s/it]
[Acessing speaker spk_5 track 1 of 2:  20%|██        | 4/20 [00:03<00:15,  1.04it/s]
[Acessing speaker spk_5 track 1 of 2:  25%|██▌       | 5/20 [00:04<00:13,  1.11it/s]
[Acessing speaker spk_5 track 1 of 2:  30%|███       | 6/20 [00:05<00:12,  1.15it/s]
[Acessing speaker spk_5 track 1 of 2:  35%|███▌      | 7/20 [00:06<00:10,  1.30it/s]
[Acessing speaker spk_5 track 1 of 2:  40%|████      | 8/20 [00:07<00:10,  1.13it/s]
[Acessing speaker spk_5 track 1 of 2:  45%|████▌     | 9/20 [00:07<00:09,  1.19it/s]
[Acessing speaker spk_5 track 1 of 2:  50%|█████     | 10/20 [00:10<00:14,  1.43s/it]
[Acessing speaker spk_5 track 1 of 2:  55%|█████▌    | 11/2


Starte Inference für Experiment: E69_bugfix_mdOn1p0_mdOff0p5_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E69_bugfix_mdOn1p0_mdOff0p5_bs12_len20
  session_dir     = data-bin/dev/session_49
  comment         = AVSR-Override: min_on=1.0s, min_off=0.5s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_49


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/12 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 1/12 [00:01<00:11,  1.05s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 2/12 [00:02<00:10,  1.00s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 3/12 [00:02<00:06,  1.32it/s]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 4/12 [00:02<00:05,  1.56it/s]
[Acessing speaker spk_0 track 1 of 1:  42%|████▏     | 5/12 [00:03<00:04,  1.51it/s]
[Acessing speaker spk_0 track 1 of 1:  50%|█████     | 6/12 [00:04<00:04,  1.48it/s]
[Acessing speaker spk_0 track 1 of 1:  58%|█████▊    | 7/12 [00:04<00:03,  1.55it/s]
[Acessing speaker spk_0 track 1 of 1:  67%|██████▋   | 8/12 [00:05<00:02,  1.47it/s]
[Acessing speaker spk_0 track 1 of 1:  75%|███████▌  | 9/12 [00:08<00:04,  1.45s/it]
[Acessing speaker spk_0 track 1 of 1:  83%|████████▎ | 10/12 [00:09<00:02,  1.25s/it]
[Acessing speaker spk_0 track 1 of 1:  92%|█████████▏| 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▋         | 1/16 [00:01<00:15,  1.05s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▎        | 2/16 [00:03<00:24,  1.75s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 3/16 [00:04<00:18,  1.40s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 4/16 [00:04<00:12,  1.08s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███▏      | 5/16 [00:05<00:09,  1.11it/s]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 6/16 [00:06<00:08,  1.18it/s]
[Acessing speaker spk_1 track 1 of 1:  44%|████▍     | 7/16 [00:07<00:07,  1.15it/s]
[Acessing speaker spk_1 track 1 of 1:  50%|█████     | 8/16 [00:08<00:07,  1.01it/s]
[Acessing speaker spk_1 track 1 of 1:  56%|█████▋    | 9/16 [00:09<00:07,  1.10s/it]
[Acessing speaker spk_1 track 1 of 1:  62%|██████▎   | 10/16 [00:12<00:10,  1.70s/it]
[Acessing speaker spk_1 track 1 of 1:  69%|██████▉   | 11/1





[Acessing speaker spk_2 track 1 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 1 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.73it/s]

[Acessing speaker spk_2 track 2 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 2 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.88it/s]

[Acessing speaker spk_2 track 3 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 3 of 8: 100%|██████████| 1/1 [00:00<00:00,  2.20it/s]

[Acessing speaker spk_2 track 4 of 8:   0%|          | 0/3 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 4 of 8:  33%|███▎      | 1/3 [00:04<00:09,  4.60s/it]
[Acessing speaker spk_2 track 4 of 8:  67%|██████▋   | 2/3 [00:10<00:05,  5.44s/it]
Processing speaker spk_2 track 4 of 8: 100%|██████████| 3/3 [00:17<00:00,  5.94s/it]

[Acessing speaker spk_2 track 5 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 5 of 8:  50%|█████     | 1/2 [00:07<00:07,  7.57s/it]
Processing spea





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▍         | 1/23 [00:00<00:16,  1.34it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▊         | 2/23 [00:02<00:22,  1.05s/it]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 3/23 [00:04<00:36,  1.82s/it]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 4/23 [00:09<00:53,  2.82s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 5/23 [00:11<00:46,  2.58s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 6/23 [00:12<00:33,  1.98s/it]
[Acessing speaker spk_3 track 1 of 1:  30%|███       | 7/23 [00:15<00:40,  2.55s/it]
[Acessing speaker spk_3 track 1 of 1:  35%|███▍      | 8/23 [00:19<00:42,  2.81s/it]
[Acessing speaker spk_3 track 1 of 1:  39%|███▉      | 9/23 [00:20<00:32,  2.35s/it]
[Acessing speaker spk_3 track 1 of 1:  43%|████▎     | 10/23 [00:21<00:24,  1.86s/it]
[Acessing speaker spk_3 track 1 of 1:  48%|████▊     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/28 [00:00<00:14,  1.81it/s]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/28 [00:04<01:03,  2.44s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/28 [00:08<01:17,  3.10s/it]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 4/28 [00:12<01:26,  3.58s/it]
[Acessing speaker spk_4 track 1 of 1:  18%|█▊        | 5/28 [00:16<01:26,  3.75s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██▏       | 6/28 [00:20<01:20,  3.67s/it]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 7/28 [00:24<01:25,  4.07s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▊       | 8/28 [00:28<01:15,  3.80s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 9/28 [00:28<00:53,  2.84s/it]
[Acessing speaker spk_4 track 1 of 1:  36%|███▌      | 10/28 [00:29<00:39,  2.17s/it]
[Acessing speaker spk_4 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_5 track 1 of 2:   0%|          | 0/25 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 2:   4%|▍         | 1/25 [00:00<00:21,  1.13it/s]
[Acessing speaker spk_5 track 1 of 2:   8%|▊         | 2/25 [00:01<00:17,  1.33it/s]
[Acessing speaker spk_5 track 1 of 2:  12%|█▏        | 3/25 [00:03<00:25,  1.15s/it]
[Acessing speaker spk_5 track 1 of 2:  16%|█▌        | 4/25 [00:03<00:20,  1.04it/s]
[Acessing speaker spk_5 track 1 of 2:  20%|██        | 5/25 [00:04<00:17,  1.12it/s]
[Acessing speaker spk_5 track 1 of 2:  24%|██▍       | 6/25 [00:05<00:16,  1.17it/s]
[Acessing speaker spk_5 track 1 of 2:  28%|██▊       | 7/25 [00:05<00:13,  1.31it/s]
[Acessing speaker spk_5 track 1 of 2:  32%|███▏      | 8/25 [00:07<00:15,  1.12it/s]
[Acessing speaker spk_5 track 1 of 2:  36%|███▌      | 9/25 [00:07<00:13,  1.18it/s]
[Acessing speaker spk_5 track 1 of 2:  40%|████      | 10/25 [00:10<00:21,  1.43s/it]
[Acessing speaker spk_5 track 1 of 2:  44%|████▍     | 11/2


Starte Inference für Experiment: E70_bugfix_mdOn1p0_mdOff0p8_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E70_bugfix_mdOn1p0_mdOff0p8_bs12_len20
  session_dir     = data-bin/dev/session_49
  comment         = AVSR-Override: min_on=1.0s, min_off=0.8s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_49


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/12 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 1/12 [00:01<00:11,  1.05s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 2/12 [00:02<00:10,  1.00s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 3/12 [00:02<00:06,  1.31it/s]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 4/12 [00:02<00:05,  1.56it/s]
[Acessing speaker spk_0 track 1 of 1:  42%|████▏     | 5/12 [00:03<00:04,  1.50it/s]
[Acessing speaker spk_0 track 1 of 1:  50%|█████     | 6/12 [00:04<00:04,  1.47it/s]
[Acessing speaker spk_0 track 1 of 1:  58%|█████▊    | 7/12 [00:04<00:03,  1.54it/s]
[Acessing speaker spk_0 track 1 of 1:  67%|██████▋   | 8/12 [00:05<00:02,  1.47it/s]
[Acessing speaker spk_0 track 1 of 1:  75%|███████▌  | 9/12 [00:08<00:04,  1.45s/it]
[Acessing speaker spk_0 track 1 of 1:  83%|████████▎ | 10/12 [00:09<00:02,  1.25s/it]
[Acessing speaker spk_0 track 1 of 1:  92%|█████████▏| 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   6%|▋         | 1/16 [00:01<00:15,  1.05s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▎        | 2/16 [00:03<00:24,  1.75s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 3/16 [00:04<00:18,  1.40s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 4/16 [00:04<00:12,  1.08s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███▏      | 5/16 [00:05<00:09,  1.11it/s]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 6/16 [00:06<00:10,  1.07s/it]
[Acessing speaker spk_1 track 1 of 1:  44%|████▍     | 7/16 [00:08<00:11,  1.24s/it]
[Acessing speaker spk_1 track 1 of 1:  50%|█████     | 8/16 [00:09<00:10,  1.25s/it]
[Acessing speaker spk_1 track 1 of 1:  56%|█████▋    | 9/16 [00:11<00:08,  1.28s/it]
[Acessing speaker spk_1 track 1 of 1:  62%|██████▎   | 10/16 [00:14<00:10,  1.82s/it]
[Acessing speaker spk_1 track 1 of 1:  69%|██████▉   | 11/1





[Acessing speaker spk_2 track 1 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 1 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.75it/s]

[Acessing speaker spk_2 track 2 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 2 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.88it/s]

[Acessing speaker spk_2 track 3 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 3 of 8: 100%|██████████| 1/1 [00:00<00:00,  2.22it/s]

[Acessing speaker spk_2 track 4 of 8:   0%|          | 0/3 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 4 of 8:  33%|███▎      | 1/3 [00:04<00:09,  4.59s/it]
[Acessing speaker spk_2 track 4 of 8:  67%|██████▋   | 2/3 [00:09<00:04,  4.51s/it]
Processing speaker spk_2 track 4 of 8: 100%|██████████| 3/3 [00:16<00:00,  5.40s/it]

[Acessing speaker spk_2 track 5 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 5 of 8:  50%|█████     | 1/2 [00:04<00:04,  4.04s/it]
Processing spea





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   4%|▍         | 1/23 [00:00<00:21,  1.04it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▊         | 2/23 [00:02<00:24,  1.19s/it]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 3/23 [00:05<00:39,  1.96s/it]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 4/23 [00:08<00:50,  2.64s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 5/23 [00:10<00:44,  2.45s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 6/23 [00:11<00:32,  1.90s/it]
[Acessing speaker spk_3 track 1 of 1:  30%|███       | 7/23 [00:15<00:39,  2.50s/it]
[Acessing speaker spk_3 track 1 of 1:  35%|███▍      | 8/23 [00:18<00:40,  2.73s/it]
[Acessing speaker spk_3 track 1 of 1:  39%|███▉      | 9/23 [00:20<00:32,  2.30s/it]
[Acessing speaker spk_3 track 1 of 1:  43%|████▎     | 10/23 [00:20<00:23,  1.82s/it]
[Acessing speaker spk_3 track 1 of 1:  48%|████▊     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/24 [00:05<01:55,  5.01s/it]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/24 [00:08<01:33,  4.25s/it]
[Acessing speaker spk_4 track 1 of 1:  12%|█▎        | 3/24 [00:13<01:29,  4.28s/it]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 4/24 [00:17<01:24,  4.20s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██        | 5/24 [00:20<01:15,  3.96s/it]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 6/24 [00:24<01:12,  4.03s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▉       | 7/24 [00:28<01:03,  3.75s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 8/24 [00:28<00:45,  2.86s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 9/24 [00:29<00:31,  2.11s/it]
[Acessing speaker spk_4 track 1 of 1:  42%|████▏     | 10/24 [00:30<00:24,  1.74s/it]
[Acessing speaker spk_4 track 1 of 1:  46%|████▌     | 11/2





[Acessing speaker spk_5 track 1 of 2:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 2:   4%|▍         | 1/24 [00:00<00:20,  1.15it/s]
[Acessing speaker spk_5 track 1 of 2:   8%|▊         | 2/24 [00:01<00:16,  1.36it/s]
[Acessing speaker spk_5 track 1 of 2:  12%|█▎        | 3/24 [00:03<00:23,  1.13s/it]
[Acessing speaker spk_5 track 1 of 2:  17%|█▋        | 4/24 [00:03<00:18,  1.06it/s]
[Acessing speaker spk_5 track 1 of 2:  21%|██        | 5/24 [00:04<00:16,  1.13it/s]
[Acessing speaker spk_5 track 1 of 2:  25%|██▌       | 6/24 [00:05<00:15,  1.19it/s]
[Acessing speaker spk_5 track 1 of 2:  29%|██▉       | 7/24 [00:05<00:12,  1.32it/s]
[Acessing speaker spk_5 track 1 of 2:  33%|███▎      | 8/24 [00:07<00:14,  1.14it/s]
[Acessing speaker spk_5 track 1 of 2:  38%|███▊      | 9/24 [00:07<00:12,  1.19it/s]
[Acessing speaker spk_5 track 1 of 2:  42%|████▏     | 10/24 [00:10<00:20,  1.43s/it]
[Acessing speaker spk_5 track 1 of 2:  46%|████▌     | 11/2


Starte Inference für Experiment: E71_bugfix_mdOn1p0_mdOff1p0_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E71_bugfix_mdOn1p0_mdOff1p0_bs12_len20
  session_dir     = data-bin/dev/session_49
  comment         = AVSR-Override: min_on=1.0s, min_off=1.0s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_49


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/12 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 1/12 [00:01<00:11,  1.05s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 2/12 [00:02<00:10,  1.02s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 3/12 [00:02<00:06,  1.29it/s]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 4/12 [00:02<00:05,  1.54it/s]
[Acessing speaker spk_0 track 1 of 1:  42%|████▏     | 5/12 [00:03<00:04,  1.48it/s]
[Acessing speaker spk_0 track 1 of 1:  50%|█████     | 6/12 [00:04<00:04,  1.46it/s]
[Acessing speaker spk_0 track 1 of 1:  58%|█████▊    | 7/12 [00:04<00:03,  1.54it/s]
[Acessing speaker spk_0 track 1 of 1:  67%|██████▋   | 8/12 [00:05<00:02,  1.47it/s]
[Acessing speaker spk_0 track 1 of 1:  75%|███████▌  | 9/12 [00:08<00:04,  1.45s/it]
[Acessing speaker spk_0 track 1 of 1:  83%|████████▎ | 10/12 [00:09<00:02,  1.25s/it]
[Acessing speaker spk_0 track 1 of 1:  92%|█████████▏| 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/14 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 1/14 [00:06<01:19,  6.13s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 2/14 [00:07<00:38,  3.21s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██▏       | 3/14 [00:07<00:22,  2.01s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▊       | 4/14 [00:08<00:14,  1.45s/it]
[Acessing speaker spk_1 track 1 of 1:  36%|███▌      | 5/14 [00:09<00:10,  1.19s/it]
[Acessing speaker spk_1 track 1 of 1:  43%|████▎     | 6/14 [00:10<00:08,  1.10s/it]
[Acessing speaker spk_1 track 1 of 1:  50%|█████     | 7/14 [00:11<00:08,  1.16s/it]
[Acessing speaker spk_1 track 1 of 1:  57%|█████▋    | 8/14 [00:12<00:06,  1.16s/it]
[Acessing speaker spk_1 track 1 of 1:  64%|██████▍   | 9/14 [00:15<00:08,  1.74s/it]
[Acessing speaker spk_1 track 1 of 1:  71%|███████▏  | 10/14 [00:25<00:17,  4.33s/it]
[Acessing speaker spk_1 track 1 of 1:  79%|███████▊  | 11/1





[Acessing speaker spk_2 track 1 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 1 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.75it/s]

[Acessing speaker spk_2 track 2 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 2 of 8: 100%|██████████| 1/1 [00:01<00:00,  1.02s/it]

[Acessing speaker spk_2 track 3 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 3 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.05it/s]

[Acessing speaker spk_2 track 4 of 8:   0%|          | 0/3 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 4 of 8:  33%|███▎      | 1/3 [00:04<00:08,  4.37s/it]
[Acessing speaker spk_2 track 4 of 8:  67%|██████▋   | 2/3 [00:08<00:04,  4.48s/it]
Processing speaker spk_2 track 4 of 8: 100%|██████████| 3/3 [00:16<00:00,  5.44s/it]

[Acessing speaker spk_2 track 5 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 5 of 8:  50%|█████     | 1/2 [00:03<00:03,  3.86s/it]
Processing spea





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▍         | 1/21 [00:00<00:15,  1.29it/s]
[Acessing speaker spk_3 track 1 of 1:  10%|▉         | 2/21 [00:02<00:20,  1.09s/it]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 3/21 [00:04<00:33,  1.87s/it]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 4/21 [00:08<00:44,  2.61s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 5/21 [00:10<00:39,  2.45s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▊       | 6/21 [00:11<00:28,  1.90s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 7/21 [00:15<00:35,  2.50s/it]
[Acessing speaker spk_3 track 1 of 1:  38%|███▊      | 8/21 [00:18<00:36,  2.79s/it]
[Acessing speaker spk_3 track 1 of 1:  43%|████▎     | 9/21 [00:20<00:28,  2.34s/it]
[Acessing speaker spk_3 track 1 of 1:  48%|████▊     | 10/21 [00:22<00:24,  2.23s/it]
[Acessing speaker spk_3 track 1 of 1:  52%|█████▏    | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/22 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   5%|▍         | 1/22 [00:05<01:47,  5.10s/it]
[Acessing speaker spk_4 track 1 of 1:   9%|▉         | 2/22 [00:14<02:29,  7.50s/it]
[Acessing speaker spk_4 track 1 of 1:  14%|█▎        | 3/22 [00:22<02:32,  8.03s/it]
[Acessing speaker spk_4 track 1 of 1:  18%|█▊        | 4/22 [00:26<01:54,  6.34s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 5/22 [00:30<01:34,  5.55s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 6/22 [00:34<01:17,  4.84s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 7/22 [00:35<00:53,  3.56s/it]
[Acessing speaker spk_4 track 1 of 1:  36%|███▋      | 8/22 [00:35<00:36,  2.58s/it]
[Acessing speaker spk_4 track 1 of 1:  41%|████      | 9/22 [00:36<00:26,  2.07s/it]
[Acessing speaker spk_4 track 1 of 1:  45%|████▌     | 10/22 [00:37<00:19,  1.62s/it]
[Acessing speaker spk_4 track 1 of 1:  50%|█████     | 11/2





[Acessing speaker spk_5 track 1 of 2:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 2:   5%|▍         | 1/21 [00:00<00:17,  1.15it/s]
[Acessing speaker spk_5 track 1 of 2:  10%|▉         | 2/21 [00:01<00:13,  1.36it/s]
[Acessing speaker spk_5 track 1 of 2:  14%|█▍        | 3/21 [00:03<00:20,  1.13s/it]
[Acessing speaker spk_5 track 1 of 2:  19%|█▉        | 4/21 [00:03<00:16,  1.05it/s]
[Acessing speaker spk_5 track 1 of 2:  24%|██▍       | 5/21 [00:05<00:17,  1.08s/it]
[Acessing speaker spk_5 track 1 of 2:  29%|██▊       | 6/21 [00:06<00:18,  1.22s/it]
[Acessing speaker spk_5 track 1 of 2:  33%|███▎      | 7/21 [00:07<00:14,  1.01s/it]
[Acessing speaker spk_5 track 1 of 2:  38%|███▊      | 8/21 [00:08<00:13,  1.05s/it]
[Acessing speaker spk_5 track 1 of 2:  43%|████▎     | 9/21 [00:09<00:11,  1.05it/s]
[Acessing speaker spk_5 track 1 of 2:  48%|████▊     | 10/21 [00:11<00:16,  1.50s/it]
[Acessing speaker spk_5 track 1 of 2:  52%|█████▏    | 11/2


Starte Inference für Experiment: E72_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E72_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/dev/session_49
  comment         = AVSR-Override: min_on=1.0s, min_off=1.2s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_49


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/12 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 1/12 [00:01<00:11,  1.06s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 2/12 [00:02<00:14,  1.49s/it]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 3/12 [00:03<00:09,  1.02s/it]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 4/12 [00:03<00:06,  1.26it/s]
[Acessing speaker spk_0 track 1 of 1:  42%|████▏     | 5/12 [00:04<00:05,  1.30it/s]
[Acessing speaker spk_0 track 1 of 1:  50%|█████     | 6/12 [00:05<00:04,  1.34it/s]
[Acessing speaker spk_0 track 1 of 1:  58%|█████▊    | 7/12 [00:05<00:03,  1.45it/s]
[Acessing speaker spk_0 track 1 of 1:  67%|██████▋   | 8/12 [00:06<00:02,  1.40it/s]
[Acessing speaker spk_0 track 1 of 1:  75%|███████▌  | 9/12 [00:09<00:04,  1.47s/it]
[Acessing speaker spk_0 track 1 of 1:  83%|████████▎ | 10/12 [00:10<00:02,  1.26s/it]
[Acessing speaker spk_0 track 1 of 1:  92%|█████████▏| 11/1





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/12 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 1/12 [00:06<01:07,  6.16s/it]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 2/12 [00:07<00:32,  3.24s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 3/12 [00:07<00:18,  2.03s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 4/12 [00:08<00:11,  1.47s/it]
[Acessing speaker spk_1 track 1 of 1:  42%|████▏     | 5/12 [00:09<00:08,  1.21s/it]
[Acessing speaker spk_1 track 1 of 1:  50%|█████     | 6/12 [00:11<00:10,  1.69s/it]
[Acessing speaker spk_1 track 1 of 1:  58%|█████▊    | 7/12 [00:17<00:14,  2.99s/it]
[Acessing speaker spk_1 track 1 of 1:  67%|██████▋   | 8/12 [00:22<00:13,  3.49s/it]
[Acessing speaker spk_1 track 1 of 1:  75%|███████▌  | 9/12 [00:27<00:11,  3.97s/it]
[Acessing speaker spk_1 track 1 of 1:  83%|████████▎ | 10/12 [00:28<00:06,  3.05s/it]
[Acessing speaker spk_1 track 1 of 1:  92%|█████████▏| 11/1





[Acessing speaker spk_2 track 1 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 1 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.79it/s]

[Acessing speaker spk_2 track 2 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 2 of 8: 100%|██████████| 1/1 [00:00<00:00,  1.90it/s]

[Acessing speaker spk_2 track 3 of 8:   0%|          | 0/1 [00:00<?, ?it/s]
Processing speaker spk_2 track 3 of 8: 100%|██████████| 1/1 [00:00<00:00,  2.24it/s]

[Acessing speaker spk_2 track 4 of 8:   0%|          | 0/3 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 4 of 8:  33%|███▎      | 1/3 [00:04<00:08,  4.34s/it]
[Acessing speaker spk_2 track 4 of 8:  67%|██████▋   | 2/3 [00:08<00:04,  4.41s/it]
Processing speaker spk_2 track 4 of 8: 100%|██████████| 3/3 [00:17<00:00,  5.84s/it]

[Acessing speaker spk_2 track 5 of 8:   0%|          | 0/2 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 5 of 8:  50%|█████     | 1/2 [00:03<00:03,  3.85s/it]
Processing spea





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▍         | 1/21 [00:00<00:16,  1.21it/s]
[Acessing speaker spk_3 track 1 of 1:  10%|▉         | 2/21 [00:02<00:20,  1.10s/it]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 3/21 [00:05<00:34,  1.93s/it]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 4/21 [00:08<00:44,  2.64s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 5/21 [00:12<00:45,  2.86s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▊       | 6/21 [00:12<00:32,  2.18s/it]
[Acessing speaker spk_3 track 1 of 1:  33%|███▎      | 7/21 [00:16<00:37,  2.68s/it]
[Acessing speaker spk_3 track 1 of 1:  38%|███▊      | 8/21 [00:19<00:36,  2.84s/it]
[Acessing speaker spk_3 track 1 of 1:  43%|████▎     | 9/21 [00:21<00:28,  2.38s/it]
[Acessing speaker spk_3 track 1 of 1:  48%|████▊     | 10/21 [00:23<00:24,  2.26s/it]
[Acessing speaker spk_3 track 1 of 1:  52%|█████▏    | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   5%|▍         | 1/21 [00:05<01:44,  5.20s/it]
[Acessing speaker spk_4 track 1 of 1:  10%|▉         | 2/21 [00:13<02:11,  6.94s/it]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 3/21 [00:25<02:51,  9.53s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 4/21 [00:35<02:43,  9.60s/it]
[Acessing speaker spk_4 track 1 of 1:  24%|██▍       | 5/21 [00:39<01:57,  7.35s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▊       | 6/21 [00:39<01:17,  5.16s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 7/21 [00:40<00:50,  3.63s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 8/21 [00:41<00:35,  2.76s/it]
[Acessing speaker spk_4 track 1 of 1:  43%|████▎     | 9/21 [00:41<00:25,  2.09s/it]
[Acessing speaker spk_4 track 1 of 1:  48%|████▊     | 10/21 [00:42<00:18,  1.68s/it]
[Acessing speaker spk_4 track 1 of 1:  52%|█████▏    | 11/2





[Acessing speaker spk_5 track 1 of 2:   0%|          | 0/20 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 2:   5%|▌         | 1/20 [00:00<00:16,  1.13it/s]
[Acessing speaker spk_5 track 1 of 2:  10%|█         | 2/20 [00:01<00:13,  1.35it/s]
[Acessing speaker spk_5 track 1 of 2:  15%|█▌        | 3/20 [00:03<00:19,  1.14s/it]
[Acessing speaker spk_5 track 1 of 2:  20%|██        | 4/20 [00:03<00:15,  1.05it/s]
[Acessing speaker spk_5 track 1 of 2:  25%|██▌       | 5/20 [00:04<00:13,  1.13it/s]
[Acessing speaker spk_5 track 1 of 2:  30%|███       | 6/20 [00:05<00:11,  1.19it/s]
[Acessing speaker spk_5 track 1 of 2:  35%|███▌      | 7/20 [00:05<00:09,  1.32it/s]
[Acessing speaker spk_5 track 1 of 2:  40%|████      | 8/20 [00:07<00:10,  1.14it/s]
[Acessing speaker spk_5 track 1 of 2:  45%|████▌     | 9/20 [00:07<00:09,  1.19it/s]
[Acessing speaker spk_5 track 1 of 2:  50%|█████     | 10/20 [00:10<00:14,  1.42s/it]
[Acessing speaker spk_5 track 1 of 2:  55%|█████▌    | 11/2


########## Starte Grid-Experimente für session_50 ##########

Starte Inference für Experiment: E56_bugfix_default_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E56_bugfix_default_bs12_len20
  session_dir     = data-bin/dev/session_50
  comment         = Bugfix-default segmentation (kein Override von min_duration)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_50


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/29 [00:01<00:30,  1.10s/it]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/29 [00:01<00:25,  1.07it/s]
[Acessing speaker spk_0 track 1 of 1:  10%|█         | 3/29 [00:02<00:21,  1.23it/s]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 4/29 [00:03<00:16,  1.50it/s]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 5/29 [00:04<00:20,  1.14it/s]
[Acessing speaker spk_0 track 1 of 1:  21%|██        | 6/29 [00:05<00:20,  1.11it/s]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 7/29 [00:06<00:20,  1.07it/s]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 8/29 [00:06<00:17,  1.19it/s]
[Acessing speaker spk_0 track 1 of 1:  31%|███       | 9/29 [00:07<00:16,  1.24it/s]
[Acessing speaker spk_0 track 1 of 1:  34%|███▍      | 10/29 [00:08<00:14,  1.29it/s]
[Acessing speaker spk_0 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:11<04:47, 11.05s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:12<02:18,  5.54s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:13<01:17,  3.23s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:15<01:05,  2.85s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:17<00:54,  2.48s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:17<00:38,  1.84s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:18<00:28,  1.42s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:19<00:22,  1.18s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:19<00:17,  1.02it/s]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:22<00:28,  1.69s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/20 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   5%|▌         | 1/20 [00:01<00:28,  1.48s/it]
[Acessing speaker spk_2 track 1 of 2:  10%|█         | 2/20 [00:02<00:16,  1.06it/s]
[Acessing speaker spk_2 track 1 of 2:  15%|█▌        | 3/20 [00:06<00:47,  2.77s/it]
[Acessing speaker spk_2 track 1 of 2:  20%|██        | 4/20 [00:07<00:33,  2.07s/it]
[Acessing speaker spk_2 track 1 of 2:  25%|██▌       | 5/20 [00:09<00:28,  1.90s/it]
[Acessing speaker spk_2 track 1 of 2:  30%|███       | 6/20 [00:10<00:20,  1.49s/it]
[Acessing speaker spk_2 track 1 of 2:  35%|███▌      | 7/20 [00:10<00:16,  1.24s/it]
[Acessing speaker spk_2 track 1 of 2:  40%|████      | 8/20 [00:11<00:12,  1.01s/it]
[Acessing speaker spk_2 track 1 of 2:  45%|████▌     | 9/20 [00:12<00:09,  1.13it/s]
[Acessing speaker spk_2 track 1 of 2:  50%|█████     | 10/20 [00:13<00:09,  1.08it/s]
[Acessing speaker spk_2 track 1 of 2:  55%|█████▌    | 11/2





[Acessing speaker spk_3 track 1 of 3:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 3:   6%|▌         | 1/17 [00:01<00:21,  1.34s/it]
[Acessing speaker spk_3 track 1 of 3:  12%|█▏        | 2/17 [00:01<00:14,  1.07it/s]
[Acessing speaker spk_3 track 1 of 3:  18%|█▊        | 3/17 [00:02<00:13,  1.04it/s]
[Acessing speaker spk_3 track 1 of 3:  24%|██▎       | 4/17 [00:04<00:13,  1.02s/it]
[Acessing speaker spk_3 track 1 of 3:  29%|██▉       | 5/17 [00:04<00:11,  1.06it/s]
[Acessing speaker spk_3 track 1 of 3:  35%|███▌      | 6/17 [00:06<00:11,  1.00s/it]
[Acessing speaker spk_3 track 1 of 3:  41%|████      | 7/17 [00:07<00:10,  1.04s/it]
[Acessing speaker spk_3 track 1 of 3:  47%|████▋     | 8/17 [00:07<00:08,  1.12it/s]
[Acessing speaker spk_3 track 1 of 3:  53%|█████▎    | 9/17 [00:08<00:07,  1.11it/s]
[Acessing speaker spk_3 track 1 of 3:  59%|█████▉    | 10/17 [00:10<00:08,  1.19s/it]
[Acessing speaker spk_3 track 1 of 3:  65%|██████▍   | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/27 [00:00<00:24,  1.07it/s]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/27 [00:01<00:14,  1.67it/s]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/27 [00:02<00:16,  1.46it/s]
[Acessing speaker spk_4 track 1 of 1:  15%|█▍        | 4/27 [00:03<00:18,  1.22it/s]
[Acessing speaker spk_4 track 1 of 1:  19%|█▊        | 5/27 [00:04<00:21,  1.04it/s]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 6/27 [00:04<00:17,  1.20it/s]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 7/27 [00:05<00:14,  1.38it/s]
[Acessing speaker spk_4 track 1 of 1:  30%|██▉       | 8/27 [00:07<00:21,  1.12s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 9/27 [00:08<00:18,  1.05s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 10/27 [00:08<00:15,  1.09it/s]
[Acessing speaker spk_4 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/33 [00:01<00:39,  1.23s/it]
[Acessing speaker spk_5 track 1 of 1:   6%|▌         | 2/33 [00:01<00:28,  1.08it/s]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 3/33 [00:02<00:20,  1.43it/s]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 4/33 [00:03<00:19,  1.47it/s]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 5/33 [00:03<00:18,  1.51it/s]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 6/33 [00:04<00:18,  1.49it/s]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 7/33 [00:05<00:17,  1.48it/s]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 8/33 [00:07<00:27,  1.11s/it]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 9/33 [00:08<00:28,  1.20s/it]
[Acessing speaker spk_5 track 1 of 1:  30%|███       | 10/33 [00:14<00:59,  2.59s/it]
[Acessing speaker spk_5 track 1 of 1:  33%|███▎      | 11/3


Starte Inference für Experiment: E57_bugfix_mdOn0p4_mdOff0p5_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E57_bugfix_mdOn0p4_mdOff0p5_bs12_len20
  session_dir     = data-bin/dev/session_50
  comment         = AVSR-Override: min_on=0.4s, min_off=0.5s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_50


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/32 [00:00<00:12,  2.40it/s]
[Acessing speaker spk_0 track 1 of 1:   6%|▋         | 2/32 [00:01<00:17,  1.67it/s]
[Acessing speaker spk_0 track 1 of 1:   9%|▉         | 3/32 [00:01<00:20,  1.41it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▎        | 4/32 [00:02<00:16,  1.65it/s]
[Acessing speaker spk_0 track 1 of 1:  16%|█▌        | 5/32 [00:03<00:16,  1.62it/s]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 6/32 [00:03<00:14,  1.80it/s]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 7/32 [00:04<00:19,  1.29it/s]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 8/32 [00:05<00:20,  1.20it/s]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 9/32 [00:06<00:20,  1.12it/s]
[Acessing speaker spk_0 track 1 of 1:  31%|███▏      | 10/32 [00:07<00:17,  1.23it/s]
[Acessing speaker spk_0 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/29 [00:10<05:02, 10.79s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/29 [00:14<02:54,  6.47s/it]
[Acessing speaker spk_1 track 1 of 1:  10%|█         | 3/29 [00:14<01:37,  3.75s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 4/29 [00:17<01:19,  3.16s/it]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 5/29 [00:18<01:04,  2.68s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 6/29 [00:19<00:45,  1.97s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 7/29 [00:19<00:33,  1.50s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 8/29 [00:20<00:24,  1.17s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███       | 9/29 [00:21<00:20,  1.02s/it]
[Acessing speaker spk_1 track 1 of 1:  34%|███▍      | 10/29 [00:21<00:16,  1.15it/s]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/20 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   5%|▌         | 1/20 [00:01<00:24,  1.30s/it]
[Acessing speaker spk_2 track 1 of 2:  10%|█         | 2/20 [00:01<00:15,  1.15it/s]
[Acessing speaker spk_2 track 1 of 2:  15%|█▌        | 3/20 [00:06<00:46,  2.73s/it]
[Acessing speaker spk_2 track 1 of 2:  20%|██        | 4/20 [00:07<00:32,  2.05s/it]
[Acessing speaker spk_2 track 1 of 2:  25%|██▌       | 5/20 [00:09<00:28,  1.88s/it]
[Acessing speaker spk_2 track 1 of 2:  30%|███       | 6/20 [00:10<00:20,  1.47s/it]
[Acessing speaker spk_2 track 1 of 2:  35%|███▌      | 7/20 [00:10<00:15,  1.22s/it]
[Acessing speaker spk_2 track 1 of 2:  40%|████      | 8/20 [00:11<00:12,  1.01s/it]
[Acessing speaker spk_2 track 1 of 2:  45%|████▌     | 9/20 [00:11<00:09,  1.12it/s]
[Acessing speaker spk_2 track 1 of 2:  50%|█████     | 10/20 [00:12<00:09,  1.07it/s]
[Acessing speaker spk_2 track 1 of 2:  55%|█████▌    | 11/2





[Acessing speaker spk_3 track 1 of 3:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 3:   4%|▍         | 1/24 [00:01<00:24,  1.08s/it]
[Acessing speaker spk_3 track 1 of 3:   8%|▊         | 2/24 [00:01<00:14,  1.47it/s]
[Acessing speaker spk_3 track 1 of 3:  12%|█▎        | 3/24 [00:02<00:13,  1.53it/s]
[Acessing speaker spk_3 track 1 of 3:  17%|█▋        | 4/24 [00:03<00:15,  1.26it/s]
[Acessing speaker spk_3 track 1 of 3:  21%|██        | 5/24 [00:03<00:12,  1.53it/s]
[Acessing speaker spk_3 track 1 of 3:  25%|██▌       | 6/24 [00:03<00:10,  1.69it/s]
[Acessing speaker spk_3 track 1 of 3:  29%|██▉       | 7/24 [00:05<00:12,  1.31it/s]
[Acessing speaker spk_3 track 1 of 3:  33%|███▎      | 8/24 [00:05<00:12,  1.29it/s]
[Acessing speaker spk_3 track 1 of 3:  38%|███▊      | 9/24 [00:06<00:12,  1.22it/s]
[Acessing speaker spk_3 track 1 of 3:  42%|████▏     | 10/24 [00:07<00:09,  1.41it/s]
[Acessing speaker spk_3 track 1 of 3:  46%|████▌     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/31 [00:00<00:13,  2.23it/s]
[Acessing speaker spk_4 track 1 of 1:   6%|▋         | 2/31 [00:01<00:17,  1.63it/s]
[Acessing speaker spk_4 track 1 of 1:  10%|▉         | 3/31 [00:01<00:13,  2.02it/s]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 4/31 [00:01<00:12,  2.12it/s]
[Acessing speaker spk_4 track 1 of 1:  16%|█▌        | 5/31 [00:02<00:15,  1.70it/s]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 6/31 [00:03<00:18,  1.35it/s]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 7/31 [00:05<00:21,  1.11it/s]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 8/31 [00:05<00:18,  1.25it/s]
[Acessing speaker spk_4 track 1 of 1:  29%|██▉       | 9/31 [00:06<00:15,  1.42it/s]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 10/31 [00:08<00:22,  1.09s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/38 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/38 [00:00<00:32,  1.15it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▌         | 2/38 [00:01<00:20,  1.77it/s]
[Acessing speaker spk_5 track 1 of 1:   8%|▊         | 3/38 [00:01<00:22,  1.58it/s]
[Acessing speaker spk_5 track 1 of 1:  11%|█         | 4/38 [00:02<00:17,  1.91it/s]
[Acessing speaker spk_5 track 1 of 1:  13%|█▎        | 5/38 [00:02<00:15,  2.09it/s]
[Acessing speaker spk_5 track 1 of 1:  16%|█▌        | 6/38 [00:03<00:17,  1.85it/s]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 7/38 [00:03<00:17,  1.75it/s]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 8/38 [00:04<00:15,  1.91it/s]
[Acessing speaker spk_5 track 1 of 1:  24%|██▎       | 9/38 [00:05<00:16,  1.74it/s]
[Acessing speaker spk_5 track 1 of 1:  26%|██▋       | 10/38 [00:05<00:17,  1.64it/s]
[Acessing speaker spk_5 track 1 of 1:  29%|██▉       | 11/3


Starte Inference für Experiment: E58_bugfix_mdOn0p4_mdOff0p8_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E58_bugfix_mdOn0p4_mdOff0p8_bs12_len20
  session_dir     = data-bin/dev/session_50
  comment         = AVSR-Override: min_on=0.4s, min_off=0.8s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_50


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/31 [00:00<00:12,  2.43it/s]
[Acessing speaker spk_0 track 1 of 1:   6%|▋         | 2/31 [00:01<00:17,  1.67it/s]
[Acessing speaker spk_0 track 1 of 1:  10%|▉         | 3/31 [00:01<00:19,  1.41it/s]
[Acessing speaker spk_0 track 1 of 1:  13%|█▎        | 4/31 [00:02<00:16,  1.65it/s]
[Acessing speaker spk_0 track 1 of 1:  16%|█▌        | 5/31 [00:03<00:16,  1.62it/s]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 6/31 [00:03<00:13,  1.81it/s]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 7/31 [00:04<00:18,  1.29it/s]
[Acessing speaker spk_0 track 1 of 1:  26%|██▌       | 8/31 [00:05<00:19,  1.20it/s]
[Acessing speaker spk_0 track 1 of 1:  29%|██▉       | 9/31 [00:06<00:19,  1.12it/s]
[Acessing speaker spk_0 track 1 of 1:  32%|███▏      | 10/31 [00:07<00:17,  1.23it/s]
[Acessing speaker spk_0 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:10<04:42, 10.86s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:12<02:16,  5.46s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:13<01:16,  3.20s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:15<01:05,  2.83s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:17<00:54,  2.47s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:17<00:38,  1.84s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:18<00:28,  1.41s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:18<00:21,  1.11s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:19<00:17,  1.03it/s]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:19<00:14,  1.19it/s]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/19 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   5%|▌         | 1/19 [00:01<00:23,  1.32s/it]
[Acessing speaker spk_2 track 1 of 2:  11%|█         | 2/19 [00:01<00:14,  1.14it/s]
[Acessing speaker spk_2 track 1 of 2:  16%|█▌        | 3/19 [00:07<00:45,  2.82s/it]
[Acessing speaker spk_2 track 1 of 2:  21%|██        | 4/19 [00:08<00:31,  2.11s/it]
[Acessing speaker spk_2 track 1 of 2:  26%|██▋       | 5/19 [00:10<00:32,  2.34s/it]
[Acessing speaker spk_2 track 1 of 2:  32%|███▏      | 6/19 [00:11<00:24,  1.85s/it]
[Acessing speaker spk_2 track 1 of 2:  37%|███▋      | 7/19 [00:12<00:17,  1.43s/it]
[Acessing speaker spk_2 track 1 of 2:  42%|████▏     | 8/19 [00:12<00:12,  1.18s/it]
[Acessing speaker spk_2 track 1 of 2:  47%|████▋     | 9/19 [00:13<00:11,  1.13s/it]
[Acessing speaker spk_2 track 1 of 2:  53%|█████▎    | 10/19 [00:14<00:09,  1.04s/it]
[Acessing speaker spk_2 track 1 of 2:  58%|█████▊    | 11/1





[Acessing speaker spk_3 track 1 of 3:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 3:   4%|▍         | 1/23 [00:01<00:24,  1.10s/it]
[Acessing speaker spk_3 track 1 of 3:   9%|▊         | 2/23 [00:01<00:14,  1.44it/s]
[Acessing speaker spk_3 track 1 of 3:  13%|█▎        | 3/23 [00:02<00:13,  1.50it/s]
[Acessing speaker spk_3 track 1 of 3:  17%|█▋        | 4/23 [00:03<00:17,  1.08it/s]
[Acessing speaker spk_3 track 1 of 3:  22%|██▏       | 5/23 [00:03<00:14,  1.29it/s]
[Acessing speaker spk_3 track 1 of 3:  26%|██▌       | 6/23 [00:05<00:15,  1.11it/s]
[Acessing speaker spk_3 track 1 of 3:  30%|███       | 7/23 [00:05<00:13,  1.15it/s]
[Acessing speaker spk_3 track 1 of 3:  35%|███▍      | 8/23 [00:06<00:13,  1.12it/s]
[Acessing speaker spk_3 track 1 of 3:  39%|███▉      | 9/23 [00:07<00:10,  1.32it/s]
[Acessing speaker spk_3 track 1 of 3:  43%|████▎     | 10/23 [00:07<00:08,  1.48it/s]
[Acessing speaker spk_3 track 1 of 3:  48%|████▊     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/31 [00:00<00:13,  2.29it/s]
[Acessing speaker spk_4 track 1 of 1:   6%|▋         | 2/31 [00:01<00:17,  1.61it/s]
[Acessing speaker spk_4 track 1 of 1:  10%|▉         | 3/31 [00:01<00:14,  1.98it/s]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 4/31 [00:01<00:12,  2.09it/s]
[Acessing speaker spk_4 track 1 of 1:  16%|█▌        | 5/31 [00:02<00:15,  1.66it/s]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 6/31 [00:03<00:18,  1.32it/s]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 7/31 [00:05<00:23,  1.00it/s]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 8/31 [00:05<00:19,  1.15it/s]
[Acessing speaker spk_4 track 1 of 1:  29%|██▉       | 9/31 [00:06<00:16,  1.33it/s]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 10/31 [00:08<00:24,  1.15s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/37 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/37 [00:00<00:31,  1.14it/s]
[Acessing speaker spk_5 track 1 of 1:   5%|▌         | 2/37 [00:01<00:20,  1.75it/s]
[Acessing speaker spk_5 track 1 of 1:   8%|▊         | 3/37 [00:01<00:21,  1.56it/s]
[Acessing speaker spk_5 track 1 of 1:  11%|█         | 4/37 [00:02<00:17,  1.88it/s]
[Acessing speaker spk_5 track 1 of 1:  14%|█▎        | 5/37 [00:02<00:15,  2.05it/s]
[Acessing speaker spk_5 track 1 of 1:  16%|█▌        | 6/37 [00:03<00:17,  1.82it/s]
[Acessing speaker spk_5 track 1 of 1:  19%|█▉        | 7/37 [00:04<00:17,  1.69it/s]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 8/37 [00:04<00:15,  1.85it/s]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 9/37 [00:05<00:16,  1.69it/s]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 10/37 [00:05<00:16,  1.59it/s]
[Acessing speaker spk_5 track 1 of 1:  30%|██▉       | 11/3


Starte Inference für Experiment: E59_bugfix_mdOn0p4_mdOff1p0_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E59_bugfix_mdOn0p4_mdOff1p0_bs12_len20
  session_dir     = data-bin/dev/session_50
  comment         = AVSR-Override: min_on=0.4s, min_off=1.0s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_50


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▎         | 1/28 [00:00<00:11,  2.42it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/28 [00:01<00:15,  1.67it/s]
[Acessing speaker spk_0 track 1 of 1:  11%|█         | 3/28 [00:01<00:17,  1.42it/s]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 4/28 [00:02<00:14,  1.65it/s]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 5/28 [00:03<00:14,  1.63it/s]
[Acessing speaker spk_0 track 1 of 1:  21%|██▏       | 6/28 [00:03<00:12,  1.78it/s]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 7/28 [00:04<00:16,  1.27it/s]
[Acessing speaker spk_0 track 1 of 1:  29%|██▊       | 8/28 [00:05<00:16,  1.19it/s]
[Acessing speaker spk_0 track 1 of 1:  32%|███▏      | 9/28 [00:06<00:17,  1.11it/s]
[Acessing speaker spk_0 track 1 of 1:  36%|███▌      | 10/28 [00:07<00:14,  1.22it/s]
[Acessing speaker spk_0 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▍         | 1/26 [00:12<05:08, 12.34s/it]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 2/26 [00:14<02:25,  6.07s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 3/26 [00:14<01:21,  3.53s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▌        | 4/26 [00:16<01:06,  3.03s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 5/26 [00:18<00:54,  2.60s/it]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 6/26 [00:19<00:38,  1.92s/it]
[Acessing speaker spk_1 track 1 of 1:  27%|██▋       | 7/26 [00:19<00:27,  1.47s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███       | 8/26 [00:20<00:20,  1.14s/it]
[Acessing speaker spk_1 track 1 of 1:  35%|███▍      | 9/26 [00:20<00:16,  1.00it/s]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 10/26 [00:25<00:32,  2.02s/it]
[Acessing speaker spk_1 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   6%|▌         | 1/18 [00:01<00:21,  1.29s/it]
[Acessing speaker spk_2 track 1 of 2:  11%|█         | 2/18 [00:01<00:13,  1.16it/s]
[Acessing speaker spk_2 track 1 of 2:  17%|█▋        | 3/18 [00:06<00:41,  2.75s/it]
[Acessing speaker spk_2 track 1 of 2:  22%|██▏       | 4/18 [00:07<00:28,  2.07s/it]
[Acessing speaker spk_2 track 1 of 2:  28%|██▊       | 5/18 [00:11<00:35,  2.76s/it]
[Acessing speaker spk_2 track 1 of 2:  33%|███▎      | 6/18 [00:12<00:24,  2.06s/it]
[Acessing speaker spk_2 track 1 of 2:  39%|███▉      | 7/18 [00:13<00:17,  1.59s/it]
[Acessing speaker spk_2 track 1 of 2:  44%|████▍     | 8/18 [00:14<00:14,  1.41s/it]
[Acessing speaker spk_2 track 1 of 2:  50%|█████     | 9/18 [00:15<00:11,  1.23s/it]
[Acessing speaker spk_2 track 1 of 2:  56%|█████▌    | 10/18 [00:15<00:08,  1.01s/it]
[Acessing speaker spk_2 track 1 of 2:  61%|██████    | 11/1





[Acessing speaker spk_3 track 1 of 3:   0%|          | 0/22 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 3:   5%|▍         | 1/22 [00:01<00:22,  1.08s/it]
[Acessing speaker spk_3 track 1 of 3:   9%|▉         | 2/22 [00:01<00:13,  1.47it/s]
[Acessing speaker spk_3 track 1 of 3:  14%|█▎        | 3/22 [00:02<00:12,  1.52it/s]
[Acessing speaker spk_3 track 1 of 3:  18%|█▊        | 4/22 [00:03<00:16,  1.09it/s]
[Acessing speaker spk_3 track 1 of 3:  23%|██▎       | 5/22 [00:03<00:13,  1.30it/s]
[Acessing speaker spk_3 track 1 of 3:  27%|██▋       | 6/22 [00:06<00:21,  1.37s/it]
[Acessing speaker spk_3 track 1 of 3:  32%|███▏      | 7/22 [00:07<00:18,  1.25s/it]
[Acessing speaker spk_3 track 1 of 3:  36%|███▋      | 8/22 [00:07<00:14,  1.00s/it]
[Acessing speaker spk_3 track 1 of 3:  41%|████      | 9/22 [00:08<00:10,  1.19it/s]
[Acessing speaker spk_3 track 1 of 3:  45%|████▌     | 10/22 [00:08<00:08,  1.41it/s]
[Acessing speaker spk_3 track 1 of 3:  50%|█████     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/31 [00:00<00:13,  2.29it/s]
[Acessing speaker spk_4 track 1 of 1:   6%|▋         | 2/31 [00:01<00:17,  1.65it/s]
[Acessing speaker spk_4 track 1 of 1:  10%|▉         | 3/31 [00:01<00:13,  2.04it/s]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 4/31 [00:01<00:12,  2.14it/s]
[Acessing speaker spk_4 track 1 of 1:  16%|█▌        | 5/31 [00:02<00:15,  1.71it/s]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 6/31 [00:03<00:18,  1.36it/s]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 7/31 [00:05<00:21,  1.11it/s]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 8/31 [00:05<00:18,  1.25it/s]
[Acessing speaker spk_4 track 1 of 1:  29%|██▉       | 9/31 [00:06<00:15,  1.42it/s]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 10/31 [00:08<00:22,  1.09s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/34 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/34 [00:00<00:28,  1.15it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▌         | 2/34 [00:01<00:18,  1.76it/s]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 3/34 [00:01<00:19,  1.58it/s]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 4/34 [00:02<00:15,  1.90it/s]
[Acessing speaker spk_5 track 1 of 1:  15%|█▍        | 5/34 [00:02<00:14,  2.07it/s]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 6/34 [00:03<00:15,  1.84it/s]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 7/34 [00:04<00:15,  1.74it/s]
[Acessing speaker spk_5 track 1 of 1:  24%|██▎       | 8/34 [00:04<00:13,  1.91it/s]
[Acessing speaker spk_5 track 1 of 1:  26%|██▋       | 9/34 [00:05<00:14,  1.74it/s]
[Acessing speaker spk_5 track 1 of 1:  29%|██▉       | 10/34 [00:05<00:14,  1.64it/s]
[Acessing speaker spk_5 track 1 of 1:  32%|███▏      | 11/3


Starte Inference für Experiment: E60_bugfix_mdOn0p4_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E60_bugfix_mdOn0p4_mdOff1p2_bs12_len20
  session_dir     = data-bin/dev/session_50
  comment         = AVSR-Override: min_on=0.4s, min_off=1.2s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_50


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▎         | 1/28 [00:00<00:11,  2.40it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/28 [00:01<00:15,  1.65it/s]
[Acessing speaker spk_0 track 1 of 1:  11%|█         | 3/28 [00:01<00:17,  1.40it/s]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 4/28 [00:02<00:14,  1.63it/s]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 5/28 [00:03<00:14,  1.61it/s]
[Acessing speaker spk_0 track 1 of 1:  21%|██▏       | 6/28 [00:03<00:12,  1.79it/s]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 7/28 [00:04<00:16,  1.28it/s]
[Acessing speaker spk_0 track 1 of 1:  29%|██▊       | 8/28 [00:05<00:16,  1.20it/s]
[Acessing speaker spk_0 track 1 of 1:  32%|███▏      | 9/28 [00:06<00:17,  1.11it/s]
[Acessing speaker spk_0 track 1 of 1:  36%|███▌      | 10/28 [00:08<00:21,  1.21s/it]
[Acessing speaker spk_0 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/25 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▍         | 1/25 [00:10<04:20, 10.85s/it]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 2/25 [00:12<02:05,  5.46s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 3/25 [00:13<01:10,  3.19s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 4/25 [00:15<00:59,  2.84s/it]
[Acessing speaker spk_1 track 1 of 1:  20%|██        | 5/25 [00:17<00:49,  2.47s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 6/25 [00:18<00:36,  1.93s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 7/25 [00:18<00:26,  1.45s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 8/25 [00:19<00:20,  1.20s/it]
[Acessing speaker spk_1 track 1 of 1:  36%|███▌      | 9/25 [00:23<00:34,  2.17s/it]
[Acessing speaker spk_1 track 1 of 1:  40%|████      | 10/25 [00:24<00:27,  1.82s/it]
[Acessing speaker spk_1 track 1 of 1:  44%|████▍     | 11/2





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   6%|▌         | 1/17 [00:01<00:20,  1.29s/it]
[Acessing speaker spk_2 track 1 of 2:  12%|█▏        | 2/17 [00:01<00:13,  1.09it/s]
[Acessing speaker spk_2 track 1 of 2:  18%|█▊        | 3/17 [00:06<00:39,  2.79s/it]
[Acessing speaker spk_2 track 1 of 2:  24%|██▎       | 4/17 [00:07<00:27,  2.09s/it]
[Acessing speaker spk_2 track 1 of 2:  29%|██▉       | 5/17 [00:11<00:33,  2.77s/it]
[Acessing speaker spk_2 track 1 of 2:  35%|███▌      | 6/17 [00:12<00:22,  2.07s/it]
[Acessing speaker spk_2 track 1 of 2:  41%|████      | 7/17 [00:13<00:15,  1.60s/it]
[Acessing speaker spk_2 track 1 of 2:  47%|████▋     | 8/17 [00:14<00:12,  1.42s/it]
[Acessing speaker spk_2 track 1 of 2:  53%|█████▎    | 9/17 [00:15<00:09,  1.23s/it]
[Acessing speaker spk_2 track 1 of 2:  59%|█████▉    | 10/17 [00:15<00:07,  1.01s/it]
[Acessing speaker spk_2 track 1 of 2:  65%|██████▍   | 11/1





[Acessing speaker spk_3 track 1 of 3:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 3:   5%|▍         | 1/21 [00:01<00:21,  1.09s/it]
[Acessing speaker spk_3 track 1 of 3:  10%|▉         | 2/21 [00:01<00:13,  1.46it/s]
[Acessing speaker spk_3 track 1 of 3:  14%|█▍        | 3/21 [00:02<00:11,  1.51it/s]
[Acessing speaker spk_3 track 1 of 3:  19%|█▉        | 4/21 [00:03<00:15,  1.08it/s]
[Acessing speaker spk_3 track 1 of 3:  24%|██▍       | 5/21 [00:03<00:12,  1.29it/s]
[Acessing speaker spk_3 track 1 of 3:  29%|██▊       | 6/21 [00:06<00:21,  1.41s/it]
[Acessing speaker spk_3 track 1 of 3:  33%|███▎      | 7/21 [00:08<00:20,  1.44s/it]
[Acessing speaker spk_3 track 1 of 3:  38%|███▊      | 8/21 [00:08<00:15,  1.16s/it]
[Acessing speaker spk_3 track 1 of 3:  43%|████▎     | 9/21 [00:09<00:11,  1.07it/s]
[Acessing speaker spk_3 track 1 of 3:  48%|████▊     | 10/21 [00:10<00:10,  1.01it/s]
[Acessing speaker spk_3 track 1 of 3:  52%|█████▏    | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/29 [00:00<00:12,  2.28it/s]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/29 [00:01<00:16,  1.65it/s]
[Acessing speaker spk_4 track 1 of 1:  10%|█         | 3/29 [00:01<00:12,  2.03it/s]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 4/29 [00:01<00:11,  2.11it/s]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 5/29 [00:02<00:14,  1.70it/s]
[Acessing speaker spk_4 track 1 of 1:  21%|██        | 6/29 [00:03<00:17,  1.35it/s]
[Acessing speaker spk_4 track 1 of 1:  24%|██▍       | 7/29 [00:06<00:30,  1.40s/it]
[Acessing speaker spk_4 track 1 of 1:  28%|██▊       | 8/29 [00:07<00:24,  1.15s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 9/29 [00:07<00:18,  1.06it/s]
[Acessing speaker spk_4 track 1 of 1:  34%|███▍      | 10/29 [00:09<00:24,  1.27s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/34 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/34 [00:00<00:28,  1.16it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▌         | 2/34 [00:01<00:18,  1.76it/s]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 3/34 [00:01<00:19,  1.58it/s]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 4/34 [00:02<00:15,  1.90it/s]
[Acessing speaker spk_5 track 1 of 1:  15%|█▍        | 5/34 [00:02<00:13,  2.08it/s]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 6/34 [00:03<00:15,  1.85it/s]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 7/34 [00:03<00:15,  1.75it/s]
[Acessing speaker spk_5 track 1 of 1:  24%|██▎       | 8/34 [00:04<00:13,  1.92it/s]
[Acessing speaker spk_5 track 1 of 1:  26%|██▋       | 9/34 [00:05<00:14,  1.75it/s]
[Acessing speaker spk_5 track 1 of 1:  29%|██▉       | 10/34 [00:05<00:14,  1.65it/s]
[Acessing speaker spk_5 track 1 of 1:  32%|███▏      | 11/3


Starte Inference für Experiment: E61_bugfix_mdOn0p6_mdOff0p5_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E61_bugfix_mdOn0p6_mdOff0p5_bs12_len20
  session_dir     = data-bin/dev/session_50
  comment         = AVSR-Override: min_on=0.6s, min_off=0.5s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_50


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/32 [00:00<00:12,  2.40it/s]
[Acessing speaker spk_0 track 1 of 1:   6%|▋         | 2/32 [00:01<00:18,  1.66it/s]
[Acessing speaker spk_0 track 1 of 1:   9%|▉         | 3/32 [00:01<00:20,  1.41it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▎        | 4/32 [00:02<00:16,  1.65it/s]
[Acessing speaker spk_0 track 1 of 1:  16%|█▌        | 5/32 [00:03<00:16,  1.62it/s]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 6/32 [00:03<00:14,  1.81it/s]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 7/32 [00:04<00:19,  1.29it/s]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 8/32 [00:05<00:19,  1.20it/s]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 9/32 [00:06<00:20,  1.12it/s]
[Acessing speaker spk_0 track 1 of 1:  31%|███▏      | 10/32 [00:07<00:17,  1.23it/s]
[Acessing speaker spk_0 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/29 [00:10<05:00, 10.73s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/29 [00:12<02:25,  5.39s/it]
[Acessing speaker spk_1 track 1 of 1:  10%|█         | 3/29 [00:12<01:22,  3.16s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 4/29 [00:15<01:10,  2.81s/it]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 5/29 [00:16<00:58,  2.45s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 6/29 [00:17<00:41,  1.82s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 7/29 [00:18<00:30,  1.40s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 8/29 [00:18<00:23,  1.10s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███       | 9/29 [00:19<00:19,  1.03it/s]
[Acessing speaker spk_1 track 1 of 1:  34%|███▍      | 10/29 [00:19<00:15,  1.19it/s]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/20 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   5%|▌         | 1/20 [00:01<00:24,  1.29s/it]
[Acessing speaker spk_2 track 1 of 2:  10%|█         | 2/20 [00:01<00:15,  1.16it/s]
[Acessing speaker spk_2 track 1 of 2:  15%|█▌        | 3/20 [00:08<00:59,  3.49s/it]
[Acessing speaker spk_2 track 1 of 2:  20%|██        | 4/20 [00:09<00:40,  2.51s/it]
[Acessing speaker spk_2 track 1 of 2:  25%|██▌       | 5/20 [00:11<00:32,  2.18s/it]
[Acessing speaker spk_2 track 1 of 2:  30%|███       | 6/20 [00:11<00:23,  1.67s/it]
[Acessing speaker spk_2 track 1 of 2:  35%|███▌      | 7/20 [00:12<00:17,  1.36s/it]
[Acessing speaker spk_2 track 1 of 2:  40%|████      | 8/20 [00:13<00:13,  1.10s/it]
[Acessing speaker spk_2 track 1 of 2:  45%|████▌     | 9/20 [00:13<00:10,  1.05it/s]
[Acessing speaker spk_2 track 1 of 2:  50%|█████     | 10/20 [00:14<00:09,  1.02it/s]
[Acessing speaker spk_2 track 1 of 2:  55%|█████▌    | 11/2





[Acessing speaker spk_3 track 1 of 3:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 3:   4%|▍         | 1/23 [00:01<00:23,  1.07s/it]
[Acessing speaker spk_3 track 1 of 3:   9%|▊         | 2/23 [00:01<00:14,  1.48it/s]
[Acessing speaker spk_3 track 1 of 3:  13%|█▎        | 3/23 [00:02<00:13,  1.53it/s]
[Acessing speaker spk_3 track 1 of 3:  17%|█▋        | 4/23 [00:03<00:15,  1.25it/s]
[Acessing speaker spk_3 track 1 of 3:  22%|██▏       | 5/23 [00:03<00:12,  1.45it/s]
[Acessing speaker spk_3 track 1 of 3:  26%|██▌       | 6/23 [00:04<00:14,  1.20it/s]
[Acessing speaker spk_3 track 1 of 3:  30%|███       | 7/23 [00:05<00:13,  1.21it/s]
[Acessing speaker spk_3 track 1 of 3:  35%|███▍      | 8/23 [00:06<00:12,  1.17it/s]
[Acessing speaker spk_3 track 1 of 3:  39%|███▉      | 9/23 [00:06<00:10,  1.36it/s]
[Acessing speaker spk_3 track 1 of 3:  43%|████▎     | 10/23 [00:07<00:08,  1.52it/s]
[Acessing speaker spk_3 track 1 of 3:  48%|████▊     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/30 [00:00<00:12,  2.29it/s]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/30 [00:01<00:16,  1.66it/s]
[Acessing speaker spk_4 track 1 of 1:  10%|█         | 3/30 [00:01<00:13,  2.04it/s]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 4/30 [00:02<00:15,  1.66it/s]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 5/30 [00:03<00:18,  1.32it/s]
[Acessing speaker spk_4 track 1 of 1:  20%|██        | 6/30 [00:04<00:22,  1.09it/s]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 7/30 [00:05<00:18,  1.24it/s]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 8/30 [00:05<00:15,  1.41it/s]
[Acessing speaker spk_4 track 1 of 1:  30%|███       | 9/30 [00:07<00:23,  1.10s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 10/30 [00:08<00:20,  1.03s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 11/3





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/36 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/36 [00:00<00:30,  1.16it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▌         | 2/36 [00:01<00:19,  1.76it/s]
[Acessing speaker spk_5 track 1 of 1:   8%|▊         | 3/36 [00:01<00:20,  1.58it/s]
[Acessing speaker spk_5 track 1 of 1:  11%|█         | 4/36 [00:02<00:16,  1.89it/s]
[Acessing speaker spk_5 track 1 of 1:  14%|█▍        | 5/36 [00:04<00:29,  1.04it/s]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 6/36 [00:04<00:26,  1.14it/s]
[Acessing speaker spk_5 track 1 of 1:  19%|█▉        | 7/36 [00:05<00:23,  1.26it/s]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 8/36 [00:05<00:18,  1.49it/s]
[Acessing speaker spk_5 track 1 of 1:  25%|██▌       | 9/36 [00:06<00:18,  1.48it/s]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 10/36 [00:07<00:17,  1.47it/s]
[Acessing speaker spk_5 track 1 of 1:  31%|███       | 11/3


Starte Inference für Experiment: E62_bugfix_mdOn0p6_mdOff0p8_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E62_bugfix_mdOn0p6_mdOff0p8_bs12_len20
  session_dir     = data-bin/dev/session_50
  comment         = AVSR-Override: min_on=0.6s, min_off=0.8s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_50


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/31 [00:02<01:01,  2.04s/it]
[Acessing speaker spk_0 track 1 of 1:   6%|▋         | 2/31 [00:02<00:36,  1.27s/it]
[Acessing speaker spk_0 track 1 of 1:  10%|▉         | 3/31 [00:03<00:30,  1.07s/it]
[Acessing speaker spk_0 track 1 of 1:  13%|█▎        | 4/31 [00:04<00:22,  1.20it/s]
[Acessing speaker spk_0 track 1 of 1:  16%|█▌        | 5/31 [00:04<00:19,  1.31it/s]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 6/31 [00:05<00:16,  1.52it/s]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 7/31 [00:06<00:20,  1.18it/s]
[Acessing speaker spk_0 track 1 of 1:  26%|██▌       | 8/31 [00:07<00:20,  1.13it/s]
[Acessing speaker spk_0 track 1 of 1:  29%|██▉       | 9/31 [00:08<00:20,  1.08it/s]
[Acessing speaker spk_0 track 1 of 1:  32%|███▏      | 10/31 [00:09<00:17,  1.20it/s]
[Acessing speaker spk_0 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:10<04:42, 10.85s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:12<02:16,  5.46s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:13<01:16,  3.19s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:15<01:05,  2.83s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:17<00:54,  2.47s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:17<00:38,  1.84s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:18<00:28,  1.42s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:18<00:21,  1.11s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:19<00:17,  1.02it/s]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:19<00:14,  1.19it/s]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/19 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   5%|▌         | 1/19 [00:01<00:23,  1.31s/it]
[Acessing speaker spk_2 track 1 of 2:  11%|█         | 2/19 [00:01<00:14,  1.14it/s]
[Acessing speaker spk_2 track 1 of 2:  16%|█▌        | 3/19 [00:06<00:44,  2.76s/it]
[Acessing speaker spk_2 track 1 of 2:  21%|██        | 4/19 [00:07<00:30,  2.07s/it]
[Acessing speaker spk_2 track 1 of 2:  26%|██▋       | 5/19 [00:10<00:32,  2.31s/it]
[Acessing speaker spk_2 track 1 of 2:  32%|███▏      | 6/19 [00:11<00:23,  1.82s/it]
[Acessing speaker spk_2 track 1 of 2:  37%|███▋      | 7/19 [00:12<00:16,  1.40s/it]
[Acessing speaker spk_2 track 1 of 2:  42%|████▏     | 8/19 [00:12<00:12,  1.15s/it]
[Acessing speaker spk_2 track 1 of 2:  47%|████▋     | 9/19 [00:13<00:11,  1.12s/it]
[Acessing speaker spk_2 track 1 of 2:  53%|█████▎    | 10/19 [00:14<00:09,  1.03s/it]
[Acessing speaker spk_2 track 1 of 2:  58%|█████▊    | 11/1





[Acessing speaker spk_3 track 1 of 3:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 3:   4%|▍         | 1/23 [00:01<00:23,  1.09s/it]
[Acessing speaker spk_3 track 1 of 3:   9%|▊         | 2/23 [00:01<00:14,  1.47it/s]
[Acessing speaker spk_3 track 1 of 3:  13%|█▎        | 3/23 [00:02<00:13,  1.52it/s]
[Acessing speaker spk_3 track 1 of 3:  17%|█▋        | 4/23 [00:03<00:17,  1.09it/s]
[Acessing speaker spk_3 track 1 of 3:  22%|██▏       | 5/23 [00:03<00:13,  1.31it/s]
[Acessing speaker spk_3 track 1 of 3:  26%|██▌       | 6/23 [00:05<00:15,  1.13it/s]
[Acessing speaker spk_3 track 1 of 3:  30%|███       | 7/23 [00:05<00:13,  1.17it/s]
[Acessing speaker spk_3 track 1 of 3:  35%|███▍      | 8/23 [00:08<00:19,  1.31s/it]
[Acessing speaker spk_3 track 1 of 3:  39%|███▉      | 9/23 [00:08<00:14,  1.05s/it]
[Acessing speaker spk_3 track 1 of 3:  43%|████▎     | 10/23 [00:09<00:11,  1.14it/s]
[Acessing speaker spk_3 track 1 of 3:  48%|████▊     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/30 [00:00<00:12,  2.30it/s]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/30 [00:01<00:29,  1.07s/it]
[Acessing speaker spk_4 track 1 of 1:  10%|█         | 3/30 [00:03<00:29,  1.09s/it]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 4/30 [00:03<00:25,  1.03it/s]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 5/30 [00:04<00:24,  1.00it/s]
[Acessing speaker spk_4 track 1 of 1:  20%|██        | 6/30 [00:06<00:27,  1.13s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 7/30 [00:06<00:21,  1.05it/s]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 8/30 [00:07<00:17,  1.24it/s]
[Acessing speaker spk_4 track 1 of 1:  30%|███       | 9/30 [00:09<00:24,  1.18s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 10/30 [00:10<00:21,  1.09s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 11/3





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/35 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/35 [00:00<00:29,  1.14it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▌         | 2/35 [00:01<00:18,  1.75it/s]
[Acessing speaker spk_5 track 1 of 1:   9%|▊         | 3/35 [00:01<00:20,  1.58it/s]
[Acessing speaker spk_5 track 1 of 1:  11%|█▏        | 4/35 [00:02<00:16,  1.90it/s]
[Acessing speaker spk_5 track 1 of 1:  14%|█▍        | 5/35 [00:02<00:14,  2.07it/s]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 6/35 [00:03<00:15,  1.84it/s]
[Acessing speaker spk_5 track 1 of 1:  20%|██        | 7/35 [00:04<00:16,  1.74it/s]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 8/35 [00:04<00:14,  1.91it/s]
[Acessing speaker spk_5 track 1 of 1:  26%|██▌       | 9/35 [00:05<00:14,  1.73it/s]
[Acessing speaker spk_5 track 1 of 1:  29%|██▊       | 10/35 [00:05<00:15,  1.64it/s]
[Acessing speaker spk_5 track 1 of 1:  31%|███▏      | 11/3


Starte Inference für Experiment: E63_bugfix_mdOn0p6_mdOff1p0_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E63_bugfix_mdOn0p6_mdOff1p0_bs12_len20
  session_dir     = data-bin/dev/session_50
  comment         = AVSR-Override: min_on=0.6s, min_off=1.0s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_50


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▎         | 1/28 [00:00<00:11,  2.40it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/28 [00:01<00:15,  1.66it/s]
[Acessing speaker spk_0 track 1 of 1:  11%|█         | 3/28 [00:01<00:17,  1.41it/s]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 4/28 [00:02<00:14,  1.65it/s]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 5/28 [00:03<00:14,  1.62it/s]
[Acessing speaker spk_0 track 1 of 1:  21%|██▏       | 6/28 [00:03<00:12,  1.79it/s]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 7/28 [00:04<00:16,  1.28it/s]
[Acessing speaker spk_0 track 1 of 1:  29%|██▊       | 8/28 [00:05<00:16,  1.19it/s]
[Acessing speaker spk_0 track 1 of 1:  32%|███▏      | 9/28 [00:06<00:17,  1.11it/s]
[Acessing speaker spk_0 track 1 of 1:  36%|███▌      | 10/28 [00:07<00:14,  1.22it/s]
[Acessing speaker spk_0 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▍         | 1/26 [00:10<04:28, 10.74s/it]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 2/26 [00:12<02:09,  5.41s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 3/26 [00:12<01:12,  3.16s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▌        | 4/26 [00:15<01:01,  2.81s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 5/26 [00:16<00:51,  2.45s/it]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 6/26 [00:17<00:36,  1.82s/it]
[Acessing speaker spk_1 track 1 of 1:  27%|██▋       | 7/26 [00:18<00:26,  1.40s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███       | 8/26 [00:18<00:19,  1.10s/it]
[Acessing speaker spk_1 track 1 of 1:  35%|███▍      | 9/26 [00:19<00:16,  1.03it/s]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 10/26 [00:23<00:31,  1.99s/it]
[Acessing speaker spk_1 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   6%|▌         | 1/18 [00:01<00:21,  1.29s/it]
[Acessing speaker spk_2 track 1 of 2:  11%|█         | 2/18 [00:01<00:13,  1.16it/s]
[Acessing speaker spk_2 track 1 of 2:  17%|█▋        | 3/18 [00:08<00:51,  3.45s/it]
[Acessing speaker spk_2 track 1 of 2:  22%|██▏       | 4/18 [00:09<00:34,  2.48s/it]
[Acessing speaker spk_2 track 1 of 2:  28%|██▊       | 5/18 [00:13<00:39,  3.00s/it]
[Acessing speaker spk_2 track 1 of 2:  33%|███▎      | 6/18 [00:14<00:26,  2.22s/it]
[Acessing speaker spk_2 track 1 of 2:  39%|███▉      | 7/18 [00:14<00:18,  1.70s/it]
[Acessing speaker spk_2 track 1 of 2:  44%|████▍     | 8/18 [00:15<00:14,  1.48s/it]
[Acessing speaker spk_2 track 1 of 2:  50%|█████     | 9/18 [00:16<00:11,  1.28s/it]
[Acessing speaker spk_2 track 1 of 2:  56%|█████▌    | 10/18 [00:16<00:08,  1.04s/it]
[Acessing speaker spk_2 track 1 of 2:  61%|██████    | 11/1





[Acessing speaker spk_3 track 1 of 3:   0%|          | 0/22 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 3:   5%|▍         | 1/22 [00:01<00:22,  1.08s/it]
[Acessing speaker spk_3 track 1 of 3:   9%|▉         | 2/22 [00:01<00:13,  1.48it/s]
[Acessing speaker spk_3 track 1 of 3:  14%|█▎        | 3/22 [00:02<00:12,  1.53it/s]
[Acessing speaker spk_3 track 1 of 3:  18%|█▊        | 4/22 [00:03<00:16,  1.10it/s]
[Acessing speaker spk_3 track 1 of 3:  23%|██▎       | 5/22 [00:03<00:12,  1.31it/s]
[Acessing speaker spk_3 track 1 of 3:  27%|██▋       | 6/22 [00:06<00:21,  1.36s/it]
[Acessing speaker spk_3 track 1 of 3:  32%|███▏      | 7/22 [00:07<00:18,  1.25s/it]
[Acessing speaker spk_3 track 1 of 3:  36%|███▋      | 8/22 [00:07<00:13,  1.00it/s]
[Acessing speaker spk_3 track 1 of 3:  41%|████      | 9/22 [00:08<00:10,  1.20it/s]
[Acessing speaker spk_3 track 1 of 3:  45%|████▌     | 10/22 [00:08<00:08,  1.42it/s]
[Acessing speaker spk_3 track 1 of 3:  50%|█████     | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/30 [00:00<00:13,  2.21it/s]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/30 [00:01<00:17,  1.60it/s]
[Acessing speaker spk_4 track 1 of 1:  10%|█         | 3/30 [00:01<00:13,  1.98it/s]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 4/30 [00:02<00:16,  1.61it/s]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 5/30 [00:03<00:19,  1.27it/s]
[Acessing speaker spk_4 track 1 of 1:  20%|██        | 6/30 [00:04<00:23,  1.04it/s]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 7/30 [00:05<00:19,  1.18it/s]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 8/30 [00:05<00:16,  1.34it/s]
[Acessing speaker spk_4 track 1 of 1:  30%|███       | 9/30 [00:07<00:24,  1.15s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 10/30 [00:08<00:21,  1.08s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 11/3





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/32 [00:00<00:27,  1.14it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▋         | 2/32 [00:01<00:17,  1.75it/s]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 3/32 [00:01<00:18,  1.57it/s]
[Acessing speaker spk_5 track 1 of 1:  12%|█▎        | 4/32 [00:02<00:14,  1.90it/s]
[Acessing speaker spk_5 track 1 of 1:  16%|█▌        | 5/32 [00:02<00:13,  2.07it/s]
[Acessing speaker spk_5 track 1 of 1:  19%|█▉        | 6/32 [00:03<00:16,  1.62it/s]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 7/32 [00:04<00:15,  1.61it/s]
[Acessing speaker spk_5 track 1 of 1:  25%|██▌       | 8/32 [00:04<00:13,  1.79it/s]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 9/32 [00:05<00:13,  1.67it/s]
[Acessing speaker spk_5 track 1 of 1:  31%|███▏      | 10/32 [00:06<00:13,  1.59it/s]
[Acessing speaker spk_5 track 1 of 1:  34%|███▍      | 11/3


Starte Inference für Experiment: E64_bugfix_mdOn0p6_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E64_bugfix_mdOn0p6_mdOff1p2_bs12_len20
  session_dir     = data-bin/dev/session_50
  comment         = AVSR-Override: min_on=0.6s, min_off=1.2s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_50


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▎         | 1/28 [00:00<00:11,  2.40it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/28 [00:01<00:15,  1.67it/s]
[Acessing speaker spk_0 track 1 of 1:  11%|█         | 3/28 [00:01<00:17,  1.41it/s]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 4/28 [00:02<00:14,  1.65it/s]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 5/28 [00:03<00:14,  1.62it/s]
[Acessing speaker spk_0 track 1 of 1:  21%|██▏       | 6/28 [00:03<00:12,  1.80it/s]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 7/28 [00:04<00:16,  1.29it/s]
[Acessing speaker spk_0 track 1 of 1:  29%|██▊       | 8/28 [00:05<00:16,  1.20it/s]
[Acessing speaker spk_0 track 1 of 1:  32%|███▏      | 9/28 [00:06<00:16,  1.12it/s]
[Acessing speaker spk_0 track 1 of 1:  36%|███▌      | 10/28 [00:07<00:14,  1.23it/s]
[Acessing speaker spk_0 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/25 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▍         | 1/25 [00:12<05:05, 12.71s/it]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 2/25 [00:14<02:23,  6.23s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 3/25 [00:14<01:19,  3.61s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 4/25 [00:17<01:05,  3.11s/it]
[Acessing speaker spk_1 track 1 of 1:  20%|██        | 5/25 [00:19<00:53,  2.65s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 6/25 [00:19<00:39,  2.05s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 7/25 [00:20<00:27,  1.54s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 8/25 [00:21<00:21,  1.27s/it]
[Acessing speaker spk_1 track 1 of 1:  36%|███▌      | 9/25 [00:25<00:35,  2.25s/it]
[Acessing speaker spk_1 track 1 of 1:  40%|████      | 10/25 [00:26<00:28,  1.87s/it]
[Acessing speaker spk_1 track 1 of 1:  44%|████▍     | 11/2





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   6%|▌         | 1/17 [00:01<00:21,  1.32s/it]
[Acessing speaker spk_2 track 1 of 2:  12%|█▏        | 2/17 [00:01<00:13,  1.08it/s]
[Acessing speaker spk_2 track 1 of 2:  18%|█▊        | 3/17 [00:06<00:38,  2.78s/it]
[Acessing speaker spk_2 track 1 of 2:  24%|██▎       | 4/17 [00:07<00:27,  2.08s/it]
[Acessing speaker spk_2 track 1 of 2:  29%|██▉       | 5/17 [00:11<00:33,  2.76s/it]
[Acessing speaker spk_2 track 1 of 2:  35%|███▌      | 6/17 [00:12<00:22,  2.06s/it]
[Acessing speaker spk_2 track 1 of 2:  41%|████      | 7/17 [00:13<00:15,  1.59s/it]
[Acessing speaker spk_2 track 1 of 2:  47%|████▋     | 8/17 [00:14<00:12,  1.41s/it]
[Acessing speaker spk_2 track 1 of 2:  53%|█████▎    | 9/17 [00:15<00:09,  1.23s/it]
[Acessing speaker spk_2 track 1 of 2:  59%|█████▉    | 10/17 [00:15<00:07,  1.01s/it]
[Acessing speaker spk_2 track 1 of 2:  65%|██████▍   | 11/1





[Acessing speaker spk_3 track 1 of 3:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 3:   5%|▍         | 1/21 [00:01<00:21,  1.08s/it]
[Acessing speaker spk_3 track 1 of 3:  10%|▉         | 2/21 [00:01<00:12,  1.47it/s]
[Acessing speaker spk_3 track 1 of 3:  14%|█▍        | 3/21 [00:02<00:11,  1.52it/s]
[Acessing speaker spk_3 track 1 of 3:  19%|█▉        | 4/21 [00:03<00:15,  1.09it/s]
[Acessing speaker spk_3 track 1 of 3:  24%|██▍       | 5/21 [00:03<00:12,  1.30it/s]
[Acessing speaker spk_3 track 1 of 3:  29%|██▊       | 6/21 [00:06<00:20,  1.39s/it]
[Acessing speaker spk_3 track 1 of 3:  33%|███▎      | 7/21 [00:07<00:19,  1.42s/it]
[Acessing speaker spk_3 track 1 of 3:  38%|███▊      | 8/21 [00:08<00:14,  1.14s/it]
[Acessing speaker spk_3 track 1 of 3:  43%|████▎     | 9/21 [00:08<00:10,  1.09it/s]
[Acessing speaker spk_3 track 1 of 3:  48%|████▊     | 10/21 [00:10<00:10,  1.03it/s]
[Acessing speaker spk_3 track 1 of 3:  52%|█████▏    | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/28 [00:00<00:11,  2.30it/s]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/28 [00:01<00:15,  1.65it/s]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/28 [00:01<00:12,  2.03it/s]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 4/28 [00:02<00:14,  1.64it/s]
[Acessing speaker spk_4 track 1 of 1:  18%|█▊        | 5/28 [00:03<00:17,  1.31it/s]
[Acessing speaker spk_4 track 1 of 1:  21%|██▏       | 6/28 [00:04<00:20,  1.08it/s]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 7/28 [00:05<00:17,  1.23it/s]
[Acessing speaker spk_4 track 1 of 1:  29%|██▊       | 8/28 [00:05<00:14,  1.40it/s]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 9/28 [00:07<00:20,  1.10s/it]
[Acessing speaker spk_4 track 1 of 1:  36%|███▌      | 10/28 [00:08<00:18,  1.04s/it]
[Acessing speaker spk_4 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/32 [00:00<00:26,  1.15it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▋         | 2/32 [00:01<00:17,  1.75it/s]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 3/32 [00:01<00:18,  1.57it/s]
[Acessing speaker spk_5 track 1 of 1:  12%|█▎        | 4/32 [00:02<00:14,  1.90it/s]
[Acessing speaker spk_5 track 1 of 1:  16%|█▌        | 5/32 [00:02<00:13,  2.07it/s]
[Acessing speaker spk_5 track 1 of 1:  19%|█▉        | 6/32 [00:03<00:14,  1.84it/s]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 7/32 [00:04<00:14,  1.73it/s]
[Acessing speaker spk_5 track 1 of 1:  25%|██▌       | 8/32 [00:04<00:12,  1.90it/s]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 9/32 [00:05<00:13,  1.72it/s]
[Acessing speaker spk_5 track 1 of 1:  31%|███▏      | 10/32 [00:05<00:13,  1.62it/s]
[Acessing speaker spk_5 track 1 of 1:  34%|███▍      | 11/3


Starte Inference für Experiment: E65_bugfix_mdOn0p8_mdOff0p5_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E65_bugfix_mdOn0p8_mdOff0p5_bs12_len20
  session_dir     = data-bin/dev/session_50
  comment         = AVSR-Override: min_on=0.8s, min_off=0.5s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_50


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/31 [00:00<00:12,  2.38it/s]
[Acessing speaker spk_0 track 1 of 1:   6%|▋         | 2/31 [00:01<00:17,  1.66it/s]
[Acessing speaker spk_0 track 1 of 1:  10%|▉         | 3/31 [00:01<00:19,  1.41it/s]
[Acessing speaker spk_0 track 1 of 1:  13%|█▎        | 4/31 [00:02<00:18,  1.45it/s]
[Acessing speaker spk_0 track 1 of 1:  16%|█▌        | 5/31 [00:03<00:15,  1.67it/s]
[Acessing speaker spk_0 track 1 of 1:  19%|█▉        | 6/31 [00:04<00:20,  1.23it/s]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 7/31 [00:05<00:20,  1.16it/s]
[Acessing speaker spk_0 track 1 of 1:  26%|██▌       | 8/31 [00:06<00:20,  1.10it/s]
[Acessing speaker spk_0 track 1 of 1:  29%|██▉       | 9/31 [00:06<00:18,  1.22it/s]
[Acessing speaker spk_0 track 1 of 1:  32%|███▏      | 10/31 [00:08<00:24,  1.15s/it]
[Acessing speaker spk_0 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/28 [00:10<04:51, 10.79s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/28 [00:12<02:21,  5.43s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/28 [00:12<01:19,  3.18s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 4/28 [00:15<01:07,  2.82s/it]
[Acessing speaker spk_1 track 1 of 1:  18%|█▊        | 5/28 [00:17<00:56,  2.46s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██▏       | 6/28 [00:17<00:40,  1.83s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 7/28 [00:18<00:29,  1.42s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▊       | 8/28 [00:18<00:23,  1.19s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 9/28 [00:19<00:18,  1.01it/s]
[Acessing speaker spk_1 track 1 of 1:  36%|███▌      | 10/28 [00:22<00:30,  1.71s/it]
[Acessing speaker spk_1 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/20 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   5%|▌         | 1/20 [00:01<00:24,  1.29s/it]
[Acessing speaker spk_2 track 1 of 2:  10%|█         | 2/20 [00:01<00:15,  1.16it/s]
[Acessing speaker spk_2 track 1 of 2:  15%|█▌        | 3/20 [00:06<00:46,  2.74s/it]
[Acessing speaker spk_2 track 1 of 2:  20%|██        | 4/20 [00:07<00:32,  2.05s/it]
[Acessing speaker spk_2 track 1 of 2:  25%|██▌       | 5/20 [00:09<00:28,  1.89s/it]
[Acessing speaker spk_2 track 1 of 2:  30%|███       | 6/20 [00:10<00:20,  1.48s/it]
[Acessing speaker spk_2 track 1 of 2:  35%|███▌      | 7/20 [00:10<00:15,  1.23s/it]
[Acessing speaker spk_2 track 1 of 2:  40%|████      | 8/20 [00:11<00:12,  1.01s/it]
[Acessing speaker spk_2 track 1 of 2:  45%|████▌     | 9/20 [00:11<00:09,  1.12it/s]
[Acessing speaker spk_2 track 1 of 2:  50%|█████     | 10/20 [00:13<00:09,  1.06it/s]
[Acessing speaker spk_2 track 1 of 2:  55%|█████▌    | 11/2





[Acessing speaker spk_3 track 1 of 3:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 3:   5%|▍         | 1/21 [00:01<00:21,  1.08s/it]
[Acessing speaker spk_3 track 1 of 3:  10%|▉         | 2/21 [00:01<00:16,  1.19it/s]
[Acessing speaker spk_3 track 1 of 3:  14%|█▍        | 3/21 [00:02<00:16,  1.09it/s]
[Acessing speaker spk_3 track 1 of 3:  19%|█▉        | 4/21 [00:03<00:12,  1.34it/s]
[Acessing speaker spk_3 track 1 of 3:  24%|██▍       | 5/21 [00:04<00:14,  1.13it/s]
[Acessing speaker spk_3 track 1 of 3:  29%|██▊       | 6/21 [00:05<00:12,  1.17it/s]
[Acessing speaker spk_3 track 1 of 3:  33%|███▎      | 7/21 [00:06<00:12,  1.14it/s]
[Acessing speaker spk_3 track 1 of 3:  38%|███▊      | 8/21 [00:06<00:09,  1.34it/s]
[Acessing speaker spk_3 track 1 of 3:  43%|████▎     | 9/21 [00:06<00:07,  1.55it/s]
[Acessing speaker spk_3 track 1 of 3:  48%|████▊     | 10/21 [00:08<00:08,  1.28it/s]
[Acessing speaker spk_3 track 1 of 3:  52%|█████▏    | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/29 [00:00<00:12,  2.32it/s]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/29 [00:01<00:16,  1.64it/s]
[Acessing speaker spk_4 track 1 of 1:  10%|█         | 3/29 [00:01<00:12,  2.02it/s]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 4/29 [00:02<00:15,  1.63it/s]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 5/29 [00:03<00:18,  1.31it/s]
[Acessing speaker spk_4 track 1 of 1:  21%|██        | 6/29 [00:04<00:21,  1.08it/s]
[Acessing speaker spk_4 track 1 of 1:  24%|██▍       | 7/29 [00:05<00:17,  1.23it/s]
[Acessing speaker spk_4 track 1 of 1:  28%|██▊       | 8/29 [00:05<00:15,  1.40it/s]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 9/29 [00:07<00:22,  1.12s/it]
[Acessing speaker spk_4 track 1 of 1:  34%|███▍      | 10/29 [00:08<00:19,  1.05s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/34 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/34 [00:00<00:28,  1.15it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▌         | 2/34 [00:01<00:24,  1.29it/s]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 3/34 [00:02<00:19,  1.62it/s]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 4/34 [00:02<00:18,  1.58it/s]
[Acessing speaker spk_5 track 1 of 1:  15%|█▍        | 5/34 [00:03<00:18,  1.57it/s]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 6/34 [00:03<00:15,  1.78it/s]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 7/34 [00:04<00:16,  1.66it/s]
[Acessing speaker spk_5 track 1 of 1:  24%|██▎       | 8/34 [00:05<00:16,  1.59it/s]
[Acessing speaker spk_5 track 1 of 1:  26%|██▋       | 9/34 [00:07<00:26,  1.07s/it]
[Acessing speaker spk_5 track 1 of 1:  29%|██▉       | 10/34 [00:08<00:28,  1.17s/it]
[Acessing speaker spk_5 track 1 of 1:  32%|███▏      | 11/3


Starte Inference für Experiment: E66_bugfix_mdOn0p8_mdOff0p8_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E66_bugfix_mdOn0p8_mdOff0p8_bs12_len20
  session_dir     = data-bin/dev/session_50
  comment         = AVSR-Override: min_on=0.8s, min_off=0.8s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_50


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/30 [00:00<00:12,  2.41it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/30 [00:01<00:16,  1.67it/s]
[Acessing speaker spk_0 track 1 of 1:  10%|█         | 3/30 [00:01<00:19,  1.41it/s]
[Acessing speaker spk_0 track 1 of 1:  13%|█▎        | 4/30 [00:02<00:17,  1.45it/s]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 5/30 [00:03<00:14,  1.67it/s]
[Acessing speaker spk_0 track 1 of 1:  20%|██        | 6/30 [00:04<00:19,  1.23it/s]
[Acessing speaker spk_0 track 1 of 1:  23%|██▎       | 7/30 [00:05<00:19,  1.16it/s]
[Acessing speaker spk_0 track 1 of 1:  27%|██▋       | 8/30 [00:06<00:20,  1.09it/s]
[Acessing speaker spk_0 track 1 of 1:  30%|███       | 9/30 [00:06<00:17,  1.21it/s]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 10/30 [00:07<00:15,  1.25it/s]
[Acessing speaker spk_0 track 1 of 1:  37%|███▋      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▍         | 1/26 [00:12<05:18, 12.74s/it]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 2/26 [00:14<02:31,  6.29s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 3/26 [00:15<01:24,  3.66s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▌        | 4/26 [00:17<01:09,  3.17s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▉        | 5/26 [00:19<00:57,  2.72s/it]
[Acessing speaker spk_1 track 1 of 1:  23%|██▎       | 6/26 [00:20<00:40,  2.01s/it]
[Acessing speaker spk_1 track 1 of 1:  27%|██▋       | 7/26 [00:20<00:29,  1.53s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███       | 8/26 [00:21<00:22,  1.26s/it]
[Acessing speaker spk_1 track 1 of 1:  35%|███▍      | 9/26 [00:21<00:17,  1.04s/it]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 10/26 [00:25<00:27,  1.74s/it]
[Acessing speaker spk_1 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/19 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   5%|▌         | 1/19 [00:01<00:23,  1.32s/it]
[Acessing speaker spk_2 track 1 of 2:  11%|█         | 2/19 [00:01<00:14,  1.14it/s]
[Acessing speaker spk_2 track 1 of 2:  16%|█▌        | 3/19 [00:07<00:45,  2.82s/it]
[Acessing speaker spk_2 track 1 of 2:  21%|██        | 4/19 [00:08<00:31,  2.10s/it]
[Acessing speaker spk_2 track 1 of 2:  26%|██▋       | 5/19 [00:10<00:32,  2.34s/it]
[Acessing speaker spk_2 track 1 of 2:  32%|███▏      | 6/19 [00:11<00:23,  1.84s/it]
[Acessing speaker spk_2 track 1 of 2:  37%|███▋      | 7/19 [00:12<00:16,  1.41s/it]
[Acessing speaker spk_2 track 1 of 2:  42%|████▏     | 8/19 [00:12<00:12,  1.16s/it]
[Acessing speaker spk_2 track 1 of 2:  47%|████▋     | 9/19 [00:13<00:11,  1.12s/it]
[Acessing speaker spk_2 track 1 of 2:  53%|█████▎    | 10/19 [00:14<00:09,  1.03s/it]
[Acessing speaker spk_2 track 1 of 2:  58%|█████▊    | 11/1





[Acessing speaker spk_3 track 1 of 3:   0%|          | 0/21 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 3:   5%|▍         | 1/21 [00:01<00:21,  1.10s/it]
[Acessing speaker spk_3 track 1 of 3:  10%|▉         | 2/21 [00:01<00:16,  1.18it/s]
[Acessing speaker spk_3 track 1 of 3:  14%|█▍        | 3/21 [00:03<00:19,  1.07s/it]
[Acessing speaker spk_3 track 1 of 3:  19%|█▉        | 4/21 [00:03<00:14,  1.18it/s]
[Acessing speaker spk_3 track 1 of 3:  24%|██▍       | 5/21 [00:04<00:15,  1.05it/s]
[Acessing speaker spk_3 track 1 of 3:  29%|██▊       | 6/21 [00:05<00:13,  1.11it/s]
[Acessing speaker spk_3 track 1 of 3:  33%|███▎      | 7/21 [00:06<00:12,  1.10it/s]
[Acessing speaker spk_3 track 1 of 3:  38%|███▊      | 8/21 [00:06<00:09,  1.30it/s]
[Acessing speaker spk_3 track 1 of 3:  43%|████▎     | 9/21 [00:07<00:07,  1.52it/s]
[Acessing speaker spk_3 track 1 of 3:  48%|████▊     | 10/21 [00:08<00:08,  1.26it/s]
[Acessing speaker spk_3 track 1 of 3:  52%|█████▏    | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/29 [00:00<00:12,  2.30it/s]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/29 [00:01<00:16,  1.64it/s]
[Acessing speaker spk_4 track 1 of 1:  10%|█         | 3/29 [00:01<00:12,  2.00it/s]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 4/29 [00:02<00:15,  1.62it/s]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 5/29 [00:03<00:18,  1.29it/s]
[Acessing speaker spk_4 track 1 of 1:  21%|██        | 6/29 [00:04<00:23,  1.00s/it]
[Acessing speaker spk_4 track 1 of 1:  24%|██▍       | 7/29 [00:05<00:19,  1.15it/s]
[Acessing speaker spk_4 track 1 of 1:  28%|██▊       | 8/29 [00:05<00:15,  1.33it/s]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 9/29 [00:07<00:22,  1.14s/it]
[Acessing speaker spk_4 track 1 of 1:  34%|███▍      | 10/29 [00:08<00:20,  1.06s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/33 [00:00<00:27,  1.15it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▌         | 2/33 [00:01<00:24,  1.28it/s]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 3/33 [00:02<00:18,  1.62it/s]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 4/33 [00:02<00:18,  1.56it/s]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 5/33 [00:03<00:17,  1.56it/s]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 6/33 [00:03<00:15,  1.77it/s]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 7/33 [00:04<00:15,  1.65it/s]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 8/33 [00:05<00:15,  1.58it/s]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 9/33 [00:07<00:25,  1.07s/it]
[Acessing speaker spk_5 track 1 of 1:  30%|███       | 10/33 [00:08<00:27,  1.18s/it]
[Acessing speaker spk_5 track 1 of 1:  33%|███▎      | 11/3


Starte Inference für Experiment: E67_bugfix_mdOn0p8_mdOff1p0_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E67_bugfix_mdOn0p8_mdOff1p0_bs12_len20
  session_dir     = data-bin/dev/session_50
  comment         = AVSR-Override: min_on=0.8s, min_off=1.0s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_50


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▎         | 1/27 [00:00<00:10,  2.37it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/27 [00:01<00:15,  1.63it/s]
[Acessing speaker spk_0 track 1 of 1:  11%|█         | 3/27 [00:02<00:17,  1.38it/s]
[Acessing speaker spk_0 track 1 of 1:  15%|█▍        | 4/27 [00:02<00:16,  1.43it/s]
[Acessing speaker spk_0 track 1 of 1:  19%|█▊        | 5/27 [00:03<00:13,  1.64it/s]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 6/27 [00:04<00:17,  1.21it/s]
[Acessing speaker spk_0 track 1 of 1:  26%|██▌       | 7/27 [00:05<00:17,  1.15it/s]
[Acessing speaker spk_0 track 1 of 1:  30%|██▉       | 8/27 [00:06<00:17,  1.08it/s]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 9/27 [00:07<00:15,  1.20it/s]
[Acessing speaker spk_0 track 1 of 1:  37%|███▋      | 10/27 [00:07<00:13,  1.24it/s]
[Acessing speaker spk_0 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/25 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▍         | 1/25 [00:10<04:18, 10.78s/it]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 2/25 [00:12<02:04,  5.43s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 3/25 [00:12<01:09,  3.17s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 4/25 [00:15<00:59,  2.82s/it]
[Acessing speaker spk_1 track 1 of 1:  20%|██        | 5/25 [00:17<00:49,  2.46s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 6/25 [00:17<00:34,  1.83s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 7/25 [00:18<00:25,  1.41s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 8/25 [00:18<00:20,  1.18s/it]
[Acessing speaker spk_1 track 1 of 1:  36%|███▌      | 9/25 [00:23<00:34,  2.15s/it]
[Acessing speaker spk_1 track 1 of 1:  40%|████      | 10/25 [00:24<00:27,  1.80s/it]
[Acessing speaker spk_1 track 1 of 1:  44%|████▍     | 11/2





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   6%|▌         | 1/18 [00:01<00:21,  1.29s/it]
[Acessing speaker spk_2 track 1 of 2:  11%|█         | 2/18 [00:01<00:13,  1.16it/s]
[Acessing speaker spk_2 track 1 of 2:  17%|█▋        | 3/18 [00:06<00:42,  2.81s/it]
[Acessing speaker spk_2 track 1 of 2:  22%|██▏       | 4/18 [00:08<00:29,  2.11s/it]
[Acessing speaker spk_2 track 1 of 2:  28%|██▊       | 5/18 [00:11<00:35,  2.70s/it]
[Acessing speaker spk_2 track 1 of 2:  33%|███▎      | 6/18 [00:12<00:24,  2.02s/it]
[Acessing speaker spk_2 track 1 of 2:  39%|███▉      | 7/18 [00:13<00:17,  1.56s/it]
[Acessing speaker spk_2 track 1 of 2:  44%|████▍     | 8/18 [00:14<00:13,  1.39s/it]
[Acessing speaker spk_2 track 1 of 2:  50%|█████     | 9/18 [00:14<00:10,  1.21s/it]
[Acessing speaker spk_2 track 1 of 2:  56%|█████▌    | 10/18 [00:15<00:08,  1.00s/it]
[Acessing speaker spk_2 track 1 of 2:  61%|██████    | 11/1





[Acessing speaker spk_3 track 1 of 3:   0%|          | 0/20 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 3:   5%|▌         | 1/20 [00:01<00:20,  1.08s/it]
[Acessing speaker spk_3 track 1 of 3:  10%|█         | 2/20 [00:01<00:15,  1.20it/s]
[Acessing speaker spk_3 track 1 of 3:  15%|█▌        | 3/20 [00:03<00:17,  1.04s/it]
[Acessing speaker spk_3 track 1 of 3:  20%|██        | 4/20 [00:03<00:13,  1.20it/s]
[Acessing speaker spk_3 track 1 of 3:  25%|██▌       | 5/20 [00:06<00:21,  1.46s/it]
[Acessing speaker spk_3 track 1 of 3:  30%|███       | 6/20 [00:07<00:18,  1.31s/it]
[Acessing speaker spk_3 track 1 of 3:  35%|███▌      | 7/20 [00:07<00:13,  1.03s/it]
[Acessing speaker spk_3 track 1 of 3:  40%|████      | 8/20 [00:08<00:10,  1.19it/s]
[Acessing speaker spk_3 track 1 of 3:  45%|████▌     | 9/20 [00:09<00:10,  1.09it/s]
[Acessing speaker spk_3 track 1 of 3:  50%|█████     | 10/20 [00:09<00:08,  1.17it/s]
[Acessing speaker spk_3 track 1 of 3:  55%|█████▌    | 11/2





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/29 [00:00<00:12,  2.30it/s]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/29 [00:01<00:16,  1.65it/s]
[Acessing speaker spk_4 track 1 of 1:  10%|█         | 3/29 [00:01<00:12,  2.02it/s]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 4/29 [00:02<00:15,  1.64it/s]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 5/29 [00:03<00:18,  1.31it/s]
[Acessing speaker spk_4 track 1 of 1:  21%|██        | 6/29 [00:04<00:21,  1.08it/s]
[Acessing speaker spk_4 track 1 of 1:  24%|██▍       | 7/29 [00:05<00:17,  1.23it/s]
[Acessing speaker spk_4 track 1 of 1:  28%|██▊       | 8/29 [00:05<00:15,  1.40it/s]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 9/29 [00:07<00:22,  1.11s/it]
[Acessing speaker spk_4 track 1 of 1:  34%|███▍      | 10/29 [00:08<00:19,  1.04s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/30 [00:00<00:25,  1.16it/s]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 2/30 [00:01<00:21,  1.28it/s]
[Acessing speaker spk_5 track 1 of 1:  10%|█         | 3/30 [00:02<00:16,  1.61it/s]
[Acessing speaker spk_5 track 1 of 1:  13%|█▎        | 4/30 [00:02<00:16,  1.58it/s]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 5/30 [00:03<00:15,  1.57it/s]
[Acessing speaker spk_5 track 1 of 1:  20%|██        | 6/30 [00:03<00:13,  1.78it/s]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 7/30 [00:04<00:13,  1.65it/s]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 8/30 [00:05<00:13,  1.59it/s]
[Acessing speaker spk_5 track 1 of 1:  30%|███       | 9/30 [00:07<00:22,  1.07s/it]
[Acessing speaker spk_5 track 1 of 1:  33%|███▎      | 10/30 [00:15<01:07,  3.35s/it]
[Acessing speaker spk_5 track 1 of 1:  37%|███▋      | 11/3


Starte Inference für Experiment: E68_bugfix_mdOn0p8_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E68_bugfix_mdOn0p8_mdOff1p2_bs12_len20
  session_dir     = data-bin/dev/session_50
  comment         = AVSR-Override: min_on=0.8s, min_off=1.2s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_50


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▎         | 1/27 [00:00<00:10,  2.41it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/27 [00:01<00:15,  1.67it/s]
[Acessing speaker spk_0 track 1 of 1:  11%|█         | 3/27 [00:01<00:17,  1.41it/s]
[Acessing speaker spk_0 track 1 of 1:  15%|█▍        | 4/27 [00:02<00:15,  1.45it/s]
[Acessing speaker spk_0 track 1 of 1:  19%|█▊        | 5/27 [00:03<00:13,  1.66it/s]
[Acessing speaker spk_0 track 1 of 1:  22%|██▏       | 6/27 [00:04<00:17,  1.22it/s]
[Acessing speaker spk_0 track 1 of 1:  26%|██▌       | 7/27 [00:05<00:17,  1.16it/s]
[Acessing speaker spk_0 track 1 of 1:  30%|██▉       | 8/27 [00:06<00:17,  1.09it/s]
[Acessing speaker spk_0 track 1 of 1:  33%|███▎      | 9/27 [00:06<00:14,  1.20it/s]
[Acessing speaker spk_0 track 1 of 1:  37%|███▋      | 10/27 [00:07<00:13,  1.25it/s]
[Acessing speaker spk_0 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▍         | 1/24 [00:10<04:08, 10.81s/it]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 2/24 [00:12<01:59,  5.45s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▎        | 3/24 [00:13<01:06,  3.19s/it]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 4/24 [00:15<00:56,  2.85s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 5/24 [00:17<00:47,  2.48s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 6/24 [00:18<00:34,  1.94s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▉       | 7/24 [00:18<00:26,  1.53s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 8/24 [00:23<00:38,  2.42s/it]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 9/24 [00:24<00:29,  1.98s/it]
[Acessing speaker spk_1 track 1 of 1:  42%|████▏     | 10/24 [00:24<00:21,  1.57s/it]
[Acessing speaker spk_1 track 1 of 1:  46%|████▌     | 11/2





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   6%|▌         | 1/17 [00:01<00:20,  1.30s/it]
[Acessing speaker spk_2 track 1 of 2:  12%|█▏        | 2/17 [00:01<00:13,  1.10it/s]
[Acessing speaker spk_2 track 1 of 2:  18%|█▊        | 3/17 [00:07<00:40,  2.86s/it]
[Acessing speaker spk_2 track 1 of 2:  24%|██▎       | 4/17 [00:08<00:27,  2.14s/it]
[Acessing speaker spk_2 track 1 of 2:  29%|██▉       | 5/17 [00:12<00:33,  2.77s/it]
[Acessing speaker spk_2 track 1 of 2:  35%|███▌      | 6/17 [00:12<00:22,  2.07s/it]
[Acessing speaker spk_2 track 1 of 2:  41%|████      | 7/17 [00:13<00:16,  1.61s/it]
[Acessing speaker spk_2 track 1 of 2:  47%|████▋     | 8/17 [00:15<00:16,  1.81s/it]
[Acessing speaker spk_2 track 1 of 2:  53%|█████▎    | 9/17 [00:16<00:12,  1.51s/it]
[Acessing speaker spk_2 track 1 of 2:  59%|█████▉    | 10/17 [00:17<00:08,  1.20s/it]
[Acessing speaker spk_2 track 1 of 2:  65%|██████▍   | 11/1





[Acessing speaker spk_3 track 1 of 3:   0%|          | 0/19 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 3:   5%|▌         | 1/19 [00:01<00:19,  1.09s/it]
[Acessing speaker spk_3 track 1 of 3:  11%|█         | 2/19 [00:01<00:14,  1.19it/s]
[Acessing speaker spk_3 track 1 of 3:  16%|█▌        | 3/19 [00:03<00:16,  1.05s/it]
[Acessing speaker spk_3 track 1 of 3:  21%|██        | 4/19 [00:03<00:12,  1.19it/s]
[Acessing speaker spk_3 track 1 of 3:  26%|██▋       | 5/19 [00:06<00:20,  1.45s/it]
[Acessing speaker spk_3 track 1 of 3:  32%|███▏      | 6/19 [00:07<00:19,  1.51s/it]
[Acessing speaker spk_3 track 1 of 3:  37%|███▋      | 7/19 [00:08<00:14,  1.17s/it]
[Acessing speaker spk_3 track 1 of 3:  42%|████▏     | 8/19 [00:09<00:12,  1.14s/it]
[Acessing speaker spk_3 track 1 of 3:  47%|████▋     | 9/19 [00:09<00:09,  1.04it/s]
[Acessing speaker spk_3 track 1 of 3:  53%|█████▎    | 10/19 [00:10<00:08,  1.06it/s]
[Acessing speaker spk_3 track 1 of 3:  58%|█████▊    | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/27 [00:00<00:11,  2.32it/s]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/27 [00:01<00:15,  1.63it/s]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/27 [00:01<00:11,  2.01it/s]
[Acessing speaker spk_4 track 1 of 1:  15%|█▍        | 4/27 [00:02<00:14,  1.60it/s]
[Acessing speaker spk_4 track 1 of 1:  19%|█▊        | 5/27 [00:03<00:17,  1.29it/s]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 6/27 [00:04<00:19,  1.07it/s]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 7/27 [00:05<00:16,  1.22it/s]
[Acessing speaker spk_4 track 1 of 1:  30%|██▉       | 8/27 [00:05<00:13,  1.39it/s]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 9/27 [00:07<00:19,  1.11s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 10/27 [00:08<00:17,  1.04s/it]
[Acessing speaker spk_4 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/30 [00:00<00:25,  1.16it/s]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 2/30 [00:01<00:21,  1.28it/s]
[Acessing speaker spk_5 track 1 of 1:  10%|█         | 3/30 [00:02<00:16,  1.61it/s]
[Acessing speaker spk_5 track 1 of 1:  13%|█▎        | 4/30 [00:02<00:16,  1.58it/s]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 5/30 [00:03<00:15,  1.57it/s]
[Acessing speaker spk_5 track 1 of 1:  20%|██        | 6/30 [00:03<00:13,  1.79it/s]
[Acessing speaker spk_5 track 1 of 1:  23%|██▎       | 7/30 [00:04<00:13,  1.66it/s]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 8/30 [00:05<00:13,  1.59it/s]
[Acessing speaker spk_5 track 1 of 1:  30%|███       | 9/30 [00:07<00:22,  1.06s/it]
[Acessing speaker spk_5 track 1 of 1:  33%|███▎      | 10/30 [00:17<01:16,  3.82s/it]
[Acessing speaker spk_5 track 1 of 1:  37%|███▋      | 11/3


Starte Inference für Experiment: E69_bugfix_mdOn1p0_mdOff0p5_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E69_bugfix_mdOn1p0_mdOff0p5_bs12_len20
  session_dir     = data-bin/dev/session_50
  comment         = AVSR-Override: min_on=1.0s, min_off=0.5s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_50


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   3%|▎         | 1/29 [00:00<00:21,  1.33it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/29 [00:01<00:21,  1.25it/s]
[Acessing speaker spk_0 track 1 of 1:  10%|█         | 3/29 [00:02<00:25,  1.04it/s]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 4/29 [00:03<00:25,  1.03s/it]
[Acessing speaker spk_0 track 1 of 1:  17%|█▋        | 5/29 [00:05<00:26,  1.11s/it]
[Acessing speaker spk_0 track 1 of 1:  21%|██        | 6/29 [00:06<00:24,  1.06s/it]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 7/29 [00:07<00:23,  1.06s/it]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 8/29 [00:07<00:19,  1.08it/s]
[Acessing speaker spk_0 track 1 of 1:  31%|███       | 9/29 [00:08<00:17,  1.15it/s]
[Acessing speaker spk_0 track 1 of 1:  34%|███▍      | 10/29 [00:09<00:15,  1.21it/s]
[Acessing speaker spk_0 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:10<04:42, 10.85s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:12<02:16,  5.46s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:13<01:16,  3.19s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:15<01:05,  2.83s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:17<00:54,  2.47s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:17<00:38,  1.83s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:18<00:28,  1.41s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:18<00:22,  1.18s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:19<00:17,  1.02it/s]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:22<00:28,  1.69s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/20 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   5%|▌         | 1/20 [00:01<00:24,  1.30s/it]
[Acessing speaker spk_2 track 1 of 2:  10%|█         | 2/20 [00:01<00:15,  1.16it/s]
[Acessing speaker spk_2 track 1 of 2:  15%|█▌        | 3/20 [00:06<00:46,  2.74s/it]
[Acessing speaker spk_2 track 1 of 2:  20%|██        | 4/20 [00:07<00:32,  2.05s/it]
[Acessing speaker spk_2 track 1 of 2:  25%|██▌       | 5/20 [00:09<00:28,  1.89s/it]
[Acessing speaker spk_2 track 1 of 2:  30%|███       | 6/20 [00:10<00:20,  1.48s/it]
[Acessing speaker spk_2 track 1 of 2:  35%|███▌      | 7/20 [00:10<00:16,  1.23s/it]
[Acessing speaker spk_2 track 1 of 2:  40%|████      | 8/20 [00:11<00:12,  1.01s/it]
[Acessing speaker spk_2 track 1 of 2:  45%|████▌     | 9/20 [00:11<00:09,  1.12it/s]
[Acessing speaker spk_2 track 1 of 2:  50%|█████     | 10/20 [00:13<00:09,  1.07it/s]
[Acessing speaker spk_2 track 1 of 2:  55%|█████▌    | 11/2





[Acessing speaker spk_3 track 1 of 3:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 3:   6%|▌         | 1/17 [00:01<00:17,  1.08s/it]
[Acessing speaker spk_3 track 1 of 3:  12%|█▏        | 2/17 [00:01<00:14,  1.05it/s]
[Acessing speaker spk_3 track 1 of 3:  18%|█▊        | 3/17 [00:02<00:13,  1.03it/s]
[Acessing speaker spk_3 track 1 of 3:  24%|██▎       | 4/17 [00:04<00:13,  1.03s/it]
[Acessing speaker spk_3 track 1 of 3:  29%|██▉       | 5/17 [00:04<00:11,  1.05it/s]
[Acessing speaker spk_3 track 1 of 3:  35%|███▌      | 6/17 [00:05<00:10,  1.06it/s]
[Acessing speaker spk_3 track 1 of 3:  41%|████      | 7/17 [00:06<00:09,  1.02it/s]
[Acessing speaker spk_3 track 1 of 3:  47%|████▋     | 8/17 [00:07<00:07,  1.18it/s]
[Acessing speaker spk_3 track 1 of 3:  53%|█████▎    | 9/17 [00:08<00:06,  1.15it/s]
[Acessing speaker spk_3 track 1 of 3:  59%|█████▉    | 10/17 [00:10<00:08,  1.17s/it]
[Acessing speaker spk_3 track 1 of 3:  65%|██████▍   | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/27 [00:00<00:19,  1.34it/s]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/27 [00:01<00:12,  1.92it/s]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/27 [00:01<00:15,  1.55it/s]
[Acessing speaker spk_4 track 1 of 1:  15%|█▍        | 4/27 [00:02<00:18,  1.25it/s]
[Acessing speaker spk_4 track 1 of 1:  19%|█▊        | 5/27 [00:04<00:21,  1.05it/s]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 6/27 [00:04<00:17,  1.21it/s]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 7/27 [00:05<00:14,  1.38it/s]
[Acessing speaker spk_4 track 1 of 1:  30%|██▉       | 8/27 [00:07<00:21,  1.12s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 9/27 [00:08<00:18,  1.05s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 10/27 [00:08<00:15,  1.09it/s]
[Acessing speaker spk_4 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/33 [00:00<00:27,  1.15it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▌         | 2/33 [00:01<00:24,  1.28it/s]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 3/33 [00:02<00:18,  1.61it/s]
[Acessing speaker spk_5 track 1 of 1:  12%|█▏        | 4/33 [00:02<00:18,  1.58it/s]
[Acessing speaker spk_5 track 1 of 1:  15%|█▌        | 5/33 [00:03<00:17,  1.57it/s]
[Acessing speaker spk_5 track 1 of 1:  18%|█▊        | 6/33 [00:04<00:17,  1.53it/s]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 7/33 [00:04<00:17,  1.50it/s]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 8/33 [00:06<00:27,  1.10s/it]
[Acessing speaker spk_5 track 1 of 1:  27%|██▋       | 9/33 [00:08<00:28,  1.20s/it]
[Acessing speaker spk_5 track 1 of 1:  30%|███       | 10/33 [00:13<01:00,  2.62s/it]
[Acessing speaker spk_5 track 1 of 1:  33%|███▎      | 11/3


Starte Inference für Experiment: E70_bugfix_mdOn1p0_mdOff0p8_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E70_bugfix_mdOn1p0_mdOff0p8_bs12_len20
  session_dir     = data-bin/dev/session_50
  comment         = AVSR-Override: min_on=1.0s, min_off=0.8s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_50


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▎         | 1/28 [00:00<00:19,  1.35it/s]
[Acessing speaker spk_0 track 1 of 1:   7%|▋         | 2/28 [00:01<00:20,  1.26it/s]
[Acessing speaker spk_0 track 1 of 1:  11%|█         | 3/28 [00:02<00:18,  1.37it/s]
[Acessing speaker spk_0 track 1 of 1:  14%|█▍        | 4/28 [00:02<00:14,  1.61it/s]
[Acessing speaker spk_0 track 1 of 1:  18%|█▊        | 5/28 [00:03<00:19,  1.19it/s]
[Acessing speaker spk_0 track 1 of 1:  21%|██▏       | 6/28 [00:04<00:19,  1.13it/s]
[Acessing speaker spk_0 track 1 of 1:  25%|██▌       | 7/28 [00:05<00:19,  1.07it/s]
[Acessing speaker spk_0 track 1 of 1:  29%|██▊       | 8/28 [00:06<00:16,  1.19it/s]
[Acessing speaker spk_0 track 1 of 1:  32%|███▏      | 9/28 [00:07<00:15,  1.24it/s]
[Acessing speaker spk_0 track 1 of 1:  36%|███▌      | 10/28 [00:08<00:14,  1.27it/s]
[Acessing speaker spk_0 track 1 of 1:  39%|███▉      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/25 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▍         | 1/25 [00:12<04:58, 12.45s/it]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 2/25 [00:14<02:20,  6.12s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▏        | 3/25 [00:14<01:18,  3.55s/it]
[Acessing speaker spk_1 track 1 of 1:  16%|█▌        | 4/25 [00:16<01:04,  3.05s/it]
[Acessing speaker spk_1 track 1 of 1:  20%|██        | 5/25 [00:18<00:52,  2.61s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 6/25 [00:19<00:36,  1.93s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 7/25 [00:19<00:26,  1.48s/it]
[Acessing speaker spk_1 track 1 of 1:  32%|███▏      | 8/25 [00:20<00:20,  1.23s/it]
[Acessing speaker spk_1 track 1 of 1:  36%|███▌      | 9/25 [00:21<00:16,  1.01s/it]
[Acessing speaker spk_1 track 1 of 1:  40%|████      | 10/25 [00:24<00:25,  1.72s/it]
[Acessing speaker spk_1 track 1 of 1:  44%|████▍     | 11/2





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/19 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   5%|▌         | 1/19 [00:01<00:23,  1.29s/it]
[Acessing speaker spk_2 track 1 of 2:  11%|█         | 2/19 [00:01<00:14,  1.16it/s]
[Acessing speaker spk_2 track 1 of 2:  16%|█▌        | 3/19 [00:06<00:43,  2.73s/it]
[Acessing speaker spk_2 track 1 of 2:  21%|██        | 4/19 [00:07<00:30,  2.05s/it]
[Acessing speaker spk_2 track 1 of 2:  26%|██▋       | 5/19 [00:10<00:32,  2.31s/it]
[Acessing speaker spk_2 track 1 of 2:  32%|███▏      | 6/19 [00:11<00:23,  1.82s/it]
[Acessing speaker spk_2 track 1 of 2:  37%|███▋      | 7/19 [00:11<00:16,  1.40s/it]
[Acessing speaker spk_2 track 1 of 2:  42%|████▏     | 8/19 [00:12<00:12,  1.15s/it]
[Acessing speaker spk_2 track 1 of 2:  47%|████▋     | 9/19 [00:13<00:11,  1.12s/it]
[Acessing speaker spk_2 track 1 of 2:  53%|█████▎    | 10/19 [00:14<00:09,  1.03s/it]
[Acessing speaker spk_2 track 1 of 2:  58%|█████▊    | 11/1





[Acessing speaker spk_3 track 1 of 3:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 3:   6%|▌         | 1/17 [00:01<00:17,  1.08s/it]
[Acessing speaker spk_3 track 1 of 3:  12%|█▏        | 2/17 [00:01<00:12,  1.20it/s]
[Acessing speaker spk_3 track 1 of 3:  18%|█▊        | 3/17 [00:03<00:14,  1.05s/it]
[Acessing speaker spk_3 track 1 of 3:  24%|██▎       | 4/17 [00:04<00:13,  1.07s/it]
[Acessing speaker spk_3 track 1 of 3:  29%|██▉       | 5/17 [00:04<00:11,  1.03it/s]
[Acessing speaker spk_3 track 1 of 3:  35%|███▌      | 6/17 [00:05<00:10,  1.05it/s]
[Acessing speaker spk_3 track 1 of 3:  41%|████      | 7/17 [00:06<00:09,  1.01it/s]
[Acessing speaker spk_3 track 1 of 3:  47%|████▋     | 8/17 [00:07<00:07,  1.17it/s]
[Acessing speaker spk_3 track 1 of 3:  53%|█████▎    | 9/17 [00:08<00:06,  1.15it/s]
[Acessing speaker spk_3 track 1 of 3:  59%|█████▉    | 10/17 [00:10<00:08,  1.17s/it]
[Acessing speaker spk_3 track 1 of 3:  65%|██████▍   | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/27 [00:00<00:19,  1.35it/s]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/27 [00:01<00:13,  1.92it/s]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/27 [00:01<00:15,  1.53it/s]
[Acessing speaker spk_4 track 1 of 1:  15%|█▍        | 4/27 [00:03<00:20,  1.10it/s]
[Acessing speaker spk_4 track 1 of 1:  19%|█▊        | 5/27 [00:04<00:22,  1.04s/it]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 6/27 [00:05<00:18,  1.14it/s]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 7/27 [00:05<00:15,  1.32it/s]
[Acessing speaker spk_4 track 1 of 1:  30%|██▉       | 8/27 [00:07<00:21,  1.15s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 9/27 [00:08<00:19,  1.07s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 10/27 [00:09<00:15,  1.07it/s]
[Acessing speaker spk_4 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/32 [00:00<00:26,  1.15it/s]
[Acessing speaker spk_5 track 1 of 1:   6%|▋         | 2/32 [00:01<00:23,  1.27it/s]
[Acessing speaker spk_5 track 1 of 1:   9%|▉         | 3/32 [00:02<00:18,  1.60it/s]
[Acessing speaker spk_5 track 1 of 1:  12%|█▎        | 4/32 [00:02<00:17,  1.57it/s]
[Acessing speaker spk_5 track 1 of 1:  16%|█▌        | 5/32 [00:03<00:17,  1.55it/s]
[Acessing speaker spk_5 track 1 of 1:  19%|█▉        | 6/32 [00:04<00:17,  1.50it/s]
[Acessing speaker spk_5 track 1 of 1:  22%|██▏       | 7/32 [00:04<00:16,  1.47it/s]
[Acessing speaker spk_5 track 1 of 1:  25%|██▌       | 8/32 [00:06<00:26,  1.12s/it]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 9/32 [00:08<00:27,  1.21s/it]
[Acessing speaker spk_5 track 1 of 1:  31%|███▏      | 10/32 [00:14<00:59,  2.70s/it]
[Acessing speaker spk_5 track 1 of 1:  34%|███▍      | 11/3


Starte Inference für Experiment: E71_bugfix_mdOn1p0_mdOff1p0_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E71_bugfix_mdOn1p0_mdOff1p0_bs12_len20
  session_dir     = data-bin/dev/session_50
  comment         = AVSR-Override: min_on=1.0s, min_off=1.0s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_50


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/25 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▍         | 1/25 [00:00<00:17,  1.37it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 2/25 [00:01<00:18,  1.27it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▏        | 3/25 [00:02<00:15,  1.38it/s]
[Acessing speaker spk_0 track 1 of 1:  16%|█▌        | 4/25 [00:02<00:12,  1.63it/s]
[Acessing speaker spk_0 track 1 of 1:  20%|██        | 5/25 [00:03<00:16,  1.18it/s]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 6/25 [00:04<00:16,  1.13it/s]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 7/25 [00:05<00:16,  1.08it/s]
[Acessing speaker spk_0 track 1 of 1:  32%|███▏      | 8/25 [00:06<00:14,  1.20it/s]
[Acessing speaker spk_0 track 1 of 1:  36%|███▌      | 9/25 [00:07<00:12,  1.25it/s]
[Acessing speaker spk_0 track 1 of 1:  40%|████      | 10/25 [00:07<00:11,  1.29it/s]
[Acessing speaker spk_0 track 1 of 1:  44%|████▍     | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/24 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▍         | 1/24 [00:10<04:06, 10.72s/it]
[Acessing speaker spk_1 track 1 of 1:   8%|▊         | 2/24 [00:12<01:58,  5.40s/it]
[Acessing speaker spk_1 track 1 of 1:  12%|█▎        | 3/24 [00:12<01:06,  3.16s/it]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 4/24 [00:15<00:56,  2.81s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 5/24 [00:16<00:46,  2.45s/it]
[Acessing speaker spk_1 track 1 of 1:  25%|██▌       | 6/24 [00:17<00:32,  1.83s/it]
[Acessing speaker spk_1 track 1 of 1:  29%|██▉       | 7/24 [00:18<00:23,  1.41s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 8/24 [00:18<00:18,  1.17s/it]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 9/24 [00:23<00:32,  2.15s/it]
[Acessing speaker spk_1 track 1 of 1:  42%|████▏     | 10/24 [00:24<00:25,  1.79s/it]
[Acessing speaker spk_1 track 1 of 1:  46%|████▌     | 11/2





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/18 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   6%|▌         | 1/18 [00:01<00:21,  1.29s/it]
[Acessing speaker spk_2 track 1 of 2:  11%|█         | 2/18 [00:01<00:13,  1.16it/s]
[Acessing speaker spk_2 track 1 of 2:  17%|█▋        | 3/18 [00:06<00:41,  2.78s/it]
[Acessing speaker spk_2 track 1 of 2:  22%|██▏       | 4/18 [00:07<00:29,  2.08s/it]
[Acessing speaker spk_2 track 1 of 2:  28%|██▊       | 5/18 [00:11<00:36,  2.77s/it]
[Acessing speaker spk_2 track 1 of 2:  33%|███▎      | 6/18 [00:12<00:24,  2.07s/it]
[Acessing speaker spk_2 track 1 of 2:  39%|███▉      | 7/18 [00:13<00:17,  1.60s/it]
[Acessing speaker spk_2 track 1 of 2:  44%|████▍     | 8/18 [00:14<00:14,  1.42s/it]
[Acessing speaker spk_2 track 1 of 2:  50%|█████     | 9/18 [00:15<00:11,  1.23s/it]
[Acessing speaker spk_2 track 1 of 2:  56%|█████▌    | 10/18 [00:15<00:08,  1.01s/it]
[Acessing speaker spk_2 track 1 of 2:  61%|██████    | 11/1





[Acessing speaker spk_3 track 1 of 3:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 3:   6%|▋         | 1/16 [00:01<00:16,  1.08s/it]
[Acessing speaker spk_3 track 1 of 3:  12%|█▎        | 2/16 [00:01<00:11,  1.19it/s]
[Acessing speaker spk_3 track 1 of 3:  19%|█▉        | 3/16 [00:03<00:13,  1.05s/it]
[Acessing speaker spk_3 track 1 of 3:  25%|██▌       | 4/16 [00:05<00:19,  1.63s/it]
[Acessing speaker spk_3 track 1 of 3:  31%|███▏      | 5/16 [00:06<00:15,  1.42s/it]
[Acessing speaker spk_3 track 1 of 3:  38%|███▊      | 6/16 [00:07<00:12,  1.30s/it]
[Acessing speaker spk_3 track 1 of 3:  44%|████▍     | 7/16 [00:08<00:09,  1.05s/it]
[Acessing speaker spk_3 track 1 of 3:  50%|█████     | 8/16 [00:09<00:08,  1.01s/it]
[Acessing speaker spk_3 track 1 of 3:  56%|█████▋    | 9/16 [00:10<00:08,  1.27s/it]
[Acessing speaker spk_3 track 1 of 3:  62%|██████▎   | 10/16 [00:11<00:06,  1.09s/it]
[Acessing speaker spk_3 track 1 of 3:  69%|██████▉   | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/27 [00:00<00:19,  1.34it/s]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/27 [00:01<00:13,  1.92it/s]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/27 [00:01<00:15,  1.55it/s]
[Acessing speaker spk_4 track 1 of 1:  15%|█▍        | 4/27 [00:02<00:18,  1.25it/s]
[Acessing speaker spk_4 track 1 of 1:  19%|█▊        | 5/27 [00:04<00:21,  1.04it/s]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 6/27 [00:04<00:17,  1.20it/s]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 7/27 [00:05<00:14,  1.37it/s]
[Acessing speaker spk_4 track 1 of 1:  30%|██▉       | 8/27 [00:07<00:21,  1.13s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 9/27 [00:08<00:18,  1.06s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 10/27 [00:08<00:15,  1.08it/s]
[Acessing speaker spk_4 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/29 [00:00<00:24,  1.15it/s]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 2/29 [00:01<00:21,  1.28it/s]
[Acessing speaker spk_5 track 1 of 1:  10%|█         | 3/29 [00:02<00:16,  1.60it/s]
[Acessing speaker spk_5 track 1 of 1:  14%|█▍        | 4/29 [00:02<00:15,  1.57it/s]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 5/29 [00:03<00:15,  1.56it/s]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 6/29 [00:04<00:15,  1.52it/s]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 7/29 [00:04<00:14,  1.49it/s]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 8/29 [00:06<00:23,  1.11s/it]
[Acessing speaker spk_5 track 1 of 1:  31%|███       | 9/29 [00:15<01:08,  3.40s/it]
[Acessing speaker spk_5 track 1 of 1:  34%|███▍      | 10/29 [00:23<01:31,  4.79s/it]
[Acessing speaker spk_5 track 1 of 1:  38%|███▊      | 11/2


Starte Inference für Experiment: E72_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E72_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/dev/session_50
  comment         = AVSR-Override: min_on=1.0s, min_off=1.2s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_50


Processing speakers:   0%|          | 0/6 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 1:   0%|          | 0/25 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 1:   4%|▍         | 1/25 [00:00<00:18,  1.33it/s]
[Acessing speaker spk_0 track 1 of 1:   8%|▊         | 2/25 [00:01<00:18,  1.23it/s]
[Acessing speaker spk_0 track 1 of 1:  12%|█▏        | 3/25 [00:02<00:16,  1.33it/s]
[Acessing speaker spk_0 track 1 of 1:  16%|█▌        | 4/25 [00:02<00:13,  1.58it/s]
[Acessing speaker spk_0 track 1 of 1:  20%|██        | 5/25 [00:04<00:17,  1.15it/s]
[Acessing speaker spk_0 track 1 of 1:  24%|██▍       | 6/25 [00:05<00:17,  1.10it/s]
[Acessing speaker spk_0 track 1 of 1:  28%|██▊       | 7/25 [00:06<00:17,  1.04it/s]
[Acessing speaker spk_0 track 1 of 1:  32%|███▏      | 8/25 [00:06<00:14,  1.16it/s]
[Acessing speaker spk_0 track 1 of 1:  36%|███▌      | 9/25 [00:07<00:13,  1.21it/s]
[Acessing speaker spk_0 track 1 of 1:  40%|████      | 10/25 [00:08<00:12,  1.25it/s]
[Acessing speaker spk_0 track 1 of 1:  44%|████▍     | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/23 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▍         | 1/23 [00:12<04:33, 12.41s/it]
[Acessing speaker spk_1 track 1 of 1:   9%|▊         | 2/23 [00:14<02:08,  6.11s/it]
[Acessing speaker spk_1 track 1 of 1:  13%|█▎        | 3/23 [00:14<01:10,  3.55s/it]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 4/23 [00:16<00:58,  3.06s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 5/23 [00:18<00:47,  2.62s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 6/23 [00:19<00:34,  2.03s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|███       | 7/23 [00:20<00:25,  1.59s/it]
[Acessing speaker spk_1 track 1 of 1:  35%|███▍      | 8/23 [00:24<00:37,  2.48s/it]
[Acessing speaker spk_1 track 1 of 1:  39%|███▉      | 9/23 [00:25<00:28,  2.02s/it]
[Acessing speaker spk_1 track 1 of 1:  43%|████▎     | 10/23 [00:26<00:20,  1.59s/it]
[Acessing speaker spk_1 track 1 of 1:  48%|████▊     | 11/2





[Acessing speaker spk_2 track 1 of 2:   0%|          | 0/17 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 2:   6%|▌         | 1/17 [00:01<00:21,  1.32s/it]
[Acessing speaker spk_2 track 1 of 2:  12%|█▏        | 2/17 [00:01<00:13,  1.08it/s]
[Acessing speaker spk_2 track 1 of 2:  18%|█▊        | 3/17 [00:06<00:39,  2.80s/it]
[Acessing speaker spk_2 track 1 of 2:  24%|██▎       | 4/17 [00:07<00:27,  2.09s/it]
[Acessing speaker spk_2 track 1 of 2:  29%|██▉       | 5/17 [00:11<00:33,  2.76s/it]
[Acessing speaker spk_2 track 1 of 2:  35%|███▌      | 6/17 [00:12<00:22,  2.06s/it]
[Acessing speaker spk_2 track 1 of 2:  41%|████      | 7/17 [00:13<00:15,  1.59s/it]
[Acessing speaker spk_2 track 1 of 2:  47%|████▋     | 8/17 [00:14<00:12,  1.41s/it]
[Acessing speaker spk_2 track 1 of 2:  53%|█████▎    | 9/17 [00:15<00:09,  1.23s/it]
[Acessing speaker spk_2 track 1 of 2:  59%|█████▉    | 10/17 [00:15<00:07,  1.01s/it]
[Acessing speaker spk_2 track 1 of 2:  65%|██████▍   | 11/1





[Acessing speaker spk_3 track 1 of 3:   0%|          | 0/16 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 3:   6%|▋         | 1/16 [00:01<00:16,  1.08s/it]
[Acessing speaker spk_3 track 1 of 3:  12%|█▎        | 2/16 [00:01<00:11,  1.19it/s]
[Acessing speaker spk_3 track 1 of 3:  19%|█▉        | 3/16 [00:03<00:13,  1.05s/it]
[Acessing speaker spk_3 track 1 of 3:  25%|██▌       | 4/16 [00:05<00:19,  1.62s/it]
[Acessing speaker spk_3 track 1 of 3:  31%|███▏      | 5/16 [00:06<00:17,  1.56s/it]
[Acessing speaker spk_3 track 1 of 3:  38%|███▊      | 6/16 [00:08<00:13,  1.40s/it]
[Acessing speaker spk_3 track 1 of 3:  44%|████▍     | 7/16 [00:08<00:10,  1.12s/it]
[Acessing speaker spk_3 track 1 of 3:  50%|█████     | 8/16 [00:09<00:08,  1.05s/it]
[Acessing speaker spk_3 track 1 of 3:  56%|█████▋    | 9/16 [00:11<00:09,  1.29s/it]
[Acessing speaker spk_3 track 1 of 3:  62%|██████▎   | 10/16 [00:12<00:06,  1.10s/it]
[Acessing speaker spk_3 track 1 of 3:  69%|██████▉   | 11/1





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▍         | 1/26 [00:00<00:18,  1.35it/s]
[Acessing speaker spk_4 track 1 of 1:   8%|▊         | 2/26 [00:01<00:12,  1.92it/s]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 3/26 [00:01<00:14,  1.55it/s]
[Acessing speaker spk_4 track 1 of 1:  15%|█▌        | 4/26 [00:02<00:17,  1.25it/s]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 5/26 [00:04<00:20,  1.05it/s]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 6/26 [00:04<00:16,  1.21it/s]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 7/26 [00:05<00:13,  1.38it/s]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 8/26 [00:07<00:20,  1.12s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▍      | 9/26 [00:08<00:17,  1.05s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 10/26 [00:08<00:14,  1.09it/s]
[Acessing speaker spk_4 track 1 of 1:  42%|████▏     | 11/2





[Acessing speaker spk_5 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_5 track 1 of 1:   3%|▎         | 1/29 [00:00<00:24,  1.15it/s]
[Acessing speaker spk_5 track 1 of 1:   7%|▋         | 2/29 [00:01<00:21,  1.28it/s]
[Acessing speaker spk_5 track 1 of 1:  10%|█         | 3/29 [00:02<00:16,  1.61it/s]
[Acessing speaker spk_5 track 1 of 1:  14%|█▍        | 4/29 [00:02<00:15,  1.57it/s]
[Acessing speaker spk_5 track 1 of 1:  17%|█▋        | 5/29 [00:03<00:15,  1.54it/s]
[Acessing speaker spk_5 track 1 of 1:  21%|██        | 6/29 [00:04<00:15,  1.48it/s]
[Acessing speaker spk_5 track 1 of 1:  24%|██▍       | 7/29 [00:04<00:15,  1.45it/s]
[Acessing speaker spk_5 track 1 of 1:  28%|██▊       | 8/29 [00:06<00:23,  1.12s/it]
[Acessing speaker spk_5 track 1 of 1:  31%|███       | 9/29 [00:15<01:07,  3.38s/it]
[Acessing speaker spk_5 track 1 of 1:  34%|███▍      | 10/29 [00:23<01:33,  4.94s/it]
[Acessing speaker spk_5 track 1 of 1:  38%|███▊      | 11/2


########## Starte Grid-Experimente für session_54 ##########

Starte Inference für Experiment: E56_bugfix_default_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E56_bugfix_default_bs12_len20
  session_dir     = data-bin/dev/session_54
  comment         = Bugfix-default segmentation (kein Override von min_duration)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_54


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/31 [00:01<00:33,  1.12s/it]
[Acessing speaker spk_0 track 1 of 2:   6%|▋         | 2/31 [00:01<00:22,  1.27it/s]
[Acessing speaker spk_0 track 1 of 2:  10%|▉         | 3/31 [00:02<00:18,  1.49it/s]
[Acessing speaker spk_0 track 1 of 2:  13%|█▎        | 4/31 [00:02<00:15,  1.76it/s]
[Acessing speaker spk_0 track 1 of 2:  16%|█▌        | 5/31 [00:03<00:14,  1.84it/s]
[Acessing speaker spk_0 track 1 of 2:  19%|█▉        | 6/31 [00:04<00:18,  1.33it/s]
[Acessing speaker spk_0 track 1 of 2:  23%|██▎       | 7/31 [00:06<00:28,  1.21s/it]
[Acessing speaker spk_0 track 1 of 2:  26%|██▌       | 8/31 [00:09<00:39,  1.72s/it]
[Acessing speaker spk_0 track 1 of 2:  29%|██▉       | 9/31 [00:10<00:32,  1.50s/it]
[Acessing speaker spk_0 track 1 of 2:  32%|███▏      | 10/31 [00:10<00:25,  1.24s/it]
[Acessing speaker spk_0 track 1 of 2:  35%|███▌      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:02<00:57,  2.20s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:08<01:51,  4.45s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:17<02:36,  6.53s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:23<02:25,  6.34s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:28<02:11,  5.98s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:31<01:44,  4.99s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:34<01:22,  4.11s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:39<01:27,  4.60s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:44<01:21,  4.55s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:45<00:58,  3.45s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/38 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/38 [00:01<01:05,  1.78s/it]
[Acessing speaker spk_2 track 1 of 1:   5%|▌         | 2/38 [00:02<00:46,  1.30s/it]
[Acessing speaker spk_2 track 1 of 1:   8%|▊         | 3/38 [00:05<01:06,  1.91s/it]
[Acessing speaker spk_2 track 1 of 1:  11%|█         | 4/38 [00:08<01:24,  2.50s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 5/38 [00:09<01:00,  1.83s/it]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 6/38 [00:09<00:44,  1.40s/it]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 7/38 [00:11<00:47,  1.53s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 8/38 [00:12<00:39,  1.30s/it]
[Acessing speaker spk_2 track 1 of 1:  24%|██▎       | 9/38 [00:14<00:40,  1.41s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▋       | 10/38 [00:15<00:37,  1.34s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/43 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   2%|▏         | 1/43 [00:00<00:37,  1.13it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▍         | 2/43 [00:02<00:45,  1.11s/it]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 3/43 [00:02<00:35,  1.12it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 4/43 [00:04<00:48,  1.25s/it]
[Acessing speaker spk_3 track 1 of 1:  12%|█▏        | 5/43 [00:05<00:44,  1.17s/it]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 6/43 [00:06<00:44,  1.19s/it]
[Acessing speaker spk_3 track 1 of 1:  16%|█▋        | 7/43 [00:07<00:36,  1.02s/it]
[Acessing speaker spk_3 track 1 of 1:  19%|█▊        | 8/43 [00:09<00:43,  1.25s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 9/43 [00:10<00:37,  1.11s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 10/43 [00:11<00:44,  1.34s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 11/4





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/31 [00:02<01:02,  2.09s/it]
[Acessing speaker spk_4 track 1 of 1:   6%|▋         | 2/31 [00:02<00:39,  1.35s/it]
[Acessing speaker spk_4 track 1 of 1:  10%|▉         | 3/31 [00:03<00:27,  1.02it/s]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 4/31 [00:03<00:20,  1.32it/s]
[Acessing speaker spk_4 track 1 of 1:  16%|█▌        | 5/31 [00:05<00:28,  1.09s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 6/31 [00:06<00:25,  1.03s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 7/31 [00:09<00:37,  1.55s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 8/31 [00:09<00:30,  1.32s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▉       | 9/31 [00:11<00:31,  1.43s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 10/31 [00:12<00:25,  1.19s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▌      | 11/3


Starte Inference für Experiment: E57_bugfix_mdOn0p4_mdOff0p5_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E57_bugfix_mdOn0p4_mdOff0p5_bs12_len20
  session_dir     = data-bin/dev/session_54
  comment         = AVSR-Override: min_on=0.4s, min_off=0.5s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_54


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/34 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/34 [00:00<00:31,  1.05it/s]
[Acessing speaker spk_0 track 1 of 2:   6%|▌         | 2/34 [00:01<00:23,  1.38it/s]
[Acessing speaker spk_0 track 1 of 2:   9%|▉         | 3/34 [00:02<00:19,  1.58it/s]
[Acessing speaker spk_0 track 1 of 2:  12%|█▏        | 4/34 [00:02<00:16,  1.82it/s]
[Acessing speaker spk_0 track 1 of 2:  15%|█▍        | 5/34 [00:02<00:15,  1.90it/s]
[Acessing speaker spk_0 track 1 of 2:  18%|█▊        | 6/34 [00:04<00:20,  1.35it/s]
[Acessing speaker spk_0 track 1 of 2:  21%|██        | 7/34 [00:06<00:31,  1.18s/it]
[Acessing speaker spk_0 track 1 of 2:  24%|██▎       | 8/34 [00:06<00:25,  1.03it/s]
[Acessing speaker spk_0 track 1 of 2:  26%|██▋       | 9/34 [00:09<00:38,  1.53s/it]
[Acessing speaker spk_0 track 1 of 2:  29%|██▉       | 10/34 [00:10<00:32,  1.37s/it]
[Acessing speaker spk_0 track 1 of 2:  32%|███▏      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/29 [00:02<00:56,  2.03s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/29 [00:02<00:30,  1.11s/it]
[Acessing speaker spk_1 track 1 of 1:  10%|█         | 3/29 [00:08<01:24,  3.26s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 4/29 [00:15<02:03,  4.95s/it]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 5/29 [00:21<02:08,  5.33s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 6/29 [00:27<02:03,  5.36s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 7/29 [00:30<01:40,  4.59s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 8/29 [00:32<01:20,  3.86s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███       | 9/29 [00:38<01:28,  4.40s/it]
[Acessing speaker spk_1 track 1 of 1:  34%|███▍      | 10/29 [00:42<01:24,  4.47s/it]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/39 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/39 [00:01<01:00,  1.59s/it]
[Acessing speaker spk_2 track 1 of 1:   5%|▌         | 2/39 [00:02<00:45,  1.23s/it]
[Acessing speaker spk_2 track 1 of 1:   8%|▊         | 3/39 [00:05<01:07,  1.88s/it]
[Acessing speaker spk_2 track 1 of 1:  10%|█         | 4/39 [00:08<01:29,  2.57s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 5/39 [00:09<01:04,  1.88s/it]
[Acessing speaker spk_2 track 1 of 1:  15%|█▌        | 6/39 [00:10<00:47,  1.43s/it]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 7/39 [00:11<00:49,  1.56s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 8/39 [00:12<00:40,  1.32s/it]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 9/39 [00:14<00:40,  1.37s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 10/39 [00:15<00:37,  1.30s/it]
[Acessing speaker spk_2 track 1 of 1:  28%|██▊       | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/44 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   2%|▏         | 1/44 [00:00<00:29,  1.45it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▍         | 2/44 [00:01<00:43,  1.02s/it]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 3/44 [00:02<00:34,  1.18it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 4/44 [00:03<00:27,  1.47it/s]
[Acessing speaker spk_3 track 1 of 1:  11%|█▏        | 5/44 [00:04<00:31,  1.25it/s]
[Acessing speaker spk_3 track 1 of 1:  14%|█▎        | 6/44 [00:05<00:35,  1.06it/s]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 7/44 [00:05<00:31,  1.18it/s]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 8/44 [00:07<00:40,  1.14s/it]
[Acessing speaker spk_3 track 1 of 1:  20%|██        | 9/44 [00:08<00:36,  1.04s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 10/44 [00:10<00:44,  1.30s/it]
[Acessing speaker spk_3 track 1 of 1:  25%|██▌       | 11/4





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/35 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/35 [00:01<01:06,  1.96s/it]
[Acessing speaker spk_4 track 1 of 1:   6%|▌         | 2/35 [00:02<00:42,  1.29s/it]
[Acessing speaker spk_4 track 1 of 1:   9%|▊         | 3/35 [00:03<00:30,  1.05it/s]
[Acessing speaker spk_4 track 1 of 1:  11%|█▏        | 4/35 [00:03<00:22,  1.35it/s]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 5/35 [00:05<00:32,  1.08s/it]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 6/35 [00:06<00:29,  1.02s/it]
[Acessing speaker spk_4 track 1 of 1:  20%|██        | 7/35 [00:08<00:43,  1.55s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 8/35 [00:09<00:35,  1.32s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 9/35 [00:11<00:37,  1.43s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▊       | 10/35 [00:12<00:29,  1.20s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███▏      | 11/3


Starte Inference für Experiment: E58_bugfix_mdOn0p4_mdOff0p8_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E58_bugfix_mdOn0p4_mdOff0p8_bs12_len20
  session_dir     = data-bin/dev/session_54
  comment         = AVSR-Override: min_on=0.4s, min_off=0.8s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_54


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/31 [00:00<00:28,  1.05it/s]
[Acessing speaker spk_0 track 1 of 2:   6%|▋         | 2/31 [00:01<00:20,  1.39it/s]
[Acessing speaker spk_0 track 1 of 2:  10%|▉         | 3/31 [00:02<00:17,  1.58it/s]
[Acessing speaker spk_0 track 1 of 2:  13%|█▎        | 4/31 [00:02<00:14,  1.82it/s]
[Acessing speaker spk_0 track 1 of 2:  16%|█▌        | 5/31 [00:02<00:13,  1.89it/s]
[Acessing speaker spk_0 track 1 of 2:  19%|█▉        | 6/31 [00:04<00:18,  1.34it/s]
[Acessing speaker spk_0 track 1 of 2:  23%|██▎       | 7/31 [00:06<00:28,  1.18s/it]
[Acessing speaker spk_0 track 1 of 2:  26%|██▌       | 8/31 [00:06<00:22,  1.03it/s]
[Acessing speaker spk_0 track 1 of 2:  29%|██▉       | 9/31 [00:09<00:33,  1.53s/it]
[Acessing speaker spk_0 track 1 of 2:  32%|███▏      | 10/31 [00:11<00:32,  1.55s/it]
[Acessing speaker spk_0 track 1 of 2:  35%|███▌      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/29 [00:02<00:56,  2.01s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/29 [00:02<00:29,  1.11s/it]
[Acessing speaker spk_1 track 1 of 1:  10%|█         | 3/29 [00:08<01:23,  3.22s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 4/29 [00:15<02:03,  4.93s/it]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 5/29 [00:21<02:08,  5.36s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 6/29 [00:26<02:01,  5.26s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 7/29 [00:30<01:45,  4.81s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 8/29 [00:32<01:22,  3.95s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███       | 9/29 [00:38<01:30,  4.53s/it]
[Acessing speaker spk_1 track 1 of 1:  34%|███▍      | 10/29 [00:43<01:25,  4.51s/it]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/37 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/37 [00:01<00:57,  1.60s/it]
[Acessing speaker spk_2 track 1 of 1:   5%|▌         | 2/37 [00:02<00:42,  1.23s/it]
[Acessing speaker spk_2 track 1 of 1:   8%|▊         | 3/37 [00:05<01:03,  1.88s/it]
[Acessing speaker spk_2 track 1 of 1:  11%|█         | 4/37 [00:08<01:24,  2.55s/it]
[Acessing speaker spk_2 track 1 of 1:  14%|█▎        | 5/37 [00:11<01:20,  2.53s/it]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 6/37 [00:11<00:57,  1.86s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 7/37 [00:13<00:55,  1.84s/it]
[Acessing speaker spk_2 track 1 of 1:  22%|██▏       | 8/37 [00:16<00:59,  2.04s/it]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 9/37 [00:17<00:50,  1.79s/it]
[Acessing speaker spk_2 track 1 of 1:  27%|██▋       | 10/37 [00:18<00:41,  1.52s/it]
[Acessing speaker spk_2 track 1 of 1:  30%|██▉       | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/42 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   2%|▏         | 1/42 [00:00<00:28,  1.43it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▍         | 2/42 [00:02<00:54,  1.35s/it]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 3/42 [00:03<00:37,  1.04it/s]
[Acessing speaker spk_3 track 1 of 1:  10%|▉         | 4/42 [00:04<00:39,  1.04s/it]
[Acessing speaker spk_3 track 1 of 1:  12%|█▏        | 5/42 [00:05<00:40,  1.10s/it]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 6/42 [00:06<00:34,  1.05it/s]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 7/42 [00:07<00:42,  1.22s/it]
[Acessing speaker spk_3 track 1 of 1:  19%|█▉        | 8/42 [00:08<00:37,  1.09s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██▏       | 9/42 [00:10<00:43,  1.33s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 10/42 [00:13<01:02,  1.94s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 11/4





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/32 [00:01<01:00,  1.95s/it]
[Acessing speaker spk_4 track 1 of 1:   6%|▋         | 2/32 [00:02<00:38,  1.29s/it]
[Acessing speaker spk_4 track 1 of 1:   9%|▉         | 3/32 [00:03<00:27,  1.06it/s]
[Acessing speaker spk_4 track 1 of 1:  12%|█▎        | 4/32 [00:03<00:20,  1.36it/s]
[Acessing speaker spk_4 track 1 of 1:  16%|█▌        | 5/32 [00:05<00:29,  1.08s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 6/32 [00:06<00:26,  1.02s/it]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 7/32 [00:08<00:38,  1.55s/it]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 8/32 [00:09<00:31,  1.33s/it]
[Acessing speaker spk_4 track 1 of 1:  28%|██▊       | 9/32 [00:11<00:32,  1.43s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███▏      | 10/32 [00:12<00:26,  1.19s/it]
[Acessing speaker spk_4 track 1 of 1:  34%|███▍      | 11/3


Starte Inference für Experiment: E59_bugfix_mdOn0p4_mdOff1p0_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E59_bugfix_mdOn0p4_mdOff1p0_bs12_len20
  session_dir     = data-bin/dev/session_54
  comment         = AVSR-Override: min_on=0.4s, min_off=1.0s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_54


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/29 [00:00<00:26,  1.05it/s]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/29 [00:01<00:19,  1.39it/s]
[Acessing speaker spk_0 track 1 of 2:  10%|█         | 3/29 [00:02<00:16,  1.58it/s]
[Acessing speaker spk_0 track 1 of 2:  14%|█▍        | 4/29 [00:02<00:13,  1.83it/s]
[Acessing speaker spk_0 track 1 of 2:  17%|█▋        | 5/29 [00:05<00:37,  1.54s/it]
[Acessing speaker spk_0 track 1 of 2:  21%|██        | 6/29 [00:07<00:39,  1.73s/it]
[Acessing speaker spk_0 track 1 of 2:  24%|██▍       | 7/29 [00:08<00:29,  1.33s/it]
[Acessing speaker spk_0 track 1 of 2:  28%|██▊       | 8/29 [00:11<00:37,  1.80s/it]
[Acessing speaker spk_0 track 1 of 2:  31%|███       | 9/29 [00:12<00:34,  1.74s/it]
[Acessing speaker spk_0 track 1 of 2:  34%|███▍      | 10/29 [00:13<00:28,  1.50s/it]
[Acessing speaker spk_0 track 1 of 2:  38%|███▊      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/29 [00:02<01:04,  2.31s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/29 [00:02<00:33,  1.23s/it]
[Acessing speaker spk_1 track 1 of 1:  10%|█         | 3/29 [00:08<01:25,  3.28s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 4/29 [00:16<02:06,  5.05s/it]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 5/29 [00:22<02:12,  5.52s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 6/29 [00:27<02:05,  5.44s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 7/29 [00:31<01:49,  4.98s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 8/29 [00:34<01:26,  4.11s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███       | 9/29 [00:40<01:33,  4.67s/it]
[Acessing speaker spk_1 track 1 of 1:  34%|███▍      | 10/29 [00:44<01:28,  4.66s/it]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/33 [00:01<00:51,  1.59s/it]
[Acessing speaker spk_2 track 1 of 1:   6%|▌         | 2/33 [00:02<00:38,  1.24s/it]
[Acessing speaker spk_2 track 1 of 1:   9%|▉         | 3/33 [00:05<00:56,  1.90s/it]
[Acessing speaker spk_2 track 1 of 1:  12%|█▏        | 4/33 [00:08<01:12,  2.51s/it]
[Acessing speaker spk_2 track 1 of 1:  15%|█▌        | 5/33 [00:09<00:51,  1.84s/it]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 6/33 [00:09<00:38,  1.41s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 7/33 [00:11<00:39,  1.54s/it]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 8/33 [00:14<00:45,  1.83s/it]
[Acessing speaker spk_2 track 1 of 1:  27%|██▋       | 9/33 [00:15<00:39,  1.65s/it]
[Acessing speaker spk_2 track 1 of 1:  30%|███       | 10/33 [00:16<00:33,  1.48s/it]
[Acessing speaker spk_2 track 1 of 1:  33%|███▎      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/39 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/39 [00:00<00:27,  1.39it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▌         | 2/39 [00:02<00:53,  1.43s/it]
[Acessing speaker spk_3 track 1 of 1:   8%|▊         | 3/39 [00:03<00:36,  1.01s/it]
[Acessing speaker spk_3 track 1 of 1:  10%|█         | 4/39 [00:04<00:35,  1.01s/it]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 5/39 [00:05<00:37,  1.10s/it]
[Acessing speaker spk_3 track 1 of 1:  15%|█▌        | 6/39 [00:06<00:31,  1.05it/s]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 7/39 [00:07<00:38,  1.22s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 8/39 [00:08<00:33,  1.09s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 9/39 [00:10<00:40,  1.34s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 10/39 [00:14<00:58,  2.01s/it]
[Acessing speaker spk_3 track 1 of 1:  28%|██▊       | 11/3





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/31 [00:01<00:57,  1.92s/it]
[Acessing speaker spk_4 track 1 of 1:   6%|▋         | 2/31 [00:02<00:36,  1.26s/it]
[Acessing speaker spk_4 track 1 of 1:  10%|▉         | 3/31 [00:03<00:26,  1.07it/s]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 4/31 [00:03<00:19,  1.37it/s]
[Acessing speaker spk_4 track 1 of 1:  16%|█▌        | 5/31 [00:05<00:27,  1.07s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 6/31 [00:06<00:25,  1.01s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 7/31 [00:08<00:36,  1.53s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 8/31 [00:09<00:30,  1.31s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▉       | 9/31 [00:11<00:31,  1.42s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 10/31 [00:12<00:24,  1.18s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▌      | 11/3


Starte Inference für Experiment: E60_bugfix_mdOn0p4_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E60_bugfix_mdOn0p4_mdOff1p2_bs12_len20
  session_dir     = data-bin/dev/session_54
  comment         = AVSR-Override: min_on=0.4s, min_off=1.2s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_54


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   4%|▎         | 1/28 [00:00<00:25,  1.05it/s]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/28 [00:01<00:18,  1.40it/s]
[Acessing speaker spk_0 track 1 of 2:  11%|█         | 3/28 [00:02<00:15,  1.58it/s]
[Acessing speaker spk_0 track 1 of 2:  14%|█▍        | 4/28 [00:02<00:13,  1.83it/s]
[Acessing speaker spk_0 track 1 of 2:  18%|█▊        | 5/28 [00:03<00:19,  1.16it/s]
[Acessing speaker spk_0 track 1 of 2:  21%|██▏       | 6/28 [00:05<00:28,  1.29s/it]
[Acessing speaker spk_0 track 1 of 2:  25%|██▌       | 7/28 [00:06<00:21,  1.03s/it]
[Acessing speaker spk_0 track 1 of 2:  29%|██▊       | 8/28 [00:09<00:32,  1.61s/it]
[Acessing speaker spk_0 track 1 of 2:  32%|███▏      | 9/28 [00:10<00:30,  1.62s/it]
[Acessing speaker spk_0 track 1 of 2:  36%|███▌      | 10/28 [00:11<00:25,  1.42s/it]
[Acessing speaker spk_0 track 1 of 2:  39%|███▉      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   3%|▎         | 1/29 [00:02<00:56,  2.02s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/29 [00:02<00:30,  1.12s/it]
[Acessing speaker spk_1 track 1 of 1:  10%|█         | 3/29 [00:08<01:22,  3.17s/it]
[Acessing speaker spk_1 track 1 of 1:  14%|█▍        | 4/29 [00:15<02:04,  5.00s/it]
[Acessing speaker spk_1 track 1 of 1:  17%|█▋        | 5/29 [00:22<02:09,  5.39s/it]
[Acessing speaker spk_1 track 1 of 1:  21%|██        | 6/29 [00:27<02:02,  5.33s/it]
[Acessing speaker spk_1 track 1 of 1:  24%|██▍       | 7/29 [00:31<01:47,  4.87s/it]
[Acessing speaker spk_1 track 1 of 1:  28%|██▊       | 8/29 [00:33<01:25,  4.05s/it]
[Acessing speaker spk_1 track 1 of 1:  31%|███       | 9/29 [00:39<01:31,  4.55s/it]
[Acessing speaker spk_1 track 1 of 1:  34%|███▍      | 10/29 [00:43<01:25,  4.52s/it]
[Acessing speaker spk_1 track 1 of 1:  38%|███▊      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/31 [00:02<01:23,  2.79s/it]
[Acessing speaker spk_2 track 1 of 1:   6%|▋         | 2/31 [00:06<01:38,  3.41s/it]
[Acessing speaker spk_2 track 1 of 1:  10%|▉         | 3/31 [00:10<01:35,  3.41s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 4/31 [00:10<01:02,  2.32s/it]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 5/31 [00:13<01:01,  2.36s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 6/31 [00:15<01:01,  2.46s/it]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 7/31 [00:17<00:49,  2.07s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 8/31 [00:17<00:39,  1.70s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 9/31 [00:18<00:29,  1.34s/it]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 10/31 [00:20<00:30,  1.46s/it]
[Acessing speaker spk_2 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/39 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/39 [00:00<00:26,  1.45it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▌         | 2/39 [00:02<00:49,  1.35s/it]
[Acessing speaker spk_3 track 1 of 1:   8%|▊         | 3/39 [00:02<00:34,  1.05it/s]
[Acessing speaker spk_3 track 1 of 1:  10%|█         | 4/39 [00:03<00:33,  1.04it/s]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 5/39 [00:05<00:35,  1.05s/it]
[Acessing speaker spk_3 track 1 of 1:  15%|█▌        | 6/39 [00:05<00:30,  1.09it/s]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 7/39 [00:07<00:38,  1.20s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 8/39 [00:08<00:33,  1.08s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 9/39 [00:11<00:55,  1.86s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 10/39 [00:15<01:07,  2.31s/it]
[Acessing speaker spk_3 track 1 of 1:  28%|██▊       | 11/3





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/31 [00:01<00:57,  1.92s/it]
[Acessing speaker spk_4 track 1 of 1:   6%|▋         | 2/31 [00:02<00:36,  1.26s/it]
[Acessing speaker spk_4 track 1 of 1:  10%|▉         | 3/31 [00:03<00:26,  1.07it/s]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 4/31 [00:03<00:19,  1.38it/s]
[Acessing speaker spk_4 track 1 of 1:  16%|█▌        | 5/31 [00:05<00:27,  1.07s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 6/31 [00:06<00:25,  1.01s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 7/31 [00:08<00:36,  1.54s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 8/31 [00:09<00:30,  1.32s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▉       | 9/31 [00:11<00:32,  1.49s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 10/31 [00:12<00:25,  1.24s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▌      | 11/3


Starte Inference für Experiment: E61_bugfix_mdOn0p6_mdOff0p5_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E61_bugfix_mdOn0p6_mdOff0p5_bs12_len20
  session_dir     = data-bin/dev/session_54
  comment         = AVSR-Override: min_on=0.6s, min_off=0.5s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_54


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/33 [00:00<00:30,  1.05it/s]
[Acessing speaker spk_0 track 1 of 2:   6%|▌         | 2/33 [00:01<00:22,  1.39it/s]
[Acessing speaker spk_0 track 1 of 2:   9%|▉         | 3/33 [00:02<00:19,  1.58it/s]
[Acessing speaker spk_0 track 1 of 2:  12%|█▏        | 4/33 [00:02<00:15,  1.82it/s]
[Acessing speaker spk_0 track 1 of 2:  15%|█▌        | 5/33 [00:02<00:14,  1.89it/s]
[Acessing speaker spk_0 track 1 of 2:  18%|█▊        | 6/33 [00:04<00:20,  1.33it/s]
[Acessing speaker spk_0 track 1 of 2:  21%|██        | 7/33 [00:06<00:31,  1.21s/it]
[Acessing speaker spk_0 track 1 of 2:  24%|██▍       | 8/33 [00:06<00:24,  1.01it/s]
[Acessing speaker spk_0 track 1 of 2:  27%|██▋       | 9/33 [00:09<00:37,  1.57s/it]
[Acessing speaker spk_0 track 1 of 2:  30%|███       | 10/33 [00:10<00:32,  1.39s/it]
[Acessing speaker spk_0 track 1 of 2:  33%|███▎      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:02<00:53,  2.06s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:07<01:43,  4.15s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:15<02:17,  5.74s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:21<02:15,  5.91s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:26<02:05,  5.69s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:29<01:40,  4.77s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:32<01:19,  3.98s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:37<01:26,  4.53s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:42<01:21,  4.51s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:43<00:58,  3.43s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/39 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/39 [00:01<00:59,  1.58s/it]
[Acessing speaker spk_2 track 1 of 1:   5%|▌         | 2/39 [00:02<00:44,  1.21s/it]
[Acessing speaker spk_2 track 1 of 1:   8%|▊         | 3/39 [00:05<01:07,  1.87s/it]
[Acessing speaker spk_2 track 1 of 1:  10%|█         | 4/39 [00:08<01:28,  2.54s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 5/39 [00:09<01:03,  1.86s/it]
[Acessing speaker spk_2 track 1 of 1:  15%|█▌        | 6/39 [00:09<00:46,  1.42s/it]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 7/39 [00:11<00:51,  1.61s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 8/39 [00:12<00:42,  1.36s/it]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 9/39 [00:14<00:41,  1.39s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 10/39 [00:15<00:38,  1.33s/it]
[Acessing speaker spk_2 track 1 of 1:  28%|██▊       | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/43 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   2%|▏         | 1/43 [00:00<00:28,  1.45it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▍         | 2/43 [00:01<00:41,  1.02s/it]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 3/43 [00:02<00:33,  1.19it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 4/43 [00:03<00:26,  1.47it/s]
[Acessing speaker spk_3 track 1 of 1:  12%|█▏        | 5/43 [00:03<00:30,  1.26it/s]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 6/43 [00:05<00:34,  1.07it/s]
[Acessing speaker spk_3 track 1 of 1:  16%|█▋        | 7/43 [00:05<00:30,  1.18it/s]
[Acessing speaker spk_3 track 1 of 1:  19%|█▊        | 8/43 [00:07<00:39,  1.14s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 9/43 [00:08<00:35,  1.03s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 10/43 [00:10<00:42,  1.29s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 11/4





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/34 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/34 [00:01<01:03,  1.92s/it]
[Acessing speaker spk_4 track 1 of 1:   6%|▌         | 2/34 [00:02<00:40,  1.27s/it]
[Acessing speaker spk_4 track 1 of 1:   9%|▉         | 3/34 [00:03<00:29,  1.06it/s]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 4/34 [00:03<00:21,  1.37it/s]
[Acessing speaker spk_4 track 1 of 1:  15%|█▍        | 5/34 [00:05<00:31,  1.09s/it]
[Acessing speaker spk_4 track 1 of 1:  18%|█▊        | 6/34 [00:06<00:28,  1.02s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██        | 7/34 [00:08<00:42,  1.57s/it]
[Acessing speaker spk_4 track 1 of 1:  24%|██▎       | 8/34 [00:09<00:34,  1.33s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▋       | 9/34 [00:11<00:35,  1.43s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▉       | 10/34 [00:12<00:28,  1.20s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 11/3


Starte Inference für Experiment: E62_bugfix_mdOn0p6_mdOff0p8_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E62_bugfix_mdOn0p6_mdOff0p8_bs12_len20
  session_dir     = data-bin/dev/session_54
  comment         = AVSR-Override: min_on=0.6s, min_off=0.8s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_54


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/30 [00:00<00:27,  1.05it/s]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/30 [00:01<00:20,  1.39it/s]
[Acessing speaker spk_0 track 1 of 2:  10%|█         | 3/30 [00:02<00:17,  1.56it/s]
[Acessing speaker spk_0 track 1 of 2:  13%|█▎        | 4/30 [00:02<00:14,  1.81it/s]
[Acessing speaker spk_0 track 1 of 2:  17%|█▋        | 5/30 [00:02<00:13,  1.88it/s]
[Acessing speaker spk_0 track 1 of 2:  20%|██        | 6/30 [00:04<00:17,  1.34it/s]
[Acessing speaker spk_0 track 1 of 2:  23%|██▎       | 7/30 [00:06<00:27,  1.19s/it]
[Acessing speaker spk_0 track 1 of 2:  27%|██▋       | 8/30 [00:06<00:21,  1.03it/s]
[Acessing speaker spk_0 track 1 of 2:  30%|███       | 9/30 [00:09<00:32,  1.53s/it]
[Acessing speaker spk_0 track 1 of 2:  33%|███▎      | 10/30 [00:11<00:30,  1.55s/it]
[Acessing speaker spk_0 track 1 of 2:  37%|███▋      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:02<00:57,  2.20s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:07<01:45,  4.20s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:15<02:16,  5.69s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:21<02:15,  5.88s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:26<02:03,  5.60s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:30<01:45,  5.03s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:32<01:21,  4.07s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:40<01:38,  5.18s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:44<01:29,  4.96s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:45<01:03,  3.72s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/37 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/37 [00:01<01:03,  1.76s/it]
[Acessing speaker spk_2 track 1 of 1:   5%|▌         | 2/37 [00:02<00:45,  1.29s/it]
[Acessing speaker spk_2 track 1 of 1:   8%|▊         | 3/37 [00:05<01:04,  1.90s/it]
[Acessing speaker spk_2 track 1 of 1:  11%|█         | 4/37 [00:08<01:22,  2.49s/it]
[Acessing speaker spk_2 track 1 of 1:  14%|█▎        | 5/37 [00:09<00:58,  1.83s/it]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 6/37 [00:09<00:43,  1.40s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 7/37 [00:11<00:45,  1.53s/it]
[Acessing speaker spk_2 track 1 of 1:  22%|██▏       | 8/37 [00:14<00:53,  1.85s/it]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 9/37 [00:16<00:55,  1.97s/it]
[Acessing speaker spk_2 track 1 of 1:  27%|██▋       | 10/37 [00:17<00:44,  1.65s/it]
[Acessing speaker spk_2 track 1 of 1:  30%|██▉       | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/41 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   2%|▏         | 1/41 [00:00<00:34,  1.14it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▍         | 2/41 [00:02<00:55,  1.42s/it]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 3/41 [00:03<00:38,  1.00s/it]
[Acessing speaker spk_3 track 1 of 1:  10%|▉         | 4/41 [00:04<00:37,  1.00s/it]
[Acessing speaker spk_3 track 1 of 1:  12%|█▏        | 5/41 [00:05<00:39,  1.08s/it]
[Acessing speaker spk_3 track 1 of 1:  15%|█▍        | 6/41 [00:06<00:32,  1.06it/s]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 7/41 [00:07<00:41,  1.21s/it]
[Acessing speaker spk_3 track 1 of 1:  20%|█▉        | 8/41 [00:08<00:35,  1.09s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 9/41 [00:10<00:42,  1.34s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 10/41 [00:13<01:00,  1.96s/it]
[Acessing speaker spk_3 track 1 of 1:  27%|██▋       | 11/4





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/32 [00:01<00:59,  1.92s/it]
[Acessing speaker spk_4 track 1 of 1:   6%|▋         | 2/32 [00:02<00:38,  1.27s/it]
[Acessing speaker spk_4 track 1 of 1:   9%|▉         | 3/32 [00:03<00:27,  1.06it/s]
[Acessing speaker spk_4 track 1 of 1:  12%|█▎        | 4/32 [00:03<00:20,  1.37it/s]
[Acessing speaker spk_4 track 1 of 1:  16%|█▌        | 5/32 [00:05<00:29,  1.08s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 6/32 [00:06<00:26,  1.02s/it]
[Acessing speaker spk_4 track 1 of 1:  22%|██▏       | 7/32 [00:08<00:38,  1.55s/it]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 8/32 [00:09<00:31,  1.32s/it]
[Acessing speaker spk_4 track 1 of 1:  28%|██▊       | 9/32 [00:11<00:34,  1.48s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███▏      | 10/32 [00:12<00:27,  1.23s/it]
[Acessing speaker spk_4 track 1 of 1:  34%|███▍      | 11/3


Starte Inference für Experiment: E63_bugfix_mdOn0p6_mdOff1p0_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E63_bugfix_mdOn0p6_mdOff1p0_bs12_len20
  session_dir     = data-bin/dev/session_54
  comment         = AVSR-Override: min_on=0.6s, min_off=1.0s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_54


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   4%|▎         | 1/28 [00:00<00:25,  1.05it/s]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/28 [00:01<00:18,  1.39it/s]
[Acessing speaker spk_0 track 1 of 2:  11%|█         | 3/28 [00:02<00:15,  1.58it/s]
[Acessing speaker spk_0 track 1 of 2:  14%|█▍        | 4/28 [00:02<00:13,  1.82it/s]
[Acessing speaker spk_0 track 1 of 2:  18%|█▊        | 5/28 [00:03<00:19,  1.16it/s]
[Acessing speaker spk_0 track 1 of 2:  21%|██▏       | 6/28 [00:05<00:28,  1.28s/it]
[Acessing speaker spk_0 track 1 of 2:  25%|██▌       | 7/28 [00:06<00:21,  1.03s/it]
[Acessing speaker spk_0 track 1 of 2:  29%|██▊       | 8/28 [00:09<00:31,  1.59s/it]
[Acessing speaker spk_0 track 1 of 2:  32%|███▏      | 9/28 [00:10<00:30,  1.60s/it]
[Acessing speaker spk_0 track 1 of 2:  36%|███▌      | 10/28 [00:12<00:30,  1.68s/it]
[Acessing speaker spk_0 track 1 of 2:  39%|███▉      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:02<00:57,  2.20s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:07<01:45,  4.21s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:15<02:19,  5.80s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:21<02:17,  5.97s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:26<02:05,  5.68s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:30<01:46,  5.09s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:32<01:22,  4.12s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:38<01:28,  4.67s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:43<01:22,  4.60s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:44<00:59,  3.48s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/33 [00:01<00:50,  1.58s/it]
[Acessing speaker spk_2 track 1 of 1:   6%|▌         | 2/33 [00:02<00:37,  1.22s/it]
[Acessing speaker spk_2 track 1 of 1:   9%|▉         | 3/33 [00:05<00:55,  1.86s/it]
[Acessing speaker spk_2 track 1 of 1:  12%|█▏        | 4/33 [00:08<01:11,  2.47s/it]
[Acessing speaker spk_2 track 1 of 1:  15%|█▌        | 5/33 [00:09<00:50,  1.82s/it]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 6/33 [00:09<00:37,  1.39s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 7/33 [00:11<00:39,  1.52s/it]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 8/33 [00:14<00:45,  1.82s/it]
[Acessing speaker spk_2 track 1 of 1:  27%|██▋       | 9/33 [00:15<00:39,  1.64s/it]
[Acessing speaker spk_2 track 1 of 1:  30%|███       | 10/33 [00:16<00:32,  1.41s/it]
[Acessing speaker spk_2 track 1 of 1:  33%|███▎      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/38 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/38 [00:00<00:26,  1.42it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▌         | 2/38 [00:02<00:49,  1.36s/it]
[Acessing speaker spk_3 track 1 of 1:   8%|▊         | 3/38 [00:03<00:33,  1.03it/s]
[Acessing speaker spk_3 track 1 of 1:  11%|█         | 4/38 [00:04<00:33,  1.02it/s]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 5/38 [00:05<00:35,  1.07s/it]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 6/38 [00:05<00:30,  1.07it/s]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 7/38 [00:07<00:37,  1.21s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 8/38 [00:08<00:32,  1.09s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▎       | 9/38 [00:10<00:40,  1.39s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▋       | 10/38 [00:13<00:55,  1.99s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 11/3





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/31 [00:01<00:57,  1.93s/it]
[Acessing speaker spk_4 track 1 of 1:   6%|▋         | 2/31 [00:02<00:36,  1.27s/it]
[Acessing speaker spk_4 track 1 of 1:  10%|▉         | 3/31 [00:03<00:26,  1.06it/s]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 4/31 [00:03<00:19,  1.37it/s]
[Acessing speaker spk_4 track 1 of 1:  16%|█▌        | 5/31 [00:05<00:28,  1.08s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 6/31 [00:06<00:25,  1.02s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 7/31 [00:08<00:37,  1.55s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 8/31 [00:09<00:30,  1.32s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▉       | 9/31 [00:11<00:31,  1.43s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 10/31 [00:12<00:25,  1.19s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▌      | 11/3


Starte Inference für Experiment: E64_bugfix_mdOn0p6_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E64_bugfix_mdOn0p6_mdOff1p2_bs12_len20
  session_dir     = data-bin/dev/session_54
  comment         = AVSR-Override: min_on=0.6s, min_off=1.2s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_54


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   4%|▎         | 1/27 [00:00<00:24,  1.05it/s]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/27 [00:01<00:17,  1.39it/s]
[Acessing speaker spk_0 track 1 of 2:  11%|█         | 3/27 [00:02<00:15,  1.58it/s]
[Acessing speaker spk_0 track 1 of 2:  15%|█▍        | 4/27 [00:02<00:12,  1.82it/s]
[Acessing speaker spk_0 track 1 of 2:  19%|█▊        | 5/27 [00:03<00:19,  1.16it/s]
[Acessing speaker spk_0 track 1 of 2:  22%|██▏       | 6/27 [00:05<00:26,  1.28s/it]
[Acessing speaker spk_0 track 1 of 2:  26%|██▌       | 7/27 [00:06<00:20,  1.02s/it]
[Acessing speaker spk_0 track 1 of 2:  30%|██▉       | 8/27 [00:09<00:30,  1.58s/it]
[Acessing speaker spk_0 track 1 of 2:  33%|███▎      | 9/27 [00:10<00:28,  1.59s/it]
[Acessing speaker spk_0 track 1 of 2:  37%|███▋      | 10/27 [00:11<00:23,  1.39s/it]
[Acessing speaker spk_0 track 1 of 2:  41%|████      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:02<00:52,  2.01s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:07<01:43,  4.15s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:15<02:17,  5.74s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:21<02:14,  5.84s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:26<02:02,  5.58s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:30<01:45,  5.01s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:32<01:22,  4.12s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:38<01:27,  4.60s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:42<01:23,  4.62s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:43<00:59,  3.50s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/31 [00:02<01:22,  2.75s/it]
[Acessing speaker spk_2 track 1 of 1:   6%|▋         | 2/31 [00:05<01:16,  2.63s/it]
[Acessing speaker spk_2 track 1 of 1:  10%|▉         | 3/31 [00:08<01:24,  3.02s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 4/31 [00:09<00:56,  2.08s/it]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 5/31 [00:11<00:57,  2.20s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 6/31 [00:14<00:58,  2.35s/it]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 7/31 [00:15<00:47,  1.99s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 8/31 [00:16<00:37,  1.65s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 9/31 [00:17<00:28,  1.30s/it]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 10/31 [00:18<00:30,  1.43s/it]
[Acessing speaker spk_2 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/38 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/38 [00:00<00:26,  1.42it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▌         | 2/38 [00:02<00:49,  1.38s/it]
[Acessing speaker spk_3 track 1 of 1:   8%|▊         | 3/38 [00:03<00:34,  1.03it/s]
[Acessing speaker spk_3 track 1 of 1:  11%|█         | 4/38 [00:04<00:33,  1.02it/s]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 5/38 [00:05<00:35,  1.07s/it]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 6/38 [00:05<00:29,  1.07it/s]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 7/38 [00:07<00:37,  1.21s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 8/38 [00:08<00:34,  1.14s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▎       | 9/38 [00:10<00:39,  1.37s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▋       | 10/38 [00:13<00:55,  1.98s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 11/3





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/31 [00:01<00:57,  1.93s/it]
[Acessing speaker spk_4 track 1 of 1:   6%|▋         | 2/31 [00:02<00:36,  1.27s/it]
[Acessing speaker spk_4 track 1 of 1:  10%|▉         | 3/31 [00:04<00:40,  1.43s/it]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 4/31 [00:05<00:35,  1.30s/it]
[Acessing speaker spk_4 track 1 of 1:  16%|█▌        | 5/31 [00:07<00:37,  1.46s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 6/31 [00:08<00:31,  1.26s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 7/31 [00:10<00:41,  1.71s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 8/31 [00:11<00:32,  1.43s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▉       | 9/31 [00:13<00:34,  1.56s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 10/31 [00:14<00:26,  1.28s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▌      | 11/3


Starte Inference für Experiment: E65_bugfix_mdOn0p8_mdOff0p5_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E65_bugfix_mdOn0p8_mdOff0p5_bs12_len20
  session_dir     = data-bin/dev/session_54
  comment         = AVSR-Override: min_on=0.8s, min_off=0.5s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_54


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/32 [00:00<00:29,  1.05it/s]
[Acessing speaker spk_0 track 1 of 2:   6%|▋         | 2/32 [00:01<00:21,  1.38it/s]
[Acessing speaker spk_0 track 1 of 2:   9%|▉         | 3/32 [00:02<00:18,  1.56it/s]
[Acessing speaker spk_0 track 1 of 2:  12%|█▎        | 4/32 [00:02<00:15,  1.81it/s]
[Acessing speaker spk_0 track 1 of 2:  16%|█▌        | 5/32 [00:02<00:14,  1.87it/s]
[Acessing speaker spk_0 track 1 of 2:  19%|█▉        | 6/32 [00:04<00:19,  1.33it/s]
[Acessing speaker spk_0 track 1 of 2:  22%|██▏       | 7/32 [00:06<00:29,  1.20s/it]
[Acessing speaker spk_0 track 1 of 2:  25%|██▌       | 8/32 [00:06<00:23,  1.02it/s]
[Acessing speaker spk_0 track 1 of 2:  28%|██▊       | 9/32 [00:09<00:35,  1.56s/it]
[Acessing speaker spk_0 track 1 of 2:  31%|███▏      | 10/32 [00:10<00:30,  1.40s/it]
[Acessing speaker spk_0 track 1 of 2:  34%|███▍      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:02<00:52,  2.03s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:07<01:46,  4.25s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:15<02:17,  5.72s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:21<02:14,  5.84s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:26<02:04,  5.66s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:29<01:40,  4.76s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:32<01:19,  3.97s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:37<01:25,  4.51s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:42<01:20,  4.49s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:43<00:57,  3.40s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/39 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/39 [00:01<01:01,  1.62s/it]
[Acessing speaker spk_2 track 1 of 1:   5%|▌         | 2/39 [00:02<00:45,  1.24s/it]
[Acessing speaker spk_2 track 1 of 1:   8%|▊         | 3/39 [00:05<01:07,  1.88s/it]
[Acessing speaker spk_2 track 1 of 1:  10%|█         | 4/39 [00:08<01:27,  2.49s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 5/39 [00:09<01:02,  1.83s/it]
[Acessing speaker spk_2 track 1 of 1:  15%|█▌        | 6/39 [00:09<00:46,  1.40s/it]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 7/39 [00:11<00:49,  1.54s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 8/39 [00:12<00:40,  1.31s/it]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 9/39 [00:14<00:40,  1.37s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 10/39 [00:15<00:37,  1.31s/it]
[Acessing speaker spk_2 track 1 of 1:  28%|██▊       | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/43 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   2%|▏         | 1/43 [00:00<00:29,  1.45it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▍         | 2/43 [00:01<00:42,  1.03s/it]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 3/43 [00:02<00:33,  1.18it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 4/43 [00:03<00:26,  1.46it/s]
[Acessing speaker spk_3 track 1 of 1:  12%|█▏        | 5/43 [00:04<00:30,  1.26it/s]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 6/43 [00:07<00:57,  1.56s/it]
[Acessing speaker spk_3 track 1 of 1:  16%|█▋        | 7/43 [00:07<00:45,  1.27s/it]
[Acessing speaker spk_3 track 1 of 1:  19%|█▊        | 8/43 [00:09<00:49,  1.43s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 9/43 [00:10<00:42,  1.24s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 10/43 [00:12<00:47,  1.43s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 11/4





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/33 [00:01<01:01,  1.91s/it]
[Acessing speaker spk_4 track 1 of 1:   6%|▌         | 2/33 [00:02<00:39,  1.26s/it]
[Acessing speaker spk_4 track 1 of 1:   9%|▉         | 3/33 [00:03<00:28,  1.07it/s]
[Acessing speaker spk_4 track 1 of 1:  12%|█▏        | 4/33 [00:03<00:21,  1.38it/s]
[Acessing speaker spk_4 track 1 of 1:  15%|█▌        | 5/33 [00:05<00:31,  1.13s/it]
[Acessing speaker spk_4 track 1 of 1:  18%|█▊        | 6/33 [00:06<00:28,  1.05s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██        | 7/33 [00:09<00:40,  1.56s/it]
[Acessing speaker spk_4 track 1 of 1:  24%|██▍       | 8/33 [00:09<00:33,  1.33s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 9/33 [00:11<00:34,  1.43s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|███       | 10/33 [00:12<00:27,  1.19s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 11/3


Starte Inference für Experiment: E66_bugfix_mdOn0p8_mdOff0p8_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E66_bugfix_mdOn0p8_mdOff0p8_bs12_len20
  session_dir     = data-bin/dev/session_54
  comment         = AVSR-Override: min_on=0.8s, min_off=0.8s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_54


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/29 [00:00<00:26,  1.05it/s]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/29 [00:01<00:19,  1.39it/s]
[Acessing speaker spk_0 track 1 of 2:  10%|█         | 3/29 [00:02<00:16,  1.57it/s]
[Acessing speaker spk_0 track 1 of 2:  14%|█▍        | 4/29 [00:02<00:13,  1.82it/s]
[Acessing speaker spk_0 track 1 of 2:  17%|█▋        | 5/29 [00:02<00:12,  1.88it/s]
[Acessing speaker spk_0 track 1 of 2:  21%|██        | 6/29 [00:04<00:17,  1.35it/s]
[Acessing speaker spk_0 track 1 of 2:  24%|██▍       | 7/29 [00:06<00:26,  1.19s/it]
[Acessing speaker spk_0 track 1 of 2:  28%|██▊       | 8/29 [00:06<00:20,  1.03it/s]
[Acessing speaker spk_0 track 1 of 2:  31%|███       | 9/29 [00:09<00:30,  1.54s/it]
[Acessing speaker spk_0 track 1 of 2:  34%|███▍      | 10/29 [00:11<00:29,  1.56s/it]
[Acessing speaker spk_0 track 1 of 2:  38%|███▊      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:02<00:57,  2.21s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:07<01:47,  4.30s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:15<02:18,  5.76s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:21<02:16,  5.95s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:26<02:04,  5.65s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:30<01:46,  5.06s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:32<01:21,  4.10s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:38<01:28,  4.65s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:43<01:22,  4.59s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:44<00:58,  3.47s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/37 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/37 [00:01<00:57,  1.59s/it]
[Acessing speaker spk_2 track 1 of 1:   5%|▌         | 2/37 [00:02<00:46,  1.33s/it]
[Acessing speaker spk_2 track 1 of 1:   8%|▊         | 3/37 [00:05<01:05,  1.93s/it]
[Acessing speaker spk_2 track 1 of 1:  11%|█         | 4/37 [00:08<01:23,  2.52s/it]
[Acessing speaker spk_2 track 1 of 1:  14%|█▎        | 5/37 [00:09<00:59,  1.85s/it]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 6/37 [00:10<00:43,  1.41s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 7/37 [00:11<00:46,  1.54s/it]
[Acessing speaker spk_2 track 1 of 1:  22%|██▏       | 8/37 [00:14<00:53,  1.84s/it]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 9/37 [00:15<00:46,  1.65s/it]
[Acessing speaker spk_2 track 1 of 1:  27%|██▋       | 10/37 [00:16<00:38,  1.43s/it]
[Acessing speaker spk_2 track 1 of 1:  30%|██▉       | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/41 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   2%|▏         | 1/41 [00:00<00:28,  1.43it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▍         | 2/41 [00:02<00:53,  1.37s/it]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 3/41 [00:03<00:36,  1.03it/s]
[Acessing speaker spk_3 track 1 of 1:  10%|▉         | 4/41 [00:04<00:36,  1.02it/s]
[Acessing speaker spk_3 track 1 of 1:  12%|█▏        | 5/41 [00:05<00:38,  1.07s/it]
[Acessing speaker spk_3 track 1 of 1:  15%|█▍        | 6/41 [00:05<00:32,  1.08it/s]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 7/41 [00:07<00:40,  1.20s/it]
[Acessing speaker spk_3 track 1 of 1:  20%|█▉        | 8/41 [00:08<00:35,  1.08s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 9/41 [00:10<00:42,  1.32s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 10/41 [00:13<01:00,  1.94s/it]
[Acessing speaker spk_3 track 1 of 1:  27%|██▋       | 11/4





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/31 [00:01<00:57,  1.92s/it]
[Acessing speaker spk_4 track 1 of 1:   6%|▋         | 2/31 [00:02<00:36,  1.27s/it]
[Acessing speaker spk_4 track 1 of 1:  10%|▉         | 3/31 [00:03<00:26,  1.07it/s]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 4/31 [00:03<00:19,  1.37it/s]
[Acessing speaker spk_4 track 1 of 1:  16%|█▌        | 5/31 [00:05<00:29,  1.14s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 6/31 [00:06<00:26,  1.05s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 7/31 [00:09<00:37,  1.57s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 8/31 [00:09<00:30,  1.33s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▉       | 9/31 [00:11<00:31,  1.44s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 10/31 [00:12<00:25,  1.20s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▌      | 11/3


Starte Inference für Experiment: E67_bugfix_mdOn0p8_mdOff1p0_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E67_bugfix_mdOn0p8_mdOff1p0_bs12_len20
  session_dir     = data-bin/dev/session_54
  comment         = AVSR-Override: min_on=0.8s, min_off=1.0s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_54


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   4%|▎         | 1/27 [00:00<00:24,  1.04it/s]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/27 [00:01<00:18,  1.39it/s]
[Acessing speaker spk_0 track 1 of 2:  11%|█         | 3/27 [00:02<00:15,  1.58it/s]
[Acessing speaker spk_0 track 1 of 2:  15%|█▍        | 4/27 [00:02<00:12,  1.82it/s]
[Acessing speaker spk_0 track 1 of 2:  19%|█▊        | 5/27 [00:03<00:19,  1.15it/s]
[Acessing speaker spk_0 track 1 of 2:  22%|██▏       | 6/27 [00:05<00:26,  1.28s/it]
[Acessing speaker spk_0 track 1 of 2:  26%|██▌       | 7/27 [00:06<00:20,  1.03s/it]
[Acessing speaker spk_0 track 1 of 2:  30%|██▉       | 8/27 [00:09<00:30,  1.59s/it]
[Acessing speaker spk_0 track 1 of 2:  33%|███▎      | 9/27 [00:10<00:28,  1.60s/it]
[Acessing speaker spk_0 track 1 of 2:  37%|███▋      | 10/27 [00:11<00:23,  1.40s/it]
[Acessing speaker spk_0 track 1 of 2:  41%|████      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:02<00:58,  2.23s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:07<01:45,  4.23s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:15<02:17,  5.71s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:21<02:15,  5.89s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:26<02:03,  5.60s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:30<01:45,  5.02s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:32<01:21,  4.07s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:38<01:28,  4.66s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:43<01:23,  4.62s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:44<00:59,  3.49s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/33 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/33 [00:01<00:51,  1.62s/it]
[Acessing speaker spk_2 track 1 of 1:   6%|▌         | 2/33 [00:02<00:38,  1.25s/it]
[Acessing speaker spk_2 track 1 of 1:   9%|▉         | 3/33 [00:05<00:56,  1.90s/it]
[Acessing speaker spk_2 track 1 of 1:  12%|█▏        | 4/33 [00:08<01:12,  2.51s/it]
[Acessing speaker spk_2 track 1 of 1:  15%|█▌        | 5/33 [00:09<00:51,  1.84s/it]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 6/33 [00:10<00:46,  1.74s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 7/33 [00:13<00:52,  2.01s/it]
[Acessing speaker spk_2 track 1 of 1:  24%|██▍       | 8/33 [00:15<00:53,  2.16s/it]
[Acessing speaker spk_2 track 1 of 1:  27%|██▋       | 9/33 [00:17<00:44,  1.87s/it]
[Acessing speaker spk_2 track 1 of 1:  30%|███       | 10/33 [00:18<00:36,  1.58s/it]
[Acessing speaker spk_2 track 1 of 1:  33%|███▎      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/38 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/38 [00:00<00:25,  1.44it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▌         | 2/38 [00:02<00:48,  1.35s/it]
[Acessing speaker spk_3 track 1 of 1:   8%|▊         | 3/38 [00:02<00:33,  1.04it/s]
[Acessing speaker spk_3 track 1 of 1:  11%|█         | 4/38 [00:03<00:32,  1.03it/s]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 5/38 [00:05<00:34,  1.06s/it]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 6/38 [00:05<00:29,  1.08it/s]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 7/38 [00:07<00:37,  1.21s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 8/38 [00:08<00:32,  1.09s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▎       | 9/38 [00:10<00:39,  1.35s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▋       | 10/38 [00:14<01:04,  2.29s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 11/3





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/30 [00:01<00:55,  1.93s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/30 [00:02<00:35,  1.27s/it]
[Acessing speaker spk_4 track 1 of 1:  10%|█         | 3/30 [00:03<00:25,  1.07it/s]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 4/30 [00:03<00:18,  1.38it/s]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 5/30 [00:05<00:26,  1.07s/it]
[Acessing speaker spk_4 track 1 of 1:  20%|██        | 6/30 [00:06<00:24,  1.01s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 7/30 [00:08<00:35,  1.54s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 8/30 [00:09<00:29,  1.32s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|███       | 9/30 [00:11<00:30,  1.43s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 10/30 [00:12<00:23,  1.20s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 11/3


Starte Inference für Experiment: E68_bugfix_mdOn0p8_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E68_bugfix_mdOn0p8_mdOff1p2_bs12_len20
  session_dir     = data-bin/dev/session_54
  comment         = AVSR-Override: min_on=0.8s, min_off=1.2s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_54


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   4%|▎         | 1/27 [00:00<00:24,  1.05it/s]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/27 [00:01<00:18,  1.39it/s]
[Acessing speaker spk_0 track 1 of 2:  11%|█         | 3/27 [00:02<00:15,  1.58it/s]
[Acessing speaker spk_0 track 1 of 2:  15%|█▍        | 4/27 [00:02<00:12,  1.82it/s]
[Acessing speaker spk_0 track 1 of 2:  19%|█▊        | 5/27 [00:05<00:33,  1.54s/it]
[Acessing speaker spk_0 track 1 of 2:  22%|██▏       | 6/27 [00:07<00:36,  1.72s/it]
[Acessing speaker spk_0 track 1 of 2:  26%|██▌       | 7/27 [00:08<00:26,  1.32s/it]
[Acessing speaker spk_0 track 1 of 2:  30%|██▉       | 8/27 [00:11<00:34,  1.82s/it]
[Acessing speaker spk_0 track 1 of 2:  33%|███▎      | 9/27 [00:12<00:31,  1.75s/it]
[Acessing speaker spk_0 track 1 of 2:  37%|███▋      | 10/27 [00:13<00:25,  1.51s/it]
[Acessing speaker spk_0 track 1 of 2:  41%|████      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:03<01:41,  3.90s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:09<02:02,  4.92s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:17<02:28,  6.19s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:23<02:20,  6.11s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:28<02:06,  5.75s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:32<01:47,  5.13s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:34<01:24,  4.20s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:40<01:28,  4.67s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:44<01:23,  4.66s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:45<01:00,  3.53s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/31 [00:02<01:23,  2.77s/it]
[Acessing speaker spk_2 track 1 of 1:   6%|▋         | 2/31 [00:05<01:16,  2.64s/it]
[Acessing speaker spk_2 track 1 of 1:  10%|▉         | 3/31 [00:08<01:23,  2.99s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 4/31 [00:09<00:55,  2.06s/it]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 5/31 [00:11<00:57,  2.19s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 6/31 [00:14<00:59,  2.36s/it]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 7/31 [00:15<00:47,  2.00s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▌       | 8/31 [00:16<00:38,  1.65s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 9/31 [00:17<00:28,  1.31s/it]
[Acessing speaker spk_2 track 1 of 1:  32%|███▏      | 10/31 [00:18<00:30,  1.44s/it]
[Acessing speaker spk_2 track 1 of 1:  35%|███▌      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/38 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/38 [00:00<00:25,  1.43it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▌         | 2/38 [00:02<00:48,  1.35s/it]
[Acessing speaker spk_3 track 1 of 1:   8%|▊         | 3/38 [00:02<00:33,  1.04it/s]
[Acessing speaker spk_3 track 1 of 1:  11%|█         | 4/38 [00:03<00:32,  1.03it/s]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 5/38 [00:05<00:34,  1.06s/it]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 6/38 [00:05<00:29,  1.08it/s]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 7/38 [00:07<00:37,  1.20s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 8/38 [00:08<00:33,  1.13s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▎       | 9/38 [00:10<00:39,  1.36s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▋       | 10/38 [00:13<00:55,  1.97s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 11/3





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/30 [00:01<00:56,  1.95s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/30 [00:02<00:35,  1.28s/it]
[Acessing speaker spk_4 track 1 of 1:  10%|█         | 3/30 [00:03<00:25,  1.06it/s]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 4/30 [00:03<00:19,  1.37it/s]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 5/30 [00:05<00:27,  1.08s/it]
[Acessing speaker spk_4 track 1 of 1:  20%|██        | 6/30 [00:06<00:24,  1.02s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 7/30 [00:08<00:35,  1.55s/it]
[Acessing speaker spk_4 track 1 of 1:  27%|██▋       | 8/30 [00:09<00:29,  1.32s/it]
[Acessing speaker spk_4 track 1 of 1:  30%|███       | 9/30 [00:11<00:31,  1.51s/it]
[Acessing speaker spk_4 track 1 of 1:  33%|███▎      | 10/30 [00:12<00:24,  1.25s/it]
[Acessing speaker spk_4 track 1 of 1:  37%|███▋      | 11/3


Starte Inference für Experiment: E69_bugfix_mdOn1p0_mdOff0p5_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E69_bugfix_mdOn1p0_mdOff0p5_bs12_len20
  session_dir     = data-bin/dev/session_54
  comment         = AVSR-Override: min_on=1.0s, min_off=0.5s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_54


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   3%|▎         | 1/31 [00:00<00:29,  1.03it/s]
[Acessing speaker spk_0 track 1 of 2:   6%|▋         | 2/31 [00:01<00:21,  1.37it/s]
[Acessing speaker spk_0 track 1 of 2:  10%|▉         | 3/31 [00:02<00:17,  1.56it/s]
[Acessing speaker spk_0 track 1 of 2:  13%|█▎        | 4/31 [00:02<00:14,  1.80it/s]
[Acessing speaker spk_0 track 1 of 2:  16%|█▌        | 5/31 [00:02<00:13,  1.87it/s]
[Acessing speaker spk_0 track 1 of 2:  19%|█▉        | 6/31 [00:04<00:18,  1.34it/s]
[Acessing speaker spk_0 track 1 of 2:  23%|██▎       | 7/31 [00:06<00:28,  1.19s/it]
[Acessing speaker spk_0 track 1 of 2:  26%|██▌       | 8/31 [00:08<00:38,  1.68s/it]
[Acessing speaker spk_0 track 1 of 2:  29%|██▉       | 9/31 [00:09<00:32,  1.47s/it]
[Acessing speaker spk_0 track 1 of 2:  32%|███▏      | 10/31 [00:10<00:25,  1.21s/it]
[Acessing speaker spk_0 track 1 of 2:  35%|███▌      | 11/3





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:02<00:52,  2.01s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:07<01:47,  4.31s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:15<02:19,  5.82s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:21<02:16,  5.92s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:26<02:05,  5.70s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:29<01:40,  4.79s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:32<01:19,  3.99s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:38<01:30,  4.77s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:43<01:24,  4.68s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:44<01:00,  3.53s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/38 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/38 [00:01<00:59,  1.61s/it]
[Acessing speaker spk_2 track 1 of 1:   5%|▌         | 2/38 [00:02<00:44,  1.23s/it]
[Acessing speaker spk_2 track 1 of 1:   8%|▊         | 3/38 [00:05<01:05,  1.88s/it]
[Acessing speaker spk_2 track 1 of 1:  11%|█         | 4/38 [00:08<01:24,  2.49s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 5/38 [00:09<01:00,  1.83s/it]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 6/38 [00:09<00:44,  1.40s/it]
[Acessing speaker spk_2 track 1 of 1:  18%|█▊        | 7/38 [00:11<00:47,  1.53s/it]
[Acessing speaker spk_2 track 1 of 1:  21%|██        | 8/38 [00:12<00:39,  1.31s/it]
[Acessing speaker spk_2 track 1 of 1:  24%|██▎       | 9/38 [00:14<00:41,  1.42s/it]
[Acessing speaker spk_2 track 1 of 1:  26%|██▋       | 10/38 [00:16<00:46,  1.67s/it]
[Acessing speaker spk_2 track 1 of 1:  29%|██▉       | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/43 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   2%|▏         | 1/43 [00:00<00:29,  1.44it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▍         | 2/43 [00:01<00:42,  1.03s/it]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 3/43 [00:02<00:34,  1.18it/s]
[Acessing speaker spk_3 track 1 of 1:   9%|▉         | 4/43 [00:03<00:26,  1.46it/s]
[Acessing speaker spk_3 track 1 of 1:  12%|█▏        | 5/43 [00:04<00:30,  1.26it/s]
[Acessing speaker spk_3 track 1 of 1:  14%|█▍        | 6/43 [00:05<00:34,  1.07it/s]
[Acessing speaker spk_3 track 1 of 1:  16%|█▋        | 7/43 [00:05<00:30,  1.17it/s]
[Acessing speaker spk_3 track 1 of 1:  19%|█▊        | 8/43 [00:07<00:39,  1.14s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 9/43 [00:08<00:35,  1.04s/it]
[Acessing speaker spk_3 track 1 of 1:  23%|██▎       | 10/43 [00:10<00:42,  1.29s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▌       | 11/4





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/31 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/31 [00:02<01:02,  2.09s/it]
[Acessing speaker spk_4 track 1 of 1:   6%|▋         | 2/31 [00:02<00:38,  1.34s/it]
[Acessing speaker spk_4 track 1 of 1:  10%|▉         | 3/31 [00:03<00:27,  1.03it/s]
[Acessing speaker spk_4 track 1 of 1:  13%|█▎        | 4/31 [00:03<00:20,  1.33it/s]
[Acessing speaker spk_4 track 1 of 1:  16%|█▌        | 5/31 [00:05<00:28,  1.09s/it]
[Acessing speaker spk_4 track 1 of 1:  19%|█▉        | 6/31 [00:06<00:25,  1.02s/it]
[Acessing speaker spk_4 track 1 of 1:  23%|██▎       | 7/31 [00:09<00:37,  1.54s/it]
[Acessing speaker spk_4 track 1 of 1:  26%|██▌       | 8/31 [00:09<00:30,  1.32s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▉       | 9/31 [00:11<00:31,  1.42s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 10/31 [00:12<00:24,  1.19s/it]
[Acessing speaker spk_4 track 1 of 1:  35%|███▌      | 11/3


Starte Inference für Experiment: E70_bugfix_mdOn1p0_mdOff0p8_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E70_bugfix_mdOn1p0_mdOff0p8_bs12_len20
  session_dir     = data-bin/dev/session_54
  comment         = AVSR-Override: min_on=1.0s, min_off=0.8s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_54


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   4%|▎         | 1/28 [00:00<00:25,  1.04it/s]
[Acessing speaker spk_0 track 1 of 2:   7%|▋         | 2/28 [00:01<00:18,  1.38it/s]
[Acessing speaker spk_0 track 1 of 2:  11%|█         | 3/28 [00:02<00:16,  1.55it/s]
[Acessing speaker spk_0 track 1 of 2:  14%|█▍        | 4/28 [00:02<00:13,  1.80it/s]
[Acessing speaker spk_0 track 1 of 2:  18%|█▊        | 5/28 [00:02<00:12,  1.87it/s]
[Acessing speaker spk_0 track 1 of 2:  21%|██▏       | 6/28 [00:04<00:16,  1.34it/s]
[Acessing speaker spk_0 track 1 of 2:  25%|██▌       | 7/28 [00:06<00:25,  1.20s/it]
[Acessing speaker spk_0 track 1 of 2:  29%|██▊       | 8/28 [00:09<00:34,  1.72s/it]
[Acessing speaker spk_0 track 1 of 2:  32%|███▏      | 9/28 [00:10<00:32,  1.69s/it]
[Acessing speaker spk_0 track 1 of 2:  36%|███▌      | 10/28 [00:11<00:26,  1.46s/it]
[Acessing speaker spk_0 track 1 of 2:  39%|███▉      | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:02<00:52,  2.02s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:07<01:46,  4.27s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:17<02:39,  6.66s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:23<02:28,  6.47s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:28<02:13,  6.09s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:32<01:52,  5.35s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:34<01:25,  4.29s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:40<01:29,  4.73s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:45<01:24,  4.70s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:46<01:00,  3.55s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/36 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/36 [00:01<00:56,  1.61s/it]
[Acessing speaker spk_2 track 1 of 1:   6%|▌         | 2/36 [00:02<00:42,  1.24s/it]
[Acessing speaker spk_2 track 1 of 1:   8%|▊         | 3/36 [00:05<01:04,  1.96s/it]
[Acessing speaker spk_2 track 1 of 1:  11%|█         | 4/36 [00:08<01:20,  2.52s/it]
[Acessing speaker spk_2 track 1 of 1:  14%|█▍        | 5/36 [00:09<00:57,  1.85s/it]
[Acessing speaker spk_2 track 1 of 1:  17%|█▋        | 6/36 [00:10<00:42,  1.42s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 7/36 [00:11<00:44,  1.54s/it]
[Acessing speaker spk_2 track 1 of 1:  22%|██▏       | 8/36 [00:14<00:51,  1.83s/it]
[Acessing speaker spk_2 track 1 of 1:  25%|██▌       | 9/36 [00:15<00:44,  1.65s/it]
[Acessing speaker spk_2 track 1 of 1:  28%|██▊       | 10/36 [00:16<00:37,  1.42s/it]
[Acessing speaker spk_2 track 1 of 1:  31%|███       | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/41 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   2%|▏         | 1/41 [00:00<00:27,  1.43it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▍         | 2/41 [00:02<00:53,  1.36s/it]
[Acessing speaker spk_3 track 1 of 1:   7%|▋         | 3/41 [00:03<00:36,  1.03it/s]
[Acessing speaker spk_3 track 1 of 1:  10%|▉         | 4/41 [00:04<00:36,  1.03it/s]
[Acessing speaker spk_3 track 1 of 1:  12%|█▏        | 5/41 [00:05<00:38,  1.06s/it]
[Acessing speaker spk_3 track 1 of 1:  15%|█▍        | 6/41 [00:05<00:32,  1.07it/s]
[Acessing speaker spk_3 track 1 of 1:  17%|█▋        | 7/41 [00:07<00:40,  1.20s/it]
[Acessing speaker spk_3 track 1 of 1:  20%|█▉        | 8/41 [00:08<00:35,  1.08s/it]
[Acessing speaker spk_3 track 1 of 1:  22%|██▏       | 9/41 [00:10<00:42,  1.33s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▍       | 10/41 [00:13<01:00,  1.95s/it]
[Acessing speaker spk_3 track 1 of 1:  27%|██▋       | 11/4





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/29 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   3%|▎         | 1/29 [00:01<00:55,  1.97s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/29 [00:02<00:34,  1.29s/it]
[Acessing speaker spk_4 track 1 of 1:  10%|█         | 3/29 [00:03<00:24,  1.05it/s]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 4/29 [00:03<00:18,  1.35it/s]
[Acessing speaker spk_4 track 1 of 1:  17%|█▋        | 5/29 [00:05<00:25,  1.08s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██        | 6/29 [00:06<00:24,  1.08s/it]
[Acessing speaker spk_4 track 1 of 1:  24%|██▍       | 7/29 [00:09<00:35,  1.60s/it]
[Acessing speaker spk_4 track 1 of 1:  28%|██▊       | 8/29 [00:10<00:28,  1.36s/it]
[Acessing speaker spk_4 track 1 of 1:  31%|███       | 9/29 [00:11<00:29,  1.45s/it]
[Acessing speaker spk_4 track 1 of 1:  34%|███▍      | 10/29 [00:12<00:23,  1.21s/it]
[Acessing speaker spk_4 track 1 of 1:  38%|███▊      | 11/2


Starte Inference für Experiment: E71_bugfix_mdOn1p0_mdOff1p0_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E71_bugfix_mdOn1p0_mdOff1p0_bs12_len20
  session_dir     = data-bin/dev/session_54
  comment         = AVSR-Override: min_on=1.0s, min_off=1.0s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_54


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   4%|▍         | 1/26 [00:00<00:23,  1.05it/s]
[Acessing speaker spk_0 track 1 of 2:   8%|▊         | 2/26 [00:01<00:17,  1.39it/s]
[Acessing speaker spk_0 track 1 of 2:  12%|█▏        | 3/26 [00:02<00:14,  1.57it/s]
[Acessing speaker spk_0 track 1 of 2:  15%|█▌        | 4/26 [00:02<00:12,  1.82it/s]
[Acessing speaker spk_0 track 1 of 2:  19%|█▉        | 5/26 [00:03<00:18,  1.13it/s]
[Acessing speaker spk_0 track 1 of 2:  23%|██▎       | 6/26 [00:06<00:25,  1.30s/it]
[Acessing speaker spk_0 track 1 of 2:  27%|██▋       | 7/26 [00:08<00:33,  1.77s/it]
[Acessing speaker spk_0 track 1 of 2:  31%|███       | 8/26 [00:10<00:30,  1.72s/it]
[Acessing speaker spk_0 track 1 of 2:  35%|███▍      | 9/26 [00:11<00:25,  1.48s/it]
[Acessing speaker spk_0 track 1 of 2:  38%|███▊      | 10/26 [00:12<00:19,  1.23s/it]
[Acessing speaker spk_0 track 1 of 2:  42%|████▏     | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:02<00:57,  2.20s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:07<01:46,  4.24s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:15<02:17,  5.72s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:21<02:15,  5.91s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:26<02:03,  5.62s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:30<01:46,  5.05s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:32<01:21,  4.10s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:38<01:28,  4.67s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:43<01:22,  4.61s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:44<00:59,  3.48s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/32 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/32 [00:01<00:49,  1.59s/it]
[Acessing speaker spk_2 track 1 of 1:   6%|▋         | 2/32 [00:02<00:40,  1.36s/it]
[Acessing speaker spk_2 track 1 of 1:   9%|▉         | 3/32 [00:05<00:56,  1.96s/it]
[Acessing speaker spk_2 track 1 of 1:  12%|█▎        | 4/32 [00:08<01:11,  2.55s/it]
[Acessing speaker spk_2 track 1 of 1:  16%|█▌        | 5/32 [00:09<00:50,  1.87s/it]
[Acessing speaker spk_2 track 1 of 1:  19%|█▉        | 6/32 [00:10<00:37,  1.43s/it]
[Acessing speaker spk_2 track 1 of 1:  22%|██▏       | 7/32 [00:11<00:38,  1.55s/it]
[Acessing speaker spk_2 track 1 of 1:  25%|██▌       | 8/32 [00:14<00:44,  1.85s/it]
[Acessing speaker spk_2 track 1 of 1:  28%|██▊       | 9/32 [00:15<00:38,  1.65s/it]
[Acessing speaker spk_2 track 1 of 1:  31%|███▏      | 10/32 [00:16<00:31,  1.43s/it]
[Acessing speaker spk_2 track 1 of 1:  34%|███▍      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/38 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/38 [00:00<00:25,  1.43it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▌         | 2/38 [00:02<00:48,  1.36s/it]
[Acessing speaker spk_3 track 1 of 1:   8%|▊         | 3/38 [00:03<00:33,  1.04it/s]
[Acessing speaker spk_3 track 1 of 1:  11%|█         | 4/38 [00:03<00:33,  1.03it/s]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 5/38 [00:05<00:35,  1.07s/it]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 6/38 [00:05<00:29,  1.07it/s]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 7/38 [00:07<00:37,  1.20s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 8/38 [00:08<00:32,  1.08s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▎       | 9/38 [00:10<00:38,  1.33s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▋       | 10/38 [00:13<00:54,  1.93s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 11/3





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/28 [00:02<00:56,  2.10s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/28 [00:02<00:34,  1.35s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/28 [00:03<00:24,  1.02it/s]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 4/28 [00:03<00:18,  1.33it/s]
[Acessing speaker spk_4 track 1 of 1:  18%|█▊        | 5/28 [00:05<00:25,  1.09s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██▏       | 6/28 [00:06<00:22,  1.02s/it]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 7/28 [00:10<00:44,  2.13s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▊       | 8/28 [00:11<00:34,  1.74s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 9/28 [00:13<00:32,  1.71s/it]
[Acessing speaker spk_4 track 1 of 1:  36%|███▌      | 10/28 [00:14<00:25,  1.39s/it]
[Acessing speaker spk_4 track 1 of 1:  39%|███▉      | 11/2


Starte Inference für Experiment: E72_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  base_model      = avsr_cocktail_finetuned
  model_type      = avsr_cocktail
  checkpoint_path = model-bin/avsr_cocktail_mcorec_finetune
  beam_size       = 12
  max_length      = 20
  output_dir_name = output_E72_bugfix_mdOn1p0_mdOff1p2_bs12_len20
  session_dir     = data-bin/dev/session_54
  comment         = AVSR-Override: min_on=1.0s, min_off=1.2s (nur ASD-Chunks)
Loading avsr_cocktail model...
Loading model from model-bin/avsr_cocktail_mcorec_finetune
avsr_cocktail model loaded successfully!
Inferring 1 sessions using avsr_cocktail model
Processing session session_54


Processing speakers:   0%|          | 0/5 [00:00<?, ?it/s]





[Acessing speaker spk_0 track 1 of 2:   0%|          | 0/26 [00:00<?, ?it/s]
[Acessing speaker spk_0 track 1 of 2:   4%|▍         | 1/26 [00:00<00:23,  1.05it/s]
[Acessing speaker spk_0 track 1 of 2:   8%|▊         | 2/26 [00:01<00:17,  1.40it/s]
[Acessing speaker spk_0 track 1 of 2:  12%|█▏        | 3/26 [00:02<00:14,  1.58it/s]
[Acessing speaker spk_0 track 1 of 2:  15%|█▌        | 4/26 [00:02<00:12,  1.83it/s]
[Acessing speaker spk_0 track 1 of 2:  19%|█▉        | 5/26 [00:03<00:18,  1.16it/s]
[Acessing speaker spk_0 track 1 of 2:  23%|██▎       | 6/26 [00:05<00:25,  1.28s/it]
[Acessing speaker spk_0 track 1 of 2:  27%|██▋       | 7/26 [00:08<00:33,  1.76s/it]
[Acessing speaker spk_0 track 1 of 2:  31%|███       | 8/26 [00:10<00:30,  1.71s/it]
[Acessing speaker spk_0 track 1 of 2:  35%|███▍      | 9/26 [00:11<00:25,  1.48s/it]
[Acessing speaker spk_0 track 1 of 2:  38%|███▊      | 10/26 [00:11<00:19,  1.22s/it]
[Acessing speaker spk_0 track 1 of 2:  42%|████▏     | 11/2





[Acessing speaker spk_1 track 1 of 1:   0%|          | 0/27 [00:00<?, ?it/s]
[Acessing speaker spk_1 track 1 of 1:   4%|▎         | 1/27 [00:01<00:51,  1.99s/it]
[Acessing speaker spk_1 track 1 of 1:   7%|▋         | 2/27 [00:07<01:43,  4.15s/it]
[Acessing speaker spk_1 track 1 of 1:  11%|█         | 3/27 [00:15<02:16,  5.67s/it]
[Acessing speaker spk_1 track 1 of 1:  15%|█▍        | 4/27 [00:22<02:24,  6.28s/it]
[Acessing speaker spk_1 track 1 of 1:  19%|█▊        | 5/27 [00:27<02:09,  5.88s/it]
[Acessing speaker spk_1 track 1 of 1:  22%|██▏       | 6/27 [00:31<01:49,  5.21s/it]
[Acessing speaker spk_1 track 1 of 1:  26%|██▌       | 7/27 [00:33<01:25,  4.26s/it]
[Acessing speaker spk_1 track 1 of 1:  30%|██▉       | 8/27 [00:39<01:29,  4.69s/it]
[Acessing speaker spk_1 track 1 of 1:  33%|███▎      | 9/27 [00:43<01:23,  4.62s/it]
[Acessing speaker spk_1 track 1 of 1:  37%|███▋      | 10/27 [00:44<00:59,  3.49s/it]
[Acessing speaker spk_1 track 1 of 1:  41%|████      | 11/2





[Acessing speaker spk_2 track 1 of 1:   0%|          | 0/30 [00:00<?, ?it/s]
[Acessing speaker spk_2 track 1 of 1:   3%|▎         | 1/30 [00:04<02:14,  4.64s/it]
[Acessing speaker spk_2 track 1 of 1:   7%|▋         | 2/30 [00:07<01:35,  3.41s/it]
[Acessing speaker spk_2 track 1 of 1:  10%|█         | 3/30 [00:10<01:31,  3.40s/it]
[Acessing speaker spk_2 track 1 of 1:  13%|█▎        | 4/30 [00:11<01:00,  2.32s/it]
[Acessing speaker spk_2 track 1 of 1:  17%|█▋        | 5/30 [00:13<00:58,  2.35s/it]
[Acessing speaker spk_2 track 1 of 1:  20%|██        | 6/30 [00:16<00:57,  2.39s/it]
[Acessing speaker spk_2 track 1 of 1:  23%|██▎       | 7/30 [00:17<00:46,  2.01s/it]
[Acessing speaker spk_2 track 1 of 1:  27%|██▋       | 8/30 [00:18<00:36,  1.67s/it]
[Acessing speaker spk_2 track 1 of 1:  30%|███       | 9/30 [00:18<00:27,  1.32s/it]
[Acessing speaker spk_2 track 1 of 1:  33%|███▎      | 10/30 [00:20<00:29,  1.50s/it]
[Acessing speaker spk_2 track 1 of 1:  37%|███▋      | 11/3





[Acessing speaker spk_3 track 1 of 1:   0%|          | 0/38 [00:00<?, ?it/s]
[Acessing speaker spk_3 track 1 of 1:   3%|▎         | 1/38 [00:00<00:25,  1.44it/s]
[Acessing speaker spk_3 track 1 of 1:   5%|▌         | 2/38 [00:02<00:48,  1.35s/it]
[Acessing speaker spk_3 track 1 of 1:   8%|▊         | 3/38 [00:03<00:33,  1.04it/s]
[Acessing speaker spk_3 track 1 of 1:  11%|█         | 4/38 [00:04<00:33,  1.03it/s]
[Acessing speaker spk_3 track 1 of 1:  13%|█▎        | 5/38 [00:05<00:35,  1.07s/it]
[Acessing speaker spk_3 track 1 of 1:  16%|█▌        | 6/38 [00:07<00:42,  1.32s/it]
[Acessing speaker spk_3 track 1 of 1:  18%|█▊        | 7/38 [00:09<00:52,  1.68s/it]
[Acessing speaker spk_3 track 1 of 1:  21%|██        | 8/38 [00:10<00:42,  1.41s/it]
[Acessing speaker spk_3 track 1 of 1:  24%|██▎       | 9/38 [00:12<00:45,  1.55s/it]
[Acessing speaker spk_3 track 1 of 1:  26%|██▋       | 10/38 [00:15<01:00,  2.17s/it]
[Acessing speaker spk_3 track 1 of 1:  29%|██▉       | 11/3





[Acessing speaker spk_4 track 1 of 1:   0%|          | 0/28 [00:00<?, ?it/s]
[Acessing speaker spk_4 track 1 of 1:   4%|▎         | 1/28 [00:01<00:51,  1.92s/it]
[Acessing speaker spk_4 track 1 of 1:   7%|▋         | 2/28 [00:02<00:33,  1.27s/it]
[Acessing speaker spk_4 track 1 of 1:  11%|█         | 3/28 [00:03<00:23,  1.06it/s]
[Acessing speaker spk_4 track 1 of 1:  14%|█▍        | 4/28 [00:03<00:17,  1.36it/s]
[Acessing speaker spk_4 track 1 of 1:  18%|█▊        | 5/28 [00:05<00:25,  1.10s/it]
[Acessing speaker spk_4 track 1 of 1:  21%|██▏       | 6/28 [00:06<00:22,  1.04s/it]
[Acessing speaker spk_4 track 1 of 1:  25%|██▌       | 7/28 [00:09<00:33,  1.59s/it]
[Acessing speaker spk_4 track 1 of 1:  29%|██▊       | 8/28 [00:09<00:27,  1.35s/it]
[Acessing speaker spk_4 track 1 of 1:  32%|███▏      | 9/28 [00:11<00:27,  1.45s/it]
[Acessing speaker spk_4 track 1 of 1:  36%|███▌      | 10/28 [00:12<00:21,  1.21s/it]
[Acessing speaker spk_4 track 1 of 1:  39%|███▉      | 11/2

## 9 – Evaluation & Aggregation

Separate CSV (`results_grid_segmentation_by_session.csv`) –
nicht die gemeinsame Experiment-CSV, da dieses Grid andere Parameter hat.

In [12]:
df_by_session = append_eval_results_for_experiments(
    experiments=EXPERIMENTS,
    session_ids=SESSION_IDS,
    target_csv="results_grid_segmentation_by_session.csv",
    session_dir_template="data-bin/dev/{sid}",   # ggf. anpassen
    label_dir="labels",
)


########## Evaluate für session_40 ##########
Starte Evaluate: /home/josch080/Projektgruppe/mcorec_train/bin/python script/evaluate.py --session_dir data-bin/dev/session_40 --output_dir_name output_ --label_dir_name labels
Evaluating 1 sessions

=== Evaluating session session_40 ===

--- Evaluating output dir: output_E01_bs4_len15 ---
Conversation clustering F1 score: 1.0
Speaker to WER: {'spk_0': 0.564, 'spk_1': 0.4281, 'spk_2': 0.5576, 'spk_3': 0.4283, 'spk_4': 0.4793, 'spk_5': 0.4189}
Speaker clustering F1 score: {'spk_0': 1.0, 'spk_1': 1.0, 'spk_2': 1.0, 'spk_3': 1.0, 'spk_4': 1.0, 'spk_5': 1.0}
Joint ASR-Clustering Error Rate: {'spk_0': 0.282, 'spk_1': 0.21405, 'spk_2': 0.2788, 'spk_3': 0.21415, 'spk_4': 0.23965, 'spk_5': 0.20945}

--- Evaluating output dir: output_E02_bs8_len15 ---
Conversation clustering F1 score: 1.0
Speaker to WER: {'spk_0': 0.561, 'spk_1': 0.4312, 'spk_2': 0.5506, 'spk_3': 0.4283, 'spk_4': 0.5041, 'spk_5': 0.4189}
Speaker clustering F1 score: {'spk_0': 1.0, 

  results_df = pd.concat([results_df, new_df], ignore_index=True)


## 10 – Ergebnisanalyse: Top-15-Konfigurationen

In [13]:
# Mittelwert über alle 5 Sessions pro Experiment
summary = (df_by_session
           .groupby("model", as_index=False)[["avg_conv_f1","avg_speaker_wer","avg_joint_error"]]
           .mean()
           .sort_values("avg_speaker_wer")) # aufsteigend: beste WER oben

# Ergebnisse persistieren
summary.to_csv("results_grid_segmentation_summary.csv", index=False)

display(summary.head(15)) # Top-15 nach WER
print("Saved:",
      "results_grid_segmentation_by_session.csv",
      "results_grid_segmentation_summary.csv")


Unnamed: 0,model,avg_conv_f1,avg_speaker_wer,avg_joint_error
16,E72_bugfix_mdOn1p0_mdOff1p2_bs12_len20,0.847619,0.490223,0.321304
4,E60_bugfix_mdOn0p4_mdOff1p2_bs12_len20,0.847619,0.490475,0.32143
12,E68_bugfix_mdOn0p8_mdOff1p2_bs12_len20,0.847619,0.490515,0.321449
8,E64_bugfix_mdOn0p6_mdOff1p2_bs12_len20,0.847619,0.490828,0.321606
15,E71_bugfix_mdOn1p0_mdOff1p0_bs12_len20,0.847619,0.495416,0.3239
11,E67_bugfix_mdOn0p8_mdOff1p0_bs12_len20,0.847619,0.495567,0.323976
3,E59_bugfix_mdOn0p4_mdOff1p0_bs12_len20,0.847619,0.495808,0.324096
7,E63_bugfix_mdOn0p6_mdOff1p0_bs12_len20,0.847619,0.496161,0.324272
14,E70_bugfix_mdOn1p0_mdOff0p8_bs12_len20,0.847619,0.515067,0.333725
10,E66_bugfix_mdOn0p8_mdOff0p8_bs12_len20,0.847619,0.515175,0.33378


Saved: results_grid_segmentation_by_session.csv results_grid_segmentation_summary.csv


## 11 – Bestes Modell & Vergleich zur Default-Konfiguration

In [15]:
# Beste WER (summary ist nach avg_speaker_wer sortiert)
best = summary.iloc[0]
print("Best:", best["model"],
      "WER=", best["avg_speaker_wer"],
      "Joint=", best["avg_joint_error"])

# Vergleich zur Default-Konfiguration (Bugfix ohne Override)
DEFAULT_MODEL = "E56_bugfix_default_bs12_len20"  # falls bei dir anders heißt, hier anpassen

row_default = summary[summary["model"] == DEFAULT_MODEL]
if row_default.empty:
    print(f"\nWARN: Default '{DEFAULT_MODEL}' nicht in summary gefunden.")
    print("Verfügbare Modelle (Beispiele):", list(summary["model"].head(10)))
else:
    d = row_default.iloc[0]

     # Δ: negativ = Verbesserung (niedrigere WER ist besser)
    delta_wer = best["avg_speaker_wer"] - d["avg_speaker_wer"]
    delta_joint = best["avg_joint_error"] - d["avg_joint_error"]

    rel_delta_wer = (delta_wer / d["avg_speaker_wer"]) * 100
    rel_delta_joint = (delta_joint / d["avg_joint_error"]) * 100

    print(f"\nVergleich zu Default: {DEFAULT_MODEL}")
    print(f"  Default  WER={d['avg_speaker_wer']:.6f}  Joint={d['avg_joint_error']:.6f}")
    print(f"  Best     WER={best['avg_speaker_wer']:.6f}  Joint={best['avg_joint_error']:.6f}")
    print(f"  ΔWER     {delta_wer:+.6f}  ({rel_delta_wer:+.3f}%)")
    print(f"  ΔJoint   {delta_joint:+.6f}  ({rel_delta_joint:+.3f}%)")


Best: E72_bugfix_mdOn1p0_mdOff1p2_bs12_len20 WER= 0.49022333333333334 Joint= 0.32130366666666665

Vergleich zu Default: E56_bugfix_default_bs12_len20
  Default  WER=0.524484  Joint=0.338434
  Best     WER=0.490223  Joint=0.321304
  ΔWER     -0.034261  (-6.532%)
  ΔJoint   -0.017130  (-5.062%)


## 12 – Interpretation

| Konfiguration | WER | Joint Error | Δ WER vs. E56 |
|--------------|-----|-------------|---------------|
| E56 (Bugfix-Default, beam=12, len=20) | 0.5245 | 0.3384 | – |
| **E72 (min_on=1.0, min_off=1.2)** | **0.4902** | **0.3213** | **−0.034** |
| Vor-Bugfix-Referenz (E09, beam=12, len=20) | 0.4954 | 0.3239 | – |

*(Werte auf dem 5-Session-Subset; auf allen 25 Dev-Sessions: WER 0.4943, siehe `04_dev_final_results`)*

**E72 übertrifft sogar das Vor-Bugfix-Ergebnis** (0.4902 vs. 0.4954).

### Warum helfen größere min_duration_on/off?

- **min_on=1.0 s:** Kurze Sprachfragmente (<1 s) werden verworfen.
  Das reduziert Rauschen und unvollständige Äußerungen im Input.
- **min_off=1.2 s:** Pausen bis 1.2 s werden überbrückt – Sätze
  mit kurzen Sprechpausen bleiben zusammen, was den Decoder-Kontext verbessert.

**Diese Konfiguration (min_on=1.0, min_off=1.2, beam=12, len=20)**
wird als finale Konfiguration für `04_dev_final_results` und
`04_eval_final_results` verwendet.
