# Fine Tuning Pre-trained WAV2VEC Using DSing Train

## Summary

1. Loading WAV2VEC model
2. Creating a Huggingface dataset for batch processing by Cropping the DSing Dataset described below (8 songs for each split, 1.5seconds max)

The dataset is cropped to allow for processing examples for this notebook.  in the actual processing, it will look different.

## DSing Dataset

Reading data from DSing Dataset.  Filesystem formatted this way to convert easily to huggingface dataset.

where: <br>
dev/test/trainX are datasets split.<br>
\[split\]_text contains transcript for the snippet.<br>
\[split\]_spk2gender contains information about gender for snippet.<br>

Tests Split: 480 Utterances, 48 minutes<br>
Dev Split: 482 Utterances, 41 minutes<br>
Train1 Split: 8794 Utterances, 15.1 hours<br>
Train3 Split: 25526 Utterances, 44.7 hours<br>
Train30 Split: 268,392 Utterances, 149.1 hours<br>

sing_300x30x2/dataset/<br>
├── dev/<br>
├───| metadata.csv<br>
&ensp;&ensp;&ensp;&ensp;&ensp;&ensp;&ensp;&nbsp;| \<audio files\>.wav<br>
├── dev_spk2gender<br>
├── dev_text<br>
├── test/<br>
├───| metadata.csv<br>
&ensp;&ensp;&ensp;&ensp;&ensp;&ensp;&ensp;&nbsp;| \<audio files\>.wav<br>
├── test_spk2gender<br>
├── test_text<br>
├── train1/<br>
├───| metadata.csv<br>
&ensp;&ensp;&ensp;&ensp;&ensp;&ensp;&ensp;&nbsp;| \<audio files\>.wav<br>
├── train1_spk2gender<br>
├── train1_text<br>
├── train3/<br>
├───| metadata.csv<br>
&ensp;&ensp;&ensp;&ensp;&ensp;&ensp;&ensp;&nbsp;| \<audio files\>.wav<br>
├── train3_spk2gender<br>
└── train3_text<br>
├── train30/<br>
├───| metadata.csv<br>
&ensp;&ensp;&ensp;&ensp;&ensp;&ensp;&ensp;&nbsp;| \<audio files\>.wav<br>
├── train30_spk2gender<br>
├── train30_text<br>


# For Colab Training

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
  #
# Curated DAMP 300x30x2 dataset (DSing) unpacked and preprocessed into
# child folder dataset/. DSing is around 1.9GB after being preprocessed.
#

!unzip /content/drive/MyDrive/dali_test_utt_dataset_20240414.zip -d /

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
  inflating: /dali_datasets/test/5991209bcde54446a428d0d40cfd2d82_45.wav  
  inflating: /dali_datasets/test/4986c16294c247f1a01dd92793a7e2b3_50.wav  
  inflating: /dali_datasets/test/ae91bcda73944695b7756ddc066c3e02_7.wav  
  inflating: /dali_datasets/test/45e0ccbdf76f4060af50f95d93492755_3.wav  
  inflating: /dali_datasets/test/06e8322a90954a028762888dfe3e70cf_47.wav  
  inflating: /dali_datasets/test/6dae25f87a5f45779e0adf97b7537552_11.wav  
  inflating: /dali_datasets/test/9fe1d4ff035d43839e7556cd7293d525_25.wav  
  inflating: /dali_datasets/test/ccd577a699864e469dd77dc04de02955_24.wav  
  inflating: /dali_datasets/test/84cfa398e56f49aa8454ceab82ed933e_6.wav  
  inflating: /dali_datasets/test/57a743bbbc7c472788d258f977c11cee_27.wav  
  inflating: /dali_datasets/test/7357de99882d49cb9a0564a8ce4d60f4_19.wav  
  inflating: /dali_datasets/test/57a743bbbc7c472788d258f977c11cee_33.wav  
  inflating: /dali_datasets/test/38dd5

In [3]:
!pip install transformers==4.39.1
!pip install accelerate==0.28.0
!pip install evaluate
!pip install torchaudio
!pip install flashlight-text
!pip install jiwer

Collecting transformers==4.39.1
  Downloading transformers-4.39.1-py3-none-any.whl (8.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.8/8.8 MB[0m [31m38.5 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: transformers
  Attempting uninstall: transformers
    Found existing installation: transformers 4.38.2
    Uninstalling transformers-4.38.2:
      Successfully uninstalled transformers-4.38.2
Successfully installed transformers-4.39.1
Collecting accelerate==0.28.0
  Downloading accelerate-0.28.0-py3-none-any.whl (290 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m290.1/290.1 kB[0m [31m7.5 MB/s[0m eta [36m0:00:00[0m
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch>=1.10.0->accelerate==0.28.0)
  Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch>=1.10.0->accelerate==0.28.0)
  Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-

# Useful Memory Clearing Commands

In [None]:
#del model
#del dsing_train
#del dsing_val
#del data_collator
#del asr_pipeline
#del training_args
#del trainer
#del wer_metric

In [None]:
import torch, gc
gc.collect()
torch.cuda.empty_cache()

# Basic Setup

In [4]:
import os
import sys
import time
import glob
import re
import json
import random
import tqdm

import torch
import torchaudio
import evaluate

import pandas as pd
import numpy as np
import IPython
from IPython.display import display, HTML, Audio

from datasets import load_dataset, load_metric, ClassLabel, DatasetDict

from torch.utils.data import DataLoader

from torchaudio.models.decoder import ctc_decoder
from torchaudio.models.decoder import download_pretrained_files

from transformers import Wav2Vec2CTCTokenizer
from transformers import Wav2Vec2FeatureExtractor
from transformers import pipeline

from transformers import TrainingArguments, Trainer
from transformers.utils import logging
from transformers import AutoModel




In [5]:
#############################################
# Special Variables...
#############################################
root_folder = "/content/drive/MyDrive"

# Google Colab
#
tokens_file=f"{root_folder}/tokens.txt"
dataset_folder = "/dali_datasets"

# Local
#
# tokens_file="./tokens.txt"
# dataset_folder = "../sing_300x30x2/damp_dataset"
# root_folder = "./"


# read labels for all song utterances for DSING splits (test, dev (aka validation), train1 (aka train)
BATCH_SIZE=1
CALC_HOURS=False

In [15]:

#
# Create Pipeline
#

model_checkpoint="openai/whisper-small.en"

asr_pipeline = pipeline("automatic-speech-recognition", model=model_checkpoint)
model = asr_pipeline.model

target_sampling_rate = asr_pipeline.feature_extractor.sampling_rate

# Count the number of trainable parameters in the model
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
print("Trainable parameters:", trainable_params)



config.json:   0%|          | 0.00/1.94k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/967M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/1.93k [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/805 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.41M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

normalizer.json:   0%|          | 0.00/52.7k [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/34.6k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/1.83k [00:00<?, ?B/s]

preprocessor_config.json:   0%|          | 0.00/185k [00:00<?, ?B/s]

Trainable parameters: 240582144


In [None]:
asr_pipeline.feature_extractor

WhisperFeatureExtractor {
  "chunk_length": 30,
  "feature_extractor_type": "WhisperFeatureExtractor",
  "feature_size": 80,
  "hop_length": 160,
  "n_fft": 400,
  "n_samples": 480000,
  "nb_max_frames": 3000,
  "padding_side": "right",
  "padding_value": 0.0,
  "processor_class": "WhisperProcessor",
  "return_attention_mask": false,
  "sampling_rate": 16000
}

In [None]:
asr_pipeline.tokenizer

## Decoders

In [7]:
files = download_pretrained_files("librispeech-4-gram")
# Found from Fairseq for Wav2Vec2 -
# https://github.com/facebookresearch/fairseq/blob/main/examples/wav2vec/config/finetuning/vox_960h_2_aws.yaml
# Note: I am not using the same lexicon file though...
# LM_WEIGHT = 2.0
# WORD_SCORE = 0
# SIL_SCORE = -1

LM_WEIGHT = 3.23
WORD_SCORE = -0.26
SIL_SCORE = 0

beam_search_decoder = ctc_decoder(
    lexicon=files.lexicon,
    tokens=tokens_file,
    lm=files.lm,
    nbest=1,
    beam_size=512, #per the paper by Ou
    lm_weight=LM_WEIGHT,
    word_score=WORD_SCORE,
    sil_score=SIL_SCORE,
    blank_token='<pad>',
    unk_word='<unk>'
)

greedy_decoder = ctc_decoder(
    lexicon=files.lexicon,
    tokens=tokens_file,
    lm=files.lm,
    nbest=1,
    beam_size=1,
    lm_weight=LM_WEIGHT,
    word_score=WORD_SCORE,
    sil_score=SIL_SCORE,
    blank_token='<pad>',
    unk_word='<unk>'
)

100%|██████████| 4.97M/4.97M [00:00<00:00, 60.1MB/s]
100%|██████████| 57.0/57.0 [00:00<00:00, 96.7kB/s]
100%|██████████| 2.91G/2.91G [00:44<00:00, 70.5MB/s]


# Generate Hugginface Dataset Object from DAMP 300x30x2 dataset

Reference: https://huggingface.co/docs/datasets/audio_load

Other pre-processing is heavily inspired from: https://colab.research.google.com/drive/1nCC5Ci-81U5opK_VuXDiZlmcAuATreF2#scrollTo=RBDRAAYxRE6n



# Data Cleaning

In [8]:
def prepare_dataset(batch,tokenizer,feature_extractor):
    """
    Creating a new dataset with the map function to generate the
    keys below.  Padding will occur in the data collator on a per
    batch basis.

    Inputs (i.e. feature extractor):
    input_values   - tensor array for audio samples (shape=(n,) - where n is the number of audio samples)
    attention_mask - used for expressing where there are padded samples

    Outputs (i.e. tokenizer related)
    labels - tensor array for text output tokens (i.e. not transcript).  (shape=(m,) - where m is the number of character tokens)
    """
    chars_to_ignore_regex = '[\,\?\.\!\-\;\:\"]'
    batch["transcription"] = re.sub(chars_to_ignore_regex, '', batch["transcription"]).lower() + " "

    audio = batch["audio"]

    # batched output is "un-batched" to ensure mapping is correct

    # Feature Extractor manipulation
    #
    # this object will return a list of lists because the
    # transcriptions are not padded (i.e. as opposed to a
    # Tensor of tensors when using return_tensors='pt').
    # Padding is done per batch to optimize the size for inference and
    # training.
    #
    # data_collator is responsible for padding the data.
    inputs_values_pt = feature_extractor(audio["array"], sampling_rate=audio["sampling_rate"])

    #
    # use this condition because some Wav2Vec2 models require attention_masks
    # for better performance while others do not.
    #
    if "attention_mask" in inputs_values_pt:
        batch["attention_mask"] = inputs_values_pt.attention_mask

    batch["input_values"] = inputs_values_pt.input_values[0]
    batch["input_length"] = len(batch["input_values"])

    # Tokenizer manipulation
    #
    # this object will return a list of lists because the
    # transcriptions are not padded (i.e. as opposed to a
    # Tensor of tensors when using return_tensors='pt').
    # Padding is done per batch to optimize the size for inference and
    # training.
    #
    # data_collator is responsible for padding the data.
    labels_pt = tokenizer(batch["transcription"])
    batch["labels"] = labels_pt['input_ids']

    return batch

In [9]:
from dataclasses import dataclass
from typing import Any, Dict, List, Optional, Union

@dataclass
class DataCollatorCTCWithPadding:
    """
    Data collator that will dynamically pad the already tokenized inputs received.
    Args:
        processor (:class:`~transformers.Wav2Vec2Processor`)
            The processor used for proccessing the data.
        padding (:obj:`bool`, :obj:`str` or :class:`~transformers.tokenization_utils_base.PaddingStrategy`, `optional`, defaults to :obj:`True`):
            Select a strategy to pad the returned sequences (according to the model's padding side and padding index)
            among:
            * :obj:`True` or :obj:`'longest'`: Pad to the longest sequence in the batch (or no padding if only a single
              sequence if provided).

            Other Options in the pad method that are NOT implemented for this class (i.e. I always want to pad to longest for the
            input and the labels)
            * (not implemented) :obj:`'max_length'`: Pad to a maximum length specified with the argument :obj:`max_length` or to the
              maximum acceptable input length for the model if that argument is not provided.
            * (not implemented) :obj:`False` or :obj:`'do_not_pad'` (default): No padding (i.e., can output a batch with sequences of
              different lengths).

    Reference Code here:
    https://huggingface.co/blog/fine-tune-wav2vec2-english


    Note: in the example referenced above, there were parameters for padding max length, etc.  I have created some logic
    in the prepare_dataset to support truncation of data for testing and benchmarking.  I do not think i need max_length
    options for collator at this time.

    """

    tokenizer: WhisperFeatureExtractor
    feature_extractor: Wav2Vec2FeatureExtractor
    padding: Union[bool, str] = "longest"

    def __call__(self, features: List[Dict[str, Union[List[int], torch.Tensor]]]) -> Dict[str, torch.Tensor]:

        # Features in this case is a list of batch size that contains DataSet objects from the train split
        # (including pretokenized labels). the output batch has been changed from a list back to a dictionary
        # with the respective data objects.
        #
        # Note for future self:
        # pad is being called from PreTrainedTokenizerBase.pad.  From docs:
        #      Pad a single encoded input or a batch of encoded inputs up to predefined length or to the max sequence length
        #      in the batch.
        #
        #    Padding side (left/right) padding token ids are defined at the tokenizer level (with `self.padding_side`,
        #    `self.pad_token_id` and `self.pad_token_type_id`).

        #    Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the
        #    text followed by a call to the `pad` method to get a padded encoding.
        #
        #         <Tip>
        #
        #         If the `encoded_inputs` passed are dictionary of numpy arrays, PyTorch tensors or TensorFlow tensors, the
        #         result will use the same type unless you provide a different tensor type with `return_tensors`. In the case of
        #         PyTorch tensors, you will lose the specific device of your tensors however.
        #
        #         </Tip>

        # Audio Input Data (not tokenized)
        input_features = [{"input_values": feature["input_values"]} for feature in features]

        # batch is a dictionary-like type.
        batch = self.feature_extractor.pad(
            input_features,
            padding=self.padding,
            return_tensors="pt",
        )

        # Tokenized Transcript Labels (character level tokens)
        label_features = [{"input_ids": feature["labels"]} for feature in features]
        labels_batch = self.tokenizer.pad(
            label_features,
            padding=self.padding,
            return_tensors="pt",
        )

        # replace padding with -100 to ignore loss correctly
        batch["labels"] = labels_batch["input_ids"].masked_fill(labels_batch.attention_mask.ne(1), -100)

        return batch

NameError: name 'WhisperFeatureExtractor' is not defined

In [None]:
wer_metric = evaluate.load("wer")

def compute_metrics_dummy(eval_pred):
    return {'wer':1.0}


def compute_metrics(eval_pred,kind='beam',compute=True):
    """
    Calculates WER for a batch of logits and their labels.

    eval_pred - tuple (logit output from the model, token labels from dataset)
    kind - can compare between beam search and greedy search.  both using kenlm

    compute - bool - for training this will compute WER every time its logged.
                     this is nice for understanding if the training is working.
                     for evaluation, this is set to false so compute is run after
                     all batches are processed.

    output is the WER computed from the batch.  if the model is run multiple times, the
    batch WERs are aggregated.

    Note: add_batch and then doing compute will clear the previously cached batch results.
    """
    logits, labels = eval_pred
    #print(f"Logit Type: {type(logits)}")

    # In some scenarios, the input the compute_metrics is a tensor.
    if type(logits) is np.ndarray:
        logits = torch.Tensor(logits)
    else:
        # copy this tensor for computing things...
        logits = logits.clone().detach().requires_grad_(False)
    #print(f"Changing Logit Type to: {type(logits)}")
    #print(f"{logits.shape}")

    if kind=='beam':
        # Creates a list of lists that are of size [batch_size,emissions,vocab_size]
        #
        # Where output[0][0] gives you the CTCHypothesis object.
        #
        # Extract transcript from output[0][0].words (i.e. list of words).
        # May need to join depending on objective.
        #
        predictions = beam_search_decoder(logits)
    elif kind=='greedy':
        # Creates a list of lists that are of size [batch_size,1]
        #
        # Where output[0][0] gives you the CTCHypothesis object.
        #
        # Extract transcript from output[0][0].words (i.e. list of words).
        # May need to join depending on objective.
        #
        predictions = greedy_decoder(logits)
    else:
        print(f"Error passing in decoder kind: {kind}")
        sys.exit()

    ref = asr_pipeline.tokenizer.batch_decode(labels)
    pred = [" ".join(prediction[0].words) for prediction in predictions]

    wer_metric.add_batch(predictions=pred, references=ref)

    if compute:
        return {"wer":wer_metric.compute()}
    else:
      return {"wer":None}

# Load Dataset

In [None]:
dsing_test.get_samples(10)

AttributeError: 'Dataset' object has no attribute 'get_samples'

In [None]:
asr_pipeline(sample["audio"].copy(), max_new_tokens=256, generate_kwargs={"task": "transcribe"})

WhisperFeatureExtractor {
  "chunk_length": 30,
  "feature_extractor_type": "WhisperFeatureExtractor",
  "feature_size": 80,
  "hop_length": 160,
  "n_fft": 400,
  "n_samples": 480000,
  "nb_max_frames": 3000,
  "padding_side": "right",
  "padding_value": 0.0,
  "processor_class": "WhisperProcessor",
  "return_attention_mask": false,
  "sampling_rate": 16000
}

In [10]:
#data_collator = DataCollatorCTCWithPadding(
#    tokenizer=asr_pipeline.tokenizer,
#    feature_extractor=asr_pipeline.feature_extractor,
#)

#
dali_test = load_dataset("audiofolder", data_dir=dataset_folder, split=f'test[:1000]')
if CALC_HOURS:
    arr_lens = [len(d['array']) for d in dali_test['audio']]
    print(f"Total Hours of Training Data: {np.sum(arr_lens)/ target_sampling_rate / 3600:.2f}")

#dsing_test = dsing_test.to_iterable_dataset()
#dsing_test = dsing_test.with_format('torch')
# make changes to dataset object to prepare for Wav2Vec2 model
#dsing_test = dsing_test.map(
#    prepare_dataset,
#    remove_columns=["audio","transcription"],
#    fn_kwargs={'tokenizer':asr_pipeline.tokenizer, 'feature_extractor':asr_pipeline.feature_extractor}
#)

Resolving data files:   0%|          | 0/10187 [00:00<?, ?it/s]

Generating test split: 0 examples [00:00, ? examples/s]

## Evaluation Output

In [14]:
asr_pipeline.tokenizer

WhisperTokenizerFast(name_or_path='openai/whisper-tiny.en', vocab_size=50257, model_max_length=1024, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'bos_token': '<|endoftext|>', 'eos_token': '<|endoftext|>', 'unk_token': '<|endoftext|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|startoftranscript|>', '<|en|>', '<|zh|>', '<|de|>', '<|es|>', '<|ru|>', '<|ko|>', '<|fr|>', '<|ja|>', '<|pt|>', '<|tr|>', '<|pl|>', '<|ca|>', '<|nl|>', '<|ar|>', '<|sv|>', '<|it|>', '<|id|>', '<|hi|>', '<|fi|>', '<|vi|>', '<|iw|>', '<|uk|>', '<|el|>', '<|ms|>', '<|cs|>', '<|ro|>', '<|da|>', '<|hu|>', '<|ta|>', '<|no|>', '<|th|>', '<|ur|>', '<|hr|>', '<|bg|>', '<|lt|>', '<|la|>', '<|mi|>', '<|ml|>', '<|cy|>', '<|sk|>', '<|te|>', '<|fa|>', '<|lv|>', '<|bn|>', '<|sr|>', '<|az|>', '<|sl|>', '<|kn|>', '<|et|>', '<|mk|>', '<|br|>', '<|eu|>', '<|is|>', '<|hy|>', '<|ne|>', '<|mn|>', '<|bs|>', '<|kk|>', '<|sq|>', '<|sw|>', '<|gl|>', '<|mr|>', '<|pa|>', '<|si|>', '<|km|>

In [16]:
wer_metric = evaluate.load("wer")

batch_inference_time = []
total_start = time.time()
#chars_to_ignore_regex = '[\,\?\.\!\-\;\:\"\`\&\(\)]'
for i, batch in enumerate(dali_test):
    start = time.time()
    #batch = {k:v.to('cuda') for k,v in batch.items()}
    result = asr_pipeline(batch['audio']['array'])
    #print(result['text'].lower())
    #batch["transcription"] = re.sub(chars_to_ignore_regex, '', batch["transcription"]).lower() + " "
    #print(batch['transcription'].lower())

    wer_metric.add_batch(predictions=[result['text'].lower()], references=[batch['transcription'].lower()])

    finish = time.time()
    batch_inference_time.append(finish-start)
    #print("*******************")
    print(f"{i}: ({finish-start:.1f}s)")

    #if i == 10:
    #    break


total_finish = time.time()
total_processing_time = total_finish-total_start
total_inference_time = np.sum(batch_inference_time)
total_dataloading_time = total_processing_time - total_inference_time
print(f"Batch Inference took: {total_processing_time:.1f} seconds.")
print(f"Inference: {total_inference_time:.1f} seconds, DataLoading: {total_dataloading_time:.1f} seconds")
print(f"WER for all batches (lower is better): {wer_metric.compute()*100:.1f}%")

0: (1.3s)
1: (1.0s)
2: (1.0s)
3: (1.0s)
4: (0.9s)
5: (1.1s)
6: (1.1s)
7: (0.8s)
8: (1.0s)
9: (0.9s)
10: (0.9s)
11: (1.7s)
12: (1.0s)
13: (0.9s)
14: (0.8s)
15: (1.0s)
16: (1.0s)
17: (0.9s)
18: (0.8s)
19: (0.8s)
20: (0.9s)
21: (0.9s)
22: (0.8s)
23: (1.0s)
24: (1.5s)
25: (1.1s)
26: (0.8s)
27: (1.0s)
28: (1.0s)
29: (1.0s)
30: (1.0s)
31: (1.0s)
32: (11.7s)
33: (0.8s)
34: (1.0s)
35: (11.8s)
36: (0.8s)
37: (1.0s)
38: (1.0s)
39: (1.0s)
40: (0.9s)
41: (1.6s)
42: (1.2s)
43: (0.9s)
44: (1.0s)
45: (1.0s)
46: (0.8s)
47: (1.0s)
48: (0.9s)
49: (0.8s)
50: (1.0s)
51: (0.9s)
52: (0.9s)
53: (1.4s)
54: (1.4s)
55: (1.0s)
56: (0.9s)
57: (1.0s)
58: (1.0s)
59: (0.9s)
60: (1.0s)
61: (1.0s)
62: (0.9s)
63: (0.9s)
64: (1.0s)
65: (1.0s)
66: (1.2s)
67: (1.4s)
68: (0.9s)
69: (0.9s)
70: (0.9s)
71: (1.0s)
72: (1.0s)
73: (0.9s)
74: (0.9s)
75: (1.0s)
76: (0.9s)
77: (0.9s)
78: (1.2s)
79: (1.5s)
80: (0.9s)
81: (0.8s)
82: (1.0s)
83: (0.8s)
84: (0.9s)
85: (0.9s)
86: (0.8s)
87: (0.9s)
88: (0.9s)
89: (0.9s)
90: (0.9s)
91: (1.

In [None]:
""