Descripción:
En este problema, se presenta un conjunto de datos que contiene clips de audio correspondientes a oraciones habladas en distintos idiomas.

Dataset:
https://www.tensorflow.org/datasets/catalog/xtreme_s


El dataset proporcionado incluye diversos sub-datasets correspondientes a oraciones habladas en distintos idiomas. Utilizaremos un subconjunto de estos datasets para entrenar un clasificador de idiomas hablados.

Objetivo:
Utilizando el dataset proporcionado, el objetivo es construir un modelo de clasificación utilizando redes neuronales que pueda inferir el idioma correspondiente.

Se solicita entrenar dos modelos de distintas arquitecturas y comparar los resultados:

Modelo convolucional sobre los espectrogramas de los clips.
Modelo recurrente sobre los espectrogramas de los clips.

Ver https://colab.research.google.com/github/FCEIA-AAII/lab11/blob/master/lab11-a.ipynb como ejemplo de obtención de espectrogramas a partir de clips de audio.

# Librerías

In [None]:
!pip install tensorflow

: 

In [None]:
!pip install pydub

Collecting pydub
  Downloading pydub-0.25.1-py2.py3-none-any.whl.metadata (1.4 kB)
Downloading pydub-0.25.1-py2.py3-none-any.whl (32 kB)
Installing collected packages: pydub
Successfully installed pydub-0.25.1


In [None]:
import os
import pathlib

import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
import tensorflow as tf

from tensorflow.keras import layers
from tensorflow.keras import models
from IPython import display

import tensorflow as tf
import tensorflow_datasets as tfds #  La versión estable, lanzada cada pocos meses.
import pandas as pd
from typing import Optional
from tensorflow_datasets.core import dataset_info # de aca viene el dataset

# Construcción y carga del Dataset


In [None]:
tfds.builder('xtreme_s').info # importa el full_name



tfds.core.DatasetInfo(
    name='xtreme_s',
    full_name='xtreme_s/fleurs.af_za/2.0.0',
    description="""
    FLEURS is the speech version of the FLORES machine translation benchmark, covering 2000 n-way parallel sentences in n=102 languages.
    XTREME-S covers four task families: speech recognition, classification, speech-to-text translation and retrieval. Covering 102
    languages from 10+ language families, 3 different domains and 4
    task families, XTREME-S aims to simplify multilingual speech
    representation evaluation, as well as catalyze research in “universal” speech representation learning.
    
    In this version, only the FLEURS dataset is provided, which covers speech
    recognition and speech-to-text translation.
    """,
    config_description="""
    FLEURS is the speech version of the FLORES machine translation benchmark, covering 2000 n-way parallel sentences in n=102 languages.
    """,
    homepage='https://arxiv.org/abs/2205.12446',
    data_dir=PosixGPa

In [None]:
DATASET_PATH = 'data'

data_dir = pathlib.Path(DATASET_PATH)
lan = {'español':'es_419', 'ingles': 'en_us', 'frances':'fr_fr', 'japones': 'ja_jp'}

In [None]:
for key, value in lan.items():
    # Chequeo que la carpeta de los datasets no esté vacía

    # Traigo solamente una parte del subset de test para agilizar la carga
    tfds.load(f'xtreme_s/fleurs.{value}', split='test[:1%]', shuffle_files=False, data_dir=data_dir / key)

Downloading and preparing dataset 2.14 GiB (download: 2.14 GiB, generated: 4.80 GiB, total: 6.93 GiB) to data/español/xtreme_s/fleurs.es_419/2.0.0...


Dl Completed...: 0 url [00:00, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]

Extraction completed...: 0 file [00:00, ? file/s]

Generating splits...:   0%|          | 0/3 [00:00<?, ? splits/s]

Generating train examples...:   0%|          | 0/2796 [00:00<?, ? examples/s]

Shuffling data/español/xtreme_s/fleurs.es_419/incomplete.GSBY1Q_2.0.0/xtreme_s-train.tfrecord*...:   0%|      …

Generating validation examples...:   0%|          | 0/408 [00:00<?, ? examples/s]

Shuffling data/español/xtreme_s/fleurs.es_419/incomplete.GSBY1Q_2.0.0/xtreme_s-validation.tfrecord*...:   0%| …

Generating test examples...:   0%|          | 0/908 [00:00<?, ? examples/s]

Shuffling data/español/xtreme_s/fleurs.es_419/incomplete.GSBY1Q_2.0.0/xtreme_s-test.tfrecord*...:   0%|       …



Dataset xtreme_s downloaded and prepared to data/español/xtreme_s/fleurs.es_419/2.0.0. Subsequent calls will reuse this data.
Downloading and preparing dataset 1.72 GiB (download: 1.72 GiB, generated: 3.76 GiB, total: 5.48 GiB) to data/ingles/xtreme_s/fleurs.en_us/2.0.0...


Dl Completed...: 0 url [00:00, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]

Extraction completed...: 0 file [00:00, ? file/s]

Generating splits...:   0%|          | 0/3 [00:00<?, ? splits/s]

Generating train examples...:   0%|          | 0/2602 [00:00<?, ? examples/s]

Shuffling data/ingles/xtreme_s/fleurs.en_us/incomplete.KXUCD4_2.0.0/xtreme_s-train.tfrecord*...:   0%|        …

Generating validation examples...:   0%|          | 0/394 [00:00<?, ? examples/s]

Shuffling data/ingles/xtreme_s/fleurs.en_us/incomplete.KXUCD4_2.0.0/xtreme_s-validation.tfrecord*...:   0%|   …

Generating test examples...:   0%|          | 0/647 [00:00<?, ? examples/s]

Shuffling data/ingles/xtreme_s/fleurs.en_us/incomplete.KXUCD4_2.0.0/xtreme_s-test.tfrecord*...:   0%|         …



Dataset xtreme_s downloaded and prepared to data/ingles/xtreme_s/fleurs.en_us/2.0.0. Subsequent calls will reuse this data.
Downloading and preparing dataset 2.08 GiB (download: 2.08 GiB, generated: 4.73 GiB, total: 6.81 GiB) to data/frances/xtreme_s/fleurs.fr_fr/2.0.0...


Dl Completed...: 0 url [00:00, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]

Extraction completed...: 0 file [00:00, ? file/s]

In [None]:
tfds_as_dataframe(tfds.load('xtreme_s/fleurs.es_419', split='train'))

In [None]:
español = tfds.builder('xtreme_s/fleurs.es_419')

In [None]:

print(español)

<tensorflow_datasets.audio.xtreme_s.xtreme_s.XtremeS object at 0x7bb6fc896aa0>


In [None]:
ingles = tfds.builder('xtreme_s/fleurs.en_us')

In [None]:
ingles.download_and_prepare()

Downloading and preparing dataset 1.72 GiB (download: 1.72 GiB, generated: 3.76 GiB, total: 5.48 GiB) to /root/tensorflow_datasets/xtreme_s/fleurs.en_us/2.0.0...


Dl Completed...: 0 url [00:00, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]

Extraction completed...: 0 file [00:00, ? file/s]

KeyboardInterrupt: 

In [None]:
datasets = tfds.load("xtreme_s/fleurs.en_us")
train_dataset, test_dataset = datasets["train"], datasets["test"]
assert isinstance(train_dataset, tf.data.Dataset)

Downloading and preparing dataset 1.72 GiB (download: 1.72 GiB, generated: 3.76 GiB, total: 5.48 GiB) to /root/tensorflow_datasets/xtreme_s/fleurs.en_us/2.0.0...


Dl Completed...: 0 url [00:00, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]

Extraction completed...: 0 file [00:00, ? file/s]

In [None]:
train_ds, val_ds = tf.keras.utils.audio_dataset_from_directory(
    directory=data_dir,
    batch_size=64,
    validation_split=0.2,
    seed=0,
    output_sequence_length=16000,
    subset='both')

label_names = lan.keys()
print()
print("label names:", label_names)

Found 0 files belonging to 4 classes.
Using 0 files for training.


ValueError: No training audio files found in directory data. Allowed format(s): ('.wav',)

In [None]:
# prompt: unzip 10 files from español folder

import os
import zipfile

def unzip_files(folder_path, num_files):
  """Unzips a specified number of files from a folder.

  Args:
    folder_path: The path to the folder containing the zip files.
    num_files: The number of zip files to unzip.
  """
  zip_files = [f for f in os.listdir(folder_path) if f.endswith('.zip')]
  if not zip_files:
    print(f"No zip files found in {folder_path}")
    return

  files_to_unzip = zip_files[:min(num_files, len(zip_files))]  # Limit to num_files or available files
  for zip_file in files_to_unzip:
    zip_file_path = os.path.join(folder_path, zip_file)
    try:
      with zipfile.ZipFile(zip_file_path, 'r') as zip_ref:
        zip_ref.extractall(folder_path)
        print(f"Unzipped: {zip_file}")
    except zipfile.BadZipFile:
        print(f"Error: {zip_file} is not a valid zip file.")
    except Exception as e:
        print(f"Error unzipping {zip_file}: {e}")

# Example usage (replace with your actual folder path and number of files):
unzip_files('data/español', 10)

Error: audio_español.zip is not a valid zip file.


# Analisis Exploratorio


In [None]:
train_ds.element_spec

TypeError: unsupported operand type(s) for /: 'PosixPath' and 'set'

# Modelo Convolucional

# Modelo Recurrente