In [None]:
#| hide
%load_ext autoreload
%autoreload 2

In [None]:
#| default_exp datasets/mics_databases

#  Class for RIR measurement databases
> Class to get the acoustic time-series and other meta-data of RIR acoustic measurements 

In [None]:
#| export
import os
from pathlib import Path
import numpy as np
from torchvision.datasets.utils import download_url, extract_archive
# For testing and adding methods to a class as patches
from fastcore.all import patch, test_eq
# For abstract base classes
from abc import ABC, abstractmethod
# For type hinting
from typing import Optional, List, Union, Tuple, ClassVar
from urllib.error import URLError
from scipy.io import loadmat
import json

## A. Helper funtions

We will define many class properties with ``@property`` and to make sure all the attributes are initialized before their use, we define the following method

In [None]:
#| hide
from nbdev.showdoc import show_doc

  import pkg_resources,importlib


In [None]:
#| exporti 
#| hide 

def checked_property(attr_name: str, # string with the name of the protected attribute to access, example: '_fs'
                     attr_type: type = object, # Type of the attribute: for _fs for example is float
                     doc: Optional[str] = None # String containing a descrption of the class attribute 
                     ):
    """
    Ensures that the attribute is initialized before accessing it.
    """
    def getter(self):
        value = getattr(self, attr_name)
        if value is None:
            raise ValueError(f"Attribute '{attr_name}' is not initialized.")
        return value
    
    prop = property(getter)
    if doc:
        prop.__doc__ = doc
    return prop

In [None]:
show_doc(checked_property)

---

[source](https://github.com/Ramon-PR/DataScience_exploration/blob/main/DataScience_exploration/datasets/mics_databases.py#L24){target="_blank" style="float:right; font-size:smaller"}

### checked_property

>      checked_property (attr_name:str, attr_type:type=<class 'object'>,
>                        doc:Optional[str]=None)

*Ensures that the attribute is initialized before accessing it.*

|    | **Type** | **Default** | **Details** |
| -- | -------- | ----------- | ----------- |
| attr_name | str |  | string with the name of the protected attribute to access, example: '_fs' |
| attr_type | type | object | Type of the attribute: for _fs for example is float |
| doc | Optional | None | String containing a descrption of the class attribute |

Example of use:  


In [None]:
class Mics(ABC):  
    _fs: Optional[int] = None  
    fs = checked_property('_fs', float)  

And if we use the property to access ``_fs`` without it being initialized, it should give an error

In [None]:
mic = Mics()
print(mic._fs)  # ``_fs`` is None,
try:
    print(mic.fs)  # ❌ But the property ``fs`` requires _fs to be initialized to a float value
except ValueError as e:
    print(f"Caught ValueError: {e}")


None
Caught ValueError: Attribute '_fs' is not initialized.


## B. Database for microphones
> The base class to handle RIR measurements.



This class defines common properties and methods for the different RIR databases that will inherit from it.
The class DB_microphones will be an abstract class (from abc import ABC, abstractmethod) 

+ **ABC**: base clase to declare an **A**bstract **B**ase **C**lass  
+ **abstractmethod**: it is a decorator to indicate which methods have to be implemented by the subclasses  

This is useful since this base class can not be implemented and will force the subclasses to implement certain methods `abstractmethod`



Inspired by MNIST dataset, we will download the data in a folder structure like `./root/class_name/raw`.

+ **root**: is a parameter passed to the class
+ **class_name**: is the name of the class used to download the database  
+ **raw**: is the subfolder where the raw data is downloaded  

and we will include a `mirror` list with the urls where we can find the data to download, and a list `resources` that contains tuples with the name of the file to download and it's md5 checksum.



### Base class

In [None]:
#| export
class DB_microphones(ABC):
    """
        Base class for microphone databases.
        Defines methods: get_mic, get_pos, get_time and class @property such as .fs, .nt, .n_mics, .n_sources, ...
    """

    # ClassVar tells Pylance that these are Class variables, not instance variables.
    # and initializes them to empty lists (although __init_subclass__ will ensure they are defined in subclasses)
    mirrors: ClassVar[list[str]] = [] # List of urls to download the data from.
    resources: ClassVar[list[tuple[str, str]]] = [] # List with tuples (filename, md5) for the files to download.

    # This method is called when a subclass is defined. And I use it to ensure that the subclass has the required class attributes.
    def __init_subclass__(cls, **kwargs):
        super().__init_subclass__(**kwargs)
        if not hasattr(cls, 'resources'):
            raise NotImplementedError(f"{cls.__name__} must define class attribute 'resources'")
        if not hasattr(cls, 'mirrors'):
            raise NotImplementedError(f"{cls.__name__} must define class attribute 'mirrors'")


    _fs: Optional[float] # Using Optional to indicate that these attributes can be None until initialized
    _nmics: Optional[int]
    _nt: Optional[int]
    _n_sources: Optional[int]
    _source_id: Optional[int]
    _signal_size: Optional[int]
    _signal_start: Optional[int]
    
    def __init__(self, 
                 root: str = "./data", # Path to the root directory of the database, where the data will be dowloaded 
                 dataname: str = "RIR", # String matching the name of the resources to download and load. (if several resources are available, all will be downloaded but only the first one will be loaded). 
                 signal_start: int = 0, # Start index of the signal in the data
                 signal_size: Optional[int] = None, # int or None. Size of the signal to be extracted from the data, if None, the whole signal will be loaded.
                 ):
        
        self.root = root
        self._signal_start = signal_start
        self._signal_size = signal_size
        self._nt = None

        self._fs = None
        self._nmics = None
        self._n_sources = None
        self._source_id = None

        self._resources_to_download =  self._matching_resources(pattern=dataname)
        self._resource_to_load = self._resources_to_download[0][0] if self._resources_to_download else None
        assert self._resource_to_load is not None, f"No resources found matching '{dataname}'."

        self._resource_datapath = os.path.join(self.raw_folder, self._resource_to_load)
        
        # Create the root directory if it does not exist
        Path(self.root).mkdir(parents=True, exist_ok=True)

    @abstractmethod
    def load_data(self, filepath: str):
        """ Load the data from the given filepath."""
        pass

    # Validated properties with documentation
    fs = checked_property("_fs", float, "Sampling frequency in Hz")
    n_mics = checked_property("_nmics", int, "Number of microphones")
    nt = checked_property("_nt", int, "Number of total time samples in the database")
    n_sources = checked_property("_n_sources", int, "Number of sound sources")
    source_id = checked_property("_source_id", int, "ID of the current source")
    signal_size = checked_property("_signal_size", int, "Size of the signal to extract")
    signal_start = checked_property("_signal_start", int, "Start index of the signal")


    @property
    def raw_folder(self) -> str:
        """ Returns the path to the raw data folder. ./data/class_name/raw """
        return os.path.join(self.root, self.__class__.__name__, "raw")

    @property
    def dt(self) -> float:
        return 1.0 / self.fs  
    
    @abstractmethod
    def get_mic(self, imic: int, start: Optional[int]=None, size: Optional[int]=None) -> np.ndarray:
        pass

    @abstractmethod
    def get_pos(self, imic: int) -> np.ndarray:
        pass

    def get_time(self, start: Optional[int]=None, size: Optional[int]=None) -> np.ndarray:        
        start = start if start is not None else self.signal_start
        size = size if size is not None else self.signal_size
        assert isinstance(start, int)
        assert isinstance(size, int)
        return (start + np.arange(size)) * self.dt
    
    def _matching_resources(self,
                         pattern: str, # pattern to look for in resource names
                         ) -> list:
        """ match if the pattern is found in any of the resources """ 

        if not hasattr(self, 'resources'):
            print("No resources found.")
            return []

        # Assuming self.resources is a list of tuples (resource_name, resource_data)

        # where resource_name is a string and resource_data can be any type
        matches = [(res, md5) for res, md5 in self.resources if pattern.lower() in res.lower()]
        return(matches)

    
    def _download_resource(self, 
                           resource_name: str, # name of the resource to download
                            ) -> None:
        
        """ download a resource by its name """
        
        if not hasattr(self, 'resources'):
            print("No resources found.")
            return

        # Check the matching resources
        down_resources = self._matching_resources(pattern = resource_name)
        if not down_resources:
            print(f"No resources found matching '{resource_name}'.")
            return

        for file, md5 in down_resources:
            errors = []
            for mirror in self.mirrors:
                url = os.path.join(mirror, file)
                try:
                    if not os.path.isfile(os.path.join(self.raw_folder, file)):
                        print(f"Downloading {file} from {mirror}")
                        download_url(url=url, root=self.raw_folder, filename=file, md5=md5)

                except URLError as e:
                    errors.append(e)
                    continue
                break
            else:
                s = f"Error downloading {file}:\n"
                for mirror, err in zip(self.mirrors, errors):
                    s += f"Tried {mirror}, got:\n{str(err)}\n"
                raise RuntimeError(s)
            
    def _prepare_download_and_unpack(self, 
                          dataname: str, # Sting 
                          unpack: bool = True
                          ) -> str:
        """
        Common workflow: match resource, download if needed, unpack, and load.
        Child classes should call this and implement their own load_data.
        """
        matched_res = self._matching_resources(dataname)
        if not matched_res:
            raise ValueError(f"No resources found matching '{dataname}'.")

        print("Matched resources to download:")
        for res, _ in matched_res:
            print(f"- {res}")

        # Download the resource if it does not exist in the raw folder 
        self._download_resource(resource_name=dataname)

        # Unpack the resource if needed
        if unpack:
            self.data_folder = self._unpack_resource() # Unpacked folder with the data
            return self.data_folder
        else:
            self.data_path = self._resource_datapath # If it does not need unpacking, just return the path to the resource file
            return self._resource_datapath


    def _unpack_resource(self) -> str:
        """ Unpack the resource if it is compressed. """

        # path del resource sin unpack
        assert os.path.exists(self._resource_datapath), f"Resource {self._resource_datapath} does not exist. Please download it first." 

        # Check if the unpacked folder is already there
        unpacked_folder = os.path.splitext(self._resource_datapath)[0]

        if os.path.exists(unpacked_folder):
            print(f"Unpacked folder {unpacked_folder} already exists. Skipping unpacking.")
            return unpacked_folder
        
        else:
            try:
                extract_archive(from_path=self._resource_datapath)
                print(f"Unpacked {self._resource_datapath} to {unpacked_folder}")
                return unpacked_folder
            
            except RuntimeError as e:
                print(f"Error unpacking {self._resource_datapath}: {e}")
                return unpacked_folder  # Return the folder even if there was an error
                
                
    @classmethod
    def print_resources(cls):
        print(f"Resources for class {cls.__name__}:")
        for name, md5 in cls.resources:
            print(f"- {name} ")


    def __str__(self):
        return (
            f"Database: {self.__class__.__name__}\n"
            f"Download: {[resname for resname, _ in self._resources_to_download] }\n"
            f"Load room: {self._resource_to_load}\n"
            f"Path to raw resource: {self._resource_datapath}\n"
            f"Path to unpacked data folder: {self.data_folder}\n"
            f"Sampling frequency: {self.fs} Hz\n"
            f"Number of microphones: {self.n_mics}\n"
            f"Number of total time samples: {self.nt}\n"
            f"Number of time samples selected: {self.signal_size}\n"
            f"Number of sources: {self.n_sources}\n"
            f"Signal start: {self.signal_start}\n"
            f"Signal size: {self.signal_size}\n"
            f"Source ID: {self.source_id}"
        )
    
    def _check_bounds_in_sample_size(self, number_of_time_samples: int) -> None:
        """ Check if the start and size are within the bounds of the signal size. """

        T = number_of_time_samples
        assert self._signal_start is not None
        start_sample = self._signal_start
        if self._signal_size is None:
            self._signal_size = T - start_sample

        assert self._signal_size is not None
        last_sample = self._signal_start + self._signal_size

        assert (start_sample >= 0 and start_sample < T), f"The start_signal should be in [0, {T-1}]."
        assert (last_sample > 0 and last_sample <= T), f"The size_signal should be in [1, {T-start_sample}]."

### Zea database
> Database from [Elias Zea](https://www.sciencedirect.com/science/article/abs/pii/S0022460X19304316) . It will inherit from DB_microphones 

This is one of the RIR databases. It will have to implement it's own attributes:  
    + `mirrors`  
    + `resources`  
    + `microphone spacing`  

And the methods:  
    + To check what resource to load  
    + To download the resources  
    + To unpack the downloaded resources  
    + To load the selected resource (database/dataname)  
    + To get the different attributes in the database: `dx`, `dt`, `fs`, `num_mics`, `num_sources`  
    + And also the data related with the microphone recordings: `imic`, `position`, `time_samples`, `signal`  
     

In [None]:
#| export
class ZeaRIR(DB_microphones):
    """ ZeaRIR database. """

    mirrors = [
            "https://raw.githubusercontent.com/eliaszea/RIRIS/main/dependencies/measurementData/"
        ]

    resources = [
            ("BalderRIR.mat", "bc904010041dc18e54a1a61b23ee3f99"),
            ("FrejaRIR.mat", "1dedf2ab190ad48fbfa9403409418a1d"),
            ("MuninRIR.mat", "5c90de0cbbc61128de332fffc64261c9"),
        ]
    
    _dx = 3e-2  # Distance between microphones in meters, as per the database documentation.

    def __init__(self,
                 root: str = "./data", # Path to the root directory of the database, where the data will be dowloaded
                 dataname: str = "Balder", # String matching the name of the resources to download and load. (if several resources are available, all will be downloaded but only the first one will be loaded). 
                 signal_start: int = 0, # Start index of the signal to load.
                 signal_size: Optional[int] = None, # # int or None. Size of the signal to be extracted from the data, if None, the whole signal will be loaded.
                 ):
        super().__init__(root, dataname, signal_start, signal_size)

        # Prepare the download and unpack the resource
        filepath = self._prepare_download_and_unpack(dataname, unpack=False)
        
        self.data_folder = self.raw_folder

        # The resource *.mat is not unpacked, so we can load it directly.
        assert isinstance(filepath, str), f"Check if your resource has to be unpacked or not."
        self.load_data(filepath)


    def load_data(self, filepath: str):
        """ Loads all the Matlab data from the given filepath."""
        print(f"Loading the resource {filepath} ...")
        _rawdata = loadmat(filepath, simplify_cells=True)
        self._fs = _rawdata['out']['fs']

        T = _rawdata['out']['T']
        M = _rawdata['out']['M']

        # Check if the signal_start and signal_size are within the bounds of the signal size
        self._check_bounds_in_sample_size(number_of_time_samples=T)
        assert isinstance(self._signal_start, int) and isinstance(self._signal_size, int)
        start_sample = self._signal_start
        last_sample = self._signal_start + self._signal_size

        self._RIR = _rawdata['out']['image'][start_sample:last_sample, :]  # Transpose to have (n_mics, n_sources, nt)

        self._nmics = M
        self._nt = T
        self._n_sources = 1
        self._source_id = 0

    def get_mic(self, imic: int, start: Optional[int]=None, size: Optional[int]=None) -> np.ndarray:
        """ Returns the signal of the microphone imic, starting at index start and with size size. """
        start = start if start is not None else self.signal_start
        size = size if size is not None else self.signal_size
        assert isinstance(start, int)
        assert isinstance(size, int)
        return self._RIR[start:start + size, imic]
    
    def get_pos(self, imic: int) -> np.ndarray:
        """ Returns the position of the microphone imic in meters (x, y, z) """
        assert 0 <= imic < self.n_mics, f"Microphone index {imic} out of range [0, {self.n_mics - 1}]"
        return  np.array([imic * self._dx, 0, 0])

    def _unpack_resource(self):
        """ .mat files does not need to be uncompressed. To avoid confusion, I return the path to the resource file directly. """
        return self._resource_datapath

#### Checks that Zea database works

In [None]:
db = ZeaRIR(root="./data", dataname="RIR", signal_start=0, signal_size=128)

Matched resources to download:
- BalderRIR.mat
- FrejaRIR.mat
- MuninRIR.mat
Loading the resource ./data/ZeaRIR/raw/BalderRIR.mat ...


It has checked what resources match with dataname "RIR", and found three resources. It downloads all the matching resources. It only loads the data for the first resource "Balder", because there each object of this class should only return signals from the same room. To load other rooms I give a singular dataname corresponding to the name of that resource.  
If the resources are already in the folder, it will skip the download:  

In [None]:
db._download_resource(resource_name="Balder") # Just return (no error message) because "BalderRIR.mat" is in the raw folder 

And we can check that the correct room and its parameters are properly loaded

In [None]:
print(db)

Database: ZeaRIR
Download: ['BalderRIR.mat', 'FrejaRIR.mat', 'MuninRIR.mat']
Load room: BalderRIR.mat
Path to raw resource: ./data/ZeaRIR/raw/BalderRIR.mat
Path to unpacked data folder: ./data/ZeaRIR/raw
Sampling frequency: 11250 Hz
Number of microphones: 100
Number of total time samples: 3623
Number of time samples selected: 128
Number of sources: 1
Signal start: 0
Signal size: 128
Source ID: 0


We can check the data that it has loaded from the memory and that the main get methods work:

In [None]:
print(f"Loaded chunk of data of size {db._RIR.shape}")
print(f"Output of get_mic  (4 time samples): {db.get_mic(imic=0, start=0, size=4)}")
print(f"Output of get_time (4 time samples): {db.get_time(start=0, size=4)}")
print(f"Test of get_pos: {db.get_pos(imic=1)}")


Loaded chunk of data of size (128, 100)
Output of get_mic  (4 time samples): [ 0.00041836  0.0001148  -0.00129174  0.00162724]
Output of get_time (4 time samples): [0.00000000e+00 8.88888889e-05 1.77777778e-04 2.66666667e-04]
Test of get_pos: [0.03 0.   0.  ]


Before implementing the downloading method, I used this code to test how to download the resources and what MD5 should I write for each resource (since it is not provided in the given mirror).

In [None]:
from torchvision.datasets.utils import calculate_md5, check_md5

In [None]:
# db = ZeaRIR(root="./data")
for file, md5_class in db.resources:
    url = os.path.join(db.mirrors[0], file)
    download_url(url, root=db.raw_folder, filename=file)
    md5 = calculate_md5(os.path.join(db.raw_folder, file))
    print(f"File: {file}, MD5: {md5}")
    assert check_md5(os.path.join(db.raw_folder, file), md5_class), (
    f"Check the MD5 of the resource '{file}' for the class '{db.__class__.__name__}' "
)

File: BalderRIR.mat, MD5: bc904010041dc18e54a1a61b23ee3f99
File: FrejaRIR.mat, MD5: 1dedf2ab190ad48fbfa9403409418a1d
File: MuninRIR.mat, MD5: 5c90de0cbbc61128de332fffc64261c9


It may be useful to check the name of the resources before instantiating an object (which will initiate the downloading process).  
I can implement a class method to print the resources that can be downloaded.

In [None]:
ZeaRIR.print_resources()

Resources for class ZeaRIR:
- BalderRIR.mat 
- FrejaRIR.mat 
- MuninRIR.mat 


::: {.callout-note}
I am developing using nbdev, which includes an option `patch` from the library `fastcore`, that allows to implement a method of a class outside of the class definition, by declaring to which class it has to "patch" the method.  
In the autogenerated .py file it will appear in a way that I am not that familiar, so I opted to just use patch for didactic purposes, but the exported code is already in the class definition.
:::



In [None]:
@patch(cls_method=True)  
def print_resources(cls: DB_microphones):
    print(f"!!Method overwritten by a patch!!")
    print(f"Resources for class {cls.__name__}:")
    for name, md5 in cls.resources:
        print(f"- {name} ")



::: {.callout-note}
Pylance linting does not like `patch` and will underline it as a possible error.  
I have added it directly to the class (the following code is just for testing purposes).
([This is a callout from Quarto](https://quarto.org/docs/authoring/callouts.html#callout-types))
:::


I can overwrite the method with patch (note the extra line)

In [None]:
ZeaRIR.print_resources()

!!Method overwritten by a patch!!
Resources for class ZeaRIR:
- BalderRIR.mat 
- FrejaRIR.mat 
- MuninRIR.mat 


### MeshRIR database
> Database from [Shoichi Koyama](https://arxiv.org/abs/2106.10801), National Institute of Informatics, Tokyo, Japan . It will inherit from DB_micorphones 

In [None]:
#| export
class MeshRIR(DB_microphones):

    mirrors = [
        "https://zenodo.org/records/10852693/files/"
    ]

    resources = [
        ("S1-M3969_npy.zip", "2cb598eb44bb9905560c545db7af3432" ),
        ("S32-M441_npy.zip", "9818fc66b36513590e7abd071243d8e9"), 
    ]

    
    def __init__(self,
                 root: str = "./data", # Path to the root directory of the database, where the data will be dowloaded
                 dataname: str = "S1", # String matching the name of the resources to download and load. (if several resources are available, all will be downloaded but only the first one will be loaded). 
                 signal_start: int = 0, # Start index of the signal to load.
                 signal_size: Optional[int] = None, # Size of the signal to load. If None, the whole signal will be loaded.
                 source_id: int = 0,
                 ):
        
        super().__init__(root, dataname, signal_start, signal_size)

        # Prepare the download and unpack the resource
        # This database unpacks files in a folder with the same name as the resource without the .zip extension 
        self.data_folder = self._prepare_download_and_unpack(dataname, unpack=True)
        assert isinstance(self.data_folder, str), f"Check if your resource has to be unpacked or not."

        # Load the data from the unpacked folder (also perform checks)
        self._load_database_info()  # Load the database information from the data.json file
        assert (source_id >= 0) and (source_id < self.n_sources) , f"Database has {self.n_sources} sources. Choose source_id in [0, {self.n_sources-1}]. "
        self._source_id = source_id

        # Loads src and mic positions, NOTE: maybe also load signals? load_all_data 
        # self.load_data(filepath=self.data_folder)       
        self.load_src_and_mics_positions()


    def load_data(self, filepath: str):
        filepath=self.data_folder
        self.load_src_and_mics_positions()

    def _load_database_info(self):
        """ Load the database information from the data.json file. """
        json_file = os.path.join(self.data_folder, "data.json")
        with open(json_file, "r") as f:
            json_data = json.load(f)
        
        self._fs = json_data['samplerate']
        T = json_data['ir length']
        self._n_sources = json_data['number of sources']
        self._nmics = json_data['number of points']

        # Check that the folder contains all the signals in the database
        nfiles = len([f for f in os.listdir(self.data_folder) if f.startswith('ir_') and f.endswith('.npy')])
        assert self._nmics == nfiles, f"ir_xxx.npy files = {nfiles}, should be {self._nmics}"

        # Check if the signal_start and signal_size are within the bounds of the signal size
        self._check_bounds_in_sample_size(number_of_time_samples=T)
        assert isinstance(self._signal_start, int) and isinstance(self._signal_size, int)
        self._start_sample = self._signal_start
        self._last_sample = self._signal_start + self._signal_size
        self._nt = T

    def load_src_and_mics_positions(self):
        self.load_src_positions()  # Load source positions from the file
        self.load_mic_positions()  # Load microphone positions from the file

    def load_src_positions(self):
        # Source position in the dataset
        filepath = self.data_folder
        assert isinstance(filepath, str), f"Check if your resource has to be unpacked or not."
        pos_src_path = os.path.join(filepath, 'pos_src.npy')
        self._source_positions = np.load(pos_src_path)

    def load_mic_positions(self):
        # Position of the microphones
        filepath = self.data_folder
        assert isinstance(filepath, str), f"Check if your resource has to be unpacked or not."
        pos_mic_path = os.path.join(filepath, 'pos_mic.npy')
        self._pos_mics = np.load(pos_mic_path) # (nmics, 3)  each row is (x,y,z) for a mic


    def load_all_data(self):
        # Concatenate vectors (source, signal) -> into -> (source, imic, signal)
        filepath = self.data_folder
        assert isinstance(filepath, str), f"Check if your resource has to be unpacked or not."
        data = np.concatenate( 
            [np.load(os.path.join(filepath, f'ir_{i}.npy'))[:,None,:]  
             for i in range(self.n_mics)], # for all mics
             axis = 1 ) # in axis 1 (mics)  (source, mics, signal)
        return data

    def load_mic(self, imic):
        filepath = self.data_folder
        assert isinstance(filepath, str), f"Check if your resource has to be unpacked or not."
        mic_signal = np.load(os.path.join(filepath, f'ir_{imic}.npy')) # (source, signal)
        return mic_signal[self.source_id, :]
        
    def get_pos(self, imic: int)-> np.ndarray:
        """ Returns the position of the microphone imic in meters (x, y, z) """
        if not hasattr(self, "_pos_mics"):
            self.load_mic_positions()
        assert 0 <= imic < self.n_mics, f"Microphone index {imic} out of range [0, {self.n_mics - 1}]"
        return self._pos_mics[imic,:]
     
    def get_src_pos(self):
        if not hasattr(self, "_source_positions"):
            self.load_src_positions()
        return self._source_positions[self.source_id]
    
    def get_mic(self, imic: int, start=None, size=None) -> np.ndarray:
        """ Returns the signal of the microphone imic, starting at index start and with size size. """
        start = start if start is not None else self.signal_start
        size = size if size is not None else self.signal_size
        assert isinstance(start, int)
        assert isinstance(size, int)
        return self.load_mic(imic=imic)[start:start + size]

Now let's check the MeshRIR database implementation:

In [None]:
db2 = MeshRIR(root="./data", dataname="S32", signal_start=0, signal_size=128, source_id=31)

Matched resources to download:
- S32-M441_npy.zip
Unpacked folder ./data/MeshRIR/raw/S32-M441_npy already exists. Skipping unpacking.


Since this is a heavier database, I have already checked that the downloading method works.  
This database requires to unzip the resource, the class has checked that the unpacked folder already exists, so it does not download and unpack the resource.

In [None]:
print(db2)

Database: MeshRIR
Download: ['S32-M441_npy.zip']
Load room: S32-M441_npy.zip
Path to raw resource: ./data/MeshRIR/raw/S32-M441_npy.zip
Path to unpacked data folder: ./data/MeshRIR/raw/S32-M441_npy
Sampling frequency: 48000 Hz
Number of microphones: 441
Number of total time samples: 32768
Number of time samples selected: 128
Number of sources: 32
Signal start: 0
Signal size: 128
Source ID: 31


Test of main get methods:

In [None]:
print(f"In this database we do not preload all the database.")
print(f"Output of get_mic  (4 time samples): {db2.get_mic(imic=0, start=0, size=4)}")
print(f"Output of get_time (4 time samples): {db2.get_time(start=0, size=4)}")
print(f"Test of get_pos: {db2.get_pos(imic=1)}")

In this database we do not preload all the database.
Output of get_mic  (4 time samples): [0.00599654 0.00572385 0.00485317 0.00515282]
Output of get_time (4 time samples): [0.00000000e+00 2.08333333e-05 4.16666667e-05 6.25000000e-05]
Test of get_pos: [-0.4 -0.5  0. ]


## Good Practices (after coding)
> Things that I have learnt, or thought they are interesting after coding this notebook

1. Use of **Inheritance**  
    - There are different experimental databases but it is useful to crete a base class with the methods that I want to use for my applications.
    - In the **base class** I try to define common attributes. The "protected" attributes starting with underscore ex: _fs. The "private" attributes starting with double-underscore ex: __fs. 
    - The class can have methods **getter** to return the protected and private attributes. In particular Iwill use ``@property`` to define which attributes I want to access. I can access ``obj._fs`` with ``obj.fs`` property method.
    - It is possible to instantiate objecs of the **base class**, although it will not have the information we require, since this is an ``abstract`` class. To avoid wrong uses, there is the package abc (abstract base class) that includes definitions that are useful to define the behaviour of classes like this. 
    - Inheriting from ABC (Abstract Base Class) and declaring ``@abstractmethods`` that each subclass have to implement, avoids the instantiation of objects of any abstractclass or its subclasses is the abstractmethods are not overridden.
    - This is useful to remind you that you have to implement all the abstractmethods before you can use a class.
    - The base class contains the commonalities between databases so I do not have to repeat code.
    - In the base class I can **__init__** only the strictly necessary attributes, but if there is a set of operations that may be used in different subclasses, I can define a method, like ``_prepare_data(self)``, and in the subclasses.__init__() I can use that method defined in the base class. **This avoids a case where a new subclass has a different init logic and I have to review the init logic of the base class**.  

2. Logic and options to **download** the databases
    - Inspired by **MNIST** I can write in the subclasses the class attributes ``mirrors`` and ``resources``, with the urls where I can download the files(resources).
    - From **MNIST** I also use some downloading logic and what libraries to use to download, unpack and check data. 
    - When downloading files from github, do not use the url that can be seen in the explorer, but use the url where github saves the raw data: ``"https://raw.githubusercontent.com/{USER}/{REPO}/{BRANCH}/{PATH_DATA_FOLDER}/"`` substituting the ``USER``, ``REPO``, ``BRANCH``, ``PATH_DATA_FOLDER`` of the file that you want to download, as seen in the normal github url of the data.  

3. Use a lot **``assert``**
    - It is very useful to check for errors and that your parameters are supposed to be of a certain kind or in certain bounds.
    - Sometimes Pylance or other linters show errors although the code is perfectly functional, because it can not detect the type of your data, an assert before the line of code where Pylance shows an error can tell Pylance that your data is gonna be of the type that is supposed to be, therefore, the operations such as + are compatible with those variables.

4. Use the method ``__str__()``, to print useful information of the object, like different attributes, statistics, etc. Then use it as ``print(obj)``.  

In [None]:
#| hide
import nbdev; nbdev.nbdev_export()