Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: create a class AudacityBbox to be able to handle the non standard format with bbox #222

Closed
shaupert opened this issue Feb 16, 2023 · 1 comment
Labels
ENH: enhancement New feature or request

Comments

@shaupert
Copy link

shaupert commented Feb 16, 2023

Describe the solution you'd like
I'm a user of Audacity and I would like to be able to use crowsetta when I draw bounding boxes around my sound of interest

Describe alternatives you've considered
I propose below a code to add this features in crowsetta or to add it in the tutorial "how to" to construct your own class for Audacity bbox.
The class uses the functions read_audacity_annot and write_audacity_annot from the package maad. Of course, you can also adapt the code to your own need in order to avoid the dependence with scikit-maad
You can also add an example of this kind of file. Have a look to the section data of scikit-maad, there are files with this format :

I hope it helps.

"""module with functions that handle .txt annotation files
from Audacity in case of bounding box.
"""

import pathlib
from typing import ClassVar, List, Optional

from maad.util import read_audacity_annot, write_audacity_annot

import attr
import pandas as pd

import crowsetta
from crowsetta.typing import PathLike

""" TO DO : I do not know how to work on that [SH]
class AudacityBboxSchema(pandera.SchemaModel):
"""

COLUMNS_MAP = {
    "min_t": "begin_time_s",
    "max_t": "end_time_s",
    "min_f": "low_freq_hz",
    "max_f": "high_freq_hz",
}

@crowsetta.formats.register_format
@crowsetta.interface.BBoxLike.register
@attr.define
class AudacityBbox:
    """Class that represents .txt annotation files
    from Raven (https://ravensoundsoftware.com/software/),
    created by exporting a Selection Table.
    Attributes
    ----------
    name: str
        Shorthand name for annotation format: 'raven'.
    ext: str
        Extension of files in annotation format: '.txt'
    df : pandas.DataFrame
        with annotations loaded into it
    annot_path : str, pathlib.Path
        Path to Raven .txt file from which annotations were loaded.
    audio_path : str. pathlib.Path
        Path to audio file that the Raven .txt file annotates.
    """
    name: ClassVar[str] = 'audacitybbox'
    ext: ClassVar[str] = '.txt'

    df: pd.DataFrame
    annot_path: pathlib.Path
    annot_col: str
    audio_path: Optional[pathlib.Path] = attr.field(default=None,
                                                    converter=attr.converters.optional(pathlib.Path))

    @classmethod
    def from_file(cls,
                  annot_path: PathLike,
                  annot_col: str = 'label',
                  audio_path: Optional[PathLike] = None) -> 'Self':
        """Load annotations from a Audacity annotation file with bbox,
        created by exporting a Selection Table.
        Parameters
        ----------
        annot_path : str, pathlib.Path
            Path to a .txt file exported from Audacity bbox.
        annot_col : str
            name of column that contains annotations
        audio_path : str, pathlib.Path
            Path to audio file that the Audacity bbox .txt file annotates.
            Optional, defaults to None.
        Examples
        --------
        >>> example = crowsetta.data.get('audacitybbox')
        >>> audacitybbox = crowsetta.formats.bbox.AudacityBbox.from_file(example.annot_path)
        """
        annot_path = pathlib.Path(annot_path)
        crowsetta.validation.validate_ext(annot_path, extension=cls.ext)

        #  call the scikit-maad function to read audacity bbox file
        df = read_audacity_annot(annot_path)
        if len(df) < 1:
            raise ValueError(
                f'Cannot load annotations, '
                f'there are no rows in Audacity bbox .txt file:\n{df}'
            )
        columns_map = dict(COLUMNS_MAP)  # copy
        columns_map.update({annot_col: 'annotation'})
        df.rename(columns=columns_map, inplace=True)

        return cls(
            df=df,
            annot_path=annot_path,
            annot_col=annot_col,
            audio_path=audio_path,
        )

    def to_bbox(self) -> List[crowsetta.BBox]:
        """Convert this Audacity bbox annotation to a ``list`` of ``crowsetta.Bbox``.
        Returns
        -------
        bboxes : list
            of ``crowsetta.BBox``
        Examples
        --------
        >>> example = crowsetta.data.get('audacitybbox')
        >>> audacitybbox = crowsetta.formats.bbox.AudacityBbox.from_file(example.annot_path)
        >>> bboxes = audacitybbox.to_bbox()
        """
        bboxes = []
        for begin_time, end_time, low_freq, high_freq, label in zip(
                self.df.begin_time_s.values,
                self.df.end_time_s.values,
                self.df.low_freq_hz.values,
                self.df.high_freq_hz.values,
                self.df['annotation'].values,
        ):
            bboxes.append(
                    crowsetta.BBox(onset=begin_time,
                                   offset=end_time,
                                   low_freq=low_freq,
                                   high_freq=high_freq,
                                   label=label)
                )
        return bboxes

    def to_annot(self) -> crowsetta.Annotation:
        """Convert this Audacity bbox annotation to a ``crowsetta.Annotation``.
        Returns
        -------
        annot : crowsetta.Annotation
        Examples
        --------
        >>> example = crowsetta.data.get('audacitybbox')
        >>> audacitybbox = crowsetta.formats.bbox.AudacityBbox.from_file(example.annot_path)
        >>> annot = audacitybbox.to_annot()
        """
        bboxes = self.to_bbox()
        return crowsetta.Annotation(annot_path=self.annot_path,
                                    notated_path=self.audio_path,
                                    bboxes=bboxes)

    def to_file(self,
                annot_path: PathLike) -> None:
        """make a .txt file that can be read by Raven
        from this annotation
        Parameters
        ----------
        annot_path : str, pahtlib.Path
             path including filename where file should be saved.
             Must have extension '.txt'
        """
        crowsetta.validation.validate_ext(annot_path, extension=self.ext)

        columns_map = {v: k for k, v in COLUMNS_MAP.items()}  # copy
        columns_map.update({'annotation': self.annot_col})
        df_out = self.df.rename(columns=columns_map)
        write_audacity_annot(annot_path, df_out)
@NickleDave
Copy link
Collaborator

Thank you very much @shaupert! I really appreciate your help, will start with your implementation and make sure to give you credit.

I'm going to close this as a duplicate of #213 but like I said your code above is a great start and I will be sure to reference this issue there and add you as a contributor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ENH: enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants