Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Store Annotation data in consistent format #259

Open
sammlapp opened this issue Sep 19, 2023 · 3 comments
Open

ENH: Store Annotation data in consistent format #259

sammlapp opened this issue Sep 19, 2023 · 3 comments
Labels
ENH: enhancement New feature or request

Comments

@sammlapp
Copy link

Is your feature request related to a problem? Please describe.
I find it confusing that the base class for all annotations (Annotation) stores data in different ways for "sequence" and "bounding box" type annotations, since these fundamentally contain similar (overlapping) data types.

Describe the solution you'd like
Annotation class could be refactored to store annotations in a dataframe with columns for time0, time1, freq0, freq1. It would contain methods or properties that allow the user to get either a list of bbox or a sequence/ list of sequences. This would enable all of the current functionality while storing data in a consistent format regardless of the source of the annotation.

class Annotation:
    def __init__(self,df,annotation_file,notated_file):
        self.df = df #columns time0, time1, freq0, freq1 etc
        self.annotation_file=annotation_file
        self.notated_file=notated_file
    
    @classmethod
    def from_bboxes(cls, bboxes,annotation_file,notated_file):
        ...
        return cls(df,annotation_file,notated_file)
    
    @classmethod
    def from_sequences(cls,sequences,annotation_file,notated_file):
        ...
        return cls(df,annotation_file,notated_file)

    @property
    def sequences(self):
        return self._make_sequences_from_df()

    def _make_sequences_from_df(self):
        ...
        return list_of_sequences

    @property
    def bboxes(self):
        return self._make_bboxes_from_df()

    def _make_bboxes_from_df(self):
        ...
        return list_of_bboxes

#example usage with sequences
anns = Annotation.from_sequences(sequences)
anns.sequences # gives list of sequences

#example usage with bboxes
anns = Annotation.from_bboxes(bboxes)
anns.bboxes # gives list of bboxes

Describe alternatives you've considered
Alternatively to storing annotations as a dataframe, Annotation could store a list of BBox objects, and a sequences property could call a method _bbox_to_seq() to convert the list of BBox to a sequence.

Additional context
This vocalpy thread contains discussion of this idea https://forum.vocalpy.org/t/use-of-both-sequence-and-bbox-in-annotation-class/63/6

@sammlapp sammlapp added the ENH: enhancement New feature or request label Sep 19, 2023
@sammlapp sammlapp changed the title ENH: ENH: Store Annotation data in consistent format Sep 19, 2023
@NickleDave
Copy link
Collaborator

Hi @sammlapp thank you for raising this issue and linking back to the thread in the forum.

I know my reply there was a bit lengthy 😅

Long story short:
I think you are right, that we should implement a class representing bounding boxes the way you suggest, as a pandas Dataframe.
If you would still like to take a stab at an implementing that, it would be more than welcome.
I will raise a separate issue about that shortly.
There's some other housekeeping we'll need to do (probably drop the BBox class) that I will outline on that issue.

Unfortunately we can't represent sequences the same way, because by definition a sequence is a series of non-overlapping line segments.
I think we also need to change the way we represent sequences--I've been giving this some thought as I do some related work to VocalPy itself. I'll make a separate issue discussing how to change the implementation of sequences, and I'll link related issues from VocalPy here and on that issue.

@NickleDave
Copy link
Collaborator

@all-contributors please add @sammlapp for ideas

@allcontributors
Copy link
Contributor

@NickleDave

I've put up a pull request to add @sammlapp! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ENH: enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants