Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix register_format decorator #119

Closed
NickleDave opened this issue Dec 2, 2021 · 6 comments · Fixed by #176
Closed

fix register_format decorator #119

NickleDave opened this issue Dec 2, 2021 · 6 comments · Fixed by #176
Labels
ENH: enhancement New feature or request
Projects

Comments

@NickleDave
Copy link
Collaborator

NickleDave commented Dec 2, 2021

why?

It's not easy to test whether entry points are installed (#94) but all formats are currently installed as entry points, which defines what the formats.show() function returns. This can lead to bugs (#47) and also makes it hard to test whether that function works well (#92) and to iterate over all the formats in tests (#66)

We could use the pyproject.toml as a single source of truth for tests, but why not just declare what the formats are in code?
#92 (comment)
This way it would be clear without needing a file that's just meant for development.

The dirt simple way to do this is just define a module-level dict mapping format names to functions / classes:
https://play.pixelblaster.ro/blog/2017/12/18/a-quick-and-dirty-mini-plugin-system-for-python/

but now we have a module-level dict just sitting there in the open

even worse, we now have to remember to update the dict every time we add a format

An alternative would be to let formats register themselves with a decorator

Kinda like this:
https://realpython.com/python37-new-features/#customization-of-module-attributes

This way there's no need to update a module-level dict or wherever else we store the 'name': callable pairs

With this approach we might not even need entry points -- a user would just import the format decorator, guaranteeing that everything would be in place for them to add their format with it (because the format.Formats module-level class would get instantiated on import)

Also I think it might be possible to later make it so the same decorator would wrap any Format class (#99) passed to the format decorator in a Format base class? Or maybe that's not useful and I need to think about it more.

how

the annot_format decorator will register formats

Formats will be registered as a property of a class that gets called by a separate function formats.show, because I am paranoid about just storing them in a dictionary mapping `.
Something like:

"""formats/formats.py"""
class AnnotFormats():
    def __init__(self):
        self.__formats__ = {}

    def available(self, caller):
        return self.__formats__

    def add(self, format_name, format_func):
        self.__formats__[format_name] = format_func

# singleton, module-level class instance used to track formats
annot_formats = AnnotFormats()

def show():
    return annot_formats.available()

class FormatAdder
# another singleton, we use to enforce who can add a format
show_formats = ShowFormats()


def annot_format(format_name):
    def decorator(format_func):
        annot_formats.add(format_name, format_func)
        return format_func
    return decorator

... or something like this

@NickleDave NickleDave added the ENH: enhancement New feature or request label Dec 2, 2021
@NickleDave
Copy link
Collaborator Author

Took a pass at this.

One issue is that Transcriber uses instances of the Meta class to determine how it should dispatch function calls.

This might be easier once an interface is in place (#105 ), that does much of the same work?

Or maybe I will try to replace Meta with an interface and everything will blow up

@NickleDave
Copy link
Collaborator Author

I should just test with toy code first if it lets me get the benefits I want or if there's something I'm not grokking

@NickleDave
Copy link
Collaborator Author

NickleDave commented Feb 2, 2022

A decorator still seems easier to use than entry points.

One thing to be aware of is that decorators affect introspection -- losing access to attributes like __name__ of the wrapped function: https://github.com/GrahamDumpleton/wrapt/blob/develop/blog/01-how-you-implemented-your-python-decorator-is-wrong.md

This wrapt library aims to restore many of those attributes.
The build is failing though 🤔

@NickleDave NickleDave changed the title replace format entry points with annot_format decorator decide if we can / want to use register_format decorator Mar 28, 2022
@NickleDave
Copy link
Collaborator Author

in the branch in progress I have a register_format decorator -- need to test whether it actually works though

@NickleDave NickleDave added this to To do in ENH May 14, 2022
@NickleDave
Copy link
Collaborator Author

Chose to not use register_format for built-in classes because it was complicated.

But it would be nice if a user could use register_format.

If this is easy -- e.g. if I can get it to work by re-writing the existing example user annotation format to be a registered sub-class of sequence, and then calling register_format on it -- then I'm +1 on it. But if not then just pull out this functionality for now

@NickleDave NickleDave changed the title decide if we can / want to use register_format decorator fix register_format decorator May 15, 2022
@NickleDave
Copy link
Collaborator Author

NickleDave commented May 15, 2022

Tested and it is apparently that easy.

Re-wrote example user format in tests and registered it.

>>> import example
>>> import crowsetta
>>> crowsetta.formats.as_list()
['raven',
 'birdsong-recognition-dataset',
 'generic-seq',
 'notmat',
 'simple-seq',
 'yarden',
 'textgrid',
 'timit',
 'example-custom-format']
>>> scribe = crowsetta.Transcriber(format='example-custom-format')
>>> example = scribe.from_file('bird1_annotation.mat')
>>> annots = example.to_annot()
>>> annots[0]
Annotation(annot_path=PosixPath('bird1_annotation.mat'), notated_path=PosixPath('lbr3009_0005_2017_04_27_06_14_46.wav'), seq=<Sequence with 15 segments>)

Needed to fix register_format function so that it uses Format.name class variable instead of Format.__name__.

Re-naming this issue to "fix" the function, will go ahead and use

NickleDave added a commit that referenced this issue May 15, 2022
so it adds a format class to
crowsetta.formats.FORMATS` with
the class' `name` attribute
instead of using `__name__`.
This makes it consistent with how built-in
classes are listed when
calling `crowsetta.formats.as_list()`
@NickleDave NickleDave moved this from To do to In progress in ENH May 15, 2022
@NickleDave NickleDave moved this from In progress to Done in ENH May 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ENH: enhancement New feature or request
Projects
ENH
Done
Development

Successfully merging a pull request may close this issue.

1 participant