Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customizing Base Models #125

Open
chadlagore opened this issue May 22, 2018 · 1 comment
Open

Customizing Base Models #125

chadlagore opened this issue May 22, 2018 · 1 comment

Comments

@chadlagore
Copy link
Collaborator

chadlagore commented May 22, 2018

Customizing a BaseModel is pretty simple at the moment. Something like:

from minutes import BaseModel

class MyBaseMinutesModel(BaseModel):
    def fit(self, **kwargs):
        X_train, X_test, y_train, y_test = self._generate_training_data()
        # Design and train your model, then assign it to self.model.

I don't think we need to make this anymore convenient for the user, but some documentation for this should exist in the README.

@chadlagore
Copy link
Collaborator Author

chadlagore commented May 25, 2018

I'm going to flesh out this approach a bit better to show how you customize a base model, then version control it here in the repository for others to use. Below is untested code, you may need to tweak it, but its correct in spirit to how Minutes should behave.

import os

from keras.models import Sequential

from minutes.base import BaseModel
from minutes import Speaker


class MyBaseMinutesModel(BaseModel):
    """Subclass the BaseModel and override the fit method 
    (thats all you need to do).
    """

    def fit(self, **kwargs):
        X_train, X_test, y_train, y_test = self._generate_training_data()
        # Design and train your model, then assign it to self.model.
        self.model = Sequential([
              # ...
        ])
        self.model.compile()
        self.model.fit()


# Define hyperparameters (there is one at the moment and it determines
# the type of data generated by `_generate_training_data`.
new_base = MyBaseMinutesModel(
    'human-readable-name',
    ms_per_observation=1000
)

# Add many speakers.
for folder in os.listdir('/path/to/speakers/'):
    # Assuming folder is speakers name.
    s = Speaker(folder) 

    # Assuming folder contains audio for speaker.
    s.add_audio('/path/to/speakers/' + folder)

    new_base.add_speaker(s)

new_base.fit(verbose=2)    # Takes a while :)
new_base.save()

Note that new_base.save() will save the fixed model parameters and keras model to the location specified by MINUTES_MODELS_DIRECTORY environment variable. If MINUTES_MODELS_DIRECTORY environment variable is not set by the user, this will default into the minutes repository /minutes/models/human-readable-name. Then its as simple as:

git checkout -b new-model-type
git add minutes/models/*
git commit -m "My new model got 99%!"
git push origin HEAD

We have work to do to standardize the naming of the output models. Currently it uses the name specified by the user, but it should have some other features to make it unique/informative/serialized.

@chadlagore chadlagore mentioned this issue Jun 1, 2018
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant