Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyperparameter optimization #264

Open
lars-reimann opened this issue May 4, 2023 · 0 comments
Open

Hyperparameter optimization #264

lars-reimann opened this issue May 4, 2023 · 0 comments
Assignees
Labels
enhancement 💡 New feature or request

Comments

@lars-reimann
Copy link
Member

lars-reimann commented May 4, 2023

Is your feature request related to a problem?

Finding appropriate values for hyperparameters by hand is tedious. There should be automation to try different combinations of values.

Desired solution

  1. For all hyperparameters of models of type T it should also be possible to pass a Choice[T] (see feat: add Choice class for possible values of hyperparameter #325). Example:
# Before
class KNearestNeighbors(Classifier):
  def __init__(self, number_of_neighbors: int) -> None:
    ...

# After
class KNearestNeighbors(Classifier):
  def __init__(self, number_of_neighbors: int | Choice[int]) -> None:
    ...

# Usage
KNearestNeighbors(number_of_neighbors = Choice(1, 10, 100))
  1. Adjust the getters (Getters for hyperparameters of models #260) accordingly.
  2. When a user tries to call fit on a model that contains Choice at any level (can be nested), raise an exception. Also point to the correct method (see 4.).
  3. Add new method fit_by_exhaustive_search to Classifier and subclasses with parameter:
    • optimization_metric: The metric to use to find the best model. It should have type ClassifierMetric, which is an enum with one value for each classifier metric we have available:
    class ClassifierMetric(Enum):
        ACCURACY = "accuracy"
        PRECISION = "precision
        RECALL = "recall"
        F1_SCORE = "f1_score"
    The parameter should be required.
  4. Add new method fit_by_exhaustive_search to Regressor and subclasses with parameter:
    • optimization_metric: The metric to use to find the best model. It should have type RegressorMetric, which is an enum with one value for each regressor metric we have available:
    class RegressorMetric(Enum):
        MEAN_SQUARED_ERROR = "mean_squared_error"
        MEAN_ABSOLUTE_ERROR = "mean_absolute_error"
    The parameter should be required.
  5. Both of those methods should then collect the Choices inside of the model and its children, and for each possible setting create a model without choices, fit this, and compute the listed metric on it. It should then keep track of the best (fitted) model according to the metric and return it at the end. GridSearchCV of scikit-learn can be useful for this.
@lars-reimann lars-reimann added the enhancement 💡 New feature or request label May 4, 2023
@lars-reimann lars-reimann self-assigned this May 5, 2023
lars-reimann added a commit that referenced this issue May 26, 2023
### Summary of Changes

Add a class to represent possible choices for the value of a
hyperparameter. This is in preparation for #264.
@lars-reimann lars-reimann removed their assignment May 26, 2023
lars-reimann pushed a commit that referenced this issue Jun 1, 2023
## [0.13.0](v0.12.0...v0.13.0) (2023-06-01)

### Features

* add `Choice` class for possible values of hyperparameter ([#325](#325)) ([d511c3e](d511c3e)), closes [#264](#264)
* Add `RangeScaler` transformer ([#310](#310)) ([f687840](f687840)), closes [#141](#141)
* Add methods that tell which columns would be affected by a transformer ([#304](#304)) ([3933b45](3933b45)), closes [#190](#190)
* Getters for hyperparameters of Regression and Classification models ([#306](#306)) ([5c7a662](5c7a662)), closes [#260](#260)
* improve error handling of table ([#308](#308)) ([ef87cc4](ef87cc4)), closes [#147](#147)
* Remove warnings thrown in new `Transformer` methods ([#324](#324)) ([ca046c4](ca046c4)), closes [#323](#323)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 💡 New feature or request
Projects
Status: Backlog
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants