Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Change GGS to inherit from BaseSeriesAnnotator #5315

Merged
merged 5 commits into from
Nov 17, 2023

Conversation

Alex-JG3
Copy link
Contributor

Reference Issues/PRs

#5296 must be merged before this PR can.

What does this implement/fix? Explain your changes.

Updates GreedyGaussianSegmentation to inherit from BaseSerieAnnotator following the HMM class as a guide.

Does your contribution introduce a new dependency? If yes, which one?

No.

What should a reviewer concentrate their feedback on?

  • Currently type check and conversion is done in the _predict method. It would be good to refactor this out into the BaseSeriesAnnotator class but I think that is a job for another PR.
  • This PR makes no attempt to convert the GGS class to inherit from BaseSeriesAnnotator class. It would be good for this to be done but again I think this is a job for another PR.

Did you add any tests for the change?

No but I had to add the type conversions in the _predict method to get the annotation tests to pass locally.

Any other comments?

This changes means that, similarly to the HMM class, the fit method has to be called before the predict method even though fit does not contain any essential logic. Perhaps these classes should only have a fit_predict method and neither a fit or predict method. Here is the example from the HMM docstring demontstrating this.

>>> from sktime.annotation.hmm import HMM
>>> from scipy.stats import norm
>>> from numpy import asarray
>>> # define the emission probs for our HMM model:
>>> centers = [3.5,-5]
>>> sd = [.25 for i in centers]
>>> emi_funcs = [(norm.pdf, {'loc': mean,
...  'scale': sd[ind]}) for ind, mean in enumerate(centers)]
>>> hmm_est = HMM(emi_funcs, asarray([[0.25,0.75], [0.666, 0.333]]))
>>> # generate synthetic data (or of course use your own!)
>>> obs = asarray([3.7,3.2,3.4,3.6,-5.1,-5.2,-4.9])
>>> hmm_est = hmm_est.fit(obs)
>>> labels = hmm_est.predict(obs)

PR checklist

For all contributions
  • I've added myself to the list of contributors with any new badges I've earned :-)
    How to: add yourself to the all-contributors file in the sktime root directory (not the CONTRIBUTORS.md). Common badges: code - fixing a bug, or adding code logic. doc - writing or improving documentation or docstrings. bug - reporting or diagnosing a bug (get this plus code if you also fixed the bug in the PR).maintenance - CI, test framework, release.
    See here for full badge reference
  • Optionally, I've added myself and possibly others to the CODEOWNERS file - do this if you want to become the owner or maintainer of an estimator you added.
    See here for further details on the algorithm maintainer role.
  • The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.
For new estimators
  • I've added the estimator to the API reference - in docs/source/api_reference/taskname.rst, follow the pattern.
  • I've added one or more illustrative usage examples to the docstring, in a pydocstyle compliant Examples section.
  • If the estimator relies on a soft dependency, I've set the python_dependencies tag and ensured
    dependency isolation, see the estimator dependencies guide.

@Alex-JG3 Alex-JG3 changed the title [ENH] Change ggs to baseseriesannotator [ENH] Change GGS to inherit from BaseSeriesAnnotator Sep 24, 2023
@Alex-JG3 Alex-JG3 marked this pull request as draft September 24, 2023 14:33
@fkiraly
Copy link
Collaborator

fkiraly commented Oct 1, 2023

Adding a pointer to my comment in the merged PR re get_params / set_params here: #5296 (comment), just in case you don't see it because it is merged.

@fkiraly
Copy link
Collaborator

fkiraly commented Oct 1, 2023

Currently type check and conversion is done in the _predict method. It would be good to refactor this out into the BaseSeriesAnnotator class but I think that is a job for another PR.

Agreed. It is also related to the interesting question how the final interface should look like.

It would be good for this to be done but again I think this is a job for another PR.

I don't know whethe this should be done, at all - might be too invasive to the existing code.

@fkiraly
Copy link
Collaborator

fkiraly commented Oct 1, 2023

This changes means that, similarly to the HMM class, the fit method has to be called before the predict method even though fit does not contain any essential logic. Perhaps these classes should only have a fit_predict method and neither a fit or predict method.

Btw, this has been a frequent point of discussion for transformations too - some transformers do not have any essential logic in fit, and these classes have the fit_is_empty tag.

In order to adhere to the "strategy pattern", all objects of the same type must have the same interface, so even if fit is empty one must be able to call it. https://en.wikipedia.org/wiki/Strategy_pattern

This may seem counterintuitive from a parsimony perspective if looking only at the individual class, but the point is to ensure all classes fit to the same base interface.

@Alex-JG3 Alex-JG3 closed this Oct 18, 2023
@Alex-JG3 Alex-JG3 force-pushed the change-ggs-to-baseseriesannotator branch from df0844d to f8c31e7 Compare October 18, 2023 12:55
@Alex-JG3 Alex-JG3 reopened this Oct 18, 2023
@Alex-JG3
Copy link
Contributor Author

This PR was branched off #5296 which has now been merged. To avoid merge conflicts from squashing the commits in #5296, I have removed commit df0844d, merged with main, and put the contents of commit df0844d into a new commit, ffde895.

The get_params and set_params methods from the parent BaseObject class
are sufficient.
@Alex-JG3
Copy link
Contributor Author

Alex-JG3 commented Oct 18, 2023

Adding a pointer to my comment in the merged PR re get_params / set_params here: #5296 (comment), just in case you don't see it because it is merged.

@fkiraly, thanks for pointing that out. I think you are right, the get_params and set_params methods can be removed from the GreedyGaussianSegmentation class since:

  • The get_params method in BaseObject will only get the parameters of the GreedyGaussianSegmentation class not the GGS adeptee class. The implementation of get_params in GreedyGaussianSegmentation was only there to remove the parameters from the GGS adaptee class.
  • My concern with set_params was that the parameters would not be passed onto the adapee class. However, the set_params method in BaseObject calls self.__init__ which will create a new instance of the GGS adaptee class with the new parameters.

Removed get_params and set_params from the GreedyGaussianSegmentation class in commit f0e392b.

The _intermediate_change_points and _intermediate_ll are attributes
according to the GreedyGaussianSegmentation docstring but since they
were attributes of the GGS adaptee class they were not accessible. This
commit makes them into properties so they are accessible.
@fkiraly
Copy link
Collaborator

fkiraly commented Oct 19, 2023

thanks for pointing that out. I think you are right

Yes - the default get_params and set_params should do the trick, although this may not have been the case when the GGS was implemented (that was pre-skbase)

@Alex-JG3 Alex-JG3 marked this pull request as ready for review October 26, 2023 07:17
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

One blocking but minor thing, we should not override _repr_ as the BaseObject also handles that (and it plays a role in the html representation of nested estimators afaik)

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, didn't spot that you fixed the small request.
(if you click the circling arrows next to the reviewer symbol, you re-request a review)

@fkiraly fkiraly merged commit b031687 into sktime:main Nov 17, 2023
49 checks passed
@fkiraly
Copy link
Collaborator

fkiraly commented Nov 17, 2023

Re having fit or not - interesting question.
Why is the Gaussian fitting logic not in fit?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants