Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] adapter for scipy distributions #227

Closed
fkiraly opened this issue Mar 31, 2024 · 6 comments · Fixed by #287
Closed

[ENH] adapter for scipy distributions #227

fkiraly opened this issue Mar 31, 2024 · 6 comments · Fixed by #287
Labels
enhancement feature request New feature or request good first issue Good for newcomers module:probability&simulation probability distributions and simulators

Comments

@fkiraly
Copy link
Collaborator

fkiraly commented Mar 31, 2024

Adapting scipy distributions is very formulaic and could easily be dealt with by an adapter class. This also avoids duplication in implementation efforts, as scipy is a core dependency.

This may require some abstraction around methods, but it seems mostly like a delegator class, as scipy does its own broadcasting.

Further discrepancies to be mindful of:

  • energy and similar integrals are not implemented in scipy
  • the class parameterization is different, scipy uses class methods whereas skpro uses __init__ based parameterization

Good first issue with a design/architecture flavour, can be leveraged to cover a lot of ground in #22.

@fkiraly fkiraly added enhancement module:probability&simulation probability distributions and simulators feature request New feature or request good first issue Good for newcomers labels Mar 31, 2024
@malikrafsan
Copy link
Contributor

Hi @fkiraly I am very interested in being assigned to this issue!

My solution to "the class parameterization" is that we store the parameter of the class and then pass it to the methods. I have created a preliminary solution for this. I have compared it to the existing distribution class and it has a similar result

Regarding the "energy and similar integrals are not implemented in scipy", I haven't found any solution for this, do you have any suggestions? Thank you so much!

Screenshot from 2024-04-09 22-59-52

@fkiraly
Copy link
Collaborator Author

fkiraly commented Apr 10, 2024

Yes, this is exactly what I meant - although on an adatper level, rather than for each distribution. That is, inheriting from the adapter and putting the scipy class in a field should do the job.

Similar to _PytsAdapter and _TslearnAdapter in sktime.

@malikrafsan
Copy link
Contributor

Hi @fkiraly , firstly I want to apologize for my recent absence from SKPro project. I have quite a lot of college assigments and exams previously and unfortunately, I wasn't able to contribute as much as I would have liked 😅. However, currently there is not a lot of things on my way and I am eager to continue contributing 😅

Similar to _PytsAdapter and _TslearnAdapter in sktime.

I will check those two ASAP. Looking forward to catching up and contributing!

@malikrafsan
Copy link
Contributor

What I understand from _PytsAdapter and _TslearnAdapter is that we create an Adapter Class that will be a base class for wrapping another library functionality.

We create the base class to have some methods that the child classes can override by returning the specific class/function from the library that we want to wrap

# _PytsAdapter
    def _get_pyts_class(self):
        raise NotImplementedError("abstract method")

# RocketPyts
    def _get_pyts_class(self):
        from pyts.transformation.rocket import ROCKET
        return ROCKET

We then initialize the class and store it on the attribute. When we want to the methods needed, we use that initialized attribute and use the functionality

# _PytsAdapter
    def _predict(self, X, y=None):
        pyts_est = getattr(self, self._estimator_attr)
        return self._call_with_y_optional(pyts_est.predict, X, y)

What I am confused about is that, the _PytsAdapter and _TslearnAdapter initialized the estimator attribute on the _fit method. But in the scipy adapter, there is no method bound to be called first (unlike _fit method). Do we need to move the initialization on the constructor method for scipy adapter?

I will try to raise a draft PR containing my proposed adapter ASAP

@fkiraly
Copy link
Collaborator Author

fkiraly commented Apr 28, 2024

Welcome back, @malikrafsan!

Generally, no need to apologize - as long as you are not paid or have not otherwise committed, there is no expectation that you deliver. Most contributions are on a volunteer or reseach basis, so there is absolutely no pressure. Thanks a lot for your previous work!

Re the adapter, the main idea would be avoiding duplication in classes interfacing scipy classes - these are already vectorized methods, but do not store parameters. The design principle here is being "DRY" - don't repeat yourself.

The precise way to achieve this depends on the inner and outer interface of the adapter, so pyts / tslearn adapters are not 100% comparable - as you say, there is no fit in scipy distributions.

So, one would have to design something where the scipy distribution, with some information about parameter names, sits inside the adapter.

What tends to be helpful is to interface a few instances and then abstract from them:

  • one example, and the one I would use as a template, is the Fisk distribution (Fisk).
  • similar to that, sort of a first try, is Poisson. This will need to be refactored.

My strategy would be: start with Fisk and try to make it more general. Make a copy and turn it into an adapter class. Then have Fisk inherit from the adapter, and simplify as much as possible. Then create a second distribution, and then refactor Poisson.

@malikrafsan
Copy link
Contributor

Hi @fkiraly, thank you so much for your understanding!

I have created a new draft PR containing the proposed base Scipy adapter class and inherited adapter class for the Fisk and Poisson distributions here. Please let me know your thoughts about it, I would be more than happy to get your feedback!

Once again, thank you so much for having me!

fkiraly pushed a commit that referenced this issue May 3, 2024
Fixes #227 

#### What does this implement/fix? Explain your changes.
<!--
A clear and concise description of what you have implemented.
-->

- Adapter for Scipy distributions
- Fisk Distribution using Scipy Adapter
- Poisson Distribution using Scipy Adapter
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement feature request New feature or request good first issue Good for newcomers module:probability&simulation probability distributions and simulators
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants