Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Adapter for Scipy Distributions #287

Merged
merged 11 commits into from
May 3, 2024
Merged

Conversation

malikrafsan
Copy link
Contributor

@malikrafsan malikrafsan commented Apr 29, 2024

Reference Issues/PRs

Fixes #227

What does this implement/fix? Explain your changes.

  • Adapter for Scipy distributions
  • Fisk Distribution using Scipy Adapter
  • Poisson Distribution using Scipy Adapter

Does your contribution introduce a new dependency? If yes, which one?

no

What should a reviewer concentrate their feedback on?

  • Adapter structure

Did you add any tests for the change?

  • I try to install this library locally using pip install --editable .[dev,test] and create a simple driver program to use this new distribution
  • I try to compare several methods of the existing Fisk distribution with proposed one and they produce the same result

Any other comments?

PR checklist

For all contributions
  • I've added myself to the list of contributors with any new badges I've earned :-)
    How to: add yourself to the all-contributors file in the skpro root directory (not the CONTRIBUTORS.md). Common badges: code - fixing a bug, or adding code logic. doc - writing or improving documentation or docstrings. bug - reporting or diagnosing a bug (get this plus code if you also fixed the bug in the PR).maintenance - CI, test framework, release.
    See here for full badge reference
  • The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.
For new estimators
  • I've added the estimator to the API reference - in docs/source/api_reference/taskname.rst, follow the pattern.
  • I've added one or more illustrative usage examples to the docstring, in a pydocstyle compliant Examples section.
  • If the estimator relies on a soft dependency, I've set the python_dependencies tag and ensured
    dependency isolation, see the estimator dependencies guide.

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

This was extremely quick and exactly what I meant! Really nice!

Some comments:

  • good idea to have separate discrete/continuous. Currently, the interface indeed is different. There is also the issue of not having pmf in the base class.
    • for the time being, I think we can leave the handling of pmf and log_pmf as you suggest. However, you have certainly noted the missing handling of pmf in the base class, and that's why you tried to work around it wiht the second class. I would recommend, could you open an issue on implementing the pmf in the base class? Not necessarily sth to work on right now, but to keep track of what we still need to do. You could also pick it up later in a separate PR, if you'd like.
  • what I would do is replace entirely the current Fisk and Poisson distribution implementations with the scipy based one. There is no difference in the logic, only the code is more streamlined through the adapter.
    • don't worry about delete/replace, the classes were written expressly with the idea in mind to later refactor them to a single adapter. What is important to check that this does not change anything, so is a proper refactor - see below.
  • what could be helpful is to add some tests for the adapter class, e.g., take Fisk and check its outputs against direct outputs from scipy. That way we test that the adapter class is really adapting 1:1, and we can check that "before and after" are the same, for this refactor.
  • finally, may I suggest to pick one more distribution from scipy and add it as an skpro object? How about a beta distribution, given that it is on the priority list in [ENH] roadmap of probability distributions to implement #22?
    • with the adapter, it should now be easy to add dozens of distributions with minimal work overhead, by using scipy. I would leave this to separate PR. And of course the more exotic properties like energy will not be provided by scipy.
  • make sure you adhere to code formatting standards, here is a guide how to ensure this locally on your computer: https://www.sktime.net/en/stable/developer_guide/coding_standards.html - this is also requierd for the tests on remote to start.
    • you can also test your distributions for interface compliance by using check_estimator (from skpro.utils)

@fkiraly fkiraly added enhancement module:probability&simulation probability distributions and simulators labels Apr 29, 2024
@malikrafsan
Copy link
Contributor Author

malikrafsan commented May 1, 2024

I would recommend, could you open an issue on implementing the pmf in the base class?

Sure! I would love to do it

I have opened the new issue here #289 Please let me know if you have any feedback for the issue!

@malikrafsan
Copy link
Contributor Author

don't worry about delete/replace, the classes were written expressly with the idea in mind to later refactor them to a single adapter.

Ahh I see, previously, I was a bit afraid to delete them 😅 Okay, I will replace them!

@malikrafsan
Copy link
Contributor Author

what could be helpful is to add some tests for the adapter class, e.g., take Fisk and check its outputs against direct outputs from scipy. That way we test that the adapter class is really adapting 1:1, and we can check that "before and after" are the same, for this refactor.

I tried to run the test, but it turns out that, after changes, Fisk and Poisson failed the test. I will try to understand the test cases and pytest first as this is also my first time having hands-on on pytest.

Also for confirmation, from what I understand, it is better for me to make new test file to compare the result of the scipy adapter (let's say Fisk) vs result directly from scipy right? I will try to make it after fixing the test first

Screenshot from 2024-05-01 17-33-02

@fkiraly
Copy link
Collaborator

fkiraly commented May 1, 2024

I tried to run the test, but it turns out that, after changes, Fisk and Poisson failed the test. I will try to understand the test cases and pytest first as this is also my first time having hands-on on pytest.

That is odd. I thought your tests passed with your added distribution, so renaming and moving them should not have broken the tests. Do you have a good guess for the reason of the failures?

I think I found your problem - _ppf should have arg p, not q.

Further, your type annotations are incorrect, these should be np.ndarray, not pd.DataFrame - but that will not fail tests. I would also suggest not to annotate, as that will unnecessarily couple the module with a numpy import. Instead, if you want, you can copy-paste the docstrings from the extension template (extension_template/distributions), that is much clearer info for developers.

@fkiraly
Copy link
Collaborator

fkiraly commented May 1, 2024

Also for confirmation, from what I understand, it is better for me to make new test file to compare the result of the scipy adapter (let's say Fisk) vs result directly from scipy right?

Yes, exactly. I would add a new file in distributions.adapter.scipy.tests for this.

@fkiraly
Copy link
Collaborator

fkiraly commented May 1, 2024

(I fixed the typo in the _ppf arg, hope that is ok)

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good, unfortunately the failures are on 3.8, from type annotations, square brackets are not supported. Would you be so kind to remove these? These are internal annotations anyway, in private methods.

(we will drop 3.8 support soon, but only in late May or June)

@malikrafsan
Copy link
Contributor Author

malikrafsan commented May 3, 2024

(I fixed the typo in the _ppf arg, hope that is ok)

Sure! Absolutely no problem for me! Thank you for fixing it 😄

All looks good, unfortunately the failures are on 3.8, from type annotations, square brackets are not supported. Would you be so kind to remove these? These are internal annotations anyway, in private methods.

Thank you so much for pointing this out! I have removed the type annotations square brackets and the test work fine after that! 😄

what could be helpful is to add some tests for the adapter class

I have also added the pytest for scipy adapter class, please let me know if you have any feedback for the test!

I also changed the scipy adapter by removing the scipy discrete adapter -> if the method is not supported for the discrete/continuous distribution, then it will return 0 (from what I understand on #229 ). Please let me know if it is correct or not😅😅

@malikrafsan malikrafsan marked this pull request as ready for review May 3, 2024 08:14
@malikrafsan malikrafsan requested a review from fkiraly May 3, 2024 08:14
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - this is great!

In a separate step, we may want to move the "return 0" logic into the BaseDistribution class, and base it on the tag distribution_type.

@fkiraly
Copy link
Collaborator

fkiraly commented May 3, 2024

What would be great is adding a new distribution, to check whether it is indeed easy to use this adapter - how about beta? You can do this in another PR based on this one.

@fkiraly fkiraly changed the title [ENH] Add Adapter for Scipy Distributions [ENH] Adapter for Scipy Distributions May 3, 2024
@fkiraly fkiraly merged commit 79dccf2 into sktime:main May 3, 2024
36 checks passed
@malikrafsan
Copy link
Contributor Author

What would be great is adding a new distribution, to check whether it is indeed easy to use this adapter - how about beta? You can do this in another PR based on this one.

Sure, I would like to do that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement module:probability&simulation probability distributions and simulators
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ENH] adapter for scipy distributions
2 participants