Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Benchmarking interface v2 based on kotsu package #2977

Merged
merged 15 commits into from
Feb 5, 2023

Conversation

alex-hh
Copy link
Contributor

@alex-hh alex-hh commented Jul 11, 2022

Reference Issues/PRs

See previous discussions #2884 and #2805 with previous examples of integrating kotsu into sktime by @TNTran92 and @DBCerigo
The current PR is intended to provide an example of how benchmarking could be implemented by wrapping kotsu in such a way that users only need to deal with sktime objects.

What does this implement/fix? Explain your changes.

This PR provides a candidate benchmarking interface for sktime estimators/datasets by wrapping kotsu.
Currently only supports forecasters, but is intended to be flexible.
It is intended to allow users to create their own benchmarks by adding sets of estimators and tasks.

Does your contribution introduce a new dependency? If yes, which one?

kotsu

What should a reviewer concentrate their feedback on?

  • To avoid sktime users interacting directly with kotsu, we interface kotsu via Benchmark objects designed to handle sktime objects
  • Currently the only output allowed is a df / csv file, however it might be desirable to provide simple ways to save models / predictions etc., by passing appropriate arguments to kotsu's benchmarking
  • Only a forecasting benchmark and a base benchmark are currently implemented. The idea is that other types of modelling problem could extend the base benchmark in similar ways however it wasn't clear to me how best to do this and there may be other suggestions!
  • No tests are implemented currently

Any other comments?

PR checklist

For all contributions
  • I've added myself to the list of contributors.
  • Optionally, I've updated sktime's CODEOWNERS to receive notifications about future changes to these files.
  • I've added unit tests and made sure they pass locally.
  • The PR title starts with either [ENH], [MNT], [DOC], or [BUG] indicating whether the PR topic is related to enhancement, maintenance, documentation, or bug.
For new estimators
  • I've added the estimator to the online documentation.
  • I've updated the existing example notebooks or provided a new one to showcase how my estimator works.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@alex-hh alex-hh changed the title Reimplement benchmarking with kotsu v2 [ENH] Reimplement benchmarking with kotsu v2 Jul 11, 2022
}
],
"source": [
"results_df = benchmark.run(\"./forecasting_results.csv\")\n",
Copy link
Contributor

@TNTran92 TNTran92 Jul 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my run, forecasting_results.csv shows up in the root directory of the repo. In this by design? If yes, I think it's better in the working directory.

Though I can easily pd.read_csv this file, is there a reason to write to a output file?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my run, forecasting_results.csv shows up in the root directory of the repo. In this by design? If yes, I think it's better in the working directory.

I don't think there was much thought gone into where it should be saving yet - the example notebook is intended to be an example, and the hope would be users would configure where they wanted to save their results.

Though I can easily pd.read_csv this file, is there a reason to write to a output file?

When writing kotsu we generally wanted results to be persisted; you run some experiments one day, and we want those results to still be there when we come back to the project/work say a week later. Leaving results in memory isn't a reliable way of achieving that.

@DBCerigo
Copy link
Contributor

(Slides from dev-days-2022 talk on 12-7-22 on "sktime - easy benchmarking", which is relevant to this PR)

@fkiraly fkiraly self-assigned this Jul 12, 2022
@ltsaprounis
Copy link
Contributor

"""
from typing import Callable, Optional, Type, Union

import kotsu
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to move the import inside the constructor and inside the run.

To complete soft dependency encapsulation, kindly add a _check_soft_dependencies at the start of the __init__.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 4e18ac4

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great as a minimal example!

I have suggestions above, the blocking one is the one on isolating the soft dependency.
The other one would be nicer from a pattern/design perspective imo.

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think it is a problem that estimators and their parameters are separated.

I'd rather have objects passed than classes and their parameters separately.
Unfitted objects carry the same information - they are the class plus their kwargs (can be retrieved by get_params)

See discussion in #2804

@TNTran92
Copy link
Contributor

TNTran92 commented Jul 14, 2022

Looks like at runtime, the run method turns models into objects by running model = model_spec.make()
This method returns an entity that has the form of factory(**_kwargs)
Do you think we can pass object directly to run instead of having to go through spec?

@DBCerigo
Copy link
Contributor

DBCerigo commented Jul 24, 2022

Now ready for re-review @fkiraly.

Some replies made directly on review comments above, and another #2804 (comment) in #2804.

PR checklist

  • I've added myself to the list of contributors.
  • Optionally, I've updated sktime's CODEOWNERS to receive notifications about future changes to these files.
  • I've added unit tests and made sure they pass locally.
  • The PR title starts with either [ENH], [MNT], [DOC], or [BUG] indicating whether the PR topic is related to enhancement, maintenance, documentation, or bug.

@alex-hh alex-hh marked this pull request as ready for review July 24, 2022 13:04
@alex-hh alex-hh requested a review from aiwalter as a code owner July 24, 2022 13:04
@DBCerigo DBCerigo force-pushed the kotsu-benchmarking-v2 branch 3 times, most recently from b09507d to e094aec Compare July 27, 2022 17:49
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is good, as a design study!

Blocking change request:

  • please add docstrings to public methods and user facing classes

Questions:

  • how would carrying out tests integrate with this? You need access to individual predictions for that.
  • how would storing partial results on the HD or via a data interface class work?
  • I do not fully understand what type or signature you require for dataset_loader, could you elaborate?

@DBCerigo
Copy link
Contributor

DBCerigo commented Aug 3, 2022

Blocking change request:

* please add docstrings to public methods and user facing classes

Just to confirm, you'd like the doc strings extended to contain Parameters sections etc. right?

Questions:

* how would carrying out tests integrate with this? You need access to individual predictions for that.

What do you mean by "tests"?
If you mean statistical tests and the like - anything that is for a single estimator can we achieved using the system in the PR. Anything that is cross/multi-estimator, that will require the storing of the predictions, which we'll do in the subsequent PR.
If you mean unit tests - such tests are present in this PR.

* how would storing partial results on the HD or via a data interface class work?

Can you clarify what you mean by "partial results"?

* I do not fully understand what type or signature you require for `dataset_loader`, could you elaborate?

This implementation isn't intending to add a new object or interface for a dataset_loader, it was intending to follow what was defined in sktime.datasets.

@TNTran92 TNTran92 mentioned this pull request Aug 3, 2022
6 tasks
@fkiraly
Copy link
Collaborator

fkiraly commented Aug 7, 2022

Just to confirm, you'd like the doc strings extended to contain Parameters sections etc. right?

Yes, Parameters and, if appropriate, Returns. These end up in the public documentation.

how would carrying out tests integrate with this? You need access to individual predictions for that.

What do you mean by "tests"?

Apologies, I meant "statistical hypothesis tests", on comparing performance of estimators. Not a blocking comment, anyway.

Can you clarify what you mean by "partial results"?

For instance, predictions, of m out of n estimators involved in the experiment - similar to the old framework.

This implementation isn't intending to add a new object or interface for a dataset_loader, it was intending to follow what was defined in sktime.datasets.

Ah, I see - docstrings would be helpful that detail the assumptions, then.

@DBCerigo
Copy link
Contributor

@fkiraly this ready for (possibly final) re-review now. Doc strings all updated in 3e28d29. Nice one.

@TNTran92
Copy link
Contributor

Great work. Thanks a lot @DBCerigo

The classification factory is also up (#3278); I'm still working on its unit tests

@fkiraly
Copy link
Collaborator

fkiraly commented Jan 12, 2023

🎉 welcome back, @DBCerigo 🎉

@DBCerigo
Copy link
Contributor

@fkiraly good to be back 😸 This is now ready for re-review. The change since last review is c25de09, as discussed on slack.

@fkiraly
Copy link
Collaborator

fkiraly commented Jan 30, 2023

Great! Good to have you back.

There's one point to address, it is dependency isolation.

If kotsu becomes a soft dependency, dependent functionality should be isolated. To see examples how this is to be done, search for:

  • pytest.mark.skipif
  • _check_soft_dependencies

in the code base. Also see
https://www.sktime.org/en/stable/developer_guide/dependencies.html

@DBCerigo
Copy link
Contributor

DBCerigo commented Feb 2, 2023

@fkiraly thanks for the pointers for the test skips - all implemented now.

Let's wait until we have 2+ subclasses of the base class before doing any more design changes, as having multiple examples will make it plainer to see and verify the usefulness of any further design changes. So I would suggest merging this PR as is. Then implementing and merging 1+ more benchmarks (I think @TNTran92 has a PR for classifiers already? though likely needs some retweaking now), then reviewing the implementation as a whole.

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Massive PR, extremely useful functionality!

Comments are either addressed, or make sense to address separately, e.g., minor rewors to the interface.

fkiraly added a commit that referenced this pull request Feb 5, 2023
)

This fixes some merge conflicts and formatting in `.all-contributorsrc`.

Includes contributors from #2977 to remove conflicts with main, of #2977
fkiraly added a commit that referenced this pull request Feb 5, 2023
#2977 cannot be pushed to due to branch lock, but has conflicts on `all-contributorsrc`. Therefore we will carry out the merge as follows:

1. overwrite main with `all-contributorc` from #2977
2. merge #2977
3. revert 1
@fkiraly fkiraly merged commit fcf8a4e into sktime:main Feb 5, 2023
@fkiraly fkiraly changed the title [ENH] Reimplement benchmarking with kotsu v2 [ENH] Benchmarking interface v2 based on kotsu package Feb 5, 2023
fkiraly added a commit that referenced this pull request Feb 5, 2023
Reverts #4206, see there for explanation of the strategy to resolve the merge conflict with #2977
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:metrics&benchmarking metrics and benchmarking modules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants