Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pairwise transformers tests #1096

Merged
merged 9 commits into from Jul 5, 2021

Conversation

chrisholder
Copy link
Collaborator

Reference Issues/PRs

Tests to go along side the PR #1071 @fkiraly

What does this implement/fix? Explain your changes.

Implements basic tests for general transformers and specific tests for the scipy_dist pairwise transformer.

Does your contribution introduce a new dependency? If yes, which one?

What should a reviewer concentrate their feedback on?

Is this enough I can add more but felt for now keep it simple until we see where this goes fully.

Any other comments?

PR checklist

For all contributions
  • I've added myself to the list of contributors.
  • Optionally, I've updated sktime's CODEOWNERS to receive notifications about future changes to these files.
  • I've added unit tests and made sure they pass locally.
For new estimators
  • I've added the estimator to the online documentation.
  • I've updated the existing example notebooks or provided a new one to showcase how my estimator works.

@chrisholder chrisholder requested a review from mloning as a code owner July 2, 2021 19:42
@fkiraly
Copy link
Collaborator

fkiraly commented Jul 3, 2021

I think this is still failing because the automated tests in the _make_args logic - I think this needs adapting for the test still?

@chrisholder
Copy link
Collaborator Author

chrisholder commented Jul 3, 2021

Yea I was going to ask you about this since that was what I was about to fix and realised I didn't know what you wanted. Because this is inheriting BaseEstimator there are standard test expectations such as _is_fittedm fit(), predict() etc. Was just wondering why it is inheriting BaseEstimator if these aren't being used?

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 3, 2021

Yea I was going to ask you about this since that was what I was about to fix and realised I didn't know what you wanted. Because this is inheriting BaseEstimator there are standard test expectations such as _is_fittedm fit(), predict() etc. Was just wondering why it is inheriting BaseEstimator if these aren't being used?

The useful functionality we want is get_params and the associated nested parameter access from scikit-learn.
Indeed, fitting is not relevant for most distances. This is also true for some other objects, which is why @RNKuhns has been trying to create a BaseObject which is BaseEstimator minus fitting, see #877 and #891.

I think the distances set the precedent for objects that do not need fitting, so it may require careful thinking what to do about the tests. An easy way may be to just manually specify the methods that the distance is expected to have - or, if that is not easy, just skip some of the tests if you see a distance/kernel.

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 4, 2021

quick question: do you know what to do here, or should I have a look?

@chrisholder
Copy link
Collaborator Author

Sorry yea should be all working now had to write some extra tests for the panel data

@chrisholder
Copy link
Collaborator Author

chrisholder commented Jul 4, 2021

I've made some changes to BasePairwiseTransformerPanel that you may not like btw. The comment for what BasePairwiseTransformerPanel supports was numpy or List of dataframes, but it was only supporting List of dataframes so I just added a method to allow it support List of dataframes, Numpy of dataframes, and 3d Numpy - maybe wasn't your intention can remove very easily if you don't want it.

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 5, 2021

Yes, I'm fine with the changes to the input conversion - this is something we may anyway refactor in the near future, but good for now as it covers a potential problem. Also happy with the changes to the p condition.

Regarding the __init__, I don't like that you moved np.mean to the default args.
I think a general good practice recommendation is to avoid non-primitive arguments in the init, or arguments that are reliant on imports.
This is because having mutable arguments in the init (like lists or data frames) may lead to easy to commit but hard to track down bugs. Now np.mean is not mutable, but it's imported, so potentially open to that issue. It's also easier to keep the rule in mind if you memorize that only strings, numbers etc go as defaults.

Maybe being a bit overcautious here, but I'd consider some adherence with "defensive programming" principles as good practice.

Feel free to merge either way though.

@chrisholder chrisholder merged commit 8d570b7 into sktime:trafo-distances Jul 5, 2021
@chrisholder chrisholder deleted the trafo-distances branch July 5, 2021 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants