New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
transformers refactor: summarizer and base class #1663
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just tested the SummaryTransformer for panel data portion, it all works fine with univariate data. With an input of shape (20, 1, 24) I get an output of shape (20, 7) and I can build a sklearn classifier with no issue.
Multivariate data seems to have an issue, however. With an input of (10, 6, 100) I get an output of shape (60, 7). The conversion process has given me 50 extra cases, I would expect (10, 42).
If we need this in to be on-track for release, I'm not going to push that the above be sorted now, as you could probably reshape the above output for the correct format (even though that would be annoying). Will approve as the other changes are good and let you decide if more should be added to the PR.
What should the result be in the multivariate case, in your opinion? Would you expect multi-index across rows, or across columns? Or, something else? |
@MatthewMiddlehurst, I opened an issue here, I think it's primarily a design question: FYI, the vectorization over the variables isn't coming from me, i.e., not automated by the base class. If you didn't intend it, it's probably coming from apply-like |
This refactors the
SummaryTransformer
to use the new transformer template.Some minor changes there -
transform
was still present and overwrote_transform
, and the "inner type" tags were set incorrectly.In addition, this fixes two bugs:
Panel
is passed to aSeries
-wise transform that outputsPrimitives
- in this case, thet different rows were not concatenated properly (some index issues).fit
is called beforetransform
, even if thefit-in-transform
flag as True (which meansfit
is empty and can be skipped, butfit
still needs to be called). As a shorthand for "transform right away",fit_transform
should be called, nottransform
.