add isconstant param to mstump#871
Conversation
137d831 to
ad32d18
Compare
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #871 +/- ##
========================================
Coverage 99.12% 99.13%
========================================
Files 83 83
Lines 13940 14068 +128
========================================
+ Hits 13818 13946 +128
Misses 122 122
☔ View full report in Codecov by Sentry. |
NimaSarajpoor
left a comment
There was a problem hiding this comment.
@seanlaw
Could you please check out the only comment I left for now? Afterwards, I start with adding test function and then adding the param isconstant to the performant module.
…ion. update test functions
NimaSarajpoor
left a comment
There was a problem hiding this comment.
@seanlaw
I have a few comments. Can you please go through them and let me know if you have any feedback?
|
There is still one function that needs to support the parameter In this function, we can see the following lines: where the variables Hence, the D in line 193, i.e. Now the question is: How should we apply the "isconstant" concept here? One straightforward approach is to just say if both @seanlaw What do you think? Is there anything that I am missing here? |
That sounds reasonable to me.
Yeah, that sounds fine. At the end of the day, the user providing In case it matters, you would likely apply the same logic to |
Right! I should check that out in the main branch and just get some sense about the output and see what happens in default.
Yep! I will go through it more carefully to make sure that we are not messing with the default mode, i.e. output of existing version.
Right! I missed that while I was going through the script. Thanks for pointing that out. |
…ven if provided as df
|
I have been thinking about the discretizing process and if we should be worried about how it handles (user-defined) constant subseqs. I now think it should be straightforward. This is what I noticed: Before discretization, we first z-normalize the subseqs. Now, something very cool is happening here: the mean of the z-normalized subseq is zero. This means that the z-normalized subseq is either ALL zero or it has BOTH positive and negative values. (Because otherwise, the mean cannot be zero). Therefore, when we discretize a subseq S, I think we cannot have identical values in the discretized version unless the non-discretized version, i.e. Now, what if user defines constant subseqs by passing |
Ahh, I see what you mean. Since we z-normalize prior to discretization then we should be mostly safe (at least in the case where the subsequence is "truly" constant). In the case where the user provides If so, then it sounds like this greatly simplifies things! Thanks for thinking through it @NimaSarajpoor. It's been so long that I've forgotten all of the inner workings. Hopefully, the code/logic wasn't too painful to understand? |
That is correct!
I think it was good overal :) |
There was a problem hiding this comment.
@seanlaw
I think we are close to finishing this PR. I left a few comments (some of them are for me. So, you can ignore those as I will take care of them for sure unless you have a suggestion / opposite opinion regarding them)
|
@seanlaw |
Thank you. I will find some time to review it |
seanlaw
left a comment
There was a problem hiding this comment.
@NimaSarajpoor Let's start with these comments
|
@NimaSarajpoor Do I need to review anything yet? |
No, I haven't made the changes yet. Hopefully, I will push them in upcoming days. |
45b965a to
4e0bd04
Compare
|
@seanlaw |
Thank you. I will find some time to review it |
|
@NimaSarajpoor this was a more challenging one. Thank you for your contribution and persistence! |
Thank you for the guidance! |
No description provided.