Skip to content

Conversation

mihailescum
Copy link
Contributor

I updated the fast pattern matching tutorial and welcome any suggestions.

A question from my side: I wrote this some while ago, when there was still an excl_zone parameter. I wanted to demonstrate that if setting excl_zone to zero in match, you could get the same behavior as when doing the pattern search by hand as in the first part of the tutorial. Now this is no longer possible. Do you have any suggestions on what to do? It feels natural to be able to disable the exclusion zone if necessary, but this would require an ignore_trivial parameter or so that is not there.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@mihailescum
Copy link
Contributor Author

Also, please ignore the ton of commits. They are just some residue. I guess you will squash the history in the end.

@seanlaw
Copy link
Contributor

seanlaw commented Jun 14, 2021

A question from my side: I wrote this some while ago, when there was still an excl_zone parameter. I wanted to demonstrate that if setting excl_zone to zero in match, you could get the same behavior as when doing the pattern search by hand as in the first part of the tutorial. Now this is no longer possible. Do you have any suggestions on what to do? It feels natural to be able to disable the exclusion zone if necessary, but this would require an ignore_trivial parameter or so that is not there.

I'm not entirely sure but let me take a look and see if I can understand what you mean.

@seanlaw
Copy link
Contributor

seanlaw commented Jun 14, 2021

@mexxexx I went through the tutorial really, really quickly and I love what you've done and how you've laid everything out! Great work! I will need to find some time to go over it in more detail.

Regarding the excl_zone, if I understand correctly, you can still set the exclusion zone to zero by doing:

from stumpy.config import STUMPY_EXCL_ZONE_DENOM
STUMPY_EXCL_ZONE_DENOM = np.inf

Then, internally inside of match(), excl_zone = int(np.ceil(m / STUMPY_EXCL_ZONE_DENOM)) should resolve to zero. So, the exclusion zone is now set globally rather than locally. This ensures that the user doesn't "forget" to set it across multiple functions. Of course, the only constraint here is that the exclusion zone must ALWAYS be expressed relative to m but I think this is a reasonable tradeoff. Does that make sense? Hopefully, I'm not misunderstanding your meaning.

@mihailescum
Copy link
Contributor Author

Sure, take your time and then we can see if anything needs adjustment.

Regarding the excl_zone, if I understand correctly, you can still set the exclusion zone to zero by doing:

That is what I needed :) I did not think about setting the denominator to np.inf! It would still make sense to reset the denominator afterwards, right? Otherwise it might screw up the following computations.

What about adding a small context manager to config.py looking something like this:

@contextlib.contextmanager
def set_environment_variable(name, value):
    thismodule = sys.modules[__name__]

    environment_vars = [v for v in dir(thismodule) if v.startswith("STUMPY")]
    if name not in environment_vars:
        raise AttributeError("No STUMPY environment variable with the name {} exists.".format(name))

    value_old = getattr(thismodule, name)
    setattr(thismodule, name, value)
    yield
    setattr(thismodule, name, value_old)

Then one could to the following:

with stumpy.config.set_environment_variable("STUMPY_EXCL_ZONE_DENOM", np.inf):
    dosomething()

And then it would automatically reset to the original value:

with stumpy.config.set_environment_variable("STUMPY_EXCL_ZONE_DENOM", np.inf):
    print(stumpy.config.STUMPY_EXCL_ZONE_DENOM) # prints np.inf

print(stumpy.config.STUMPY_EXCL_ZONE_DENOM) # prints 4

However, I realize that this might be a bit over the top and only for a very specific use case.

@seanlaw
Copy link
Contributor

seanlaw commented Jun 15, 2021

What about adding a small context manager to config.py

Initially, I thought that using a context manager was an interesting idea. However, as I thought through it:

  1. It does feel somewhat over-the-top and it feels like a lot of unnecessary coding
  2. I don't love that the reset behavior feels implicit (i.e., it isn't clear to first time users that variables will reset or turn back to the previously set value and not necessarily to the ORIGINAL value?)

It feels a lot more compact and explicit to do:

from stumpy.config import STUMPY_EXCL_ZONE_DENOM


STUMPY_EXCL_ZONE_DENOM = np.inf
do_something()
STUMPY_EXCL_ZONE_DENOM = 4  # Set it back to the default value

My vote is to not use a context manager.

@seanlaw
Copy link
Contributor

seanlaw commented Jun 15, 2021

Also, if you wouldn't mind handling this, I've added a Matplotlib style file so the lines:

plt.rcParams["figure.figsize"] = [20, 6]  # width, height
plt.rcParams['xtick.direction'] = 'out'

can now be replace with:

plt.style.use('stumpy.mplstyle')

@review-notebook-app
Copy link

review-notebook-app bot commented Jun 22, 2021

View / edit / reply to this conversation on ReviewNB

seanlaw commented on 2021-06-22T01:52:15Z
----------------------------------------------------------------

Please replace core.mass with stumpy.mass 


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

seanlaw commented on 2021-06-22T01:52:15Z
----------------------------------------------------------------

Please replace core.mass with stumpy.mass


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

seanlaw commented on 2021-06-22T01:52:16Z
----------------------------------------------------------------

maybe "STUMPY provides you with a super powerful function called match() that does even more work for you. One benefit of using match is that, as it discovers each new neighbor, it applies an exclusion zone around it and this ensures that every match that is returned is actually a unique occurrence of your input query."


@review-notebook-app
Copy link

review-notebook-app bot commented Jun 22, 2021

View / edit / reply to this conversation on ReviewNB

seanlaw commented on 2021-06-22T01:52:16Z
----------------------------------------------------------------

When you said "Above" I thought you meant in the last paragraph so maybe let's be specific and say, "Earlier, manually sorted the distance profile..."

"patients" should be "patient's"

"For example, if you have EEG data of a patients heartbeat and want to match one specific beat, then you may consider using a smaller threshold since your time series may be highly regular."

"Let's plot all of the discovered matches to see if we need to adjust our threshold"


@review-notebook-app
Copy link

review-notebook-app bot commented Jun 22, 2021

View / edit / reply to this conversation on ReviewNB

seanlaw commented on 2021-06-22T01:52:17Z
----------------------------------------------------------------

'While some of the main feature are somewhat conserved across all subsequences, there seems to be a lot of artifacts as well.

With stumpy.match, you have two options for controlling the threshold. You can either specify a constant value (e.g., max_distance=5.0) or provide a custom function. This function has to take one parameter, which will be the distance profile, D, between Q and T. This way, you can encode some dependency on the distance profile into your maximum distance threshold. The default maximum distance is max_distance = max(np.mean(D) - 2 * np.std(D), np.min(D)). This is the typical "two standard deviations below from the mean".

Typically, one has to experiment a bit with what an acceptable maximum distance so let's try to change it to "four standard deviations below the mean" (i.e., a smaller maximum distance).


@review-notebook-app
Copy link

review-notebook-app bot commented Jun 22, 2021

View / edit / reply to this conversation on ReviewNB

seanlaw commented on 2021-06-22T01:52:17Z
----------------------------------------------------------------

Please change core.mass to stumpy.mass 


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

seanlaw commented on 2021-06-22T01:52:18Z
----------------------------------------------------------------

Looks like there is an error :(


@review-notebook-app
Copy link

review-notebook-app bot commented Jun 22, 2021

View / edit / reply to this conversation on ReviewNB

seanlaw commented on 2021-06-22T01:52:19Z
----------------------------------------------------------------

Please use stumpy.mass instead of core.mass


@seanlaw
Copy link
Contributor

seanlaw commented Jun 22, 2021

@mexxexx I really enjoyed your addition! Great work! I provided some minor edits for you to consider

@seanlaw
Copy link
Contributor

seanlaw commented Jul 14, 2021

We are adding a Binder link to the top of each tutorial. Would you mind adding the following below the main H1 header:

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/TDAmeritrade/stumpy/main?filepath=notebooks/Tutorial_Pattern_Searching.ipynb)

Right now it links to Tutorial_Pattern_Searching but you may change that accordingly

@codecov-commenter
Copy link

codecov-commenter commented Jul 27, 2021

Codecov Report

Merging #407 (34d2744) into main (bf66a4b) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##              main      #407   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           35        35           
  Lines         2764      2765    +1     
=========================================
+ Hits          2764      2765    +1     
Impacted Files Coverage Δ
stumpy/motifs.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bf66a4b...34d2744. Read the comment docs.

@mihailescum
Copy link
Contributor Author

mihailescum commented Jul 27, 2021

@seanlaw I finally got around integrating your feedback into the tutorial. What do you think?

@seanlaw
Copy link
Contributor

seanlaw commented Jul 28, 2021

@mexxexx I think everything looks great! I find some minor things but I can go over it after this PR gets merged. Thank you for this contribution!

@seanlaw seanlaw marked this pull request as ready for review July 28, 2021 00:30
@seanlaw seanlaw merged commit 1a82dc3 into stumpy-dev:main Jul 28, 2021
@Meng6
Copy link

Meng6 commented Jul 30, 2021

Also, if you wouldn't mind handling this, I've added a Matplotlib style file so the lines:

plt.rcParams["figure.figsize"] = [20, 6]  # width, height
plt.rcParams['xtick.direction'] = 'out'

can now be replace with:

plt.style.use('stumpy.mplstyle')

Hi @seanlaw ,

I installed the latest version of stumpy package (v1.9.2) via Conda, but I got an error while running the following line of code:

plt.style.use('stumpy.mplstyle')

Here is the error message:

File "/Users/xxx/opt/anaconda3/lib/python3.7/site-packages/matplotlib/style/core.py", line 127, in use
    "available styles".format(style)) from err
OSError: 'stumpy.mplstyle' not found in the style library and input is not a valid URL or path; see `style.available` for list of available styles

I think that's because the stumpy's matplotlib file cannot be found. Do you know how to solve this problem?

Thanks so much for your help!
Meng

@seanlaw
Copy link
Contributor

seanlaw commented Jul 30, 2021

@Meng6 I think what you are experiencing is a separate issue. Would you mind posting your question in our Github Discussions:

https://github.com/TDAmeritrade/stumpy/discussions

And please describe what it is that you are trying to do? The style file is not needed for STUMPY (it is only there to keep our plots looking somewhat consistent) and you can safely comment/ignore that line out if you are trying to copy the steps in the tutorial or you can save a copy of our style file and place it in the same directory as your notebook.

@seanlaw
Copy link
Contributor

seanlaw commented Jul 31, 2021

@Meng6 Alternatively, you can replace:

plt.style.use('stumpy.mplstyle')

with:

plt.style.use('https://raw.githubusercontent.com/TDAmeritrade/stumpy/main/docs/stumpy.mplstyle')

And it should work. I've updated all of the tutorials with that now

@Meng6
Copy link

Meng6 commented Aug 2, 2021

Hi @seanlaw, it works! Thanks so much for your help & quick response!

@mihailescum mihailescum deleted the mexxexx/issue358 branch August 9, 2021 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants