Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster sax #3

Closed
wants to merge 4 commits into from
Closed

Faster sax #3

wants to merge 4 commits into from

Conversation

rhsimplex
Copy link

@rhsimplex rhsimplex commented Jul 25, 2018

Optimizations and a couple tricks to get this to be linear in size of sequence:

In [3]: %timeit -r1 find_discords_hotsax(np.random.random(5000), num_discords=2)
8.85 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)

In [4]: %timeit -r1 find_discords_hotsax(np.random.random(10000), num_discords=2)
17.8 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)

In [5]: %timeit -r1 find_discords_hotsax(np.random.random(20000), num_discords=2)
36.1 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)

In [6]: %timeit -r1 find_discords_hotsax(np.random.random(40000), num_discords=2)
1min 18s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)

Tricks:

  • precompute all znorms (old code does a lot of redundant calculations)
  • don't search over most common PAA sequence (in sequences with no anomalies, these will dominate by far, and are the least interesting regions)
  • limit random search. as noted in the paper, random search is O(n2)

@rhsimplex rhsimplex closed this Jul 25, 2018
@rhsimplex
Copy link
Author

Sorry, meant to push this to my own fork. Can still make a PR if you're interested in stuff like this =)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant