[ENH] New similarity search module #724

TonyBagnall · 2023-09-05T18:39:21Z

this PR introduces a new (experimental) module: similarity search by @baraline

Overview of the content:

Changed base shape expected for X to (n_samples, n_channels, n_timestamps), and for Q to (n_channels, q_length) to handle the more general case.
The choice of how a candidate of X is matched with Q is left to the child classes (e.g. TopKSimilaritySearch class will ask for a k argument and return top-k matches). A possibility would be to make a base class for distance profiles to force them having the same API and have attributes indicating if they are normalized or not.
Normalization happens in the predict method every time it is called. It stores the mean and std of each channels of Q and of all possible candidates of X as attributes (e.g _X_means). They are stored as argument to be called in the child classes of the base class when the distance profile is normalized. It may be possible to avoid recomputing it by checking if the length of Q and the stored input _X have not changed.
Added naive Euclidean and normalized naive Euclidean distance profiles with basic tests
Added top-k similarity search class with basic tests. The return of predict is an array of tuple as (with k=1) [(id_sample, id_timestamp)]

TODOs left for future PR :

Make documentation for module and estimator
Decide how to handle/check when there is not input or query length change to avoid computing normalization again
Refine testing scenarios and add them to the CI pipeline (if not made automaticaly ?)
Add non-naïve Euclidean distance profile with Mueen Algorithm.
Add a template for new cases

…rimental/similarity_search

…an distance (#756) * Adding TopKSimilarity Search class with euclidean distance profiles * Removing old normalization attributes * Removing old normalization attributes in predict --------- Co-authored-by: MatthewMiddlehurst <m.middlehurst@uea.ac.uk>

…eon-toolkit/aeon into experimental/similarity_search

TonyBagnall

minor points, maths in docstring main point. Forgot this was my own PR :)

aeon/similarity_search/top_k_similarity.py

aeon/similarity_search/base.py

aeon/similarity_search/distance_profiles/_commons.py

aeon/similarity_search/distance_profiles/naive_euclidean.py

…eon-toolkit/aeon into experimental/similarity_search

review-notebook-app · 2023-10-16T21:28:56Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

…numpy docs

…heck, some utils functions

… int types

hadifawaz1999

LGTM great work !

Future PR suggestion to do:

1- Add test for base class with a dummy similarity search module ?
2- Add example in doc string of topksimsearch

MatthewMiddlehurst

I have no issue with introducing the module, I have not really looked at the specifics of how it is implemented and am content to let it evolve as usage and implementations increase.

Most of the comments are regarding the documentation, I did not check the examples but may come back to do so (or make a PR later).

aeon/registry/tests/test_lookup.py

aeon/similarity_search/base.py

aeon/similarity_search/distance_profiles/naive_euclidean.py

aeon/similarity_search/distance_profiles/tests/test_naive_euclidean.py

docs/index.md

docs/api_reference/similarity_search.rst

docs/getting_started.md

baraline

The PR should now be in a stable state after the modifications from the previous review. If everyone agree, we can move forward and progress on it in future PR.

TonyBagnall and others added 6 commits September 5, 2023 15:15

base similarity search

ba32f9d

slow search example

1ad9eca

Merge branch 'main' of https://github.com/aeon-toolkit/aeon into expe…

e798d68

…rimental/similarity_search

Merge branch 'main' into experimental/similarity_search

b6491be

Merge branch 'experimental/similarity_search' of https://github.com/a…

5ed662f

…eon-toolkit/aeon into experimental/similarity_search

MatthewMiddlehurst mentioned this pull request Sep 26, 2023

[ENH] Similarity search base class and TopK search with naïve Euclidean distance #756

Merged

TonyBagnall changed the title ~~[ENH] Experimental new similarity search~~ [ENH] New similarity search module Sep 26, 2023

TonyBagnall and others added 9 commits September 26, 2023 15:54

Merge branch 'main' into experimental/similarity_search

015dc9f

format

c258de7

add init

1141032

Merge branch 'main' into experimental/similarity_search

ac04c74

call constructor

ae4b49a

Merge branch 'main' into experimental/similarity_search

98f95db

add similarity base to register

a64e29b

add similarity-search to tagging

1a1858b

Bugfixes for constant case and input alteration during normalization

33557bc

TonyBagnall commented Oct 6, 2023

View reviewed changes

TonyBagnall and others added 11 commits October 7, 2023 14:04

Merge branch 'main' into experimental/similarity_search

ba70e87

typo

cf72421

[pre-commit.ci lite] apply automatic fixes

fbda755

typo

f440c3e

Merge branch 'main' into experimental/similarity_search

2ea98a6

Merge branch 'experimental/similarity_search' of https://github.com/a…

1f74035

…eon-toolkit/aeon into experimental/similarity_search

docstrings

9fcd4d3

docstrings

55ebc86

docstrings

bc75368

docstrings

7398eac

docstrings

c7d927f

TonyBagnall marked this pull request as ready for review October 7, 2023 17:24

baraline and others added 15 commits October 20, 2023 17:40

Adding parameters for self matches, typos in example notebook

810de65

typo in import, replace Q with q

23f29cf

switch test example for pipeline

787fe10

switch test example for pipeline

174fff5

Merge branch 'main' of https://github.com/aeon-toolkit/aeon

10f79f2

Add mask to distance profile, move exclusion zoneto base class, some …

2c66919

…numpy docs

Merge branch 'main' into experimental/similarity_search

a63c21c

Add distance profile and speedups notebooks, exclusion factor value c…

b310735

…heck, some utils functions

Merge branch 'main' of https://github.com/aeon-toolkit/aeon

e4d4b3e

Merge branch 'main' of https://github.com/aeon-toolkit/aeon

bd75ab3

Merge branch 'main' of https://github.com/aeon-toolkit/aeon

dc4d82c

Merge branch 'main' into experimental/similarity_search

8d4a3bd

Fixing tests and docs that where not updated after previous changes

e0c82fd

Force float convertion of input to avoid issues with normalization of…

a625f9f

… int types

Merge branch 'main' into experimental/similarity_search

610ac00

TonyBagnall mentioned this pull request Oct 24, 2023

AEP 01 - Similarity Search Module aeon-toolkit/aeon-admin#1

Open

hadifawaz1999 previously approved these changes Oct 25, 2023

View reviewed changes

TonyBagnall added 2 commits October 25, 2023 18:36

Merge branch 'main' of https://github.com/aeon-toolkit/aeon

60027c5

Merge branch 'main' into experimental/similarity_search

b0a38ba

MatthewMiddlehurst added enhancement New feature, improvement request or other non-bug code enhancement implementing framework Implementing frameworks for new learning tasks labels Oct 25, 2023

MatthewMiddlehurst reviewed Oct 26, 2023

View reviewed changes

Adding dummy class and test, correcting some docstrings

b6d322f

baraline dismissed hadifawaz1999’s stale review via b6d322f October 27, 2023 15:49

Fixes from Matthew review

6771ead

baraline approved these changes Oct 28, 2023

View reviewed changes

MatthewMiddlehurst approved these changes Oct 30, 2023

View reviewed changes

TonyBagnall merged commit de2da96 into main Oct 30, 2023
18 checks passed

TonyBagnall deleted the experimental/similarity_search branch October 30, 2023 10:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] New similarity search module #724

[ENH] New similarity search module #724

TonyBagnall commented Sep 5, 2023 •

edited

Loading

TonyBagnall left a comment

review-notebook-app bot commented Oct 16, 2023

hadifawaz1999 left a comment

MatthewMiddlehurst left a comment

baraline left a comment

[ENH] New similarity search module #724

[ENH] New similarity search module #724

Conversation

TonyBagnall commented Sep 5, 2023 • edited Loading

TonyBagnall left a comment

Choose a reason for hiding this comment

review-notebook-app bot commented Oct 16, 2023

hadifawaz1999 left a comment

Choose a reason for hiding this comment

MatthewMiddlehurst left a comment

Choose a reason for hiding this comment

baraline left a comment

Choose a reason for hiding this comment

TonyBagnall commented Sep 5, 2023 •

edited

Loading