Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Baseline Similarity Evaluation with Annoy #364

Merged
merged 380 commits into from Jul 21, 2021
Merged
Changes from all commits
Commits
Show all changes
380 commits
Select commit Hold shift + click to select a range
4f70ffc
Adds bulk endpoint for similar recordings
aidanlw17 Jun 25, 2019
79ff1c2
Add bulk get functionality to index model
aidanlw17 Jun 25, 2019
27164c6
Parse parameters to index to check that they are possible
aidanlw17 Jun 25, 2019
243e297
Adds bulk endpoint for similar recordings
aidanlw17 Jun 25, 2019
5f8a723
Add bulk get functionality to index model
aidanlw17 Jun 25, 2019
34f5af1
removes postgres and hybrid similarity
aidanlw17 Jul 4, 2019
b91171a
Add blueprint for similarity api
aidanlw17 Jul 4, 2019
3e634e7
Add endpoint to get distance between recordings
aidanlw17 Jul 4, 2019
fa34f08
Remove whitespace
aidanlw17 Jul 4, 2019
6045a44
Alters parameter checking for similarity indices
aidanlw17 Jul 6, 2019
0dc187e
Alters exception for getting lowlevel id
aidanlw17 Jul 6, 2019
f230312
Add unit testing for get_similar_recordings
aidanlw17 Jul 6, 2019
aaa4adb
Adds tests for bulk similarity lookup
aidanlw17 Jul 6, 2019
a5d0f04
Uppercase MBID in docstring
aidanlw17 Jul 6, 2019
3350e0a
Remove connection property for AnnoyModel class
aidanlw17 Jul 7, 2019
a41e4d8
Uses mock AnnoyModel to test similarity lookup endpoints
aidanlw17 Jul 7, 2019
371c4d8
Fix typo in bulk get
aidanlw17 Jul 7, 2019
c639549
Fix issue with loading index function call
aidanlw17 Jul 7, 2019
5b8bd7e
Add test cases for similarity between recordings
aidanlw17 Jul 7, 2019
c9e1270
Fix error in similarity between return value
aidanlw17 Jul 7, 2019
dee8cda
Refactor to move sql queries into db module
aidanlw17 Jul 7, 2019
3757394
Remove unneeded imports in similarity/script.py
aidanlw17 Jul 8, 2019
bc1e568
Adds and improves docstrings
aidanlw17 Jul 8, 2019
1caf227
Adds utilities for querying
aidanlw17 Jun 25, 2019
a1533d0
Add cli command for removing index
aidanlw17 Jun 25, 2019
420b181
Adds bulk endpoint for similar recordings
aidanlw17 Jun 25, 2019
b5f648a
Add bulk get functionality to index model
aidanlw17 Jun 25, 2019
d2b7c1e
Parse parameters to index to check that they are possible
aidanlw17 Jun 25, 2019
782feb5
Adds bulk endpoint for similar recordings
aidanlw17 Jun 25, 2019
3eeff0b
Add bulk get functionality to index model
aidanlw17 Jun 25, 2019
b9b5c2a
removes postgres and hybrid similarity
aidanlw17 Jul 4, 2019
89bdce1
Add endpoint to get distance between recordings
aidanlw17 Jul 4, 2019
163d8d4
Alters parameter checking for similarity indices
aidanlw17 Jul 6, 2019
f7634d6
Adds tests for bulk similarity lookup
aidanlw17 Jul 6, 2019
31c911c
Remove connection property for AnnoyModel class
aidanlw17 Jul 7, 2019
4789663
Uses mock AnnoyModel to test similarity lookup endpoints
aidanlw17 Jul 7, 2019
ebae5e2
Fix issue with loading index function call
aidanlw17 Jul 7, 2019
ba7e9ed
Add test cases for similarity between recordings
aidanlw17 Jul 7, 2019
1045e3c
Refactor to move sql queries into db module
aidanlw17 Jul 7, 2019
f205b47
Adds and improves docstrings
aidanlw17 Jul 8, 2019
2d6fe33
Remove duplicate function after rebase
aidanlw17 Jul 10, 2019
ff95070
Fix n_neighbours parsing
aidanlw17 Jul 10, 2019
c40bf72
Add unit tests and refactor
aidanlw17 Jul 11, 2019
a0b1704
Fix bug in unit test
aidanlw17 Jul 12, 2019
d307c72
Refactors to avoid circular imports and remove unnecessary function
aidanlw17 Jul 12, 2019
805e5ae
Small bug fix and changes to unit test mocks
aidanlw17 Jul 16, 2019
c3cb64f
Change exception for python2
aidanlw17 Jul 16, 2019
554650f
Add tests for build/save/load cycle and init in index model
aidanlw17 Jul 16, 2019
d175107
Finishes test suite for index_model
aidanlw17 Jul 17, 2019
a76885f
Adds similarity eval table and fks in update
aidanlw17 Jul 17, 2019
5299bd1
Drop similarity_eval table and fks in script
aidanlw17 Jul 17, 2019
aa9e440
Add route and template for similarity metrics
aidanlw17 Jul 17, 2019
c85ba28
Remove check for visible metrics
aidanlw17 Jul 17, 2019
ad61f42
adds schema, templates, and routes
aidanlw17 Jul 18, 2019
754b8db
Add support for similarity schema
aidanlw17 Jul 26, 2019
2fcc9b1
Improve appearance of similar recordings link
aidanlw17 Jul 26, 2019
02d663e
Adds similarity schema
aidanlw17 Jul 23, 2019
d048cab
Fix references to similarity tables
aidanlw17 Jul 23, 2019
712df43
Fix typo
aidanlw17 Jul 26, 2019
6577837
Upgrade annoy to 1.16.0
aidanlw17 Jul 28, 2019
bea8059
Improve speed of adding indices and remove default for params
aidanlw17 Jul 28, 2019
fafc086
Add utility to initialize indices
aidanlw17 Jul 28, 2019
e7d4640
Add message for error
aidanlw17 Jul 31, 2019
8323939
Adds forms for evaluation
aidanlw17 Jul 31, 2019
b5b5645
Adds script for similarity eval submission
aidanlw17 Jul 31, 2019
1421b98
Alter evaluation to adhere to schema and forms
aidanlw17 Jul 31, 2019
27986e5
Add utility for offset
aidanlw17 Jul 31, 2019
5dba8b7
Build route functions for displaying and submitting eval
aidanlw17 Jul 31, 2019
d34cb41
Remove unnecessary table
aidanlw17 Aug 1, 2019
4225a79
Populate similarity.eval_params table
aidanlw17 Aug 1, 2019
810938a
Naming changes and unique constraints
aidanlw17 Aug 1, 2019
1d71b90
Add evaluation and submit eval results functions
aidanlw17 Aug 1, 2019
404c82f
Alter index model get_nns query to return distances
aidanlw17 Aug 1, 2019
d36ab1a
Alter eval view functions to collect submission info
aidanlw17 Aug 1, 2019
29a24f1
Update sql scripts with eval update
aidanlw17 Aug 2, 2019
e276a64
Fix similarity schema references
aidanlw17 Aug 2, 2019
294d42f
Fix typo in test
aidanlw17 Aug 2, 2019
b485a46
Refactor utils to avoid circular imports
aidanlw17 Aug 2, 2019
d481da8
Fix exceptions in api endpoint
aidanlw17 Aug 2, 2019
f6d93b4
Fix process batch size and conflicts for submitting eval_results
aidanlw17 Aug 9, 2019
4fdc4cf
Alter get_nns functions to support eval_results
aidanlw17 Aug 9, 2019
9a0f35e
Remove use of time library
aidanlw17 Aug 9, 2019
1bb650d
Update similarity api to match get_nns changes
aidanlw17 Aug 9, 2019
f570506
get_similar submits correct info about eval
aidanlw17 Aug 9, 2019
009ad8b
Refactor for similarity ui blueprint
aidanlw17 Aug 9, 2019
7407efd
Add constraint on eval feedback - one form per user
aidanlw17 Aug 22, 2019
baf0629
Alter get_mbids_by_ids to preserve order
aidanlw17 Aug 22, 2019
7f218e9
Add function to check for user in eval submissions
aidanlw17 Aug 22, 2019
a6e0800
Add offset validation utility
aidanlw17 Aug 22, 2019
4c7b891
Adds evaluation and metrics templates
aidanlw17 Aug 22, 2019
5ece7e7
Add evaluation with react
aidanlw17 Aug 22, 2019
44c1d91
Add similarity eval routes
aidanlw17 Aug 22, 2019
9a6fdc8
Remove unneeded comments
aidanlw17 Aug 24, 2019
a705fbd
Compute stats before submitting similarity
aidanlw17 Aug 24, 2019
40c5bc5
Catching error displays error message
aidanlw17 Aug 26, 2019
967a68a
Adds patch for musicbrainz
aidanlw17 Aug 26, 2019
53cfb01
Removes unneeded comment
aidanlw17 Aug 26, 2019
6efd9b5
Add docs for eval and similarity stats
aidanlw17 Aug 27, 2019
28ff00d
Add add-indices to init command and reorganize command options
aidanlw17 Aug 27, 2019
f6e1558
Use logger instead of print
aidanlw17 Aug 27, 2019
3331e5e
Edits docstring
aidanlw17 Aug 27, 2019
1abcc9c
Alter error messages and stats cast count to float
aidanlw17 Sep 27, 2019
61c6135
Add time to logging info
aidanlw17 Sep 27, 2019
46f850f
Fix error handling for Key/Scale and remove print statement
aidanlw17 Sep 27, 2019
364c7ca
Add annoy as a dependency
aidanlw17 Jun 24, 2019
de9d445
Creates index_model class and init function
aidanlw17 Jun 24, 2019
261d2e1
Adds exceptions for similarity module and get_vector_dimension
aidanlw17 Jun 24, 2019
6bf1890
Add build/save/load functionality
aidanlw17 Jun 24, 2019
69a6364
Adds functions to add items to index
aidanlw17 Jun 24, 2019
f0fb034
Add script for building a single index
aidanlw17 Jun 25, 2019
7b9a28d
Add functionality to query by mbid or by id
aidanlw17 Jun 25, 2019
7b1fd65
Adds scripts to query annoy and postgres for similar recordings
aidanlw17 Jun 25, 2019
069ef94
Parse parameters to index to check that they are possible
aidanlw17 Jun 25, 2019
7742ea5
Adds endpoint for similar recordings to a single (MBID, offset) combi…
aidanlw17 Jun 25, 2019
ed7acf7
Adds bulk endpoint for similar recordings
aidanlw17 Jun 25, 2019
5446582
Add bulk get functionality to index model
aidanlw17 Jun 25, 2019
1da450e
Parse parameters to index to check that they are possible
aidanlw17 Jun 25, 2019
04d420d
Adds bulk endpoint for similar recordings
aidanlw17 Jun 25, 2019
c22f6c4
Add bulk get functionality to index model
aidanlw17 Jun 25, 2019
75a3bb6
removes postgres and hybrid similarity
aidanlw17 Jul 4, 2019
b541028
Add blueprint for similarity api
aidanlw17 Jul 4, 2019
a0130e8
Add endpoint to get distance between recordings
aidanlw17 Jul 4, 2019
7c83a33
Remove whitespace
aidanlw17 Jul 4, 2019
d2e17c0
Alters parameter checking for similarity indices
aidanlw17 Jul 6, 2019
60b6054
Alters exception for getting lowlevel id
aidanlw17 Jul 6, 2019
3b802c9
Add unit testing for get_similar_recordings
aidanlw17 Jul 6, 2019
0308ecb
Adds tests for bulk similarity lookup
aidanlw17 Jul 6, 2019
6938b93
Uppercase MBID in docstring
aidanlw17 Jul 6, 2019
b30f993
Remove connection property for AnnoyModel class
aidanlw17 Jul 7, 2019
c6e03ab
Uses mock AnnoyModel to test similarity lookup endpoints
aidanlw17 Jul 7, 2019
b70b6e3
Fix typo in bulk get
aidanlw17 Jul 7, 2019
5a15a89
Fix issue with loading index function call
aidanlw17 Jul 7, 2019
2842860
Add test cases for similarity between recordings
aidanlw17 Jul 7, 2019
fe71833
Fix error in similarity between return value
aidanlw17 Jul 7, 2019
5ac6cc1
Refactor to move sql queries into db module
aidanlw17 Jul 7, 2019
dae1330
Remove unneeded imports in similarity/script.py
aidanlw17 Jul 8, 2019
cc4138b
Adds and improves docstrings
aidanlw17 Jul 8, 2019
ae208db
Adds utilities for querying
aidanlw17 Jun 25, 2019
0548ebc
Add cli command for removing index
aidanlw17 Jun 25, 2019
5a35db3
Adds bulk endpoint for similar recordings
aidanlw17 Jun 25, 2019
2043a8e
Add bulk get functionality to index model
aidanlw17 Jun 25, 2019
c7cfa15
Parse parameters to index to check that they are possible
aidanlw17 Jun 25, 2019
7af3351
Adds bulk endpoint for similar recordings
aidanlw17 Jun 25, 2019
2b2f214
Add bulk get functionality to index model
aidanlw17 Jun 25, 2019
f42b851
removes postgres and hybrid similarity
aidanlw17 Jul 4, 2019
4365a57
Add endpoint to get distance between recordings
aidanlw17 Jul 4, 2019
48c3913
Alters parameter checking for similarity indices
aidanlw17 Jul 6, 2019
5ef0497
Adds tests for bulk similarity lookup
aidanlw17 Jul 6, 2019
10e59ee
Remove connection property for AnnoyModel class
aidanlw17 Jul 7, 2019
fdd029a
Uses mock AnnoyModel to test similarity lookup endpoints
aidanlw17 Jul 7, 2019
93b1f0f
Fix issue with loading index function call
aidanlw17 Jul 7, 2019
6820a64
Add test cases for similarity between recordings
aidanlw17 Jul 7, 2019
d4e9ef5
Refactor to move sql queries into db module
aidanlw17 Jul 7, 2019
7a36aae
Adds and improves docstrings
aidanlw17 Jul 8, 2019
3d1df6d
Remove duplicate function after rebase
aidanlw17 Jul 10, 2019
7716f79
Fix n_neighbours parsing
aidanlw17 Jul 10, 2019
0bc9095
Add unit tests and refactor
aidanlw17 Jul 11, 2019
f8d8f09
Fix bug in unit test
aidanlw17 Jul 12, 2019
35e834a
Refactors to avoid circular imports and remove unnecessary function
aidanlw17 Jul 12, 2019
8435fee
Small bug fix and changes to unit test mocks
aidanlw17 Jul 16, 2019
069e571
Change exception for python2
aidanlw17 Jul 16, 2019
54f1c5d
Add tests for build/save/load cycle and init in index model
aidanlw17 Jul 16, 2019
be4ee39
Finishes test suite for index_model
aidanlw17 Jul 17, 2019
05ab59d
Adds similarity eval table and fks in update
aidanlw17 Jul 17, 2019
900b474
Drop similarity_eval table and fks in script
aidanlw17 Jul 17, 2019
43986ad
Add route and template for similarity metrics
aidanlw17 Jul 17, 2019
b7d2d2e
Remove check for visible metrics
aidanlw17 Jul 17, 2019
7818115
adds schema, templates, and routes
aidanlw17 Jul 18, 2019
c1d599a
Add support for similarity schema
aidanlw17 Jul 26, 2019
f0b56a3
Improve appearance of similar recordings link
aidanlw17 Jul 26, 2019
4cdb2a6
Adds similarity schema
aidanlw17 Jul 23, 2019
65c1d45
Fix references to similarity tables
aidanlw17 Jul 23, 2019
88c2c6c
Fix typo
aidanlw17 Jul 26, 2019
aa6b182
Upgrade annoy to 1.16.0
aidanlw17 Jul 28, 2019
246b6f1
Improve speed of adding indices and remove default for params
aidanlw17 Jul 28, 2019
98fe919
Add utility to initialize indices
aidanlw17 Jul 28, 2019
637383f
Add message for error
aidanlw17 Jul 31, 2019
a725869
Adds forms for evaluation
aidanlw17 Jul 31, 2019
7059961
Adds script for similarity eval submission
aidanlw17 Jul 31, 2019
3045e5c
Alter evaluation to adhere to schema and forms
aidanlw17 Jul 31, 2019
aba69ab
Add utility for offset
aidanlw17 Jul 31, 2019
fd032c1
Build route functions for displaying and submitting eval
aidanlw17 Jul 31, 2019
fba7559
Remove unnecessary table
aidanlw17 Aug 1, 2019
0c0cbfa
Populate similarity.eval_params table
aidanlw17 Aug 1, 2019
c235370
Naming changes and unique constraints
aidanlw17 Aug 1, 2019
f0a989e
Add evaluation and submit eval results functions
aidanlw17 Aug 1, 2019
4337c22
Alter index model get_nns query to return distances
aidanlw17 Aug 1, 2019
6920b0b
Alter eval view functions to collect submission info
aidanlw17 Aug 1, 2019
a5ae26c
Update sql scripts with eval update
aidanlw17 Aug 2, 2019
1aa214b
Fix similarity schema references
aidanlw17 Aug 2, 2019
0437fa0
Fix typo in test
aidanlw17 Aug 2, 2019
e4fa9ca
Refactor utils to avoid circular imports
aidanlw17 Aug 2, 2019
526e238
Fix exceptions in api endpoint
aidanlw17 Aug 2, 2019
0d565c4
Fix process batch size and conflicts for submitting eval_results
aidanlw17 Aug 9, 2019
b872063
Alter get_nns functions to support eval_results
aidanlw17 Aug 9, 2019
534d35c
Remove use of time library
aidanlw17 Aug 9, 2019
e029538
Update similarity api to match get_nns changes
aidanlw17 Aug 9, 2019
a5ceaf0
get_similar submits correct info about eval
aidanlw17 Aug 9, 2019
a80927f
Refactor for similarity ui blueprint
aidanlw17 Aug 9, 2019
294f3ef
Add constraint on eval feedback - one form per user
aidanlw17 Aug 22, 2019
b5d675a
Alter get_mbids_by_ids to preserve order
aidanlw17 Aug 22, 2019
70e0fdc
Add function to check for user in eval submissions
aidanlw17 Aug 22, 2019
eb770dd
Add offset validation utility
aidanlw17 Aug 22, 2019
5262f0b
Adds evaluation and metrics templates
aidanlw17 Aug 22, 2019
092a26a
Add evaluation with react
aidanlw17 Aug 22, 2019
4129b36
Add similarity eval routes
aidanlw17 Aug 22, 2019
30357ea
Remove unneeded comments
aidanlw17 Aug 24, 2019
388cc1e
Compute stats before submitting similarity
aidanlw17 Aug 24, 2019
901f820
Catching error displays error message
aidanlw17 Aug 26, 2019
cd54fcf
Adds patch for musicbrainz
aidanlw17 Aug 26, 2019
0901d6b
Removes unneeded comment
aidanlw17 Aug 26, 2019
49b3579
Add add-indices to init command and reorganize command options
aidanlw17 Aug 27, 2019
19cc5f7
Use logger instead of print
aidanlw17 Aug 27, 2019
8ae37ab
Edits docstring
aidanlw17 Aug 27, 2019
bd2f643
Merge remote-tracking branch 'origin/eval' into eval
aidanlw17 Sep 27, 2019
d133b28
Add batch_size default as constant
aidanlw17 Sep 27, 2019
e2c09d4
Fix count issue
aidanlw17 Sep 27, 2019
f734763
Use count(id) instead of max(id) for total
aidanlw17 Sep 27, 2019
24c0a9a
Remove check for vector existence when adding to index
aidanlw17 Sep 29, 2019
a871cfe
Add newline at end of files
aidanlw17 Sep 29, 2019
699e37e
Adds chunks as a utility
aidanlw17 Sep 29, 2019
e0ec01e
Add utility to add multiple rows
aidanlw17 Sep 29, 2019
2ce5c58
Fix equality
aidanlw17 Sep 29, 2019
16336b0
Change process for empty rows and query beforehand
aidanlw17 Sep 29, 2019
39ae5d9
Split submit_similarity_by_id into bulk and non-bulk methods
aidanlw17 Sep 29, 2019
09bbc0e
Rename metric dimension to metric dimensionality
aidanlw17 Sep 29, 2019
348021c
Rename get_similarity_row_mbid to get_similarity_by_mbid
aidanlw17 Sep 29, 2019
acf6ac6
Merge remote-tracking branch 'aidanlw17/eval' into master
alastair Jun 9, 2020
70b8273
Merge remote-tracking branch 'aidanlw17/sim-docs' into eval
alastair Sep 18, 2020
defedc6
Fix similarity docs table
alastair Sep 18, 2020
b19b63c
Add more detailed documentation about how similarity indicies work
alastair Sep 23, 2020
ee55f85
Merge remote-tracking branch 'origin/master' into eval
alastair Sep 23, 2020
8808dc7
Generate similarity metrics in bulk
alastair Sep 23, 2020
4cb5229
flake8 cleanups
alastair Sep 23, 2020
033c048
Ignore annoy indicies during docker build
alastair Sep 24, 2020
f9f9e7b
Only enable similarity endpoints if feature flag is set
alastair Sep 24, 2020
90d6e31
Improve returned data and speed of similarity endpoints
alastair Oct 1, 2020
e31ed15
Add remove_dups parameter to remove duplicate mbids with the same dis…
alastair Oct 1, 2020
95f8089
Improve docs
alastair Oct 2, 2020
021a4d8
Merge branch 'master' into eval
alastair Jul 2, 2021
d0be2a6
Add similarity data location, and config to consul template
alastair Jul 20, 2021
d13ff89
Fix similarity init method
alastair Jul 20, 2021
1fa50df
Fix similarity tests
alastair Jul 20, 2021
766d257
Allow duplicates to be removed even if they have different scores
alastair Jul 21, 2021
7165b4f
Remove the single-mbid similarity method
alastair Jul 21, 2021
0f04359
Disable evaluation submission
alastair Jul 21, 2021
cdbdb7c
Remove youtube player
alastair Jul 21, 2021
100a0d6
Use a single column in similarity.eval_results table to store results
alastair Jul 21, 2021
578bf7a
Fix similarity tests
alastair Jul 21, 2021
db7d8af
Enable similarity in config file so that tests run
alastair Jul 21, 2021
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
@@ -1,3 +1,4 @@
.git
node_modules
annoy_indices
data
@@ -38,3 +38,7 @@ node_modules

# Other stuff
*.swp

cache_namespaces
package-lock.json

@@ -115,4 +115,39 @@ ALTER TABLE feedback
FOREIGN KEY (user_id)
REFERENCES "user" (id);

ALTER TABLE similarity.similarity
ADD CONSTRAINT similarity_fk_lowlevel
FOREIGN KEY (id)
REFERENCES lowlevel (id);

ALTER TABLE similarity.similarity_stats
ADD CONSTRAINT similarity_stats_fk_metric
FOREIGN KEY (metric)
REFERENCES similarity.similarity_metrics (metric);

ALTER TABLE similarity.eval_params
ADD CONSTRAINT eval_params_fk_metric
FOREIGN KEY (metric)
REFERENCES similarity.similarity_metrics (metric);

ALTER TABLE similarity.eval_results
ADD CONSTRAINT eval_results_fk_lowlevel
FOREIGN KEY (query_id)
REFERENCES lowlevel (id);

ALTER TABLE similarity.eval_results
ADD CONSTRAINT eval_results_fk_eval_params
FOREIGN KEY (params)
REFERENCES similarity.eval_params (id);

ALTER TABLE similarity.eval_feedback
ADD CONSTRAINT eval_feedback_fk_user
FOREIGN KEY (user_id)
REFERENCES "user" (id);

ALTER TABLE similarity.eval_feedback
ADD CONSTRAINT eval_feedback_fk_query_id
FOREIGN KEY (eval_id)
REFERENCES similarity.eval_results (id);

COMMIT;
@@ -20,5 +20,10 @@ ALTER TABLE challenge ADD CONSTRAINT challenge_pkey PRIMARY KEY (id);
ALTER TABLE dataset_eval_challenge ADD CONSTRAINT dataset_eval_challenge_pkey PRIMARY KEY (dataset_eval_job, challenge_id);
ALTER TABLE api_key ADD CONSTRAINT api_key_pkey PRIMARY KEY (value);
ALTER TABLE feedback ADD CONSTRAINT feedback_pkey PRIMARY KEY (user_id, highlevel_model_id);
ALTER TABLE similarity.similarity ADD CONSTRAINT similarity_pkey PRIMARY KEY (id);
ALTER TABLE similarity.similarity_metrics ADD CONSTRAINT similarity_metrics_pkey PRIMARY KEY (metric);
ALTER TABLE similarity.similarity_stats ADD CONSTRAINT similarity_stats_pkey PRIMARY KEY (metric);
ALTER TABLE similarity.eval_params ADD CONSTRAINT eval_params_pkey PRIMARY KEY (id);
ALTER TABLE similarity.eval_results ADD CONSTRAINT eval_results_pkey PRIMARY KEY (id);

COMMIT;
@@ -0,0 +1 @@
CREATE SCHEMA similarity;
@@ -156,4 +156,59 @@ CREATE TABLE feedback (
suggestion TEXT
);

CREATE TABLE similarity.similarity (
id INTEGER, -- PK, FK to lowlevel
mfccs DOUBLE PRECISION[] NOT NULL,
mfccsw DOUBLE PRECISION[] NOT NULL,
gfccs DOUBLE PRECISION[] NOT NULL,
gfccsw DOUBLE PRECISION[] NOT NULL,
key DOUBLE PRECISION[] NOT NULL,
bpm DOUBLE PRECISION[] NOT NULL,
onsetrate DOUBLE PRECISION[] NOT NULL,
moods DOUBLE PRECISION[] NOT NULL,
instruments DOUBLE PRECISION[] NOT NULL,
dortmund DOUBLE PRECISION[] NOT NULL,
rosamerica DOUBLE PRECISION[] NOT NULL,
tzanetakis DOUBLE PRECISION[] NOT NULL
);

CREATE TABLE similarity.similarity_metrics (
metric TEXT, -- PK
is_hybrid BOOLEAN,
description TEXT,
category TEXT,
visible BOOLEAN
);

CREATE TABLE similarity.similarity_stats (
metric TEXT,
means DOUBLE PRECISION[],
stddevs DOUBLE PRECISION[]
);

CREATE TABLE similarity.eval_params (
id SERIAL, -- PK
metric TEXT, -- FK to similarity_metrics
distance_type TEXT,
n_trees INTEGER
);
ALTER TABLE similarity.eval_params ADD CONSTRAINT unique_params_constraint UNIQUE(metric, distance_type, n_trees);

CREATE TABLE similarity.eval_results (
id SERIAL, -- PK
query_id INTEGER, -- FK to lowlevel
results JSONB,
params INTEGER -- FK to eval_params
);
ALTER TABLE similarity.eval_results ADD CONSTRAINT unique_eval_query_constraint UNIQUE(query_id, params);

CREATE TABLE similarity.eval_feedback (
user_id INTEGER, -- FK to user
eval_id INTEGER, -- FK to eval_results
result_id INTEGER,
rating similarity.eval_type,
suggestion TEXT
);
ALTER TABLE similarity.eval_feedback ADD CONSTRAINT unique_eval_user_constraint UNIQUE(user_id, eval_id, result_id);

COMMIT;
@@ -3,3 +3,4 @@ CREATE TYPE model_status AS ENUM ('hidden', 'evaluation', 'show');
CREATE TYPE version_type AS ENUM ('lowlevel', 'highlevel');
CREATE TYPE eval_location_type AS ENUM ('local', 'remote');
CREATE TYPE gid_type AS ENUM ('mbid', 'msid');
CREATE TYPE similarity.eval_type AS ENUM ('less similar', 'accurate', 'more similar');
@@ -0,0 +1,5 @@
BEGIN;

DROP SCHEMA IF EXISTS similarity CASCADE;

COMMIT;
@@ -0,0 +1,24 @@
BEGIN;

DROP TABLE IF EXISTS highlevel_model CASCADE;
DROP TABLE IF EXISTS highlevel_meta CASCADE;
DROP TABLE IF EXISTS highlevel CASCADE;
DROP TABLE IF EXISTS model CASCADE;
DROP TABLE IF EXISTS lowlevel_json CASCADE;
DROP TABLE IF EXISTS lowlevel CASCADE;
DROP TABLE IF EXISTS version CASCADE;
DROP TABLE IF EXISTS statistics CASCADE;
DROP TABLE IF EXISTS incremental_dumps CASCADE;
DROP TABLE IF EXISTS dataset_snapshot CASCADE;
DROP TABLE IF EXISTS dataset_eval_jobs CASCADE;
DROP TABLE IF EXISTS dataset_class_member CASCADE;
DROP TABLE IF EXISTS dataset_class CASCADE;
DROP TABLE IF EXISTS dataset CASCADE;
DROP TABLE IF EXISTS dataset_eval_sets CASCADE;
DROP TABLE IF EXISTS "user" CASCADE;
DROP TABLE IF EXISTS api_key CASCADE;
DROP TABLE IF EXISTS challenge CASCADE;
DROP TABLE IF EXISTS dataset_eval_challenge CASCADE;
DROP TABLE IF EXISTS feedback CASCADE;

COMMIT;
@@ -0,0 +1,10 @@
BEGIN;

DROP TYPE IF EXISTS eval_job_status CASCADE;
DROP TYPE IF EXISTS model_status CASCADE;
DROP TYPE IF EXISTS version_type CASCADE;
DROP TYPE IF EXISTS eval_location_type CASCADE;
DROP TYPE IF EXISTS gid_type CASCADE;
DROP TYPE IF EXISTS similarity.eval_type CASCADE;

COMMIT;
@@ -0,0 +1,30 @@
BEGIN;

-- Add base metrics when db is initialized, before similarity stats are computed
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('mfccs', 'FALSE', 'MFCCs', 'timbre');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('mfccsw', 'FALSE', 'MFCCs (weighted)', 'timbre');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('gfccs', 'FALSE', 'GFCCs', 'timbre');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('gfccsw', 'FALSE', 'GFCCs (weighted)', 'timbre');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('key', 'FALSE', 'Key/Scale', 'rhythm');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('bpm', 'FALSE', 'BPM', 'rhythm');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('onsetrate', 'FALSE', 'OnsetRate', 'timbre');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('moods', 'FALSE', 'Moods', 'high-level');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('instruments', 'FALSE', 'Instruments', 'high-level');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('dortmund','FALSE', 'Genre (dortmund model)', 'high-level');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('rosamerica', 'FALSE', 'Genre (rosamerica model)', 'high-level');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('tzanetakis', 'FALSE', 'Genre (tzanetakis model)', 'high-level');

INSERT INTO similarity.eval_params (metric, distance_type, n_trees) VALUES ('mfccs', 'angular', 10);
INSERT INTO similarity.eval_params (metric, distance_type, n_trees) VALUES ('mfccsw', 'angular', 10);
INSERT INTO similarity.eval_params (metric, distance_type, n_trees) VALUES ('gfccs', 'angular', 10);
INSERT INTO similarity.eval_params (metric, distance_type, n_trees) VALUES ('gfccsw', 'angular', 10);
INSERT INTO similarity.eval_params (metric, distance_type, n_trees) VALUES ('key', 'angular', 10);
INSERT INTO similarity.eval_params (metric, distance_type, n_trees) VALUES ('bpm', 'angular', 10);
INSERT INTO similarity.eval_params (metric, distance_type, n_trees) VALUES ('onsetrate', 'angular', 10);
INSERT INTO similarity.eval_params (metric, distance_type, n_trees) VALUES ('moods', 'angular', 10);
INSERT INTO similarity.eval_params (metric, distance_type, n_trees) VALUES ('instruments', 'angular', 10);
INSERT INTO similarity.eval_params (metric, distance_type, n_trees) VALUES ('dortmund', 'angular', 10);
INSERT INTO similarity.eval_params (metric, distance_type, n_trees) VALUES ('rosamerica', 'angular', 10);
INSERT INTO similarity.eval_params (metric, distance_type, n_trees) VALUES ('tzanetakis', 'angular', 10);

COMMIT;
@@ -0,0 +1,66 @@
BEGIN;

CREATE SCHEMA similarity;

CREATE TABLE similarity.similarity (
id INTEGER, -- PK, FK to lowlevel
mfccs DOUBLE PRECISION[] NOT NULL,
mfccsw DOUBLE PRECISION[] NOT NULL,
gfccs DOUBLE PRECISION[] NOT NULL,
gfccsw DOUBLE PRECISION[] NOT NULL,
key DOUBLE PRECISION[] NOT NULL,
bpm DOUBLE PRECISION[] NOT NULL,
onsetrate DOUBLE PRECISION[] NOT NULL,
moods DOUBLE PRECISION[] NOT NULL,
instruments DOUBLE PRECISION[] NOT NULL,
dortmund DOUBLE PRECISION[] NOT NULL,
rosamerica DOUBLE PRECISION[] NOT NULL,
tzanetakis DOUBLE PRECISION[] NOT NULL
);

ALTER TABLE similarity.similarity
ADD CONSTRAINT similarity_fk_lowlevel
FOREIGN KEY (id)
REFERENCES lowlevel (id);

CREATE TABLE similarity.similarity_metrics (
metric TEXT, -- PK
is_hybrid BOOLEAN,
description TEXT,
category TEXT
);

ALTER TABLE similarity.similarity_metrics ADD CONSTRAINT similarity_metrics_pkey PRIMARY KEY (metric);
-- Add base metrics when db is initialized, before similarity stats are computed
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('mfccs', 'FALSE', 'MFCCs', 'timbre');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('mfccsw', 'FALSE', 'MFCCs (weighted)', 'timbre');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('gfccs', 'FALSE', 'GFCCs', 'timbre');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('gfccsw', 'FALSE', 'GFCCs (weighted)', 'timbre');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('key', 'FALSE', 'Key/Scale', 'rhythm');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('bpm', 'FALSE', 'BPM', 'rhythm');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('onsetrate', 'FALSE', 'MFCCs', 'timbre');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('moods', 'FALSE', 'Moods', 'high-level');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('instruments', 'FALSE', 'Instruments', 'high-level');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('dortmund','FALSE', 'Genre (dortmund model)', 'high-level');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('rosamerica', 'FALSE', 'Genre (rosamerica model)', 'high-level');
INSERT INTO similarity.similarity_metrics (metric, is_hybrid, description, category) VALUES ('tzanetakis', 'FALSE', 'Genre (tzanetakis model)', 'high-level');


CREATE TABLE similarity.similarity_stats (
metric TEXT, -- FK to metric
means DOUBLE PRECISION[],
stddevs DOUBLE PRECISION[]
);

ALTER TABLE similarity.similarity_stats ADD CONSTRAINT similarity_stats_pkey PRIMARY KEY (metric);

ALTER TABLE similarity.similarity_stats
ADD CONSTRAINT similarity_stats_fk_metric
FOREIGN KEY (metric)
REFERENCES similarity.similarity_metrics (metric);

ALTER TABLE similarity.similarity_metrics ADD CONSTRAINT similarity_metrics_pkey PRIMARY KEY (metric);
ALTER TABLE similarity.similarity_stats ADD CONSTRAINT similarity_stats_pkey PRIMARY KEY (metric);
ALTER TABLE similarity.similarity add CONSTRAINT similarity_pkey PRIMARY KEY (id);

COMMIT;
@@ -0,0 +1,65 @@
BEGIN;

CREATE TYPE similarity.eval_type AS ENUM ('less similar', 'accurate', 'more similar');

CREATE TABLE similarity.eval_params (
id SERIAL, -- PK
metric TEXT, -- FK to similarity_metrics
distance_type TEXT,
n_trees INTEGER
);

ALTER TABLE similarity.eval_params ADD CONSTRAINT unique_params_constraint UNIQUE(metric, distance_type, n_trees);
ALTER TABLE similarity.eval_params ADD CONSTRAINT eval_params_pkey PRIMARY KEY (id);

ALTER TABLE similarity.eval_params
ADD CONSTRAINT eval_params_fk_metric
FOREIGN KEY (metric)
REFERENCES similarity.similarity_metrics (metric);


CREATE TABLE similarity.eval_results (
id SERIAL, -- PK
query_id INTEGER, -- FK to lowlevel
results JSONB,
params INTEGER -- FK to eval_params
);

ALTER TABLE similarity.eval_results ADD CONSTRAINT unique_eval_query_constraint UNIQUE(query_id, params);
ALTER TABLE similarity.eval_results ADD CONSTRAINT eval_results_pkey PRIMARY KEY (id);

ALTER TABLE similarity.eval_results
ADD CONSTRAINT eval_results_fk_lowlevel
FOREIGN KEY (query_id)
REFERENCES lowlevel (id);

ALTER TABLE similarity.eval_results
ADD CONSTRAINT eval_results_fk_eval_params
FOREIGN KEY (params)
REFERENCES similarity.eval_params (id);


CREATE TABLE similarity.eval_feedback (
user_id INTEGER, -- FK to user
eval_id INTEGER, -- FK to eval_results
result_id INTEGER,
rating similarity.eval_type,
suggestion TEXT
);

ALTER TABLE similarity.eval_feedback ADD CONSTRAINT unique_eval_user_constraint UNIQUE(user_id, eval_id, result_id);

ALTER TABLE similarity.eval_feedback
ADD CONSTRAINT eval_feedback_fk_user
FOREIGN KEY (user_id)
REFERENCES "user" (id);

ALTER TABLE similarity.eval_feedback
ADD CONSTRAINT eval_feedback_fk_query_id
FOREIGN KEY (eval_id)
REFERENCES similarity.eval_results (id);

ALTER TABLE similarity.eval_params ADD CONSTRAINT eval_params_pkey PRIMARY KEY (id);
ALTER TABLE similarity.eval_results ADD CONSTRAINT eval_results_pkey PRIMARY KEY (id);

COMMIT;
@@ -57,6 +57,7 @@ RATELIMIT_WINDOW = 10

DATASET_DIR = "/data/datasets"
FILE_STORAGE_DIR = "/data/files"
SIMILARITY_INDEX_DIR = "/data/annoy_indices"

#Feature Flags
# Choose a server to perform the evaluation on
@@ -66,6 +67,11 @@ FEATURE_EVAL_FILTERING = True
# Choose settings used for model training
FEATURE_EVAL_MODEL_SELECTION = False

# Enable similarity API endpoints and webpages
FEATURE_SIMILARITY = True
# Allow submission of feedback on the quality of similarity results
FEATURE_SIMILARITY_FEEDBACK = False

DEBUG_TB_INTERCEPT_REDIRECTS = False

# maximum number of recordings in the dataset for which the download dataset button is shown
@@ -48,6 +48,7 @@ LOG_SENTRY = {

DATASET_DIR = '''{{template "KEY" "dataset_dir"}}'''
FILE_STORAGE_DIR = '''{{template "KEY" "file_storage_dir"}}'''
SIMILARITY_INDEX_DIR = '''{{template "KEY" "similarity_index_dir"}}'''

#Feature Flags
# Choose a server to perform the evaluation on
@@ -56,6 +57,10 @@ FEATURE_EVAL_LOCATION = {{template "KEY" "feature/eval_location"}}
FEATURE_EVAL_FILTERING = {{template "KEY" "feature/eval_filtering"}}
# Choose settings used for model training
FEATURE_EVAL_MODEL_SELECTION = {{template "KEY" "feature/eval_model_selection"}}
# Enable similarity API endpoints and webpages
FEATURE_SIMILARITY = {{template "KEY" "feature/similarity"}}
# Allow submission of feedback on the quality of similarity results
FEATURE_SIMILARITY_FEEDBACK = {{template "KEY" "feature/similarity_feedback"}}

RATELIMIT_PER_IP = {{template "KEY" "ratelimit_per_ip"}} # number of requests per ip
RATELIMIT_WINDOW = {{template "KEY" "ratelimit_window"}} # window size in seconds
Loading
Loading