## Demo notebook for FoNN similarity search tools

In [1]:
# imports

import os

from FoNN.similarity_search import PatternSimilarity

Initialize PatternSimilarity class object to conduct similarity searches

In [2]:
# set input corpus path
mtc_ann_corpus_path = '../mtc_ann_corpus'

# set up PatternSimilarity class instance:

# Args:
# corpus_path -- set to root dir of input corpus
# level -- sets level of input data granularity ('duration_weighted', 'note', or 'accent')
# n -- sets length of representative search term patterns(s) extracted from query tune in 'motif' similarity method. Can be an integer value between 3 and 12.
# query_tune -- Name of query tune for similarity search. Must be selected from the filenames from the original corpus, in this case '''../mtc_ann_corpus/krn''' dir.
# feature -- the musical feature for which pattern data has been extracted. For a list of the 16 features extracted by FoNN's ingest pipeline, see NgramPatternCorpus.FEATURES or ./README.md.

similarity_search = PatternSimilarity(
    corpus_path=mtc_ann_corpus_path,
    level='duration_weighted',
    n=5,
    query_tune='NLB015569_01',
    feature='diatonic_scale_degree'
)

Run similarity search using 'TFIDF' method.

In [3]:
similarity_search.run_similarity_search(mode='tfidf')

Query tune: NLB015569_01.
Similarity search mode: TFIDF
              Cosine similarity
NLB015569_01           1.000000
NLB070089_01           0.107788
NLB074769_02           0.092102
NLB072311_01           0.085999
NLB072883_01           0.083984


Run similarity search using 'motif' method.

In [4]:
similarity_search.run_similarity_search('motif')

Query tune: NLB015569_01.
Search mode: motif (normalized)
          title  normalized_count
0  NLB071957_03             0.667
1  NLB015569_01             0.532
2  NLB075532_01             0.378
3  NLB070089_01             0.279
4  NLB072311_01             0.221


Run similarity search using 'incipit and cadence' method, with default Levenshtein distance metric.

In [5]:
similarity_search.run_similarity_search(mode='incipit_and_cadence', edit_dist_metric='levenshtein')

# alternate edit distance metrics can be selected as follows:
# Hamming distance
similarity_search.run_similarity_search(mode='incipit_and_cadence', edit_dist_metric='hamming')
# custom-weighted Hamming distance
similarity_search.run_similarity_search(mode='incipit_and_cadence', edit_dist_metric='custom_weighted_hamming')

Query tune: NLB015569_01.
Similarity search mode: incipit and cadence (Levenshtein distance)
              Levenshtein distance
NLB015569_01                     0
NLB138219_01                     8
NLB072886_01                     9
NLB071666_01                    10
NLB072886_02                    10
Query tune: NLB015569_01.
Similarity search mode: incipit and cadence (Hamming distance)
              NLB015569_01
NLB015569_01      0.000000
NLB070089_01      0.302979
NLB072311_01      0.363525
NLB072946_01      0.424316
NLB075635_01      0.515137
Query tune: NLB015569_01.
Similarity search mode: incipit and cadence (Custom Weighted Hamming distance)
              Hamming distance (weighted)
NLB015569_01                          0.0
NLB138219_01                          9.0
NLB072886_01                         12.0
NLB071666_01                         12.0
NLB071441_01                         13.0
