Skip to content
This repository has been archived by the owner on Nov 23, 2022. It is now read-only.

cenkbircanoglu/similarity-py

Repository files navigation

Similarity Py Build Status Coverage Status

Installation

Install the package

    $ pip install similarityPy

Dependencies

enum

Distance Algorithms

 Numerical Data

  Norm

      Data: [{x, y, z}]
      Formula: alt tag

  Manhattan Distance

      Data: [{a, b, c}, {x, y, z}]
      Formula: alt tag

  Euclidean Distance

      Data: [{a, b, c}, {x, y, z}]
      Formula: alt tag

  Squared Euclidean Distance

      Data: [{a, b, c}, {x, y, z}]
      Formula: alt tag

  Normalized Squared Euclidean Distance

      Data: [{a, b}, {x, y}]
      Formula: alt tag

  Chessboard Distance

      Data: [{a, b, c}, {x, y, z}]
      Formula: alt tag

  Bray Curtis Distance

      Data: [{a, b, c}, {x, y, z}]
      Formula: alt tag

  Canberra Distance

      Data: [{a, b, c}, {x, y, z}]
      Formula: alt tag

  Cosine Distance

      Data: [{a, b, c}, {x, y, z}]
      Formula: alt tag

  Correlation Distance

      Data: [{a, b, c}, {x, y, z}]
      Formula: alt tag

 Boolean Data

  Jaccard Dissimilarity

      Data: [{True,False,True}, {True,True,False}]
      Explanation:[u,v] is equivalent to alt tag, where nij is the number of corresponding pairs of elements in u and v respectively equal to i and j.

  Matching Dissimilarity

      Data: [{True,False,True}, {True,True,False}]
      Explanation:[u,v] is equivalent to (n10+n01)/Length[u], where nij is the number of corresponding pairs of elements in u and v respectively equal to i and j.

  Dice Dissimilarity

      Data: [{True,False,True}, {True,True,False}]
      Explanation:[u,v] is equivalent to alt tag, where nij is the number of corresponding pairs of elements in u and v respectively equal to i and j.

  Rogers Tanimoto Dissimilarity

      Data: [{True,False,True}, {True,True,False}]
      Explanation:[u,v] is equivalent to alt tag, where nij is the number of corresponding pairs of elements in u and v respectively equal to i and j.

  Russell Rao Dissimilarity

      Data: [{True,False,True}, {True,True,False}]
      Explanation:[u,v] is equivalent to (n10+n01+n00)/Length[u], where nij is the number of corresponding pairs of elements in u and v respectively equal to i and j.

  Sokal Sneath Dissimilarity

      Data: [{True,False,True}, {True,True,False}]
      Explanation:[u,v] is equivalent to alt tag, where nij is the number of corresponding pairs of elements in and respectively equal to i and j.

  Yule Dissimilarity

      Data: [{True,False,True}, {True,True,False}]
      Explanation:[u,v] is equivalent to alt tag, where nij is the number of corresponding pairs of elements in and respectively equal to i and j.

 String Data

  Hamming Distance

      Data: [{a, b, c}, {x, y, z}]
      Explanation:[u,v] gives the number of elements whose values disagree in u and v.

  Edit Distance

      Data: [{a, b, c}, {x, y, z}]
      Explanation:[u,v] gives the number of one-element deletions, insertions, and substitutions required to transform u to v.

  Damerau Levenshtein Distance

      Data: [{a, b, c}, {x, y, z}]
      Explanation:[u,v] gives the number of one-element deletions, insertions, substitutions, and transpositions required to transform u to v.

  Needleman Wunsch Similarity (Not Implemented Yet)

      Data: [{a, b, c}, {x, y, z}]
      Explanation:[u,v] finds an optimal global alignment between the elements of u and v, and returns the number of one-element matches.

  Smith Waterman Similarity (Not Implemented Yet)

      Data: [{a, b, c}, {x, y, z}]
      Explanation:[u,v] finds an optimal local alignment between the elements of u and v, and returns the number of one-element matches.

Testing

Run all tests:

    $ python -m unittest discover -s tests -p '*_test.py'

Start test with nose and code coverage:

    $ nosetests --with-cov  --cov-report html  --cov  similarityPy tests/