Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #3377 from sanuj/euclidean_dist
add cookbook for euclidean distance
- Loading branch information
Showing
2 changed files
with
67 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
================== | ||
Euclidean Distance | ||
================== | ||
|
||
The Euclidean distance for real valued features is the square root of the sum of squared disparity between the corresponding feature dimensions of two data points. | ||
|
||
.. math:: | ||
d({\bf x},{\bf x'})= \sqrt{\sum_{i=0}^{d}|{\bf x_i}-{\bf x'_i}|^2} | ||
where :math:`\bf x` and :math:`\bf x'` are :math:`d` dimensional feature vectors. | ||
|
||
------- | ||
Example | ||
------- | ||
|
||
Imagine we have files with data. We create CDenseFeatures (here 64 bit floats aka RealFeatures) as | ||
|
||
.. sgexample:: euclidean.sg:create_features | ||
|
||
We create an instance of :sgclass:`CEuclideanDistance` by passing it :sgclass:`CDenseFeatures`. | ||
|
||
.. sgexample:: euclidean.sg:create_instance | ||
|
||
Distance matrix can be extracted as follows: | ||
|
||
.. sgexample:: euclidean.sg:extract_distance | ||
|
||
We can use the same instance with new :sgclass:`CDenseFeatures` to compute distance. | ||
|
||
.. sgexample:: euclidean.sg:refresh_distance | ||
|
||
If desired, squared distance can be extracted like: | ||
|
||
.. sgexample:: euclidean.sg:extract_sq_distance | ||
|
||
---------- | ||
References | ||
---------- | ||
:wiki:`Euclidean_distance` | ||
|
||
.. bibliography:: ../../references.bib | ||
:filter: docname in docnames |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
CSVFile f_feats_a("../../data/fm_train_real.dat") | ||
CSVFile f_feats_b("../../data/fm_test_real.dat") | ||
|
||
#![create_features] | ||
RealFeatures features_a(f_feats_a) | ||
RealFeatures features_b(f_feats_b) | ||
#![create_features] | ||
|
||
#![create_instance] | ||
EuclideanDistance distance(features_a, features_a) | ||
#![create_instance] | ||
|
||
#![extract_distance] | ||
RealMatrix distance_matrix_aa = distance.get_distance_matrix() | ||
#![extract_distance] | ||
|
||
#![refresh_distance] | ||
distance.init(features_a, features_b) | ||
#![refresh_distance] | ||
|
||
#![extract_sq_distance] | ||
distance.set_disable_sqrt(True) | ||
RealMatrix distance_matrix_ab = distance.get_distance_matrix() | ||
#![extract_sq_distance] |