# Measure Differences

I need a procedure to measure a difference between two vectors, first from baseline/reference/known and another from implementation.

Here is a sample scenario:
1. We have one method/procedure and two modes
1. On first mode the method produces an output in a form of vector \[ 0.1, 0.2, 0.3, 0.4, 0.5, 0.6 ]
1. On second mode produces an output \[ 0.1, 0.2, 0.3, 0.4, **0.6**, 0.6 ]
1. Performance of the method is calculated by how close the both outputs. An example in real application, the first mode is a real-data and the second is a predicted-data.
1. Then we modify the method 
1. Now, on first mode produces an output \[ 1, 2, 3, 4, 5, 6 ]
1. On second mode produces an output \[ 1, 2, 3, 4, **4**, 6 ]

In 6. you might ask, why the output is different ? A possible answer is because we use a different way to represent the output.

## Requirements
- Should produce same measure if both vectors are multiplied by a scalar.
- Should produce same measure if both vectors are added by a scalar.

In [1]:
import numpy as np
import scipy.spatial

def GetDistance( s1, s2, f ):
    s1x = ( s1 - s1.mean()) / s1.std()
    s2x = ( s2 - s2.mean()) / s2.std()
    return f( s1x, s2x )    

def GetDistance_without_normalization( s1, s2, f ):
    return f( s1, s2 )    

def DoTest( s1, s2, s3, s4, s5, s6, fd ):
    print( "scipy.spatial.distance.euclidean: {},{},{}".format(
        fd( s1, s2, scipy.spatial.distance.euclidean )
        , fd( s3, s4, scipy.spatial.distance.euclidean )
        , fd( s5, s6, scipy.spatial.distance.euclidean ) 
        )
    )

    print( "scipy.spatial.distance.correlation: {},{},{}".format(
        fd( s1, s2, scipy.spatial.distance.correlation )
        , fd( s3, s4, scipy.spatial.distance.correlation )
        , fd( s5, s6, scipy.spatial.distance.correlation ) 
        )
    )

    print( "scipy.spatial.distance.cosine: {},{},{}".format(
        fd( s1, s2, scipy.spatial.distance.cosine )
        , fd( s3, s4, scipy.spatial.distance.cosine )
        , fd( s5, s6, scipy.spatial.distance.cosine ) 
        )
    )
        
s1 = np.array( [1, 2, 3, 4, 5, 6 ])
s2 = np.array( [2, 2, 3, 4, 6, 6 ])

# Multiply
s3 = s1 * 10
s4 = s2 * 10

# Addition
s5 = s1 + 15
s6 = s2 + 15

print( "----- With Normalization -----")
DoTest( s1, s2, s3, s4, s5, s6, GetDistance )

print( "----- Without Normalization -----")
DoTest( s1, s2, s3, s4, s5, s6, GetDistance_without_normalization )



----- With Normalization -----
scipy.spatial.distance.euclidean: 0.6810612496922059,0.6810612496922059,0.6810612496922059
scipy.spatial.distance.correlation: 0.03865370215269248,0.03865370215269248,0.03865370215269237
scipy.spatial.distance.cosine: 0.03865370215269248,0.03865370215269248,0.03865370215269237
----- Without Normalization -----
scipy.spatial.distance.euclidean: 1.4142135623730951,14.142135623730951,1.4142135623730951
scipy.spatial.distance.correlation: 0.03865370215269237,0.03865370215269259,0.03865370215269237
scipy.spatial.distance.cosine: 0.007669388831071378,0.0076693888310716,0.0003203812416273655


## Conclusion
- Use function **scipy.spatial.distance.correlation**

## References

- https://docs.scipy.org/doc/scipy-0.14.0/reference/spatial.distance.html