A fast, scalable and light-weight C++ Fréchet and DTW distance library, exposed to python and focused on clustering of polygonal curves.
import Fred as fred
- for verbosity, set
fred.config.verbosity
, default is0
, possible values0,1,2,3
By default, Fred will automatically determine the number of threads to use. If you want to set an upper limit, set fred.config.number_threads
. Set to -1
to enable dynamic mode again.
- signature:
fred.Curve(np.ndarray)
,fred.Curve(np.ndarray, str name)
- properties:
fred.Curve.values
: curves asnp.ndarray
,fred.Curve.name
: get name of curve,fred.Curve.dimensions
: dimension of curve,fred.Curve.complexity
: number of points of curve
- signature:
fred.Curves()
- methods:
fred.Curves.add(curve)
: add curve,fred.Curves[i]
: get ith curve,len(fred.Curves)
: number curves,fred.Curves + fred.Curves
: add two sets of curves,fred.Curves.simplify(l)
: return set of simplified curves - properties:
fred.Curves.m
: maximum complexity of the contained curves,fred.Curves.values
: curves asnp.ndarray
- signature:
fred.continuous_frechet(curve1, curve2)
- returns:
fred.Continuous_Frechet_Result
with membersvalue
,time_bounds
: running-time for upper and lower bound,number_searches
: number of free space diagrams built,time_searches
: running-time for free spaces
- approximation error in percent of distance:
fred.config.continuous_frechet_error
, which defaults to 1
- signature:
fred.discrete_frechet(curve1, curve2)
- returns:
fred.Discrete_Frechet_Result
with membersvalue
andtime
- signature:
fred.discrete_dynamic_time_warping(curve1, curve2)
- returns:
fred.Discrete_Dynamic_Time_Warping_Distance
with membersvalue
andtime
All simplifications are vertex-restricted!
- graph approach from Polygonal Approximations of a Curve — Formulations and Algorithms
- signature:
fred.frechet_minimum_error_simplification(fred.Curve, int complexity)
- returns:
fred.Curve
that uses input curves vertices, withcomplexity
number of vertices and that has minimum distance to input curve
- algorithm "FS" from Near-Linear Time Approximation Algorithms for Curve Simplification
- signature:
fred.frechet_approximate_minimum_link_simplification(fred.Curve, double error)
- returns:
fred.Curve
that uses input curves vertices, is of small complexity and with distance to input curve at mosterror
- binary search on
fred.frechet_approximate_minimum_link_simplification
- signature:
fred.frechet_approximate_minimum_error_simplification(fred.Curve, int complexity)
- returns:
fred.Curve
that uses input curves vertices, withcomplexity
number of vertices and that has small distance to input curve
- approach from Approximating Length-Restricted Means Under Dynamic Time Warping
- signature:
fred.dtw_approximate_minimum_error_simplification(fred.Curve, int complexity)
- returns:
fred.Curve
that uses input curves vertices, withcomplexity
number of vertices and that has small distance to input curve
- Set
fred.config.use_distance_matrix
toFalse
if already computed distances should not be stored. This makes sense for massive data sets, especially when so much memory is consumed that the OS kills the process.
The parameter distance_func
controls which distance function to use. Possible values:
0
: continuous Fréchet distance1
: discrete Fréchet distance2
: discrete dynamic time warping distance (algorithms are then of heuristic nature)
The parameter consecutive_call
controls whether distances and simplifications already computed in a previous clustering call are resused (set to true
then); defaults to false
.
Has no effect when fred.config.use_distance_matrix == False
.
- from Approximating (k,l)-center clustering for curves
- signature:
fred.discrete_klcenter(int k, int l, fred.Curves, bool local_search, bool consecutive_call, bool random_first_center, bool fast_simplification, int distance_func)
with parametersk
: number of centersl
: maximum complexity of the centerslocal_search
: number of iterations of local search to improve solution, defaults to0
random_first_center
: determines if first center is chosen uniformly at random or first curve is used as first center, optional, defaults to truefast_simplification
: determines whether to use the minimum error simplification or the faster approximate minimum error simplification, defaults tofalse
- returns:
fred.Clustering_Result
with mebersvalue
: objective valuetime
: running-timeassignment
: empty if compute_assignment has not been called
- Algorithm from section 4.3 in Geometric Approximation Algorithms + simplification
- signature:
fred.discrete_klmedian(int k, int l, fred.Curves, bool consecutive_call, bool fast_simplification, int distance_func)
with parametersk
: number of centersl
: maximum complexity of the centersfast_simplification
: determines whether to use the minimum error simplification or the faster approximate minimum error simplification, defaults tofalse
- returns:
fred.Clustering_Result
with mebersvalue
: objective valuetime
: running-timeassignment
: empty if compute_assignment has not been called
- signature:
fred.Clustering_Result
- methods:
len(fred.Clustering_Result)
: number of centersfred.Clustering_Result[i]
: get ith centerfred.Clustering_Result.compute_assignment(fred.Curves, bool consecutive_call)
: assigns every curve to its nearest center with parameterconsecutive_call
, which defaults tofalse
; set to true, if you want to assign the curves used for clusteringfred.Clustering_Result.optimize(fred.Curves, bool consecutive_call)
: (heuristically) optimizes cluster centers using a stabbing algorithm
- members:
value
: objective valuetime
: running-timeassignment
: empty if compute_assignment was not called
- signature:
fred.Cluster_Assignment
- methods:
len(fred.Cluster_Assignment)
: number of centersfred.Cluster_Assignment.count(i)
: number of curves assigned to centeri
fred.Cluster_Assignment.get(i,j)
: get index ofj
th curve assigned to centeri
fred.Cluster_Assignment.distance(i,j)
: get distance ofj
th curve assigned to centeri
to centeri
- Section 2 in Random Projections and Sampling Algorithms for Clustering of High Dimensional Polygonal Curves
- signature:
fred.dimension_reduction(curves, epsilon, empirical_constant)
with parametersepsilon
: (1+epsilon) approximation parameter,empirical_constant
: use constant of empirical study (faster, but less accurate), defaults toTrue
- returns:
fred.Curves
collection of curves
You have to have installed:
- C++ compiler
- CMake
- git
- optional: openmp available (should be a part of your compiler)
That's it!
It's best to use the Ubuntu subsystem, which is easy to install through powershell. Once you have this installed, follow this procedure:
sudo apt update
sudo apt upgrade
sudo apt install python3-pip cmake git gcc
pip3 install --user fred-frechet
Apple's clang does not really support openmp, which is now kind of an integral part of Fred. You can try to install libomp via homebrew, but the best way would be to get a virtual ubuntu running. Fred does install out of the box on Mac though, but without omp support.
- Variant 1: simply run
pip install Fred-Frechet
- Variant 2: clone repository and run
make
for installation into userdir
Just run python py/test.py
.
import Fred as fred
import numpy as np
import pandas as pd
curve1d = fred.Curve([1., 2.]) # Curve stores a polygonal curve with
# at least two points of at least one
# and equal number of dimensions
curve2d1 = fred.Curve([[1., 0.], [2., 1.], [3., 0.]]) # any number of dimensions and points works
curve2d2 = fred.Curve([[1., -1.], [2., -2.], [3., -1.]], "optional name, e.g. displayed in plot")
print(curve2d1)
fred.plot_curve(curve2d1, curve2d2)
fred.plot_curve(curve2d2, fred.frechet_minimum_error_simplification(curve2d2, 2))
print("distance is {}".format(fred.continuous_frechet(curve2d1, curve2d2).value))
print("download HUGE curves")
import requests, zipfile, io # you can use all libraries
# that work with numpy to read data into fred
re = requests.get("https://archive.ics.uci.edu/ml/machine-learning-databases/00447/data.zip", stream=True)
zf = zipfile.ZipFile(io.BytesIO(re.content))
ps1 = fred.Curve(pd.read_csv(zf.open('PS1.txt'), delimiter="\t", header=None).values[:50], "PS1")
ps2 = fred.Curve(pd.read_csv(zf.open('PS2.txt'), delimiter="\t", header=None).values[:50], "PS2")
ps3 = fred.Curve(pd.read_csv(zf.open('PS3.txt'), delimiter="\t", header=None).values[:50], "PS3")
ps4 = fred.Curve(pd.read_csv(zf.open('PS4.txt'), delimiter="\t", header=None).values[:50], "PS4")
ps5 = fred.Curve(pd.read_csv(zf.open('PS5.txt'), delimiter="\t", header=None).values[:50], "PS5")
ps6 = fred.Curve(pd.read_csv(zf.open('PS6.txt'), delimiter="\t", header=None).values[:50], "PS6")
curves = fred.Curves() # for clustering or if you want to apply dimension reduction
# you need to encapsulate your curves in a Curves object
curves.add(ps1)
curves.add(ps2)
curves.add(ps3)
curves.add(ps4)
curves.add(ps5)
curves.add(ps6)
fred.plot_curve(curves)
curves = fred.dimension_reduction(curves, 0.95) # fred is pretty fast but with high dimensional data
# a dimension reduction massively improves running-time
# even for smaller values of epsilon
fred.plot_curve(curves)
# Oneshot clustering - if you already know the value of k
clustering = fred.discrete_klcenter(2, 10, curves) # fast but coarse
clustering = fred.discrete_klmedian(2, 10, curves) # slow but better results
print("clustering cost is {}".format(clustering.value))
for i, center in enumerate(clustering):
print(f"center {i} is {center}")
fred.plot_curve(clustering)
# Multiple clustering calls - if you need to find a suitable value for k
for k in range(2, 6):
clustering = fred.discrete_klcenter(k, 10, curves, consecutive_call=True)
print(f"clustering cost is {clustering.value}")
clustering = fred.discrete_klmedian(k, 10, curves, consecutive_call=True)
print(f"clustering cost is {clustering.value}")
clustering.compute_assignment(curves, consecutive_call=True) # use consecutive_call = False when computing assignment for curves other
# than those used for computing the clustering
for i in range(len(clustering)):
for j in range(clustering.assignment.count(i)):
print(f"{curves[clustering.assignment.get(i,j)].name} was assigned to center {clustering[i].name}")
clustering.optimize_centers(curves, consecutive_call = True) # uses stabbing to get better centers
fred.plot_clustering(clustering, curves)
distance_func = 0 # try 2 for DTW
import Fred as fred
curves = fred.Curves()
curves.add(fred.Curve([[-10.93943942, -37.06242838],[-10.93944232, -37.06256992],[-10.93944232, -37.06256992],[-10.93944232, -37.06256992],[-10.93899402, -37.06280406],[-10.93860314, -37.06296434],[-10.93835807, -37.06278978],[-10.93829299, -37.06237239],[-10.93854427, -37.06176238],[-10.93889765, -37.06105771],[-10.9392863 , -37.06034013],[-10.93958482, -37.0597757 ],[-10.93981259, -37.05935465],[-10.94003912, -37.0589944 ],[-10.94029375, -37.05890855],[-10.94044568, -37.05899324],[-10.94071019, -37.05920933],[-10.94091498, -37.05938079],[-10.94092187, -37.05938262],[-10.94111489, -37.05955812],[-10.94143233, -37.05984643],[-10.94189287, -37.06029526],[-10.94234691, -37.06074624],[-10.9425804 , -37.06096802],[-10.94263953, -37.06101946],[-10.94263953, -37.06101946],[-10.94263953, -37.06101946],[-10.94266097, -37.0610417 ],[-10.94286097, -37.0612328 ],[-10.94315305, -37.06155233],[-10.94350765, -37.06192153],[-10.94410138, -37.0621534 ],[-10.94468481, -37.06230469],[-10.94527759, -37.06251654],[-10.94594673, -37.06281271],[-10.94671859, -37.06308869],[-10.94730157, -37.06335312],[-10.94778644, -37.06381369],[-10.94844587, -37.06445785],[-10.94904832, -37.06500803],[-10.94979403, -37.06571115],[-10.9504376 , -37.06632879],[-10.95101859, -37.06687168],[-10.95164801, -37.06724984],[-10.95194617, -37.06738994],[-10.95245685, -37.06766077],[-10.95311692, -37.06801289],[-10.95373597, -37.06834647],[-10.95425643, -37.06861906],[-10.95456037, -37.06876904],[-10.9551056 , -37.06906634],[-10.95571837, -37.06937444],[-10.95645998, -37.06951516],[-10.95722549, -37.06945968],[-10.9582645 , -37.06937347],[-10.959267 , -37.06930038],[-10.96033403, -37.06924749],[-10.9616459 , -37.069179 ],[-10.96270941, -37.06910438],[-10.96397319, -37.06903497],[-10.96497853, -37.06897429],[-10.9656844 , -37.06850669],[-10.96592044, -37.06768898],[-10.96610809, -37.06688631],[-10.96630181, -37.06600845],[-10.96646721, -37.06534033],[-10.96698213, -37.06499147],[-10.96764198, -37.06528425],[-10.96814327, -37.06557494],[-10.9687186 , -37.06599832],[-10.96927369, -37.0664603 ],[-10.96945204, -37.06660118],[-10.96959754, -37.06671691],[-10.96985275, -37.0669367 ],[-10.97016947, -37.0671957 ],[-10.97041451, -37.06739896],[-10.97048234, -37.06735379],[-10.97032484, -37.06715337],[-10.97020309, -37.06704726],[-10.97024076, -37.066873 ],[-10.97046391, -37.0665994 ],[-10.97074272, -37.06632107],[-10.97105451, -37.06647656],[-10.97140669, -37.06677885],[-10.97186827, -37.06717748],[-10.97201951, -37.06732553],[-10.97215358, -37.06744312],[-10.9723854 , -37.06761853],[-10.97258704, -37.06777718],[-10.97286671, -37.067999 ],[-10.97318196, -37.06824471],[-10.97329008, -37.06835596],[-10.97331396, -37.0683913 ],[-10.97331396, -37.0683913 ],[-10.97340625, -37.06848176],[-10.97355518, -37.06857843],[-10.97388026, -37.06881238],[-10.97409123, -37.06894244],[-10.97418474, -37.06886148],[-10.97428593, -37.0687222 ],[-10.97436295, -37.06863122],[-10.97442223, -37.06855982],[-10.97447997, -37.06848969],[-10.97454607, -37.06841423],[-10.97467334, -37.06824215],[-10.97479694, -37.06806764],[-10.97487819, -37.06798559],[-10.97494334, -37.06790972],[-10.97494334, -37.06790972],[-10.97496155, -37.06788944],[-10.97496155, -37.06788944],[-10.97496155, -37.06788944],[-10.97496155, -37.06788944]]))
curves.add(fred.Curve([[-10.90889296, -37.05237154],[-10.90904785, -37.05260632],[-10.90973536, -37.05266121],[-10.91010518, -37.05261827],[-10.91000414, -37.05256681],[-10.91048646, -37.05260013],[-10.91135943, -37.05269498],[-10.91229152, -37.05277322],[-10.91342033, -37.05270886],[-10.91431659, -37.05278532],[-10.91506897, -37.05266128],[-10.91544909, -37.05242861],[-10.91593664, -37.05267365],[-10.91667623, -37.05265982],[-10.91748816, -37.05281022],[-10.91801454, -37.05269153],[-10.91787201, -37.05279041],[-10.91785134, -37.05282591],[-10.91792023, -37.052902 ],[-10.91837675, -37.05296211],[-10.91850017, -37.0533763 ],[-10.91849212, -37.05358395],[-10.91849799, -37.05357595],[-10.9184368 , -37.05413224],[-10.91850316, -37.05443409],[-10.91850858, -37.05446375],[-10.9185052 , -37.05446315],[-10.91839066, -37.05517003],[-10.91841456, -37.05612296],[-10.9185024 , -37.05687158],[-10.9181955 , -37.05773603],[-10.91861413, -37.05873733],[-10.91855651, -37.05895014],[-10.91835643, -37.05889317],[-10.91854341, -37.05965251],[-10.91848843, -37.06081932],[-10.91843979, -37.06219825],[-10.91841746, -37.06266105],[-10.91837862, -37.06279615],[-10.91831496, -37.06300964],[-10.9183473 , -37.06299321],[-10.91837899, -37.06298774],[-10.91837791, -37.06298758],[-10.918378 , -37.06298698],[-10.91837822, -37.06298652],[-10.91841453, -37.06321011],[-10.91847828, -37.06394916],[-10.91861829, -37.06397188],[-10.91861297, -37.06398652],[-10.91861178, -37.06398627],[-10.91861355, -37.06398464],[-10.91861037, -37.06398528],[-10.91861033, -37.06398535],[-10.91861101, -37.06398616],[-10.91861229, -37.06398567],[-10.91861452, -37.06398519],[-10.91852107, -37.06433786],[-10.91854463, -37.06535452],[-10.91862404, -37.06630359],[-10.91879595, -37.06735107],[-10.91863274, -37.06821768],[-10.91867692, -37.06827064],[-10.91864985, -37.06846654],[-10.91861474, -37.06844821],[-10.91861158, -37.06845233],[-10.91861335, -37.06844958],[-10.91863853, -37.06863415],[-10.91866434, -37.06936379],[-10.91857642, -37.06997517],[-10.91855849, -37.07080263],[-10.91839213, -37.07178178],[-10.91835998, -37.07254682],[-10.91817896, -37.07257863],[-10.91805938, -37.07302097],[-10.91824085, -37.07327488],[-10.91825536, -37.07351718],[-10.9182579 , -37.07365122],[-10.91824237, -37.07365167],[-10.91824648, -37.07364797],[-10.9182473 , -37.07364716],[-10.91824587, -37.07364717],[-10.9182463 , -37.07364646],[-10.91824667, -37.07364626],[-10.91824711, -37.07364598],[-10.91829794, -37.07417675],[-10.91832389, -37.07471123],[-10.91834816, -37.07553953],[-10.91838258, -37.07596903],[-10.91830836, -37.076009 ],[-10.91831556, -37.07602092],[-10.91835704, -37.07625065],[-10.91840705, -37.07718778],[-10.91836099, -37.07770235],[-10.91843053, -37.07811014],[-10.91859487, -37.07820678],[-10.91832849, -37.07835976],[-10.91841839, -37.07871173],[-10.91858561, -37.07912875],[-10.91853693, -37.08029196],[-10.91847022, -37.08132272],[-10.91864122, -37.08232053],[-10.91878421, -37.08390708],[-10.91902161, -37.08549171],[-10.91957646, -37.08678353],[-10.91990603, -37.08762652],[-10.92011327, -37.08811781],[-10.92026249, -37.08852046],[-10.92027593, -37.08896067],[-10.92028812, -37.08971096],[-10.92086275, -37.09022295],[-10.92119806, -37.09125864],[-10.92158668, -37.09245244],[-10.9219998 , -37.09341251],[-10.92227813, -37.09389165],[-10.92251669, -37.09491899],[-10.9227356 , -37.09633969],[-10.92225014, -37.09771002],[-10.92184819, -37.09906076],[-10.92144139, -37.09987985],[-10.92144481, -37.09995651],[-10.92130557, -37.10039003],[-10.92115916, -37.1016734 ],[-10.92137987, -37.10308411],[-10.92125626, -37.10455606],[-10.92161594, -37.10520467],[-10.92292039, -37.10514096],[-10.92374638, -37.10509667],[-10.92379817, -37.10503713],[-10.92420939, -37.10468552]]))
curves.add(fred.Curve([[-10.90885896, -37.05223934],[-10.90887542, -37.05237009],[-10.90888581, -37.05236599],[-10.9088792 , -37.05236119],[-10.908879 , -37.0523613 ],[-10.90887897, -37.05236133],[-10.90887908, -37.05236184],[-10.90887925, -37.05236193],[-10.90887916, -37.05236232],[-10.90888031, -37.05238124],[-10.90901423, -37.05261118],[-10.90985044, -37.05264657],[-10.9107936 , -37.05260385],[-10.91128793, -37.05269182],[-10.91130952, -37.05271182],[-10.91131643, -37.05271003],[-10.91131638, -37.05270946],[-10.91131644, -37.05270941],[-10.91143227, -37.05271666],[-10.91178154, -37.05270394],[-10.91236017, -37.05273264],[-10.91239461, -37.05271299],[-10.91241785, -37.05269624],[-10.91241363, -37.0526976 ],[-10.91241079, -37.05269947],[-10.91287302, -37.05276819],[-10.91292392, -37.05277988],[-10.91327659, -37.05276669],[-10.91415117, -37.05275218],[-10.91495188, -37.05273417],[-10.91496974, -37.05215649],[-10.9149677 , -37.05136187],[-10.91497338, -37.05058764],[-10.91496995, -37.05030794],[-10.91496585, -37.04945366],[-10.91495289, -37.04837212],[-10.91492666, -37.04793266],[-10.9154707 , -37.04776878],[-10.9164315 , -37.04755438],[-10.91754675, -37.04716886],[-10.91769218, -37.04709118],[-10.91810692, -37.0469302 ],[-10.91830395, -37.0470265 ],[-10.91830991, -37.0471823 ],[-10.9183291 , -37.04789938],[-10.91833195, -37.04793058],[-10.91834546, -37.04793107],[-10.91834466, -37.04793116],[-10.91835246, -37.04833939],[-10.91834766, -37.04877383],[-10.91838263, -37.04880066],[-10.91838436, -37.04879317],[-10.91838504, -37.04883652],[-10.9183568