# Timings for encore vs reencore

In [1]:
import MDAnalysis as mda
from MDAnalysis.tests.datafiles import PSF, DCD, DCD2
from MDAnalysis.analysis import encore, reencore, pca, align, clustering
from MDAnalysis.analysis.clustering.methods import AffinityPropagation
from sklearn.decomposition import PCA

Load everything into memory and align to make it easier to compare timings.

In [3]:
u1 = mda.Universe(PSF, DCD, in_memory=True)
u2 = mda.Universe(PSF, DCD2, in_memory=True)
ens = mda.Ensemble([u1, u2]).select_atoms("name CA")

align.AlignTraj(u1, u1, select="name CA").run()
align.AlignTraj(u2, u1, select="name CA").run()

<MDAnalysis.analysis.align.AlignTraj at 0x7f8e316d68d0>

## Harmonic ensemble similarity

In [3]:
%timeit encore.hes([u1, u2], select="name CA")

352 ms ± 24.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [4]:
%timeit reencore.hes(ens)

335 ms ± 15.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


## Clustering ensemble similarity

In [7]:
%timeit encore.ces([u1, u2], select="name CA")

5.71 s ± 123 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [8]:
%timeit reencore.ces(ens, AffinityPropagation)

2.31 s ± 399 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


Note: reencore.ces uses the Python DistanceMatrix to calculate distances, which is likely considerably slower than the C PureRMSD used in normal `encore`. Given that the distance matrix calculation takes ~99% the function time, you could shorten this considerably by supplying your own  matrix.

## Dimension reduction ensemble similarity

In [13]:
%timeit encore.dres([u1, u2], select="name CA")

5.75 s ± 83.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [14]:
%timeit reencore.dres(ens, pca.PCA, n_components=3)

556 ms ± 41.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


Note: This also uses the Python PCA class to calculate the dimension reduction, instead of the C StochasticProximityEmbedding used in normal `encore`. Again, you could make this considerably faster by manually calculating your own embedding. To demonstrate, the scikit-learn implementation:

In [18]:
%timeit reencore.dres(ens, PCA, n_components=3)

118 ms ± 5.11 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


## ces_convergence

In [4]:
%timeit encore.ces_convergence(u1, select="name CA", window_size=10)

2.15 s ± 29.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [5]:
%timeit reencore.ces_convergence(u1, select="name CA", window_size=10)

476 ms ± 22.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


## dres_convergence

In [6]:
%timeit encore.dres_convergence(u1, select="name CA", window_size=10)

2.27 s ± 109 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [7]:
%timeit reencore.dres_convergence(u1, select="name CA", window_size=10)

395 ms ± 25.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
