In [1]:
import atomtypes
import MDAnalysis as mda
import numpy as np
import pandas as pd

In [2]:
u = mda.Universe('big.gro')

In [3]:
oldag = u.atoms
newag = atomtypes.StrucAtomGroup(atomtypes.convert(u.atoms))

# Let the games begin 

## Let's get some attributes

In [32]:
a = %timeit -n1 -r50 -o newag.names()
s = pd.Series(a.all_runs)
print "Mean time of 95 quantile: ", s[s < s.quantile(.95)].mean()

The slowest run took 4.43 times longer than the fastest. This could mean that an intermediate result is being cached 
1 loops, best of 50: 5.01 µs per loop
Mean time of 95 quantile:  6.48295625727e-06


In [17]:
a = %timeit -n1 -r50 -o oldag.names()
s = pd.Series(a.all_runs)
print "Mean time of 95 quantile: ", s[s < s.quantile(.95)].mean()

1 loops, best of 50: 313 ms per loop
Mean time of 95 quantile:  0.319286437745


In [39]:
a = %timeit -n1 -r50 -o newag.charges()
s = pd.Series(a.all_runs)
print "Mean time of 95 quantile: ", s[s < s.quantile(.95)].mean()

The slowest run took 7.00 times longer than the fastest. This could mean that an intermediate result is being cached 
1 loops, best of 50: 2.86 µs per loop
Mean time of 95 quantile:  4.23066159512e-06


In [19]:
a = %timeit -n1 -r50 -o oldag.charges()
s = pd.Series(a.all_runs)
print "Mean time of 95 quantile: ", s[s < s.quantile(.95)].mean()

1 loops, best of 50: 229 ms per loop
Mean time of 95 quantile:  0.232743126281


Getting values such as names and charges is at least 5 orders of magnitude faster with a structured array.

## Let's set some attributes

In [22]:
charges = np.random.random(len(oldag))

In [23]:
a = %timeit -n1 -r50 -o newag.set_charges(charges)
s = pd.Series(a.all_runs)
print "Mean time of 95 quantile: ", s[s < s.quantile(.95)].mean()

1 loops, best of 50: 13.9 ms per loop
Mean time of 95 quantile:  0.0141457851897


In [25]:
a = %timeit -n1 -r50 -o oldag.set_charge(charges)
s = pd.Series(a.all_runs)
print "Mean time of 95 quantile: ", s[s < s.quantile(.95)].mean()

1 loops, best of 50: 671 ms per loop
Mean time of 95 quantile:  0.687485415885


Setting values such as charges is at least an order of magnitude faster with a structured array.

## Let's try some fancy indexing

In [27]:
idx = np.random.randint(0, 1500000, size=25000)

In [44]:
a = %timeit -n1 -r50 -o newag[idx]
s = pd.Series(a.all_runs)
print "Mean time of 95 quantile: ", s[s < s.quantile(.95)].mean()

1 loops, best of 50: 7.18 ms per loop
Mean time of 95 quantile:  0.00750946998596


In [45]:
a = %timeit -n1 -r50 -o oldag[idx]
s = pd.Series(a.all_runs)
print "Mean time of 95 quantile: ", s[s < s.quantile(.95)].mean()

1 loops, best of 50: 6.35 ms per loop
Mean time of 95 quantile:  0.00713543181724


Fancy indexing is comparable in both the structured array and the list backends.