Add self-diffusivity calculation #24

xhgchen · 2023-07-11T18:27:34Z

Fixes #7

Changes made in this Pull Request:

Add self-diffusivity calculation method sd() in class VelocityAutocorr

xhgchen · 2023-07-11T18:30:25Z

Definitely want to discuss what is best for testing this, perhaps during our meeting on Thursday. I did have some ideas in my proposal, though none of them are as straightforward to put into practice as the tests written thus far.

codecov · 2023-07-11T18:31:09Z

Codecov Report

Merging #24 (3c66b5d) into main (bebcfa1) will not change coverage.
The diff coverage is 100.00%.

Additional details and impacted files

hmacdope · 2023-07-12T03:46:21Z

The step traj should have an analytically solvable integral, though what it is I am not sure.

* Method is `sd()` in class `VelocityAutocorr`

xhgchen · 2023-07-12T20:36:15Z

Is it really easily solvable analytically? I'm thinking about trying for a different numerical integration method e.g. scipy.integrate.quad for the tests.

hmacdope · 2023-07-12T21:20:37Z

Try making a few notebooks to play around in and we can go over it on when we meet soon. 👍

orionarcher · 2023-07-13T23:38:06Z

@ReviewNB

This should make it easier to integrate notebooks into review/version control if you want to do that.

xhgchen · 2023-07-14T23:48:44Z

@ReviewNB

This should make it easier to integrate notebooks into review/version control if you want to do that.

Thank you, I'll look into it!

xhgchen · 2023-07-18T22:32:03Z

Added tests for both the simple windowed calculation and the FFT calculation but not sure if both are necessary - please let me know your thoughts. Using scipy.integrate.simpson for the tests to start and changed NSTEP from 5000 to 5001 to work with that method.

* Add `sd_odd()` using `scipy.integrate.simpson` * Add tests for `sd_odd()`

xhgchen · 2023-07-19T02:25:36Z

We could probably do without the non-FFT tests for self-diffusivity, since we test that already in the VACF tests. Am I good to remove them?

hmacdope · 2023-07-19T04:22:07Z

@xhgchen I would check the full code path, testing is cheap but having undetected issues is expensive.

* Change `self._velocity_array` to `self._velocities` for readability

xhgchen · 2023-07-20T01:17:01Z

Made the applicable changes @orionarcher requested in #25 here, namely changing _parse_dim_type() to a static method and removing the unnecessary type casting.

hmacdope

I think we need to validate against existing MSD code.

hmacdope · 2023-07-21T07:28:56Z

transport_analysis/tests/test_velocityautocorr.py

+            )
+            / tdim_factor
+        )
+        # 7705160166.66 (exp) agrees with 7705162888.88 (act) to 6 sig figs


Is this really the self diffusivity? Seems very high. Try with the equivalent trajectory in positions space with the existing MSD module from MDAnalysis and see what you get.

Tests have been added (though they are very rough). Will try to make the separate class next!

hmacdope

Looking great, we should also add something to plot the running integral of the ACF as is done in. Fig 12 of https://livecomsjournal.org/index.php/livecoms/article/view/v1i1e6324/937. The right function for this is probably scipy.integrate.cumtrapz.

hmacdope · 2023-07-24T05:15:08Z

transport_analysis/tests/test_velocityautocorr.py

-        assert_almost_equal(v_simple.results.timeseries, poly, decimal=3)
+        assert_almost_equal(v_fft.results.timeseries, poly, decimal=3)
+
+    @pytest.mark.parametrize(


You can separate all the tests that use this fixture into a separate class and then make the parameterization over the class rather than repeat

Those tests would include the old ones for VACF too, right?

hmacdope · 2023-07-24T05:17:38Z

transport_analysis/velocityautocorr.py

@@ -262,3 +261,68 @@ def plot_vacf(self, start=0, stop=0, step=1):
            self.times[start:stop:step],
            self.results.timeseries[start:stop:step],
        )
+
+    def sd(self, start=0, stop=0, step=1):


This is a green Kubo integral of the VACF so we should use the proper name.

Also will need citations.

Does sd_gk() sound good? I feel that self-diffusivity_greenkubo() is too long. I thought we were doing citations after we finish the bulk of the implementations? Or would it be better to do them now?

sd_gk sounds good to me. Either is fine on citation, up to you, just make a note or issue.

Left a note in Issue #20.

hmacdope · 2023-07-24T05:18:27Z

transport_analysis/velocityautocorr.py

+        `numpy.float64`
+            The calculated self-diffusivity value for the analysis.
+        """
+        stop = self.n_frames if stop == 0 else stop


Should raise an exception if .run hasn't already been called with some kind of cache variable.

I'm not too sure where to start here. Do you think you'd be able to explain a bit more, or would you be good with setting up a meeting to go over this?

Yep happy to meet.

Something like the following should work.

def __init__(self,) self._run_called = False ... def _conclude(self): self._run_called = True ... def sd_gk(self): if not self._run_called: raise RunTimeError("blah blah")

Or self._has_run?

Something like, "VelocityAutocorrelation.run() must be called before 'sd_gk'"

hmacdope · 2023-07-24T05:18:59Z

transport_analysis/velocityautocorr.py

+            / self.dim_fac
+        )
+
+    def sd_odd(self, start=0, stop=0, step=1):


We will probably remove this IMHO as trapeziod or Gquad is the standard. Can keep it around for testing but I would make it a private method with a _ prefix

I thought Simpson's rule was better when appropriate? Here's a link to my source: https://math.dartmouth.edu/~m3cod/klbookLectures/406unit/trap.pdf

Is keeping the standard more important than having an option for better accuracy?

Need for an odd number of samples is a bit of a pain.

I feel having the option available is better than not providing it, but I am no expert. Is it really preferable to remove it rather than just make it clear that trapezoid is the standard/default?

In general you want to aim for the lowest possible API surface so that people don't get confused and do the wrong thing.

The idea being you should plan for users to get frustrated as soon as the first thing goes wrong, meaning you should take away the ability for them to annoy themselves if that makes sense.

If users do the wrong thing (use the method with an even number of samples), scipy.integrate.simpson actually handles it well by default using the equations in Cartwright's paper for the last interval. On second thought, considering that it is pretty much the same as trapezoid but better, what are your thoughts on using it as the default instead?

Source: https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.simpson.html

* Create position trajectory corresponding to unit velocity trajectory * Write tests against MSD-based self-diffusivity calculation from the MDAnalysis MSD module (Einstein method)

* Track whether analysis is run with `self._run_called` * Raise exceptions in self-diffusivity and plotting functions if analysis not run * Add tests for new exception raises

* Parameterization occurs over class `TestAllDims` to test calculations over all possible dimensions

orionarcher · 2023-07-27T17:13:54Z

transport_analysis/tests/test_velocityautocorr.py

+def step_vtraj_pos(NSTEP):
+    x = np.arange(NSTEP).astype(np.float64)


Can we make the name here a bit more descriptive? Maybe unit_velocity_traj

Does this matter enough to be worth the effort? It's used in a lot of places, so the change would be a little messy.

orionarcher · 2023-07-27T17:19:43Z

transport_analysis/velocityautocorr.py

+        `numpy.float64`
+            The calculated self-diffusivity value for the analysis.
+        """
+        stop = self.n_frames if stop == 0 else stop


Or self._has_run?

Something like, "VelocityAutocorrelation.run() must be called before 'sd_gk'"

orionarcher · 2023-07-27T17:21:24Z

transport_analysis/velocityautocorr.py

@@ -262,3 +266,78 @@ def plot_vacf(self, start=0, stop=0, step=1):
            self.times[start:stop:step],
            self.results.timeseries[start:stop:step],
        )
+
+    def sd_gk(self, start=0, stop=0, step=1):


Another nitpick, but generally I'm not in favor of suber abbeviated naming. sd_gk means something to you now, as a developer, but it probably won't mean much to a new user. Maybe self_diffusivity_gk or self_diffusivity_green_kubo or diffusivity_green_kubo?

Good point, I'll go with self_diffusivity_gk then.

* Update units in plotting functions

* Make VACF plotting function and tests unitless

* Make `plot_running_integral()` docstring more clear

* Units are angstroms squared per picoseconds squared

hmacdope

Great work @xhgchen! Addressed all my comments. I would say merge and lets kick some more goals. 💯

xhgchen added 2 commits July 12, 2023 12:03

Add self-diffusivity calculation method

fde9cbf

* Method is `sd()` in class `VelocityAutocorr`

Fix sd() to divide by dimensionality

b7e2dd3

xhgchen force-pushed the self-diffusivity branch from 29a3e4a to b7e2dd3 Compare July 12, 2023 18:04

xhgchen added the enhancement New feature or request label Jul 14, 2023

xhgchen added 3 commits July 18, 2023 13:46

Add basic tests for self-diffusivity

3542652

Add self-diffusivity tests for start, stop, step

796a1b5

Add FFT self-diffusivity tests

560a9a3

Add self-diffusivity calc for odd # of data points

bdf6095

* Add `sd_odd()` using `scipy.integrate.simpson` * Add tests for `sd_odd()`

Change v_simple to v_fft in FFT tests

916cbd6

xhgchen added 3 commits July 19, 2023 18:52

Refactor _parse_dim_type() to static method

29f2805

Remove unnecessary type casting

067cb40

* Change `self._velocity_array` to `self._velocities` for readability

Reformat velocityautocorr.py with Black

fb661c6

xhgchen marked this pull request as ready for review July 20, 2023 01:14

xhgchen requested review from hmacdope and orionarcher July 20, 2023 01:14

hmacdope requested changes Jul 21, 2023

View reviewed changes

hmacdope requested changes Jul 24, 2023

View reviewed changes

xhgchen added 2 commits July 24, 2023 14:26

Add self-diffusivity tests against MSD

e39ecdc

* Create position trajectory corresponding to unit velocity trajectory * Write tests against MSD-based self-diffusivity calculation from the MDAnalysis MSD module (Einstein method)

Check if .run() called with cache variable

bdbde05

* Track whether analysis is run with `self._run_called` * Raise exceptions in self-diffusivity and plotting functions if analysis not run * Add tests for new exception raises

xhgchen added 2 commits July 26, 2023 14:56

Add parameter in sd_gk_odd() exception test

cdbda5f

Refactor all dims tests to separate class

54d7a32

* Parameterization occurs over class `TestAllDims` to test calculations over all possible dimensions

orionarcher reviewed Jul 27, 2023

View reviewed changes

xhgchen added 5 commits July 27, 2023 17:38

Add plot_running_integral() for VACF class

3dc6fd0

* Update units in plotting functions

Add tests for plot_running_integral()

0cf4a26

* Make VACF plotting function and tests unitless

Shorten plot_running_integral() ylabel

3909512

Change sd_gk() to self_diffusivity_gk()

fd74977

* Make `plot_running_integral()` docstring more clear

Add units to plot_vacf()

3c66b5d

* Units are angstroms squared per picoseconds squared

xhgchen requested review from hmacdope and orionarcher July 30, 2023 19:50

hmacdope approved these changes Aug 1, 2023

View reviewed changes

xhgchen merged commit 1211874 into MDAnalysis:main Aug 2, 2023
24 checks passed

		def step_vtraj_pos(NSTEP):
		x = np.arange(NSTEP).astype(np.float64)

Add self-diffusivity calculation #24

Add self-diffusivity calculation #24

Conversation

xhgchen commented Jul 11, 2023

xhgchen commented Jul 11, 2023

codecov bot commented Jul 11, 2023 • edited

Codecov Report

hmacdope commented Jul 12, 2023

xhgchen commented Jul 12, 2023

hmacdope commented Jul 12, 2023

orionarcher commented Jul 13, 2023 • edited

xhgchen commented Jul 14, 2023

xhgchen commented Jul 18, 2023

xhgchen commented Jul 19, 2023

hmacdope commented Jul 19, 2023 • edited

xhgchen commented Jul 20, 2023

hmacdope left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hmacdope left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hmacdope left a comment

Choose a reason for hiding this comment

codecov bot commented Jul 11, 2023 •

edited

orionarcher commented Jul 13, 2023 •

edited

hmacdope commented Jul 19, 2023 •

edited

hmacdope left a comment •

edited