Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: analysis.calc_rg and analysis.calc_re break when particles are in different periodic images #4689

Closed
pm-blanco opened this issue Mar 17, 2023 · 3 comments · Fixed by #4698

Comments

@pm-blanco
Copy link
Contributor

pm-blanco commented Mar 17, 2023

While debugging a student code I found out the following behavior which looks like a bug to me. If one initializes the system with particles in different periodic images (for example some inside the simulation box and some in the next periodic image) and one calculates the radius of gyration or end-to-end distance, one obtains a value that disregards the periodicity of the system. Here you can find a minimal script that reproduces the bug:

import espressomd

box_l=20
bug=True # Switch on/off the bug

espresso_system=espressomd.System(box_l=[box_l]*3, periodicity=[True,True,True])

if bug:
    # Initialize the system with one particle inside the simulation box and another in a periodic image
    # Both particles are separated only by a distance of 0.1 by PBC
    espresso_system.part.add(id=0, pos = [0.1,0,0])
    espresso_system.part.add(id=1, pos = [box_l+0.2,box_l,box_l])
else:
    # Initialize the system with both particles inside the same periodic image
    espresso_system.part.add(id=0, pos = [box_l+0.1,box_l,box_l])
    espresso_system.part.add(id=1, pos = [box_l+0.2,box_l,box_l])

dist = espresso_system.distance(espresso_system.part.by_id(0),espresso_system.part.by_id(1))
Rg = espresso_system.analysis.calc_rg(chain_start=0, number_of_chains=1, chain_length=2)
Re = espresso_system.analysis.calc_re(chain_start=0, number_of_chains=1, chain_length=2)
print(f'The distance between the two particles is {dist:.4g}')
print(f'Their end-to-end distance is {Re[0]:.4g} and their radius of gyration is {Rg[0]:.4g}')

Actual output:

The distance between the two particles is 0.1
Their end-to-end distance is 34.7 and their radius of gyration is 17.35

Expected output:

The distance between the two particles is 0.1
Their end-to-end distance is 0.1 and their radius of gyration is 0.05
@jngrad
Copy link
Member

jngrad commented Mar 17, 2023

It also seems to be broken with Lees-Edwards boundary conditions:

import espressomd.lees_edwards
box_l=20
espresso_system=espressomd.System(box_l=[box_l]*3, periodicity=[True,True,True])
espresso_system.time_step = 1
espresso_system.cell_system.skin = 0.1
espresso_system.part.add(id=0, pos = [0.1,0,0])
espresso_system.part.add(id=1, pos = [box_l-0.2,0,0], v=[0.2,0,0])
params_lin = {'initial_pos_offset': -2., 'time_0': 0., 'shear_velocity': 0.}
lin_protocol = espressomd.lees_edwards.LinearShear(**params_lin)
espresso_system.lees_edwards.set_boundary_conditions(
    shear_direction="y", shear_plane_normal="x", protocol=lin_protocol)
espresso_system.integrator.run(2)
dist = espresso_system.distance(espresso_system.part.by_id(0),espresso_system.part.by_id(1))
Rg = espresso_system.analysis.calc_rg(chain_start=0, number_of_chains=1, chain_length=2)
Re = espresso_system.analysis.calc_re(chain_start=0, number_of_chains=1, chain_length=2)
print(f'pos=\n{espresso_system.part.all().pos}')
print(f'The distance between the two particles is {dist:.4g}')
print(f'Their end-to-end distance is {Re[0]:.4g} and their radius of gyration is {Rg[0]:.4g}')

Actual output:

pos=
[[ 0.1  0.   0. ]
 [20.2  2.   0. ]]
The distance between the two particles is 2.002
Their end-to-end distance is 20.2 and their radius of gyration is 10.1

Expected output:

pos=
[[ 0.1  0.   0. ]
 [20.2  2.   0. ]]
The distance between the two particles is 2.002
Their end-to-end distance is 2.002 and their radius of gyration is 1.001

All polymer-based distance observables rely on unfolded_position() since ca0367e (4.2.0 release). This is most likely the source of the error. These functions should rely on box_geo.get_mi_vector() instead, which returns the smallest distance while taking both periodicity and Lees-Edwards boundary conditions into account.

@jngrad jngrad added this to the Espresso 4.2.1 milestone Mar 17, 2023
@jngrad
Copy link
Member

jngrad commented Mar 20, 2023

I looked into the three polymer observables (end-to-end distance, radius of gyration, hydrodynamic radius), and it's not trivial. There are many edge cases to consider.

Center of mass definition in periodic boundaries

The weighted barycenter of a collection of points in a periodic system needs to be properly folded. This can be achieved by choosing an arbitrary point in the cloud that serves as a reference point, from which we calculate the distance vector to all other points using the minimum image convention. We compute the mass-average of these distance vectors to get the center of mass, and then offset it by the reference point coordinates. The cluster analysis code uses this algorithm (method Cluster::center_of_mass_subcluster()).

With Lees-Edwards boundary conditions, the aforementioned algorithm double-counts the position offset when the reference particle is in the central box. When the reference particle crosses the sheared boundary, the behavior is different. Likewise, the cluster radius of gyration double-counts the position offset while the corresponding polymer observables single-counts it. The GSL fractal dimension seems to underestimate the shear offset by a factor of 1.585.

Self-interaction through periodic images

When a polymer grows longer than half the box size, it interacts with its periodic image, at which points the definition of our three polymer observables no longer hold. It is easy to show it with the end-to-end distance with a polymer that grows beyond the unit cell; in the plot below, a box of size 10 is populated with a linear polymer that grows along the x-axis through the insertion of particles that are separated by a distance of 1. Method system.distance() takes periodicity into account and has a periodic profile while method system.analysis.calc_re() works on unfolded coordinates and increases linearly.

Plot of the end-to-end distance measured with two observables

Outlook

Working on unfolded coordinates seems to be the correct approach here. The discrepancy between folded and unfolded coordinates should not arise in a simulation that is properly set up, i.e. where the box dimensions are always larger than the polymer diameter. Polymer self-interactions through periodic images are generally not desirable, as they introduce unphysical correlations in the polymer dynamics. The only setup I can think of where such correlations would be useful is in energy minimization simulations of polymer crystals.

The bug mentioned by @pm-blanco could probably be caught by comparing the distance between consecutive particles in unfolded coordinates against the half box size. If the inter-particle distance is greater, it means the two particles are interacting through periodic images, and thus the polymer was instantiated with inconsistent image_box values. We could also use the min_global_cut, although it is not always guaranteed to be defined (e.g. not before interactions are defined). For the special case where a polymer is grown using a chemical reaction, care should be taken to insert particles with a suitable image_box value (this is currently not done automatically).

We should also update the user guide chapter on polymer observables to reflect that they aren't well-defined when self-interactions via periodic images are involved, or when Lees-Edwards periodic boundaries are used.

@pm-blanco @RudolfWeeber @kosovan @mebrito @schlaicha @bindgens1 do you have a different take on this matter?

@jngrad
Copy link
Member

jngrad commented Mar 21, 2023

Meeting discussion summary:

  • keep the current unfolded coordinates algorithms
  • document polymer observable caveats in the user guide
  • don't check for inter-particle distances being larger than a box length, since there would be false positives for polymer melts (the self-interaction through periodic images are acceptable due to fast decorrelation)
  • investigate Lees-Edwards issues further and open dedicated tickets

@kodiakhq kodiakhq bot closed this as completed in #4698 Mar 29, 2023
kodiakhq bot added a commit that referenced this issue Mar 29, 2023
Fixes #4689

Description of changes:
- disable cluster analysis methods when Lees-Edwards boundary conditions are used
- document caveats from chain analysis methods
- cleanup doxygen and comment blocks
jngrad pushed a commit to jngrad/espresso that referenced this issue Mar 29, 2023
Fixes espressomd#4689

Description of changes:
- disable cluster analysis methods when Lees-Edwards boundary conditions are used
- document caveats from chain analysis methods
- cleanup doxygen and comment blocks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants