Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix normalization of accumulated quantities (originally: PMFTXYZ is missing normalization by equiv orientations) #616

Open
vyasr opened this issue May 18, 2020 · 6 comments · Fixed by #619

Comments

@vyasr
Copy link
Collaborator

vyasr commented May 18, 2020

If equiv_orientations are provided to PMFTXYZ, it currently will not divide by the number of equivalent orientations, leading to overcounting. It looks like this issue was introduced in #300 when the redundant logic between various PMFT classes was consolidated. There is unfortunately an API question here, not something that can be immediately fixed, because right now the equivalent orientations are provided in the call to compute and therefore could change between calls. If this is desirable, we will need to use a histogram that support non-integral bins so that we can perform a normalization during accumulation. If it's not desirable, then we should probably accept those as constructor arguments instead. For the 2.x line, I think the most sensible solution is to simply throw an error in compute if these differ in subsequent calls where reset is not called.

Update: The issue originally raised here is resolved by #619. However, the broader problems along this line are discussed in @bdice's comment below and remain unresolved.

@bdice
Copy link
Member

bdice commented May 19, 2020

@vyasr Is this issue only a symptom of a wider problem around normalizing accumulated data? We've discussed this a bit before but haven't really handled it well yet.

Looking at accumulateGeneral in BondHistogramCompute.cc, I see it stores the latest number of points and query points (m_n_points and m_n_query_points) and the latest box, which are then used later in the reduction. The general problem becomes apparent if any of these quantities change during accumulation:

  • the numbers of points
  • the number of query points
  • the box volume
  • the number of equivalent orientations (for PMFTXYZ)

https://github.com/glotzerlab/freud/pull/619/files#diff-a29cf673bb4b7780b92ebabd433936a5R71-R73

float inv_num_dens = m_box.getVolume() / (float) m_n_query_points;
float norm_factor = (float) 1.0 / ((float) m_frame_counter * (float) m_n_points * (float) m_num_equiv_orientations);
float prefactor = inv_num_dens * norm_factor;

The broader solutions I could imagine would be to do one of the following:

  • Normalize after each accumulation as you suggest above.
  • Have a separate counter tracking a running sum of the factors needed for normalization.

@vyasr
Copy link
Collaborator Author

vyasr commented May 19, 2020

Yes, I recognized this problem going into 2.0 and punted on it because it wasn't obvious exactly how we should solve it. I don't think the second solution (keeping a counter) is the right solution because in terms of what the calculation actually is, I think you have to normalize each time. While you could accomplish this by keeping a counter, you would have to compute the appropriate weighted sums (e.g. a/x + b/y = (ay + bx)/xy) since the counts and the normalization factor will be different each frame and that sum could become arbitrarily complicated over many frames.

Additionally, the current design is a bit awkward because the BondHistogramCompute class tracks a bunch of things that aren't quite within its purview because all it does is accumulate total counts. Currently all the subclasses happen to use reduceOverThreadsPerBin to reduce into an auxiliary array, and so the parent tracks all things that are used for that normalization. However, if we do intermediate reductions, then we'd also need to reset the counts between accumulations so that we don't double count. Is that what we want, or do we need to keep a separate list of the counts since the last accumulation?

@vyasr
Copy link
Collaborator Author

vyasr commented Aug 30, 2020

@bdice once we've resolved some of the other open PRs, let's chat and try to make a call on this?

@bdice
Copy link
Member

bdice commented Sep 10, 2020

@vyasr I recommend that we move forward by raising an error if the number of equivalent orientations change without resetting, as you did in #619. In this case, I think that's entirely reasonable to expect that the symmetry group of a PMFT's particle is remaining constant.

This solution doesn't help us with the broader considerations about normalization with changing boxes and numbers of points/query points, but this is a safe and minimally-breaking solution for equivalent orientations.

@vyasr
Copy link
Collaborator Author

vyasr commented Sep 11, 2020

Right. I'll rename this issue and update the description to clarify that the purpose has changed a bit, then resolve that PR.

@vyasr vyasr changed the title PMFTXYZ is missing normalization by equiv orientations Fix normalization of accumulated quantities (originally: PMFTXYZ is missing normalization by equiv orientations) Sep 11, 2020
@vyasr
Copy link
Collaborator Author

vyasr commented Oct 29, 2020

I renamed and updated the description of this issue, but forgot to remove the link when merging #619. The broader issues discussed here still need to be addressed.

@vyasr vyasr reopened this Oct 29, 2020
@bdice bdice mentioned this issue Sep 20, 2021
16 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants