Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-37737: Fix non-deterministic behavior #10

Merged
merged 2 commits into from Feb 3, 2023
Merged

Conversation

cmsaunders
Copy link
Collaborator

Since c++ sets of pointers are sorted by the pointer address, this adds an after-the-fact sorting step (using the RA of the match) to ensure that results are identical run to run. It also adds a fixed seed for the fitDefaulted method. Lastly, a function has been added to get the parameters for a given map, which is helpful for debugging.

@cmsaunders cmsaunders force-pushed the tickets/DM-37737 branch 3 times, most recently from ff52309 to 4a2d6a3 Compare January 31, 2023 17:45
include/FoF.h Outdated
d_average /= this->size();
average[d] = d_average;
}
averageUpdated = true;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I did recommend caching the average, I don't think you can easily ensure that the cached value is always correct. Since Match is a derived from list rather than composed, you'd need to override every function that could possibly mutate it (std::list has 13 functions under Modifiers). Additionally, I don't think you can guarantee that the P class is immutable. Yes, Match stores const P &, but they can be mutated elsewhere.

Assuming that the performance gains from caching are worthwhile, you could perhaps add a bool forceUpdate=false arg to allow users to ignore the value of averageUpdated if they're certain that (or unsure if) they've mutated the Match some other way.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am just going to change the name of this method to calculateAverage and get rid of the caching. It's not the most efficient, but this is only used once in the code base, and this whole part of the code is going to be superseded in a couple months by using the outputs of IsolatedStarAssociationTask.

// Now loop through matches in this catalog
for (PointCat::const_iterator j = pcat.begin(); j != pcat.end(); ++j) {
for (auto j = matchVector.begin(); j != matchVector.end(); ++j) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want a const iterator you should use cbegin and cend. Alternatively, you can use the range-based for loop syntax const auto& j : matchVector (with the bonus that you won't need to keep dereferencing j).


// Sort the matches by their ra position to ensure deterministic behavior downstream.
const auto matchSort = [](fof::Match<Point, 2> *a, fof::Match<Point, 2> *b) {
return a->getAverage()[0] < b->getAverage()[0];
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you only care about the average RA, you might want to add a TODO to support getting the average for one dimension, again assuming that's a worthwhile optimization.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, we don't need to worry about optimization since this will be replaced soon.

return a->getAverage()[0] < b->getAverage()[0];
};
std::vector<fof::Match<Point, 2> *> matchVector(pcat.size());
matchVector.assign(pcat.begin(), pcat.end());
Copy link

@taranu taranu Jan 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you note why you need to copy pcat here? I'm guessing it's because it's a const ref to a container? If it were mutable, presumably you could just std::sort it directly, no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am copying it because pcat is an instance of a child-class of set and you can't re-sort a set (and I don't want to impose this sorting on the class in general). I copied it into a vector because the requirements for the comparison function for std::sort<vector> are simpler than for a set.

if (mel.atom) {
int nSub = mel.nParams;
DVector mparams = mel.atom->getParams().subVector(0, nSub);
DVector outP = mparams.subVector(20, 30);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really follow what this function is doing, and in particular where these magic numbers of 888, 20 and 30 came from.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sorry, I forgot to remove some debugging stuff. I added some comments.

@cmsaunders cmsaunders merged commit fbcbe60 into lsst-dev Feb 3, 2023
@cmsaunders cmsaunders deleted the tickets/DM-37737 branch February 3, 2023 19:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants