Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-43238: Refactor calcDistances.py for a 50-100x speed increase in AnalyzeMatchedVisitCore #216

Merged
merged 5 commits into from Mar 11, 2024

Conversation

erykoff
Copy link
Contributor

@erykoff erykoff commented Mar 8, 2024

This refactors the loops in calcDistances to be vectorized and get a 10x speed increase. In addition, it adds downsampling which gets another 5-10x speed increase in the densest fields.

One algorithmic change from the old version is that now the computation of sepResiduals changes from from separations - median(separations) to separations - mean(separations) because this is much easier to vectorize.

There is also a small fix for empty tracts.

else:
raRotated = np.array(data[self.raKey])

np.add.at(meanRa, groupId, raRotated)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool, and new to me!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Numpy universal functions at are my favorite thing, and they even made them faster in the version that will be in the next rubin-env.

bad2 = nSep <= 2
sepMean[bad2] = np.nan
sepResiduals = separations - sepMean[matchedPairInd]
sepResiduals = sepResiduals[np.isfinite(sepResiduals)]

if len(rmsDistances) == 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need this anymore, since you already returned if good.sum() == 0

@erykoff erykoff merged commit e86d30d into main Mar 11, 2024
8 checks passed
@erykoff erykoff deleted the tickets/DM-43238 branch March 11, 2024 22:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants