New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-30024: Implement matching in DRP DiaAssociation task #526
Conversation
47a81e5
to
fa60904
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And squash the "fixup comments" commit.
@@ -0,0 +1,79 @@ | |||
import healpy as hp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://developer.lsst.io/python/style.html?highlight=order#standard-code-order-should-be-followed
- Add license statement
- module-docstring
- imports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This set functions was copied over from the LSSTDESC/dia_pipe directory. Didn't want worry about getting this up to spec for now. I can make a ticket to document it up to standards.
|
||
|
||
def eq2vec(ra, dec): | ||
"""Convert equatorial ra,dec in degrees to x,y,z on the unit sphere parameters""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't have been able to guess just by the name that this is the vectorized version of eq2xyz
. Maybe eq2xyzVec
or eqVec2xyz
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
@@ -85,17 +90,33 @@ class DrpAssociationPipeTask(pipeBase.PipelineTask): | |||
""" | |||
ConfigClass = DrpAssociationPipeConfig | |||
_DefaultName = "drpAssociation" | |||
RunnerClass = pipeBase.ButlerInitializedTaskRunner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a Gen3-only task, right? RunnerClasses are only needed for Gen2
@@ -161,13 +182,19 @@ def run(self, diaSourceTables, skyMap, tractId, patchId): | |||
# self.associator.addCatalog() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this wasn't added on this changeset, but it's gotta go eventually
---------- | ||
diaSrc : `pandas.Series` | ||
Full unassociated DiaSource to create a DiaObject from. | ||
diaSources : `pandas.DataFrame` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the object that's modified in place, right? Would be clearer to me if the parameter descriptions explicitly said who is modified
diaObjCat, | ||
diaObjCoords, | ||
healPixIndices): | ||
"""Create a new DiaObject and append its data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doc string is the same as addNewDiaObject
. I'm skeptical.
"nearbyObj2": 0, | ||
"nearbyObj3": 0, | ||
"flags": 0} | ||
for f in ["u", "g", "r", "i", "z", "y"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be worth making the band list a config parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't need this here actually. This is all needed in the Apdb as these are columns that aren't nullable.
Parameters | ||
---------- | ||
diaSources : `pandas.DataFrame` | ||
DiaSources to spatially associate into DiaObjects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could say a little more about the format of the diaSources
it's expecting. Like is it one big concatenation of all the visits? What does it group by and loop over? ccdVisitId?
|
||
Represents a simple, brute force algorithm matching DiaSources into | ||
DiaObjects. Algorithm picks the nearest, first match DiaObject to | ||
associate a source to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could say a little more about how it works. Like I'm assuming it loops over visits and 2-way matches each visit's diaSourceTable to the modified diaObject table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does nearest first match mean? Does it stop checking once it finds one within some radius? Or check everything and take the nearest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First match is "the fist match found withing radius". I'll update the comment a bit.
|
||
self.hpIndices = [toIndex(simpleAssoc.config.nside, | ||
diaObj["ra"], | ||
diaObj["decl"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you use a mixture of decl and dec as abbreviations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
decl is the column name in the DPDD and our Apdb schema, hence use of it.
Finish docs and rename to camel case.
Debug SimpleAssociationTask. Debug add and update unittests. Debug unittests. Add run test. switch variables to camel case. Debug SimpleAssociationTask variable names. Task worked with very little proding. I'm good at my job :) Add full run test. Debug simple association run test.
Debug pipeline and linting fixes. Debug pipeline. Confim outputs on ap_verify ci datasets. Fixup docs. Update docs on test_simpleAssociation.py Add quick comment to associationUtils.py. Fixup comments.
3a28745
to
3106f82
Compare
tests/test_simpleAssociation.py
Outdated
self.diaObjDecs = np.linspace(45, 46, self.nDiaObjects) | ||
# Copy a coord to get multiple matches. | ||
self.diaObjRas[3] = self.diaObjRas[2] + 0.1 / 3600 | ||
self.diaObjDecs[3] = self.diaObjDecs[2] + 0.1 / 3600 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatting. No spaces around /.
"ra": ra, | ||
"decl": decl} | ||
for idx, (ra, decl) in enumerate(zip(self.diaObjRas, | ||
self.diaObjDecs))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is another example of formatting that I find hard to follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, since it's linting and passing standards I'm likely just going to keep it this way.
Add drpAssociation to diffImDRP subset. Fix linting. Fix white space around division.
ee62333
to
26f8951
Compare
No description provided.