Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-30024: Implement matching in DRP DiaAssociation task #526

Merged
merged 4 commits into from Jun 9, 2021

Conversation

morriscb
Copy link
Contributor

@morriscb morriscb commented Jun 2, 2021

No description provided.

@morriscb morriscb force-pushed the tickets/DM-30024 branch 2 times, most recently from 47a81e5 to fa60904 Compare June 2, 2021 21:39
Copy link
Contributor

@yalsayyad yalsayyad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And squash the "fixup comments" commit.

@@ -0,0 +1,79 @@
import healpy as hp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This set functions was copied over from the LSSTDESC/dia_pipe directory. Didn't want worry about getting this up to spec for now. I can make a ticket to document it up to standards.



def eq2vec(ra, dec):
"""Convert equatorial ra,dec in degrees to x,y,z on the unit sphere parameters"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't have been able to guess just by the name that this is the vectorized version of eq2xyz. Maybe eq2xyzVec or eqVec2xyz

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

@@ -85,17 +90,33 @@ class DrpAssociationPipeTask(pipeBase.PipelineTask):
"""
ConfigClass = DrpAssociationPipeConfig
_DefaultName = "drpAssociation"
RunnerClass = pipeBase.ButlerInitializedTaskRunner
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a Gen3-only task, right? RunnerClasses are only needed for Gen2

@@ -161,13 +182,19 @@ def run(self, diaSourceTables, skyMap, tractId, patchId):
# self.associator.addCatalog()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this wasn't added on this changeset, but it's gotta go eventually

----------
diaSrc : `pandas.Series`
Full unassociated DiaSource to create a DiaObject from.
diaSources : `pandas.DataFrame`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the object that's modified in place, right? Would be clearer to me if the parameter descriptions explicitly said who is modified

diaObjCat,
diaObjCoords,
healPixIndices):
"""Create a new DiaObject and append its data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doc string is the same as addNewDiaObject. I'm skeptical.

"nearbyObj2": 0,
"nearbyObj3": 0,
"flags": 0}
for f in ["u", "g", "r", "i", "z", "y"]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be worth making the band list a config parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't need this here actually. This is all needed in the Apdb as these are columns that aren't nullable.

Parameters
----------
diaSources : `pandas.DataFrame`
DiaSources to spatially associate into DiaObjects
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could say a little more about the format of the diaSources it's expecting. Like is it one big concatenation of all the visits? What does it group by and loop over? ccdVisitId?


Represents a simple, brute force algorithm matching DiaSources into
DiaObjects. Algorithm picks the nearest, first match DiaObject to
associate a source to.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could say a little more about how it works. Like I'm assuming it loops over visits and 2-way matches each visit's diaSourceTable to the modified diaObject table.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does nearest first match mean? Does it stop checking once it finds one within some radius? Or check everything and take the nearest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First match is "the fist match found withing radius". I'll update the comment a bit.


self.hpIndices = [toIndex(simpleAssoc.config.nside,
diaObj["ra"],
diaObj["decl"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you use a mixture of decl and dec as abbreviations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decl is the column name in the DPDD and our Apdb schema, hence use of it.

Finish docs and rename to camel case.
Debug SimpleAssociationTask.

Debug add and update unittests.

Debug unittests.

Add run test.

switch variables to camel case.

Debug SimpleAssociationTask variable names.

Task worked with very little proding. I'm good at my job :)

Add full run test.

Debug simple association run test.
Debug pipeline and linting fixes.

Debug pipeline.

Confim outputs on ap_verify ci datasets.

Fixup docs.

Update docs on test_simpleAssociation.py

Add quick comment to associationUtils.py.

Fixup comments.
self.diaObjDecs = np.linspace(45, 46, self.nDiaObjects)
# Copy a coord to get multiple matches.
self.diaObjRas[3] = self.diaObjRas[2] + 0.1 / 3600
self.diaObjDecs[3] = self.diaObjDecs[2] + 0.1 / 3600
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting. No spaces around /.

"ra": ra,
"decl": decl}
for idx, (ra, decl) in enumerate(zip(self.diaObjRas,
self.diaObjDecs))]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another example of formatting that I find hard to follow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, since it's linting and passing standards I'm likely just going to keep it this way.

Add drpAssociation to diffImDRP subset.

Fix linting.

Fix white space around division.
@morriscb morriscb merged commit e1a9e10 into master Jun 9, 2021
@morriscb morriscb deleted the tickets/DM-30024 branch June 9, 2021 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants