-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move catalog readers to GCRCatalogs #170
Comments
I'm modifying the catalog reader for the DC2 static coadds to return values specified in the Data Products Definition Document. The last value I'm adding before submitting a pull request is the covariance matrix The DPDD specifies
@TallJimbo do you know what method of calculating the covariance |
On a related note, while GCR does support multidimensional numpy array, the user won't be able to directly convert GCR's return to, say, a pandas DataFrame. This may or may not be important but we should keep this in mind. @djperrefort, also, can you review this PR first and the create your PR on top on it? |
I believe the intent of the DPDD is for The DM pipelines currently produce moments for all bands, of course; whether they will continue to depends on how we do deblending across bands (a naive configuration of Scarlet, for instance, would guarantee that the shape for all bands would be identical). I think it's highly likely that we'll also have per-band
Note that the DPDD is very much a conceptual document; the presence of arrays in the tables there should not be taken as an indication that we will use arrays in the actual database schema. So there's no actual gain from using arrays in your interfaces now if they're problematic. |
@TallJimbo This makes sense, thank you. To clarify one last detail, This relates to @yymao's earlier comment on converting to a pandas data frame since there would only be a single matrix for the whole table and not one entry per object. |
I'm not totally sure I understand your question, but I'd say that |
Thank you for clarifying. Do you know the column name for the error in the second moments? I'm using The code is in place to calculate the covariance, it's just a matter of specifying the correct values to use. |
The The two implementations are pretty similar, so you're welcome to use the SDSS one instead if you do care about having uncertainties. But there's really nothing you can do to compute the uncertainties if they're not reported, unless you go back to the pixels - these are not empirical covariances, they're uncertainties propagated from the pixel uncertainties. When they're not provided, it's best to just set them to zero. |
In case you're worrying about why there are "slot" variables, it's because the support the python API `src.getShape()` without knowing which algorithm was used (ditto for e.g. centroids)
|
Earlier in #157 I created a GCR reader in this repo for @wmwv's merged catalogs as a demonstration; however, now that we plan to validate the merged catalogs in DESCQA and also advertise them to the Collaboration, we should move the reader to https://github.com/LSSTDESC/gcr-catalogs as
GCRCatalogs
is a python package that is installed in the DESC shared environment and also directly used by DESCQA. I believe @djperrefort has been working on this.Similarly, we should move @slosar's #169 to https://github.com/LSSTDESC/gcr-catalogs as well.
(cc @fjaviersanchez as this issue is related to #168)
The text was updated successfully, but these errors were encountered: