Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DC2 DM catalog validation #168

Closed
fjaviersanchez opened this issue Jun 6, 2018 · 20 comments
Closed

DC2 DM catalog validation #168

fjaviersanchez opened this issue Jun 6, 2018 · 20 comments

Comments

@fjaviersanchez
Copy link

Since this repo is already very crowded, I am making a specialized repo dealing with the DC2 DM catalog validation here: https://github.com/LSSTDESC/DC2_DMCatalog_Validation

The idea is to try to organize things there and make a summary in this issue.

@wmwv
Copy link
Contributor

wmwv commented Jun 7, 2018

What do you see the role of DC2_DMCatalog_Validation vs DC2_Repo/Validation?

@fjaviersanchez
Copy link
Author

fjaviersanchez commented Jun 7, 2018

I see the DC2_DMCatalog_Validation repo more as an issue/discussion tracker and documentation development repository rather than code development. But I am happy to switch back here if you think it is confusing/unnecessary.

@egawiser
Copy link

egawiser commented Jun 7, 2018

It strikes me as weird to not have all DC2 repos within "DC2_Repo". But I'm not a GH expert...

@fjaviersanchez
Copy link
Author

@egawiser That's fine. I was thinking that maybe for discussing the DM catalog validation independently it was better to have a separate repository. However, I don't have any strong opinion about this so, I'll follow your advice and have everything centralized in this repo. I'll switch back here and summarize the conversation with @yymao:

I asked what was missing in the infrastructure so we can run the catalogs prepared by @wmwv in DESCQA.

@yymao said:

"There are (at least) two things to do to run this in DESCQA:

1. We need to move the reader into GCRCatalogs. (At this moment the reader for the merged catalog is sitting alone in DC2_Repo.

2. We need to map more quantities to the GCRCatalogs schema. Right now I only implemented a minimal set."

There is nobody assigned to this now but he's happy to do it once he has some free time.

If there's somebody else interested in doing this task, I think it's a good way to get started in DESCQA and it is a self-contained task.

Also, apologies for the back and forth/confussion between repositories.

@wmwv
Copy link
Contributor

wmwv commented Jun 7, 2018

I nominate @djperrefort

@djperrefort
Copy link

I can work on this. I’ll start taking a look over this weekend.

@fjaviersanchez
Copy link
Author

Here I put a document trying to summarize the catalog validation process and the to-do's.

Please, feel free to make any additions/comments.

@djperrefort Have you been able to start this? Do you need any help?

@djperrefort
Copy link

@fjaviersanchez I've forked the GCRCatalogs repo and copied over the coadd reader. There were a few minor exceptions that were raised when iteratively parsing the coadd data, but I've update the reader to address this.

Using input from @wmwv I'm currently mapping more quantities to the GCRCatalogs schema. NERSC is performing maintenance today, so I'll work more on the mapping tomorrow.

@fjaviersanchez
Copy link
Author

@djperrefort That sounds great! Thanks a lot for the update and the work!

@fjaviersanchez
Copy link
Author

fjaviersanchez commented Jun 20, 2018

I am putting an updated list of tasks to validate the DM output catalogs and who is currently working or interested on them. I am also adding some tasks in which we are looking for help. If you are interested in participating in any of these tasks, please, feel free to contact the participants and write down your name in the table below.

Task Participants (or potentially interested) Progress tracking Looking for participants? If looking for participants, relevant skills?
Ingestion of DM outputs into DESCQA @wmwv @yymao @djperrefort #166 #157 ? Experience with DESCQA and DM
Mapping DM output columns to GCR schema @wmwv @yymao @djperrefort ? ? Experience with DESCQA and DM
Ingestion of centroid files from imSim and PhoSim into DESCQA yes Great project if you want to get started in GCRCatalogs
Review suitability/validation criteria of available tests at DESCQA for the DM outputs (@rmjarvis) yes experience with data and/or range of acceptability of different key measured quantities
Add new tests into DESCQA (@rmjarvis) yes Test development can be a good project for newcomers
Adapt existing QA codes to DESCQA @fjaviersanchez here yes Great project if you want to get started in DESCQA
Perform simple end-to-end analysis @EiffL @fjaviersanchez yes If you already have some analysis code from HSC or DC1, feel free to apply them to DC2! You can also contact us if you want to participate

Please feel free to add more tasks to the list!

@djperrefort
Copy link

Here is a table of the values I am currently mapping from the DM output to the GCR schema. Three values in particular that are still not being mapped are the ra / dec error, the chi squared statistic of the model fit, and the point source covariance matrix. I have been told the first two exist as native values, but haven't found the correct column names. I don't believe the point source covariance matrix is a native value, so it will have to be calculated by the GCR if it is to be included.

Note that any u band values in the below table have analogous columns in each of the ugrizy bands.

Homonogized Value Native Value
objectId id
parentObjectId parent
ra coord_ra
dec coord_dec
centroidX slot_Centroid_x
centroidY slot_Centroid_y
centroidX_err slot_Centroid_xSigma
centroidY_err slot_Centroid_ySigma
centroid_flag slot_Centroid_flag
psNdata base_PsfFlux_area
extendedness base_ClassificationExtendedness_value
u_magLSST u_mag
u_magLSST_err u_mag_err
u_psFlux u_slot_ModelFlux_flux
u_psFlux_flag u_slot_ModelFlux_flag
u_psFlux_err u_slot_ModelFlux_fluxSigma
u_I_flag u_slot_Shape_flag
u_Ixx u_slot_Shape_xx
u_IxxPSF u_slot_PsfShape_xx
u_Ixy u_slot_Shape_xy
u_IxyPSF u_slot_PsfShape_xy
u_Iyy u_slot_Shape_yy
u_IyyPSF u_slot_PsfShape_yy

@fjaviersanchez, If preferred I can create a dedicated issue for this in the forked repo so that it can be tracked in your above table.

@fjaviersanchez
Copy link
Author

This looks good! Thanks a lot @djperrefort! I think that including some link would be great whenever you can. Please, feel free to either use a new issue in the forked repo, a link to the forked repo itself, or to an open PR in GCRCatalogs if you have already one. Whatever is easier/more comfortable for you.

@fjaviersanchez
Copy link
Author

fjaviersanchez commented Jul 31, 2018

Below I am putting an example of some calexps that I checked. Overall, the results look good. I am suspecting that the reference catalog is not great to analyze y-band though.

Some details about the matching: I am comparing the calexps results with the reference catalog for 1.2p, using nearest neighbor matching and restricting the matches to be within a 5 pixels radius (1 arcsecond) and within 1 magnitude (I did the analysis in individual bands). There's a slight disagreement between the reference and PhoSim's outputs (I believe this is still due to the extinction issue?). The images look reasonable, including some nice features. I'll take look at the coadds whenever they become available.

phosim_1p2_validation.pdf

@johannct
Copy link
Contributor

Hi @fjaviersanchez this is great thanks a lot! @boutigny : it might be worth comparing your validation. @fjaviersanchez : there is one coadd ready in the same directory structure, with the caveat that the pixel size is wrong and that the flats were not avaliable.

@boutigny
Copy link

Thanks @fjaviersanchez for this validation work. I am very surprised that you get fields that do not match the reference catalog (or is it that I don't understand the comments associated to the figures). There should be some references otherwise the astrometry would have failed and the calexp / src would not have been produced. The reference catalog should also be almost perfect (but a slight smearing in RA, DEC and in flux) for all the bands as it is derived from the catalog used for the image simulations.
Could you please provide more details on what you are plotting ?

@fjaviersanchez
Copy link
Author

fjaviersanchez commented Jul 31, 2018

In the left panels I am just taking a look at the images and I show the centroids for objects in the reference catalog (marked with X) and the centroids measured by DM with (marked with +) for PhoSim images (the legend is wrong, sorry). I color code the markers using the input magnitude in the Xs and the measured magnitude in the +s. Then, in the top right panel, I plot magPSF - mag_input. I only select objects that have base_ClassificationExtendedness_value==0 and have been matched to a star (so there are no issues about aperture choice and stuff like that while comparing magnitudes). The matching has been done as I described above. I compared the smeared magnitudes to the measured magnitudes (so the dispersion that we see is just due to the measurement process and not a combination of the smearing + measurement). I only select objects with magnitude < 26 to perform the comparison (so I don't include objects that are too faint). The middle panel is just the projection in the y-axis of this 2D histogram and the bottom histogram is just the ratio of number of "measured stars" (i.e, extendedness==0) and input stars as a function of true magnitude.

I think that in y-band, for some reason, I am getting things that are a close match spatially but they aren't a good match magnitude-wise. I suspect it might be bright shredded sources but I'll look into it. Also, the reference catalog doesn't include sprinkled objects, and I think that these might also be interfering in the matching (so they are matched to a star nearby but they aren't actually stars).

@boutigny Please, let me know if this makes sense to you or you need any additional details.

@boutigny
Copy link

@fjaviersanchez I suggest to restrict the comparison to mag < 22-23 as it is single exposures (not coadd). If you still see some strange things in the y band it would be worth to investigate further as there may be a problem in the astrometry.

@fjaviersanchez
Copy link
Author

fjaviersanchez commented Aug 1, 2018

@boutigny thanks for the suggestion. It looks like I had a bug and I was comparing y-band magnitude to z-band magnitudes. Now, that's fixed and the results look reasonable (see last slide in the document below).

phosim_1p2_validation.pdf

@fjaviersanchez
Copy link
Author

@jchiang87 Below I am adding the latest updates about the imSim 1.2i data validation. I still have to cross-check with PhoSim 1.2p (which will serve as validation of both PhoSim and imSim).
imSim_validation.pdf

@wmwv
Copy link
Contributor

wmwv commented Jan 22, 2021

Work completed. Focus has moved on to Run 2.2i.

@wmwv wmwv closed this as completed Jan 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants