Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM catalog validation #124

Merged
merged 13 commits into from
Aug 22, 2018
Merged

DM catalog validation #124

merged 13 commits into from
Aug 22, 2018

Conversation

fjaviersanchez
Copy link
Contributor

With this PR I am creating a test that matches two catalogs and plots the difference in magnitude and position. @yymao I tried to follow your advice and set up the tests in conclude_test. Right now, I am reading the magnitudes from mag_true_$BAND_lsst but I think it would be useful to use something like mag_$BAND as an alias for the magnitudes that we want to compare.

I am using the simple KDTree nearest neighbor matcher (spatial_closest) from https://github.com/LSSTDESC/CatalogMatcher for speed purposes but I am open to use any other matching technique.

For DM catalogs we also need to set up the selection in the config file (and this selection cuts should be transparent somehow to the truth catalogs). Right now, I have as a placeholder ra < 500.

So, as to-do list:

Generalize magnitude reading.
Generalize selection.

@fjaviersanchez
Copy link
Contributor Author

An example of the plots that this test generates (using proto-dc2_v2.0_test and proto-dc2_v3.0_test)

screen shot 2018-06-29 at 12 41 06 pm

screen shot 2018-06-29 at 12 40 47 pm

(Since the galaxies in these catalogs are unrelated we can expect that we do not find any correlation between positions nor magnitudes).

@fjaviersanchez
Copy link
Contributor Author

Success! Thanks @yymao:

screen shot 2018-07-27 at 11 13 10 am

screen shot 2018-07-27 at 11 12 59 am

@yymao yymao self-requested a review July 27, 2018 15:42
@fjaviersanchez
Copy link
Contributor Author

I was comparing with PSF mag, now comparing with cModel mag :)

screen shot 2018-07-27 at 12 23 41 pm

@fjaviersanchez
Copy link
Contributor Author

@rmandelb
Copy link

rmandelb commented Jul 29, 2018

@fjaviersanchez - do you understand why your red points in the cmodel mag plot are offset by ~0.1-0.2 magnitudes from the bright part of the 2D histogram (which appears to be centered at Delta mag~0)?

Could there be a few outliers that are throwing the mean off, in which case the median (or a mean determined with outlier rejection) would be more stable?

@fjaviersanchez
Copy link
Contributor Author

@rmandelb I think those are bad matches. I am only using nearest-neighbor matching (because it's the fastest) and I think that some sources are being wrongly associated to a fainter source creating pretty big outliers. I am including the median here:

image

I think that I'll include both mean and median (the more information the better I guess).

Copy link
Member

@yymao yymao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fjaviersanchez sorry for being late on reviewing this. I directly pushed a few changes to make the PR works with the CI test. And made a few comments for code review.

try:
assert(len(self.ra.keys())==2) # Making sure that we have *just* two catalogs
except AssertionError:
print('The test can compare two catalogs only!')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raise an error if the test shouldn't work

        if len(self.ra) != 2: # Making sure that we have *just* two catalogs
            raise ValueError('The test can compare two catalogs only!')

except AssertionError:
print('The test can compare two catalogs only!')

cat_names = list(self.ra.keys()) # This is an auxiliary list to easily get the catalogs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary .keys()

cat_names = list(self.ra)

cat_names = list(self.ra.keys()) # This is an auxiliary list to easily get the catalogs
print(cat_names)
cat_len = [len(self.ra[cat_names[0]]),len(self.ra[cat_names[1]])]
cat_names = np.array(cat_names)[np.argsort(cat_len)].tolist() # We find the catalog with less objects to build the tree
Copy link
Member

@yymao yymao Aug 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 104-105, simplify the logic

cat_names = sorted(cat_names, key=lambda name: len(self.ra[name]))

-1, 1, self.nbins, '%s' % band, r'$\Delta %s$' % band, photo_savename, bin_stat=True)
n_true, _ = np.histogram(self.magnitude[(cat_names[1], band)], bins=50, range=(10, 30))
n_meas, bin_edges = np.histogram(self.magnitude[(cat_names[0], band)], bins=50, range=(10, 30))
plt.figure()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Store the figure object so that we can close it later.

fig = plt.figure()

plt.ylim(0,1)
plt.tight_layout()
photo_savename = os.path.join(output_dir, 'mag_ratio_%s_%s_%s.png' % (cat_names[0], cat_names[1], band))
plt.savefig(photo_savename)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add after Line 139

plt.close(fig)

so that matplotlib will not waste memory on open figures.

n_true, _ = np.histogram(self.magnitude[(cat_names[1], band)], bins=50, range=(10, 30))
n_meas, bin_edges = np.histogram(self.magnitude[(cat_names[0], band)], bins=50, range=(10, 30))
plt.figure()
plt.plot(0.5*(bin_edges[1:]+bin_edges[:-1]), 1.0*n_meas/n_true,'o')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of 1.0*n_meas/n_true, do n_meas.astype(float) / n_true to be explicit what the code does.

rect_histx = [left, bottom_h, width, 0.2]
rect_histy = [left_h, bottom, 0.2, height]

plt.figure()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

store the figure object so that we can close it later

fig = plt.figure()

axHisty.autoscale(tight=True)
axHistx.set_xlim(axScatter.get_xlim())
axHisty.set_ylim(axScatter.get_ylim())
plt.savefig(savename)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add after Line 90

plt.close(fig)

@yymao yymao merged commit fdc39f1 into LSSTDESC:master Aug 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants