# Table of Contents
 <p><div class="lev1 toc-item"><a href="#Introduction" data-toc-modified-id="Introduction-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Introduction</a></div><div class="lev1 toc-item"><a href="#Results" data-toc-modified-id="Results-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Results</a></div><div class="lev1 toc-item"><a href="#Imports" data-toc-modified-id="Imports-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Imports</a></div><div class="lev1 toc-item"><a href="#Load-the-data" data-toc-modified-id="Load-the-data-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Load the data</a></div><div class="lev1 toc-item"><a href="#Check-if-a-flag-is-true-for-all-the-objects" data-toc-modified-id="Check-if-a-flag-is-true-for-all-the-objects-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Check if a flag is true for all the objects</a></div><div class="lev1 toc-item"><a href="#Check-if-a-flag-is-False-for-all-the-objects" data-toc-modified-id="Check-if-a-flag-is-False-for-all-the-objects-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Check if a flag is False for all the objects</a></div><div class="lev1 toc-item"><a href="#Compare-good-vs-bad-using-false-flags" data-toc-modified-id="Compare-good-vs-bad-using-false-flags-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>Compare good vs bad using false flags</a></div><div class="lev1 toc-item"><a href="#Look-at-an-object-and-determine-good-or-bad" data-toc-modified-id="Look-at-an-object-and-determine-good-or-bad-8"><span class="toc-item-num">8&nbsp;&nbsp;</span>Look at an object and determine good or bad</a></div>

# Introduction
Date: Oct 31, 2019
```
1. Jedisim output:  lsst_z1.5_000.fits (z=1.5 ngals = 10k)
2. DMSTACK output: src_lsst_z1.5_000.csv (90 flags and 76 params, lots of nans, index90 is id)
3. Clean nans and filtering: src_lsst_z1.5_000.txt (only few columns id, x,y, xerr,yerr,e1,e2,ellip,flux,radius)
4. Combine m,m9,l,l9 to get LC catallog final.cat using mergecats
5. Combine 100 final catalogs to get final_text.txt


The text file final_text.txt has columns gm and gc for monochromatic ellipiticity and chromatic ellipticity.

When I plot the density plot of gm squared, I saw some bump in gm_sq value 0.7 to 1.0.

```
![](images/gm_sq_kde.png)

NOTES:
```
- final_text.txt has only 42 columns
- flux gmsq and gcsq are created later.


all objects = 183,832 (from final_text.txt) 
bad objects = 18,386  (0.7 < gm_sq < 1.0)
bad objects percentage = 10.00%
```

NOTES:
```
Looking file number and object id of bad density objects from final_text, I went to original
dmstack csv files and created two dataframes:

df_good_all.csv # 0.7 < gm_sq < 1.0
df_bad_all.csv # rest
```

# Results

Looking at all the flags (90 flags), 28 flags are all False for bad objects and 
at least one flag was positive for good objects.

```
1 base_GaussianCentroid_flag
2 base_GaussianCentroid_flag_resetToPeak
3 base_SdssCentroid_flag
4 base_SdssCentroid_flag_edge
5 base_SdssCentroid_flag_almostNoSecondDerivative
6 base_SdssCentroid_flag_notAtMaximum
7 base_SdssCentroid_flag_resetToPeak
8 base_SdssShape_flag_unweightedBad
9 base_SdssShape_flag_unweighted
10 base_SdssShape_flag_maxIter
11 ext_shapeHSM_HsmPsfMoments_flag
12 ext_shapeHSM_HsmPsfMoments_flag_galsim
13 ext_shapeHSM_HsmSourceMoments_flag
14 ext_shapeHSM_HsmSourceMoments_flag_galsim
15 base_CircularApertureFlux_3_0_flag
16 base_CircularApertureFlux_4_5_flag
17 base_CircularApertureFlux_4_5_flag_sincCoeffsTruncated
18 base_CircularApertureFlux_6_0_flag
19 base_CircularApertureFlux_6_0_flag_sincCoeffsTruncated
20 base_CircularApertureFlux_9_0_flag
21 base_CircularApertureFlux_12_0_flag
22 base_CircularApertureFlux_12_0_flag_apertureTruncated
23 base_CircularApertureFlux_17_0_flag
24 base_CircularApertureFlux_17_0_flag_apertureTruncated
25 base_GaussianFlux_flag
26 base_PsfFlux_flag
27 base_PsfFlux_flag_edge
28 base_ClassificationExtendedness_flag

```

```
if 28 flags == False:
    object is bad
else:
    object is good
```


# Imports

In [67]:
import numpy as np
import pandas as pd
import seaborn as sns
sns.set(color_codes=True)

import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
sns.set(context='notebook', style='whitegrid', rc={'figure.figsize': (12,8)})
plt.style.use('ggplot') # better than sns styles.
matplotlib.rcParams['figure.figsize'] = 12,8

import os
import time

# random state
random_state=100
np.random.seed(random_state)

# Jupyter notebook settings for pandas
#pd.set_option('display.float_format', '{:,.2g}'.format) # numbers sep by comma
from pandas.api.types import CategoricalDtype
np.set_printoptions(precision=3)
pd.set_option('display.max_columns', 100)
pd.set_option('display.max_rows', 100) # None for all the rows
pd.set_option('display.max_colwidth', 200)

import IPython
from IPython.display import display, HTML, Image, Markdown

print([(x.__name__,x.__version__) for x in [np, pd,sns,matplotlib]])

[('numpy', '1.15.4'), ('pandas', '0.24.2'), ('seaborn', '0.9.0'), ('matplotlib', '2.2.4')]


In [68]:
%%javascript
IPython.OutputArea.auto_scroll_threshold = 9999;

<IPython.core.display.Javascript object>

# Load the data

In [69]:
tmp = pd.read_csv('data/df_good_all.csv',nrows=0)

cols = tmp.columns
cols

Index(['Unnamed: 0', '# calib_detected', 'calib_psfCandidate', 'calib_psfUsed',
       'calib_psfReserved', 'flags_negative', 'deblend_deblendedAsPsf',
       'deblend_tooManyPeaks', 'deblend_parentTooBig', 'deblend_masked',
       ...
       'base_PsfFlux_flux', 'base_PsfFlux_fluxSigma', 'base_Variance_value',
       'base_PsfFlux_apCorr', 'base_PsfFlux_apCorrSigma',
       'base_GaussianFlux_apCorr', 'base_GaussianFlux_apCorrSigma',
       'base_ClassificationExtendedness_value', 'footprint', 'ellip'],
      dtype='object', length=169)

In [70]:
cols_flags = cols[:91]
cols_flags

Index(['Unnamed: 0', '# calib_detected', 'calib_psfCandidate', 'calib_psfUsed',
       'calib_psfReserved', 'flags_negative', 'deblend_deblendedAsPsf',
       'deblend_tooManyPeaks', 'deblend_parentTooBig', 'deblend_masked',
       'deblend_skipped', 'deblend_rampedTemplate', 'deblend_patchedTemplate',
       'deblend_hasStrayFlux', 'base_GaussianCentroid_flag',
       'base_GaussianCentroid_flag_noPeak',
       'base_GaussianCentroid_flag_resetToPeak', 'base_NaiveCentroid_flag',
       'base_NaiveCentroid_flag_noCounts', 'base_NaiveCentroid_flag_edge',
       'base_NaiveCentroid_flag_resetToPeak', 'base_SdssCentroid_flag',
       'base_SdssCentroid_flag_edge',
       'base_SdssCentroid_flag_noSecondDerivative',
       'base_SdssCentroid_flag_almostNoSecondDerivative',
       'base_SdssCentroid_flag_notAtMaximum',
       'base_SdssCentroid_flag_resetToPeak', 'base_SdssShape_flag',
       'base_SdssShape_flag_unweightedBad', 'base_SdssShape_flag_unweighted',
       'base_SdssShape_flag_

In [71]:
df_good_flags = pd.read_csv('data/df_good_all.csv', usecols=cols_flags,index_col=0)

print(df_good_flags.shape)
df_good_flags.tail()

(959917, 90)


Unnamed: 0,# calib_detected,calib_psfCandidate,calib_psfUsed,calib_psfReserved,flags_negative,deblend_deblendedAsPsf,deblend_tooManyPeaks,deblend_parentTooBig,deblend_masked,deblend_skipped,deblend_rampedTemplate,deblend_patchedTemplate,deblend_hasStrayFlux,base_GaussianCentroid_flag,base_GaussianCentroid_flag_noPeak,base_GaussianCentroid_flag_resetToPeak,base_NaiveCentroid_flag,base_NaiveCentroid_flag_noCounts,base_NaiveCentroid_flag_edge,base_NaiveCentroid_flag_resetToPeak,base_SdssCentroid_flag,base_SdssCentroid_flag_edge,base_SdssCentroid_flag_noSecondDerivative,base_SdssCentroid_flag_almostNoSecondDerivative,base_SdssCentroid_flag_notAtMaximum,base_SdssCentroid_flag_resetToPeak,base_SdssShape_flag,base_SdssShape_flag_unweightedBad,base_SdssShape_flag_unweighted,base_SdssShape_flag_shift,base_SdssShape_flag_maxIter,base_SdssShape_flag_psf,ext_shapeHSM_HsmPsfMoments_flag,ext_shapeHSM_HsmPsfMoments_flag_no_pixels,ext_shapeHSM_HsmPsfMoments_flag_not_contained,ext_shapeHSM_HsmPsfMoments_flag_galsim,ext_shapeHSM_HsmShapeRegauss_flag,ext_shapeHSM_HsmShapeRegauss_flag_no_pixels,ext_shapeHSM_HsmShapeRegauss_flag_not_contained,ext_shapeHSM_HsmShapeRegauss_flag_parent_source,ext_shapeHSM_HsmShapeRegauss_flag_galsim,ext_shapeHSM_HsmSourceMoments_flag,ext_shapeHSM_HsmSourceMoments_flag_no_pixels,ext_shapeHSM_HsmSourceMoments_flag_not_contained,ext_shapeHSM_HsmSourceMoments_flag_galsim,base_CircularApertureFlux_3_0_flag,base_CircularApertureFlux_3_0_flag_apertureTruncated,base_CircularApertureFlux_3_0_flag_sincCoeffsTruncated,base_CircularApertureFlux_4_5_flag,base_CircularApertureFlux_4_5_flag_apertureTruncated,base_CircularApertureFlux_4_5_flag_sincCoeffsTruncated,base_CircularApertureFlux_6_0_flag,base_CircularApertureFlux_6_0_flag_apertureTruncated,base_CircularApertureFlux_6_0_flag_sincCoeffsTruncated,base_CircularApertureFlux_9_0_flag,base_CircularApertureFlux_9_0_flag_apertureTruncated,base_CircularApertureFlux_9_0_flag_sincCoeffsTruncated,base_CircularApertureFlux_12_0_flag,base_CircularApertureFlux_12_0_flag_apertureTruncated,base_CircularApertureFlux_17_0_flag,base_CircularApertureFlux_17_0_flag_apertureTruncated,base_CircularApertureFlux_25_0_flag,base_CircularApertureFlux_25_0_flag_apertureTruncated,base_CircularApertureFlux_35_0_flag,base_CircularApertureFlux_35_0_flag_apertureTruncated,base_CircularApertureFlux_50_0_flag,base_CircularApertureFlux_50_0_flag_apertureTruncated,base_CircularApertureFlux_70_0_flag,base_CircularApertureFlux_70_0_flag_apertureTruncated,base_GaussianFlux_flag,base_PixelFlags_flag,base_PixelFlags_flag_offimage,base_PixelFlags_flag_edge,base_PixelFlags_flag_interpolated,base_PixelFlags_flag_saturated,base_PixelFlags_flag_cr,base_PixelFlags_flag_bad,base_PixelFlags_flag_suspect,base_PixelFlags_flag_interpolatedCenter,base_PixelFlags_flag_saturatedCenter,base_PixelFlags_flag_crCenter,base_PixelFlags_flag_suspectCenter,base_PsfFlux_flag,base_PsfFlux_flag_noGoodPixels,base_PsfFlux_flag_edge,base_Variance_flag,base_Variance_flag_emptyFootprint,base_PsfFlux_flag_apCorr,base_GaussianFlux_flag_apCorr,base_ClassificationExtendedness_flag
7942,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7944,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7945,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7946,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7950,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,1.0,0.0,1.0,1.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,1.0,0.0,0.0,0.0,1.0


In [72]:
# remove nans
df_good_flags.isnull().sum().sum()

0

In [73]:
# make data type boolean
df_good_flags = df_good_flags.sort_index()
df_good_flags = df_good_flags.astype(bool)
df_good_flags.head()

Unnamed: 0,# calib_detected,calib_psfCandidate,calib_psfUsed,calib_psfReserved,flags_negative,deblend_deblendedAsPsf,deblend_tooManyPeaks,deblend_parentTooBig,deblend_masked,deblend_skipped,deblend_rampedTemplate,deblend_patchedTemplate,deblend_hasStrayFlux,base_GaussianCentroid_flag,base_GaussianCentroid_flag_noPeak,base_GaussianCentroid_flag_resetToPeak,base_NaiveCentroid_flag,base_NaiveCentroid_flag_noCounts,base_NaiveCentroid_flag_edge,base_NaiveCentroid_flag_resetToPeak,base_SdssCentroid_flag,base_SdssCentroid_flag_edge,base_SdssCentroid_flag_noSecondDerivative,base_SdssCentroid_flag_almostNoSecondDerivative,base_SdssCentroid_flag_notAtMaximum,base_SdssCentroid_flag_resetToPeak,base_SdssShape_flag,base_SdssShape_flag_unweightedBad,base_SdssShape_flag_unweighted,base_SdssShape_flag_shift,base_SdssShape_flag_maxIter,base_SdssShape_flag_psf,ext_shapeHSM_HsmPsfMoments_flag,ext_shapeHSM_HsmPsfMoments_flag_no_pixels,ext_shapeHSM_HsmPsfMoments_flag_not_contained,ext_shapeHSM_HsmPsfMoments_flag_galsim,ext_shapeHSM_HsmShapeRegauss_flag,ext_shapeHSM_HsmShapeRegauss_flag_no_pixels,ext_shapeHSM_HsmShapeRegauss_flag_not_contained,ext_shapeHSM_HsmShapeRegauss_flag_parent_source,ext_shapeHSM_HsmShapeRegauss_flag_galsim,ext_shapeHSM_HsmSourceMoments_flag,ext_shapeHSM_HsmSourceMoments_flag_no_pixels,ext_shapeHSM_HsmSourceMoments_flag_not_contained,ext_shapeHSM_HsmSourceMoments_flag_galsim,base_CircularApertureFlux_3_0_flag,base_CircularApertureFlux_3_0_flag_apertureTruncated,base_CircularApertureFlux_3_0_flag_sincCoeffsTruncated,base_CircularApertureFlux_4_5_flag,base_CircularApertureFlux_4_5_flag_apertureTruncated,base_CircularApertureFlux_4_5_flag_sincCoeffsTruncated,base_CircularApertureFlux_6_0_flag,base_CircularApertureFlux_6_0_flag_apertureTruncated,base_CircularApertureFlux_6_0_flag_sincCoeffsTruncated,base_CircularApertureFlux_9_0_flag,base_CircularApertureFlux_9_0_flag_apertureTruncated,base_CircularApertureFlux_9_0_flag_sincCoeffsTruncated,base_CircularApertureFlux_12_0_flag,base_CircularApertureFlux_12_0_flag_apertureTruncated,base_CircularApertureFlux_17_0_flag,base_CircularApertureFlux_17_0_flag_apertureTruncated,base_CircularApertureFlux_25_0_flag,base_CircularApertureFlux_25_0_flag_apertureTruncated,base_CircularApertureFlux_35_0_flag,base_CircularApertureFlux_35_0_flag_apertureTruncated,base_CircularApertureFlux_50_0_flag,base_CircularApertureFlux_50_0_flag_apertureTruncated,base_CircularApertureFlux_70_0_flag,base_CircularApertureFlux_70_0_flag_apertureTruncated,base_GaussianFlux_flag,base_PixelFlags_flag,base_PixelFlags_flag_offimage,base_PixelFlags_flag_edge,base_PixelFlags_flag_interpolated,base_PixelFlags_flag_saturated,base_PixelFlags_flag_cr,base_PixelFlags_flag_bad,base_PixelFlags_flag_suspect,base_PixelFlags_flag_interpolatedCenter,base_PixelFlags_flag_saturatedCenter,base_PixelFlags_flag_crCenter,base_PixelFlags_flag_suspectCenter,base_PsfFlux_flag,base_PsfFlux_flag_noGoodPixels,base_PsfFlux_flag_edge,base_Variance_flag,base_Variance_flag_emptyFootprint,base_PsfFlux_flag_apCorr,base_GaussianFlux_flag_apCorr,base_ClassificationExtendedness_flag
0,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,True,False,False,False,False,True,False,True,False,False,False,True,False,False,False,False,False,False,False,False,True,False,False,False,True,False,False,True,False,False,True,False,False,True,False,True,True,False,True,False,True,True,True,True,True,True,True,True,False,False,False,True,False,False,False,False,False,False,False,False,False,True,False,True,True,False,False,False,True
0,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,True,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,True,False,False,False,True,False,False,True,False,True,True,False,True,True,False,True,True,False,True,True,True,True,True,True,True,True,True,True,False,False,False,True,False,False,False,False,False,False,False,False,False,True,False,True,True,False,False,False,True
0,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,True,False,False,False,False,True,False,False,True,False,False,True,False,False,False,False,False,False,False,False,True,False,False,False,True,False,False,True,False,False,True,False,False,True,False,True,True,False,True,False,True,True,True,True,True,True,True,True,False,False,False,True,False,False,False,False,False,False,False,False,False,True,False,True,True,False,False,False,True
0,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,True,False,False,False,False,True,False,True,False,False,False,True,False,False,False,False,False,False,False,False,True,False,False,False,True,False,False,True,False,False,True,False,False,True,False,True,True,False,True,False,True,True,True,True,True,True,True,True,False,False,False,True,False,False,False,False,False,False,False,False,False,True,False,False,True,False,False,False,True
0,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,True,False,False,False,False,True,False,False,True,False,False,True,False,False,False,False,False,False,False,False,True,False,False,False,True,False,False,True,False,True,True,False,True,True,False,True,True,False,True,True,True,True,True,True,True,True,True,True,False,False,False,True,False,False,False,False,False,False,False,False,False,True,False,True,True,False,False,False,True


In [74]:
# df_good_flags.to_csv('df_good_flags.csv',index=True)

In [75]:
# bad density objects
#

In [76]:
df_bad_flags = pd.read_csv('data/df_bad_all.csv', usecols=cols_flags, index_col=0)
print(df_bad_flags.shape)

df_bad_flags = df_bad_flags.sort_index()
df_bad_flags = df_bad_flags.astype(bool)
# df_bad_flags.to_csv('data/df_bad_flags.csv',index=True)

df_bad_flags.head()

(73544, 90)


Unnamed: 0,# calib_detected,calib_psfCandidate,calib_psfUsed,calib_psfReserved,flags_negative,deblend_deblendedAsPsf,deblend_tooManyPeaks,deblend_parentTooBig,deblend_masked,deblend_skipped,deblend_rampedTemplate,deblend_patchedTemplate,deblend_hasStrayFlux,base_GaussianCentroid_flag,base_GaussianCentroid_flag_noPeak,base_GaussianCentroid_flag_resetToPeak,base_NaiveCentroid_flag,base_NaiveCentroid_flag_noCounts,base_NaiveCentroid_flag_edge,base_NaiveCentroid_flag_resetToPeak,base_SdssCentroid_flag,base_SdssCentroid_flag_edge,base_SdssCentroid_flag_noSecondDerivative,base_SdssCentroid_flag_almostNoSecondDerivative,base_SdssCentroid_flag_notAtMaximum,base_SdssCentroid_flag_resetToPeak,base_SdssShape_flag,base_SdssShape_flag_unweightedBad,base_SdssShape_flag_unweighted,base_SdssShape_flag_shift,base_SdssShape_flag_maxIter,base_SdssShape_flag_psf,ext_shapeHSM_HsmPsfMoments_flag,ext_shapeHSM_HsmPsfMoments_flag_no_pixels,ext_shapeHSM_HsmPsfMoments_flag_not_contained,ext_shapeHSM_HsmPsfMoments_flag_galsim,ext_shapeHSM_HsmShapeRegauss_flag,ext_shapeHSM_HsmShapeRegauss_flag_no_pixels,ext_shapeHSM_HsmShapeRegauss_flag_not_contained,ext_shapeHSM_HsmShapeRegauss_flag_parent_source,ext_shapeHSM_HsmShapeRegauss_flag_galsim,ext_shapeHSM_HsmSourceMoments_flag,ext_shapeHSM_HsmSourceMoments_flag_no_pixels,ext_shapeHSM_HsmSourceMoments_flag_not_contained,ext_shapeHSM_HsmSourceMoments_flag_galsim,base_CircularApertureFlux_3_0_flag,base_CircularApertureFlux_3_0_flag_apertureTruncated,base_CircularApertureFlux_3_0_flag_sincCoeffsTruncated,base_CircularApertureFlux_4_5_flag,base_CircularApertureFlux_4_5_flag_apertureTruncated,base_CircularApertureFlux_4_5_flag_sincCoeffsTruncated,base_CircularApertureFlux_6_0_flag,base_CircularApertureFlux_6_0_flag_apertureTruncated,base_CircularApertureFlux_6_0_flag_sincCoeffsTruncated,base_CircularApertureFlux_9_0_flag,base_CircularApertureFlux_9_0_flag_apertureTruncated,base_CircularApertureFlux_9_0_flag_sincCoeffsTruncated,base_CircularApertureFlux_12_0_flag,base_CircularApertureFlux_12_0_flag_apertureTruncated,base_CircularApertureFlux_17_0_flag,base_CircularApertureFlux_17_0_flag_apertureTruncated,base_CircularApertureFlux_25_0_flag,base_CircularApertureFlux_25_0_flag_apertureTruncated,base_CircularApertureFlux_35_0_flag,base_CircularApertureFlux_35_0_flag_apertureTruncated,base_CircularApertureFlux_50_0_flag,base_CircularApertureFlux_50_0_flag_apertureTruncated,base_CircularApertureFlux_70_0_flag,base_CircularApertureFlux_70_0_flag_apertureTruncated,base_GaussianFlux_flag,base_PixelFlags_flag,base_PixelFlags_flag_offimage,base_PixelFlags_flag_edge,base_PixelFlags_flag_interpolated,base_PixelFlags_flag_saturated,base_PixelFlags_flag_cr,base_PixelFlags_flag_bad,base_PixelFlags_flag_suspect,base_PixelFlags_flag_interpolatedCenter,base_PixelFlags_flag_saturatedCenter,base_PixelFlags_flag_crCenter,base_PixelFlags_flag_suspectCenter,base_PsfFlux_flag,base_PsfFlux_flag_noGoodPixels,base_PsfFlux_flag_edge,base_Variance_flag,base_Variance_flag_emptyFootprint,base_PsfFlux_flag_apCorr,base_GaussianFlux_flag_apCorr,base_ClassificationExtendedness_flag
3,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,True,True,True,True,True,True,True,True,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
3,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,True,True,True,True,True,True,True,True,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
3,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,True,True,True,True,True,True,True,True,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
4,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,True,True,True,True,True,True,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
4,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,True,True,True,True,True,True,True,True,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False


# Check if a flag is true for all the objects

In [77]:
df_bad_flags.eq(True).all()

# calib_detected                                          False
calib_psfCandidate                                        False
calib_psfUsed                                             False
calib_psfReserved                                         False
flags_negative                                            False
deblend_deblendedAsPsf                                    False
deblend_tooManyPeaks                                      False
deblend_parentTooBig                                      False
deblend_masked                                            False
deblend_skipped                                           False
deblend_rampedTemplate                                    False
deblend_patchedTemplate                                   False
deblend_hasStrayFlux                                      False
base_GaussianCentroid_flag                                False
base_GaussianCentroid_flag_noPeak                         False
base_GaussianCentroid_flag_resetToPeak  

In [78]:
df_good_flags.eq(True).all()

# calib_detected                                          False
calib_psfCandidate                                        False
calib_psfUsed                                             False
calib_psfReserved                                         False
flags_negative                                            False
deblend_deblendedAsPsf                                    False
deblend_tooManyPeaks                                      False
deblend_parentTooBig                                      False
deblend_masked                                            False
deblend_skipped                                           False
deblend_rampedTemplate                                    False
deblend_patchedTemplate                                   False
deblend_hasStrayFlux                                      False
base_GaussianCentroid_flag                                False
base_GaussianCentroid_flag_noPeak                         False
base_GaussianCentroid_flag_resetToPeak  

In [79]:
df_good_flags.eq(True).all().sum()

0

In [80]:
df_bad_flags.eq(True).all().sum()

0

# Check if a flag is False for all the objects

In [81]:
df_good_flags.eq(False).all()

# calib_detected                                          False
calib_psfCandidate                                         True
calib_psfUsed                                              True
calib_psfReserved                                          True
flags_negative                                             True
deblend_deblendedAsPsf                                    False
deblend_tooManyPeaks                                       True
deblend_parentTooBig                                       True
deblend_masked                                             True
deblend_skipped                                            True
deblend_rampedTemplate                                    False
deblend_patchedTemplate                                    True
deblend_hasStrayFlux                                       True
base_GaussianCentroid_flag                                False
base_GaussianCentroid_flag_noPeak                          True
base_GaussianCentroid_flag_resetToPeak  

In [82]:
df_bad_flags.eq(False).all()

# calib_detected                                          False
calib_psfCandidate                                         True
calib_psfUsed                                              True
calib_psfReserved                                          True
flags_negative                                             True
deblend_deblendedAsPsf                                    False
deblend_tooManyPeaks                                       True
deblend_parentTooBig                                       True
deblend_masked                                             True
deblend_skipped                                            True
deblend_rampedTemplate                                    False
deblend_patchedTemplate                                    True
deblend_hasStrayFlux                                       True
base_GaussianCentroid_flag                                 True
base_GaussianCentroid_flag_noPeak                          True
base_GaussianCentroid_flag_resetToPeak  

# Compare good vs bad using false flags

In [83]:
# flag is False for all the objects (true means all objects have False flag.)
m = df_good_flags.eq(False).all().sum() # 45 flags are all False
n = df_bad_flags.eq(False).all().sum() # 73 flags are all False

df_good_flags.shape[1], m,n, n-m

(90, 45, 73, 28)

In [84]:
df1 = df_good_flags.eq(False).all().to_frame().rename(columns={0:'good'})
df2 = df_bad_flags.eq(False).all().to_frame().rename(columns={0:'bad'})

df_flags = pd.concat([df1,df2], axis=1)

df_flags

Unnamed: 0,good,bad
# calib_detected,False,False
calib_psfCandidate,True,True
calib_psfUsed,True,True
calib_psfReserved,True,True
flags_negative,True,True
deblend_deblendedAsPsf,False,False
deblend_tooManyPeaks,True,True
deblend_parentTooBig,True,True
deblend_masked,True,True
deblend_skipped,True,True


In [85]:
df_flags[df_flags['good'] != df_flags['bad']]

Unnamed: 0,good,bad
base_GaussianCentroid_flag,False,True
base_GaussianCentroid_flag_resetToPeak,False,True
base_SdssCentroid_flag,False,True
base_SdssCentroid_flag_edge,False,True
base_SdssCentroid_flag_almostNoSecondDerivative,False,True
base_SdssCentroid_flag_notAtMaximum,False,True
base_SdssCentroid_flag_resetToPeak,False,True
base_SdssShape_flag_unweightedBad,False,True
base_SdssShape_flag_unweighted,False,True
base_SdssShape_flag_maxIter,False,True


In [86]:
cols_imp = df_flags[df_flags['good'] != df_flags['bad']].index.to_numpy()

print(len(cols_imp))
cols_imp

28


array(['base_GaussianCentroid_flag',
       'base_GaussianCentroid_flag_resetToPeak', 'base_SdssCentroid_flag',
       'base_SdssCentroid_flag_edge',
       'base_SdssCentroid_flag_almostNoSecondDerivative',
       'base_SdssCentroid_flag_notAtMaximum',
       'base_SdssCentroid_flag_resetToPeak',
       'base_SdssShape_flag_unweightedBad',
       'base_SdssShape_flag_unweighted', 'base_SdssShape_flag_maxIter',
       'ext_shapeHSM_HsmPsfMoments_flag',
       'ext_shapeHSM_HsmPsfMoments_flag_galsim',
       'ext_shapeHSM_HsmSourceMoments_flag',
       'ext_shapeHSM_HsmSourceMoments_flag_galsim',
       'base_CircularApertureFlux_3_0_flag',
       'base_CircularApertureFlux_4_5_flag',
       'base_CircularApertureFlux_4_5_flag_sincCoeffsTruncated',
       'base_CircularApertureFlux_6_0_flag',
       'base_CircularApertureFlux_6_0_flag_sincCoeffsTruncated',
       'base_CircularApertureFlux_9_0_flag',
       'base_CircularApertureFlux_12_0_flag',
       'base_CircularApertureFlux_12_0_fl

In [87]:
df_good_flags[['base_GaussianCentroid_flag']].eq(False).all()

base_GaussianCentroid_flag    False
dtype: bool

In [88]:
df_bad_flags[['base_GaussianCentroid_flag']].eq(False).all()

base_GaussianCentroid_flag    True
dtype: bool

In [89]:
df_good_flags[['base_GaussianCentroid_flag','base_GaussianCentroid_flag_resetToPeak']].eq(False).all().all()

False

In [90]:
df_bad_flags[['base_GaussianCentroid_flag','base_GaussianCentroid_flag_resetToPeak']].eq(False).all().all()

True

In [91]:
df_good_flags[cols_imp].eq(False).all().all()

False

In [92]:
df_bad_flags[cols_imp].eq(False).all().all()

True

In [93]:
# df_bad_flags[cols_imp] # each and every elements are False

In [94]:
# df_good_flags[cols_imp] # at least one True for each flags

# Look at an object and determine good or bad

In [101]:
a = df_good_flags[cols_imp].head().T

b = df_bad_flags[cols_imp].head().T

pd.concat([a,b],axis=1)

# this gives bad has all False
# good have at least one True

Unnamed: 0,0,0.1,0.2,0.3,0.4,3,3.1,3.2,4,4.1
base_GaussianCentroid_flag,False,False,False,False,False,False,False,False,False,False
base_GaussianCentroid_flag_resetToPeak,False,False,False,False,False,False,False,False,False,False
base_SdssCentroid_flag,True,True,True,True,True,False,False,False,False,False
base_SdssCentroid_flag_edge,True,True,True,True,True,False,False,False,False,False
base_SdssCentroid_flag_almostNoSecondDerivative,False,False,False,False,False,False,False,False,False,False
base_SdssCentroid_flag_notAtMaximum,False,False,False,False,False,False,False,False,False,False
base_SdssCentroid_flag_resetToPeak,False,False,False,False,False,False,False,False,False,False
base_SdssShape_flag_unweightedBad,False,False,False,False,False,False,False,False,False,False
base_SdssShape_flag_unweighted,True,False,False,True,False,False,False,False,False,False
base_SdssShape_flag_maxIter,False,False,False,False,False,False,False,False,False,False


In [103]:
# look at good only
df_good_flags[cols_imp].head()

Unnamed: 0,base_GaussianCentroid_flag,base_GaussianCentroid_flag_resetToPeak,base_SdssCentroid_flag,base_SdssCentroid_flag_edge,base_SdssCentroid_flag_almostNoSecondDerivative,base_SdssCentroid_flag_notAtMaximum,base_SdssCentroid_flag_resetToPeak,base_SdssShape_flag_unweightedBad,base_SdssShape_flag_unweighted,base_SdssShape_flag_maxIter,ext_shapeHSM_HsmPsfMoments_flag,ext_shapeHSM_HsmPsfMoments_flag_galsim,ext_shapeHSM_HsmSourceMoments_flag,ext_shapeHSM_HsmSourceMoments_flag_galsim,base_CircularApertureFlux_3_0_flag,base_CircularApertureFlux_4_5_flag,base_CircularApertureFlux_4_5_flag_sincCoeffsTruncated,base_CircularApertureFlux_6_0_flag,base_CircularApertureFlux_6_0_flag_sincCoeffsTruncated,base_CircularApertureFlux_9_0_flag,base_CircularApertureFlux_12_0_flag,base_CircularApertureFlux_12_0_flag_apertureTruncated,base_CircularApertureFlux_17_0_flag,base_CircularApertureFlux_17_0_flag_apertureTruncated,base_GaussianFlux_flag,base_PsfFlux_flag,base_PsfFlux_flag_edge,base_ClassificationExtendedness_flag
0,False,False,True,True,False,False,False,False,True,False,True,False,True,False,True,True,False,True,False,True,True,False,True,False,False,True,True,True
0,False,False,True,True,False,False,False,False,False,False,True,False,True,False,True,True,True,True,True,True,True,False,True,True,False,True,True,True
0,False,False,True,True,False,False,False,False,False,False,True,False,True,False,True,True,False,True,False,True,True,False,True,False,False,True,True,True
0,False,False,True,True,False,False,False,False,True,False,True,False,True,False,True,True,False,True,False,True,True,False,True,False,False,True,False,True
0,False,False,True,True,False,False,False,False,False,False,True,False,True,False,True,True,True,True,True,True,True,False,True,True,False,True,True,True


In [108]:
# is any column all True?
a = df_good_flags[cols_imp]

print(a.shape)
a.sum().sort_values() 
# good objects: 959k
# base_SdssShape_flag_unweighted  flag True : 42k
# base_SdssShape_flag_unweighted  flag False: 917k (most are False)
# 
# for bad objects all objects are False.

(959917, 28)


ext_shapeHSM_HsmPsfMoments_flag_galsim                        3
base_SdssShape_flag_unweightedBad                             3
base_SdssCentroid_flag_resetToPeak                            9
ext_shapeHSM_HsmSourceMoments_flag_galsim                    16
base_GaussianFlux_flag                                       16
base_SdssShape_flag_maxIter                                 219
base_SdssCentroid_flag_notAtMaximum                         228
base_CircularApertureFlux_12_0_flag_apertureTruncated       283
base_SdssCentroid_flag_almostNoSecondDerivative             341
base_GaussianCentroid_flag                                  374
base_GaussianCentroid_flag_resetToPeak                      374
base_CircularApertureFlux_6_0_flag_sincCoeffsTruncated     7193
base_CircularApertureFlux_4_5_flag_sincCoeffsTruncated     7193
base_CircularApertureFlux_17_0_flag_apertureTruncated      9930
base_PsfFlux_flag_edge                                    12628
base_SdssCentroid_flag_edge             

In [120]:
# we know all bad objects has these 28 flags all False
# does any good object has all these 28 flags False?

In [122]:
a = df_good_flags[cols_imp]
a.head(2)

Unnamed: 0,base_GaussianCentroid_flag,base_GaussianCentroid_flag_resetToPeak,base_SdssCentroid_flag,base_SdssCentroid_flag_edge,base_SdssCentroid_flag_almostNoSecondDerivative,base_SdssCentroid_flag_notAtMaximum,base_SdssCentroid_flag_resetToPeak,base_SdssShape_flag_unweightedBad,base_SdssShape_flag_unweighted,base_SdssShape_flag_maxIter,ext_shapeHSM_HsmPsfMoments_flag,ext_shapeHSM_HsmPsfMoments_flag_galsim,ext_shapeHSM_HsmSourceMoments_flag,ext_shapeHSM_HsmSourceMoments_flag_galsim,base_CircularApertureFlux_3_0_flag,base_CircularApertureFlux_4_5_flag,base_CircularApertureFlux_4_5_flag_sincCoeffsTruncated,base_CircularApertureFlux_6_0_flag,base_CircularApertureFlux_6_0_flag_sincCoeffsTruncated,base_CircularApertureFlux_9_0_flag,base_CircularApertureFlux_12_0_flag,base_CircularApertureFlux_12_0_flag_apertureTruncated,base_CircularApertureFlux_17_0_flag,base_CircularApertureFlux_17_0_flag_apertureTruncated,base_GaussianFlux_flag,base_PsfFlux_flag,base_PsfFlux_flag_edge,base_ClassificationExtendedness_flag
0,False,False,True,True,False,False,False,False,True,False,True,False,True,False,True,True,False,True,False,True,True,False,True,False,False,True,True,True
0,False,False,True,True,False,False,False,False,False,False,True,False,True,False,True,True,True,True,True,True,True,False,True,True,False,True,True,True


In [128]:
a.all(axis=1).sum() # NO

0

In [129]:
# conclusion:
# if all 28 flags are False, object is bad
# else: good

In [130]:
for i,j in enumerate(cols_imp,1):
    print(i,j)

1 base_GaussianCentroid_flag
2 base_GaussianCentroid_flag_resetToPeak
3 base_SdssCentroid_flag
4 base_SdssCentroid_flag_edge
5 base_SdssCentroid_flag_almostNoSecondDerivative
6 base_SdssCentroid_flag_notAtMaximum
7 base_SdssCentroid_flag_resetToPeak
8 base_SdssShape_flag_unweightedBad
9 base_SdssShape_flag_unweighted
10 base_SdssShape_flag_maxIter
11 ext_shapeHSM_HsmPsfMoments_flag
12 ext_shapeHSM_HsmPsfMoments_flag_galsim
13 ext_shapeHSM_HsmSourceMoments_flag
14 ext_shapeHSM_HsmSourceMoments_flag_galsim
15 base_CircularApertureFlux_3_0_flag
16 base_CircularApertureFlux_4_5_flag
17 base_CircularApertureFlux_4_5_flag_sincCoeffsTruncated
18 base_CircularApertureFlux_6_0_flag
19 base_CircularApertureFlux_6_0_flag_sincCoeffsTruncated
20 base_CircularApertureFlux_9_0_flag
21 base_CircularApertureFlux_12_0_flag
22 base_CircularApertureFlux_12_0_flag_apertureTruncated
23 base_CircularApertureFlux_17_0_flag
24 base_CircularApertureFlux_17_0_flag_apertureTruncated
25 base_GaussianFlux_flag
26 b