<a href="https://colab.research.google.com/github/jbkalmbach/pzflow-paper-2021/blob/main/photo-z/CMNN_cat_creation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Catalog Creation

This notebook uses the version of the [CMNN_Photoz_Estimator](https://github.com/dirac-institute/CMNN_Photoz_Estimator) catalog creation tools on the branch `issue/22/set_seed_train_test` to create train and test catalogs for our photo-z experiments in the paper.

We start with a mock galaxy catalog of true magnitudes and use the CMNN code to generate a mock catalog with apparent magnitudes and errors consistent with a 10 year LSST survey.

In [None]:
# Get CMNN Version needed
! git clone https://github.com/dirac-institute/CMNN_Photoz_Estimator.git --branch issue/22/set_seed_train_test

Cloning into 'CMNN_Photoz_Estimator'...
remote: Enumerating objects: 227, done.[K
remote: Counting objects: 100% (190/190), done.[K
remote: Compressing objects: 100% (142/142), done.[K
remote: Total 227 (delta 117), reused 102 (delta 48), pack-reused 37[K
Receiving objects: 100% (227/227), 86.84 MiB | 39.05 MiB/s, done.
Resolving deltas: 100% (125/125), done.


In [None]:
import numpy as np
import os
import sys
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
# Add CMNN Photo z to path
path_to_cmnn_estimation_code = '/content/CMNN_Photoz_Estimator/'
sys.path.append(path_to_cmnn_estimation_code)

# Import tool we need
from cmnn_catalog import make_test_and_train

In [None]:
verbose = True

# Take magnitude values from CMNN defaults
train_m5 = [26.100, 27.400, 27.500, 26.800, 26.100, 24.900]

test_m5 = [26.100, 27.400, 27.500, 26.800, 26.100, 24.900]

test_mcut = [26.100, 27.400, 27.500, 25.000, 26.100, 24.900]

train_mcut = [26.100, 27.400, 27.500, 25.000, 26.100, 24.900]

runid = '1'

force_idet = True

force_gridet = True

# Specify size of training and test sets
train_N = 250000

test_N = 50000

# Minimum number of colors for galaxies
# To start with we want test and training sets with all
cmnn_minNc = 5

# Specify location for output
os.makedirs(f'/content/drive/MyDrive/DIRAC/pzflow/output/run_{runid}', exist_ok=True)
output_dir = f'/content/drive/MyDrive/DIRAC/pzflow/output/run_{runid}'

In [None]:
# Download base catalog
!gdown --id 1OJ0vRtzwJptyF-f4_34j4hvv4_E77mUE

Downloading...
From: https://drive.google.com/uc?id=1OJ0vRtzwJptyF-f4_34j4hvv4_E77mUE
To: /content/LSST_galaxy_catalog_i25p3.dat.gz
111MB [00:02, 44.8MB/s]


In [None]:
# Extract base catalog file
!gunzip LSST_galaxy_catalog_i25p3.dat.gz

In [None]:
! mkdir -p output/run_1/
make_test_and_train(verbose, runid, test_m5, train_m5, test_mcut, train_mcut, force_idet, force_gridet, test_N, train_N, cmnn_minNc, 'LSST_galaxy_catalog_i25p3.dat', seed=42)
! cp output/run_1/*.cat /content/drive/MyDrive/DIRAC/pzflow/output/run_1/

 
Starting cmnn_catalog.make_test_and_train(),  2021-08-26 19:35:08.645160
Read the mock catalog of true redshifts and magnitudes.
Calculating magnitude errors.
Calculating observed apparent magnitudes.
Applying the magnitude cuts.
Calculating colors.
Opening and writing to  output/run_1/test.cat
Opening and writing to  output/run_1/train.cat
Wrote:  output/run_1/test.cat, output/run_1/train.cat


In [None]:
!head /content/drive/MyDrive/DIRAC/pzflow/output/run_1/train.cat

    5789583 0.90058196 25.448562  0.124823 25.635542  0.042729 25.454899  0.033699 24.888279  0.036172 24.562720  0.050165 24.472106  0.150306 -0.186980  0.131934  0.180643  0.054419  0.566619  0.049438  0.325560  0.061846  0.090614  0.158456 
    9645863 2.01064400 24.952113  0.075436 24.813600  0.021939 24.856853  0.019434 24.923800  0.036793 24.752957  0.063623 24.804617  0.185715  0.138514  0.078562 -0.043254  0.029309 -0.066947  0.041610  0.170843  0.073496 -0.051660  0.196311 
    1341356 0.55477380 25.836928  0.174945 25.816929  0.048413 24.947092  0.021429 24.600678  0.029065 24.472055  0.047449 24.657549  0.139597  0.019999  0.181521  0.869837  0.052944  0.346415  0.036111  0.128623  0.055644 -0.185494  0.147441 
   13103201 0.62702690 24.177630  0.039932 24.087804  0.013474 23.529282  0.007172 23.085825  0.008598 22.980726  0.011430 22.932603  0.033431  0.089827  0.042143  0.558522  0.015264  0.443456  0.011197  0.105100  0.014303  0.048122  0.035331 
   11756538 0.90088370 2

In [None]:
!head /content/drive/MyDrive/DIRAC/pzflow/output/run_1/test.cat

    5282600 0.64819190 24.773679  0.066454 24.486278  0.018180 23.715850  0.008217 23.031361  0.008153 22.725347  0.008862 22.603023  0.023841  0.287402  0.068896  0.770428  0.019951  0.684489  0.011576  0.306015  0.012042  0.122324  0.025435 
    4530583 1.54744500 24.848797  0.067937 24.918624  0.023465 24.950455  0.021093 24.856045  0.033861 24.572990  0.048171 24.269002  0.109074 -0.069827  0.071875 -0.031831  0.031552  0.094410  0.039893  0.283055  0.058882  0.303988  0.119237 
    5166333 0.98921716 25.269208  0.093680 25.284407  0.032201 25.073436  0.023355 24.519189  0.027131 23.974104  0.029300 23.839821  0.082798 -0.015200  0.099060  0.210971  0.039779  0.554247  0.035799  0.545085  0.039932  0.134283  0.087829 
    2029680 0.65405510 24.949006  0.071989 24.243395  0.014922 23.126170  0.005467 22.293219  0.005033 21.992805  0.005000 21.866349  0.012312  0.705611  0.073520  1.117225  0.015892  0.832951  0.007431  0.300414  0.007095  0.126456  0.013289 
   12802329 0.42824006 2