Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hsc y3 #351

Merged
merged 16 commits into from
May 29, 2024
451 changes: 451 additions & 0 deletions examples/hscy3/config.yml

Large diffs are not rendered by default.

132 changes: 132 additions & 0 deletions examples/hscy3/hsc_pipeline.yaml
yomori marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
# Stages to run
stages:
#- name: TXSourceDiagnosticPlots
#- name: TXRoweStatistics
#- name: TXTauStatistics
#- name: TXPSFDiagnostics
- name: TXSimpleMask
- name: TXAuxiliaryLensMaps
#- name: TXSourceNoiseMaps
- name: TXTwoPoint # Compute real-space 2-point correlations
threads_per_process: 128
- name: TXTracerMetadata
- name: FlowCreator # Simulate a spectroscopic population
- name: GridSelection # Simulate a spectroscopic sample
- name: TXParqetToHDF # Convert the spec sample format
- name: TXSourceSelectorHSC # select and split objects into source bins
- name: TXShearCalibration
- name: TXTruthLensSelector
- name: PZPrepareEstimatorLens
classname: Inform_BPZ_lite
- name: PZEstimatorLens
classname: BPZ_lite
- name: TXMeanLensSelector # select objects for lens bins from the PDFs
- name: Inform_NZDirLens # Prepare the DIR method inputs for the lens sample
classname: Inform_NZDir
- name: Inform_NZDirSource # Prepare the DIR method inputs for the source sample
classname: Inform_NZDir
- name: TXJackknifeCenters # Split the area into jackknife regions
#- name: PZRailSummarizeLens # Run the DIR method on the lens sample to find n(z)
# classname: PZRailSummarize
#- name: PZRailSummarizeSource # Run the DIR method on the lens sample to find n(z)
# classname: PZRailSummarize
#- name: TXPhotozSourceStack
# classname: TXPhotozStack
#- name: TXPhotozLensStack
# classname: TXPhotozStack
#- name: PZRailSummarizeSource # Run the DIR method on the lens sample to find n(z)
# classname: PZRailSummarize
#- name: Inform_NZDirSource # Prepare the DIR method inputs for the source sample
# classname: Inform_NZDir
#- name: PZPrepareEstimatorLens # Prepare the p(z) estimator
# classname: Inform_BPZ_lite
#- name: PZEstimatorLens # Measure lens galaxy PDFs
# classname: BPZ_lite
# threads_per_process: 1
#- name: TXMeanLensSelector # select objects for lens bins from the PDFs
#- name: Inform_NZDirLens # Prepare the DIR method inputs for the lens sample
# classname: Inform_NZDir
#- name: PZRailSummarizeLens # Run the DIR method on the lens sample to find n(z)
# classname: PZRailSummarize
#- name: PZRailSummarizeSource # Run the DIR method on the lens sample to find n(z)
# classname: PZRailSummarize
#- name: TXSourceSelectorMetacal # select and split objects into source bins
#- name: Inform_NZDirSource # Prepare the DIR method inputs for the source sample
# classname: Inform_NZDir
#- name: TXLensCatalogSplitter


# modules and packages to import that have pipeline
# stages defined in them
modules: >
txpipe
rail.creation.degradation.grid_selection
rail.creation.engines.flowEngine
rail.estimation.algos.NZDir
rail.estimation.algos.bpz_lite


# where to find any modules that are not in this repo,
# and any other code we need.
python_paths:
- submodules/WLMassMap/python/desc/
- submodules/TJPCov
- submodules/FlexZPipe
- submodules/RAIL

# Where to put outputs
output_dir: data/hsc-y3/outputs/shearsys/

# How to run the pipeline: mini, parsl, or cwl
launcher:
name: mini
interval: 1.0

# Where to run the pipeline: cori-interactive, cori-batch, or local
site:
name: local
max_threads: 128

# python modules to import to search for stages
#modules: txpipe

# configuration settings
config: examples/hscy3/config.yml

# On NERSC, set this before running:
# export DATA=${LSST}/groups/WL/users/zuntz/data/metacal-testbed

inputs:
# The following files are REQUIRED files
shear_catalog : /global/cfs/cdirs/lsst/groups/WL/projects/txpipe-sys-tests/hsc-y3/shear/txpipe_allfield_shear.h5
calibration_table : data/example/inputs/sample_cosmodc2_w10year_errors.dat
star_catalog : /global/cfs/cdirs/lsst/groups/WL/users/yomori/repo/nulltests_txpipe/hscy1/star_catalog_hscy1_allfields.h5
fiducial_cosmology : data/fiducial_cosmology.yml
random_cats_source : Null
random_cats : /global/cfs/cdirs/lsst/groups/WL/projects/txpipe-sys-tests/hsc-y3/shear/random_cats.hdf5

binned_random_catalog : /global/cfs/cdirs/lsst/groups/WL/projects/txpipe-sys-tests/hsc-y3/shear/random_cats.hdf5
binned_random_catalog_sub: /global/cfs/cdirs/lsst/groups/WL/projects/txpipe-sys-tests/hsc-y3/shear/random_cats.hdf5

# The following are just extracted from other pipelines which never gets used for cosmic shear-only analyses
calibration_table : data/example/inputs/sample_cosmodc2_w10year_errors.dat
flow : data/example/inputs/example_flow.pkl
photometry_catalog : data/example/inputs/photometry_catalog.hdf5
lens_tomography_catalog : /global/cfs/cdirs/lsst/groups/WL/users/yomori/repo/aaa/TXPipe/data/desy3a/outputs/lens_tomography_catalog_unweighted.hdf5
shear_photoz_stack : /global/cfs/cdirs/lsst/groups/WL/projects/txpipe-sys-tests/hscy1/shear_pz_stack.hdf5
lens_photoz_stack : /global/cfs/cdirs/lsst/groups/WL/projects/txpipe-sys-tests/hscy1/lens_pz_stack.hdf5
binned_lens_catalog : /global/cfs/cdirs/lsst/groups/WL/users/yomori/repo/aaa/TXPipe/data/desy3a/outputs/binned_lens_catalog.hdf5
#patch_centers : /global/cfs/cdirs/lsst/groups/WL/users/yomori/repo/aaa/TXPipe/data/desy3a/outputs/patch_centers.txt


# if supported by the launcher, restart the pipeline where it left off
# if interrupteda
# # if supported by the launcher, restart the pipeline where it left off
# if interrupted
resume: True
# where to put output logs for individual stages
log_dir: data/hsc-y3/logs
# where to put an overall parsl pipeline log
pipeline_log: data/hsc-y3/log.txt


183 changes: 183 additions & 0 deletions examples/hscy3/pipeline.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@

# Stages to run
stages:
#- name: TXBrighterFatterPlot # Make plots tracking the brighter-fatter effect
#- name: TXPSFDiagnostics # Compute and plots other PSF diagnostics
#- name: TXRoweStatistics # Compute and plot Rowe statistics
# threads_per_process: 2
#- name: TXAuxiliaryLensMaps # make depth and bright object maps
#- name: TXSimpleMask # combine maps to make a simple mask
#- name: TXSourceSelectorMetacal # select and split objects into source bins
#- name: TXSourceNoiseMaps # Compute shear noise using rotations
#- name: TXShearCalibration # Calibrate and split the source sample tomographically
#- name: TXSourceMaps # make source g1 and g2 maps
#- name: TXConvergenceMaps # Make convergence kappa maps from g1, g2 maps
#- name: TXAuxiliarySourceMaps # make PSF and flag maps
#- name: FlowCreator # Simulate a spectroscopic population
#- name: GridSelection # Simulate a spectroscopic sample
#- name: TXParqetToHDF # Convert the spec sample format
#- name: PZPrepareEstimatorLens # Prepare the p(z) estimator
# classname: Inform_BPZ_lite
#- name: PZEstimatorLens # Measure lens galaxy PDFs
# classname: BPZ_lite
#- name: TXMeanLensSelector # select objects for lens bins from the PDFs
#- name: Inform_NZDirLens # Prepare the DIR method inputs for the lens sample
# classname: Inform_NZDir
#- name: Inform_NZDirSource # Prepare the DIR method inputs for the source sample
# classname: Inform_NZDir
#- name: PZRailSummarizeLens # Run the DIR method on the lens sample to find n(z)
# classname: PZRailSummarize
#- name: PZRailSummarizeSource # Run the DIR method on the lens sample to find n(z)
# classname: PZRailSummarize
#- name: TXLensCatalogSplitter # Split the lens sample tomographically
#- name: TXLSSWeightsUnit # add systematic weights to the lens sample (weight=1 for this example)
#- name: TXTracerMetadata # collate metadata
#- name: TXLensNoiseMaps # Compute lens noise using half-splits
#- name: TXLensMaps # make source lens and n_gal maps
#- name: TXDensityMaps # turn mask and ngal maps into overdensity maps
#- name: TXTwoPoint # Compute real-space 2-point correlations
# threads_per_process: 2
#- name: TXTwoPointFourier # Compute power spectra C_ell
#- name: TXpureB # Compute pureB
#- name: TXStarCatalogSplitter # Split the star catalog into separate bins (psf/non-psf)
#- name: TXSourceMaps # make source g1 and g2 maps
#- name: TXLensMaps # make source lens and n_gal maps
#- name: TXAuxiliarySourceMaps # make PSF and flag maps
#- name: TXAuxiliaryLensMaps # make depth and bright object maps
#- name: TXSimpleMask # combine maps to make a simple mask
#- name: TXLSSWeightsUnit # add systematic weights to the lens sample (weight=1 for this example)
#- name: TXSourceNoiseMaps # Compute shear noise using rotations
#- name: TXDensityMaps # turn mask and ngal maps into overdensity maps
#- name: TXTracerMetadata # collate metadata
#- name: TXRandomCat # generate lens bin random catalogs
#- name: TXJackknifeCenters # Split the area into jackknife regions
#- name: TXBlinding # Blind the data following Muir et al
# threads_per_process: 2
#- name: TXTwoPointTheoryReal # compute theory using CCL to save in sacc file and plot later
#- name: TXTwoPointPlotsTheory # Make plots of 2pt correlations
#- name: TXSourceDiagnosticPlots # Make a suite of diagnostic plots
#- name: TXLensDiagnosticPlots # Make a suite of diagnostic plots
#- name: TXGammaTFieldCenters # Compute and plot gamma_t around center points
# threads_per_process: 2
#- name: TXGammaTStars # Compute and plot gamma_t around bright stars
# threads_per_process: 2
#- name: TXGammaTRandoms # Compute and plot gamma_t around randoms
# threads_per_process: 2
#- name: TXGalaxyStarDensity # Compute and plot the star-galaxy density cross-correlation
#- name: TXGalaxyStarShear # Compute and plot the star-galaxy shear cross-correlation
#- name: TXPhotozPlotSource # Plot the bin n(z)
# classname: TXPhotozPlot
#- name: TXPhotozPlotLens # Plot the bin n(z)
# classname: TXPhotozPlot
#- name: TXConvergenceMaps # Make convergence kappa maps from g1, g2 maps
#- name: TXConvergenceMapPlots # Plot the convergence map
#- name: TXMapCorrelations # plot the correlations between systematics and data
#- name: TXApertureMass # Compute aperture-mass statistics
# threads_per_process: 2
#- name: TXTwoPointFourier # Compute power spectra C_ell
# Disablig this since not yet synchronised with new Treecorr MPI
# - name: TXSelfCalibrationIA # Self-calibration intrinsic alignments of galaxies
# Disabling these as they take too long for a quick test.
# - name: TXRealGaussianCovariance # Compute covariance of real-space correlations
# - name: TXTwoPointFourier # Compute power spectra C_ell
# - name: TXFourierNamasterCovariance # Compute the C_ell covariance
# - name: TXFourierTJPCovariance # Compute the C_ell covariance with TJPCov


# modules and packages to import that have pipeline
# stages defined in them
modules: >
txpipe
rail.creation.degradation.grid_selection
rail.creation.engines.flowEngine
rail.estimation.algos.NZDir
rail.estimation.algos.bpz_lite

# where to find any modules that are not in this repo,
# and any other code we need.
python_paths:
- submodules/WLMassMap/python/desc/

# Where to put outputs
output_dir: data/hsc-y3/outputs

# How to run the pipeline: mini, parsl, or cwl
launcher:
name: mini
interval: 1.0

# Where to run the pipeline: cori-interactive, cori-batch, or local
site:
name: local
max_threads: 2

# configuration settings
config: examples/desy3/config_Bmodes.yml

# These are overall inputs to the whole pipeline, not generated within it
inputs:
# See README for paths to download these files
shear_catalog : /global/cfs/cdirs/lsst/groups/WL/projects/txpipe-sys-tests/hsc-y3/shear_catalog_desy3_unmasked_withfakez_v2.h5
yomori marked this conversation as resolved.
Show resolved Hide resolved
star_catalog : /global/cfs/cdirs/lsst/groups/WL/projects/txpipe-sys-tests/hsc-y3/DES_psf_y3_catalog.hdf5
#lens_tomography_catalog: /global/cfs/cdirs/lsst/groups/WL/users/yomori/repo/aaa/TXPipe/data/desy3a/outputs/lens_tomography_catalog_unweighted.hdf5 ##############<----- manually add this line
# This file comes with the code
fiducial_cosmology: data/fiducial_cosmology.yml
shear_photoz_stack: data/example/outputs/shear_photoz_stack_manual.hdf5
photometry_catalog: data/example/inputs/photometry_catalog.hdf5
calibration_table : data/example/inputs/sample_cosmodc2_w10year_errors.dat
exposures : data/example/inputs/exposures.hdf5
flow : data/example/inputs/example_flow.pkl
random_cats_source: Null

# These are overall inputs to the whole pipeline, not generated within it
#inputs:
# shear_catalog: //global/cfs/cdirs/lsst/groups/WL/projects/txpipe-sys-tests/des-y3/shear_catalog_desy3_unmasked_withfakez_v2.h5
# photometry_catalog:
# #shear_tomography_catalog: /global/cfs/cdirs/desc-wl/projects/txpipe-sys-tests/des-y3/shear_tomography_desy3_unmasked_test.h5
# lens_tomography_catalog: /global/cfs/cdirs/lsst/groups/WL/projects/txpipe-sys-tests/des-y1/lens_tomography_catalog_desy1_RM.h5
# lens_photoz_stack:
# mask: /global/cfs/cdirs/lsst/groups/WL/projects/txpipe-sys-tests/des-y1/mask_desy1.h5
# star_catalog: /global/cfs/cdirs/lsst/groups/WL/projects/txpipe-sys-tests/des-y3/DES_psf_y3_catalog.h5
# calibration_table: data/example/inputs/sample_cosmodc2_w10year_errors.dat
# fiducial_cosmology: data/fiducial_cosmology.yml
# random_cats_source: Null
# flow: data/example/inputs/example_flow.pkl


# if supported by the launcher, restart the pipeline where it left off
# if interrupted
resume: True
# where to put output logs for individual stages
log_dir: data/hsc-y3/logs
# where to put an overall parsl pipeline log
pipeline_log: data/hsc-y3/log.txt





###########################################################################################

#'/global/cfs/cdirs/desc-wl/users/yomori/repo/aaa/TXPipe/data/desy3a/outputs/manual_shear_photoz_stack.ipynb'

#from astropy.io import fits

## http://desdr-server.ncsa.illinois.edu/despublic/y3a2_files/datavectors/2pt_NG_final_2ptunblind_02_26_21_wnz_maglim_covupdate.fits
#d=fits.open('2pt_NG_final_2ptunblind_02_26_21_wnz_maglim_covupdate.fits')

#bins=np.append(d['nz_source'].data['Z_LOW'],d['nz_source'].data['Z_HIGH'][-1])

#nz1 = d['nz_source'].data['BIN1']
#nz2 = d['nz_source'].data['BIN2']
#nz3 = d['nz_source'].data['BIN3']
#nz4 = d['nz_source'].data['BIN4']

#with h5py.File('shear_photoz_stack_manual.hdf5', 'w') as f:
# f.create_group("qp")
# f.create_group("qp/data")
# f.create_group("qp/meta")
# f['qp/meta/bins'] = bins[np.newaxis]
# f['qp/meta/pdf_name'] = np.array([b'hist'], dtype='|S4')
# f['qp/meta/pdf_version'] = [0]
# f['qp/data/pdfs'] = (np.c_[nz1,nz2,nz3,nz4]).T

28 changes: 15 additions & 13 deletions txpipe/calibrate.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ class TXShearCalibration(PipelineStage):
"chunk_rows": 100_000,
"subtract_mean_shear": True,
"extra_cols": [""],
"shear_catalog_type": ''
}

def run(self):
Expand All @@ -67,13 +68,17 @@ def run(self):
with self.open_input("shear_catalog", wrapper=True) as f:
cat_cols, renames = f.get_primary_catalog_names()

cat_cols += [f"00/{c}" for c in extra_cols]
renames.update({f"00/{c}":c for c in extra_cols})

if cat_type=='des':
yomori marked this conversation as resolved.
Show resolved Hide resolved
cat_cols += [f"00/{c}" for c in extra_cols]
renames.update({f"00/{c}":c for c in extra_cols})
else:
cat_cols += [f"{c}" for c in extra_cols]
renames.update({f"{c}":c for c in extra_cols})

if cat_type!='hsc':
output_cols = ["ra", "dec", "weight", "g1", "g2"] + extra_cols
else:
output_cols = ["ra", "dec", "weight", "g1", "g2","c1","c2"] + extra_cols
output_cols = ["ra", "dec", "weight", "g1", "g2", "c1", "c2"] + extra_cols

# We parallelize by bin. This isn't ideal but we don't know the number
# of objects in each bin per chunk, so we can't parallelize in full. This
Expand Down Expand Up @@ -104,6 +109,7 @@ def run(self):

#  Main loop
for s, e, data in rename_iterated(it, renames):


if self.rank == 0:
print(f"Rank 0 processing data {s:,} - {e:,}")
Expand All @@ -123,19 +129,15 @@ def run(self):
# otherwise just objects in this bin
w = np.where(data["bin"] == b)
cal = cals[b]

# Cut down the data to just this selection for output
d = {name: data[name][w] for name in output_cols}

# Calibrate the shear columns
if cat_type=='hsc':
d["g1"], d["g2"] = cal.apply(
d["g1"], d["g2"], d["c1"], d["c2"]
)
d["g1"], d["g2"] = cal.apply(d["g1"], d["g2"], d["c1"], d["c2"], d['aselepsf1'], d['aselepsf2'], d['msel'], subtract_mean=subtract_mean_shear)
else:
d["g1"], d["g2"] = cal.apply(
d["g1"], d["g2"], subtract_mean=subtract_mean_shear
)
d["g1"], d["g2"] = cal.apply(d["g1"], d["g2"], subtract_mean=subtract_mean_shear)

# Write output, keeping track of sizes
splitter.write_bin(d, b)
Expand All @@ -149,7 +151,7 @@ def setup_output(self, extra_cols):
counts = f["tomography/counts"][:]
count2d = f["tomography/counts_2d"][0]
nbin = len(counts)

# Prepare the calibrated output catalog
f = self.open_output("binned_shear_catalog", parallel=True)

Expand Down
Loading