Producing DESI-Y1 Lya mocks with quickquasars. #578

HiramHerrera · 2023-11-14T21:32:20Z

This PR introduces the necessary tools to produce DESI-Y1 Lya mocks.

Here are the Major changes listed:

Introduction of the survey_release.py module and the SurveyRelease class instance with the necessary tools to create mock catalogs to be read by quickquasars.
- Included the generate_survey_release_catalogs.ipynb noteboook with a brief tutorial on making these catalogs.
Introduced new flags to quickquasars.
- --raw-mock [saclay, london, ohio] allows to generate Y5 mocks directly from quickquasars for the specified raw mock team.
- --dn_dzdm was reworked to boolean flag, used when redshift and magnitude distributions are to be downsampled by quickquasarsdirectly.
- --year1-throughput adds the Y1 throughput model to mock spectra including the 440 nm dip observed during Y1
- --from-catalog allows to read from an input pre-processed mock catalog generated by the SurveyRelease instance.
- --metal-strengths allows to change the default metal strengths used for the --metals flag. This flag should be the same length and in the same order as the --metals flag in order to work. The lya_spectra.py script was modified to make this flag work.

An example quickquasars command to run DESI-Y1 mocks would be:

quickquasars -i <input_path> --outdir <output_path> --zbest  --bbflux  --save-continuum --zmin 1.7 --dla file --balprob 0.16 --metals LYB LY3 LY4 LY5 SiII\(1260\) SiIII\(1207\) SiII\(1193\) SiII\(1190\) --metal-strengths 0.1901 0.0697 0.0335 0.0187 1.3e-03 3.5e-03 0.7e-03 1.4e-03 --from-catalog <input_preprocessed_y1catalog> --seed <seed>

Important notes:

The --year1-throughput flag works if a file called desiY1.yaml file exists within specsim/data/config pointing to the specpsf/psf-quicksim.fits and files thru-[brz]_y1measured.fits which include the dip feature at 440 nm. We can take advantage of Simulate DESI spectra with the same output wavelength binning as the pipeline specsim#125 to include this file. However, if I'm correct, that PR changes the defaults of the desi.yaml file to those of DESI-Y1. I think it is important to keep the desi.yaml pointing to a throughput model without the 440 nm dip (default DESIMODEL) as we might want mocks without this feature.
The SurveyRelease downsampling by NPASS functionality relies on a pixel map currently stored in /global/cfs/cdirs/desi/users/hiramk/desi/quickquasars/sampling_tests/npass_pixmap.fits this file is large enough (1.1GB) to not be included in this PR. The code should work as long as the current path (my local cfdirs) is not modified. However, we should probably copy this file to somewhere else. Is there a recommended place to store it? Maybe DESIMODEL?
Probably we might want to deprecate the --dn_dzdm and --raw-mock flags as the SurveyRelease instance already does this and its probably more efficient to downsample and assign magnitudes in pre-processing than within quickquasars. I kept them since @andreicuceu DESI-Y5 mocks were produced using these flags.

…ich type of raw mock is inputed

…f recieving a preprocessed catalog. Introduction of survey release module to produce Y1 mocks

…asars and implemented notebook to produce preprocessing mock catalogs to pass to quickquasars.

…n method.

coveralls · 2023-11-15T00:23:19Z

coverage: 44.724% (-0.8%) from 45.526%
when pulling 599de34 on HiramHerrera:y1mocks_v3
into 50998bf on desihub:main.

andreufont · 2023-11-15T15:25:13Z

I won't be able to review this until Nov 24, but you can merge without me if others have approved it.

I was a bit surprised to see the --raw-mock option in QQ. What does it do? Is it just to create a picca-readable quasar catalog to be used in the raw analyis? Because we don't want to create spectra files in the raw analysis...

HiramHerrera · 2023-11-15T16:22:39Z

The --raw-mock flag was introduced to somehow tell quickquasars what kind of raw mocks are inputted. Principally for saclay or lyacolore. This was done for two main reason where these mocks are handled differently:

The downsampling fractions to match the observed redshift distribution are different depending if we use saclay or lyacolore while ohio don't require downsampling: Making quickquasars compute these fractions internally was Memory inefficient as this requires opening the complete raw mocks master catalog. The way quickquasars is coded would make this process to happen for every node we use. To avoid this we computed the fractions in pre-processing and saved it in a fits file that is read by the --dn_dzdm flag according to the raw mock team inputted. This is no longer required if we make the catalogs in pre-processing for all kind of mocks we do (including Y5).
The relative metal strengths of the quickquasars method (--metals) is different for saclay and for lyacolore mocks: The previous version of the code had by default the strengths for lyacolore while the information for saclay mocks was on another branch. Right now the --metal-strengths flag changes the defaults, so maybe this is no longer a problem.

These are the reasons of why I say the --dn_dzdm and the --raw-mock flags can be deprecated

andreufont · 2023-11-15T17:34:13Z

Oh, I understand now. Thanks @HiramHerrera ! I thought these were related to the "raw analysis". I see now that you are just specifying which type of transmission files are being provided.

alxogm · 2023-11-16T02:05:54Z

Regarding this question:

The SurveyRelease downsampling by NPASS functionality relies on a pixel map currently stored in /global/cfs/cdirs/desi/users/hiramk/desi/quickquasars/sampling_tests/npass_pixmap.fits this file is large enough (1.1GB) to not be included in this PR. The code should work as long as the current path (my local cfdirs) is not modified. However, we should probably copy this file to somewhere else. Is there a recommended place to store it? Maybe DESIMODEL?

@sbailey Since this is some sort of a template for the Y1 lya mock production, could we store it in /global/cfs/cdirs/desi/spectro/templates/ where other templates used by quickquasars are? maybe in a new directory (name TBD, suggestions accepted)?

p-slash · 2023-12-01T16:15:10Z

py/desisim/scripts/quickquasars.py

+                bins = ((z - zmin)/(zmax - zmin) * len(zcenters) + 0.5).astype(np.int64)
+                selection_z = np.random.uniform(size=z.size) < fraction[bins]
+
+                np.random.set_state(rnd_state)


Why do you restore the old random state?

I used this to restore the quickquasars state to what it would be if we don't downsample to match the redshift distribution i.e not passing the --dn_dzdm flag. This is probably not needed since in either case the realizations would differ in number of objects and therefore the all the random calls done by the code would be affected. So I think It can be removed if requested.

p-slash · 2024-01-03T17:25:39Z

py/desisim/survey_release.py

+        mask_footprint=desimodel.footprint.is_point_in_desi(tiles,mock['RA'],mock['DEC'])
+        mock=mock[mask_footprint]
+        log.info(f"Keeping {sum(mask_footprint)}  mock QSOs in footprint TILES")
+        if release is not None:


Better to write if release == "iron": ... else: raise NotImplementedError

p-slash · 2024-01-03T17:26:24Z

py/desisim/survey_release.py

+            log.info(f"Downsampling by NPASSES fraction in {release} release")
+            # TODO: Implement this in desimodel instead of local path
+            pixmap=Table.read('/global/cfs/cdirs/desi/users/hiramk/desi/quickquasars/sampling_tests/npass_pixmap.fits')
+            mock_pixels = hp.ang2pix(1024, np.radians(90-mock['DEC']),np.radians(mock['RA']),nest=True)


Why are you using nside=1024? That is very high resolution

p-slash · 2024-01-03T17:27:54Z

py/desisim/survey_release.py

+            mock_pixels = hp.ang2pix(1024, np.radians(90-mock['DEC']),np.radians(mock['RA']),nest=True)
+            try:
+                data_pixels = hp.ang2pix(1024,np.radians(90-self.data['TARGET_DEC']),np.radians(self.data['TARGET_RA']),nest=True)
+            except: 


Not a good use of try-except. I recommend if 'TARGET_DEC in self.data.dtype.names` or similar

Also, never use naked except!

p-slash · 2024-01-03T17:29:58Z

py/desisim/survey_release.py

+            mock_pass_counts = np.bincount(mock_passes,minlength=8)
+            mock['NPASS'] = mock_passes
+            downsampling=np.divide(data_pass_counts,mock_pass_counts,out=np.zeros(8),where=mock_pass_counts>0)
+            rand = np.random.uniform(size=len(mock))


Better to have a seed here or leave a detailed comment why not.

p-slash · 2024-01-03T17:33:59Z

py/desisim/survey_release.py

+            filename=os.path.join(os.path.dirname(desisim.__file__),'data/dn_dzdM_EDR.fits')
+            zcenters=fitsio.FITS(filename)['Z_CENTERS'][:]
+            dz = zcenters[1]-zcenters[0]
+            rmagcenters=fitsio.FITS(filename)['RMAG_CENTERS'][:]
+            dn_dzdm=fitsio.FITS(filename)['dn_dzdm'][:,:]


Should be written as
with fitsio.FITS(filename) as fts:
zcenters = fts['Z_CENTERS'].read()
rmagcenters = fts['RMAG_CENTERS'].read()
dn_dzdm = fts['dn_dzdm'].read()
dz = zcenters[1] - zcenters[0]

p-slash · 2024-01-03T17:35:48Z

py/desisim/survey_release.py

+            if self.data is None:
+                raise ValueError("No data catalog was provided")
+            dz = 0.1
+            zbins = np.arange(0,10,dz)


It is not recommended to use np.arange with a non-integer step. Better use np.linspace

p-slash · 2024-01-03T17:36:03Z

py/desisim/survey_release.py

+            dz = 0.1
+            zbins = np.arange(0,10,dz)
+            zcenters=0.5*(zbins[1:]+zbins[:-1])
+            rmagbins = np.arange(15,25,0.1)


Same here. Use np.linspace

p-slash · 2024-01-03T17:36:42Z

py/desisim/survey_release.py

+        for i,z_bin in enumerate(zcenters):
+            w_z = (self.mockcatalog['Z'] > z_bin-0.5*dz) & (self.mockcatalog['Z'] <= z_bin+0.5*dz)
+            if np.sum(w_z)==0: continue
+            rand = np.random.uniform(size=np.sum(w_z))


p-slash · 2024-01-03T17:37:25Z

py/desisim/survey_release.py

+            mock=self.mockcatalog
+            try:
+                data_pixels = hp.ang2pix(1024,np.radians(90-self.data['TARGET_DEC']),np.radians(self.data['TARGET_RA']),nest=True)
+            except: 


Bad use of try-except again. Use an if statement

Also, Never use naked except!

p-slash · 2024-01-03T17:40:37Z

py/desisim/survey_release.py

+                random_variable = rv_discrete(values=(np.arange(1,len(pdf)+1),pdf))
+                is_pass = self.mockcatalog['NPASS'] == tile_pass
+                exptime_mock[is_lya_mock&is_pass]=1000*random_variable.rvs(size=np.sum(is_pass&is_lya_mock))


Are you generating random variables? What is the seed?

p-slash · 2024-01-03T17:44:51Z

py/desisim/survey_release.py

+            if 'RMAG' in self.data.colnames:
+                dn_dzdm=np.histogram2d(self.data['Z'],self.data['RMAG'],bins=(zbins,rmagbins))[0]
+            elif 'FLUX_R' in self.data.colnames:
+                dn_dzdm=np.histogram2d(self.data['Z'],22.5-2.5*np.log10(self.data['FLUX_R']),bins=(zbins,rmagbins))[0]
+            else:
+                raise ValueError("No magnitude information in data catalog")


Write
if 'RMAG' in self.data.colnames:
data_rmag = self.data
elif 'FLUX_R' in self.data.colnames:
data_rmag = 22.5-2.5*np.log10(self.data['FLUX_R'])
else:
raise KeyError("No magnitude information in data catalog")
dn_dzdm=np.histogram2d(self.data['Z'], data_rmag, bins=(zbins,rmagbins))[0]

p-slash · 2024-01-03T17:45:48Z

py/desisim/survey_release.py

+            dist = Table.read(os.path.join(os.path.dirname(desisim.__file__),'data/redshift_dist_chaussidon2022.ecsv')) 
+            dz=dist['z'][1]-dist['z'][0]
+            factor=0.1/dz # Account for redshift bin width difference.
+            zbins = np.arange(0,10.1,0.1)


use np.linspace instead

p-slash · 2024-01-03T17:45:56Z

py/desisim/survey_release.py

+            w_z = (zcenters-0.5*dz > zmin) & (zcenters+0.5*dz <= zmax)
+            dndz=dndz[w_z]
+            zcenters=zcenters[w_z]
+            zbins = np.arange(zmin,zmax+dz,dz,)


np.linspace

p-slash · 2024-01-03T17:47:17Z

py/desisim/survey_release.py

+            dn_dzdr=fitsio.FITS(filename)[3][:,:]
+            dndz=np.sum(dn_dzdr,axis=1)
+
+            zcenters=fitsio.FITS(filename)[1][:]


Use a with statement. Read zcenters first.
with fitsio.FITS(filenames) as fts:
zcenters = fts[1].read()
dn_dzdt = fts[3].read()

p-slash · 2024-01-03T17:48:07Z

py/desisim/survey_release.py

+        try: 
+            pixels = hp.ang2pix(nside, np.radians(90-catalog['DEC']),np.radians(catalog['RA']),nest=True)
+        except KeyError:
+            pixels = hp.ang2pix(nside, np.radians(90-catalog['TARGET_DEC']),np.radians(catalog['TARGET_RA']),nest=True)


Bad use of try-except again. Use an if statement

p-slash · 2024-01-03T17:51:01Z

py/desisim/survey_release.py

+        self.mockcatalog['TARGETID']=self.mockcatalog['MOCKID']
+        self.mockcatalog['Z']=self.mockcatalog['Z_QSO_RSD']
+        self.invert=invert
+        np.random.seed(seed)


Ah this is the seed. I recommend creating a member random engine rng_engine = np.random.default_rng(seed) and calling it to generate the random numbers you need. default_rng is recommended since numpy 1.17

p-slash · 2024-01-03T17:53:59Z

py/desisim/scripts/quickquasars.py

@@ -433,7 +489,8 @@ def simulate_one_healpix(ifilename,args,model,obsconditions,decam_and_wise_filte
            DZ_FOG = DZ_FOG[indices]
            nqso = args.nmax

-    if args.dn_dzdm is not None:
+    if args.dn_dzdm:
+        # TODO: Deprecate this option. It will be faster to do this on preprocessing.


You can add warnings.warn("dn_dzdm will be deprecated. It is faster to do this in preprocessing", DeprecationWarning)

p-slash

Sorry for the delay. I went through the code and left comments for best practices and recommended functions. They are minor changes but if unfixed they can introduce critical bugs. For example, a naked except can be really bad.

Also, why is the npass file so large?

julienguy · 2024-03-05T20:51:42Z

I am merging as is because this is the version used for KP6 mocks. We will then tag. BUT the comments can still be addressed by reopening this PR after we merged and tagged.

sbailey · 2024-03-05T21:48:20Z

The post-merge version with a docstring fix has been tagged as desisim 0.38.0. I also installed this at NERSC.

HiramHerrera added 11 commits March 1, 2023 13:13

QQ is now able to reproduce EDR survey stage

922276f

Added functionality to switch the metal coefficients used based on wh…

64147bf

…ich type of raw mock is inputed

Made it possible to use downsampling and dn_dzdm

40c25da

Fixed wrong Saclay metal value

34df81f

Updated --dn_dzdm flag to just boolean. Quickquasars is now capable o…

e4d6cbe

…f recieving a preprocessed catalog. Introduction of survey release module to produce Y1 mocks

Reworked SurveyRelease instance module. Added commentaries to quickqu…

3e85457

…asars and implemented notebook to produce preprocessing mock catalogs to pass to quickquasars.

Fixed documentation, and bug on redshift distribution target_selectio…

213a3aa

…n method.

Refactor of metal strengths code.

891c35e

Deprecated --reproduce-survey flag. Added flag to use Y1 throughput.

56aa6e7

Updated SurveyRelease with invert option

225cdef

Added redshift distribution to desisim/data

02289d3

HiramHerrera added the enhancement label Nov 14, 2023

HiramHerrera requested review from andreufont, julienguy, p-slash and alxogm November 14, 2023 21:32

Small code fix for clarity

6fb7741

HiramHerrera added 5 commits November 15, 2023 14:06

Changed SurveyRelease __init__ function.

c6020a3

Made gen_qso_catalog executable

cd02e23

Fixed path to NPASS pixmap

63856f5

Updated mock catalog generation tutorial

9af008b

Removed solved commentaries

599de34

p-slash reviewed Dec 1, 2023

View reviewed changes

p-slash reviewed Jan 3, 2024

View reviewed changes

p-slash requested changes Jan 3, 2024

View reviewed changes

julienguy merged commit 69dc6e9 into desihub:main Mar 5, 2024
6 of 8 checks passed

HiramHerrera added a commit to HiramHerrera/desisim that referenced this pull request Mar 21, 2024

Addressed changes requested in PR desihub#578

284b3c0

HiramHerrera mentioned this pull request Mar 22, 2024

Pending changes requested on previous PR, from_catalog fix for Ohio mocks #580

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Producing DESI-Y1 Lya mocks with quickquasars. #578

Producing DESI-Y1 Lya mocks with quickquasars. #578

HiramHerrera commented Nov 14, 2023

coveralls commented Nov 15, 2023 •

edited

andreufont commented Nov 15, 2023

HiramHerrera commented Nov 15, 2023

andreufont commented Nov 15, 2023

alxogm commented Nov 16, 2023

p-slash Dec 1, 2023

HiramHerrera Dec 1, 2023

p-slash Jan 3, 2024

p-slash Jan 3, 2024

p-slash Jan 3, 2024

p-slash Jan 3, 2024

p-slash Jan 3, 2024

p-slash Jan 3, 2024

p-slash Jan 3, 2024

p-slash Jan 3, 2024

p-slash Jan 3, 2024

p-slash Jan 3, 2024

p-slash Jan 3, 2024 •

edited

p-slash Jan 3, 2024

p-slash Jan 3, 2024

p-slash Jan 3, 2024

p-slash Jan 3, 2024

p-slash Jan 3, 2024

p-slash Jan 3, 2024

p-slash Jan 3, 2024 •

edited

p-slash Jan 3, 2024

p-slash left a comment

julienguy commented Mar 5, 2024

sbailey commented Mar 5, 2024

Producing DESI-Y1 Lya mocks with quickquasars. #578

Producing DESI-Y1 Lya mocks with quickquasars. #578

Conversation

HiramHerrera commented Nov 14, 2023

coveralls commented Nov 15, 2023 • edited

andreufont commented Nov 15, 2023

HiramHerrera commented Nov 15, 2023

andreufont commented Nov 15, 2023

alxogm commented Nov 16, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

p-slash Jan 3, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

p-slash Jan 3, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

p-slash left a comment

Choose a reason for hiding this comment

julienguy commented Mar 5, 2024

sbailey commented Mar 5, 2024

coveralls commented Nov 15, 2023 •

edited

p-slash Jan 3, 2024 •

edited

p-slash Jan 3, 2024 •

edited