Abhijeet per camera prior gpu #255

abhi0395 · 2023-11-01T18:45:37Z

This branch combines the PCA with archetypes and a per-camera modelling approach. It is an important step in improving the current Redrock, particularly for galaxy spectral fitting and estimating their redshifts. Thanks to @craigwarner-ufastro for importing some of the time-consuming steps to GPUs; it has made a significant difference.
@craigwarner-ufastro I merged your branch locally and added prior option to it so this branch includes all your GPU updates to the archetype method. So please feel free to close your PR.

In the current form, redrock occasionally yields nonphysical model fits for galaxies and struggles to account for errors introduced by the spectral reduction pipeline adequately. To address this persistent challenge, our proposed approach combines redrock with galaxy archetypes and introduces per-camera modelling for spectral fitting and redshift estimation.

I have run extensive tests on many tiles, including SV3 visual tiles, repeat observations and large test runs on iron data and compared them with results from iron data reduction redrock. The method has shown several important improvements over redrock, such as mitigating non-physical/negative emission line fitting. It also reduces the number of sky fibers having very robust redshifts by a significant fraction (25-50%) and reduces the spurious clustering of redshifts in the redshift vs fiber plane without changing any quality cuts that redrock currently uses. It also shows a slight increase in the redshift success rate for most target classes.

There are a few new options added to rrdesi:

--archetypes: file/directory containing the archetypes
--per_camera: to do the spectral fitting in each camera (i.e. b, r, z)
-deg_legendre: number of Legendre polynomials to be used in archetype mode
--nminima: number of redshifts on which archetype model should be implemented to estimate the final redshifts
--prior_sigma: variance to be added in the final linear equation before solving for the coefficients

The new archetypes are saved in /global/cfs/cdirs/desi/users/abhijeet/new-archetypes/

The most important bash commands to run rrdesi with archetypes are detailed in the README file so anyone can run and analyze the outputs. For the current test runs, we have used the following bash commands:

Without priors:
rrdesi -i <spectra_file> --archetypes <archetype_dir or archetype_file> -o <output_file> -d <details_file.h5> -deg_legendre 2 --nminima 9 --per-camera

With priors:

rrdesi -i <spectra_file> --archetypes <archetype_dir or archetype_file> -o <output_file> -d <details_file.h5> -deg_legendre 2 --nminima 9 --per-camera --prior_sigma 0.1

The main directory ($ABHIJEET_TEST_DIR) where test results are saved:
$ABHIJEET_TEST_DIR=/global/cfs/cdirs/desi/users/abhijeet/large_test_run

Results without prior are in : REF_DIR = ${ABHIJEET_TEST_DIR}/no_prior_runs
Results with prior are in : REF_DIR = ${ABHIJEET_TEST_DIR}/prior_runs

The subdirectory structures are the following (in both directories):

visual-tiles: ${REF_DIR}/visual_tiles
pernight tiles (repeat observations): ${REF_DIR}/pernight
iron tiles (large runs from iron data nights): ${REF_DIR}/iron_nights

For comparison, the iron data reduction with redrock is at: /global/cfs/cdirs/desi/spectro/redux/iron/tiles/cumulative

Many thanks to @julienguy, @sbailey, John and members of the DESI data team for their help, suggestions and feedback on the progress.

coveralls · 2023-11-01T18:46:59Z

coverage: 33.313% (-2.2%) from 35.501%
when pulling 14bebc6 on abhijeet_per_camera_prior_gpu
into ad32b87 on main.

Rolled back changes by Abhijeet done while attempting to merge to restore per camera batch method.

sbailey

When running without archetypes, this branch runs in the same time as current main and produces the same output, except for changing the SUBTYPE size from 20 to 32 characters. Is that change necessary or leftover from something else?

When running with archetypes without legendre+priors, the overall runtime is ~40% slower, which is unfortunate but could be acceptable (spending previous GPU speedups to enable better algorithms).

When running with archetypes and legendre+priors, the runtime becomes 12x slower (!). That doesn't have to be a blocking factor for merging, but it is effectively a blocking factor for using those options, so we'll need to revisit that.

I put various comments inline. Most are minor and/or apply only to archetypes, but I would like to standardize / dis-ambiguate the option names before merging.

I think we should also resolve + merge #259 to get various GPU changes into this branch before merging this branch into main.

Heads up: I also haven't reviewed the output formats yet. I think we should more cleanly separate the coefficients from the original PCA (or NMF) scan from the archetype coefficients from the legendre coefficients, but I don't have a specific suggestion yet. That might be a post-merge update. Thinking.

py/redrock/archetypes.py

py/redrock/external/desi.py

py/redrock/fitz.py

py/redrock/utils.py

py/redrock/zfind.py

sbailey · 2023-11-28T23:06:01Z

Looking at /global/cfs/cdirs/desi/users/abhijeet/new-archetypes/rrarchetype-galaxy.fits, there is an unnamed HDU 2 with a binary table that has units and special characters in the column names:

  column info:
    LOGMSTAR [M_sol]
                        f8  
    LOGSSFR [yr^-1]     f8  
    AV_ISM [mag]        f8  
    TEMPLATEID          i4  
    SUBTYPE            S23

Although that is technically valid, it is somewhat unusual, and would probably be better to have column names like "LOGMSTAR", "LOGSSFR", and "AV_ISM" and put the units in TUNITnn keywords or in the comments. Please also add an EXTNAME to HDU 2.

…et_per_camera_prior_gpu. Added prior as optional arg to calc_zchi2_batch to do this.

abhi0395 · 2023-12-04T20:34:21Z

@sbailey
#255 (comment)

Thanks for the suggestions. I have updated the file now.

Added always_return_array arg to transmission_Lyman that is True by default. Changed calls to transmission_Lyman throughout redrock in archetypes, fitz, templates to call with always_return_array=False to optimize because no need to generate additional arrays of all 1 and multiply by them. If always_return_array is True, an array of all ones will be returned in this case instead of None.

abhi0395 added 30 commits July 11, 2023 17:22

per camera fitting approach added in archetype mode

347c1b8

arguments added for archetype mode

fef9f24

syntax error removed

f40ba30

updated for archetype mode

cd13648

added archetype in per camera mode

377be48

print statement added

6b79e6f

print removed

4bb4eae

per camera mode corrected for same archetype

b5369e3

print statements removed

b2016f7

made easier for readers

623855a

nearest archetype method added

97bd843

print statement added

813615a

cleaned up

9caa4f5

print removed

fa89b46

cleaned up

0c29dbe

print statement removed

6d84ee3

print statements removed

e67af26

nearest archetypes approach added

664ec12

normal archetypes

3bf2a12

new archetype method with --per-camera option

e0d0a66

updated

0c7651f

removed unused file

20639b3

more details added

bd9953c

more details added

29a3e21

more details added

3281546

typo corrected

b58408e

cleaned up scripts

930286e

indent corrected

5ff5d17

indentation corrected

55b04ef

indentation corrected

a56874a

abhi0395 closed this Nov 1, 2023

abhi0395 reopened this Nov 1, 2023

Update README.rst

af37f26

abhi0395 marked this pull request as ready for review November 1, 2023 23:12

abhi0395 and others added 4 commits November 2, 2023 11:58

Merge branch 'abhijeet_per_camera' into abhijeet_per_camera_gpu

0cec7ee

nearest nbh argument corrected

ced3c01

nearest nbh argument corrected

c1528e1

Added NNLS as an option to solve_matrices in zscan.

a4b8b80

Rolled back changes by Abhijeet done while attempting to merge to restore per camera batch method.

sbailey mentioned this pull request Nov 28, 2023

Merge Craig's GPU changes into abhijeet_per_camera_prior_gpu branch #259

Merged

sbailey requested changes Nov 28, 2023

View reviewed changes

Merged abhijeet_per_camera_gpu and abhijeet_per_camera_gpu and abhije…

bd5983a

…et_per_camera_prior_gpu. Added prior as optional arg to calc_zchi2_batch to do this.

craigwarner-ufastro and others added 11 commits December 4, 2023 17:48

args added for default archetype values, removed hard coded ncamera

c679027

cleaned up rrdesi

6487a3f

cleaned up rrdesi

f7fc3df

Update README.rst

3a8ee19

Update README.rst

4784a04

Update README.rst

3a8e5cd

fix crash in gpu mode with archetype legendre priors

8133f74

Merge branch 'main' into abhijeet_per_camera_prior_gpu

00d89ba

added archetypes-no-legendre flag

f7b907e

only print archetype info on rank 0 and if using archetypes

14bebc6

sbailey approved these changes Jan 4, 2024

View reviewed changes

sbailey merged commit fc6e099 into main Jan 4, 2024
10 of 12 checks passed

sbailey deleted the abhijeet_per_camera_prior_gpu branch January 4, 2024 01:04

This was referenced Jan 4, 2024

Abhijeet per camera gpu #253

Closed

adjust coeff size for legendre only for archetypes #266

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Abhijeet per camera prior gpu #255

Abhijeet per camera prior gpu #255

abhi0395 commented Nov 1, 2023 •

edited

Loading

coveralls commented Nov 1, 2023 •

edited

Loading

sbailey left a comment

sbailey commented Nov 28, 2023

abhi0395 commented Dec 4, 2023

Abhijeet per camera prior gpu #255

Abhijeet per camera prior gpu #255

Conversation

abhi0395 commented Nov 1, 2023 • edited Loading

coveralls commented Nov 1, 2023 • edited Loading

sbailey left a comment

Choose a reason for hiding this comment

sbailey commented Nov 28, 2023

abhi0395 commented Dec 4, 2023

abhi0395 commented Nov 1, 2023 •

edited

Loading

coveralls commented Nov 1, 2023 •

edited

Loading