Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abhijeet per camera prior gpu #255

Merged
merged 70 commits into from Jan 4, 2024
Merged

Abhijeet per camera prior gpu #255

merged 70 commits into from Jan 4, 2024

Conversation

abhi0395
Copy link
Member

@abhi0395 abhi0395 commented Nov 1, 2023

This branch combines the PCA with archetypes and a per-camera modelling approach. It is an important step in improving the current Redrock, particularly for galaxy spectral fitting and estimating their redshifts. Thanks to @craigwarner-ufastro for importing some of the time-consuming steps to GPUs; it has made a significant difference.
@craigwarner-ufastro I merged your branch locally and added prior option to it so this branch includes all your GPU updates to the archetype method. So please feel free to close your PR.

In the current form, redrock occasionally yields nonphysical model fits for galaxies and struggles to account for errors introduced by the spectral reduction pipeline adequately. To address this persistent challenge, our proposed approach combines redrock with galaxy archetypes and introduces per-camera modelling for spectral fitting and redshift estimation.

I have run extensive tests on many tiles, including SV3 visual tiles, repeat observations and large test runs on iron data and compared them with results from iron data reduction redrock. The method has shown several important improvements over redrock, such as mitigating non-physical/negative emission line fitting. It also reduces the number of sky fibers having very robust redshifts by a significant fraction (25-50%) and reduces the spurious clustering of redshifts in the redshift vs fiber plane without changing any quality cuts that redrock currently uses. It also shows a slight increase in the redshift success rate for most target classes.

There are a few new options added to rrdesi:

  • --archetypes: file/directory containing the archetypes
  • --per_camera: to do the spectral fitting in each camera (i.e. b, r, z)
  • -deg_legendre: number of Legendre polynomials to be used in archetype mode
  • --nminima: number of redshifts on which archetype model should be implemented to estimate the final redshifts
  • --prior_sigma: variance to be added in the final linear equation before solving for the coefficients

The new archetypes are saved in /global/cfs/cdirs/desi/users/abhijeet/new-archetypes/

The most important bash commands to run rrdesi with archetypes are detailed in the README file so anyone can run and analyze the outputs. For the current test runs, we have used the following bash commands:

Without priors:
rrdesi -i <spectra_file> --archetypes <archetype_dir or archetype_file> -o <output_file> -d <details_file.h5> -deg_legendre 2 --nminima 9 --per-camera

With priors:

rrdesi -i <spectra_file> --archetypes <archetype_dir or archetype_file> -o <output_file> -d <details_file.h5> -deg_legendre 2 --nminima 9 --per-camera --prior_sigma 0.1

The main directory ($ABHIJEET_TEST_DIR) where test results are saved:
$ABHIJEET_TEST_DIR=/global/cfs/cdirs/desi/users/abhijeet/large_test_run

  • Results without prior are in : REF_DIR = ${ABHIJEET_TEST_DIR}/no_prior_runs

  • Results with prior are in : REF_DIR = ${ABHIJEET_TEST_DIR}/prior_runs

The subdirectory structures are the following (in both directories):

  • visual-tiles: ${REF_DIR}/visual_tiles
  • pernight tiles (repeat observations): ${REF_DIR}/pernight
  • iron tiles (large runs from iron data nights): ${REF_DIR}/iron_nights

For comparison, the iron data reduction with redrock is at: /global/cfs/cdirs/desi/spectro/redux/iron/tiles/cumulative

Many thanks to @julienguy, @sbailey, John and members of the DESI data team for their help, suggestions and feedback on the progress.

@coveralls
Copy link

coveralls commented Nov 1, 2023

Coverage Status

coverage: 33.313% (-2.2%) from 35.501%
when pulling 14bebc6 on abhijeet_per_camera_prior_gpu
into ad32b87 on main.

@abhi0395 abhi0395 closed this Nov 1, 2023
@abhi0395 abhi0395 reopened this Nov 1, 2023
@abhi0395 abhi0395 marked this pull request as ready for review November 1, 2023 23:12
Copy link
Collaborator

@sbailey sbailey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When running without archetypes, this branch runs in the same time as current main and produces the same output, except for changing the SUBTYPE size from 20 to 32 characters. Is that change necessary or leftover from something else?

When running with archetypes without legendre+priors, the overall runtime is ~40% slower, which is unfortunate but could be acceptable (spending previous GPU speedups to enable better algorithms).

When running with archetypes and legendre+priors, the runtime becomes 12x slower (!). That doesn't have to be a blocking factor for merging, but it is effectively a blocking factor for using those options, so we'll need to revisit that.

I put various comments inline. Most are minor and/or apply only to archetypes, but I would like to standardize / dis-ambiguate the option names before merging.

I think we should also resolve + merge #259 to get various GPU changes into this branch before merging this branch into main.

Heads up: I also haven't reviewed the output formats yet. I think we should more cleanly separate the coefficients from the original PCA (or NMF) scan from the archetype coefficients from the legendre coefficients, but I don't have a specific suggestion yet. That might be a post-merge update. Thinking.

py/redrock/archetypes.py Show resolved Hide resolved
py/redrock/external/desi.py Outdated Show resolved Hide resolved
py/redrock/fitz.py Outdated Show resolved Hide resolved
py/redrock/utils.py Show resolved Hide resolved
py/redrock/zfind.py Outdated Show resolved Hide resolved
py/redrock/zfind.py Outdated Show resolved Hide resolved
@sbailey
Copy link
Collaborator

sbailey commented Nov 28, 2023

Looking at /global/cfs/cdirs/desi/users/abhijeet/new-archetypes/rrarchetype-galaxy.fits, there is an unnamed HDU 2 with a binary table that has units and special characters in the column names:

  column info:
    LOGMSTAR [M_sol]
                        f8  
    LOGSSFR [yr^-1]     f8  
    AV_ISM [mag]        f8  
    TEMPLATEID          i4  
    SUBTYPE            S23  

Although that is technically valid, it is somewhat unusual, and would probably be better to have column names like "LOGMSTAR", "LOGSSFR", and "AV_ISM" and put the units in TUNITnn keywords or in the comments. Please also add an EXTNAME to HDU 2.

…et_per_camera_prior_gpu. Added prior as optional arg to calc_zchi2_batch to do this.
@abhi0395
Copy link
Member Author

abhi0395 commented Dec 4, 2023

@sbailey
#255 (comment)

Thanks for the suggestions. I have updated the file now.

craigwarner-ufastro and others added 11 commits December 4, 2023 17:48
Added always_return_array arg to transmission_Lyman that is True by default.
Changed calls to transmission_Lyman throughout redrock in archetypes, fitz,
templates to call with always_return_array=False to optimize because no need
to generate additional arrays of all 1 and multiply by them.
If always_return_array is True, an array of all ones will be returned in this
case instead of None.
@sbailey sbailey merged commit fc6e099 into main Jan 4, 2024
10 of 12 checks passed
@sbailey sbailey deleted the abhijeet_per_camera_prior_gpu branch January 4, 2024 01:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants