Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abhijeet per camera prior gpu #255

Merged
merged 70 commits into from
Jan 4, 2024
Merged

Abhijeet per camera prior gpu #255

merged 70 commits into from
Jan 4, 2024

Commits on Jul 12, 2023

  1. Configuration menu
    Copy the full SHA
    347c1b8 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    fef9f24 View commit details
    Browse the repository at this point in the history
  3. syntax error removed

    abhi0395 committed Jul 12, 2023
    Configuration menu
    Copy the full SHA
    f40ba30 View commit details
    Browse the repository at this point in the history
  4. updated for archetype mode

    abhi0395 committed Jul 12, 2023
    Configuration menu
    Copy the full SHA
    cd13648 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    377be48 View commit details
    Browse the repository at this point in the history
  6. print statement added

    abhi0395 committed Jul 12, 2023
    Configuration menu
    Copy the full SHA
    6b79e6f View commit details
    Browse the repository at this point in the history
  7. print removed

    abhi0395 committed Jul 12, 2023
    Configuration menu
    Copy the full SHA
    4bb4eae View commit details
    Browse the repository at this point in the history

Commits on Aug 2, 2023

  1. Configuration menu
    Copy the full SHA
    b5369e3 View commit details
    Browse the repository at this point in the history

Commits on Aug 4, 2023

  1. print statements removed

    abhi0395 committed Aug 4, 2023
    Configuration menu
    Copy the full SHA
    b2016f7 View commit details
    Browse the repository at this point in the history

Commits on Aug 7, 2023

  1. made easier for readers

    abhi0395 committed Aug 7, 2023
    Configuration menu
    Copy the full SHA
    623855a View commit details
    Browse the repository at this point in the history

Commits on Aug 9, 2023

  1. Configuration menu
    Copy the full SHA
    97bd843 View commit details
    Browse the repository at this point in the history

Commits on Aug 14, 2023

  1. print statement added

    abhi0395 committed Aug 14, 2023
    Configuration menu
    Copy the full SHA
    813615a View commit details
    Browse the repository at this point in the history

Commits on Aug 15, 2023

  1. cleaned up

    abhi0395 committed Aug 15, 2023
    Configuration menu
    Copy the full SHA
    9caa4f5 View commit details
    Browse the repository at this point in the history
  2. print removed

    abhi0395 committed Aug 15, 2023
    Configuration menu
    Copy the full SHA
    fa89b46 View commit details
    Browse the repository at this point in the history

Commits on Aug 16, 2023

  1. cleaned up

    abhi0395 committed Aug 16, 2023
    Configuration menu
    Copy the full SHA
    0c29dbe View commit details
    Browse the repository at this point in the history

Commits on Aug 17, 2023

  1. print statement removed

    abhi0395 committed Aug 17, 2023
    Configuration menu
    Copy the full SHA
    6d84ee3 View commit details
    Browse the repository at this point in the history

Commits on Aug 18, 2023

  1. print statements removed

    abhi0395 committed Aug 18, 2023
    Configuration menu
    Copy the full SHA
    e67af26 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    664ec12 View commit details
    Browse the repository at this point in the history

Commits on Aug 25, 2023

  1. normal archetypes

    abhi0395 committed Aug 25, 2023
    Configuration menu
    Copy the full SHA
    3bf2a12 View commit details
    Browse the repository at this point in the history

Commits on Aug 26, 2023

  1. Configuration menu
    Copy the full SHA
    e0d0a66 View commit details
    Browse the repository at this point in the history
  2. updated

    abhi0395 committed Aug 26, 2023
    Configuration menu
    Copy the full SHA
    0c7651f View commit details
    Browse the repository at this point in the history
  3. removed unused file

    abhi0395 committed Aug 26, 2023
    Configuration menu
    Copy the full SHA
    20639b3 View commit details
    Browse the repository at this point in the history
  4. more details added

    abhi0395 committed Aug 26, 2023
    Configuration menu
    Copy the full SHA
    bd9953c View commit details
    Browse the repository at this point in the history
  5. more details added

    abhi0395 committed Aug 26, 2023
    Configuration menu
    Copy the full SHA
    29a3e21 View commit details
    Browse the repository at this point in the history
  6. more details added

    abhi0395 committed Aug 26, 2023
    Configuration menu
    Copy the full SHA
    3281546 View commit details
    Browse the repository at this point in the history
  7. typo corrected

    abhi0395 committed Aug 26, 2023
    Configuration menu
    Copy the full SHA
    b58408e View commit details
    Browse the repository at this point in the history

Commits on Aug 29, 2023

  1. cleaned up scripts

    abhi0395 committed Aug 29, 2023
    Configuration menu
    Copy the full SHA
    930286e View commit details
    Browse the repository at this point in the history
  2. indent corrected

    abhi0395 committed Aug 29, 2023
    Configuration menu
    Copy the full SHA
    5ff5d17 View commit details
    Browse the repository at this point in the history
  3. indentation corrected

    abhi0395 committed Aug 29, 2023
    Configuration menu
    Copy the full SHA
    55b04ef View commit details
    Browse the repository at this point in the history
  4. indentation corrected

    abhi0395 committed Aug 29, 2023
    Configuration menu
    Copy the full SHA
    a56874a View commit details
    Browse the repository at this point in the history
  5. indentation corrected

    abhi0395 committed Aug 29, 2023
    Configuration menu
    Copy the full SHA
    8d62c16 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    bf33257 View commit details
    Browse the repository at this point in the history

Commits on Sep 1, 2023

  1. some time details added

    abhi0395 committed Sep 1, 2023
    Configuration menu
    Copy the full SHA
    d0395f5 View commit details
    Browse the repository at this point in the history

Commits on Sep 15, 2023

  1. print statement added

    abhi0395 committed Sep 15, 2023
    Configuration menu
    Copy the full SHA
    3c46911 View commit details
    Browse the repository at this point in the history

Commits on Sep 19, 2023

  1. GPU acclelerated archetypes.get_best_archetype().

    Use batch rebin, transmission_Lyman, and calc_zchi2 operations.
    
    Speed gains: without archetypes, redrock on 4 GPU / 4 CPU takes 14.8s
    reported total run time, 7.3s of which is in the fine redshift scan.
    Comparatively with 64 CPU and 0 GPU the base redrock runs in 40.0s
    reported total run time with 6.7s spent in fine redshift scan.
    
    Adding the base archetypes option (without per-camera or nearest
    neighbor) raises CPU times to 63.8s overall and 28.1s in fine z scan so
    about 60% increase overall and about 4x slower in fine z scan.
    
    With the new code the "batch" CPU mode slightly improves this to 60.0s
    and 24.2s.
    
    With the new GPU code, it runs on 4 GPU / 4 CPU in 22.8s total and 14.3s
    for fine z scan, a 50% overall increase but only 2x speed increase in
    the fine z scan, a big improvement from CPU times.
    
    Also updated transmission_Lyman to return None when given scalar z to
    match behavior when given an array of redshifts, in the case where the
    wavelength range is not affected by Lyman transmission.  There is no
    need in this case to calculate an array of all ones and then
    additionally multiply the rebinned data by it.
    
    Have yet to update --per-camera and -n_nearest options so right now
    there are placeholders so that it does not crash that simply loop over
    the existing CPU-mode logic.  These will be updated shortly.
    craigwarner-ufastro committed Sep 19, 2023
    Configuration menu
    Copy the full SHA
    055aa12 View commit details
    Browse the repository at this point in the history
  2. nz = 15 line in fitz.py must have been deleted by accident on the abh…

    …ijeet_per_camera
    
    branch and this broke a couple unit tests.
    
    Restored.
    craigwarner-ufastro committed Sep 19, 2023
    Configuration menu
    Copy the full SHA
    0397d18 View commit details
    Browse the repository at this point in the history
  3. nz added

    abhi0395 committed Sep 19, 2023
    Configuration menu
    Copy the full SHA
    2e0bc04 View commit details
    Browse the repository at this point in the history

Commits on Sep 22, 2023

  1. prior option added

    abhi0395 committed Sep 22, 2023
    Configuration menu
    Copy the full SHA
    846f2ba View commit details
    Browse the repository at this point in the history

Commits on Sep 25, 2023

  1. prior added

    abhi0395 committed Sep 25, 2023
    Configuration menu
    Copy the full SHA
    201274e View commit details
    Browse the repository at this point in the history

Commits on Sep 26, 2023

  1. * One possible bug remains that I have been unable to track down but …

    …have concluded is unrelated to the modification in this PR.
    
    Steps to reproduce:
    srun -n 4 -c 2 --gpu-bind=map_gpu:3,2,1,0 rrdesi_mpi --gpu --max-gpuprocs 4 -n_nearest 4 --archetypes new-archetypes/ -i $CFS/desi/spectro/redux/fuji/tiles/cumulative/100/20210505/coadd-0-100-thru20210505.fits     -o $SCRATCH/abhijeet.fits
    srun -n 64 -c 2 rrdesi_mpi -n_nearest 4 --archetypes new-archetypes/ -i $CFS/desi/spectro/redux/fuji/tiles/cumulative/100/20210505/coadd-0-100-thru20210505.fits     -o $SCRATCH/abhijeet2.fits
    These two results will be np.allclose() but not equal which is as expected.  The CPU version will be equal to Abhijeet's original code, also as expected.
    
    Now rerun with -n_nearest 5 and the CPU version is still equal to Abhijeet's original code, but there are a small ~10 number of zzchi2 values and zzcoeff values that are different more than np.allclose (as much as 3.0 difference) between GPU and CPU.  Going back to Abhijeet's original nearest_neighbor_model code, and running on the CPU for that method with GPU get_best_archetype and there is still the same small differences, despite the fact that the output is np.allclose when not doing -n_nearest (or even for doing -n_nearest 4).
    
    It does look like the differences seem to be most? all? in QSO templates.  But the trans array shows no difference - as far as I've been able to tell, we are sending the same input tdata to calc_zchi2_one on the CPU and getting a slightly different chi2.  But only for -n_nearest == 5.
    
    However since this is independent of the code changes in the current PR (I get the exact same differences in this version versus rolling back to the previous one), it makes sense to proceed with this PR.
    
    ----------------------
    
    - Added legendre function to Target class so that legendre of a certain degree
    can be calculated (and optionally copied to GPU) once without additional
    overhead.
    
    - Added default value of 15 for fitz()
    
    - In fitz() use Target.legendre to calculate legendre.  Pass target object instead of spectra to get_best_archetype, which allows for simplification as spectra, gpuweights, gpuflux, etc are all members of Target class.  Store trans dict and pass that to get_best_archetype to eliminate need to re-calculate transmission for the same wavelength regime.
    
    - In archetypes, added properties gpuwave and gpuflux to copy and cache data on the GPU.  These are then used in rebin_template_batch.
    
    - In get_best_archetype, pass target instead of spectra, which allows for elimination of dedges and legendre as args.  Get spectra, gpuweights, gpuflux, gpuwflux, dedges, and legendre directly from the target object.  Copying trans and using Target.legendre reduce runtime by about 1s on 4 GPUs.
    
    - Vectorized and GPUized nearest_neighbor_model.  Instead of just passing trans to it, get_best_archetype now passes the binned dict, which already has rebinned flux multiplied by trans.  Using the Target.legendre also saves time.  Since the size of the tdata arrays is small (nbasis of a few), it is faster to keep operations on CPU for calc_zchi2_batch similar to in fitz().
    
    Timing notes: adding nearest_neighbor_model is a bigger hit to the GPU than CPU but since get_best_archetypes is so much faster on the GPU, the combined time is still a good speed-up.
    craigwarner-ufastro committed Sep 26, 2023
    Configuration menu
    Copy the full SHA
    6ece174 View commit details
    Browse the repository at this point in the history
  2. Unit test failed again complaining nz is None, which makes no sense as

    it should be 15 as a default arg now but adding a line if nz is None, nz
    = 15 solves issue.
    craigwarner-ufastro committed Sep 26, 2023
    Configuration menu
    Copy the full SHA
    90ec0a2 View commit details
    Browse the repository at this point in the history

Commits on Sep 27, 2023

  1. prior added

    abhi0395 committed Sep 27, 2023
    Configuration menu
    Copy the full SHA
    44c2f61 View commit details
    Browse the repository at this point in the history
  2. Merge branch 'abhijeet_per_camera_gpu' of https://github.com/desihub/…

    …redrock into abhijeet_per_camera_gpu
    abhi0395 committed Sep 27, 2023
    Configuration menu
    Copy the full SHA
    a0f8652 View commit details
    Browse the repository at this point in the history

Commits on Sep 28, 2023

  1. prior removed

    abhi0395 committed Sep 28, 2023
    Configuration menu
    Copy the full SHA
    efd96b0 View commit details
    Browse the repository at this point in the history

Commits on Oct 3, 2023

  1. prior added

    abhi0395 committed Oct 3, 2023
    Configuration menu
    Copy the full SHA
    96b38e3 View commit details
    Browse the repository at this point in the history

Commits on Oct 4, 2023

  1. Added GPU acceleration for per_camera logic *except for BVLS*.

    Modified solve_matrices algorithm to accept PCA or BVLS as a method.
    Re-orgianized logic for GPU to take advantage of existing
    calc_zchi2_batch algorithm.
    For some strange reason refactoring tdata to add rows for color data in
    batch in 3d array on CPU takes way longer than a python loop over 2d
    arrays so I left two methods for per_camera - one for GPU and one for
    CPU - for now while this gets resolved.
    
    This is not the cleanest but it is the fastest and we can revisit if we
    change our mind on that.
    craigwarner-ufastro committed Oct 4, 2023
    Configuration menu
    Copy the full SHA
    15d587e View commit details
    Browse the repository at this point in the history
  2. prior removed

    abhi0395 committed Oct 4, 2023
    Configuration menu
    Copy the full SHA
    f045d89 View commit details
    Browse the repository at this point in the history

Commits on Oct 6, 2023

  1. Found a way to combine per_camera CPU and GPU options into one method

    cleanly without a speed loss in runtime.
    
    Removed old methods Tbs_for_archetypes,
    per_camera_coeff_with_least_square, and
    per_camera_coeff_with_least_square_cpu and renamed the remaining method
    to per_camera_coeff_with_least_square_batch.
    
    Updated 1->n_nbh where Abhijeet pointed out errors had been made.
    craigwarner-ufastro committed Oct 6, 2023
    Configuration menu
    Copy the full SHA
    a868be7 View commit details
    Browse the repository at this point in the history

Commits on Oct 13, 2023

  1. Configuration menu
    Copy the full SHA
    bb472d1 View commit details
    Browse the repository at this point in the history
  2. Merge branch 'abhijeet_per_camera_gpu' into abhijeet_per_camera

    Merging gpu version "abhijeet_per_camera_gpu" TO abhijeet_per_camera
    abhi0395 committed Oct 13, 2023
    Configuration menu
    Copy the full SHA
    201399a View commit details
    Browse the repository at this point in the history

Commits on Nov 1, 2023

  1. prior_sigma arg added

    abhi0395 committed Nov 1, 2023
    Configuration menu
    Copy the full SHA
    d430674 View commit details
    Browse the repository at this point in the history
  2. print removed

    abhi0395 committed Nov 1, 2023
    Configuration menu
    Copy the full SHA
    bdc961f View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    5ba6769 View commit details
    Browse the repository at this point in the history
  4. Update README.rst

    abhi0395 committed Nov 1, 2023
    Configuration menu
    Copy the full SHA
    af37f26 View commit details
    Browse the repository at this point in the history

Commits on Nov 2, 2023

  1. Configuration menu
    Copy the full SHA
    0cec7ee View commit details
    Browse the repository at this point in the history

Commits on Nov 3, 2023

  1. Configuration menu
    Copy the full SHA
    ced3c01 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    c1528e1 View commit details
    Browse the repository at this point in the history

Commits on Nov 15, 2023

  1. Added NNLS as an option to solve_matrices in zscan.

    Rolled back changes by Abhijeet done while attempting to merge to
    restore per camera batch method.
    craigwarner-ufastro committed Nov 15, 2023
    Configuration menu
    Copy the full SHA
    a4b8b80 View commit details
    Browse the repository at this point in the history

Commits on Nov 29, 2023

  1. Merged abhijeet_per_camera_gpu and abhijeet_per_camera_gpu and abhije…

    …et_per_camera_prior_gpu. Added prior as optional arg to calc_zchi2_batch to do this.
    craigwarner-ufastro committed Nov 29, 2023
    Configuration menu
    Copy the full SHA
    bd5983a View commit details
    Browse the repository at this point in the history

Commits on Dec 4, 2023

  1. Added docstring to archetypes:rebin_template_batch

    Added always_return_array arg to transmission_Lyman that is True by default.
    Changed calls to transmission_Lyman throughout redrock in archetypes, fitz,
    templates to call with always_return_array=False to optimize because no need
    to generate additional arrays of all 1 and multiply by them.
    If always_return_array is True, an array of all ones will be returned in this
    case instead of None.
    craigwarner-ufastro committed Dec 4, 2023
    Configuration menu
    Copy the full SHA
    af13669 View commit details
    Browse the repository at this point in the history

Commits on Dec 6, 2023

  1. Configuration menu
    Copy the full SHA
    c679027 View commit details
    Browse the repository at this point in the history
  2. cleaned up rrdesi

    abhi0395 committed Dec 6, 2023
    Configuration menu
    Copy the full SHA
    6487a3f View commit details
    Browse the repository at this point in the history
  3. cleaned up rrdesi

    abhi0395 committed Dec 6, 2023
    Configuration menu
    Copy the full SHA
    f7fc3df View commit details
    Browse the repository at this point in the history
  4. Update README.rst

    abhi0395 committed Dec 6, 2023
    Configuration menu
    Copy the full SHA
    3a8ee19 View commit details
    Browse the repository at this point in the history
  5. Update README.rst

    abhi0395 committed Dec 6, 2023
    Configuration menu
    Copy the full SHA
    4784a04 View commit details
    Browse the repository at this point in the history
  6. Update README.rst

    abhi0395 committed Dec 6, 2023
    Configuration menu
    Copy the full SHA
    3a8e5cd View commit details
    Browse the repository at this point in the history

Commits on Jan 3, 2024

  1. fix crash in gpu mode with archetype legendre priors

    Stephen Bailey authored and Stephen Bailey committed Jan 3, 2024
    Configuration menu
    Copy the full SHA
    8133f74 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    00d89ba View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    f7b907e View commit details
    Browse the repository at this point in the history

Commits on Jan 4, 2024

  1. only print archetype info on rank 0 and if using archetypes

    Stephen Bailey authored and Stephen Bailey committed Jan 4, 2024
    Configuration menu
    Copy the full SHA
    14bebc6 View commit details
    Browse the repository at this point in the history