Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changed hvg with PR to work with numba #2612

Merged
merged 14 commits into from Aug 22, 2023
Merged

changed hvg with PR to work with numba #2612

merged 14 commits into from Aug 22, 2023

Conversation

Intron7
Copy link
Member

@Intron7 Intron7 commented Aug 14, 2023

The pearson residuals implementation for hvg is currently very slow and memory inefficient. I switched it to a numba kernel for csc and dense F-continous arrays.

It's based on the cuda-kernel in rapids-singlecell. For 90000 cells we go from 24 seconds to less than 5 with the new implementation. For smaller datasets we don't see a speedup.

@Intron7 Intron7 requested review from jlause and Zethson August 14, 2023 11:54
@Intron7 Intron7 added this to the 1.10.0 milestone Aug 14, 2023
@Zethson Zethson requested review from flying-sheep and removed request for Zethson August 14, 2023 12:07
@codecov
Copy link

codecov bot commented Aug 14, 2023

Codecov Report

Merging #2612 (c21bee3) into master (386f0e3) will decrease coverage by 0.28%.
Report is 6 commits behind head on master.
The diff coverage is 23.94%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2612      +/-   ##
==========================================
- Coverage   72.27%   72.00%   -0.28%     
==========================================
  Files         105      105              
  Lines       11752    11840      +88     
==========================================
+ Hits         8494     8525      +31     
- Misses       3258     3315      +57     
Files Changed Coverage Δ
scanpy/experimental/pp/_highly_variable_genes.py 63.69% <23.94%> (-33.37%) ⬇️

... and 6 files with indirect coverage changes

Copy link
Member

@flying-sheep flying-sheep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d like to see some deduplication and maintainability changes.

Please never create functions that don’t have a * after the 3rd parameter at the latest (unless the function is called rgba I guess).

Numba sadly has weird behavior around keyword-only arguments, but even if it ignores them (allowing to specify arguments positionally that shouldn’t), you should still call the function as if everything was working correctly. numba/numba#5655

>>> @numba.njit()
... def test(x, *, y):
...     return x + y 
>>> test(1, 2)
3  # no! bad numba! that should be a TypeError

scanpy/experimental/pp/_highly_variable_genes.py Outdated Show resolved Hide resolved
scanpy/experimental/pp/_highly_variable_genes.py Outdated Show resolved Hide resolved
scanpy/experimental/pp/_highly_variable_genes.py Outdated Show resolved Hide resolved
scanpy/experimental/pp/_highly_variable_genes.py Outdated Show resolved Hide resolved
scanpy/experimental/pp/_highly_variable_genes.py Outdated Show resolved Hide resolved
Copy link
Member

@flying-sheep flying-sheep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see unresolved conversations

@flying-sheep
Copy link
Member

flying-sheep commented Aug 22, 2023

Perf measurements for the use case of running the HVG tests on my machine (not very accurate, and not very reminiscent of how users use it).

Tests get a bit slower, real world gets faster.

  • scanpy master:

    $ git switch master
    $ perf stat -r 10 -B hatch run +py=3.11 test:run -n0 scanpy/tests/test_highly_variable_genes.py
       Performance counter stats for 'hatch run +py=3.11 test:run -n0 scanpy/tests/test_highly_variable_genes.py' (10 runs):
      
               71.915,07 msec task-clock:u                     #   14,035 CPUs utilized               ( +-  9,53% )
                       0      context-switches:u               #    0,000 /sec
                       0      cpu-migrations:u                 #    0,000 /sec
               1.168.035      page-faults:u                    #   29,496 K/sec                       ( +-  9,58% )
         191.815.791.770      cycles:u                         #    4,844 GHz                         ( +-  9,53% )  (83,37%)
          10.610.492.234      stalled-cycles-frontend:u        #   10,05% frontend cycles idle        ( +-  9,44% )  (83,34%)
          59.853.476.395      stalled-cycles-backend:u         #   56,69% backend cycles idle         ( +-  9,56% )  (83,32%)
         257.750.810.841      instructions:u                   #    2,44  insn per cycle
                                                               #    0,13  stalled cycles per insn     ( +-  9,57% )  (83,33%)
          45.773.330.764      branches:u                       #    1,156 G/sec                       ( +-  9,58% )  (83,33%)
           1.147.567.613      branch-misses:u                  #    4,56% of all branches             ( +-  9,54% )  (83,37%)
      
                  5,1241 +- 0,0242 seconds time elapsed  ( +-  0,47% )
  • this PR:

    $ git switch hvg_PR_numba
    $ perf stat -r 10 -B hatch run +py=3.11 test:run -n0 scanpy/tests/test_highly_variable_genes.py
       Performance counter stats for 'hatch run +py=3.11 test:run -n0 scanpy/tests/test_highly_variable_genes.py' (10 runs):
      
              113.085,21 msec task-clock:u                     #   15,789 CPUs utilized               ( +-  9,56% )
                       0      context-switches:u               #    0,000 /sec
                       0      cpu-migrations:u                 #    0,000 /sec
               1.636.606      page-faults:u                    #   26,373 K/sec                       ( +-  9,55% )
         310.410.832.165      cycles:u                         #    5,002 GHz                         ( +-  9,55% )  (83,35%)
          14.117.222.045      stalled-cycles-frontend:u        #    8,30% frontend cycles idle        ( +-  9,46% )  (83,38%)
          75.813.970.243      stalled-cycles-backend:u         #   44,56% backend cycles idle         ( +-  9,57% )  (83,35%)
         373.047.679.552      instructions:u                   #    2,19  insn per cycle
                                                               #    0,11  stalled cycles per insn     ( +-  9,57% )  (83,34%)
          67.830.590.839      branches:u                       #    1,093 G/sec                       ( +-  9,58% )  (83,35%)
           1.702.825.180      branch-misses:u                  #    4,56% of all branches             ( +-  9,56% )  (83,28%)
      
                  7,1623 +- 0,0560 seconds time elapsed  ( +-  0,78% )

@flying-sheep flying-sheep merged commit 32a3567 into master Aug 22, 2023
10 checks passed
@flying-sheep flying-sheep deleted the hvg_PR_numba branch August 22, 2023 15:20
flying-sheep added a commit that referenced this pull request Sep 7, 2023
Co-authored-by: Philipp A. <flying-sheep@web.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants