Scale #2457

Sakshi-2797 · 2023-03-24T10:54:23Z

This contribution holds modifications to the existing 'scale function' which turns out to be a faster implementation than the original one. We have introduced flavors - default and use_fastpp , where use_fastpp is our faster implementation of the scale function.The scale function modification provides an overall speedup of approx ~2x in comparison to the already existing 'scale' function.
Usage : sc.pp.scale(adata,max_value=10,flavor='use_fastpp')

codecov · 2023-03-24T11:11:17Z

Codecov Report

Merging #2457 (e749022) into master (f03c5b4) will decrease coverage by 0.12%.
The diff coverage is 25.00%.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #2457      +/-   ##
==========================================
- Coverage   71.88%   71.76%   -0.12%     
==========================================
  Files          98       98              
  Lines       11510    11537      +27     
==========================================
+ Hits         8274     8280       +6     
- Misses       3236     3257      +21

Impacted Files	Coverage Δ
scanpy/preprocessing/_simple.py	`74.59% <25.00%> (-3.52%)`	⬇️

Zethson · 2023-04-17T14:30:28Z

@Sakshi-2797 is there a reason why you'd introduce a flavor here instead of outright replacing the implementation?

Sakshi-2797 · 2023-04-17T15:24:44Z

Hi @Zethson ,
The reason why we introduced flavors here is that we wanted the traditional implementation to be present in case anyone wanted to use it. We just introduced our implementation as a faster alternative to the already available one. In case replacing the code is required, we can do that as well.

Zethson · 2023-05-03T16:02:24Z

Unfortunately, I run into

__________________________________________________________________________________ test_scale[use_fastpp] ___________________________________________________________________________________

flavor = 'use_fastpp'

    @pytest.mark.parametrize("flavor", ["default", "use_fastpp"])
    def test_scale(flavor):
        adata = pbmc68k_reduced()
        adata.X = adata.raw.X
        v = adata[:, 0 : adata.shape[1] // 2]
        # Should turn view to copy https://github.com/scverse/anndata/issues/171#issuecomment-508689965
        assert v.is_view
        with pytest.warns(Warning, match="view"):
>           sc.pp.scale(v, flavor=flavor)

scanpy/tests/test_preprocessing.py:127: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../miniconda3/envs/scanpy/lib/python3.9/functools.py:888: in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
scanpy/preprocessing/_simple.py:888: in scale_anndata
    X, adata.var["mean"], adata.var["std"] = do_scale(
../../miniconda3/envs/scanpy/lib/python3.9/site-packages/numba/core/dispatcher.py:468: in _compile_for_args
    error_rewrite(e, 'typing')
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

e = TypingError('Failed in nopython mode pipeline (step: nopython frontend)\nnon-precise type pyobject\nDuring: typing of ...y the following argument(s):\n- argument 0: Cannot determine Numba type of <class \'scipy.sparse._csr.csr_matrix\'>\n')
issue_type = 'typing'

    def error_rewrite(e, issue_type):
        """
        Rewrite and raise Exception `e` with help supplied based on the
        specified issue_type.
        """
        if config.SHOW_HELP:
            help_msg = errors.error_extras[issue_type]
            e.patch_message('\n'.join((str(e).rstrip(), help_msg)))
        if config.FULL_TRACEBACKS:
            raise e
        else:
>           raise e.with_traceback(None)
E           numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
E           non-precise type pyobject
E           During: typing of argument at /home/zeth/PycharmProjects/scanpy/scanpy/preprocessing/_simple.py (763)
E           
E           File "scanpy/preprocessing/_simple.py", line 763:
E           def do_scale(X, maxv, nthr):
E               <source elided>
E               # t0= time.time()
E               s = np.zeros((nthr, X.shape[1]))
E               ^ 
E           
E           This error may have been caused by the following argument(s):
E           - argument 0: Cannot determine Numba type of <class 'scipy.sparse._csr.csr_matrix'>

../../miniconda3/envs/scanpy/lib/python3.9/site-packages/numba/core/dispatcher.py:409: TypingError

When trying to use the new flavor with the existing test.

Sakshi-2797 · 2023-05-09T11:43:14Z

Unfortunately, I run into

__________________________________________________________________________________ test_scale[use_fastpp] ___________________________________________________________________________________

flavor = 'use_fastpp'

    @pytest.mark.parametrize("flavor", ["default", "use_fastpp"])
    def test_scale(flavor):
        adata = pbmc68k_reduced()
        adata.X = adata.raw.X
        v = adata[:, 0 : adata.shape[1] // 2]
        # Should turn view to copy https://github.com/scverse/anndata/issues/171#issuecomment-508689965
        assert v.is_view
        with pytest.warns(Warning, match="view"):
>           sc.pp.scale(v, flavor=flavor)

scanpy/tests/test_preprocessing.py:127: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../miniconda3/envs/scanpy/lib/python3.9/functools.py:888: in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
scanpy/preprocessing/_simple.py:888: in scale_anndata
    X, adata.var["mean"], adata.var["std"] = do_scale(
../../miniconda3/envs/scanpy/lib/python3.9/site-packages/numba/core/dispatcher.py:468: in _compile_for_args
    error_rewrite(e, 'typing')
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

e = TypingError('Failed in nopython mode pipeline (step: nopython frontend)\nnon-precise type pyobject\nDuring: typing of ...y the following argument(s):\n- argument 0: Cannot determine Numba type of <class \'scipy.sparse._csr.csr_matrix\'>\n')
issue_type = 'typing'

    def error_rewrite(e, issue_type):
        """
        Rewrite and raise Exception `e` with help supplied based on the
        specified issue_type.
        """
        if config.SHOW_HELP:
            help_msg = errors.error_extras[issue_type]
            e.patch_message('\n'.join((str(e).rstrip(), help_msg)))
        if config.FULL_TRACEBACKS:
            raise e
        else:
>           raise e.with_traceback(None)
E           numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
E           non-precise type pyobject
E           During: typing of argument at /home/zeth/PycharmProjects/scanpy/scanpy/preprocessing/_simple.py (763)
E           
E           File "scanpy/preprocessing/_simple.py", line 763:
E           def do_scale(X, maxv, nthr):
E               <source elided>
E               # t0= time.time()
E               s = np.zeros((nthr, X.shape[1]))
E               ^ 
E           
E           This error may have been caused by the following argument(s):
E           - argument 0: Cannot determine Numba type of <class 'scipy.sparse._csr.csr_matrix'>

../../miniconda3/envs/scanpy/lib/python3.9/site-packages/numba/core/dispatcher.py:409: TypingError

When trying to use the new flavor with the existing test.

Hi @Zethson ,
We are not able to see this issue with the latest commit. Can you please retry with the latest commit in scale branch.

Signed-off-by: zethson <lukas.heumos@posteo.net>

Zethson · 2023-05-09T11:56:22Z

Please adapt the corresponding test to:

@pytest.mark.parametrize("flavor", ["default", "use_fastpp"])
def test_scale(flavor):
    adata = pbmc68k_reduced()
    adata.X = adata.raw.X
    v = adata[:, 0 : adata.shape[1] // 2]
    # Should turn view to copy https://github.com/scverse/anndata/issues/171#issuecomment-508689965
    assert v.is_view
    with pytest.warns(Warning, match="view"):
        sc.pp.scale(v, flavor=flavor)
    assert not v.is_view
    assert_allclose(v.X.var(axis=0), np.ones(v.shape[1]), atol=0.01)
    assert_allclose(v.X.mean(axis=0), np.zeros(v.shape[1]), atol=0.00001)

It fails for me with FAILED scanpy/tests/test_preprocessing.py::test_scale[use_fastpp] - numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)

Zethson · 2023-05-10T07:51:08Z

I pushed to your branch. It failed yesterday while Github was having issues. The test should pass.

Zethson · 2023-05-24T12:41:44Z

@Sakshi-2797 were you able to find the fix to make it work or do you need more information? I'm happy to take over your branch as soon as it works to glue things together

Sakshi-2797 · 2023-05-29T08:43:43Z

@pytest.mark.parametrize("flavor", ["default", "use_fastpp"])

Hi @Zethson ,
I tried running the modified testcase mentioned above , but it seems it is failing because sparse matrix is being passed in it as a parameter. As of now, our scale function is not implemented for the sparse matrices. It is expected that these tests will fail.

Zethson · 2023-05-30T07:17:27Z

Sparse matrices are absolutely essential for single-cell to ensure that RAM usage is kept low. Is there a way to make your implementation work with sparse matrices?

sanchit-misra · 2023-12-07T05:44:01Z

Sorry for the delay in responding here. Our resources got pulled into other projects. We are looking to make it work with sparse matrices.

Sakshi-2797 and others added 6 commits February 9, 2023 00:54

Leiden Louvain with Katana

6277587

Merge branch 'master' into master

cd45af8

commented some parts

f604c9c

some changes in leiden

3e4a775

Merge branch 'scverse:master' into master

0a9cfa3

Merge branch 'master' into master

686e6a9

Sakshi-2797 force-pushed the scale branch 2 times, most recently from c450af3 to 0cf93d9 Compare March 27, 2023 06:01

Removing Master Changes

75921bb

Sakshi-2797 force-pushed the scale branch from 0cf93d9 to 75921bb Compare March 27, 2023 06:20

Sakshi-2797 added 3 commits March 27, 2023 01:42

Some changes in Scale

72d9446

Merge branch 'scale' of https://github.com/Sakshi-2797/scanpy into scale

b162be6

Merge branch 'scale' of https://github.com/Sakshi-2797/scanpy into scale

7ffde34

Sakshi-2797 force-pushed the scale branch 3 times, most recently from d415487 to e57a948 Compare March 27, 2023 10:43

Changes in Scale function

383c8a8

Sakshi-2797 force-pushed the scale branch from e57a948 to 383c8a8 Compare March 28, 2023 06:24

Add fastpp flavour to scale test

4f83e77

Signed-off-by: zethson <lukas.heumos@posteo.net>

Merge branch 'master' into scale

e749022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scale #2457

Scale #2457

Sakshi-2797 commented Mar 24, 2023 •

edited

codecov bot commented Mar 24, 2023 •

edited

Zethson commented Apr 17, 2023

Sakshi-2797 commented Apr 17, 2023

Zethson commented May 3, 2023

Sakshi-2797 commented May 9, 2023

Zethson commented May 9, 2023

Zethson commented May 10, 2023

Zethson commented May 24, 2023

Sakshi-2797 commented May 29, 2023

Zethson commented May 30, 2023

sanchit-misra commented Dec 7, 2023

Scale #2457

Are you sure you want to change the base?

Scale #2457

Conversation

Sakshi-2797 commented Mar 24, 2023 • edited

codecov bot commented Mar 24, 2023 • edited

Codecov Report

Zethson commented Apr 17, 2023

Sakshi-2797 commented Apr 17, 2023

Zethson commented May 3, 2023

Sakshi-2797 commented May 9, 2023

Zethson commented May 9, 2023

Zethson commented May 10, 2023

Zethson commented May 24, 2023

Sakshi-2797 commented May 29, 2023

Zethson commented May 30, 2023

sanchit-misra commented Dec 7, 2023

Sakshi-2797 commented Mar 24, 2023 •

edited

codecov bot commented Mar 24, 2023 •

edited