[BUG] Cellxpose returns 400 Bas request for some requests if data is 16-bit #2379

arogozhnikov · 2021-08-12T20:41:10Z

Describe the bug

Try saving test file after converting to fp16:

adata.X = adata.X.astype('float16')

Server normally starts when given this data (but prints some warnigns about conversion from 64bits).

However when you select gene or run differential expression, 400 appears.

Version affected: 0.17.0, within docker

In my case cellxgene first downloads file from the cloud, so 16bit format for storage is very reasonable - almost halves the space.

The text was updated successfully, but these errors were encountered:

bkmartinjr · 2021-08-20T23:14:59Z

@arogozhnikov - I am exploring options. One important consideration: we use Numba internally for speed, and numba does not support float16.

I see two options - if you have any thoughts, please add them to this issue.

convert X to float32 at startup (conversion in memory - will not change the H5AD). Or,
fall back on somewhat slower Python code that will support float16.

Option 1 is quite simple to implement and we could have it incorporated into our next release. Option 2 will take more exploration to support without a significant performance penalty.

Please let me know if you have any input on the trade-off.

arogozhnikov · 2021-08-23T08:46:53Z

In the context of this question, memory was also a big issue (multiple servers on the same host, each responsible for one dataset), so I'd recommend to provide a full support of fp16, even if that takes more time to implement.

bkmartinjr · 2021-08-23T14:42:38Z

To clarify - full FP16 support it is likely to run slower as well (the aforementioned performance penalty).

arogozhnikov · 2021-08-23T23:37:12Z

Still better have proper fp16.

While I did not study cellxgene code in detail, I believe all or almost all operations are column-wise.
So possibly you can convert column to fp32 on-the-fly before passing it further. This should be memory efficient, and overhead is very reasonable. Do you think so?

bkmartinjr · 2021-08-23T23:47:19Z

I have a PR almost ready for full fp16 support. Tonight or tomorrow, so it will make the next release.

You are right that you will often get better performance if you encode the matrix in a column-friendly format (eg, CSC if it is sparse)

arogozhnikov · 2022-02-01T07:02:24Z

@bkmartinjr can you please check this once again?
I was about to transition datasets to fp16, but I'm still seeing same issues (400 on diffexp, and expression of genes can't be loaded).
I've tried both csc and csr formats for X, but neither worked for me

arogozhnikov · 2022-02-01T07:10:49Z

cellxgene==1.0.0, reproduced on mac and linux if that helps

blrnw3 · 2022-02-04T19:07:36Z

I can reproduce this too. I think the get_X_array method needs updating to support float16

blrnw3 · 2022-02-07T19:49:22Z

According to this scipy issue, float16 is not supported in sparse matrices, which presumably means cellxgene cannot support float16 sparse data. Sure we could convert to dense or to float32, but that would negate all the memory savings immediately.
Flagging @bkmartinjr in case I've missed something here.

arogozhnikov · 2022-02-07T20:14:30Z

Good catch. Used float16 sparse for some

but that would negate all the memory savings immediately

h5ads still take enough space + download time, so I see benefits even if it is an immediate conversion

blrnw3 · 2022-02-07T20:44:07Z

h5ads still take enough space + download time, so I see benefits even if it is an immediate conversion

Is it not easy enough to just convert the h5ad to float32 after the download?

bkmartinjr · 2022-02-07T20:46:06Z

Suggest a cast in diffex code if the set1/set2 slices are fp16, as that is where the dependency lives. Pretty sure the rest of the code base does not have this constraint.

As it is already sliced out of the main matrix, you will retain most of the memory savings (ie, the cast to 32 bit will be only for the duration of the t-test compute)

blrnw3 · 2022-02-07T23:30:12Z

@bkmartinjr the problem is slicing the rows for diffex out of the X matrix. Scipy does not support complex slices for float16.
e.g. X[4] is ok but X[[1,2,4]] is not.

The workaround is to slice individual rows one at a time and then vstack them together, but this is prohibitively slow on anything larger than a few hundred rows.

arogozhnikov · 2022-02-08T00:05:43Z

Is it not easy enough to just convert the h5ad to float32 after the download?

Well, I use paths in s3, not downloading myself.

I see the issues you're running into, I agree it would be non-trivial to deal with non-supported scipy format. Probably an optimal one would be to add conversion of X.data somewhere here https://github.com/chanzuckerberg/cellxgene/blob/main/server/data_anndata/anndata_adaptor.py#L325

blrnw3 · 2022-02-08T01:05:22Z

Ah I see. Let's see what @bkmartinjr thinks about just converting the full float16 matrix to float32 up front (or perhaps only when X matrix is requested). So there'd be no memory saving in normal operation but you'd save download bandwidth.

bkmartinjr · 2022-02-08T01:56:18Z

I think it is a good compromise. Thumbs-up!

Convert to float32 on startup unless backed, in which case error. scipy does not support complex slicing from float16 data so this is the easiest fix for now.

* Fix float16 support [#2379] Convert to float32 on startup unless backed, in which case error. scipy does not support complex slicing from float16 data so this is the easiest fix for now. * minor msg change

blrnw3 · 2022-02-28T23:35:46Z

Fixed in 1.0.1

arogozhnikov added the bug label Aug 12, 2021

bkmartinjr self-assigned this Aug 19, 2021

bkmartinjr mentioned this issue Aug 23, 2021

X float16 support #2406

Merged

bkmartinjr closed this as completed in #2406 Aug 25, 2021

blrnw3 reopened this Feb 4, 2022

blrnw3 self-assigned this Feb 7, 2022

blrnw3 added a commit that referenced this issue Feb 9, 2022

Fix float16 support [#2379]

02f8f97

Convert to float32 on startup unless backed, in which case error. scipy does not support complex slicing from float16 data so this is the easiest fix for now.

blrnw3 closed this as completed Feb 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Cellxpose returns 400 Bas request for some requests if data is 16-bit #2379

[BUG] Cellxpose returns 400 Bas request for some requests if data is 16-bit #2379

arogozhnikov commented Aug 12, 2021

bkmartinjr commented Aug 20, 2021

arogozhnikov commented Aug 23, 2021

bkmartinjr commented Aug 23, 2021

arogozhnikov commented Aug 23, 2021

bkmartinjr commented Aug 23, 2021

arogozhnikov commented Feb 1, 2022

arogozhnikov commented Feb 1, 2022

blrnw3 commented Feb 4, 2022

blrnw3 commented Feb 7, 2022

arogozhnikov commented Feb 7, 2022

blrnw3 commented Feb 7, 2022

bkmartinjr commented Feb 7, 2022

blrnw3 commented Feb 7, 2022 •

edited

arogozhnikov commented Feb 8, 2022

blrnw3 commented Feb 8, 2022

bkmartinjr commented Feb 8, 2022

blrnw3 commented Feb 28, 2022

[BUG] Cellxpose returns 400 Bas request for some requests if data is 16-bit #2379

[BUG] Cellxpose returns 400 Bas request for some requests if data is 16-bit #2379

Comments

arogozhnikov commented Aug 12, 2021

bkmartinjr commented Aug 20, 2021

arogozhnikov commented Aug 23, 2021

bkmartinjr commented Aug 23, 2021

arogozhnikov commented Aug 23, 2021

bkmartinjr commented Aug 23, 2021

arogozhnikov commented Feb 1, 2022

arogozhnikov commented Feb 1, 2022

blrnw3 commented Feb 4, 2022

blrnw3 commented Feb 7, 2022

arogozhnikov commented Feb 7, 2022

blrnw3 commented Feb 7, 2022

bkmartinjr commented Feb 7, 2022

blrnw3 commented Feb 7, 2022 • edited

arogozhnikov commented Feb 8, 2022

blrnw3 commented Feb 8, 2022

bkmartinjr commented Feb 8, 2022

blrnw3 commented Feb 28, 2022

blrnw3 commented Feb 7, 2022 •

edited