CellMapper linear fails Open Problems NeurIPS 2022 ADT2GEX

It looks like my custom fast CCA implementation fails in this particular case and introduces NaNs in the data. Not entirely sure why this happens, but it's also sort of hard to debug without the full dataset, so I'll find a simpler solution. 

```
WARNING Using sklearn for neighbor search with large dataset (92324 cells).   
         Consider using approximate k-NN search (e.g. pynndescent) or GPU       
         acceleration (e.g. faiss or rapids)                                   
INFO    Using sklearn to compute 30 neighbors.                                 
Traceback (most recent call last):
  File "/tmp/nxf.1gyxC2spNC/.viash_script.py", line 58, in
    cmap.compute_neighbors(
  File "/usr/local/lib/python3.11/site-packages/cellmapper/model/cellmapper.py", line 217, in compute_neighbors
    self.knn.compute_neighbors(
  File "/usr/local/lib/python3.11/site-packages/cellmapper/model/kernel.py", line 183, in compute_neighbors
    backend_x.fit(self.xrep)
  File "/usr/local/lib/python3.11/site-packages/cellmapper/model/_knn_backend.py", line 46, in fit
    self._nn.fit(data)
  File "/usr/local/lib/python3.11/site-packages/sklearn/base.py", line 1365, in wrapper
    return fit_method(estimator, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sklearn/neighbors/_unsupervised.py", line 179, in fit
    return self._fit(X)
           ^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sklearn/neighbors/_base.py", line 526, in _fit
    X = validate_data(
        ^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sklearn/utils/validation.py", line 2954, in validate_data
    out = check_array(X, input_name="X", **check_params)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sklearn/utils/validation.py", line 1105, in check_array
    _assert_all_finite(
  File "/usr/local/lib/python3.11/site-packages/sklearn/utils/validation.py", line 120, in _assert_all_finite
    _assert_all_finite_element_wise(
  File "/usr/local/lib/python3.11/site-packages/sklearn/utils/validation.py", line 169, in _assert_all_finite_element_wise
    raise ValueError(msg_err)
ValueError: Input X contains NaN.
NearestNeighbors does not accept missing values encoded as NaN natively. For supervised learning, you might want to consider sklearn.ensemble.HistGradientBoostingClassifier and Regressor which accept missing values encoded as NaNs natively. Alternatively, it is possible to preprocess the data, for instance by using an imputer transformer in a pipeline or drop samples with missing values. See https://scikit-learn.org/stable/modules/impute.html You can find a list of all estimators that handle NaN values at the following page: https://scikit-learn.org/stable/modules/impute.html#estimators-that-handle-nan-value
```




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CellMapper linear fails Open Problems NeurIPS 2022 ADT2GEX #13

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CellMapper linear fails Open Problems NeurIPS 2022 ADT2GEX #13

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions