Add scArches #9

scottgigante · 2020-07-20T16:04:58Z

No description provided.

scottgigante · 2020-08-31T15:52:01Z

@M0hammadL Malte suggested adding scArches at a minimum to fill this out as three methods

M0hammadL · 2020-09-13T11:20:57Z

@scottgigante I think scArches, will not be possible here I think, since the method does not perform classification by it self, it will align the query data to reference which we can use a simple knn classifier to carry labels from reference to the query. Therefore I am not sure whether it can be considered as a classification method or not!

dburkhardt · 2020-09-13T19:35:15Z

I think that having scArches + kNN classifer would be a great baseline to have. Thumbing through the preprint, I think that these results are compelling:

Building upon the query-reference embedding, we investigated the transfer of cell-type labels from the reference dataset. We approached this classification problem by first training a simple kNN classifier on the latent space representation of the reference TS. Then each cell in the query TM was annotated using its closest neighbors in the reference dataset. Additionally, our classification pipeline provides an uncertainty score for each cell while reporting cells with more than 50 % uncertainty as unknown (see Methods). Our model transferred the labels from the reference atlas to the query atlas with ≈ 89% accuracy for all the tissues except tracheal cells (Figure 3d). Moreover, all misclassified cells and cells from the out-of-distribution tissue received high uncertainty scores (Figure 3e-f). Overall, the classification results across tissues indicated a robust prediction accuracy across most tissues (Figure 3g) while highlighting which cells were not mappable to the reference. The robust performance of a simple KNN classifier on the integrated latent space demonstrates that scArches can successfully merge large and complex query datasets into reference atlases.

I understand you would typically include some manual fine-tuning but I would love to see these results added to Open Problems

…dimred Add IVIS method

* tangram first * tangram first * tangram first * tangram first * flake8 + isort _destvi_utils * tangram update; pancreas add string index * tangram update; pancreas add string index * tangram update; pancreas add string index; n_obs = 1000 in synth data * tangram update; pancreas add string index; n_obs = 1000 in synth data * new synth * add tangram-sc to docker * new synth approach * new synth approach * new synth approach * new synth approach * new synth approach * new synth approach * new synth approach * new synth approach * pancreas subset integer; comment pancreas dataset [skip actions] * pancreas subset integer; comment pancreas dataset [skip actions] * pancreas subset integer; comment pancreas dataset [skip actions] * pancreas subset integer; comment pancreas dataset [skip actions] * pancreas subset integer; comment pancreas dataset [skip actions] * merge and split sc and st data * merged anndata in methods * merged anndata in methods * fix destvi * add code reference * shorten * Update openproblems/tasks/spatial_decomposition/_utils.py Co-authored-by: Giovanni Palla <giov.pll@gmail.com> * Update openproblems/tasks/spatial_decomposition/datasets/_sc_to_sp_utils.py [skip actions] Co-authored-by: Giovanni Palla <giov.pll@gmail.com> * Update openproblems/tasks/spatial_decomposition/datasets/_sc_to_sp_utils.py [skip actions] Co-authored-by: Giovanni Palla <giov.pll@gmail.com> * Update openproblems/tasks/spatial_decomposition/datasets/_sc_to_sp_utils.py [skip actions] Co-authored-by: Giovanni Palla <giov.pll@gmail.com> * comment fix * comment fix * fix pancreas dataset * update readme * fix destvi genertaion * fix sparse * minor fix * drop csr_matrix; fix double merge of anndata; update seurat v3 * updates * fix test for sparse arrays * test=False * add geos to r-extras * geos before r install * add software-properties-common * add python-software-properties * add RUN before command * rm geos from r-base * fix merging of anndata by pinning higher version * revert back anndata * fix obs_names and pin anndata * try to add swap * reduce number of spatial spots * remove swap * reduce obs * remove swap * remove step in CI * decrease dataset size * remove sparse * remove copy * remove datasets * remove datasets from init * address scott comments * skip all pancreas * fix import * remove destiv * Merge `main` into `synthetic-data-generation` (#10) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: MalteDLuecken <m.d.luecken@gmail.com> Co-authored-by: Scott Gigante <84813314+scottgigante-immunai@users.noreply.github.com> Co-authored-by: SingleCellOpenProblems <singlecellopenproblems@protonmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Daniel Strobl <50872326+danielStrobl@users.noreply.github.com> * change test * fix from_cache * pre-commit * update data generation to remove inf * change test * check task * resolve suggestions from scott Co-authored-by: almaan <almaan@kth.se> Co-authored-by: Giovanni Palla <giov.pll@gmail.com> Co-authored-by: Scott Gigante <84813314+scottgigante-immunai@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: MalteDLuecken <m.d.luecken@gmail.com> Co-authored-by: SingleCellOpenProblems <singlecellopenproblems@protonmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Daniel Strobl <50872326+danielStrobl@users.noreply.github.com>

* init spatial * spatial decomposition * init cleanup * run precommit * README update * updated readme * api update * pre-commit * readme update * api stylefix * pre-commit * metrics update * linting * pre-commit * black * linting fix * pre-commit * cleanup * cleanup * linting * linting * task name change * Rctd (#6) Co-authored-by: almaan <almaan@kth.se> * add stereoscope - nnls - nusvr - vanillanmf - nmfreg (#4) Co-authored-by: Hirak Sarkar <hiraksarkar.cs@gmail.com> * Seurat (#8) Co-authored-by: almaan <almaan@kth.se> * adding simulation (#7) Co-authored-by: giovp <giov.pll@gmail.com> Co-authored-by: almaan <almaan@kth.se> * reorder requirements * Update mse.py * update R2 description * review comments, populated __init__.py files for import * update import statements * fix label dataset * Specify image * fix random * fix labels * pre-commit * add test=False * add synth data from destVI * remove logger * return spatial reference in correct format * specify cell type label * add destVI simulation to datasets * fix random * fix nusvr * fix stereoscope * added destvi * added destvi * try fix data generation * fix from previous delete * add scvitools version * Synthetic data generation (#9) * tangram first * tangram first * tangram first * tangram first * flake8 + isort _destvi_utils * tangram update; pancreas add string index * tangram update; pancreas add string index * tangram update; pancreas add string index; n_obs = 1000 in synth data * tangram update; pancreas add string index; n_obs = 1000 in synth data * new synth * add tangram-sc to docker * new synth approach * new synth approach * new synth approach * new synth approach * new synth approach * new synth approach * new synth approach * new synth approach * pancreas subset integer; comment pancreas dataset [skip actions] * pancreas subset integer; comment pancreas dataset [skip actions] * pancreas subset integer; comment pancreas dataset [skip actions] * pancreas subset integer; comment pancreas dataset [skip actions] * pancreas subset integer; comment pancreas dataset [skip actions] * merge and split sc and st data * merged anndata in methods * merged anndata in methods * fix destvi * add code reference * shorten * Update openproblems/tasks/spatial_decomposition/_utils.py Co-authored-by: Giovanni Palla <giov.pll@gmail.com> * Update openproblems/tasks/spatial_decomposition/datasets/_sc_to_sp_utils.py [skip actions] Co-authored-by: Giovanni Palla <giov.pll@gmail.com> * Update openproblems/tasks/spatial_decomposition/datasets/_sc_to_sp_utils.py [skip actions] Co-authored-by: Giovanni Palla <giov.pll@gmail.com> * Update openproblems/tasks/spatial_decomposition/datasets/_sc_to_sp_utils.py [skip actions] Co-authored-by: Giovanni Palla <giov.pll@gmail.com> * comment fix * comment fix * fix pancreas dataset * update readme * fix destvi genertaion * fix sparse * minor fix * drop csr_matrix; fix double merge of anndata; update seurat v3 * updates * fix test for sparse arrays * test=False * add geos to r-extras * geos before r install * add software-properties-common * add python-software-properties * add RUN before command * rm geos from r-base * fix merging of anndata by pinning higher version * revert back anndata * fix obs_names and pin anndata * try to add swap * reduce number of spatial spots * remove swap * reduce obs * remove swap * remove step in CI * decrease dataset size * remove sparse * remove copy * remove datasets * remove datasets from init * address scott comments * skip all pancreas * fix import * remove destiv * Merge `main` into `synthetic-data-generation` (#10) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: MalteDLuecken <m.d.luecken@gmail.com> Co-authored-by: Scott Gigante <84813314+scottgigante-immunai@users.noreply.github.com> Co-authored-by: SingleCellOpenProblems <singlecellopenproblems@protonmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Daniel Strobl <50872326+danielStrobl@users.noreply.github.com> * change test * fix from_cache * pre-commit * update data generation to remove inf * change test * check task * resolve suggestions from scott Co-authored-by: almaan <almaan@kth.se> Co-authored-by: Giovanni Palla <giov.pll@gmail.com> Co-authored-by: Scott Gigante <84813314+scottgigante-immunai@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: MalteDLuecken <m.d.luecken@gmail.com> Co-authored-by: SingleCellOpenProblems <singlecellopenproblems@protonmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Daniel Strobl <50872326+danielStrobl@users.noreply.github.com> * Remove duplicate line * Fill in baseline decorator * Check R version of seurat * pre-commit * Remove reference to __from_cache__ * Clean up proportions assert * pre-commit * set merge='unique' to retain uns * fixes from scott comments * Fix code_version for new API * Set `max_epochs` on `test` * pre-commit * Temporarily remove destvi * Add dataset metadata fields * Add task summary * Temporarily remove steroscope * pre-commit * Fix typo * Copy `uns` * fix uns_merge to include _from_cache * convert NaNs in categorical dtypes * convert string dtypes * bump tangram-sc * fix string dtypes * convert strings to categoricals inside pancreas * change api label to str * obsm cannot be pd.DataFrame * revert anndata change * fix rctd * fix R2 * fix lots of things * fix nmfreg and sample method * address scott comments * add metadata attribute decorator * Update r_requirements.txt * Handle comments in `r_requirements.txt` * Rename spacexr * Move API below metrics * Rename NNLS * Fix RCTD code URL * Set n_pcs in RCTD python call * Revert 2077c35 * Set n_pca in seuratv3.py * use `n_pcs` in seuratv3.R * Split string rather than skipping QA * Shorten line lengths * shorten line lengths * Clean up comment * Delete pbmc3k_raw.h5ad * Rename R2.py to r2.py * Fix reference to r2.py * pre-commit * Rename sc_to_sp.py to pancreas.py * Rename _sc_and_sp_utils.py to utils.py * rename _utils.py to utils.py * pre-commit * import all pancreas datasets * fix typo * fix namespace clash * need to pass test arg * fix method name (0_1 -> 0_5) * check tower auth explicitly * filter genes and cells * filter_genes_cells is in-place * remaining todos from scott * add destvi dataset * delete scvi models and dataset * fix shell string * one more syntactic fix * Add tangram to readme * Specify cell types in description * pre-commit * Better dataset descriptors * Clean up * Split don't skip * handle random_state * Fix doi URL * Move import inside * Shorten line lengths * Remove commented imports * Shorten descriptors * Fix seuratv3 URL * Remove unused projection_type arg * Remove unused toarray * Remove unused toarray * Update vanillanmf.py * Remove unused DataFrame handler * Remove unused categorical handler * Remove unused pandas import * update nmfreg * fix nmfreg * fix vanilla * fix nmf * fix alpha * rctd * pre-commit * add dataset_reference * shorten line lengths * document PYTEST_MAX_RETRIES * Allow 429 too many requests Co-authored-by: almaan <almaan@kth.se> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Alma Andersson <kangarooblood@gmail.com> Co-authored-by: Hirak Sarkar <hiraksarkar.cs@gmail.com> Co-authored-by: Daniel Burkhardt <burkhardt.d.b@gmail.com> Co-authored-by: Scott Gigante <84813314+scottgigante-immunai@users.noreply.github.com> Co-authored-by: MalteDLuecken <m.d.luecken@gmail.com> Co-authored-by: SingleCellOpenProblems <singlecellopenproblems@protonmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Daniel Strobl <50872326+danielStrobl@users.noreply.github.com> Co-authored-by: Scott Gigante <scott.gigante@immunai.com>

…hods-and-metrics Feat/label projection methods and metrics Former-commit-id: 44d0805

scottgigante added this to the CZI Webinar milestone Jul 20, 2020

scottgigante assigned M0hammadL Jul 20, 2020

scottgigante added the method label Aug 18, 2020

dburkhardt mentioned this issue Aug 31, 2020

To do by Sept 7 2020 #47

Closed

9 tasks

scottgigante modified the milestones: CZI Webinar, Sept 7 2020 Aug 31, 2020

scottgigante changed the title ~~Add Label Projection SoTA methods~~ Add scArches Sep 4, 2020

dburkhardt modified the milestones: Sept 7 2020, October 13 - Community Call Sep 21, 2020

lazappi added a commit to michalk8/SingleCellOpenProblems that referenced this issue May 4, 2021

Merge pull request openproblems-bio#9 from michalk8/method-ivis-task-…

3da30f9

…dimred Add IVIS method

scottgigante-immunai closed this as completed Dec 1, 2022

rcannood pushed a commit that referenced this issue Sep 4, 2024

Merge pull request #9 from openproblems-bio/feat/label-projection-met…

f95dc3d

…hods-and-metrics Feat/label projection methods and metrics Former-commit-id: 44d0805

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add scArches #9

Add scArches #9

scottgigante commented Jul 20, 2020

scottgigante commented Aug 31, 2020

M0hammadL commented Sep 13, 2020

dburkhardt commented Sep 13, 2020

Add scArches #9

Add scArches #9

Comments

scottgigante commented Jul 20, 2020

scottgigante commented Aug 31, 2020

M0hammadL commented Sep 13, 2020

dburkhardt commented Sep 13, 2020