CiteSeqDataset #40

Edouard360 · 2018-06-15T19:05:12Z

I suggest this to complete CbmcDataset information.

change CbmcDataset dataset to CiteSeqDataset with name option "cbmc"or "pbmc" and info about ADT counts (proteins markers), added as attributes of the dataset.

This is the data from epitopes useful for having further labelling information.

…or "pbmc" and info about ADT counts (proteins markers), added as attributes of the dataset. This is the data from epitopes useful for having further labelling information.

Edouard360 · 2018-06-15T19:06:16Z

I haven't had the time to create the preprocessed files for unit tests, but:

CiteSeqDataset("pbmc") and CiteSeqDataset("cbmc") both work and also have as attributes information about the epitopes

jeff-regier · 2018-06-15T19:41:15Z

tests/test_scvi.py

@@ -30,7 +30,7 @@ def test_retina():


 def test_cbmc():
-    run_benchmarks("cbmc", n_epochs=1, show_batch_mixing=False, save_path='tests/data/')
+    run_benchmarks("cite_seq_cbmc", n_epochs=1, show_batch_mixing=False, save_path='tests/data/')


Is cite_seq_cbmc the same data as cbmc once it's loaded? Can we just continue to call it cbmc then?

…or "pbmc" and info about ADT counts (proteins markers), added as attributes of the dataset. This is the data from epitopes useful for having further labelling information.

… ADTs

- Datasets might have multiple urls from which to download (ex. cite-Seq data): we might either specify `url`, `download_name` attributes or `urls`, `download_names`. Move check if file exists in `_download`. - `CbmcDataset()` -> CiteSeqDataset('cbmc'), with information about epitopes. - `PbmcDataset()` can be obtained with: ``` gene_dataset = concat_datasets( Dataset10X("pbmc8k", save_path=save_path), Dataset10X("pbmc4k", save_path=save_path) ) ``` So I removed data/PBMC - From citeSeq methods there are actually 3 available datasets (cmbc, pbmc, and cd8). Since there are also 10X pbmc datasets, the `pbmc` nameis misleading in the `load_datasets` function. For now we leave as default romain's initial pbmc dataset, which consists in the concatenation of `pbmc8k` and `pbmc4k` - `concat_datasets` test

Edouard360 · 2018-06-26T12:59:46Z

Datasets might have multiple urls from which to download (ex. cite-Seq data): we might either specify url, download_name attributes or urls, download_names. Move check if file exists in _download.
CbmcDataset() -> CiteSeqDataset('cbmc'), with information about epitopes.
PbmcDataset() can be obtained with:

gene_dataset = concat_datasets(
     Dataset10X("pbmc8k", save_path=save_path),
     Dataset10X("pbmc4k", save_path=save_path)
)

So I removed data/PBMC

From citeSeq methods there are actually 3 available datasets (cmbc, pbmc, and cd8). Since there are also 10X pbmc datasets, the pbmc name is misleading in the load_datasets function. For now we leave as default romain's initial pbmc dataset, which consists in the concatenation of pbmc8k and pbmc4k
concat_datasets test

codecov-io · 2018-06-26T13:09:39Z

Codecov Report

Merging #40 into master will increase coverage by 2.33%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master      #40      +/-   ##
==========================================
+ Coverage   89.08%   91.42%   +2.33%     
==========================================
  Files          33       32       -1     
  Lines        1393     1388       -5     
==========================================
+ Hits         1241     1269      +28     
+ Misses        152      119      -33

Impacted Files	Coverage Δ
scvi/dataset/cite_seq.py	`100% <100%> (ø)`
scvi/dataset/dataset10X.py	`100% <100%> (ø)`	⬆️
scvi/dataset/dataset.py	`93.51% <100%> (+9.68%)`	⬆️
scvi/dataset/__init__.py	`100% <100%> (ø)`	⬆️
scvi/dataset/utils.py	`95.65% <100%> (+17.87%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 597ccd3...dfb610d. Read the comment docs.

Edouard360 · 2018-06-26T13:33:07Z

closes #47

jeff-regier · 2018-06-26T16:41:03Z

Very nice!

change CbmcDataset dataset to CiteSeqDataset with name option "cbmc" …

7b40950

…or "pbmc" and info about ADT counts (proteins markers), added as attributes of the dataset. This is the data from epitopes useful for having further labelling information.

jeff-regier reviewed Jun 15, 2018

View reviewed changes

Bump version: 0.1.2 → 0.1.2

9c9f8f7

scverse deleted a comment from imyiningliu Jun 19, 2018

jeff-regier mentioned this pull request Jun 19, 2018

add ADT counts (proteins markers) to PbmcDataset and CbmcDataset #47

Closed

jeff-regier and others added 2 commits June 19, 2018 16:17

Merge branch 'master' of github.com:YosefLab/scVI

d226deb

change CbmcDataset dataset to CiteSeqDataset with name option "cbmc" …

c60c76e

…or "pbmc" and info about ADT counts (proteins markers), added as attributes of the dataset. This is the data from epitopes useful for having further labelling information.

jeff-regier force-pushed the citeSeq branch from 7b40950 to c60c76e Compare June 19, 2018 23:23

Edouard360 added 4 commits June 23, 2018 10:02

Merge branch 'citeSeq' of https://github.com/YosefLab/scVI into citeSeq

1faaf45

Concat datasets correction + Cite_Seq download centered log ratio for…

51d4d12

… ADTs

Merge branch 'master' into citeSeq

93c3978

fix flake8

dfb610d

jeff-regier merged commit 6f2a934 into master Jun 26, 2018

jeff-regier deleted the citeSeq branch June 26, 2018 16:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CiteSeqDataset #40

CiteSeqDataset #40

Edouard360 commented Jun 15, 2018

Edouard360 commented Jun 15, 2018

jeff-regier Jun 15, 2018

Edouard360 commented Jun 26, 2018

codecov-io commented Jun 26, 2018 •

edited

Loading

Edouard360 commented Jun 26, 2018

jeff-regier commented Jun 26, 2018

CiteSeqDataset #40

CiteSeqDataset #40

Conversation

Edouard360 commented Jun 15, 2018

Edouard360 commented Jun 15, 2018

jeff-regier Jun 15, 2018

Choose a reason for hiding this comment

Edouard360 commented Jun 26, 2018

codecov-io commented Jun 26, 2018 • edited Loading

Codecov Report

Edouard360 commented Jun 26, 2018

jeff-regier commented Jun 26, 2018

codecov-io commented Jun 26, 2018 •

edited

Loading