Adding MAGIC to Scanpy #187

scottgigante · 2018-07-02T23:23:03Z

Hi,

We spoke a while ago about adding PHATE and eventually MAGIC to Scanpy. MAGIC has just been submitted to CRAN and is in a stable state.

How should a tool such as MAGIC interact with Scanpy? Do you currently have any imputation methods included in the package that I can use to model the API?

Thanks,
Scott

falexwolf · 2018-07-03T19:12:00Z

Hi Scott,

sure, I remember! 😄 For some reason, I forgot to mention you personally in the release notes, is now fixed. Sorry about that!

You could add MAGIC as a preprocessing similar to DCA in the imputation section: http://scanpy.readthedocs.io/en/latest/api/index.html#preprocessing-pp.

In terms of code, I would also adapt the conventions of DCA: https://github.com/theislab/scanpy/blob/master/scanpy/preprocessing/dca.py

We had some discussions on how to do this best: #142 and #186.

If you think you have better conventions, happy to adopt these. DCA is also not yet released...

Best,
Alex

wangjiawen2013 · 2018-07-09T08:50:23Z

@falexwolf
MAGIC uses root square transformation, not the frequently used log transformation, which causes the incompatibility with batch correction methods, such as CCA and MNN. Is DCA compatible with MNN and CCA ?

scottgigante · 2018-07-09T14:35:48Z

@wangjiawen2013 we recommend using square root transform with MAGIC but it's certainly not incompatible. So long as the inputs have been library size normalized and transformed with any of log, sqrt, arcsinh or some other sublinear transformation, MAGIC will work just fine.

scottgigante · 2018-07-09T14:36:58Z

Also I'm surprised to see I never left a note on your message @falexwolf : thanks! I'm working on the API now, will send in a PR when it's done or leave a note here if I think the DCA API could do with some modification.

wangjiawen2013 · 2018-07-12T01:16:04Z

I find some negative values in the imputation data, how did them generate ?

scottgigante · 2018-07-12T01:19:46Z

@wangjiawen2013 if you have any questions about MAGIC, I recommend you post them in the MAGIC repo. The negative values are an artifact of the imputation process, but the absolute values of expression are not really important, since normalized scRNAseq data is only really a measure of relative expression anyway.

Using the progress bar from tqdm.auto causes a `ImportError` when `ipywidgets` is not installed. ```console --------------------------------------------------------------------------- NameError Traceback (most recent call last) ~/anaconda3/envs/test_scanpy/lib/python3.8/site-packages/tqdm/notebook.py in status_printer(_, total, desc, ncols) 97 else: # No total? Show info style bar with no progress tqdm status ---> 98 pbar = IProgress(min=0, max=1) 99 pbar.value = 1 NameError: name 'IProgress' is not defined During handling of the above exception, another exception occurred: ImportError Traceback (most recent call last) <ipython-input-5-ec5b1e8cd660> in <module> ----> 1 sc.datasets.moignard15() ~/anaconda3/envs/test_scanpy/lib/python3.8/site-packages/scanpy/datasets/__init__.py in moignard15() 108 filename = settings.datasetdir / 'moignard15/nbt.3154-S3.xlsx' 109 backup_url = 'http://www.nature.com/nbt/journal/v33/n3/extref/nbt.3154-S3.xlsx' --> 110 adata = sc.read(filename, sheet='dCt_values.txt', backup_url=backup_url) 111 # filter out 4 genes as in Haghverdi et al. (2016) 112 gene_subset = ~np.in1d(adata.var_names, ['Eif2b1', 'Mrpl19', 'Polr2a', 'Ubc']) ~/anaconda3/envs/test_scanpy/lib/python3.8/site-packages/scanpy/readwrite.py in read(filename, backed, sheet, ext, delimiter, first_column_names, backup_url, cache, **kwargs) 92 filename = Path(filename) # allow passing strings 93 if is_valid_filename(filename): ---> 94 return _read( 95 filename, backed=backed, sheet=sheet, ext=ext, 96 delimiter=delimiter, first_column_names=first_column_names, ~/anaconda3/envs/test_scanpy/lib/python3.8/site-packages/scanpy/readwrite.py in _read(filename, backed, sheet, ext, delimiter, first_column_names, backup_url, cache, suppress_cache_warning, **kwargs) 489 else: 490 ext = is_valid_filename(filename, return_ext=True) --> 491 is_present = check_datafile_present_and_download( 492 filename, 493 backup_url=backup_url, ~/anaconda3/envs/test_scanpy/lib/python3.8/site-packages/scanpy/readwrite.py in check_datafile_present_and_download(path, backup_url) 745 path.parent.mkdir(parents=True) 746 --> 747 download(backup_url, path) 748 return True 749 ~/anaconda3/envs/test_scanpy/lib/python3.8/site-packages/scanpy/readwrite.py in download(url, path) 722 723 path.parent.mkdir(parents=True, exist_ok=True) --> 724 with tqdm(unit='B', unit_scale=True, miniters=1, desc=path.name) as t: 725 def update_to(b=1, bsize=1, tsize=None): 726 if tsize is not None: ~/anaconda3/envs/test_scanpy/lib/python3.8/site-packages/tqdm/notebook.py in __init__(self, *args, **kwargs) 206 unit_scale = 1 if self.unit_scale is True else self.unit_scale or 1 207 total = self.total * unit_scale if self.total else self.total --> 208 self.container = self.status_printer( 209 self.fp, total, self.desc, self.ncols) 210 self.sp = self.display ~/anaconda3/envs/test_scanpy/lib/python3.8/site-packages/tqdm/notebook.py in status_printer(_, total, desc, ncols) 101 except NameError: 102 # scverse#187 scverse#451 scverse#558 --> 103 raise ImportError( 104 "FloatProgress not found. Please update jupyter and ipywidgets." 105 " See https://ipywidgets.readthedocs.io/en/stable" ImportError: FloatProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html ```

Using the progress bar from tqdm.auto causes a `ImportError` when `ipywidgets` is not installed. ```console --------------------------------------------------------------------------- NameError Traceback (most recent call last) ~/anaconda3/envs/test_scanpy/lib/python3.8/site-packages/tqdm/notebook.py in status_printer(_, total, desc, ncols) 97 else: # No total? Show info style bar with no progress tqdm status ---> 98 pbar = IProgress(min=0, max=1) 99 pbar.value = 1 NameError: name 'IProgress' is not defined During handling of the above exception, another exception occurred: ImportError Traceback (most recent call last) <ipython-input-5-ec5b1e8cd660> in <module> ----> 1 sc.datasets.moignard15() ~/anaconda3/envs/test_scanpy/lib/python3.8/site-packages/scanpy/datasets/__init__.py in moignard15() 108 filename = settings.datasetdir / 'moignard15/nbt.3154-S3.xlsx' 109 backup_url = 'http://www.nature.com/nbt/journal/v33/n3/extref/nbt.3154-S3.xlsx' --> 110 adata = sc.read(filename, sheet='dCt_values.txt', backup_url=backup_url) 111 # filter out 4 genes as in Haghverdi et al. (2016) 112 gene_subset = ~np.in1d(adata.var_names, ['Eif2b1', 'Mrpl19', 'Polr2a', 'Ubc']) ~/anaconda3/envs/test_scanpy/lib/python3.8/site-packages/scanpy/readwrite.py in read(filename, backed, sheet, ext, delimiter, first_column_names, backup_url, cache, **kwargs) 92 filename = Path(filename) # allow passing strings 93 if is_valid_filename(filename): ---> 94 return _read( 95 filename, backed=backed, sheet=sheet, ext=ext, 96 delimiter=delimiter, first_column_names=first_column_names, ~/anaconda3/envs/test_scanpy/lib/python3.8/site-packages/scanpy/readwrite.py in _read(filename, backed, sheet, ext, delimiter, first_column_names, backup_url, cache, suppress_cache_warning, **kwargs) 489 else: 490 ext = is_valid_filename(filename, return_ext=True) --> 491 is_present = check_datafile_present_and_download( 492 filename, 493 backup_url=backup_url, ~/anaconda3/envs/test_scanpy/lib/python3.8/site-packages/scanpy/readwrite.py in check_datafile_present_and_download(path, backup_url) 745 path.parent.mkdir(parents=True) 746 --> 747 download(backup_url, path) 748 return True 749 ~/anaconda3/envs/test_scanpy/lib/python3.8/site-packages/scanpy/readwrite.py in download(url, path) 722 723 path.parent.mkdir(parents=True, exist_ok=True) --> 724 with tqdm(unit='B', unit_scale=True, miniters=1, desc=path.name) as t: 725 def update_to(b=1, bsize=1, tsize=None): 726 if tsize is not None: ~/anaconda3/envs/test_scanpy/lib/python3.8/site-packages/tqdm/notebook.py in __init__(self, *args, **kwargs) 206 unit_scale = 1 if self.unit_scale is True else self.unit_scale or 1 207 total = self.total * unit_scale if self.total else self.total --> 208 self.container = self.status_printer( 209 self.fp, total, self.desc, self.ncols) 210 self.sp = self.display ~/anaconda3/envs/test_scanpy/lib/python3.8/site-packages/tqdm/notebook.py in status_printer(_, total, desc, ncols) 101 except NameError: 102 # #187 #451 #558 --> 103 raise ImportError( 104 "FloatProgress not found. Please update jupyter and ipywidgets." 105 " See https://ipywidgets.readthedocs.io/en/stable" ImportError: FloatProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html ```

wflynny mentioned this issue Jul 3, 2018

Imputation methods #189

Open

scottgigante mentioned this issue Jul 12, 2018

Add new MAGIC API #195

Merged

falexwolf closed this as completed in #195 Jul 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding MAGIC to Scanpy #187

Adding MAGIC to Scanpy #187

scottgigante commented Jul 2, 2018

falexwolf commented Jul 3, 2018

wangjiawen2013 commented Jul 9, 2018

scottgigante commented Jul 9, 2018

scottgigante commented Jul 9, 2018

wangjiawen2013 commented Jul 12, 2018

scottgigante commented Jul 12, 2018

Adding MAGIC to Scanpy #187

Adding MAGIC to Scanpy #187

Comments

scottgigante commented Jul 2, 2018

falexwolf commented Jul 3, 2018

wangjiawen2013 commented Jul 9, 2018

scottgigante commented Jul 9, 2018

scottgigante commented Jul 9, 2018

wangjiawen2013 commented Jul 12, 2018

scottgigante commented Jul 12, 2018