Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Key error with gene_id-1 #21

Closed
Vyoming opened this issue Jul 12, 2019 · 5 comments
Closed

Key error with gene_id-1 #21

Vyoming opened this issue Jul 12, 2019 · 5 comments

Comments

@Vyoming
Copy link

Vyoming commented Jul 12, 2019

I am following the notebook to better understand the steps of single-cell RNA seq. I am running the notebook on a mac os 10.12.6, the version of R that I am using is 3.6.1. I am running the original tutorial with the data you have kindly provided. I had to change the original line(below)

from gprofiler import GProfiler

to

from gprofiler import gprofiler

Everything was working great until I attempted to concatenate to main adata object, a key error appears claiming that the key "gene_id-1" does not exist. Error message below. Thank you so much for your help!


KeyError                                  Traceback (most recent call last)
/Applications/miniconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2656             try:
-> 2657                 return self._engine.get_loc(key)
   2658             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'gene_id-1'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-7-a74b7db26376> in <module>
     30     # Concatenate to main adata object
     31     adata = adata.concatenate(adata_tmp, batch_key='sample_id')
---> 32     adata.var['gene_id'] = adata.var['gene_id-1']
     33     adata.var.drop(columns = ['gene_id-1', 'gene_id-0'], inplace=True)
     34     adata.obs.drop(columns=['sample_id'], inplace=True)

/Applications/miniconda3/lib/python3.7/site-packages/pandas/core/frame.py in __getitem__(self, key)
   2925             if self.columns.nlevels > 1:
   2926                 return self._getitem_multilevel(key)
-> 2927             indexer = self.columns.get_loc(key)
   2928             if is_integer(indexer):
   2929                 indexer = [indexer]

/Applications/miniconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2657                 return self._engine.get_loc(key)
   2658             except KeyError:
-> 2659                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2660         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2661         if indexer.ndim > 1 or indexer.size > 1:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'gene_id-1'
@LuckyMD
Copy link
Contributor

LuckyMD commented Jul 12, 2019

Hi @Vyoming,

Thanks for your error report. The gprofiler module that you installed was probably python-gprofiler and not gprofiler-official. The latter is the updated version that's in the newest notebook. The older notebooks are using python-gprofiler which is however no longer updated. I would suggest you switch to gprofiler-official to run the notebook.

Secondly, your error is related to a recent update to scanpy, where concatenate no longer duplicates the var columns. It should work with your version of scanpy if you comment out the following lines:

#adata.var['gene_id'] = adata.var['gene_id-1']
#adata.var.drop(columns = ['gene_id-1', 'gene_id-0'], inplace=True)

Let me know if that solves your issues.

@Vyoming
Copy link
Author

Vyoming commented Jul 15, 2019

Thanks, it worked perfectly!

@davidtourigny
Copy link

davidtourigny commented May 27, 2020

Hi, I have encountered what I assume is a related issue using the latest version of the notebook and following precisely the Environment set up instructions. I am running on macos 10.13.6 that I understand may have conda build issues, but for me this was not a problem. I very much doubt the error is macos-associated, but pleased to be corrected. The issue is also mentioned briefly in #28 and "solved" by commenting out the lines of code responsible, but I would prefer not to do that! Please let me know if I have missed something obvious, or if this should be opened as a separate issue.

The error message returned by the notebook (ran in conda environment) is

... reading from cache file cache/..-data-Haber-et-al_mouse-intestinal-epithelium-GSE92332_RAW-GSM2836574_Regional_Duo_M2_matrix.h5ad

---------------------------------------------------------------------------
InvalidIndexError                         Traceback (most recent call last)
<ipython-input-15-01aaadebece1> in <module>
     30 
     31     # Concatenate to main adata object
---> 32     adata = adata.concatenate(adata_tmp, batch_key='sample_id')
     33     adata.var['gene_id'] = adata.var['gene_id-1']
     34     adata.var.drop(columns = ['gene_id-1', 'gene_id-0'], inplace=True)

~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/anndata/_core/anndata.py in concatenate(self, join, batch_key, batch_categories, uns_merge, index_unique, fill_value, *adatas)
   1696         all_adatas = (self,) + tuple(adatas)
   1697 
-> 1698         out = concat(
   1699             all_adatas,
   1700             join=join,

~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/anndata/_core/merge.py in concat(adatas, join, batch_key, batch_categories, uns_merge, index_unique, fill_value)
    454 
    455     var_names = resolve_index([a.var_names for a in adatas], join=join)
--> 456     reindexers = [
    457         gen_reindexer(var_names, a.var_names, fill_value=fill_value) for a in adatas
    458     ]

~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/anndata/_core/merge.py in <listcomp>(.0)
    455     var_names = resolve_index([a.var_names for a in adatas], join=join)
    456     reindexers = [
--> 457         gen_reindexer(var_names, a.var_names, fill_value=fill_value) for a in adatas
    458     ]
    459 

~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/anndata/_core/merge.py in gen_reindexer(new_var, cur_var, fill_value)
    255     new_size = len(new_var)
    256     old_size = len(cur_var)
--> 257     new_pts = new_var.get_indexer(cur_var)
    258     cur_pts = np.arange(len(new_pts))
    259 

~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_indexer(self, target, method, limit, tolerance)
   2731 
   2732         if not self.is_unique:
-> 2733             raise InvalidIndexError(
   2734                 "Reindexing only valid with uniquely valued Index objects"
   2735             )

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

UPDATE: I suppose what I am asking for is clarification on which/why certain lines in the notebook can be commented out and what effect this will have on downstream analysis. Slightly confused by the response given to #25

@LuckyMD
Copy link
Contributor

LuckyMD commented May 27, 2020

Hi @davidtourigny,

This looks like a separate issue from the one closed here. Could you please open a separate issue for that? It appears to be related to the concatenation rather than the var column duplication (which is no longer done in newer anndata versions).

@davidtourigny
Copy link

Sorry, just saw your post immediately after updating my own. Sure, I will open this as a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants