Key error with gene_id-1 #21

Vyoming · 2019-07-12T20:38:46Z

I am following the notebook to better understand the steps of single-cell RNA seq. I am running the notebook on a mac os 10.12.6, the version of R that I am using is 3.6.1. I am running the original tutorial with the data you have kindly provided. I had to change the original line(below)

from gprofiler import GProfiler

to

from gprofiler import gprofiler

Everything was working great until I attempted to concatenate to main adata object, a key error appears claiming that the key "gene_id-1" does not exist. Error message below. Thank you so much for your help!


KeyError                                  Traceback (most recent call last)
/Applications/miniconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2656             try:
-> 2657                 return self._engine.get_loc(key)
   2658             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'gene_id-1'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-7-a74b7db26376> in <module>
     30     # Concatenate to main adata object
     31     adata = adata.concatenate(adata_tmp, batch_key='sample_id')
---> 32     adata.var['gene_id'] = adata.var['gene_id-1']
     33     adata.var.drop(columns = ['gene_id-1', 'gene_id-0'], inplace=True)
     34     adata.obs.drop(columns=['sample_id'], inplace=True)

/Applications/miniconda3/lib/python3.7/site-packages/pandas/core/frame.py in __getitem__(self, key)
   2925             if self.columns.nlevels > 1:
   2926                 return self._getitem_multilevel(key)
-> 2927             indexer = self.columns.get_loc(key)
   2928             if is_integer(indexer):
   2929                 indexer = [indexer]

/Applications/miniconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2657                 return self._engine.get_loc(key)
   2658             except KeyError:
-> 2659                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2660         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2661         if indexer.ndim > 1 or indexer.size > 1:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'gene_id-1'

The text was updated successfully, but these errors were encountered:

LuckyMD · 2019-07-12T21:34:57Z

Hi @Vyoming,

Thanks for your error report. The gprofiler module that you installed was probably python-gprofiler and not gprofiler-official. The latter is the updated version that's in the newest notebook. The older notebooks are using python-gprofiler which is however no longer updated. I would suggest you switch to gprofiler-official to run the notebook.

Secondly, your error is related to a recent update to scanpy, where concatenate no longer duplicates the var columns. It should work with your version of scanpy if you comment out the following lines:

#adata.var['gene_id'] = adata.var['gene_id-1']
#adata.var.drop(columns = ['gene_id-1', 'gene_id-0'], inplace=True)

Let me know if that solves your issues.

Vyoming · 2019-07-15T16:23:38Z

Thanks, it worked perfectly!

davidtourigny · 2020-05-27T23:15:04Z

Hi, I have encountered what I assume is a related issue using the latest version of the notebook and following precisely the Environment set up instructions. I am running on macos 10.13.6 that I understand may have conda build issues, but for me this was not a problem. I very much doubt the error is macos-associated, but pleased to be corrected. The issue is also mentioned briefly in #28 and "solved" by commenting out the lines of code responsible, but I would prefer not to do that! Please let me know if I have missed something obvious, or if this should be opened as a separate issue.

The error message returned by the notebook (ran in conda environment) is

... reading from cache file cache/..-data-Haber-et-al_mouse-intestinal-epithelium-GSE92332_RAW-GSM2836574_Regional_Duo_M2_matrix.h5ad

---------------------------------------------------------------------------
InvalidIndexError                         Traceback (most recent call last)
<ipython-input-15-01aaadebece1> in <module>
     30 
     31     # Concatenate to main adata object
---> 32     adata = adata.concatenate(adata_tmp, batch_key='sample_id')
     33     adata.var['gene_id'] = adata.var['gene_id-1']
     34     adata.var.drop(columns = ['gene_id-1', 'gene_id-0'], inplace=True)

~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/anndata/_core/anndata.py in concatenate(self, join, batch_key, batch_categories, uns_merge, index_unique, fill_value, *adatas)
   1696         all_adatas = (self,) + tuple(adatas)
   1697 
-> 1698         out = concat(
   1699             all_adatas,
   1700             join=join,

~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/anndata/_core/merge.py in concat(adatas, join, batch_key, batch_categories, uns_merge, index_unique, fill_value)
    454 
    455     var_names = resolve_index([a.var_names for a in adatas], join=join)
--> 456     reindexers = [
    457         gen_reindexer(var_names, a.var_names, fill_value=fill_value) for a in adatas
    458     ]

~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/anndata/_core/merge.py in <listcomp>(.0)
    455     var_names = resolve_index([a.var_names for a in adatas], join=join)
    456     reindexers = [
--> 457         gen_reindexer(var_names, a.var_names, fill_value=fill_value) for a in adatas
    458     ]
    459 

~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/anndata/_core/merge.py in gen_reindexer(new_var, cur_var, fill_value)
    255     new_size = len(new_var)
    256     old_size = len(cur_var)
--> 257     new_pts = new_var.get_indexer(cur_var)
    258     cur_pts = np.arange(len(new_pts))
    259 

~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_indexer(self, target, method, limit, tolerance)
   2731 
   2732         if not self.is_unique:
-> 2733             raise InvalidIndexError(
   2734                 "Reindexing only valid with uniquely valued Index objects"
   2735             )

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

UPDATE: I suppose what I am asking for is clarification on which/why certain lines in the notebook can be commented out and what effect this will have on downstream analysis. Slightly confused by the response given to #25

LuckyMD · 2020-05-27T23:28:18Z

Hi @davidtourigny,

This looks like a separate issue from the one closed here. Could you please open a separate issue for that? It appears to be related to the concatenation rather than the var column duplication (which is no longer done in newer anndata versions).

davidtourigny · 2020-05-27T23:30:14Z

Sorry, just saw your post immediately after updating my own. Sure, I will open this as a new issue.

LuckyMD closed this as completed Jul 16, 2019

LuckyMD mentioned this issue Aug 18, 2019

Error in samples concatenation #25

Closed

federicaress mentioned this issue Mar 5, 2020

Exception: Data must be 1-dimensional when plotting new marker genes in jupyter notebook #28

Closed

davidtourigny mentioned this issue May 27, 2020

Problem with adata concatenation #41

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Key error with gene_id-1 #21

Key error with gene_id-1 #21

Vyoming commented Jul 12, 2019

LuckyMD commented Jul 12, 2019

Vyoming commented Jul 15, 2019

davidtourigny commented May 27, 2020 •

edited

LuckyMD commented May 27, 2020

davidtourigny commented May 27, 2020

Key error with gene_id-1 #21

Key error with gene_id-1 #21

Comments

Vyoming commented Jul 12, 2019

LuckyMD commented Jul 12, 2019

Vyoming commented Jul 15, 2019

davidtourigny commented May 27, 2020 • edited

LuckyMD commented May 27, 2020

davidtourigny commented May 27, 2020

davidtourigny commented May 27, 2020 •

edited