write() function error: 'reserved name for dataframe columns' #255

rfenouil · 2020-07-22T13:24:51Z

Hello, I am having an error message when trying to save intermediate results as binary file using adata.write() function.
The error message seems to happen only when using the Seurat wrapper found here, not when doing the tutorial with 'pancreas' dataset.

See below for R and Python code to reproduce:

library(Seurat)
library(SeuratDisk)
library(SeuratWrappers)

curl::curl_download(url = 'http://pklab.med.harvard.edu/velocyto/mouseBM/SCG71.loom', destfile = "/data.loom")

ldat <- ReadVelocity(file = "/data.loom")
bm <- as.Seurat(x = ldat) 
bm[["RNA"]] <- bm[["spliced"]]
bm <- SCTransform(bm)
bm <- RunPCA(bm)
bm <- RunUMAP(bm, dims = 1:20)
bm <- FindNeighbors(bm, dims = 1:20)
bm <- FindClusters(bm)
DefaultAssay(bm) <- "RNA"
SaveH5Seurat(bm, filename = "/mouseBM.h5Seurat")
Convert("/mouseBM.h5Seurat", dest = "h5ad")

import scvelo as scv

scv.settings.verbosity = 3  # show errors(0), warnings(1), info(2), hints(3)
scv.settings.presenter_view = True  # set max width size for presenter view
scv.settings.set_figure_params('scvelo')  # for beautified visualization

adata = scv.read("/mouseBM.h5ad")

adata.write("/mouseBM_processed.h5ad")

Error

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/anndata/_io/utils.py", line 188, in func_wrapper
    return func(elem, key, val, *args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/anndata/_io/h5ad.py", line 241, in write_dataframe
    raise ValueError(f"{reserved!r} is a reserved name for dataframe columns.")
ValueError: '_index' is a reserved name for dataframe columns.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.7/dist-packages/anndata/_core/anndata.py", line 1852, in write_h5ad
    as_dense=as_dense,
  File "/usr/local/lib/python3.7/dist-packages/anndata/_io/h5ad.py", line 104, in write_h5ad
    write_attribute(f, "raw", adata.raw, dataset_kwargs=dataset_kwargs)
  File "/usr/lib/python3.7/functools.py", line 827, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
  File "/usr/local/lib/python3.7/dist-packages/anndata/_io/h5ad.py", line 126, in write_attribute_h5ad
    _write_method(type(value))(f, key, value, *args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/anndata/_io/h5ad.py", line 135, in write_raw
    write_attribute(f, "raw/var", value.var, dataset_kwargs=dataset_kwargs)
  File "/usr/lib/python3.7/functools.py", line 827, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
  File "/usr/local/lib/python3.7/dist-packages/anndata/_io/h5ad.py", line 126, in write_attribute_h5ad
    _write_method(type(value))(f, key, value, *args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/anndata/_io/utils.py", line 195, in func_wrapper
    ) from e
ValueError: '_index' is a reserved name for dataframe columns.

Above error raised while writing key 'raw/var' of <class 'h5py._hl.files.File'> from /.

Versions:

scvelo==0.2.1 scanpy==1.5.1 anndata==0.7.4 loompy==3.0.6 numpy==1.19.0 scipy==1.5.1 matplotlib==3.2.2 sklearn==0.23.1 pandas==1.0.5

Thank you for the great work and your help.

The text was updated successfully, but these errors were encountered:

VolkerBergen · 2020-10-06T14:01:53Z

Running it directly in scvelo works fine

adata = scv.read('data/SCG71.loom',  backup_url='http://pklab.med.harvard.edu/velocyto/mouseBM/SCG71.loom')
adata.write('data/SCG71.h5ad')

Hence, something is included in Seurat, that triggers that error. Could you please print adata and see whether there is any entry named '_index'.

davisidarta · 2020-11-11T02:00:01Z

Any updates on this? I'm also having this issue using saving .h5ad files from .h5ad files created using SeuratDisk, exclusively after running scv.pp.moments(adata). The same error does not happen when saving the same .h5ad file after performing additional analysis on scanpy - only after calculating moments within scvelo.

mihem · 2020-11-16T16:12:13Z

I'am also having this same issue, running
adata.write(filename = "scvelo.h5ad")

For me it also doesn't work before running scvelo.

adata = scv.read("SeuratObject.h5ad")
adata.write(filename = "scvelo.h5ad")

raises:

ValueError: '_index' is a reserved name for dataframe columns.

While it works fine with the dataset that VolkerBergen suggested.

@VolkerBergen could you maybe specify where you would expect the "_index" entry to be?

my AnnData object looks like this in the summary

obs: 'orig.ident', 'nCount_spliced', 'nFeature_spliced', 'nCount_unspliced', 'nFeature_unspliced', 'nCount_ambiguous', 'nFeature_ambiguous', 'nCount_RNA', 'nFeature_RNA', 'library', 'tissue', 'percent_mt', 'seurat_clusters', 'spliced_snn_res.0.3', 'label_new'
    var: 'features', 'ambiguous_features', 'spliced_features', 'unspliced_features'
    obsm: 'X_umap'
    layers: 'ambiguous', 'spliced', 'unspliced'

mariafiruleva · 2020-12-07T15:37:40Z

I guess, the source of problem is content of the df.__dict__['_raw'].__dict__.
Specifically, df.__dict__['_raw'].__dict__['_var'] contains dataframe with all features as rows and _index as column name.
Renaming resolves the issue.

adata.__dict__['_raw'].__dict__['_var'] = adata.__dict__['_raw'].__dict__['_var'].rename(columns={'_index': 'features'})

zehualilab · 2021-01-03T17:50:06Z

I guess, the source of problem is content of the df.__dict__['_raw'].__dict__.
Specifically, df.__dict__['_raw'].__dict__['_var'] contains dataframe with all features as rows and _index as column name.
Renaming resolves the issue.
adata.__dict__['_raw'].__dict__['_var'] = adata.__dict__['_raw'].__dict__['_var'].rename(columns={'_index': 'features'})

OMG!!!!OMG!!!!!OMG!!!!OMG!!!!!PROBLEM SOLVED!!!!!!!PROBLEM SOLVED!!!!!!!THX!!!!!!THX!!!!!!!!!!!

genecell · 2021-10-12T08:59:29Z

I guess, the source of problem is content of the df.__dict__['_raw'].__dict__. Specifically, df.__dict__['_raw'].__dict__['_var'] contains dataframe with all features as rows and _index as column name. Renaming resolves the issue.
adata.__dict__['_raw'].__dict__['_var'] = adata.__dict__['_raw'].__dict__['_var'].rename(columns={'_index': 'features'})

This works for me for saving the anndata h5ad file, but I got the following message when I plot the dotplot:

f"Could not find keys '{not_found}' in columns of `adata.{dim}` or in"
KeyError: "Could not find keys '['AC004791.2', 'ALKBH5', 'APOBEC3A', 'ATHL1', 'BANK1', 'BCL9L', 'BST1', 'C1QA', 'C1QC', 'C1QTNF4', 'CALB2', 'CCR8', 'CD1C', 'CD8B', 'CDK15', 'CLEC10A', 'CMTM8', 'CXCL13', 'CYB561', 'DERL3', 'EOMES', 'FCER1A', 'FCGR3A', 'FGFBP2', 'FOXP3', 'FSCN1', 'GALNT2', 'GNG4', 'GZMK', 'HOXC6', 'HSPA6', 'IDO1', 'IFIT1', 'IFIT3', 'IGFL2', 'IGHG4', 'IL1B', 'IL1RN', 'IL7R', 'KLRF1', 'KRT5', 'KRT86', 'LAD1', 'LEF1', 'LINC00926', 'METRNL', 'MKI67', 'MS4A1', 'MTRNR2L8', 'MZB1', 'NR4A2', 'P2RY6', 'PASK', 'PEMT', 'PTGS2', 'PTPN13', 'PTPRS', 'RNASE1', 'ROR1.AS1', 'RP11.138A9.1', 'RP11.354E11.2', 'RP11.89C3.4', 'RPL34', 'RPL36A', 'RRM2', 'RSAD2', 'RTKN2', 'TLDC2', 'TLR8', 'TOR4A', 'TUBA4A', 'UBE2C', 'ZNF331']' in columns of `adata.obs` or in adata.raw.var_names."

I tried to delete the adata.raw:

del adata.raw

and now I can save the anndata file, and also it works for the dotplot function.

paulitikka · 2022-07-06T13:58:10Z

If someone is still experiencing an issue with this saving execute also the following:
del(adata.var['_index']) #after the 'adata.dict['_raw'].dict['_var'] = adata.dict['_raw'].dict['_var'].rename(columns={'_index': 'features'}); del(adata.raw)' solution

YY-SONG0718 · 2022-08-02T21:46:53Z

del(adata.var['_index'])

recently I encounter this error again after using the original solution for a while, this solved the issue, thanks!

paulitikka · 2022-08-15T10:59:45Z

You are welcome Yuyao!

Mayank0512 · 2022-10-23T04:33:18Z

I guess, the source of problem is content of the df.__dict__['_raw'].__dict__. Specifically, df.__dict__['_raw'].__dict__['_var'] contains dataframe with all features as rows and _index as column name. Renaming resolves the issue.
adata.__dict__['_raw'].__dict__['_var'] = adata.__dict__['_raw'].__dict__['_var'].rename(columns={'_index': 'features'})

Damn man that works....thanks so much....u are a true savior!!!
Thank youuuuuuuuu again

weir12 · 2023-03-20T12:53:38Z

I guess, the source of problem is content of the df.__dict__['_raw'].__dict__. Specifically, df.__dict__['_raw'].__dict__['_var'] contains dataframe with all features as rows and _index as column name. Renaming resolves the issue.
adata.__dict__['_raw'].__dict__['_var'] = adata.__dict__['_raw'].__dict__['_var'].rename(columns={'_index': 'features'})

BRAVO ! !!

maximilianh · 2023-03-22T15:33:28Z

Oh boy, @mariafiruleva so many thanks!!

This command is a little easier to read, for me at least, and seems to do the same thing:

adata._raw._var.rename(columns={'_index': 'features'}, inplace=True)

Tianran1998 · 2024-01-03T02:32:27Z

I guess, the source of problem is content of the df.__dict__['_raw'].__dict__. Specifically, df.__dict__['_raw'].__dict__['_var'] contains dataframe with all features as rows and _index as column name. Renaming resolves the issue.
adata.__dict__['_raw'].__dict__['_var'] = adata.__dict__['_raw'].__dict__['_var'].rename(columns={'_index': 'features'})

It works！Thank you very much！

rfenouil added the bug Something isn't working label Jul 22, 2020

WeilerP closed this as completed May 30, 2021

WeilerP removed the bug Something isn't working label May 30, 2021

majorkazer mentioned this issue May 16, 2022

Issue with saving h5ad file after Convert mojaveazure/seurat-disk#71

Closed

martinkim0 mentioned this issue Jul 21, 2023

Issue converting scanpy back to seurat scverse/scvi-tools#2196

Closed

kuang-da mentioned this issue Sep 11, 2023

ValueError: '_index' is a reserved name for dataframe columns. kuang-da/sc-transmogrifier#5

Open

Bisho2122 mentioned this issue Mar 29, 2024

ValueError: '_index' is a reserved name for dataframe columns when writing multiple tables in Zarr scverse/spatialdata#516

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

write() function error: 'reserved name for dataframe columns' #255

write() function error: 'reserved name for dataframe columns' #255

rfenouil commented Jul 22, 2020

VolkerBergen commented Oct 6, 2020

davisidarta commented Nov 11, 2020

mihem commented Nov 16, 2020 •

edited

mariafiruleva commented Dec 7, 2020

zehualilab commented Jan 3, 2021

genecell commented Oct 12, 2021

paulitikka commented Jul 6, 2022

YY-SONG0718 commented Aug 2, 2022

paulitikka commented Aug 15, 2022 •

edited

Mayank0512 commented Oct 23, 2022

weir12 commented Mar 20, 2023

maximilianh commented Mar 22, 2023

Tianran1998 commented Jan 3, 2024

write() function error: 'reserved name for dataframe columns' #255

write() function error: 'reserved name for dataframe columns' #255

Comments

rfenouil commented Jul 22, 2020

Versions:

VolkerBergen commented Oct 6, 2020

davisidarta commented Nov 11, 2020

mihem commented Nov 16, 2020 • edited

mariafiruleva commented Dec 7, 2020

zehualilab commented Jan 3, 2021

genecell commented Oct 12, 2021

paulitikka commented Jul 6, 2022

YY-SONG0718 commented Aug 2, 2022

paulitikka commented Aug 15, 2022 • edited

Mayank0512 commented Oct 23, 2022

weir12 commented Mar 20, 2023

maximilianh commented Mar 22, 2023

Tianran1998 commented Jan 3, 2024

mihem commented Nov 16, 2020 •

edited

paulitikka commented Aug 15, 2022 •

edited