Skip to content
This repository has been archived by the owner on Jan 9, 2023. It is now read-only.

Problem with saving DataFrames to root files when they contain dtype object #38

Closed
FerdinandEiteneuer opened this issue Mar 3, 2017 · 9 comments

Comments

@FerdinandEiteneuer
Copy link

FerdinandEiteneuer commented Mar 3, 2017

Hi,

i have several textfiles with strings and floats in them. I can load them into a panda dataframe nicely.
I also need to save them in root format and wanted to use this library.

However, when i call the to_root function on my dataframe i got the following:

UserWarning: converter for dtype('O') is not implemented skipping

And indeed, if I load the rootfile later with read_root later the columns with strings in them miss.

I then tried this:

>>> for c in df.columns:
... if df[c].dtype == object:
... df[c] = df[c].astype(str)

Now i did not get any error message using to_root. However, when i load the root file later to a pandas dataframe it still misses the columns where strings are supposed to be.

How to fix this?
Thank you very much

@maxnoe
Copy link
Contributor

maxnoe commented Mar 5, 2017

You need to convert the str columns to the numpy string type S.

However, there seems to be a bug in pandas which is preventing that on reassigment,
I filed this issue here: pandas-dev/pandas#15575

In the mean time, you might need to copy your data:

df2 = pd.DataFrame()
for name, column in df.items():
    if columns.dtype == object:
        df2[name] = column.astype('S')
    else:
        df2[name] = column

The S column type requires a maximum length of the string, be default it will take the longest string
in the series. If you want to give the max length yourself, you can do column.astype('S10') for a max length of 10.

@maxnoe
Copy link
Contributor

maxnoe commented Mar 5, 2017

They actually consider it a bug that the dtype on new assignments is not object.

@FerdinandEiteneuer
Copy link
Author

Hi,

thanks for taking the time. Unfortunately your way of circumventing this issue does not work for me :(
Even if i create this new df2 and call .astype('S') or .astype('S10') it will stay of type object
I tried what you did in pandas-dev/pandas#15575 and also there my output of

import pandas as pd
df = pd.DataFrame({'a': ['Hello', 'World']})
df['a'] = df['a'].astype('S')
df['b'] = df['a'].astype('S')
print(df.dtypes)

is just simply object for df['b'] aswell.

@wiso wiso mentioned this issue Aug 30, 2017
@MatousVozak
Copy link

Hi All,

has this been resolved at all? I arrived to the same issue of saving string to the root file.

UserWarning: converter for dtype('O') is not implemented (skipping) cobj = _librootnumpy.array2tree_toCObj(arr, name=name, tree=incobj)

Best,
Mat

@chrisburr
Copy link
Member

@MatousVozak What is the contents of the column that is an object? Strings, arrays or something else? The issue is inside root_numpy but it's unlikely to be fixed unless you're willing to make a pull request as it has been effectively depreciated in favour of uproot.

That said, a better question is you need to save to a ROOT file? You might be better served using a file format natively supported by pandas like hdf5.

@MatousVozak
Copy link

Hi @chrisburr, yes it is a string and I needed to save into a root file. I simply wanted to change entries of one branch which was of a type char/string. As this was a hot fix and I couldn't find a quick work around I Eventually turned into a pyroot to do the job.

Best,
Mat

@goi42
Copy link

goi42 commented Jul 1, 2020

Was a workaround ever found for this?

@eduardo-rodrigues
Copy link
Member

Kind ping to @chrisburr ... though you may want to look at the uproot package at this stage?

@eduardo-rodrigues
Copy link
Member

As explicitly written in the README since a while, root_pandas, and root_numpy on which it depends, has been deprecated and effectively unmaintained for quite a while. We decided to close anthing outstanding as "won't do" and archive the package at this point.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants