Skip to content
This repository was archived by the owner on Jan 9, 2023. It is now read-only.
This repository was archived by the owner on Jan 9, 2023. It is now read-only.

root_pandas randomly shuffles index of columns #81

@zhangzc11

Description

@zhangzc11

I recently realized that when constructing DataFrame from root_pandas.read_root, the index of the columns get randomly shuffled. Try the following:

wget http://scikit-hep.org/uproot/examples/HZZ.root

here is the test.py code:

#!/usr/bin/env python
import uproot
import root_pandas as rp
variables = ['MET_px', 'MET_py', 'EventWeight']
df=rp.read_root('HZZ.root', 'events', columns=variables)
events = uproot.open("HZZ.root")["events"]
df2=events.pandas.df(variables, flatten=False)
print(df.values[0])
print(df2.values[0])

So if you run this test.py code multiple times, you will see that the print out result from root_pandas DataFrama (df) changes; but the DataFrame from uproot (df2) is always the same (and follows the order of TBranch name lists).

zhicaiz@zhicaiz ~$ python test.py
[2.5636332  5.912771   0.00927101]
[5.912771   2.5636332  0.00927101]
zhicaiz@zhicaiz ~$ python test.py
[0.00927101 2.5636332  5.912771  ]
[5.912771   2.5636332  0.00927101]

root_pandas version i used: v0.6.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions