reading multiple large root files (each of about 1GB) trying to read each file at a time to pandas df for groupby operation #925
Unanswered
sbdrchauhan
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I know to open large files, it is best to use
uproot.iterate()
withstep_size
option. But step_size is different for each file. I want to do the operation for each eventID, so I want to make pandas dataframe to perform analysis for each groupby object for each eventID. If I use some fixed step_size, won't it chop off the dataframe in the middle of the event, and I won't be able to do groupby analysis as it chops not at every eventID.Something like this:
![image](https://private-user-images.githubusercontent.com/58833553/256866365-c5b61164-7780-433b-8803-85a5437f103c.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkxOTE2OTAsIm5iZiI6MTcxOTE5MTM5MCwicGF0aCI6Ii81ODgzMzU1My8yNTY4NjYzNjUtYzViNjExNjQtNzc4MC00MzNiLTg4MDMtODVhNTQzN2YxMDNjLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjQlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjI0VDAxMDk1MFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTgxOGE2NDZiOGIyY2Y1OTA2MzY5ZDJjZjFhY2Q5MzYyZGFlZDNjMTBlMjViMWFkZWMwNjljNGVmMWI2NTEyNDkmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.yct3tfJ5uzUsZbSP64mQCtoiNmNs2EC3ipN_s7xqDjw)
here calculateUVW() method should get dataframe for each eventID, but step_size might chop off in the middle of eventID and my analysis might be incorrect.
Thank you
Beta Was this translation helpful? Give feedback.
All reactions