Add option to not parse blocks #21

maxnoe · 2022-10-26T10:36:50Z

When loading large files in-bulk, it's much faster to accumulate the arrays and then parse the low-level float arrays then parsing each event directly.

Added an option to just keep the float array in the event loop.

codecov · 2022-10-26T10:37:35Z

Codecov Report

Base: 96.09% // Head: 95.98% // Decreases project coverage by -0.12% ⚠️

Coverage data is based on head (aae021f) compared to base (e13ef5f).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #21      +/-   ##
==========================================
- Coverage   96.09%   95.98%   -0.12%     
==========================================
  Files          19       20       +1     
  Lines         384      473      +89     
==========================================
+ Hits          369      454      +85     
- Misses         15       19       +4

Impacted Files	Coverage Δ
corsikaio/file.py	`97.54% <100.00%> (+0.19%)`	⬆️
tests/test_file.py	`100.00% <100.00%> (ø)`
corsikaio/subblocks/run_header.py	`84.00% <0.00%> (-3.50%)`	⬇️
corsikaio/subblocks/data.py	`100.00% <0.00%> (ø)`
corsikaio/subblocks/dtypes.py	`100.00% <0.00%> (ø)`
corsikaio/subblocks/run_end.py	`100.00% <0.00%> (ø)`
corsikaio/subblocks/__init__.py	`100.00% <0.00%> (ø)`
corsikaio/subblocks/event_end.py	`100.00% <0.00%> (ø)`
corsikaio/subblocks/longitudinal.py	`100.00% <0.00%> (ø)`
tests/test_units.py	`100.00% <0.00%> (ø)`
... and 1 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

corsikaio/file.py

The-Ludwig · 2023-02-15T11:45:33Z

Hi @maxnoe! I played around with this in the last week and successfully use this in https://github.com/The-Ludwig/PANAMA .
I parse CORSIKA DAT files to pandas dataframes and using noparse, I am way faster.

I think this is fine to merge, actually, should I add a test?

Only comment I have is: also checking noparse in the derived classes from CorsikaFile seems unneeded, since

maxnoe · 2023-02-15T12:59:55Z

@orelgueta Could you give a quick review here, it's a nice-to-have feature for @The-Ludwig and shouldn't interfere with our usage

orelgueta

The added code all looks good.
Just for my understanding though, if this option is used, the arrays are remained unparsed and the user has to parse them after reading all the events using their own code. Is that correct?

The-Ludwig · 2023-02-16T10:21:56Z

@orelgueta Yes, that is correct. In my testing I am around 5 times faster if I don't parse the particle blocks, put them into a python list, make a pandas dataframe out of them and then name the columns. Of course it depends on the size and structure of the file itself, but there are definitely some good use-cases.

maxnoe · 2023-02-16T10:23:20Z

Not using their own code, the functions here can be used. The difference is basically that for the use case of reading all events in a file into a single data structure, instead of parsing n arrays and then stacking you stack n simple arrays first and then parse once.

Add option to not parse blocks

4e659d4

The-Ludwig reviewed Feb 15, 2023

View reviewed changes

corsikaio/file.py Outdated Show resolved Hide resolved

The-Ludwig approved these changes Feb 15, 2023

View reviewed changes

Add test for parse_blocks=False, remove unneeded code

aae021f

maxnoe force-pushed the noparse branch from a7c0be6 to aae021f Compare February 15, 2023 12:57

maxnoe requested a review from orelgueta February 15, 2023 12:59

maxnoe marked this pull request as ready for review February 15, 2023 13:53

orelgueta approved these changes Feb 16, 2023

View reviewed changes

maxnoe merged commit 029bcb6 into main Feb 16, 2023

maxnoe deleted the noparse branch February 16, 2023 15:01

This was referenced Mar 27, 2023

pycorsikaio to astropy Qtable #28

Open

Documentation #30

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option to not parse blocks #21

Add option to not parse blocks #21

maxnoe commented Oct 26, 2022 •

edited

Loading

codecov bot commented Oct 26, 2022 •

edited

Loading

The-Ludwig commented Feb 15, 2023

maxnoe commented Feb 15, 2023

orelgueta left a comment

The-Ludwig commented Feb 16, 2023

maxnoe commented Feb 16, 2023

Add option to not parse blocks #21

Add option to not parse blocks #21

Conversation

maxnoe commented Oct 26, 2022 • edited Loading

codecov bot commented Oct 26, 2022 • edited Loading

Codecov Report

The-Ludwig commented Feb 15, 2023

maxnoe commented Feb 15, 2023

orelgueta left a comment

Choose a reason for hiding this comment

The-Ludwig commented Feb 16, 2023

maxnoe commented Feb 16, 2023

maxnoe commented Oct 26, 2022 •

edited

Loading

codecov bot commented Oct 26, 2022 •

edited

Loading