Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why a bitmask of extensions? #151

Closed
t184256 opened this issue Nov 22, 2017 · 6 comments
Closed

Why a bitmask of extensions? #151

t184256 opened this issue Nov 22, 2017 · 6 comments
Assignees
Labels
major change non-backwards compatible change
Milestone

Comments

@t184256
Copy link

t184256 commented Nov 22, 2017

I have a handful of questions I've asked @ax3l by email some time ago. I've got no answer, so maybe I'll try asking them here.

openPMDextension is a measly uint32 bitmask, why? Why isn't it an set of attributes, an array of strings, or basically anything more flexible/extensible? It may seem to be an non-issue now, but the more it will become one, the harder it'll be to change.

@ax3l ax3l added the question label Nov 22, 2017
@ax3l ax3l added this to the 1.0.0: Initial Release milestone Nov 22, 2017
@ax3l ax3l self-assigned this Nov 22, 2017
@ax3l ax3l modified the milestones: 1.0.0: Initial Release, 2.0.0: Second Major Release Nov 22, 2017
@ax3l
Copy link
Member

ax3l commented Nov 22, 2017

Hi Alexander,

sorry for the late response - I indeed have not forgotten your mail and just did not get to it yet (also because our lab evaluation period just starts in various fields...). I will still try to answer your other questions in the e-mail, but indeed the openPMD issue tracker is the right place to ask: here we can publicly archive, link, check and improve orthogonal questions and improvements :)

openPMDextension: openPMD extensions are in principle openPMD extensions could be stacked and used at the same time. E.g. one could write a "plasma" extension that defines that "B" is always naming records that are magnetic fields and on top a PIC and a Hydro extension that define namings for used solvers/methods.

A bitmask is just the most efficient representation of such a possible combination and was taken for the sake of very simple parsing (in C/C++/Fortran). With the property we used they are "extensible" enough. Indeed, we could also use a list of (variable length) strings. (E.g. in some places such as the ED-PIC extension we alsways use comma-separated lists.)

The Proposed Change to a String

This would improve human readability, make writing not much more complicated, introduces a bit of negligible overhead during meta data read and makes parsing a bit more lines in strongly typed languages. Also, named collisions are less likely to occur than number collisions if someone is writing an extension and forgets to ping the issue tracker first for an ID. 🚀 ✨ 👍

@ax3l ax3l added major change non-backwards compatible change and removed question labels Nov 22, 2017
@ax3l
Copy link
Member

ax3l commented Nov 22, 2017

CCing @DavidSagan and @CFGrote as they suggest the same change. Thx!

@t184256
Copy link
Author

t184256 commented Nov 22, 2017

I cannot resist the temptation to answer.

With the property we used they are extensible enough.

A maximum of 32 extensions doesn't sound future-proof.

A bitmask is just the most efficient representation of such a possible combination and was taken for the sake of very simple parsing

Oh yay, let's save a thousand CPU cycles of overhead and a few dozen of bytes. In a file with megabytes or gigabytes of finely gridded data, where each particle or grid point takes way more than 4 bytes! Sorry, that doesn't sound reasonable.

@ax3l
Copy link
Member

ax3l commented Nov 22, 2017

Ah I should cross-check my writing before submit!
I just wanted to say With the property we used they are "extensible". and continue what I wrote below, that this extensibility is very limited. I do not know how enough slipped in there, I was not trying to defend the excellent nerdiness and human unreadability of bitmasks :)

was taken for the sake of very simple parsing

Oh you got me wrong, I very much agree with changing that since parsing an attribute is negligible in cost :) Sorry if that came across the wrong way since I forgot to remove my enough. Mea culpa!

I thought the last paragraph states I fully agree with the proposal:

This would improve human readability, make writing not much more complicated, introduces a bit of negligible overhead during meta data read and makes parsing a bit more lines in strongly typed languages. Also, named collisions are less likely to occur then number collisions if someone is writing an extension and forgets to ping the issue tracker first for an ID.

I only see positive points in changing it to a string list! ✨ If nobody insists it will go into 2.0.0

@ax3l
Copy link
Member

ax3l commented Nov 22, 2017

I updated the post above to cross the enough and add more emojis with 👍 after the last paragraph :)

@ax3l ax3l modified the milestones: 2.0: Second Major Release, 1.0: First Major Release Nov 24, 2017
@RemiLehe RemiLehe closed this as completed Apr 3, 2018
@RemiLehe
Copy link
Member

RemiLehe commented Apr 3, 2018

Fixed by PR #185

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
major change non-backwards compatible change
Projects
Status: Implemented
Development

No branches or pull requests

3 participants