Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Software metadata in LH5 files #6

Open
gipert opened this issue Jul 12, 2022 · 3 comments
Open

Software metadata in LH5 files #6

gipert opened this issue Jul 12, 2022 · 3 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@gipert
Copy link
Member

gipert commented Jul 12, 2022

We should consider storing software metadata relevant for data preservation consistently in all LH5 files. One thing could be certainly pygama.__version__, but I'm not sure whether also versions of some dependencies like NumPy or Numba would be relevant.

We should also retain the possibility to do checksum-based file comparisons. Storing things like file creation time would make this more difficult and would at least require custom checksumming utilities.

We should also consider versioning the LH5 specification and store the version a file was created with.

@gipert gipert added the good first issue Good for newcomers label Aug 4, 2022
@gipert gipert added the enhancement New feature or request label Sep 22, 2022
@gipert gipert transferred this issue from legend-exp/pygama May 23, 2023
@jasondet
Copy link
Contributor

jasondet commented Dec 19, 2023

@iguinn and I suggest adding to the attributes a key "provenance" that is a dict with this info. E.g. for a typical DSP output column it would contain:

provenance:
    creation_date: YYMMDDZhhmmss
    processor: module.processor_name
    module_versions:
        numpy: v...
        dspeed: v...
        etc.

@MoritzNeuberger
Copy link
Contributor

Do I understand correctly that this "provenance" information would be added to the appropriate table in the build_dsp, build_hit,... script level and no change to LH5Store would be necessary?

@gipert
Copy link
Member Author

gipert commented Jan 18, 2024

Yes, the issue mentioned by @jasondet should be moved to dspeed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants