Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust file naming convention #143

Merged
merged 6 commits into from Mar 18, 2019
Merged

Conversation

XeBoris
Copy link
Contributor

@XeBoris XeBoris commented Mar 7, 2019

My suggestion is to adjust the XENONnT naming convention to get unique file names. My implementation takes the information about the file name convention simply from directory (dirname) information at the level of the backend.
To get a clean solution we may improve later the frontend-backend communication to get all necessary from the database itself (or a similar interface).
I tested running a simple STRAX script which processed several plugins on the fly to get a high-level dataframe back. The data structure looks good to me.

@JelleAalbers I choose to write the "more complicated" looking NameConvention class in order to be able to create a string from keywords with the freedom of simply handing dictionaries over to fill out such keywords (similar to *.format() in Python) but do not relay on the requirement of passing a full keyword list. Maybe over-archiving but quite handy.

Copy link
Member

@JelleAalbers JelleAalbers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot Boris! This will be important to have strax files easily transferrable through rucio. Do I understand correctly that you would propose this new filename convention:

  • metadata.json -> metadata-DATATYPE-HASH-json
  • CHUNKID -> DATATYPE-HASH-CHUNKID

This sounds fine. Are we sure this is unique enough for rucio though? The lineage hash does not include the run id, so files from two runs processed with the same configuration will still have the same filenames.

Could you delete the code that is currently unused, i.e. the classes you added? I will merge using the ordinary merge mode, so the code will be preserved in version control should we want to use it in the future.

One of the test seems to be failing because it duplicates the file name construction code. I can have a look at this if you like.

Copy link
Member

@JelleAalbers JelleAalbers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to write down what Boris mentioned to me: the runid is not included in the prefix because it already occurs in the rucio scope name. Files only need to be unique within the rucio scope. We do need the data type name and the hash, because one run has multiple data types, potentially several with the same name but different settings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants