[Feature request]: Homogenization of data structures and physical representations #104
Open
1 of 6 tasks
Labels
code maintenance
Issue/PR for refactors, code clean up, etc.
data
Issues related to data loading, pipelining, etc.
good first issue
Good for newcomers
Feature/behavior summary
To ensure consistency in modeling, each dataset in Open MatSciML Toolkit should have uniform (or near uniform) kinds of data. For example, whether coordinates provided are fractional or Cartesian, ensuring every dataset has sufficient information to represent each data sample in a physically meaningful way, such as periodic boundary conditions (for use in e.g. shift vectors).
Request attributes
Related issues
No response
Solution description
A good place to start would be to make sure each
devset
, and subsequently any serialized datasets we have conform to the following:We should also check other projects, like Colabfit, to see what extent we can try and conform to community standards, too.
Additional notes
Can't assign Bin yet, but would be good for Bin to aggregate information, and between him and @melo-gonzo to help craft PRs to address things after the survey is done.
The text was updated successfully, but these errors were encountered: