Skip to content

stress-test of workflows and API - the 1000 glass challenge #178

@ltalirz

Description

@ltalirz

We've talked in the past about the stress test we would like the package to pass before we feel confident to give it into the hands of users.

Here is my proposal:

We pick all glasses from a database with measured

  1. densities
  2. elastic constants
  3. CTE
  4. (optional, where present) high-temperature viscosity

We do some basic data engineering to filter out outliers and then pick ~1000 diverse compositions based on an element vector distance.

Then we run them through our pipeline and compare.

Which database to draw from?

I first looked into SciGlass, since it is open source, used by the guys in Jena and there would be no issue to publish all the data (compositions, measurements) alongside the calculations.

The downside: I only find 570 compositions with all properties 1-3 (before we do any further filtering), while I find ~12k such compositions in InterGlaD.

The interglad terms of use actually also mention the case of papers using information from interglad, but it does not explicitly allow e.g. publication of small subsets.
We would need to ask (I know the director) but I think it is quite likely that they would say no.

My feeling is the SciGlass dataset is not large enough... we could of course do a compromise and mix - take as much diverse data we can get from SciGlass, top up to 1000 from InterGlaD, and publish only the SciGlass subset (this could anyhow be a start, with SciGlass calculations running on the BAM side and INterGlaD on the SCHOTT side).

What do you think @Atilaac @Gitdowski

P.S. One could in principle relax the constraint to glasses with measured density OR elastic constants OR CTE. But for 1000 comparisons for each property that means up to 3x the number of calculations + it means you cannot necessarily look at different properties predicted for the same glass and see how their errors differ.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions