Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UUID for run_id #2192

Merged
merged 20 commits into from
Feb 4, 2021
Merged

UUID for run_id #2192

merged 20 commits into from
Feb 4, 2021

Conversation

johnomotani
Copy link
Contributor

Addition for #2149:

  • Adds a header that generates UUIDs: based on https://github.com/mariusbancila/stduuid, but hacked to be compatible with C++14, rather than requiring C++17/20. stduuid is MIT licensed, so I think it's OK to just copy the code, with copyright and disclaimer added to the top of the file.
  • Converts run_id and run_restart_from to UUIDs.
  • Also adds an option to save them at every output timestep, to help with concatenating output from several runs, until we implement something more elegant like a dimension with one entry per restart.

Would be nice to add a test. Theres a uuid module, so we could check that run_id is a correctly formatted UUID with uuid.UUID(run_id) (if not it will raise an exception), but at the moment collect() doesn't support strings, so the test will have to wait.

Diff is big because I needed to merge next to pick up some recent features. Diff of the new additions here d46c7e3...uuid-run-id.

https://github.com/mariusbancila/stduuid is a header-only library for
creating UUIDs, but requires C++20 features. This commit copies the
header, but hacked to work with C++11. When BOUT++ requires C++20, we
should replace this header with mariusbancila/stduuid in a submodule.
Saving run_id and run_restart_from as time-evolving variables would make
it easier to concatenate data from several consecutive runs. A bit
hacky, but do not have a better way to create a per-run dimension at the
moment.
@bendudson
Copy link
Contributor

Looks like the test-restart-io and test-restart-io-hdf5 tests failing, missing run_id:

Missing data for run_id in input

@johnomotani
Copy link
Contributor Author

Looks like the test-restart-io and test-restart-io-hdf5 tests failing, missing run_id

I'm working on it. I think fixing this requires string support in boututils.datafile, etc.

@johnomotani
Copy link
Contributor Author

johnomotani commented Jan 3, 2021

I've fixed the test failure, and added tests for run_id and run_restart_from in test-restart-io and test-restart-io_hdf5. The tests require boutproject/boututils#15 and boutproject/boutdata#25 though, so they won't pass yet in the CI.

Once boutproject/boututils#14 is merged (needed for cmake tests to work with the boututils repo version), I'll open a PR to delete the boututils and boutdata code from pylib and replace it with the independent github repos as submodules, so we can directly use the updates from there.

@ZedThree
Copy link
Member

ZedThree commented Jan 5, 2021

uuid.h still has #include <string_view> which is C++17, although it doesn't actually use it.

We might also want to define UUID_SYSTEM_GENERATOR if the system headers are available in order to use the fancier UUID generators (that use /dev/urandom and/or the MAC address) -- I'm happy to look at doing that.

include/bout/sys/uuid.h Outdated Show resolved Hide resolved
sting_view is C++17, and not used in this version of uuid.h.
@johnomotani
Copy link
Contributor Author

We might also want to define UUID_SYSTEM_GENERATOR if the system headers are available in order to use the fancier UUID generators (that use /dev/urandom and/or the MAC address) -- I'm happy to look at doing that.

That would be great! Please do 👍

@johnomotani
Copy link
Contributor Author

I've merged next which now has #2196 to include boututils and boutdata from the boutproject repos, so the tests should now pass.

Only thing left is to look at UUID_SYSTEM_GENERATOR.

@johnomotani
Copy link
Contributor Author

The tests are failing in test-restart-io_hdf5. It seems to be because our HDF5 string writing is buggy. If the length of the string is bigger than nx then the write fails (test-io_hdf5 passes because the test string is short). I wondered if this was because I used the H5T_C_S1 type, which I think might be for variable-length strings, but changing it to H5T_NATIVE_CHAR didn't fix the error. I don't want to spend any more time trying to debug HDF5... Can we just delete the H5format code now instead of keeping spending time debugging it?

* next-run-id:
  Note removing support for HDF5 in changelog
  Better notation for indexing zShift
  Update note on zShift in grid-files section of manual
  Print error for HDF5 macros when running bout-v5-macro-upgrader.py
  Support macros being removed in bout-v5-macro-upgrader.py
  Remove HDF5 support
@ZedThree
Copy link
Member

ZedThree commented Feb 4, 2021

So there are two main versions of UUIDs: version 1 which basically hashes the current time and the system's MAC address, and pretty much leaks that information; and version 4, which is completely random.

The library we're using here, stduuid, doesn't properly implement version 1 (and even using the system generators, it only wraps the v4 APIs), so we're using version 4. I had been thinking that v1 UUIDs would be good so that we'd be able to identify runs as coming from a particular machine, but maybe that's not good/useful.

I'm happy to go ahead using the completely random v4, but I just thought I'd highlight this as a possible design decision.

@johnomotani
Copy link
Contributor Author

I can't remember what the number was, but I remember there was some quote on the chance of random collisions from v4 UUIDs being ludicrously small, so I'm happy to stick with v4 IDs, especially if v1 are not easy to implement right now.

@ZedThree
Copy link
Member

ZedThree commented Feb 4, 2021

I've merged next into #2149, and #2149 into this, so the diff should be just be the diff! :)

The configure option defaults to preferring the system UUID generator if available, but happily falls-back to the stduuid version if not.

@ZedThree
Copy link
Member

ZedThree commented Feb 4, 2021

/clang-format

@ZedThree
Copy link
Member

ZedThree commented Feb 4, 2021

I think clang-tidy-review is failing because it gets configured in an environment with libuuid installed, and then clang-tidy runs in a container where it doesn't. Not sure how best to fix that, and I'm not sure I can be bother right now -- the uuid header isn't ours and hopefully we won't have to maintain it (much), so clang-tidy won't run on it after these two PRs.

Fedora still failing due to bug in Fedora.

I'm going to merge this into #2149, so we can discuss things further there

@ZedThree ZedThree merged commit 59ff2b5 into next-run-id Feb 4, 2021
@ZedThree ZedThree deleted the uuid-run-id branch February 4, 2021 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants