-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Read Inconsistent Zero Pads #1118
Conversation
11a40b8
to
311f107
Compare
756e9ae
to
0670508
Compare
I've reverted your changes and pushed a suggested implementation that should fix this
The trick was to modify the regex a bit that parses a directory's filenames
In consequence, this regex now accepts |
0670508
to
bbc3306
Compare
bbc3306
to
83730a0
Compare
Cool, I think that works well! I think a general thing we could also consider is that prefixes that are weirdly named, e.g., Should we maybe add a warning if the user-provided name does end in a number of the prefix before the |
I pushed test bug it looks like that one passes already before the patch - as long as we use Maybe that's all that is needed in openPMD-viewer then. |
Ah, I think the problem was not in |
Theoretically we could start soft-enforcing the underscore by giving a warning if anything but an underscore is used? A prefix
Yep, if using |
I've pushed a commit that does this. In case we want to only warn if the prefix ends in a digit, that should be easy enough to fix. Apparently, there is an instance of
|
Not sure if we need to warn on not ending in an underscore, ending in a digit is maybe worth a warning.
…On November 1, 2021 8:12:24 AM PDT, "Franz Pöschel" ***@***.***> wrote:
> Theoretically we could start soft-enforcing the underscore by giving a warning if anything but an underscore is used?
I've pushed a commit that does this. In case we want to only warn if the prefix ends in a digit, that should be easy enough to fix.
Apparently, there is an instance of `data%T` in our test suite:
```
[Warning] In file-based iteration encoding, it is strongly recommended
to prepend the expansion pattern of the filename with an underscore '_'.
Example: 'data_%T.json' or 'simOutput_%06T.h5'
Given file pattern: 'data%T'
```
|
63aa78a
to
c4b0cde
Compare
Some codes mess up the zero-padding in `fileBased` encoding, e.g., when specifying padding to 5 digits but creating >100'000 output steps. Files like those cannot yet be parsed and fell back to no padding, which fails to open the file: ``` openpmd_00000.h5 openpmd_02000.h5 openpmd_101000.h5 openpmd_01000.h5 openpmd_100000.h5 openpmd_104000.h5 ``` Error: ``` RuntimeError: [HDF5] Failed to open HDF5 file diags/diag1/openpmd_0.h5 ```
Parse iteration numbers that are longer than their padding Read inconsistent zero padding
f614f39
to
c6dca59
Compare
8c8f534
to
970c2f4
Compare
* So even when opening the series with a padding of 1, | ||
* that iteration will be opened. | ||
*/ | ||
REQUIRE(o.iterations.count(123456) == 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was initially thinking here: should we read 123456
if our padding was specifically asked to be 00
?
But if we would skip it, then this would be a bit inconsistent with reading unpadded numbers.
Thus, I think the logic change here in the test that you added is the proper way to handle this 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we read 123456 if our padding was specifically asked to be 00?
Yep, it should, because that number is not padded ;)
@franzpoeschel, thanks a lot for your help and updates 💖 |
CI: |
* [Draft] Fix: Read Inconsistent Zero Pads Some codes mess up the zero-padding in `fileBased` encoding, e.g., when specifying padding to 5 digits but creating >100'000 output steps. Files like those cannot yet be parsed and fell back to no padding, which fails to open the file: ``` openpmd_00000.h5 openpmd_02000.h5 openpmd_101000.h5 openpmd_01000.h5 openpmd_100000.h5 openpmd_104000.h5 ``` Error: ``` RuntimeError: [HDF5] Failed to open HDF5 file diags/diag1/openpmd_0.h5 ``` * Revert previous changes except for test Parse iteration numbers that are longer than their padding Read inconsistent zero padding * Overflow Padding: Read Test * Warn if the prefix does end in a digit * Fix: Don't let oversize numbers accidentally bump the padding * Update test * Issue warnings on misleading patterns also when writing * Minor Style Update Co-authored-by: Franz Pöschel <franz.poeschel@gmail.com>
* Read: time/dt also in long double (#1096) * Python: time/dt round-trip Test writing and reading time and dt on an iteration via properties. * Fix: Iteration read of long double time Support reading of `dt` and `time` attributes if they are of type `long double`. (openPMD standard: all `floatX` supported) * Executables: CXX_STANDARD/EXTENSIONS (#1102) Set `CXX_EXTENSIONS OFF` and `CXX_STANDARD_REQUIRED ON` for created executables. This mitigates issues with NVCC 11.0 and C++17 builds seen as added `-std=gnu++17` flags that lead to ``` nvcc fatal : Value 'gnu++17' is not defined for option 'std' ``` when using `nvcc` as CXX compiler directly. * Doc: More Locations -DPython_EXECUTABLE (#1104) Mention the `-DPython_EXECUTABLE` twice more in build examples. * NVCC + C++17 (#1103) * NVCC + C++17 Work-around a build issue with NVCC in C++17 builds. ``` include/openPMD/backend/Attributable.hpp(437): error #289: no instance of constructor "openPMD::Attribute::Attribute" matches the argument list argument types are: (std::__cxx11::string) detected during instantiation of "__nv_bool openPMD::AttributableInterface::setAttribute(const std::__cxx11::string &, T) [with T=std::__cxx11::string]" ``` from ``` inline bool AttributableInterface::setAttribute( std::string const & key, char const value[] ) { return this->setAttribute(key, std::string(value)); } ``` Seen with: - NVCC 11.0.2 + GCC 8.3.0 - NVCC 11.0.2 + GCC 7.5.0 * NVCC 11.0.2 C++17 work-around: Add Comment * Lazy parsing: Make findable in docs and use in openpmd-ls (#1111) * Use deferred iteration parsing in openpmd-ls * Make lazy/deferred parsing searchable * Add a way to search for usesteps key * HDF5: Document HDF5_USE_FILE_LOCKING (#1106) Document a HDF5 read work-around that we currently need on OLCF Jupyter (https://jupyter.olcf.ornl.gov), due to a mounting issue of GPFS in the Jupyter serice (OLCFHELP-3685). From the HDF5 1.10.1 Release Notes: ``` Other New Features and Enhancements =================================== Library ------- - Added a mechanism for disabling the SWMR file locking scheme. The file locking calls used in HDF5 1.10.0 (including patch1) will fail when the underlying file system does not support file locking or where locks have been disabled. To disable all file locking operations, an environment variable named HDF5_USE_FILE_LOCKING can be set to the five-character string 'FALSE'. This does not fundamentally change HDF5 library operation (aside from initial file open/create, SWMR is lock-free), but users will have to be more careful about opening files to avoid problematic access patterns (i.e.: multiple writers) that the file locking was designed to prevent. Additionally, the error message that is emitted when file lock operations set errno to ENOSYS (typical when file locking has been disabled) has been updated to describe the problem and potential resolution better. (DER, 2016/10/26, HDFFV-9918) ``` This also exists as a compilation option for HDF5 in CMake, where it defaults to ``TRUE`` by default, which is also what distributions/ package managers ship. Disabling from Bash: ```bash export HDF5_USE_FILE_LOCKING=FALSE ``` Disabling from Python: ```py import os os.environ['HDF5_USE_FILE_LOCKING'] = "FALSE" ``` * Avoid object slicing when deriving from Series class (#1107) * Make Series class final * Use private constructor to avoid object slicing * Doc: OMPI_MCA_io Control (#1114) Document OpenMPI MPI-I/O backend control. We have documented this long in #446. * openPMD.hpp: Include auxiliary StringManip (#1124) Include this, handy functions. * CXX Std: Remember <variant> Impl. (#1128) We use `<variant>` or `<mpark/variant.hpp>` in our public API interface for datatypes, depending on the C++ standard. This pull request makes sure that the same implementation is used in downstream code, even if the C++ standard is switched. This avoids ABI issues when, e.g., using a C++14 built openPMD-api in a C++17 downstream code. * Spack: No More `load -r` (#1125) The `-r` argument was removed from `spack load` and is now implied. * Fix AppVeyor: Python Executable (#1127) * GH Action: Add MSVC & ClangCL on Win * Fix AppVeyor: Python Executable * Avoid mismatching system Python and Conda Python * Conda: Fix Numpy * CMake: Skip Pipe Test Written in a too special way, we cannot assume SH is always present * Test 8b (Bench Read Parallel): Support Variable encoding, Fix Bugs (#1131) * added support to read variable encoding, plus fixed some bugs * fixed style * Update examples/8b_benchmark_read_parallel.cpp remove commented out code Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja> * Update examples/8b_benchmark_read_parallel.cpp Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja> * Update examples/8b_benchmark_read_parallel.cpp Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja> * Update examples/8b_benchmark_read_parallel.cpp Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja> * Update examples/8b_benchmark_read_parallel.cpp Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja> * Update examples/8b_benchmark_read_parallel.cpp Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja> * Update examples/8b_benchmark_read_parallel.cpp Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja> * Update examples/8b_benchmark_read_parallel.cpp Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja> * Update examples/8b_benchmark_read_parallel.cpp Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja> * Update examples/8b_benchmark_read_parallel.cpp Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja> * Update examples/8b_benchmark_read_parallel.cpp Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja> * removed commented line * updated 8b env option Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja> * HDF5 I/O optimizations (#1129) * Include HDF5 optimization options * Fix code style check * Fix validations and include checks * Fix style check * Remove unecessary strict check * Update documentation with HDF5 tuning options * Update contributions * Fix Guards for H5Pset_all_coll_metadata* * MPI Guard: H5Pset_all_coll_metadata* * Remove duplicated variable Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja> * Include known issues section for HDF5 (#1132) * Update known issues with HDF5 and collective metadata operations * Fix rst link and tiny typo * Add targeted bugfix releases. Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja> * Include check for paged allocation (#1133) * Include check for paged allocation * Update ParallelHDF5IOHandler.cpp * libfabric 1.6+: Document SST Work-Arounds (#1134) * libfabric 1.6+: Document SST Work-Arounds Document work-arounds for libfabric 1.6+ on Cray systems when using data staging / streaming with ADIOS2 SST. Co-authored-by: Franz Pöschel <franz.poeschel@gmail.com> * Fix: Read Inconsistent Zero Pads (#1118) * [Draft] Fix: Read Inconsistent Zero Pads Some codes mess up the zero-padding in `fileBased` encoding, e.g., when specifying padding to 5 digits but creating >100'000 output steps. Files like those cannot yet be parsed and fell back to no padding, which fails to open the file: ``` openpmd_00000.h5 openpmd_02000.h5 openpmd_101000.h5 openpmd_01000.h5 openpmd_100000.h5 openpmd_104000.h5 ``` Error: ``` RuntimeError: [HDF5] Failed to open HDF5 file diags/diag1/openpmd_0.h5 ``` * Revert previous changes except for test Parse iteration numbers that are longer than their padding Read inconsistent zero padding * Overflow Padding: Read Test * Warn if the prefix does end in a digit * Fix: Don't let oversize numbers accidentally bump the padding * Update test * Issue warnings on misleading patterns also when writing * Minor Style Update Co-authored-by: Franz Pöschel <franz.poeschel@gmail.com> * Release: 0.14.3 Co-authored-by: Franz Pöschel <franz.poeschel@gmail.com> Co-authored-by: guj <guj@users.noreply.github.com> Co-authored-by: Jean Luca Bez <jeanlucabez@gmail.com> Co-authored-by: Jean Luca Bez <jlbez@lbl.gov>
@camille12225 reports that even with this port in 0.14.3, she still sees issues with inconsistent padding in files. I added a report to #1173 |
Some codes mess up the zero-padding in
fileBased
encoding, e.g., when specifying padding to 5 digits but creating >100'000 output steps.Files like those cannot yet be parsed using an
openpmd_%T.h5
file open string and fell back to no padding, which fails to open the file:Error:
To Do