-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
general HDF5 module support #663
Comments
Maybe we can just flesh out details at our next meeting and I can update the issue once it's more clear what the plan is. |
If we are going to combine module data that could take some care. We probably want it handled in some kind of aggregator that stores it in the For HDF5 logs, we currently only have |
Tyler had some good comments in today's meeting about collecting testing logs for this issue which I don't think my comment above captured. He suggested that we collect/create logs with known or easy-to-determine values (say we know it opened 5 files) for testing. Since we can't use the original darshan reports to confirm our outputs for HDF5-related things, this would give us a lot more confidence moving forward, especially if the data post-processing gets complicated. |
A Python-based approach like the one in darshan-hpc/darshan-logs#22 and https://github.com/tylerjereddy/heatmap_diagonal could likely be used with |
* Adds `H5F` and `H5D` specific sections to `plot_opcounts()` aggregator functions `agg_ioops()` and `gather_count_data()`. Current method does not combine `H5F` and `H5D` module data together. * Add `H5F` and `H5D` test cases to `test_plot_exp_common` tests `test_xticks_and_labels` and `test_bar_heights` * Contributes to issue darshan-hpc#663
Unrelated to my last comment, my understanding of how we want to combine the
For each module the total "Opens" is 6, so if we sum these values together, we get a total of 12 "Opens" for HDF5. Is that how we want to combine the HDF5 module data? I ask because when these values are separate, I get the understanding that there were 6 instances where a file was opened and its dataset(s) were accessed. But with the values summed to 12 under a single "HDF5" banner, I feel like it gives the impression that 12 files were opened, at least at a glance. I know we are going to lose some granularity either way, but would it make more sense to use the |
To address #663 (comment) first, I was able to reproduce the read-invisbility of from mpi4py import MPI
import numpy as np
from numpy.testing import assert_array_equal
import h5py
def main():
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
# round trip HDF5 IO test
file_path = f"./test_rank_{rank}.hdf5"
n_bytes = 10 * (rank + 1)
bytes_to_write = np.ones(shape=n_bytes, dtype=np.int8)
with h5py.File(file_path, "w") as f:
f.create_dataset("dataset", data=bytes_to_write)
with h5py.File(file_path, "r") as g:
retrieved_data = np.asarray(g['dataset'])
assert_array_equal(retrieved_data, bytes_to_write)
if __name__ == "__main__":
main() The ramped up bytes written per rank is correctly recorded, and the increase in bytes read is true according to the NumPy assertion, but completely invisible to the darshan-runtime (see below the fold). In fact,
|
I'm also wondering if |
For reference, Shane gave an update on how to combine the HDF5 modules in this comment:
|
As far as "ground truth" HDF5 testing logs go, these are the current cases we've been able to create successfully:
I think we would still want to supplement these with some others where |
I suspect a single directory would be ok for some of these simple cooked-up HDF5 cases. The Python code is almost small enough to put in there too (though we haven't been doing that), or just link to the small source snippets I suppose. I guess the team might even be "ok" with just a short description of how each log was generated without needing the source connections/links. |
* Update list of supported modules in `get_io_cost_df()` to include `H5F` and `H5D` modules * Add function `combine_hdf5_modules()` to aggregate the HDF5 modules (`H5F` and `H5D`) together for the I/O cost figure * Fix test case for `test_get_io_cost_df()` as it now includes the HDF5 module * Add test `test_combine_hdf5_data()` to verify the HDF5 module data is being combined appropriately * Contributes to issue darshan-hpc#663
* Add `H5D` or `H5F` to operation count figure modules. Only 1 module is added since `H5D` figure encompasses both HDF5 modules' data. * Correct expected figure count in `test_main_without_args` for `ior_hdf5_example.darshan` case since `H5D` section now contains an operation counts figure. * Contribute to issue darshan-hpc#663
* Change the per-module report section titles to `Per-Module Statistics: HDF5` for `H5F` and `H5D` figures * Fix issue darshan-hpc#663
Closing due to merge of gh-707 -- if there are residual issues discussed here it may be easier to split them off into other issues, though I believe at least Shane's initial comment is dealt with. |
* MAINT: PR 571 revisions * Simplify `img` tag counting in `test_main_without_args` * MAINT: Update mypy config to ignore all pandas imports * Change mypy pandas config to ignore all imports * Remove unnecessary type ignores for `pandas.testing` imports * ENH: Add log path retrieval machinery * Add module `log_utils.py` containing new function `get_log_path` for easy retrieval of logs from either local sources or darshan-logs repo, using only the filename * Fixes #566 * MAINT: PR 587 revisions * Fix type hint issue * MAINT: PR 587 revisions * Update type hints to use shorthand versions * MAINT: PR 587 revisions * Make pytest import contingent on whether a pytest session is ongoing * MAINT: PR 587 revisions * Simplify log retrieval logic into single function * MAINT: PR 587 revisions * `_locate_log()` no longer moves up too many directory levels; add a related `TODO` comment to get rid of the requirement to move up levels altogether * MAINT: PR 587 revisions * `log_utils.py` no longer depends on a fragile environment variable `PYTEST_CURRENT_TEST` to detect if we are in a `pytest` run; this was failing even for usage of `get_log_path()` within the body of a test * `test_main_without_args()` was adjusted to use `get_log_path()`, including one new logs repo-based log file; it appears to be running the extra test case when logs repo is avaialable, and skipping when the logs repo is absent * MAINT: PR 587 revisions * improve `get_log_path()` documentation to include a note about not using the func in pytest decorators * ENH: Add common access table * Add module `plot_common_access_table.py` for generating the per-module common access table * Add testing module `test_plot_common_access_table.py` with unit tests for each function in `plot_common_access_table.py` * MAINT: PR 600 revisions * Remove unnecessary type ignore for `pandas.testing` import * Fix typo * MAINT, TST: PR 600 revisions * Simplify `test_common_access_table`: - Change input from report to log path - Assign redundant column names inside of test * Add comments to POSIX and MPI-IO cases mentioning values are from original report code * Add new test case using `nonmpi_partial_modules.darshan` with comment mentioning the values are from the original report * MAINT: PR 600 revisions * Change name of `test_general` to `test_misc_funcs` * Simplify `test_misc_funcs` by assigning column names for dataframes in test body * MAINT: PR 600 revisions * Leverage new log retrieval machinery to simplify `test_common_access_table` * TST: PR 600 revisions * Add `imbalanced-io.darshan` test case for further verification of new common access table outputs * BUG, BENCH: fix time_plot_heatmap_builtin_logs * fix the call signature of `plot_heatmap()` in `time_plot_heatmap_builtin_logs()` so that the benchmark no longer fails * TST: tests for get_log_path() * add some basic regression tests for `get_log_path()`; focuses on the usage in the `pytest` context, which is the main use case * no noticeable test suite slowdown measured locally with logs repo present * BENCH: asv benchmark for get_log_path() * add a benchmark for repeated calls to `get_log_path()`, which is a situation likely to arise in the context of the full `pytest` suite where many log files are retrieved * benchmark runs in 42 seconds locally: `time asv run -e -b "time_get_log_path_repeat"` ``` · Creating environments · Discovering benchmarks · Running 1 total benchmarks (1 commits * 1 environments * 1 benchmarks) [ 0.00%] · For pydarshan commit 6d0a48e <pydarshan-devel>: [ 0.00%] ·· Benchmarking virtualenv-py3.9-cffi-numpy-pytest [ 50.00%] ··· Running (log_utils.GetLogPath.time_get_log_path_repeat--). [100.00%] ··· log_utils.GetLogPath.time_get_log_path_repeat ok [100.00%] ··· =========== ============= -- filename ----------- ------------- num_calls dxt.darshan =========== ============= 1 97.5±0.8ms 10 972±8ms 25 2.43±0.02s =========== ============= asv run -e -b "time_get_log_path_repeat" 19.79s user 30.18s system 116% cpu 42.835 total ``` * MAINT: PR 609 revisions * add an `__init__.py` in `examples/darshan-graph/` so that `setuptools` will install the `*.darshan` files in that path as well * ENH: faster get_log_path() * this is a precursor to using `get_log_path()` ubiquitously in the test suite, which itself is a precursor to being able to run the test suite from any directory (including a user/developer with a `spack` install of `py-darshan`, which is our primary planned vehicle for initial delivery of the new report capability) * achieve nearly constant-time log filepath retrieval using a dictionary/hash table * the speedups on repeated calls, the most important scenario for running in `pytest`, are several orders of magnitude faster using the new benchmark in gh-610: `asv continuous pydarshan-devel treddy_get_log_path_dict -e -b "time_get_log_path_repeat` ``` before after ratio [6d0a48e] [47e9a462] <pydarshan-devel> <treddy_get_log_path_dict> - 97.6±2ms 962±100ns 0.00 log_utils.GetLogPath.time_get_log_path_repeat(1, 'dxt.darshan') - 974±7ms 3.14±0.1μs 0.00 log_utils.GetLogPath.time_get_log_path_repeat(10, 'dxt.darshan') - 2.43±0s 7.31±0.3μs 0.00 log_utils.GetLogPath.time_get_log_path_repeat(25, 'dxt.darshan') SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY. PERFORMANCE INCREASED. ``` * the new tests from gh-609 also seem to pass with this branch, though it would probably be prudent to review/fix/merge that first.. * MAINT: PR 611 revisions * remove extraneous space highlighted by reviewer * MAINT: Close dxt heatmap matplotlib figures * Fixes issue when running tests where many figures are opened and not explicitly closed, generating a pytest warning * TST: Add `pytest.ini` * Add `pytest.ini` to upgrade `matplotlib`-generated RuntimeWarnings into errors * MAINT: Remove unused import in `test_plot_common_access_table.py` * MAINT: refactor assets * move `examples` and `tests` folders inside `darshan` package so they can be easily installed as elements of the package, and prevent us from having to search more broadly on the filesystem to locate `*.darshan` file assets (and to avoid monopolizing `import tests`, `import examples`, which is crazy) * adjust setup.py accordingly--there is now only a `darshan` package to `install`, not separate `tests` and `examples` packages * migrate the test suite over to `get_log_path()` as needed to empower a portable approach to running tests (i.e., `pytest --pyargs darshan` from any directory) * `_locate_log()` no longer needs to go searching around on the filesystem outside of our own packages, so get rid of that.. * on this branch, in some random directory, `pytest --pyargs darshan -n 8` produces: `255 passed in 88.93s (0:01:28)` * this is in line with latest `pydarshan-devel` run of the full suite before migration to `get_log_path()` and reorganization: `pytest --import-mode=importlib -n 8` `255 passed in 87.78s (0:01:27)` * enforce support for `pytest --pyargs darshan` from any dir in the CI (using a random temp dir there) * things reviewers could do to help here: - check the large diff for potential errors - check/fix up the `asv` benchmark suite for the effects of this branch - check if the coverage report stuff is still working "ok" with new `pytest` portability/incantation * MAINT: PR 612 revisions * rename `example-logs` to `example_logs` for `mypy`, which is best practice anyway * ignore some old tutorial file that `mypy` doesn't like * MAINT: PR 612 revisions * adjust the CI `mypy` check to only verify the `darshan` folder, because the testing folder is now inside the main package * MAINT: PR 612 revisions * changes to make `asv check` pass; we really probably should just use `get_log_path()` at this point though... * MAINT: fixup setup.py for example_logs * MAINT: PR 612 revisions * apply Nik's suggested patch to `test_issue_590()` * remove a redundant `--cov=` command in the `pytest` CI incantation, as suggested by Nik * CI: debug work for codecov. * MAINT: move remaining tests and remove `/examples` * Move tests `test_plot_common_access_table.py` and `test_log_utils.py` into `darshan/tests` * Remove unnecessary `__ini__.py` from `examples/darshan-graph` * Fix issue #626 * ENH: Add dataframe/table figure support * Update method `generate_img` name to more general `generate_fig` and update documentation to reflect the changes * Generalize `generate_fig` to support figure functions that generate pandas dataframes * Add test `test_generate_fig_unsupported_fig_type` to check appropriate error is raised when input function does not generate the supported figure types * Fix issue #550 * BUG: PR 618 revisions * Change default `to_html` arguments to remove index instead of header * MAINT: PR 618 revisions * Add object `DarshanReportTable` to `plot_common_access_table.py` for storing tables in dataframe and html format. Should set a standard that allows access for testing but also puts the burden on the figure function to generate a `DarshanReportTable` that contains the desired html table for the report. * Change `plot_common_access_table` to return `DarshanReportTable` and update documentation * Fix `test_common_access_table` due to above change * `ReportFigure.generate_fig` changes: - When checking for `matplotlib`/`seaborn` figures, check for the attribute `savefig` instead of checking figure type - When checking for table figures, check for `DarshanReportTable` type instead of `pd.DataFrame` - Remove `supported_mpl_fig_types` and update error message to simply state that a given figure type is not supported * Remove unused imports, add import for `plot_common_access_table.py` * MAINT: rm redundant sample.darshan * remove a redundant copy of `sample.darshan` * ENH: Add common access size table to darshan summary report * Add "Per-Module Stats" section to report containing the common access size table (to start) * Extend `test_main_without_args` to check for number of tables generated as a regression guard * MAINT: Update return type for `plot_common_access_table` * Update return type for `plot_common_access_table` from generic `Any` to `DarshanReportTable` * BUG: correct section closing tag in `base.html` * Relocate section closing tag to outside of figure loop * TST: Add regression guard for section closing tags * Add check to test `test_main_without_args` to verify the number of close/open section tags match * TST: Add regression guard for section closing tags * Add check to test `test_main_all_logs_repo_files` to verify the number of close/open section tags match for all log files * ENH: Add partial module data flag * Add module `plot_flag` with functions for generating a generic warning flag with configurable warning message * Add warning flags to summary report module table for modules marked as having partial module data * `test_module_table()` changes: - Add tests for logs `partial_data_stdio.darshan` and `partial_data_dxt.darshan` for checking cases with partial module data - Adjust actual and expected module dataframes to reflect changes in the module table * Fix gh-498 * MAINT: purge example2.darshan * `example2.darshan` is a duplicate of `example.darshan`, which itself is a duplicate of `sample.darshan` * for now, just purge `example2.darshan` as a small cleanup; I have not touched the two Jupyter notebooks that used `example2.darshan`, but I suspect Jakob is "ok" with that for now * MAINT: cleanups after main merge. * MAINT: PR 634 revisions * Remove module `plot_flag.py` * Change partial flag to use unicode warning symbol over matplotlib figure * Update `test_module_table` to check for unicode warning symbol * ENH: Add access size histogram to summary report * Add access size histogram figure from `plot_access_histogram` to summary report * Update expected `img` tag counts in `test_main_without_args` to account for new access size histogram figures * Rework table counting in `test_main_without_args` to ignore extraneous instances of the word "table", such as the instance in the figure caption for the access size histogram * Contributes to issue #465 * MAINT: Correct `nonmpi_partial_modules.darshan` filename * Correct all instances of `nonmpi_partial_modules.darshan` to `nonmpi_dxt_anonymized.darshan` after the following darshan-logs PR was merged: darshan-hpc/darshan-logs#19 * MAINT: PR 634 cleanup * Fix comment in `test_module_table()` * ENH: Add `H5D` module support to "Access Size Histogram" * Add `H5D` to supported module list for `mod_agg_iohist` * Add `H5D` access histogram to summary report * Add test cases for `ior_hdf5_example.darshan`: - `test_main_without_args()` in `test_summary.py` - `test_xticks_and_labels()` and `test_bar_heights()` in `test_plot_exp_common.py` * Add pydarshan low-level backend function to access HEATMAP records. * Indentation fix, missing mod_name value. * Add heatmap support to backend.log_get_record * ENH: Add grid formatting to darshan summary report * Add and implement new CSS class to allow for grid formatting of figures in each summary report section * Contributes to issue #517 * ENH: Update summary report section titles * Change "Log Details" to "Job Summary" * Change "Module Breakdown" to "Darshan Log Information" * Change "I/O Operations" to "I/O Summary" * Change "Per-Module Stats" to "Per-Module Statistics" * Contributes to issue #641 * ENH: Add Darshan website link to summary report footer * Contributes to issue #641 * Add HEATMAP support to DarshanReport class. Add heatmap datatype. * Update devel build recipe to also generate configure scripts using prepare.sh needed for HEATMAP support to work. * Do not use self.data[] on report. * CI: remove `type: ignore` instances * ENH: Update summary report figure captions * Update captions for DXT heat map, I/O cost, and Access Size figures according to suggestions from Phil * Contributes to issue #641 * MAINT: PR 647 Revisions * Change I/O cost graph caption from "by process type" to "by access type" * Fix regression if/elif leading to lustre records being handled wrongly. * Raise qualified exceptions. * Fix: Local variable name_records is assigned but never used. * BUG: Change DXT heatmap to use `nprocs` for y-axis scale and correct bar graph anomalies * Use `nprocs` to scale y-axis for heatmap figure instead of number of ranks with IO activity * Add new function `get_filled_hmap_df` for filling data gaps in heatmap dataframe. If a rank has no data it will now have a dedicated row filled with zeros. * Alter method of y-axis flipping to simply flip the order of the input data array since `invert_yaxis()` creates inconsistencies between heatmap and horizontal bar graph. This was observed in the following log file for the `DXT_POSIX` module: `snyder_acme.exe_id1253318_9-27-24239-1515303144625770178_2.darshan` * Correct tests for cases where not all ranks had non-zero bins * Fixes issue #576 * Contributes to issue #575 * MAINT: PR 622 revisions * Remove `get_filled_hmap_df` and function calls * Add `nprocs` parameter to `get_heatmap_df` and update documentation * Update `get_heatmap_df` to return dense dataframe based on `nprocs` * Correct tests in `test_heatmap_handling.py` to reflect dense heatmap dataframe output * Revert y-axis flipping method back to `ax.invert_yaxis()` * MAINT: PR 622 revisions * Correct misc grammar mistakes * MAINT: PR 622 revisions * fix `time_get_heatmap_df()` for function signature change * BENCH: peakmem benchmarks for heatmap * add some peak memory tracking `asv` benchmarks for the DXT heatmap code, because the decision between sparse and dense data structures requires statistically robust measurements of both performance and memory impact * the `GetHeatMapDf` benchmarks on synthetic data now include a spread of data densities to better reflect the reality that many MPI applications only have a subset of IO-active ranks * Include heatmap in read_all_records, add guard for HEATMAP. * Doc: fix incomplete sentence, remove comments from backend left in for potential addition of resolve name_record as part of dict. * Remove changes that increased diff to files unrelated. * Don't add self.data['heatmaps'] to report. * Add hash for MPIIO:heatmap to resolve table. * ENH: Add operation counts figure to summary report * Add operation counts figure from `plot_opcounts.py` to summary report * Update expected `img` tag counts in `test_main_without_args` to account for new operation count figures * Fix `img` tag string such that extraneous instances of "img" are not included in total count * Close issue #465 * MAINT: PR 654 revisions * Change "Operation Counts" figure caption to "Histogram of I/O operation frequency." * MAINT: Update darshan summary report metadata table runtime datatype * Change method `ReportData.get_runtime()` to use integers instead of floats * Correct tests `test_metadata_table` and `test_get_runtime` to use integer values * MAINT: Update partial flag message * Change partial flag message to "Module data incomplete due to runtime memory or record count limits" * Correct test `test_module_table` to reflect updated partial data flag message * MAINT: Update summary report metadata table executable description * Change executable description to "Command Line" * Update row names for expected dataframes in test `test_metadata_table` * MAINT: Add suffix to module data table entries * Suffix module descriptions in "Darshan Log Information" table with "Module Data" * Correct test `test_module_table` such that expected dataframes contain the updated row names * Typos and switch runtime to ValueError. * TST: xfail HEATMAP tests * mark the two test cases requiring `HEATMAP` module support as known failures (`xfail`) for now, so that PRs in the main and logs repos can pass CI until the support is added in * because the two tests are not parametrized, the `xfail`s are imperative, which means that only those inputs that precede the `xfail`ed logs will run for now, so I wouldn't want to leave these in for long, just long enough to avoid rushing the HEATMAP support addition/review * an alternative is to simply `pass`/`continue` for these test cases, which would allow all of the other cases for the two tests to run as usual, though it would leave out the convenient `pytest` reminder that the tests have known failures in them * MAINT: Reorganize summary report metadata sections * Move "Log Filename", "Runtime Library Version", and "Log Format Version" metadata into "Darshan Log Information" section of summary report * Correct tests `test_metadata_table` and `test_module_table` to reflect new organization of log metadata * Close issue #641 * TST: graceful HEATMAP xfails * only test cases that require runtime `HEATMAP` module support are `xfail`ed in the test suite now, instead of also skipping the cases that also come after them in iterable tests * while the original design was meant to be temporary, and was made intentionally on the assumption that HEATMAP support would be added shortly, this introduces parametrization into the two tests that are involved and does away with the old fixtures * on `pydarshan-devel`: `300 passed, 2 xfailed in 38.83s` * on this branch: `318 passed, 6 xfailed in 96.56s (0:01:36)` * there are 3 runtime heatmap logs, and two tests that need to avoid them, so the xfail arithmetic looks correct there, along with the restoration of a bunch more logs cases... * MAINT: PR 675 revisions * handle the absence of the logs repo with new `HEATMAP` `xfail` machinery.. * Add pydarshan low-level backend function to access HEATMAP records. * Indentation fix, missing mod_name value. * Add heatmap support to backend.log_get_record * Add HEATMAP support to DarshanReport class. Add heatmap datatype. * Update devel build recipe to also generate configure scripts using prepare.sh needed for HEATMAP support to work. * Do not use self.data[] on report. * Fix regression if/elif leading to lustre records being handled wrongly. * Raise qualified exceptions. * Fix: Local variable name_records is assigned but never used. * Include heatmap in read_all_records, add guard for HEATMAP. * Doc: fix incomplete sentence, remove comments from backend left in for potential addition of resolve name_record as part of dict. * Remove changes that increased diff to files unrelated. * Don't add self.data['heatmaps'] to report. * Add hash for MPIIO:heatmap to resolve table. * Typos and switch runtime to ValueError. * MAINT: PR 615 revisions * add some tests for runtime HEATMAP functionality * WIP, ENH: add HDF5 support to `plot_opcounts()` aggregators * Adds `H5F` and `H5D` specific sections to `plot_opcounts()` aggregator functions `agg_ioops()` and `gather_count_data()`. Current method does not combine `H5F` and `H5D` module data together. * Add `H5F` and `H5D` test cases to `test_plot_exp_common` tests `test_xticks_and_labels` and `test_bar_heights` * Contributes to issue #663 * For newly introduced exception to indicate 'ModuleNotInDarshanLog' inherit from ValueError. * TST: remove heatmap xfail * we no longer need to mark runtime `HEATMAP` tests as known failures, since the binding support has been merged now * MAINT: PR 691 revisions * restore a narrower `xfail` because `e3sm_io_heatmap_and_dxt.darshan` uses too much memory for CI, for now * MAINT: PR 685 revisions * Update `plot_opcounts.gather_count_data()` to return combined data for `H5D` module * Correct `H5D` test cases in `test_xticks_and_labels` and `test_bar_heights` * Add `mod` parameter to `plot_opcounts` docstring with a note explaining the special behavior when `H5D` is input * MAINT: refactor `get_by_avg_series()` * Refactor `get_by_avg_series()` such that all modules output `pd.Series` objects of the same shape. This is specifically targeted at the `H5F` module, which previously produced a data array of length 1 since it contains only "Meta" data. * ENH: Add HDF5 support to I/O cost figure * Update list of supported modules in `get_io_cost_df()` to include `H5F` and `H5D` modules * Add function `combine_hdf5_modules()` to aggregate the HDF5 modules (`H5F` and `H5D`) together for the I/O cost figure * Fix test case for `test_get_io_cost_df()` as it now includes the HDF5 module * Add test `test_combine_hdf5_data()` to verify the HDF5 module data is being combined appropriately * Contributes to issue #663 * MAINT: PR 685 revisions * Change labels to use the module name + operation name (i.e. `H5F Open`, `H5D Read`, etc.) * WIP: draft work on data access by filesystem section * MAINT: plot adjustments * Filesystem -> Category for plot titles as a generalization based on recent comments from Phil * initial placement of counter values next to horizontal bars in the read/write count plots (probably not perfect yet) * DEBUG: more debug work * ENH: PR 397 revisions * draft in support for counting/plotting bytes, including a number of new functions to further reduce code duplication * BUG: PR 397 revisions * added a regression test for `empty_series_handler()` not actually filling in the index:value pairs when a filesystem is missing from the series index, but is present in other parts of the control flow * fix the bug in `empty_series_handler()` * MAINT: PR 397 revisions * the `empty_series_handler()` function can be simplified to a single `reindex` operation, so do that * MAINT: PR 397 revisions * fixed the spelling of `Parameters` in several docstrings * removed the `:` after some `Parameters` and `Returns` docstring fields, for consistency * MAINT: PR 397 revisions * fixed an issue where the `df_reads` parameter was repeated in the `process_byte_counts()` function docstring * TST: add more tests * add a regression test for the function `convert_file_path_to_root_path()` * add a regression test for the function `convert_file_id_to_path()` * add a regression test for the function `identify_filesystems()` * add a regression test for the function `rec_to_rw_counter_dfs_with_cols()` * add a regression test for the function `check_empty_series()` * add a regression test for the function `process_byte_counts()` * TST: PR 397 revisions * add regression test for the function `process_unique_files()` * add regression test for the function `unique_fs_rw_counter()` * TST, MAINT: simplify index naming PR 397 * a few tests have been simplified in PR 397 to name the index directly inside the pytest parameters instead of the more verbose setting inside the tests proper * TST, MAINT: move plot_data under test * place the main plotting function, `plot_data()` under crude initial test * MAINT: PR 397 revisions * move more plotting code/logic into the main code module to prevent a more convenient plotting entrypoint * TST, ENH: remove right side spline * add a regression test and fix for the issue of value labels overlapping with axis right side spline, as reported in the review of PR 397 * BUG: scale font size y annotation * add a regression test and fontsize-adjusting fix for the case where the log file `noposixopens.darshan` causes the y axis labels (annotations) to overlap with the plot itself (because the strings were longer) * BUG: fix 0 count positions * add a regression test and fix for the case where there is no POSIX read/write activity and text labels are observed too far to the right side of the subplots * MAINT: PR 397 revisions * add regression test and fix for appropriate error handling when the `POSIX` module data is completely absent from a darshan log * MAINT: PR 397 revisions * address reviewer comment to simplify `ax_filesystem_bytes.barh()` calls * MAINT: PR 397 revisions * removed debug prints from `plot_data()` function * address reviewer comments related to `verbose` mode for the function `plot_with_log_file()` * address `mypy` `List`->`Sequence` parameter reviewer comments * MAINT: PR 397 revisions * `df_reads` -> `df_writes` for incorrectly named parameter in the function `process_unique_files()` * add type hints to `plot_data()` function, ignoring the pointless `pandas` "types" for now * add a basic docstring to the `plot_data()` function * MAINT: PR 397 revisions * add a missing error to the `Raises` section of the `unique_fs_rw_counter()` function * MAINT: PR 397 revisions * fix the spacing in `plot_data()` such that the regression test provided here passes: #397 (comment) * adjust `test_plot_data()` to match the spacing changes above * MAINT: PR 397 revisions * adjust the integer file counters such that they are displayed in scentific notation by `plot_data()`, as requested during PR review * adjust `test_plot_data()` to require scientific notation for the unique file counters * MAINT: PR 397 revisions * add the performance improvements suggested here: #397 (comment) * fix some issues with those changes * MAINT: PR 397 revisions * enforce shared axes in a column for "data access by category" * use a log scale for the x axes for column differences greater than two orders of magnitude * MAINT: PR 397 revisions * we no longer ignore the `STD..` data streams in the data access by category analysis/plots * these streams are now displayed using their proper name or `anonymized` if they are coming from an anonymized darshan log file * add regression tests for the above behavior * MiB Conversion Replaced 2**20 approximation with 1048576 * Function Docstring Added a docstring to ``convert_id_to_arrays`` * Filesystem Variables Use assigned filesystem variables "files_read" and "files_written" instead of multiple instances of series indexing. * All Branch CI Enabled CI on all branches for automated testing on pull requests. * MiB Conversion Replaced approximate conversion factor with actual value. Temporarily enabled CI for all branches. * Restore CI Removed temporary CI activaction. * MAINT: PR 397 revisions * `plot_data()` now labels the x axis for plot columns that use log scaling so that usage of log scales is clearer to the user * `plot_data()` now uses a symmetric log function to avoid artefacts near zero for the log axis (plot spanning far too many orders of magnitude) * the `ratio` used to determine whether a column of plots should be log scaled no avoids division by zero, and makes an assessment based on `MiB` values instead of raw bytes * add a new test fixture for selection individual log repo files by name * add new regression test, `test_log_scale_display()` for ensuring that the x axis gets properly labelled when a log scale is appropriate * MAINT: PR 397 revisions * `test_log_scale_display()` is now properly skipped when the darshan-logs project is not available * MAINT: PR 397 revisions * `rec_to_rw_counter_dfs()` no longer uses the "deprecated" `mod_read_all_records(..., dtype=...)` approach * TST, MAINT: more log scaling fixes * only scale `ratio` in `plot_data()` for bytes * `test_plot_data_shared_x_axis()` now tests a case where symmetric log scaling is used in both columns * MAINT: PR 397 revisions * `mypy` fixups for my branch after rebase, since I ignore more `pandas` imports by default than `pydarshan-devel` * MAINT: PR 397 revisions * add a fix and regression test for scaling the vertical size of the plots based on the number of categories/file systems * MAINT: PR 397 revisions * allow plotting the first `N` categories * Fixup after rebasing. * MAINT: PR 397 revisions * adjust `test_data_access_by_filesystem` module to use the newer `get_log_path()` machinery for retrieving the absolute paths to darshan test log files * MAINT: PR 397 revisions * remove extraneous `mypy` type ignores for `pytest` now that we require a newer `pytest` version that provides types in the main repo * MAINT: PR 397 * adjust `plot_data()` to use `va="center"` for annotations per reviewer request, and enforce via regression test * MAINT: PR 397 revisions * adjust the `plot_data()` `text()` read/write labels to use a centered vertical alignment per reviewer request; add regression test to enforce this * MAINT: PR 397 revisions * `plot_with_log_file()` has been renamed to `plot_with_report()` and accepts a `DarshanReport` object instead of a log file path, based on reviewer suggestions * `plot_data()` now caps the number of subplots based on `num_cats`, and this is now enforced with a regression test that also checks for "collapsed" layouts * `plot_with_report()` no longer saves a `.png` file to disk (enforced with a test), based on reviewer feedback, for consistency with many of the other report-related Python plotting functions * remove the now-extraneous `plot_filename` argument from `plot_with_report()` * remove extraneous `tmpdir` usage in several tests, because `plot_with_report()` no longer generates a file artifact by default * MAINT: PR 397 revisions * `plot_with_report()` now plots data in descending order of bytes IO activity, per reviewer request; includes a regression test * MAINT: PR 397 revisions * `convert_file_path_to_root_path()` has been adjusted to handle anonymized files on the root file system, based on feedback from Shane, along with a regression test for the plotted results no longer having `//<digits>` root paths * some fixes to `test_cat_labels_std_streams()` because of the reverse sorting of categories by bytes IO * fixes to `test_plot_with_report_proper_sort()` related to handling anonymous files on the root path, and for using the correct index for the third row category label * MAINT: PR 397 revisions * use `get_log_path()` instead of some recently-removed `pytest` fixtures in `test_data_access_by_filesystem.py` * MAINT: PR 397 revisions * draft adjustments and testing to account for `STDIO` in the "data access by category" analysis/plots * MAINT: PR 700 revisions * Restrict regex pattern in `combine_hdf5_modules()` to only collect `H5F` and `H5F` module names * Cleanup import in `test_plot_io_cost.py` * Fix comment in `test_combine_hdf5_modules` * MAINT: Use runtime for DXT heatmap x-axis scaling * Update `plot_heatmap()` to use the runtime for scaling x-axis * Add `tmax` parameter to `get_heatmap_df` to allow for setting the heatmap maximum time. Default behavior is to use the final DXT segment end time. * Add logic to `plot_heatmap` to prevent truncation of data when using the job runtime for setting plotting boundaries * Change `set_x_axis_ticks_and_labels()` and `get_x_axis_tick_labels()` to use runtime for x-axis labels * Update `set_x_axis_ticks_and_labels()` parameters and expected xticklabels in `test_set_x_axis_ticks_and_labels()` * Contributes to x-axis rescaling portion of issue #575 * MAINT: PR 696 revisions * Revert changes to `get_heatmap_df()` * Correct synthetic test case and add full `runtime` calculation logic to test `test_set_x_axis_ticks_and_labels` * MAINT: PR 696 revisions * Add formal parameter `bin_max` to `set_x_axis_ticks_and_labels` and `get_x_axis_ticks` to allow for easy scaling of x-axis tick mark locations, and update relevant documentation * Update `test_set_x_axis_ticks_and_labels`: - Add complete x-axis limit setting logic - correct the `expected_xticks` to the more realistic values - Relax the tolerances on the actual/expected xticks comparison * ENH: Add HDF5 operation counts figures to summary report * Add `H5D` or `H5F` to operation count figure modules. Only 1 module is added since `H5D` figure encompasses both HDF5 modules' data. * Correct expected figure count in `test_main_without_args` for `ior_hdf5_example.darshan` case since `H5D` section now contains an operation counts figure. * Contribute to issue #663 * MAINT: Change `H5D` and `H5F` figure section titles to HDF5 * Change the per-module report section titles to `Per-Module Statistics: HDF5` for `H5F` and `H5D` figures * Fix issue #663 * MAINT: Include `H5D` columns in `H5F`-only operation counts figure * Change operation counts figure to always list columns for both `H5F` and `H5D` modules. When `H5D` module data is not present the values are set to zero. * Contribute to issue #706 * CI: bump GHA action versions * follow NumPy in bumping to latest versions of the actions * TST: Close figures in `test_bar_heights` * Add appropriate close statement to ensure each figure in `test_plot_exp_common.test_bar_heights` is closed. * Fix issue darshan-hpc/darshan-logs#34 * TST: Add HDF5 test cases using "ground truth" logs * Add "ground truth" testing log cases to `test_common_access_table` and change the number of rows to be configurable based on the number of rows in the input `expected_df` * Add "ground truth" testing log cases to `test_plot_exp_common.py` tests `test_xticks_and_labels` and `test_bar_heights` to check results for `plot_opcounts` and `plot_access_histogram` * Fix issue #706 * ENH: Runtime heatmap multiple operation support * Add multiple operation (i.e. read and write) support to the `Heatmap.to_df()` method. This mirrors the way some DXT heatmap functions operate, which will allow for a more seamless integration of the runtime heatmap into `plot_heatmap()`. * Correct call signatures for `Heatmap.to_df()` method * Add test `test_heatmap_operations` for checking different operation configurations and comparing their results * Fix issue #697 * MAINT: PR 702 revisions * Simplify dataframe summation in `Heatmap.to_df()` * TST: PR 702 revisions * Add test `test_heatmap_df_invalid_operation` for verifying `ValueError` is raised appropriately in `Heatmap.to_df()` method * ENH: summary with runtime heatmaps * add the runtime (`HEATMAP` module) heatmaps to the pydarshan/Python summary reports * adjust some tests to reflect the new runtime `HEATMAP` support and to test some components of it in a bit more detail * design considerations: - at the moment, the runtime heatmaps are placed first, but it may be helpful to grid i.e., POSIX versions of DXT and runtime maps side-by-side to make the comparisons less jarring - it is a bit awkward that `DXT` data is directly associated with the module data source like `DXT_POSIX`, while for `HEATMAP` we index into a "submodule" dictionary--this awkwardness should be clear in the diff--not sure if we want to do anything about that - the new `determine_hmap_runtime()` function was added to provide a common abstraction when plotting heatmap data (i.e., this gives you access to DXT time resolution if available, even when only calling `plot_heatmap()` for `HEATMAP` module); this should be "ok" even if in the future we disable plotting the `DXT` data for large cases because it is mostly the plotting-related expansion of the data that causes the memory explosion rather than raw `DXT` access - it is a bit awkward that some of this machinery now resides in `plot_dxt_heatmap.py`, but now supports modes of operation that don't require DXT data at all--since this is mostly a naming issue it might be deferred for a while * BUG, TST: Fix issue 717 * Fix issue #717 * Add regression test `test_issue_717` * WIP, ENH: summary with data access cats * add the "Data Access by Category" figure to the Python summary report, and adjust tests to reflect the additional figure and support for both POSIX and STDIO with this fig * some issues to think about - the new figure is ridicuously large compared to the previous figure in the report and the option to specify the width of the figure in the `ReportFigure` has no effect - if we have a sample log that lack both `POSIX` and `STDIO`, then `test_posix_absent` might be suitably replaced instead of deleted (may help if test coverage isn't 100 % anymore) - the code changes here contain comments about an eventual API redesign, the issue there should hopefully be clear from the diff - test suite ran in 3:34 on 6 cores locally--didn't quantify the slowdown relative to `pydarshan-devel` with new analysis/figs * MAINT: PR 718 revisions * use a more reasonable pixel width in the summary report now that the value actually gets passed through * adjust the new figure caption based on reviewer feedback * MAINT: PR 715 revisions * adjust the "missing heatmap" warning message, and related tests, based on reviewer feedback * add code/tests to enforce side-by-side heatmap module comparisons--it seems to produce the described grid layout when both `HEATMAP` and `DXT` are present; note, however, that neither are currently constrained to a single-column layout when just one type of heatmap data is present * minor simplification to use `self.report.heatmaps` * placing an LRU cache on `determine_hmap_runtime()` cuts local processing time for `e3sm_io_heatmap_and_dxt.darshan` down from `2:49.53` to `2:22.12`` * improving the algorithm used by `get_heatmap_df()` allows this branch to vastly outperform `pydarshan-devel` for processing `e3sm_io_heatmap_and_dxt.darshan`; this branch is now `1:15.89` vs. `2:08.91`--that's almost twice as fast while still doing more work; this goes back to my request; like I mentioned almost a year ago here (#396 (comment)) this DXT processing code is just a draft that deserved careful algorithmic improvement * MAINT: try mask indices for 3.6. * MAINT: PR 715 revisions * add a Python `3.6` codepath to `get_heatmap_df()` * BUG: cats only allow POSIX/STDIO * only allow `POSIX` and `STDIO` records to get passed through the control flow that produces the "Data Access by Category" plots * add one new test case and adjust old cases to reflect the more selective plotting of categories * MAINT: PR 678 revisions * try pinning pytest for now * Revert "MAINT: PR 678 revisions" This reverts commit 3108230. * Try older pytest pin. * Revert "Try older pytest pin." This reverts commit cf9f4a3. * MAINT: PR 722 revisions * don't use `get_log_path()` in a `pytest` decorator, it won't skip properly * remove stale file missed in previous merge * small whitespace fixes * missed a stale autoconf file * change CI git branch to main Co-authored-by: Nikolaus Awtrey <nawtrey@lanl.gov> Co-authored-by: Tyler Reddy <treddy@lanl.gov> Co-authored-by: Tyler Reddy <tyler.je.reddy@gmail.com> Co-authored-by: Jakob Luettgau <jluettga@utk.edu> Co-authored-by: JacobDickens <Lcjacobpd@gmail.com>
We've made a ton of progress recently in getting v1 of the summary reports completed, but wanted to create an issue so we don't forget to generally include HDF5 support in our figures. I imagine it's not too difficult to extend the different figures to include HDF5 module data. I notice we do have some HDF5 data related to access sizes already, but we could probably include it in the op counts graphs, as well as in the I/O cost graph.
Thinking about it more, the HDF5 module probably requires a bit of special care since it's really 2 modules (H5F for file access, and H5D for dataset access). For I/O cost graphs and op count graphs, we will probably want to just combine the info from each module into a single "HDF5" field. Similarly, the per-module section of the report probably has an "HDF5" module section that characterizes both the H5F and H5D modules.
The text was updated successfully, but these errors were encountered: