general HDF5 module support #663

shanedsnyder · 2022-02-16T18:38:41Z

We've made a ton of progress recently in getting v1 of the summary reports completed, but wanted to create an issue so we don't forget to generally include HDF5 support in our figures. I imagine it's not too difficult to extend the different figures to include HDF5 module data. I notice we do have some HDF5 data related to access sizes already, but we could probably include it in the op counts graphs, as well as in the I/O cost graph.

Thinking about it more, the HDF5 module probably requires a bit of special care since it's really 2 modules (H5F for file access, and H5D for dataset access). For I/O cost graphs and op count graphs, we will probably want to just combine the info from each module into a single "HDF5" field. Similarly, the per-module section of the report probably has an "HDF5" module section that characterizes both the H5F and H5D modules.

shanedsnyder · 2022-02-16T18:39:51Z

Maybe we can just flesh out details at our next meeting and I can update the issue once it's more clear what the plan is.

nawtrey · 2022-02-16T19:12:34Z

If we are going to combine module data that could take some care. We probably want it handled in some kind of aggregator that stores it in the DarshanReport, like we have for some of the other figures.

For HDF5 logs, we currently only have ior_hdf5_example.darshan and shane_macsio_id29959_5-22-32552-7035573431850780836_1590156158.darshan that contain H5F and H5D modules, are these are sufficient cases for testing? If we plan to officially support HDF5 we should probably have representative logs for testing.

nawtrey · 2022-02-16T23:26:49Z

Tyler had some good comments in today's meeting about collecting testing logs for this issue which I don't think my comment above captured. He suggested that we collect/create logs with known or easy-to-determine values (say we know it opened 5 files) for testing. Since we can't use the original darshan reports to confirm our outputs for HDF5-related things, this would give us a lot more confidence moving forward, especially if the data post-processing gets complicated.

tylerjereddy · 2022-02-18T02:22:52Z

A Python-based approach like the one in darshan-hpc/darshan-logs#22 and https://github.com/tylerjereddy/heatmap_diagonal could likely be used with h5py to do pretty much anything you want for producing custom darshan logs with "ground truth" HDF5 activity.

nawtrey · 2022-03-08T19:09:02Z

I've been trying to develop some logs for testing here, and I've created logs for a couple cases that I would like some feedback on. I started with Tyler's approach and used h5py to write to 1 file/dataset per rank:

HDF5 diagonal code

import os
import time
import argparse

import numpy as np
from mpi4py import MPI
import h5py


def handle_io(test_dir_path, rank_val, delay_time, n_bytes, mode):
    # read/write n_bytes to/from a file from
    # rank rank_val after delay_time seconds
    time.sleep(delay_time)
    filename = f"test_{rank_val}.h5"
    file_path = os.path.join(test_dir_path, filename)

    if mode == "read":
        with h5py.File(file_path, "r") as f:
            data = np.asarray(f.get('data')).tobytes()
    elif mode == "write":
        bytes_to_write = np.void(bytes(1) * n_bytes)
        with h5py.File(file_path, "w") as f:
            f.create_dataset("dataset", data=bytes_to_write)


def main(mode, n_bytes):
    comm = MPI.COMM_WORLD
    rank = comm.Get_rank()
    size = comm.Get_size()

    # each rank should write data
    # at a specific offset time to
    # produce diagonal heatmap IO
    # activity like this:
    #       x
    #     x
    #   x
    # x

    dir_name = f"test_files_{mode}_{n_bytes}_bytes"
    test_dir_path = os.path.join(os.getcwd(), dir_name)

    delay_time = 0.5 + (rank * 0.2)
    for rank_val in range(size):
        if rank == rank_val:
            handle_io(
                test_dir_path=test_dir_path,
                rank_val=rank_val,
                delay_time=delay_time,
                n_bytes=n_bytes,
                mode=mode,
            )    


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "run_mode",
        type=str,
        help="Specify run mode (i.e. 'read' or 'write')",
    )
    parser.add_argument(
        "n_bytes",
        type=int,
        help="Specify number of bytes to read/write",
    )
    args = parser.parse_args()
    mode = args.run_mode
    n_bytes = args.n_bytes
    main(mode=mode, n_bytes=n_bytes)

Note: Most of the code here isn't important, but it might be worth looking over the "read" and "write" modes of handling_io().

Writing data

Running this in the "write" mode will mimic Tyler's MPI code pretty closely, with heatmaps like the following:

This case wrote 1 byte of data to 10 different files/datasets using 10 ranks. The H5D data for this log reflect this:

0    0
1    0
2    0
3    0
4    0
5    0
6    0
7    0
8    0
9    0
Name: H5D_BYTES_READ, dtype: int64
0    1
1    1
2    1
3    1
4    1
5    1
6    1
7    1
8    1
9    1
Name: H5D_BYTES_WRITTEN, dtype: int64

Reading data

With that working as intended, I wanted to see if I could get a similar log where we read instead of write. To do this, I just create a bunch of files, each with a single dataset and number of bytes stored in it (usually 1), then retrieve that dataset at runtime using the handling_io(mode="read") function in my code above. The leg work for setting up the files is handled in a separate script that I wrote.

The issue for me is when I run my script this way, I don't get any H5D data at all. I only get H5F data:

   rank  H5F_OPENS  H5F_FLUSHES  H5F_USE_MPIIO
0     0          1            0              0
1     1          1            0              0
2     2          1            0              0
3     3          1            0              0
4     4          1            0              0
5     5          1            0              0
6     6          1            0              0
7     7          1            0              0
8     8          1            0              0
9     9          1            0              0

Here is the heatmap for the same log (read only):

While this is useful in terms of having a log that has only H5F data (likely a very rare case), it doesn't satisfy my goal of getting a "read-only" HDF5 log.

Question:

Is it obvious to anyone else why we would see reading at the POSIX level and not pick it up in H5D? I have double-checked that each file contains the correct data to be read, but I feel like I'm missing something conceptually. My guess is the approach in my handling_io(mode="read") mode is naive.

* Adds `H5F` and `H5D` specific sections to `plot_opcounts()` aggregator functions `agg_ioops()` and `gather_count_data()`. Current method does not combine `H5F` and `H5D` module data together. * Add `H5F` and `H5D` test cases to `test_plot_exp_common` tests `test_xticks_and_labels` and `test_bar_heights` * Contributes to issue darshan-hpc#663

nawtrey · 2022-03-09T00:16:23Z

Unrelated to my last comment, my understanding of how we want to combine the H5F and H5D "Operation Counts" data is to sum the H5F and H5D totals. Both modules contain "Opens" and "Flush" counters which can be combined. As an example, ior_hdf5_example.darshan has the following "Opens" data:

rank, H5F_OPENS
0    2
1    2
2    2

rank, H5D_OPENS
0    2
1    2
2    2

For each module the total "Opens" is 6, so if we sum these values together, we get a total of 12 "Opens" for HDF5. Is that how we want to combine the HDF5 module data? I ask because when these values are separate, I get the understanding that there were 6 instances where a file was opened and its dataset(s) were accessed. But with the values summed to 12 under a single "HDF5" banner, I feel like it gives the impression that 12 files were opened, at least at a glance.

I know we are going to lose some granularity either way, but would it make more sense to use the H5F "Opens" summed value for the combined HDF5 "Opens"? The H5F sum is always going to be greater than or equal to the H5D sum, but it would avoid the sort of "double counting" that we would get with a simple sum.

tylerjereddy · 2022-03-09T03:44:00Z

To address #663 (comment) first, I was able to reproduce the read-invisbility of H5D, and I also confirmed that round-tripping is not recorded either:

from mpi4py import MPI
import numpy as np
from numpy.testing import assert_array_equal
import h5py

def main():
    comm = MPI.COMM_WORLD
    rank = comm.Get_rank()

    # round trip HDF5 IO test
    file_path = f"./test_rank_{rank}.hdf5"
    n_bytes = 10 * (rank + 1)

    bytes_to_write = np.ones(shape=n_bytes, dtype=np.int8)

    with h5py.File(file_path, "w") as f:
        f.create_dataset("dataset", data=bytes_to_write)

    with h5py.File(file_path, "r") as g:
        retrieved_data = np.asarray(g['dataset'])
        assert_array_equal(retrieved_data, bytes_to_write)

if __name__ == "__main__":
    main()

The ramped up bytes written per rank is correctly recorded, and the increase in bytes read is true according to the NumPy assertion, but completely invisible to the darshan-runtime (see below the fold). In fact, H5D seems completely unaware of the file opens that happen for reading, while H5F records the file handle opening for both events. So either there's a bug in the runtime or the runtime bindings aren't capable of handling the Python layering on the read side. I'll leave that to @carns or @shanedsnyder to assess.

mpirun -x DXT_ENABLE_IO_TRACE=1 -x DARSHAN_LOGPATH=/yellow/users/treddy/generated_darshan_logs -x DARSHAN_DISABLE_SHARED_REDUCTION=1 -x LD_PRELOAD=/yellow/users/treddy/darshan_install/lib/libdarshan.so:/usr/projects/hpcsoft/toss3/snow/hdf5/1.10.7_gcc-9.3.0_openmpi-3.1.6/lib/libhdf5.so -np 10 python3 test.py

{'POSIX': <darshan.report.DarshanRecordCollection object at 0x7f28ef2b7670>, 'H5F': <darshan.report.DarshanRecordCollection object at 0x7f28eed3e5b0>, 'H5D': <darshan.report.DarshanRecordCollection object at 0x7f28eed3efd0>, 'DXT_POSIX': <darshan.report.DarshanRecordCollection object at 0x7f28ef28e670>}
H5D dataframe counters:
    rank                    id  H5D_OPENS  H5D_READS  H5D_WRITES  H5D_FLUSHES  H5D_BYTES_READ  H5D_BYTES_WRITTEN  H5D_RW_SWITCHES  ...  H5D_CHUNK_SIZE_D3  H5D_CHUNK_SIZE_D4  H5D_CHUNK_SIZE_D5  H5D_USE_MPIIO_COLLECTIVE  H5D_USE_DEPRECATED  H5D_FASTEST_RANK  H5D_FASTEST_RANK_BYTES  H5D_SLOWEST_RANK  H5D_SLOWEST_RANK_BYTES
0     0  11686675357054819694          1          0           1            0               0                 10                0  ...                  0                  0                  0                         0                   0                 0                       0                 0                       0
1     1  13816125359238839068          1          0           1            0               0                 20                0  ...                  0                  0                  0                         0                   0                 0                       0                 0                       0
2     2   8163471921814110286          1          0           1            0               0                 30                0  ...                  0                  0                  0                         0                   0                 0                       0                 0                       0
3     3   5704909803088537921          1          0           1            0               0                 40                0  ...                  0                  0                  0                         0                   0                 0                       0                 0                       0
4     4   5895888996501598751          1          0           1            0               0                 50                0  ...                  0                  0                  0                         0                   0                 0                       0                 0                       0
5     5  10662237727056981784          1          0           1            0               0                 60                0  ...                  0                  0                  0                         0                   0                 0                       0                 0                       0
6     6  11021700834876979282          1          0           1            0               0                 70                0  ...                  0                  0                  0                         0                   0                 0                       0                 0                       0
7     7  15718980487096942502          1          0           1            0               0                 80                0  ...                  0                  0                  0                         0                   0                 0                       0                 0                       0
8     8   6635034257651523170          1          0           1            0               0                 90                0  ...                  0                  0                  0                         0                   0                 0                       0                 0                       0
9     9   7789431687327723946          1          0           1            0               0                100                0  ...                  0                  0                  0                         0                   0                 0                       0                 0                       0

[10 rows x 96 columns]
H5D dataframe fcounters:
    rank                    id  H5D_F_OPEN_START_TIMESTAMP  H5D_F_READ_START_TIMESTAMP  H5D_F_WRITE_START_TIMESTAMP  H5D_F_CLOSE_START_TIMESTAMP  ...  H5D_F_MAX_READ_TIME  H5D_F_MAX_WRITE_TIME  H5D_F_FASTEST_RANK_TIME  H5D_F_SLOWEST_RANK_TIME  H5D_F_VARIANCE_RANK_TIME  H5D_F_VARIANCE_RANK_BYTES
0     0  11686675357054819694                    0.759960                         0.0                     0.760155                          0.0  ...                  0.0              0.000046                      0.0                      0.0                       0.0                        0.0
1     1  13816125359238839068                    0.758065                         0.0                     0.758297                          0.0  ...                  0.0              0.000040                      0.0                      0.0                       0.0                        0.0
2     2   8163471921814110286                    0.758488                         0.0                     0.758685                          0.0  ...                  0.0              0.000034                      0.0                      0.0                       0.0                        0.0
3     3   5704909803088537921                    0.759318                         0.0                     0.759495                          0.0  ...                  0.0              0.000034                      0.0                      0.0                       0.0                        0.0
4     4   5895888996501598751                    0.759557                         0.0                     0.759741                          0.0  ...                  0.0              0.000034                      0.0                      0.0                       0.0                        0.0
5     5  10662237727056981784                    0.755550                         0.0                     0.755747                          0.0  ...                  0.0              0.000048                      0.0                      0.0                       0.0                        0.0
6     6  11021700834876979282                    0.755203                         0.0                     0.755376                          0.0  ...                  0.0              0.000042                      0.0                      0.0                       0.0                        0.0
7     7  15718980487096942502                    0.757642                         0.0                     0.757850                          0.0  ...                  0.0              0.000033                      0.0                      0.0                       0.0                        0.0
8     8   6635034257651523170                    0.754759                         0.0                     0.754982                          0.0  ...                  0.0              0.000041                      0.0                      0.0                       0.0                        0.0
9     9   7789431687327723946                    0.756886                         0.0                     0.757096                          0.0  ...                  0.0              0.000032                      0.0                      0.0                       0.0                        0.0

[10 rows x 19 columns]
H5F dataframe counters:
    rank                    id  H5F_OPENS  H5F_FLUSHES  H5F_USE_MPIIO
0     0  12338924892616139460          2            0              0
1     1   2374806179835333680          2            0              0
2     2   4741602306122135968          2            0              0
3     3   6304996309454365857          2            0              0
4     4  14038849941392865518          2            0              0
5     5  17284205279612898412          2            0              0
6     6  11364139169532633768          2            0              0
7     7  12429378654238249285          2            0              0
8     8   1186195185660660511          2            0              0
9     9  10257107547145179444          2            0              0
H5F dataframe fcounters:
    rank                    id  H5F_F_OPEN_START_TIMESTAMP  H5F_F_CLOSE_START_TIMESTAMP  H5F_F_OPEN_END_TIMESTAMP  H5F_F_CLOSE_END_TIMESTAMP  H5F_F_META_TIME
0     0  12338924892616139460                    0.750439                          0.0                  0.762948                        0.0         0.011012
1     1   2374806179835333680                    0.750282                          0.0                  0.761436                        0.0         0.009390
2     2   4741602306122135968                    0.750133                          0.0                  0.761668                        0.0         0.009890
3     3   6304996309454365857                    0.750318                          0.0                  0.762342                        0.0         0.010498
4     4  14038849941392865518                    0.750317                          0.0                  0.762754                        0.0         0.010261
5     5  17284205279612898412                    0.749981                          0.0                  0.759153                        0.0         0.007335
6     6  11364139169532633768                    0.750280                          0.0                  0.758589                        0.0         0.006231
7     7  12429378654238249285                    0.750204                          0.0                  0.761028                        0.0         0.008457
8     8   1186195185660660511                    0.750030                          0.0                  0.758631                        0.0         0.006761
9     9  10257107547145179444                    0.750031                          0.0                  0.760369                        0.0         0.008668

tylerjereddy · 2022-03-09T18:44:06Z

I'm also wondering if H5G activity may be invisible to H5D monitoring in the runtime--I'll send a message about that in Slack channel.

nawtrey · 2022-03-09T22:18:00Z

For reference, Shane gave an update on how to combine the HDF5 modules in this comment:

Yep, sorry about the confusion guys. Just to make sure we're on the same page, I think the op counts graph would have the following:

"file opens" (aggregated H5F_OPENS values)

"dataset opens" (aggregated H5D_OPENS values)

"dataset writes" (aggregated H5D_WRITES values)

"dataset reads" (aggregated H5D_READS values)

"file flushes" (aggregated H5F_FLUSHES values)

"dataset flushes" (aggregated H5D_FLUSHES values)

(Not necessarily related to this PR, but as Tyler mentions, for the I/O cost plots that independently measure read/write/meta time components, you will need to combine the total META_TIMES from dataset and file modules. You shouldn't need to do anything like that for the op counts, though).

nawtrey · 2022-03-09T22:51:06Z

As far as "ground truth" HDF5 testing logs go, these are the current cases we've been able to create successfully:

open a file but never read/write to it (to get a small log with only H5F data)
open N files over N ranks, create N datasets, write n_bytes to each (my original "write" strategy above)
the same as 2, but vary n_bytes (like Tyler did above)
open N files over N ranks, create N/2 datasets, write n_bytes to each (like Tyler did here)
open N files over N ranks, create N datasets, write n_bytes to each, flush N/2 files (like I did here)

I think we would still want to supplement these with some others where H5D picks up the read events, but I think some of these are useful already. Should I create a log for each case and open PRs for them in the logs repo? And if so, I'm guessing it's fine to group them together in a single directory too, since they are all for the same purpose?

tylerjereddy · 2022-03-10T03:55:47Z

I suspect a single directory would be ok for some of these simple cooked-up HDF5 cases. The Python code is almost small enough to put in there too (though we haven't been doing that), or just link to the small source snippets I suppose. I guess the team might even be "ok" with just a short description of how each log was generated without needing the source connections/links.

* Update list of supported modules in `get_io_cost_df()` to include `H5F` and `H5D` modules * Add function `combine_hdf5_modules()` to aggregate the HDF5 modules (`H5F` and `H5D`) together for the I/O cost figure * Fix test case for `test_get_io_cost_df()` as it now includes the HDF5 module * Add test `test_combine_hdf5_data()` to verify the HDF5 module data is being combined appropriately * Contributes to issue darshan-hpc#663

* Add `H5D` or `H5F` to operation count figure modules. Only 1 module is added since `H5D` figure encompasses both HDF5 modules' data. * Correct expected figure count in `test_main_without_args` for `ior_hdf5_example.darshan` case since `H5D` section now contains an operation counts figure. * Contribute to issue darshan-hpc#663

* Change the per-module report section titles to `Per-Module Statistics: HDF5` for `H5F` and `H5D` figures * Fix issue darshan-hpc#663

tylerjereddy · 2022-04-12T21:09:46Z

Closing due to merge of gh-707 -- if there are residual issues discussed here it may be easier to split them off into other issues, though I believe at least Shane's initial comment is dealt with.

* MAINT: PR 571 revisions * Simplify `img` tag counting in `test_main_without_args` * MAINT: Update mypy config to ignore all pandas imports * Change mypy pandas config to ignore all imports * Remove unnecessary type ignores for `pandas.testing` imports * ENH: Add log path retrieval machinery * Add module `log_utils.py` containing new function `get_log_path` for easy retrieval of logs from either local sources or darshan-logs repo, using only the filename * Fixes #566 * MAINT: PR 587 revisions * Fix type hint issue * MAINT: PR 587 revisions * Update type hints to use shorthand versions * MAINT: PR 587 revisions * Make pytest import contingent on whether a pytest session is ongoing * MAINT: PR 587 revisions * Simplify log retrieval logic into single function * MAINT: PR 587 revisions * `_locate_log()` no longer moves up too many directory levels; add a related `TODO` comment to get rid of the requirement to move up levels altogether * MAINT: PR 587 revisions * `log_utils.py` no longer depends on a fragile environment variable `PYTEST_CURRENT_TEST` to detect if we are in a `pytest` run; this was failing even for usage of `get_log_path()` within the body of a test * `test_main_without_args()` was adjusted to use `get_log_path()`, including one new logs repo-based log file; it appears to be running the extra test case when logs repo is avaialable, and skipping when the logs repo is absent * MAINT: PR 587 revisions * improve `get_log_path()` documentation to include a note about not using the func in pytest decorators * ENH: Add common access table * Add module `plot_common_access_table.py` for generating the per-module common access table * Add testing module `test_plot_common_access_table.py` with unit tests for each function in `plot_common_access_table.py` * MAINT: PR 600 revisions * Remove unnecessary type ignore for `pandas.testing` import * Fix typo * MAINT, TST: PR 600 revisions * Simplify `test_common_access_table`: - Change input from report to log path - Assign redundant column names inside of test * Add comments to POSIX and MPI-IO cases mentioning values are from original report code * Add new test case using `nonmpi_partial_modules.darshan` with comment mentioning the values are from the original report * MAINT: PR 600 revisions * Change name of `test_general` to `test_misc_funcs` * Simplify `test_misc_funcs` by assigning column names for dataframes in test body * MAINT: PR 600 revisions * Leverage new log retrieval machinery to simplify `test_common_access_table` * TST: PR 600 revisions * Add `imbalanced-io.darshan` test case for further verification of new common access table outputs * BUG, BENCH: fix time_plot_heatmap_builtin_logs * fix the call signature of `plot_heatmap()` in `time_plot_heatmap_builtin_logs()` so that the benchmark no longer fails * TST: tests for get_log_path() * add some basic regression tests for `get_log_path()`; focuses on the usage in the `pytest` context, which is the main use case * no noticeable test suite slowdown measured locally with logs repo present * BENCH: asv benchmark for get_log_path() * add a benchmark for repeated calls to `get_log_path()`, which is a situation likely to arise in the context of the full `pytest` suite where many log files are retrieved * benchmark runs in 42 seconds locally: `time asv run -e -b "time_get_log_path_repeat"` ``` · Creating environments · Discovering benchmarks · Running 1 total benchmarks (1 commits * 1 environments * 1 benchmarks) [ 0.00%] · For pydarshan commit 6d0a48e <pydarshan-devel>: [ 0.00%] ·· Benchmarking virtualenv-py3.9-cffi-numpy-pytest [ 50.00%] ··· Running (log_utils.GetLogPath.time_get_log_path_repeat--). [100.00%] ··· log_utils.GetLogPath.time_get_log_path_repeat ok [100.00%] ··· =========== ============= -- filename ----------- ------------- num_calls dxt.darshan =========== ============= 1 97.5±0.8ms 10 972±8ms 25 2.43±0.02s =========== ============= asv run -e -b "time_get_log_path_repeat" 19.79s user 30.18s system 116% cpu 42.835 total ``` * MAINT: PR 609 revisions * add an `__init__.py` in `examples/darshan-graph/` so that `setuptools` will install the `*.darshan` files in that path as well * ENH: faster get_log_path() * this is a precursor to using `get_log_path()` ubiquitously in the test suite, which itself is a precursor to being able to run the test suite from any directory (including a user/developer with a `spack` install of `py-darshan`, which is our primary planned vehicle for initial delivery of the new report capability) * achieve nearly constant-time log filepath retrieval using a dictionary/hash table * the speedups on repeated calls, the most important scenario for running in `pytest`, are several orders of magnitude faster using the new benchmark in gh-610: `asv continuous pydarshan-devel treddy_get_log_path_dict -e -b "time_get_log_path_repeat` ``` before after ratio [6d0a48e] [47e9a462] <pydarshan-devel> <treddy_get_log_path_dict> - 97.6±2ms 962±100ns 0.00 log_utils.GetLogPath.time_get_log_path_repeat(1, 'dxt.darshan') - 974±7ms 3.14±0.1μs 0.00 log_utils.GetLogPath.time_get_log_path_repeat(10, 'dxt.darshan') - 2.43±0s 7.31±0.3μs 0.00 log_utils.GetLogPath.time_get_log_path_repeat(25, 'dxt.darshan') SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY. PERFORMANCE INCREASED. ``` * the new tests from gh-609 also seem to pass with this branch, though it would probably be prudent to review/fix/merge that first.. * MAINT: PR 611 revisions * remove extraneous space highlighted by reviewer * MAINT: Close dxt heatmap matplotlib figures * Fixes issue when running tests where many figures are opened and not explicitly closed, generating a pytest warning * TST: Add `pytest.ini` * Add `pytest.ini` to upgrade `matplotlib`-generated RuntimeWarnings into errors * MAINT: Remove unused import in `test_plot_common_access_table.py` * MAINT: refactor assets * move `examples` and `tests` folders inside `darshan` package so they can be easily installed as elements of the package, and prevent us from having to search more broadly on the filesystem to locate `*.darshan` file assets (and to avoid monopolizing `import tests`, `import examples`, which is crazy) * adjust setup.py accordingly--there is now only a `darshan` package to `install`, not separate `tests` and `examples` packages * migrate the test suite over to `get_log_path()` as needed to empower a portable approach to running tests (i.e., `pytest --pyargs darshan` from any directory) * `_locate_log()` no longer needs to go searching around on the filesystem outside of our own packages, so get rid of that.. * on this branch, in some random directory, `pytest --pyargs darshan -n 8` produces: `255 passed in 88.93s (0:01:28)` * this is in line with latest `pydarshan-devel` run of the full suite before migration to `get_log_path()` and reorganization: `pytest --import-mode=importlib -n 8` `255 passed in 87.78s (0:01:27)` * enforce support for `pytest --pyargs darshan` from any dir in the CI (using a random temp dir there) * things reviewers could do to help here: - check the large diff for potential errors - check/fix up the `asv` benchmark suite for the effects of this branch - check if the coverage report stuff is still working "ok" with new `pytest` portability/incantation * MAINT: PR 612 revisions * rename `example-logs` to `example_logs` for `mypy`, which is best practice anyway * ignore some old tutorial file that `mypy` doesn't like * MAINT: PR 612 revisions * adjust the CI `mypy` check to only verify the `darshan` folder, because the testing folder is now inside the main package * MAINT: PR 612 revisions * changes to make `asv check` pass; we really probably should just use `get_log_path()` at this point though... * MAINT: fixup setup.py for example_logs * MAINT: PR 612 revisions * apply Nik's suggested patch to `test_issue_590()` * remove a redundant `--cov=` command in the `pytest` CI incantation, as suggested by Nik * CI: debug work for codecov. * MAINT: move remaining tests and remove `/examples` * Move tests `test_plot_common_access_table.py` and `test_log_utils.py` into `darshan/tests` * Remove unnecessary `__ini__.py` from `examples/darshan-graph` * Fix issue #626 * ENH: Add dataframe/table figure support * Update method `generate_img` name to more general `generate_fig` and update documentation to reflect the changes * Generalize `generate_fig` to support figure functions that generate pandas dataframes * Add test `test_generate_fig_unsupported_fig_type` to check appropriate error is raised when input function does not generate the supported figure types * Fix issue #550 * BUG: PR 618 revisions * Change default `to_html` arguments to remove index instead of header * MAINT: PR 618 revisions * Add object `DarshanReportTable` to `plot_common_access_table.py` for storing tables in dataframe and html format. Should set a standard that allows access for testing but also puts the burden on the figure function to generate a `DarshanReportTable` that contains the desired html table for the report. * Change `plot_common_access_table` to return `DarshanReportTable` and update documentation * Fix `test_common_access_table` due to above change * `ReportFigure.generate_fig` changes: - When checking for `matplotlib`/`seaborn` figures, check for the attribute `savefig` instead of checking figure type - When checking for table figures, check for `DarshanReportTable` type instead of `pd.DataFrame` - Remove `supported_mpl_fig_types` and update error message to simply state that a given figure type is not supported * Remove unused imports, add import for `plot_common_access_table.py` * MAINT: rm redundant sample.darshan * remove a redundant copy of `sample.darshan` * ENH: Add common access size table to darshan summary report * Add "Per-Module Stats" section to report containing the common access size table (to start) * Extend `test_main_without_args` to check for number of tables generated as a regression guard * MAINT: Update return type for `plot_common_access_table` * Update return type for `plot_common_access_table` from generic `Any` to `DarshanReportTable` * BUG: correct section closing tag in `base.html` * Relocate section closing tag to outside of figure loop * TST: Add regression guard for section closing tags * Add check to test `test_main_without_args` to verify the number of close/open section tags match * TST: Add regression guard for section closing tags * Add check to test `test_main_all_logs_repo_files` to verify the number of close/open section tags match for all log files * ENH: Add partial module data flag * Add module `plot_flag` with functions for generating a generic warning flag with configurable warning message * Add warning flags to summary report module table for modules marked as having partial module data * `test_module_table()` changes: - Add tests for logs `partial_data_stdio.darshan` and `partial_data_dxt.darshan` for checking cases with partial module data - Adjust actual and expected module dataframes to reflect changes in the module table * Fix gh-498 * MAINT: purge example2.darshan * `example2.darshan` is a duplicate of `example.darshan`, which itself is a duplicate of `sample.darshan` * for now, just purge `example2.darshan` as a small cleanup; I have not touched the two Jupyter notebooks that used `example2.darshan`, but I suspect Jakob is "ok" with that for now * MAINT: cleanups after main merge. * MAINT: PR 634 revisions * Remove module `plot_flag.py` * Change partial flag to use unicode warning symbol over matplotlib figure * Update `test_module_table` to check for unicode warning symbol * ENH: Add access size histogram to summary report * Add access size histogram figure from `plot_access_histogram` to summary report * Update expected `img` tag counts in `test_main_without_args` to account for new access size histogram figures * Rework table counting in `test_main_without_args` to ignore extraneous instances of the word "table", such as the instance in the figure caption for the access size histogram * Contributes to issue #465 * MAINT: Correct `nonmpi_partial_modules.darshan` filename * Correct all instances of `nonmpi_partial_modules.darshan` to `nonmpi_dxt_anonymized.darshan` after the following darshan-logs PR was merged: darshan-hpc/darshan-logs#19 * MAINT: PR 634 cleanup * Fix comment in `test_module_table()` * ENH: Add `H5D` module support to "Access Size Histogram" * Add `H5D` to supported module list for `mod_agg_iohist` * Add `H5D` access histogram to summary report * Add test cases for `ior_hdf5_example.darshan`: - `test_main_without_args()` in `test_summary.py` - `test_xticks_and_labels()` and `test_bar_heights()` in `test_plot_exp_common.py` * Add pydarshan low-level backend function to access HEATMAP records. * Indentation fix, missing mod_name value. * Add heatmap support to backend.log_get_record * ENH: Add grid formatting to darshan summary report * Add and implement new CSS class to allow for grid formatting of figures in each summary report section * Contributes to issue #517 * ENH: Update summary report section titles * Change "Log Details" to "Job Summary" * Change "Module Breakdown" to "Darshan Log Information" * Change "I/O Operations" to "I/O Summary" * Change "Per-Module Stats" to "Per-Module Statistics" * Contributes to issue #641 * ENH: Add Darshan website link to summary report footer * Contributes to issue #641 * Add HEATMAP support to DarshanReport class. Add heatmap datatype. * Update devel build recipe to also generate configure scripts using prepare.sh needed for HEATMAP support to work. * Do not use self.data[] on report. * CI: remove `type: ignore` instances * ENH: Update summary report figure captions * Update captions for DXT heat map, I/O cost, and Access Size figures according to suggestions from Phil * Contributes to issue #641 * MAINT: PR 647 Revisions * Change I/O cost graph caption from "by process type" to "by access type" * Fix regression if/elif leading to lustre records being handled wrongly. * Raise qualified exceptions. * Fix: Local variable name_records is assigned but never used. * BUG: Change DXT heatmap to use `nprocs` for y-axis scale and correct bar graph anomalies * Use `nprocs` to scale y-axis for heatmap figure instead of number of ranks with IO activity * Add new function `get_filled_hmap_df` for filling data gaps in heatmap dataframe. If a rank has no data it will now have a dedicated row filled with zeros. * Alter method of y-axis flipping to simply flip the order of the input data array since `invert_yaxis()` creates inconsistencies between heatmap and horizontal bar graph. This was observed in the following log file for the `DXT_POSIX` module: `snyder_acme.exe_id1253318_9-27-24239-1515303144625770178_2.darshan` * Correct tests for cases where not all ranks had non-zero bins * Fixes issue #576 * Contributes to issue #575 * MAINT: PR 622 revisions * Remove `get_filled_hmap_df` and function calls * Add `nprocs` parameter to `get_heatmap_df` and update documentation * Update `get_heatmap_df` to return dense dataframe based on `nprocs` * Correct tests in `test_heatmap_handling.py` to reflect dense heatmap dataframe output * Revert y-axis flipping method back to `ax.invert_yaxis()` * MAINT: PR 622 revisions * Correct misc grammar mistakes * MAINT: PR 622 revisions * fix `time_get_heatmap_df()` for function signature change * BENCH: peakmem benchmarks for heatmap * add some peak memory tracking `asv` benchmarks for the DXT heatmap code, because the decision between sparse and dense data structures requires statistically robust measurements of both performance and memory impact * the `GetHeatMapDf` benchmarks on synthetic data now include a spread of data densities to better reflect the reality that many MPI applications only have a subset of IO-active ranks * Include heatmap in read_all_records, add guard for HEATMAP. * Doc: fix incomplete sentence, remove comments from backend left in for potential addition of resolve name_record as part of dict. * Remove changes that increased diff to files unrelated. * Don't add self.data['heatmaps'] to report. * Add hash for MPIIO:heatmap to resolve table. * ENH: Add operation counts figure to summary report * Add operation counts figure from `plot_opcounts.py` to summary report * Update expected `img` tag counts in `test_main_without_args` to account for new operation count figures * Fix `img` tag string such that extraneous instances of "img" are not included in total count * Close issue #465 * MAINT: PR 654 revisions * Change "Operation Counts" figure caption to "Histogram of I/O operation frequency." * MAINT: Update darshan summary report metadata table runtime datatype * Change method `ReportData.get_runtime()` to use integers instead of floats * Correct tests `test_metadata_table` and `test_get_runtime` to use integer values * MAINT: Update partial flag message * Change partial flag message to "Module data incomplete due to runtime memory or record count limits" * Correct test `test_module_table` to reflect updated partial data flag message * MAINT: Update summary report metadata table executable description * Change executable description to "Command Line" * Update row names for expected dataframes in test `test_metadata_table` * MAINT: Add suffix to module data table entries * Suffix module descriptions in "Darshan Log Information" table with "Module Data" * Correct test `test_module_table` such that expected dataframes contain the updated row names * Typos and switch runtime to ValueError. * TST: xfail HEATMAP tests * mark the two test cases requiring `HEATMAP` module support as known failures (`xfail`) for now, so that PRs in the main and logs repos can pass CI until the support is added in * because the two tests are not parametrized, the `xfail`s are imperative, which means that only those inputs that precede the `xfail`ed logs will run for now, so I wouldn't want to leave these in for long, just long enough to avoid rushing the HEATMAP support addition/review * an alternative is to simply `pass`/`continue` for these test cases, which would allow all of the other cases for the two tests to run as usual, though it would leave out the convenient `pytest` reminder that the tests have known failures in them * MAINT: Reorganize summary report metadata sections * Move "Log Filename", "Runtime Library Version", and "Log Format Version" metadata into "Darshan Log Information" section of summary report * Correct tests `test_metadata_table` and `test_module_table` to reflect new organization of log metadata * Close issue #641 * TST: graceful HEATMAP xfails * only test cases that require runtime `HEATMAP` module support are `xfail`ed in the test suite now, instead of also skipping the cases that also come after them in iterable tests * while the original design was meant to be temporary, and was made intentionally on the assumption that HEATMAP support would be added shortly, this introduces parametrization into the two tests that are involved and does away with the old fixtures * on `pydarshan-devel`: `300 passed, 2 xfailed in 38.83s` * on this branch: `318 passed, 6 xfailed in 96.56s (0:01:36)` * there are 3 runtime heatmap logs, and two tests that need to avoid them, so the xfail arithmetic looks correct there, along with the restoration of a bunch more logs cases... * MAINT: PR 675 revisions * handle the absence of the logs repo with new `HEATMAP` `xfail` machinery.. * Add pydarshan low-level backend function to access HEATMAP records. * Indentation fix, missing mod_name value. * Add heatmap support to backend.log_get_record * Add HEATMAP support to DarshanReport class. Add heatmap datatype. * Update devel build recipe to also generate configure scripts using prepare.sh needed for HEATMAP support to work. * Do not use self.data[] on report. * Fix regression if/elif leading to lustre records being handled wrongly. * Raise qualified exceptions. * Fix: Local variable name_records is assigned but never used. * Include heatmap in read_all_records, add guard for HEATMAP. * Doc: fix incomplete sentence, remove comments from backend left in for potential addition of resolve name_record as part of dict. * Remove changes that increased diff to files unrelated. * Don't add self.data['heatmaps'] to report. * Add hash for MPIIO:heatmap to resolve table. * Typos and switch runtime to ValueError. * MAINT: PR 615 revisions * add some tests for runtime HEATMAP functionality * WIP, ENH: add HDF5 support to `plot_opcounts()` aggregators * Adds `H5F` and `H5D` specific sections to `plot_opcounts()` aggregator functions `agg_ioops()` and `gather_count_data()`. Current method does not combine `H5F` and `H5D` module data together. * Add `H5F` and `H5D` test cases to `test_plot_exp_common` tests `test_xticks_and_labels` and `test_bar_heights` * Contributes to issue #663 * For newly introduced exception to indicate 'ModuleNotInDarshanLog' inherit from ValueError. * TST: remove heatmap xfail * we no longer need to mark runtime `HEATMAP` tests as known failures, since the binding support has been merged now * MAINT: PR 691 revisions * restore a narrower `xfail` because `e3sm_io_heatmap_and_dxt.darshan` uses too much memory for CI, for now * MAINT: PR 685 revisions * Update `plot_opcounts.gather_count_data()` to return combined data for `H5D` module * Correct `H5D` test cases in `test_xticks_and_labels` and `test_bar_heights` * Add `mod` parameter to `plot_opcounts` docstring with a note explaining the special behavior when `H5D` is input * MAINT: refactor `get_by_avg_series()` * Refactor `get_by_avg_series()` such that all modules output `pd.Series` objects of the same shape. This is specifically targeted at the `H5F` module, which previously produced a data array of length 1 since it contains only "Meta" data. * ENH: Add HDF5 support to I/O cost figure * Update list of supported modules in `get_io_cost_df()` to include `H5F` and `H5D` modules * Add function `combine_hdf5_modules()` to aggregate the HDF5 modules (`H5F` and `H5D`) together for the I/O cost figure * Fix test case for `test_get_io_cost_df()` as it now includes the HDF5 module * Add test `test_combine_hdf5_data()` to verify the HDF5 module data is being combined appropriately * Contributes to issue #663 * MAINT: PR 685 revisions * Change labels to use the module name + operation name (i.e. `H5F Open`, `H5D Read`, etc.) * WIP: draft work on data access by filesystem section * MAINT: plot adjustments * Filesystem -> Category for plot titles as a generalization based on recent comments from Phil * initial placement of counter values next to horizontal bars in the read/write count plots (probably not perfect yet) * DEBUG: more debug work * ENH: PR 397 revisions * draft in support for counting/plotting bytes, including a number of new functions to further reduce code duplication * BUG: PR 397 revisions * added a regression test for `empty_series_handler()` not actually filling in the index:value pairs when a filesystem is missing from the series index, but is present in other parts of the control flow * fix the bug in `empty_series_handler()` * MAINT: PR 397 revisions * the `empty_series_handler()` function can be simplified to a single `reindex` operation, so do that * MAINT: PR 397 revisions * fixed the spelling of `Parameters` in several docstrings * removed the `:` after some `Parameters` and `Returns` docstring fields, for consistency * MAINT: PR 397 revisions * fixed an issue where the `df_reads` parameter was repeated in the `process_byte_counts()` function docstring * TST: add more tests * add a regression test for the function `convert_file_path_to_root_path()` * add a regression test for the function `convert_file_id_to_path()` * add a regression test for the function `identify_filesystems()` * add a regression test for the function `rec_to_rw_counter_dfs_with_cols()` * add a regression test for the function `check_empty_series()` * add a regression test for the function `process_byte_counts()` * TST: PR 397 revisions * add regression test for the function `process_unique_files()` * add regression test for the function `unique_fs_rw_counter()` * TST, MAINT: simplify index naming PR 397 * a few tests have been simplified in PR 397 to name the index directly inside the pytest parameters instead of the more verbose setting inside the tests proper * TST, MAINT: move plot_data under test * place the main plotting function, `plot_data()` under crude initial test * MAINT: PR 397 revisions * move more plotting code/logic into the main code module to prevent a more convenient plotting entrypoint * TST, ENH: remove right side spline * add a regression test and fix for the issue of value labels overlapping with axis right side spline, as reported in the review of PR 397 * BUG: scale font size y annotation * add a regression test and fontsize-adjusting fix for the case where the log file `noposixopens.darshan` causes the y axis labels (annotations) to overlap with the plot itself (because the strings were longer) * BUG: fix 0 count positions * add a regression test and fix for the case where there is no POSIX read/write activity and text labels are observed too far to the right side of the subplots * MAINT: PR 397 revisions * add regression test and fix for appropriate error handling when the `POSIX` module data is completely absent from a darshan log * MAINT: PR 397 revisions * address reviewer comment to simplify `ax_filesystem_bytes.barh()` calls * MAINT: PR 397 revisions * removed debug prints from `plot_data()` function * address reviewer comments related to `verbose` mode for the function `plot_with_log_file()` * address `mypy` `List`->`Sequence` parameter reviewer comments * MAINT: PR 397 revisions * `df_reads` -> `df_writes` for incorrectly named parameter in the function `process_unique_files()` * add type hints to `plot_data()` function, ignoring the pointless `pandas` "types" for now * add a basic docstring to the `plot_data()` function * MAINT: PR 397 revisions * add a missing error to the `Raises` section of the `unique_fs_rw_counter()` function * MAINT: PR 397 revisions * fix the spacing in `plot_data()` such that the regression test provided here passes: #397 (comment) * adjust `test_plot_data()` to match the spacing changes above * MAINT: PR 397 revisions * adjust the integer file counters such that they are displayed in scentific notation by `plot_data()`, as requested during PR review * adjust `test_plot_data()` to require scientific notation for the unique file counters * MAINT: PR 397 revisions * add the performance improvements suggested here: #397 (comment) * fix some issues with those changes * MAINT: PR 397 revisions * enforce shared axes in a column for "data access by category" * use a log scale for the x axes for column differences greater than two orders of magnitude * MAINT: PR 397 revisions * we no longer ignore the `STD..` data streams in the data access by category analysis/plots * these streams are now displayed using their proper name or `anonymized` if they are coming from an anonymized darshan log file * add regression tests for the above behavior * MiB Conversion Replaced 2**20 approximation with 1048576 * Function Docstring Added a docstring to ``convert_id_to_arrays`` * Filesystem Variables Use assigned filesystem variables "files_read" and "files_written" instead of multiple instances of series indexing. * All Branch CI Enabled CI on all branches for automated testing on pull requests. * MiB Conversion Replaced approximate conversion factor with actual value. Temporarily enabled CI for all branches. * Restore CI Removed temporary CI activaction. * MAINT: PR 397 revisions * `plot_data()` now labels the x axis for plot columns that use log scaling so that usage of log scales is clearer to the user * `plot_data()` now uses a symmetric log function to avoid artefacts near zero for the log axis (plot spanning far too many orders of magnitude) * the `ratio` used to determine whether a column of plots should be log scaled no avoids division by zero, and makes an assessment based on `MiB` values instead of raw bytes * add a new test fixture for selection individual log repo files by name * add new regression test, `test_log_scale_display()` for ensuring that the x axis gets properly labelled when a log scale is appropriate * MAINT: PR 397 revisions * `test_log_scale_display()` is now properly skipped when the darshan-logs project is not available * MAINT: PR 397 revisions * `rec_to_rw_counter_dfs()` no longer uses the "deprecated" `mod_read_all_records(..., dtype=...)` approach * TST, MAINT: more log scaling fixes * only scale `ratio` in `plot_data()` for bytes * `test_plot_data_shared_x_axis()` now tests a case where symmetric log scaling is used in both columns * MAINT: PR 397 revisions * `mypy` fixups for my branch after rebase, since I ignore more `pandas` imports by default than `pydarshan-devel` * MAINT: PR 397 revisions * add a fix and regression test for scaling the vertical size of the plots based on the number of categories/file systems * MAINT: PR 397 revisions * allow plotting the first `N` categories * Fixup after rebasing. * MAINT: PR 397 revisions * adjust `test_data_access_by_filesystem` module to use the newer `get_log_path()` machinery for retrieving the absolute paths to darshan test log files * MAINT: PR 397 revisions * remove extraneous `mypy` type ignores for `pytest` now that we require a newer `pytest` version that provides types in the main repo * MAINT: PR 397 * adjust `plot_data()` to use `va="center"` for annotations per reviewer request, and enforce via regression test * MAINT: PR 397 revisions * adjust the `plot_data()` `text()` read/write labels to use a centered vertical alignment per reviewer request; add regression test to enforce this * MAINT: PR 397 revisions * `plot_with_log_file()` has been renamed to `plot_with_report()` and accepts a `DarshanReport` object instead of a log file path, based on reviewer suggestions * `plot_data()` now caps the number of subplots based on `num_cats`, and this is now enforced with a regression test that also checks for "collapsed" layouts * `plot_with_report()` no longer saves a `.png` file to disk (enforced with a test), based on reviewer feedback, for consistency with many of the other report-related Python plotting functions * remove the now-extraneous `plot_filename` argument from `plot_with_report()` * remove extraneous `tmpdir` usage in several tests, because `plot_with_report()` no longer generates a file artifact by default * MAINT: PR 397 revisions * `plot_with_report()` now plots data in descending order of bytes IO activity, per reviewer request; includes a regression test * MAINT: PR 397 revisions * `convert_file_path_to_root_path()` has been adjusted to handle anonymized files on the root file system, based on feedback from Shane, along with a regression test for the plotted results no longer having `//<digits>` root paths * some fixes to `test_cat_labels_std_streams()` because of the reverse sorting of categories by bytes IO * fixes to `test_plot_with_report_proper_sort()` related to handling anonymous files on the root path, and for using the correct index for the third row category label * MAINT: PR 397 revisions * use `get_log_path()` instead of some recently-removed `pytest` fixtures in `test_data_access_by_filesystem.py` * MAINT: PR 397 revisions * draft adjustments and testing to account for `STDIO` in the "data access by category" analysis/plots * MAINT: PR 700 revisions * Restrict regex pattern in `combine_hdf5_modules()` to only collect `H5F` and `H5F` module names * Cleanup import in `test_plot_io_cost.py` * Fix comment in `test_combine_hdf5_modules` * MAINT: Use runtime for DXT heatmap x-axis scaling * Update `plot_heatmap()` to use the runtime for scaling x-axis * Add `tmax` parameter to `get_heatmap_df` to allow for setting the heatmap maximum time. Default behavior is to use the final DXT segment end time. * Add logic to `plot_heatmap` to prevent truncation of data when using the job runtime for setting plotting boundaries * Change `set_x_axis_ticks_and_labels()` and `get_x_axis_tick_labels()` to use runtime for x-axis labels * Update `set_x_axis_ticks_and_labels()` parameters and expected xticklabels in `test_set_x_axis_ticks_and_labels()` * Contributes to x-axis rescaling portion of issue #575 * MAINT: PR 696 revisions * Revert changes to `get_heatmap_df()` * Correct synthetic test case and add full `runtime` calculation logic to test `test_set_x_axis_ticks_and_labels` * MAINT: PR 696 revisions * Add formal parameter `bin_max` to `set_x_axis_ticks_and_labels` and `get_x_axis_ticks` to allow for easy scaling of x-axis tick mark locations, and update relevant documentation * Update `test_set_x_axis_ticks_and_labels`: - Add complete x-axis limit setting logic - correct the `expected_xticks` to the more realistic values - Relax the tolerances on the actual/expected xticks comparison * ENH: Add HDF5 operation counts figures to summary report * Add `H5D` or `H5F` to operation count figure modules. Only 1 module is added since `H5D` figure encompasses both HDF5 modules' data. * Correct expected figure count in `test_main_without_args` for `ior_hdf5_example.darshan` case since `H5D` section now contains an operation counts figure. * Contribute to issue #663 * MAINT: Change `H5D` and `H5F` figure section titles to HDF5 * Change the per-module report section titles to `Per-Module Statistics: HDF5` for `H5F` and `H5D` figures * Fix issue #663 * MAINT: Include `H5D` columns in `H5F`-only operation counts figure * Change operation counts figure to always list columns for both `H5F` and `H5D` modules. When `H5D` module data is not present the values are set to zero. * Contribute to issue #706 * CI: bump GHA action versions * follow NumPy in bumping to latest versions of the actions * TST: Close figures in `test_bar_heights` * Add appropriate close statement to ensure each figure in `test_plot_exp_common.test_bar_heights` is closed. * Fix issue darshan-hpc/darshan-logs#34 * TST: Add HDF5 test cases using "ground truth" logs * Add "ground truth" testing log cases to `test_common_access_table` and change the number of rows to be configurable based on the number of rows in the input `expected_df` * Add "ground truth" testing log cases to `test_plot_exp_common.py` tests `test_xticks_and_labels` and `test_bar_heights` to check results for `plot_opcounts` and `plot_access_histogram` * Fix issue #706 * ENH: Runtime heatmap multiple operation support * Add multiple operation (i.e. read and write) support to the `Heatmap.to_df()` method. This mirrors the way some DXT heatmap functions operate, which will allow for a more seamless integration of the runtime heatmap into `plot_heatmap()`. * Correct call signatures for `Heatmap.to_df()` method * Add test `test_heatmap_operations` for checking different operation configurations and comparing their results * Fix issue #697 * MAINT: PR 702 revisions * Simplify dataframe summation in `Heatmap.to_df()` * TST: PR 702 revisions * Add test `test_heatmap_df_invalid_operation` for verifying `ValueError` is raised appropriately in `Heatmap.to_df()` method * ENH: summary with runtime heatmaps * add the runtime (`HEATMAP` module) heatmaps to the pydarshan/Python summary reports * adjust some tests to reflect the new runtime `HEATMAP` support and to test some components of it in a bit more detail * design considerations: - at the moment, the runtime heatmaps are placed first, but it may be helpful to grid i.e., POSIX versions of DXT and runtime maps side-by-side to make the comparisons less jarring - it is a bit awkward that `DXT` data is directly associated with the module data source like `DXT_POSIX`, while for `HEATMAP` we index into a "submodule" dictionary--this awkwardness should be clear in the diff--not sure if we want to do anything about that - the new `determine_hmap_runtime()` function was added to provide a common abstraction when plotting heatmap data (i.e., this gives you access to DXT time resolution if available, even when only calling `plot_heatmap()` for `HEATMAP` module); this should be "ok" even if in the future we disable plotting the `DXT` data for large cases because it is mostly the plotting-related expansion of the data that causes the memory explosion rather than raw `DXT` access - it is a bit awkward that some of this machinery now resides in `plot_dxt_heatmap.py`, but now supports modes of operation that don't require DXT data at all--since this is mostly a naming issue it might be deferred for a while * BUG, TST: Fix issue 717 * Fix issue #717 * Add regression test `test_issue_717` * WIP, ENH: summary with data access cats * add the "Data Access by Category" figure to the Python summary report, and adjust tests to reflect the additional figure and support for both POSIX and STDIO with this fig * some issues to think about - the new figure is ridicuously large compared to the previous figure in the report and the option to specify the width of the figure in the `ReportFigure` has no effect - if we have a sample log that lack both `POSIX` and `STDIO`, then `test_posix_absent` might be suitably replaced instead of deleted (may help if test coverage isn't 100 % anymore) - the code changes here contain comments about an eventual API redesign, the issue there should hopefully be clear from the diff - test suite ran in 3:34 on 6 cores locally--didn't quantify the slowdown relative to `pydarshan-devel` with new analysis/figs * MAINT: PR 718 revisions * use a more reasonable pixel width in the summary report now that the value actually gets passed through * adjust the new figure caption based on reviewer feedback * MAINT: PR 715 revisions * adjust the "missing heatmap" warning message, and related tests, based on reviewer feedback * add code/tests to enforce side-by-side heatmap module comparisons--it seems to produce the described grid layout when both `HEATMAP` and `DXT` are present; note, however, that neither are currently constrained to a single-column layout when just one type of heatmap data is present * minor simplification to use `self.report.heatmaps` * placing an LRU cache on `determine_hmap_runtime()` cuts local processing time for `e3sm_io_heatmap_and_dxt.darshan` down from `2:49.53` to `2:22.12`` * improving the algorithm used by `get_heatmap_df()` allows this branch to vastly outperform `pydarshan-devel` for processing `e3sm_io_heatmap_and_dxt.darshan`; this branch is now `1:15.89` vs. `2:08.91`--that's almost twice as fast while still doing more work; this goes back to my request; like I mentioned almost a year ago here (#396 (comment)) this DXT processing code is just a draft that deserved careful algorithmic improvement * MAINT: try mask indices for 3.6. * MAINT: PR 715 revisions * add a Python `3.6` codepath to `get_heatmap_df()` * BUG: cats only allow POSIX/STDIO * only allow `POSIX` and `STDIO` records to get passed through the control flow that produces the "Data Access by Category" plots * add one new test case and adjust old cases to reflect the more selective plotting of categories * MAINT: PR 678 revisions * try pinning pytest for now * Revert "MAINT: PR 678 revisions" This reverts commit 3108230. * Try older pytest pin. * Revert "Try older pytest pin." This reverts commit cf9f4a3. * MAINT: PR 722 revisions * don't use `get_log_path()` in a `pytest` decorator, it won't skip properly * remove stale file missed in previous merge * small whitespace fixes * missed a stale autoconf file * change CI git branch to main Co-authored-by: Nikolaus Awtrey <nawtrey@lanl.gov> Co-authored-by: Tyler Reddy <treddy@lanl.gov> Co-authored-by: Tyler Reddy <tyler.je.reddy@gmail.com> Co-authored-by: Jakob Luettgau <jluettga@utk.edu> Co-authored-by: JacobDickens <Lcjacobpd@gmail.com>

shanedsnyder added the pydarshan label Feb 16, 2022

shanedsnyder added this to the pydarshan-job-summary-v1 milestone Feb 16, 2022

nawtrey mentioned this issue Mar 9, 2022

WIP, ENH: HDF5 support for plot_opcounts() aggregators #685

Merged

nawtrey mentioned this issue Mar 9, 2022

MAINT: agg_ioops.py refactor #689

Open

3 tasks

tylerjereddy mentioned this issue Mar 9, 2022

ENH: support for "mangled"/cythonized h5py bindings #690

Closed

nawtrey mentioned this issue Mar 10, 2022

ENH: HDF5 testing logs darshan-hpc/darshan-logs#29

Merged

nawtrey mentioned this issue Mar 24, 2022

ENH: HDF5 Support for I/O Cost Figure #700

Merged

nawtrey mentioned this issue Apr 6, 2022

ENH: Add HDF5 operation counts figures to summary report #704

Merged

nawtrey added a commit to nawtrey/darshan that referenced this issue Apr 7, 2022

MAINT: Change H5D and H5F figure section titles to HDF5

f77d60a

* Change the per-module report section titles to `Per-Module Statistics: HDF5` for `H5F` and `H5D` figures * Fix issue darshan-hpc#663

nawtrey mentioned this issue Apr 7, 2022

MAINT: Change H5D and H5F figure section titles to HDF5 #707

Merged

tylerjereddy closed this as completed Apr 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

general HDF5 module support #663

general HDF5 module support #663

shanedsnyder commented Feb 16, 2022

shanedsnyder commented Feb 16, 2022

nawtrey commented Feb 16, 2022

nawtrey commented Feb 16, 2022

tylerjereddy commented Feb 18, 2022

nawtrey commented Mar 8, 2022

nawtrey commented Mar 9, 2022

tylerjereddy commented Mar 9, 2022

tylerjereddy commented Mar 9, 2022

nawtrey commented Mar 9, 2022

nawtrey commented Mar 9, 2022

tylerjereddy commented Mar 10, 2022

tylerjereddy commented Apr 12, 2022

general HDF5 module support #663

general HDF5 module support #663

Comments

shanedsnyder commented Feb 16, 2022

shanedsnyder commented Feb 16, 2022

nawtrey commented Feb 16, 2022

nawtrey commented Feb 16, 2022

tylerjereddy commented Feb 18, 2022

nawtrey commented Mar 8, 2022

Writing data

Reading data

Question:

nawtrey commented Mar 9, 2022

tylerjereddy commented Mar 9, 2022

tylerjereddy commented Mar 9, 2022

nawtrey commented Mar 9, 2022

nawtrey commented Mar 9, 2022

tylerjereddy commented Mar 10, 2022

tylerjereddy commented Apr 12, 2022