Make hpctoolkitv4 reader sparse #544

lithomas1 · 2024-04-21T21:53:24Z

I made the hpctoolkit reader have the option to output in sparse format, and gated the option behind a keyword.

When we get the rest of the methods working, we should flip the default value to True from False.

Then, before the release we should delete the keyword.

ocnkr · 2024-04-22T16:01:48Z

hatchet/graphframe.py

@@ -144,9 +147,10 @@ def from_hpctoolkit(dirname):
        from .readers.hpctoolkit_v4_reader import HPCToolkitV4Reader

        if "experiment.xml" in os.listdir(dirname):
+            # TODO: Make old hpctoolkit outputs sparse?


I don't think we should make the old hpctoolkit outputs sparse. We can discuss this with Abhinav.

ocnkr · 2024-04-22T16:03:25Z

hatchet/readers/hpctoolkit_v4_reader.py

-                    value=0,
-                )
+
+                if not self.sparse_format:


We can move this if statement to line 1609 because we don't need to use not_visited_nodes in the sparse format.

ocnkr · 2024-04-22T16:05:17Z

hatchet/tests/hpctoolkit.py

@@ -76,6 +80,11 @@ def test_graphframe(data_dir, calc_pi_hpct_db):
        elif col in ("name", "type", "file", "module", "node"):
            assert gf.dataframe[col].dtype == object

+    # In case of sparse format check to make sure we are not inserting dummy values
+    # into the dataframe
+    if sparse_format:


We know how many rows for each node we should have in the dense format = number of ranks * number of threads. To test the sparse format, maybe we can check if there are some nodes with fewer number of rows? What do you think?

Good idea, you actually caught an issue in my test (I was using old < v4 hpctoolkit data :D)

Make hpctoolkitv4 reader sparse

095d1fe

lithomas1 added status: ready for review This PR is ready to be reviewed by assigned reviewers area: graphframe PRs and Issues involving Hatchet's core GraphFrame datastructure and associated classes priority: high High priority issues and PRs labels Apr 21, 2024

ocnkr self-requested a review April 22, 2024 16:00

ocnkr assigned lithomas1 Apr 22, 2024

ocnkr reviewed Apr 22, 2024

View reviewed changes

lithomas1 added 8 commits April 22, 2024 12:27

address code review

195ad9e

flesh out test

4b9fd58

black

f7c08f7

more black

0a3b2d0

fix test

9de9f15

morre formatting

943e265

try to fix

471aee4

more black

097f9d3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make hpctoolkitv4 reader sparse #544

Make hpctoolkitv4 reader sparse #544

lithomas1 commented Apr 21, 2024

ocnkr Apr 22, 2024

ocnkr Apr 22, 2024

lithomas1 Apr 22, 2024

ocnkr Apr 22, 2024

lithomas1 Apr 22, 2024

Make hpctoolkitv4 reader sparse #544

Are you sure you want to change the base?

Make hpctoolkitv4 reader sparse #544

Conversation

lithomas1 commented Apr 21, 2024

ocnkr Apr 22, 2024

Choose a reason for hiding this comment

ocnkr Apr 22, 2024

Choose a reason for hiding this comment

lithomas1 Apr 22, 2024

Choose a reason for hiding this comment

ocnkr Apr 22, 2024

Choose a reason for hiding this comment

lithomas1 Apr 22, 2024

Choose a reason for hiding this comment