[Utils][NFC] Clean up update_mc_test_checks.py. #164454

kosarev · 2025-10-21T16:09:43Z

Refine the code a bit to make it easier to comprehend the logic.

kosarev · 2025-10-21T16:10:03Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

[Utils][NFC] Clean up update_mc_test_checks.py. #164454 👈 (View in Graphite)
[Utils][update_mc_test_checks] Support updating round-trip tests. #164425 : 1 other dependent PR (#164426 )
[Utils][update_mc_test_checks] Generate check lines in alphabetical order. #164424
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

llvmbot · 2025-10-21T16:10:41Z

@llvm/pr-subscribers-testing-tools

Author: Ivan Kosarev (kosarev)

Changes

Refine the code a bit to make it easier to comprehend the logic.

Full diff: https://github.com/llvm/llvm-project/pull/164454.diff

1 Files Affected:

(modified) llvm/utils/update_mc_test_checks.py (+20-36)

diff --git a/llvm/utils/update_mc_test_checks.py b/llvm/utils/update_mc_test_checks.py
index c830833a11a14..fb2f7e0b78730 100755
--- a/llvm/utils/update_mc_test_checks.py
+++ b/llvm/utils/update_mc_test_checks.py
@@ -223,9 +223,6 @@ def update_test(ti: common.TestInfo):
     testlines = list(dict.fromkeys(testlines))
     common.debug("Valid test line found: ", len(testlines))
 
-    run_list_size = len(run_list)
-    testnum = len(testlines)
-
     raw_output = []
     raw_prefixes = []
     for (
@@ -267,14 +264,12 @@ def update_test(ti: common.TestInfo):
     prefix_set = set([prefix for p in run_list for prefix in p[0]])
     common.debug("Rewriting FileCheck prefixes:", str(prefix_set))
 
-    for test_id in range(testnum):
-        input_line = testlines[test_id]
-
+    for test_id, input_line in enumerate(testlines):
         # a {prefix : output, [runid] } dict
         # insert output to a prefix-key dict, and do a max sorting
         # to select the most-used prefix which share the same output string
         p_dict = {}
-        for run_id in range(run_list_size):
+        for run_id in range(len(run_list)):
             out = raw_output[run_id][test_id]
 
             if hasErr(out):
@@ -282,45 +277,34 @@ def update_test(ti: common.TestInfo):
             else:
                 o = getOutputString(out)
 
-            prefixes = raw_prefixes[run_id]
-
-            for p in prefixes:
+            for p in raw_prefixes[run_id]:
                 if p not in p_dict:
                     p_dict[p] = o, [run_id]
-                else:
-                    if p_dict[p] == (None, []):
-                        continue
+                    continue
 
-                    prev_o, run_ids = p_dict[p]
-                    if o == prev_o:
-                        run_ids.append(run_id)
-                        p_dict[p] = o, run_ids
-                    else:
-                        # conflict, discard
-                        p_dict[p] = None, []
+                if p_dict[p] == (None, []):
+                    continue
 
-        p_dict_sorted = dict(sorted(p_dict.items(), key=lambda item: -len(item[1][1])))
+                prev_o, run_ids = p_dict[p]
+                if o == prev_o:
+                    run_ids.append(run_id)
+                    p_dict[p] = o, run_ids
+                else:
+                    # conflict, discard
+                    p_dict[p] = None, []
 
         # prefix is selected and generated with most shared output lines
         # each run_id can only be used once
-        used_runid = set()
-
+        used_run_ids = set()
         selected_prefixes = set()
-        for prefix, tup in p_dict_sorted.items():
-            o, run_ids = tup
-
-            if len(run_ids) == 0:
-                continue
-
-            skip = False
-            for i in run_ids:
-                if i in used_runid:
-                    skip = True
-                else:
-                    used_runid.add(i)
-            if not skip:
+        get_num_runs = lambda item: len(item[1][1])
+        p_dict_sorted = sorted(p_dict.items(), key=get_num_runs, reverse=True)
+        for prefix, (o, run_ids) in p_dict_sorted:
+            if run_ids and used_run_ids.isdisjoint(run_ids):
                 selected_prefixes.add(prefix)
 
+            used_run_ids.update(run_ids)
+
         # Generate check lines in alphabetical order.
         check_lines = []
         for prefix in sorted(selected_prefixes):

Sisyph · 2025-10-21T17:21:53Z

llvm/utils/update_mc_test_checks.py

+            if run_ids and used_run_ids.isdisjoint(run_ids):
                selected_prefixes.add(prefix)

+            used_run_ids.update(run_ids)


Nit: Don't know whether its more efficient to put used_run_ids.add(run_ids) inside the if or .update outside it. Probably doesn't matter.

I believe that would be a functional change. The original code adds all the ids regardless of whether there are common ones, though I'm not convinced it's perfectly correct. Maybe subject to further work.

Performance-wise, I don't see how this would make any noticeable difference.

Maybe I'm misunderstanding. used_run_ids.update(run_ids) does not add items in the argument to the object if they are already present (no duplicates in used_run_ids).
This seems to achieve the same thing as the following code, which is to not add duplicates.

if i in used_runid: skip = True else: used_runid.add(i)

I don't see how this is relevant.

What I'm saying is, putting the used_run_ids.update(run_ids) under the if run_ids and used_run_ids.isdisjoint(run_ids) as you suggest would be a functional change, because the original code updates used_run_ids with run_ids regardless of whether they were disjoint.

So given we have something like p_dict = {'GFX8': (..., [1, 2, 3]), 'GFX9': (..., [3, 4, 5]), 'GFX10': (..., [4, 5])}, we currently should end up having GFX8 as the only member of selected_prefixes, whereas with the update() under the if it would contain GFX8 and GFX10.

There could be case that the current run_ids list and the used_run_ids is not completely disjoint. In this case we will not generate the prefix check string, but still we want to mark these run_id as used.

This is actually an error case. Noted that one run_id can only generate one output, when duplicated run_id is saw in the prefix iteration, it means there are other run_id shares the same output, but with different prefix.

An example be like:

RUNLINE1 -check-prefix=A RUNLINE2 -check-prefix=A,B RUNLINE3 -check-prefix=B

runline1,2,3 are the same runline and generate the same output. The dictionary be like:

A: output, [RUNLINE1, RUNLINE2] B: output, [RUNLINE2, RUNLINE3]

RUNLINE2 will be a duplicated run_id. If you generate check-string for both A and B prefix, the llvm-lit will fail on RUNLINE2, since it will expects two check-string from RUNLINE2, but one runline only generates one output.

The current code will not generate check-string for B. But I agree we should throw an error here like what the other update_script does

should throw an error here like what the other update_script does

Yep.

I see, thanks for the explanation. 'run_ids list and the used_run_ids is not completely disjoint' This was the confusion I was having. Still LGTM

Refine the code a bit to make it easier to comprehend the logic.

kosarev marked this pull request as ready for review October 21, 2025 16:09

This was referenced Oct 21, 2025

[Utils][update_mc_test_checks] Generate check lines in alphabetical order. #164424

Merged

[AMDGPU][MC][NFC] Fix True16 instructions in the literals test. #164426

Open

kosarev mentioned this pull request Oct 21, 2025

[Utils][update_mc_test_checks] Support updating round-trip tests. #164425

Open

llvmbot added the testing-tools label Oct 21, 2025

kosarev requested review from Sisyph and broxigarchen October 21, 2025 16:11

Sisyph approved these changes Oct 21, 2025

View reviewed changes

broxigarchen approved these changes Oct 21, 2025

View reviewed changes

kosarev force-pushed the users/kosarev/support-updating-roundtrip-tests branch from a649818 to 6f52ae6 Compare October 22, 2025 11:01

kosarev force-pushed the users/kosarev/cleanup-mc-script branch 2 times, most recently from 3959b3b to 42b3eda Compare October 23, 2025 10:23

kosarev force-pushed the users/kosarev/support-updating-roundtrip-tests branch from 6f52ae6 to 08f1cd2 Compare October 23, 2025 10:23

[Utils][NFC] Clean up update_mc_test_checks.py.

548424c

Refine the code a bit to make it easier to comprehend the logic.

kosarev force-pushed the users/kosarev/cleanup-mc-script branch from 42b3eda to 548424c Compare October 24, 2025 13:15

kosarev force-pushed the users/kosarev/support-updating-roundtrip-tests branch from 08f1cd2 to 61a3a85 Compare October 24, 2025 13:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Utils][NFC] Clean up update_mc_test_checks.py. #164454

[Utils][NFC] Clean up update_mc_test_checks.py. #164454

kosarev commented Oct 21, 2025

Uh oh!

kosarev commented Oct 21, 2025

Uh oh!

llvmbot commented Oct 21, 2025

Uh oh!

Sisyph Oct 21, 2025

Uh oh!

kosarev Oct 22, 2025

Uh oh!

Sisyph Oct 22, 2025

Uh oh!

kosarev Oct 22, 2025

Uh oh!

broxigarchen Oct 22, 2025 •

edited

Loading

Uh oh!

kosarev Oct 22, 2025

Uh oh!

Sisyph Oct 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Utils][NFC] Clean up update_mc_test_checks.py. #164454

Are you sure you want to change the base?

[Utils][NFC] Clean up update_mc_test_checks.py. #164454

Conversation

kosarev commented Oct 21, 2025

Uh oh!

kosarev commented Oct 21, 2025

Uh oh!

llvmbot commented Oct 21, 2025

Uh oh!

Sisyph Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

kosarev Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

Sisyph Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

kosarev Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

broxigarchen Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kosarev Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

Sisyph Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

broxigarchen Oct 22, 2025 •

edited

Loading