Skip to content

Conversation

joker-eph
Copy link
Collaborator

Python multiprocessing is limited to 60 workers at most:

https://github.com/python/cpython/blob/6bc65c30ff1fd0b581a2c93416496fc720bc442c/Lib/concurrent/futures/process.py#L669-L672

The limit being per thread pool, we can work around it by using multiple pools on windows when we want to actually use more workers.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a workaround for Windows multiprocessing limitations that cap worker processes at 60 per pool. Instead of restricting the total number of workers, the solution creates multiple process pools when more than 60 workers are requested on Windows.

Key changes:

  • Removed the hard limit of 60 workers from usable_core_count() function
  • Implemented multi-pool architecture in test execution that distributes workers and tests across multiple pools
  • Added logging to inform users when the workaround is active

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
llvm/utils/lit/lit/util.py Removes Windows-specific worker count limitation from usable_core_count()
llvm/utils/lit/lit/run.py Implements multi-pool workaround for Windows 60-worker limit with test distribution logic
Comments suppressed due to low confidence (1)

llvm/utils/lit/lit/run.py:1

  • The test indexing is incorrect when using multiple pools. The idx from async_results enumeration doesn't correspond to the correct index in self.tests because tests are distributed across pools with different starting indices.
import multiprocessing

@llvmbot
Copy link
Member

llvmbot commented Sep 9, 2025

@llvm/pr-subscribers-testing-tools

Author: Mehdi Amini (joker-eph)

Changes

Python multiprocessing is limited to 60 workers at most:

https://github.com/python/cpython/blob/6bc65c30ff1fd0b581a2c93416496fc720bc442c/Lib/concurrent/futures/process.py#L669-L672

The limit being per thread pool, we can work around it by using multiple pools on windows when we want to actually use more workers.


Full diff: https://github.com/llvm/llvm-project/pull/157759.diff

2 Files Affected:

  • (modified) llvm/utils/lit/lit/run.py (+45-10)
  • (modified) llvm/utils/lit/lit/util.py (-5)
diff --git a/llvm/utils/lit/lit/run.py b/llvm/utils/lit/lit/run.py
index 62070e824e87f..b24a50911eb71 100644
--- a/llvm/utils/lit/lit/run.py
+++ b/llvm/utils/lit/lit/run.py
@@ -72,25 +72,60 @@ def _execute(self, deadline):
             if v is not None
         }
 
-        pool = multiprocessing.Pool(
-            self.workers, lit.worker.initialize, (self.lit_config, semaphores)
+        # Windows has a limit of 60 workers per pool, so we need to use multiple pools
+        # if we have more than 60 workers requested
+        max_workers_per_pool = 60 if os.name == "nt" else self.workers
+        num_pools = max(
+            1, (self.workers + max_workers_per_pool - 1) // max_workers_per_pool
         )
+        workers_per_pool = min(self.workers, max_workers_per_pool)
 
-        async_results = [
-            pool.apply_async(
-                lit.worker.execute, args=[test], callback=self.progress_callback
+        if num_pools > 1:
+            self.lit_config.note(
+                "Using %d pools with %d workers each (Windows worker limit workaround)"
+                % (num_pools, workers_per_pool)
             )
-            for test in self.tests
-        ]
-        pool.close()
+
+        # Create multiple pools
+        pools = []
+        for i in range(num_pools):
+            pool = multiprocessing.Pool(
+                workers_per_pool, lit.worker.initialize, (self.lit_config, semaphores)
+            )
+            pools.append(pool)
+
+        # Distribute tests across pools
+        tests_per_pool = (len(self.tests) + num_pools - 1) // num_pools
+        async_results = []
+        test_to_pool_map = {}
+
+        for pool_idx, pool in enumerate(pools):
+            start_idx = pool_idx * tests_per_pool
+            end_idx = min(start_idx + tests_per_pool, len(self.tests))
+            pool_tests = self.tests[start_idx:end_idx]
+
+            for test in pool_tests:
+                ar = pool.apply_async(
+                    lit.worker.execute, args=[test], callback=self.progress_callback
+                )
+                async_results.append(ar)
+                test_to_pool_map[ar] = pool
+
+        # Close all pools
+        for pool in pools:
+            pool.close()
 
         try:
             self._wait_for(async_results, deadline)
         except:
-            pool.terminate()
+            # Terminate all pools on exception
+            for pool in pools:
+                pool.terminate()
             raise
         finally:
-            pool.join()
+            # Join all pools
+            for pool in pools:
+                pool.join()
 
     def _wait_for(self, async_results, deadline):
         timeout = deadline - time.time()
diff --git a/llvm/utils/lit/lit/util.py b/llvm/utils/lit/lit/util.py
index b03fd8bc22693..b1552385ccc53 100644
--- a/llvm/utils/lit/lit/util.py
+++ b/llvm/utils/lit/lit/util.py
@@ -121,11 +121,6 @@ def usable_core_count():
     except AttributeError:
         n = os.cpu_count() or 1
 
-    # On Windows with more than 60 processes, multiprocessing's call to
-    # _winapi.WaitForMultipleObjects() prints an error and lit hangs.
-    if platform.system() == "Windows":
-        return min(n, 60)
-
     return n
 
 def abs_path_preserve_drive(path):

Python multiprocessing is limited to 60 workers at most:

https://github.com/python/cpython/blob/6bc65c30ff1fd0b581a2c93416496fc720bc442c/Lib/concurrent/futures/process.py#L669-L672

The limit being per thread pool, we can work around it by using
multiple pools on windows when we want to actually use more workers.
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

@joker-eph joker-eph force-pushed the lit_windows branch 2 times, most recently from 2ba9107 to b3e4997 Compare September 16, 2025 20:46
@joker-eph joker-eph requested a review from Copilot September 16, 2025 20:47
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you able to add unit tests for the logic that figures out the number of pools to create/workers per pool, and the logic that distributes the tests? It looks correct to me after spending some time thinking about it, but I would rather not rely on my thinking being correct.

This adds a bit of complexity, but I think it's very much worth it given high core count systems are becoming increasingly popular. Thanks for working on this.

@joker-eph
Copy link
Collaborator Author

Are you able to add unit tests for the logic that figures out the number of pools to create/workers per pool, and the logic that distributes the tests? It looks correct to me after spending some time thinking about it, but I would rather not rely on my thinking being correct.

Right, I tried it locally by changing the WINDOWS_MAX_WORKERS_PER_POOL value to a smaller number, using various number of workers, and checking the distribution of workers in the pools :)

I added a test!

Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for pushing this through.

If you have benchmarking numbers on a high core count Windows machine, it would be nice if you could throw them in this thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants