Skip to content

Conversation

paultiq
Copy link
Contributor

@paultiq paultiq commented Sep 19, 2025

This arose from discussion

Edit: Summary

duckdb-python tests "pollute" the global state by: modifying the default duckdb connection, using local file names that aren't randomized, or otherwise break if run in the "wrong" order.

This PR fixes that:

  • using the tmp_path fixture to use unique, test-local databases
  • closing any pre-existing default connection before using it
  • addressing issues with the query interruption test that arose while testing.

This PR also ensures that test dependencies aren't introduced by randomizing their order & parallelizing their running.

PR

While running tests for free-threading, I found some test "contamination": tests that are order-dependent or lack isolation of resources.

* This PR makes no functional changes, only changes to tests / testing.

This PR addresses the individual tests and adds steps to detect in the future:

  • Randomizing their order: to surface any order dependencies
  • Enabling multi-process parallelism to tests, identifying any dependencies and also reducing overall test time
  • Using unique path fixtures or table names to eliminate conflicts across concurrent tests
  • Moved a slow test (10M rows) to tests/slow
  • Reworked test_query_interruption.
  • added a 10 minute timeout for pytests

Added Plugins

pytest-xdist

Disabled by default. This PR adds -n 2 to packaging_wheels to run two parallel tests at a time.

Comments:

  • One concern would be overloading the test runners. I haven't found this to happen: -n auto was surprisingly fine. I believe Runners on public repos are 4 cores.
  • I suggest -n 2 as a first step in case any issues with the more performance intensive tests.
  • When doing performance testing, don't pass -n.
pytest-randomly

Enabled by default. Randomizes the order of tests.

Comments:

  • Random order is good for surfacing test dependencies and assumptions
  • This identified a number of test dependencies, especially around use of a "dirty" default connection or reusing file names

@paultiq paultiq mentioned this pull request Oct 9, 2025
@paultiq paultiq marked this pull request as draft October 12, 2025 17:05
@paultiq
Copy link
Contributor Author

paultiq commented Oct 12, 2025

If interested, I'll reapply these changes to main... there's a lot of merge work to do given the recent changes.

Generally, what's needed is:

  • Make sure default_con is closed before re-using, so tests aren't exposed to prior tests
  • Use tmp_path or tmp_path_factory for any directory paths
  • Fix the test_query_interrupt, which seems broken/incorrect
  • Enable randomly and xdist, which will make sure test dependencies are exposed during regular tests.

@paultiq paultiq closed this Oct 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant