Test (cleanup) improvements#1619
Merged
jmchilton merged 4 commits intogalaxyproject:masterfrom Feb 23, 2026
Merged
Conversation
Use a separate SQLite database for the Celery message broker (amqp_internal_connection) to avoid write lock contention between gunicorn and Celery workers during Galaxy startup. Gravity starts gunicorn, a Celery worker, and Celery beat by default. All three processes build Galaxy app instances that access the same SQLite database. With isolation_level=IMMEDIATE, concurrent write transactions cause exclusive lock contention that can deadlock Galaxy startup, particularly when heavier initialization (like custom tool_data_table loading) widens the contention window.
The sleep() timeout parameter was compared against an iteration counter (count > timeout), but each iteration takes ~1.5s (connect timeout + sleep wait), so timeout=300 actually meant ~450s of wall time. This exceeded the 360s pytest-timeout, preventing the internal timeout handler (which prints Galaxy log contents) from ever running. Use time.time() - start_time instead so timeout=300 means 300 actual seconds, giving _serve's exception handler (with log_contents) time to fire before pytest-timeout kills the test.
The galaxy.yml config was being written with unresolved ${temp_directory}
template variables in property values. These were only resolved in the
GALAXY_CONFIG_OVERRIDE_* environment variables, but Gravity may not
propagate those env vars to gunicorn workers. This caused Galaxy workers
to fail during startup because paths like new_file_path, job_working_directory,
etc. contained literal "${temp_directory}" strings instead of actual paths.
Resolve all template variables in properties before writing them to the
YAML config file, making it self-contained and independent of env var
propagation.
ServeTestCase shares a single galaxy_root across all test methods. When test_serve_multiple_tool_data_tables starts a Galaxy daemon referencing temp .xml.test files, the base CliTestCase.tearDown() deletes those temp files then sends SIGINT to the gunicorn process. However, SIGINT doesn't stop the gravity supervisor, which respawns workers that crash on the now-deleted files. This pollutes the shared galaxy_root and causes test_serve_workflow to fail starting Galaxy. Fix by registering cleanup hook that kills the process group.
jmchilton
approved these changes
Feb 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Not sure this will really fix the timeouts we see for
test_serve_multiple_tool_data_tablesbut it should at least be correct.