Skip to content

Fix intermittent hot-reload test failures caused by metadata initialization race condition#3302

Merged
aaronburtle merged 9 commits intomainfrom
dev/aaronburtle/hot-reload-test-internalservererror
Mar 24, 2026
Merged

Fix intermittent hot-reload test failures caused by metadata initialization race condition#3302
aaronburtle merged 9 commits intomainfrom
dev/aaronburtle/hot-reload-test-internalservererror

Conversation

@aaronburtle
Copy link
Copy Markdown
Contributor

Why make this change?

Related to #2992

Some flakey tests were being masked by faster initialization when the task they were in was isolated. When we refactored the tests to lower the overall runtime in this PR #3245 these flakey tests were eventually revealed.

The root cause was a race condition: after a successful hot-reload, WaitForConditionAsync detects the "Validated hot-reloaded configuration file" console message and returns, but the engine's metadata providers have not fully re-initialized yet. An immediate HTTP request can arrive before the metadata is ready, causing a 500 error.

What is this change?

Added retry logic to the three non-ignored tests that make HTTP calls expecting success responses immediately after hot-reload:

  • HotReloadConfigRuntimePathsEndToEndTest: REST and GraphQL calls now retry up to 5 times with 1 second delays
  • HotReloadConfigConnectionString: REST call now uses WaitForRestEndpointAsync helper
  • HotReloadConfigDatabaseType: REST call now uses WaitForRestEndpointAsync helper

Added a shared WaitForRestEndpointAsync helper method that polls a REST endpoint until it returns the expected status code (5 retries, 1 second delay).

This follows the same retry pattern already established in HotReloadConfigDataSource, which had this exact fix applied previously.

How was this tested?

Ran a batch of 10 pipeline runs which all succeeded.

image

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves reliability of MSSQL hot-reload integration tests by adding post–hot-reload readiness polling/retry logic, addressing a race where config validation completes before metadata providers are fully re-initialized.

Changes:

  • Added REST polling helper (WaitForRestEndpointAsync) and updated tests to wait for REST readiness after hot-reload.
  • Added retry loop for the GraphQL request in HotReloadConfigRuntimePathsEndToEndTest to tolerate transient post–hot-reload failures.
  • Updated connection-string and database-type hot-reload tests to use the shared REST polling helper.

Comment thread src/Service.Tests/Configuration/HotReload/ConfigurationHotReloadTests.cs Outdated
@aaronburtle
Copy link
Copy Markdown
Contributor Author

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 6 pipeline(s).

Copy link
Copy Markdown
Contributor

@souvikghosh04 souvikghosh04 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the fix. approved with suggestions.

Comment thread src/Service.Tests/Configuration/HotReload/ConfigurationHotReloadTests.cs Outdated
@aaronburtle
Copy link
Copy Markdown
Contributor Author

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 6 pipeline(s).

Copy link
Copy Markdown
Collaborator

@Aniruddh25 Aniruddh25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should dispose the last response.

@github-project-automation github-project-automation Bot moved this from In Progress to Review In Progress in Data API builder Mar 20, 2026
@aaronburtle aaronburtle enabled auto-merge (squash) March 23, 2026 21:59
@aaronburtle aaronburtle merged commit 16fdda3 into main Mar 24, 2026
12 checks passed
@aaronburtle aaronburtle deleted the dev/aaronburtle/hot-reload-test-internalservererror branch March 24, 2026 22:22
@github-project-automation github-project-automation Bot moved this from Review In Progress to Done in Data API builder Mar 24, 2026
RubenCerna2079 added a commit that referenced this pull request May 6, 2026
## Why make this change?

Closes #3517

Ports into release/2.0


* Reliable cleanup of MSSQL integration tests by properly disposing the
config file watcher and FileSystemRuntimeConfigLoader, eliminating
intermittent IOException failures on `File.Delete`.
* Faster, more balanced MSSQL integration test pipeline: tests are split
into two parallel Windows jobs (combined and configuration), with a
shared mssql-test-steps.yml template, and DB-independent unit tests are
removed from the MSSQL category so they run in the unit-test pipeline
instead.
* Hot-reload test stability: replaced naive timeout waits with polling
helpers (`WaitForRestEndpointAsync` and retry logic to cover the race
between hot-reload validation completing and metadata providers
finishing re-initialization.
* Autoentity hot-reload test now uses the same polling pattern, removing
the last known race condition in that area.Reliable cleanup of MSSQL
integration tests by properly disposing the config file watcher and
FileSystemRuntimeConfigLoader, eliminating intermittent IOException
failures on `File.Delete`




## What is this change?

Cherry-picked PRs (chronological order on main):

* Fix IOException in MSSQL pipeline by disposing config file watchers
#3243
* Split MSSQL integration tests into parallel pipeline jobs and remove
MSSQL category from generic unit tests #3245
* Fix intermittent hot-reload test failures caused by metadata
initialization race condition #3302
* Fix hot-reload race condition in Autoentities test #3320

## How was this tested?

Existing and newly added unit/integration tests from each source PR.

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Souvik Ghosh <souvikofficial04@gmail.com>
Co-authored-by: RubenCerna2079 <32799214+RubenCerna2079@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants