Skip to content

Improvements to test workflow#1804

Merged
TedHartMS merged 16 commits into
mainfrom
tedhar/test-wf-fixes
May 19, 2026
Merged

Improvements to test workflow#1804
TedHartMS merged 16 commits into
mainfrom
tedhar/test-wf-fixes

Conversation

@TedHartMS
Copy link
Copy Markdown
Contributor

@TedHartMS TedHartMS commented May 14, 2026

This pull request introduces several improvements and fixes to the CI workflows and module loading logic. The main highlights include more efficient module dependency loading, enhanced test result reporting with detailed summaries, and updates to test configurations to support new test categories.

CI/CD Workflow Improvements and Test Reporting:

  • Added a new test-summary job in .github/workflows/ci.yml to generate combined Markdown summaries of test results for each category (standalone, cluster, tsavorite), including statistics, failure lists, duration distributions, and slowest tests. This helps quickly assess test health and performance.
  • Updated artifact naming conventions for test results to be more descriptive (dotnet-standalone-results-*, dotnet-cluster-results-*, etc.), and ensured correct artifact download patterns for summarization. [1] [2] [3]
  • Improved the ci-bdnbenchmark.yml workflow by adding concurrency grouping to avoid overlapping runs, and removed the random sleep step for better determinism. [1] [2]
  • Added the NUnit.DisplayName=FullName option to dotnet test for Tsavorite tests to improve test result clarity.
  • Expanded the Tsavorite nightly test matrix to include a new Tsavorite.test.stress test, and updated mappings and solution files accordingly. [1] [2] [3]
  • Improved the BDN benchmark CI workflow by adding a concurrency group to prevent overlapping runs, and removed a random sleep step that was previously used to stagger inserts. [1] [2]
  • Minor cleanup in the CI workflow steps for better readability and maintainability.
  • Tsavorite CI tests now display the Fully Qualified TestName

Local test run parallization:

  • Add a runsettings file to allow both Garnet.test(.cluster) and Tsavorite.test to run in parallel
  • Modify Garnet.test(.cluster) so each sub-testproject uses its own port for the GarnetServer instance (necessary for parallelism).
  • On my machine all tests now run in under 10 minutes (vs. over an hour).

Module Loading Optimization:

  • Introduced ModuleUtils.LoadModuleDependencies, which loads only those assemblies that a module references (transitively), rather than all DLLs in the module directory. This prevents unnecessary loading of unnecessary modules which can be very slow.
  • Updated the NetworkModuleLoad logic in AdminCommands.cs to use the new dependency loader, improving performance and reliability of module loading. [1] [2]

Minor Fixes and Cleanups:

  • Clarified a comment in ClusterContext.cs about port assignment to avoid confusion over shared state.
  • Minor formatting and path cleanups in workflow files. [1] [2] [3]

Co-authored-by: Vasileios Zois vazois@microsoft.com:

  • Fix TRX parsing: skip malformed files instead of failing CI
  • Wrap XmlDocument.Load() in try/catch so corrupt or truncated TRX files are skipped with a warning rather than crashing the test-summary job and blocking Garnet CI (Complete).
  • Fix distribution bars: show minimum 1 bar for non-zero counts
  • Small bucket counts relative to the max were rounding to 0 bars, making them appear empty. Now non-zero counts always show at least one block.
  • fix naming of result artifacts

These changes collectively improve CI efficiency, test result visibility, and the reliability of module extension loading.This pull request introduces a new Tsavorite.test.stress project to enable stress testing for the Tsavorite storage engine, integrates it into the nightly CI workflow, and makes related improvements to the test infrastructure. The main focus is on adding dedicated stress tests for read cache multi-threading scenarios, ensuring they are built, executed, and reported in CI.

The most important changes are:

Stress Test Project Addition and Integration:

  • Added a new Tsavorite.test.stress project, including its own csproj file with dependencies and project references, and registered it in the solution file (Tsavorite.slnx). This project contains new stress test classes for multi-threaded read cache scenarios. [1] [2] [3]
  • Updated the nightly CI workflow to run the new stress tests by adding Tsavorite.test.stress to the test matrix and directory mapping. These tests are not run in the CI build because they take too long on the CI machines. [1] [2]

Test Infrastructure and Access Control:

  • Updated InternalsVisibleTo attributes in relevant test projects to allow the new stress test project to access internal members required for testing. [1] [2]

Test Implementation Refactoring:

  • Refactored existing multi-threaded read cache test methods in ReadCacheChainTests.cs to extract worker methods, enabling code reuse in the new stress test project. Also, reduced the number of threads tested in the base tests to avoid excessive load during regular runs. [1] [2]

Move high thread count Tsavorite stress tests to a separate sub-project that does not run in the CI build, only nightly, as they take too long on the CI machines
Copilot AI review requested due to automatic review settings May 14, 2026 20:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a dedicated stress-test project for Tsavorite read-cache multi-threading scenarios and wires it into nightly CI, while refactoring existing tests to share the core workers.

Changes:

  • Introduced new Tsavorite.test.stress NUnit project and registered it in the Tsavorite solution.
  • Refactored ReadCacheChainTests to expose internal worker methods and reduced thread-count permutations for regular test runs.
  • Updated nightly workflow matrix/mapping to run the stress test project; improved BDN benchmark workflow by removing random sleeps and adding a concurrency group.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
libs/storage/Tsavorite/cs/test/test.stress/Tsavorite.test.stress.csproj Adds new stress test project and references required test/core projects.
libs/storage/Tsavorite/cs/test/test.stress/ReadCacheStressTests.cs Adds stress fixtures that delegate to refactored worker methods.
libs/storage/Tsavorite/cs/test/test.session/Tsavorite.test.session.csproj Grants internals visibility to stress project for worker reuse.
libs/storage/Tsavorite/cs/test/test.session/ReadCacheChainTests.cs Extracts internal worker methods; reduces base test thread permutations.
libs/storage/Tsavorite/cs/test/Tsavorite.test.csproj Grants internals visibility to stress project.
libs/storage/Tsavorite/cs/Tsavorite.slnx Registers the new stress test project.
.github/workflows/nightly.yml Adds stress project to nightly test matrix and directory mapping.
.github/workflows/ci-bdnbenchmark.yml Adds concurrency group; removes random sleep; minor artifact upload cleanup.

Comment thread libs/storage/Tsavorite/cs/test/test.stress/ReadCacheStressTests.cs
Comment thread libs/storage/Tsavorite/cs/test/test.stress/ReadCacheStressTests.cs
Comment thread .github/workflows/ci-bdnbenchmark.yml Outdated
TedHartMS and others added 8 commits May 14, 2026 16:03
… Tsavorite test runs will run all sub-projects in parallel.
Wrap XmlDocument.Load() in try/catch so corrupt or truncated TRX files
are skipped with a warning rather than crashing the test-summary job
and blocking Garnet CI (Complete).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Small bucket counts relative to the max were rounding to 0 bars,
making them appear empty. Now non-zero counts always show at least
one block.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
TedHartMS added 6 commits May 16, 2026 19:48
…be reused

Make Tsavorite CI tests display the Fully Qualified TestName
When loading a module load the transitive closure of dependencies in the module directory rather than loading all modules in the directory (which may be hundreds of files). Many of these dependencies may already be loaded.
…nvertMetadata exception handler was throwing a NullRef that hid the original failure; handle GetLogFileSize exception in CheckpointStore
@TedHartMS TedHartMS merged commit 29fad5a into main May 19, 2026
187 checks passed
@TedHartMS TedHartMS deleted the tedhar/test-wf-fixes branch May 19, 2026 01:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants